CN111402576A

CN111402576A - Urban road traffic state prediction system based on deep learning

Info

Publication number: CN111402576A
Application number: CN202010024882.0A
Authority: CN
Inventors: 郝威; 易可夫; 高志波; 张兆磊; 戎栋磊; 王杰; 王正武
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2020-01-10
Filing date: 2020-01-10
Publication date: 2020-07-10

Abstract

A deep learning-based urban road traffic state prediction system realizes prediction by utilizing the following modules: the system comprises a traffic environment module, a memory library module, a neural network module, a training and promoting network module, a visualization module and an interaction module; the traffic environment module comprises an acquisition module and a preprocessing module, wherein the acquisition module acquires urban road position information, peak-hour average vehicle speed information, air temperature information, precipitation probability information and congestion length information; the preprocessing module is used for preprocessing traffic data based on a Lagrange interpolation method and a normalization method, obtaining reliable data for traffic prediction, storing the obtained data into a memory base module, and constructing a neural network according to the reliable data, so that a deep cycle learning network of a traffic state is constructed.

Description

Urban road traffic state prediction system based on deep learning

Technical Field

The invention belongs to the field of urban traffic system analysis and traffic condition prediction, and particularly relates to an urban road traffic state prediction system based on deep learning.

Background

With the development of urbanization, the contradiction between traffic infrastructure and automobile holding capacity is more severe, the congestion problem is more serious, and economic loss, travel time consumption and environmental pollution are inevitably caused. The treatment of traffic jam is firstly prevented, the traffic state change trend in a short time is predicted according to the existing traffic state of a road, and the possible jam phenomenon is early warned; and then, information platforms such as traffic broadcast and microblog are used for sending out early warning, leading vehicles to reasonably select running routes and strengthening order management so as to avoid congestion or relieve congestion degree. Therefore, how to establish a long-acting model to perform timely early warning on traffic jam is a research hotspot of optimizing an urban intelligent traffic system.

Many research achievements for traffic jam prediction at home and abroad mainly include methods such as prediction analysis, machine learning prediction and multi-classification combined prediction based on time series correlation, but all have defects of different degrees. For example, the non-parametric regression method is based on a large amount of historical data and constructs many hypothesis conditions, and thus is difficult to be applied to traffic flow having non-linear characteristics; the effectiveness of the model based on the multilayer perception neural network and the back propagation neural network is improved, but the training process is long in time and is easy to fall into local optimization; most of prediction methods based on machine learning lack robustness in processing big data, so that the model generally lacks long-term effectiveness and expansion capability; the learning method for recognizing the data mode by simulating the multilayer perception structure of the human brain based on deep learning accelerates the data processing speed on one hand and does not consider the dimension disaster problem caused by the high-dimensional state of the traffic flow parameters on the other hand.

The model is based on an improved time cycle Neural Network (RNN), and long-term dependence information of a time sequence can be learned by long-term memory (L on short-term memory, L STM), the depth of the long-term dependence information is reflected between input and output and is also reflected on a cross-time step, and the long-term dependence information is suitable for capturing space-time evolution rules in traffic state parameters such as traffic flow and speed.

Disclosure of Invention

The invention aims to solve the technical problem of providing an urban road traffic state prediction system based on deep learning, and provides a method for predicting future traffic states and delay indexes by using historical data, so that the problems of overlarge data volume, gradient explosion, gradient disappearance and the like are solved, the application of traffic multidimensional data is comprehensively considered, and the efficiency of traffic delay prediction is further improved.

In order to achieve the purpose, the technical scheme of the invention is as follows: a deep learning-based urban road traffic state prediction system realizes prediction by utilizing the following modules: the system comprises a traffic environment module, a memory library module, a neural network module, a training and promoting network module, a visualization module and an interaction module;

the traffic environment module comprises an acquisition module and a preprocessing module, wherein the acquisition module acquires urban road position information, peak-hour average vehicle speed information, air temperature information, precipitation probability information and congestion length information; the preprocessing module is used for preprocessing traffic data based on a Lagrange interpolation method and a normalization method, obtaining reliable data for traffic prediction, storing the obtained data into a memory base module, and constructing a neural network according to the reliable data so as to construct a deep cycle learning network of a traffic state; the specific pretreatment method comprises the following steps:

performing abnormal value interpolation by a Lagrange interpolation method, deleting partial invalid data and abnormal data so as to improve the value of the data, converting the data into a list to form a matrix, and converting tensor into 3 dimensions to be used as L STM cell for input, wherein the Lagrange interpolation function formula is as follows:

in the formula, y_iPolynomial expression of degree i-1, x_iRepresenting the parameters corresponding to the i point.

The method realizes the precise processing of data through a normalization function, is used for avoiding small data among different attributes of the data in the analysis process from being greatly influenced by big data, and therefore ensures the accuracy and reliability of a test result, and the expression of the normalization function is as follows:

X.normalize＝(X-X.mean)/X.std

in the formula, x.normaize represents normalized data, x.mean represents mean, and x.std represents standard deviation.

Selecting a training set and a test set, determining output and input variables and the number of network layers, determining an initial weight, a threshold value, a learning rate, an activation function, a training function and a training neural network model, stopping network training when feedback reaches the optimal state of a Q value table, if the feedback does not meet the optimal state of the Q value table, correcting and adjusting parameter values, adjusting the parameters to enable prediction and input test set data to obtain the optimal prediction result, and analyzing the prediction result to obtain the final prediction result;

the interaction module comprises: and selecting the first three optimal prediction results to be interacted with the visualization module, so that road travel information is provided for the user.

Further, the method further comprises: and operating the prediction result to obtain a visualized loss function curve graph and a Q-value graph.

Further, the method further comprises: the prediction result is associated with the set parameter sensitivity, and the three factors, namely learning _ rate, reward _ delay and e _ greedy, are evaluated by using a three-factor three-level orthogonal test, are marked as A/B/C, have corresponding levels of 0.01/0.03/0.05, 0.9/0.8/0.7 and 0.9/0.8/0.7, and take the degradation degree of the loss curve as an optimal judgment basis; fix A, B at A₁、B₁On level, matching three levels of C A₁B₁C₁、A₁B₁C₂、A₁B₁C₃If A is₁B₁C₃Optimally, then, take C₃Horizontal, let A₁And C₃Fixing, and matching with two levels of factor B₁B₂C₃、A₁B₃C₃If A after the test₁B₂C₃Optimally, fetch B₂、C₃Two levels, two tests A₂B₂C₃、A₃B₂C₃If A is₃B₂C₃Optimal, i.e., the best horizontal combination.

The algorithm step of carrying out traffic delay index prediction analysis by using L STM algorithm and Q-L earning comprises the following steps:

1): initializing a DRQN network structure, wherein the parameter is q, initializing a target network, and the parameter q' is q;

2): initializing greedy parameters, a learning rate, rewards, attenuation coefficients, iteration times, the number of each iteration turn T, training and a neural network parameter alternation period;

3)：forepisodeinEpisodesdo；

4): initial traffic state S_t＝S₀；

5): fort from 0 to T;

6): selecting act (output an integer in the range of (0 to 2^ n _ features-1): select a with a probability of 1-epsilon_t＝argmax_aProbability of Q (s, a, θ), epsilon randomly selects behavior a_t；

7): after the behavior is determined, all states s _ all which accord with the behavior are found in the data table, and then one state is randomly selected from the s _ all to be used as s_t+1(if it is not found eligible in s _ all, then the behavior is re-determined), then according to s_tAnd s_t+1To calculate the prize r_t；

8): will experience(s)_t,a_t,r_t,s_t+1) Putting the mixture into a memory pool;

9): randomly taking out batch size data, and respectively calculating q _ eval and q _ next;

10): structure y-r₁+gmax_a+1Q(s_t+1,a|q)→q_target；

11): reversely propagating and lifting the network q according to the q _ eval and the q _ target;

12): if the iteration number is an integral multiple of the transmitter _ cycle, updating q' ═ q;

13): current state ═ s_t+1

14): stopping the training of the single round when the maximum number of times of the game iteration T of the round is reached, and returning to the traffic state initialization again;

15)：end for；

16)：end for。

compared with the prior art, the invention has the following beneficial effects: the traffic state prediction method based on deep Q learning is provided for traffic researchers and traffic managers; reliable prediction data is provided for traffic planners and government departments, route adjustment and traffic management are facilitated to be performed in advance, and waste of manpower and material resources is reduced; real-time data are provided for road users and drivers, so that the route planning can be adjusted in time conveniently, and the road congestion and practice waste are reduced; the method is not only limited to traffic jam prediction, but also can be used for predicting various traffic states such as traffic volume, distribution conditions and the like, and has strong practicability.

Drawings

FIG. 1 is an overall flow of prediction execution in an example of the present invention, including an input layer, a hidden layer, and an output layer.

Fig. 2 is a flow chart of traffic state prediction based on deep learning in an example of the invention.

Fig. 3 is a graph of loss predicted by the method proposed by the present invention in an example of the present invention.

Fig. 4 is a Q-value graph generated by prediction using the method proposed by the present invention in the example of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The method is basically characterized in that normalization and Lagrange interpolation preprocessing are performed on data according to the high-dimensional characteristics of traffic data, a deep reinforcement learning network is constructed on the basis of a Python language platform, the pressure of Q value table data storage and analysis is relieved by means of an L STM algorithm, the sensitivity of the network to traffic data processing is improved through continuous training, learning, memory and reinforcement of a training set, a future traffic delay index and a traffic state are finally predicted, the sensitivity degree of a prediction result and parameters is researched through adjustment of a learning rate, a greedy coefficient and a delay index, and an optimal result is selected.

The application example of the invention is shown in fig. 2, based on a lagrange interpolation method, null data and abnormal data are eliminated, 1650 pieces of effective data are finally screened out from all traffic data groups for training and prediction, and based on a normalization method, the degree of influence of small data with different properties by large data is reduced, so that all traffic data intervals are within [0,1], and the final prediction accuracy is improved. The formula is as follows:

lagrange interpolation function formula

Normalized function formula (X-x.mean)/x.std

As shown in fig. 1 and fig. 2, the embodiment of the present invention constructs a deep-loop Q learning network based on multidimensional traffic data, and implements a DRQN training method by using L STM algorithm and Q-L earning, and the implementation steps are as follows:

the table is a sample of the data of the analysis

(1) According to the data analysis, visualization and platform building requirements of the embodiment, introducing application libraries of Numpy, Pandas, Matplotlib and Tensorflow, and constructing a basic library.

(2) As shown in fig. 2, in the construction of the network, a traffic environment module, a deep reinforcement learning module, a memory base module and a neural network module are required to be used for supporting a training process of data, and a training and promoting network module, a loss function module and a test data module are all used for the whole training and predicting process.

(3) And (3) initializing a DRQN network structure, initializing a target network, and setting a parameter q' ═ q.

(4) And (5) initializing parameters. Let learning rate be 0.01, delay reward coefficient be 0.9, greedy coefficient be 0.9, batchsize be 15, initial weight and threshold interval [0,1], Sigmoid function for activation function, Adam for training function, maximum iteration number T.

(5) Initial traffic state S_t＝S_o。

(6-1) selecting behavior. Selecting with a probability of 1-epsilon, wherein the probability of epsilon randomly selects the behavior a

(6-2) after the behaviors are determined, finding all states s _ all which accord with the behaviors in a training set data table, and randomly selecting one behavior from the states s _ all as s_t+1(if it is not found in s _ all, the behavior is re-determined), then according to s₁And s_t+1To calculate the prize r_t

(7) Randomly fetching the batch size data, and respectively calculating q _ eval, q _ next and q _ target

(8) When the iteration number is integral multiple of the transmitter _ cycle, updating q' ═ q

(9) The round of training is stopped when the maximum number of iterations is reached.

The embodiment of the invention extracts the prediction data to execute the operation, and obtains and visualizes the prediction index.

After the steps are completed, the acquisition and visualization of the prediction result mainly comprise two parts, wherein the first part is the visualization of the loss function, and the second part is the visualization of Q-value. The loss function curve graph reflects the value degree and loss consumption of the training process, and the Q-value graph reflects the high benefit of the prediction result. As shown in fig. 3 and 4.

Embodiments of the present invention analyze the sensitivity association of the predicted results with the set parameters.

And (3) carrying out evaluation by utilizing a three-factor three-level orthogonal test, wherein the three factors are learning _ rate, reward _ delay and e _ greedy, are marked as A/B/C, the corresponding levels are 0.01/0.03/0.05, 0.9/0.8/0.7 and 0.9/0.8/0.7, and the degradation degree of the loss curve is used as an optimal solution judgment basis.

Selection of parameters and optimal results for the first trial (first group)

Selection of parameters for the second trial and optimal results (first group)

Selection of parameters and optimal results for the third trial (second group)

It is thus obtained that the optimal prediction result, which is the final prediction value of the present embodiment, can be obtained when the learning rate is 0.03, the delay reward coefficient is 0.9, and the greedy coefficient is 0.9.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent structures or equivalent processes performed by the present invention and the contents of the appended drawings, or directly or indirectly applied in other related arts, are encompassed in the scope of the present invention.

Claims

1. The urban road traffic state prediction system based on deep learning is characterized in that: the prediction is implemented using the following modules: the traffic environment module, the memory library module, the neural network module, the training and promoting network module, the visualization module and the interaction module;

in the formula (I), the compound is shown in the specification,

is shown as

The polynomial of the degree of a is,

express correspondence

A parameter of the point;

the method realizes the precise processing of data through a normalization function, is used for avoiding small data among different attributes of the data in the analysis process from being influenced by big data, and therefore guarantees the accuracy and reliability of a test result, and the expression of the normalization function is as follows:

in the formula (I), the compound is shown in the specification,

which represents the normalized data, is the normalized data,

the mean value is represented by the average value,

represents the standard deviation;

2. The traffic state prediction method according to claim 1; the method is characterized in that: the method further comprises the following steps: and operating the prediction result to obtain a visualized loss function curve graph and a Q-value graph.

3. The traffic state prediction method according to claim 1, characterized in that: the method further comprises the following steps: the prediction result is associated with the set parameter sensitivity, and the method is characterized in that:

evaluating by using a three-factor three-level orthogonal test, wherein the three factors are learning _ rate, reward _ delay and e _ greedy, are marked as A/B/C, the corresponding levels are 0.01/0.03/0.05, 0.9/0.8/0.7 and 0.9/0.8/0.7, and the degradation degree of the loss curve is used as an optimal solution judgment basis; fix A, B at A₁、B₁On level, matching three levels of C A₁B₁C₁、A₁B₁C₂、A₁B₁C₃If A is₁B₁C₃Optimally, then, take C₃Horizontal, let A₁And C₃Fixing, and matching with two levels of factor B₁B₂C₃、A₁B₃C₃If A after the test₁B₂C₃Optimally, fetch B₂、C₃Two levels, two tests A₂B₂C₃、A₃B₂C₃If A is₃B₂C₃Optimal, i.e., the best horizontal combination.

4. The traffic state prediction method according to claim 1, characterized in that: