Stacked project: Short-term traffic speed prediction
- Karthik Jamalpur
- Nov 28, 2021
- 7 min read
Updated: Dec 8, 2021
Tools used: Python, Anaconda, R programming language, Putty, WinSCP, windows, Excel, Ms. PowerPoint
IDE: Jupyter notebook, Pycharm, Google colab
Major packages: Scikit, sklearn, Pandas, Numpy, Keras, K-means, ARIMA, Matplotlib
Roles: Research (Machine learning, Deep learning)
Short term-traffic speed prediction is one of the long-standing key topics in intelligent transport systems, which is notable for traffic control and guidance. Carrying on this project is critical because it has big data collection from different sources, and it is challenging to select an accurate optimal prediction model. A sampling of traffic conditions by using real-time traffic information data i.e. persistent floating car data (FCD) strengthens the short-term predictions. By using machine learning models, Bayesian and neural networks demonstrate the time and space correlation and it can be useful for automatic implementation on road networks. A seasonal autoregressive moving average model is proposed to perform traffic speed prediction and then it is compared to machine learning networks, these time series models can help give a prior estimation of speed prediction for the formulation of Bayesian network and Neural network(LSTM method).In this project, validation errors (MAE, RMSE, MAPE, etc.) are used to compare the results among the time series and machine learning models. A large floating car data of Rome ring road will be the test set (original values) to compare with the predicted outputs. After that, all models are evaluated, and the optimal model will be allocated to the road network with the quality framework. Additionally, Data is clustered by different variables to know the different traffic states based on similarity to enhance the traffic system.
I used three models in the project
Arima model (Auto-Regressive Integrated Moving Average)
Bayesian time series model
Recurrent neural networks(Long-short term memory networks)
Outline of the project:
•Aim of the project
•Traffic prediction
•Floating car data
•Time series model
•Bayesian model
•Neural network model
•Clustering (K-means method)
•Conclusion
Traffic speed prediction
Traffic speed prediction is the task of forecasting real-time traffic information based on frequent and thorough floating car data, such as average traffic speed and traffic counts.
Short-term traffic prediction: It helps to predict traffic after 10 or 15 minutes etc.
Long-term traffic prediction: It predicts the state of traffic next day or next week etc.
Using the above two types can forecast travel time, traffic congestion, and finding the optimal routes etc.
Floating car data
Vehicles enabled with GPS devices collect information about their location, their route, and travel peed throughout the road network [3]. This method of data collection is referred to as floating car data (FCD) and can be applied to derive historical speed data or even for real-time applications.
Floating car data based on GPS trajectories open a lot of possibilities in traffic modeling and analysis and provide valuable information to traffic planners and decision-makers.
Time series
A time series is a series of data points listed in order of time. Most commonly, a time series is taken at successive equally spaced points in time.
It can be useful to see how a given dataset, transportation variables change over time. Examples of time series are the prediction of traffic speed (How to mean speed, counts, standard deviation changes over time), Business sales.
ARIMA model
Autoregressive integrated moving average model: Autoregressive integrated moving average models are a typical class of models for predicting a time series that can be made to be stationary by differencing, if necessary, maybe in conjunction with nonlinear transformations such as logging or deflating if necessary.
A non-seasonal ARIMA model is classified as an "ARIMA (p, d, q)" model, where:
p is the number of autoregressive terms,
d is the number of nonseasonal differences needed for stationarity, and
q is the number of lagged forecast errors in the prediction equation.

Bayesian networks
It is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph. For time series analysis, state-space models are used here, data is not calibrated directly instead it assumes unobserved values from the given data with a noise term.
Bayesian structural time series
•It is like Bayesian networks but feeds the past traffic data of time series and predicts the probability of future speed values. Several methods to carry forward BSTS, are
1. local level,
2. local linear trend,
3. linear trend with seasonality.


Neural networks
Machine learnings subsection is a Neural network and form of base Deep learning, it is inspired by the human brain structure. Input data is fed to neural networks to train themselves to recognize patterns found in the data, and then predict the output for a new set of similar data.
a neural network can be thought of as the functional unit of deep learning, which mimics the behavior of the human brain to solve complex data-driven problems.
Recurrent neural networks
RNN is a class of neural networks where connections between nodes form a directed graph along a temporal sequence.
The architecture of Long-short-term neural network
LSTM deals with both Long Term Memory (LTM) and Short-Term Memory (STM) and for making the calculations simple and effective. it uses the concept of gates [27].
Forget gate: LTM goes to forget gate and it forgets information that is not useful i.e. it takes events and previous Short term memory as input and keeps only relevant information for prediction.
Learn Gate: Event (current output) and STM are combined so that necessary information that it has recently learned from STM can be applied to the current output.
Remember Gate: LTM information that hasn’t been forgotten and STM and event are combined in Remember gate which works as updated LTM.
Use Gate: This gate also uses LTM, STM, and event to predict the output of the current even which works as an updated STM.

LSTM for memory b/w batches: -
Long sequences can recognize by the LSTM memory. It resets the training set when fitting the model to take advantage of that internal state of the LSTM network. It means it can build a state over the long sequence and even maintain that state if needed to make predictions and it requires setting training data not to be shuffled.
LSTM for regression with time steps: -
Data preparation for the LSTM network includes time steps in this method and it has the same sequence problems may have a varied number of time steps per sample. In this project let’s say a specific speed value reached in a specific time period (maximum speed or minimum speed in dataset). Each value would be a sample and observation that lead up to the event would be the time steps and variables observed would be the features. Instead of using past observations as separate input features, we use prior time steps to predict the next time step, which is more accurate in framing the problem.
LSTM for regression using window method: -
Window signifies the recent and multiple time steps for each phase and the window size is the parameter that can be altered for each problem.
For instance, given the current time (t), to predict the value at the next time in the sequence (t+1), we can use the current time (t), as well as the two prior times (Look back function) as input variables. When phrased as a regression problem, the input variables are t-2, t-1, t and the output variable is t+1.
Stacked LSTM with memory between batches: -
The LSTMs can be successfully trained when stacked into deep network architectures. These can be stacked in Keras in the same way that other layer types can be stacked. One addition to the configuration that is required is that an LSTM layer prior to each subsequent LSTM layer must return the sequence.

Validation errors
By observing the above values stacked LSTM with memory between batches is the optimal model. Where Normalization and standardization techniques are not helpful but original data itself has the best results when fitting LSTM models.
Is ARIMA or Bayesian network or Neural network best for time series forecasting….?
For time series forecasting there is no best way of doing it. It all depends on data and what kind of issues are elevating. But it’s almost always true that bringing some prior information is helpful. FCD data we had cannot be unpredictable. Let’s say predicting the future values in 2020, but in 2020 corona outbreak causes a drastic downfall of counts on highways, or otherwise not taking that into account, it will be fairly useless. To carry out a project there are so many assumptions like smoothness and seasonality or holidays and other things, even before looking at data.
From the whole pre and post-prediction data analysis, ARIMA has better results than Bayesian time series forecasting.
Below is the predicted graph

Conclusion
Transport hurdles have always been increasing in the modern world and the research on traffic speed prediction has spiked up in recent years. This project demonstrates short-term traffic speed prediction on Rome ring road in the region of Lazio by using persistent vehicle probe data or Floating car data. This data consists of average speed (mean speed) composed of individual points from a massive number of private cars or GPS-enabled vehicles. Different machine learning models (ML) i.e. Bayesian network and Neural network (NN) to predict the speed by using observed mean speed. A time series model, autoregressive moving average model (ARIMA) compared with ML models which forecast future traffic values from only observed time-space correlations. Additionally, A clustering analysis is used in this project to enhance the traffic route and guidance.
From a large number of resources, Massive data set i.e. floating car data is collected and it is used to compute different types of ML models and validate them. For validation, error indicators are introduced to evaluate the optimal model. FCD data has variables, several vehicles (counts), mean observed speed and standard deviation of the individual observed speed of vehicles on the individual link at a specific period. This data is divided into 5 min intervals. The validation process addressed the comparison of the predicted values to the original values. The Bayesian network has MAE (mean absolute error) of 36.01 and RMSE (Root mean square error) of 42.74, Neural network (NN) has MAE of 32.65 and RMSE of 40.10, here NN showed better performance than Bayesian network. On the other hand, ARIMA has shown impressive results compared to ML models with an MAE of 11.65 and RMSE of 12.45. Since ARIMA is a regression model it represented the traffic congestion already took place.
Comments