Impact of Meteorological Factors on Asphalt Pavement Surface Temperatures: A Machine Learning Approach
Abstract:
Recent observations of global warming phenomena have necessitated the evaluation of the service performance of asphalt pavements, which is substantially influenced by surface temperature levels. This study employed twelve distinct machine learning algorithms—K-neighbors, linear regression, multi-layer perceptron, lasso, ridge, support vector regression, decision tree, AdaBoost, random forest, extra tree, gradient boosting, and XGBoost—to predict the surface temperature of asphalt pavements. Data were sourced from the Road Weather Information System of Iowa State University, comprising 12,581 data points including air temperature, dew point temperature, wind speed, wind direction, wind gust, and pavement sensor temperature. These data were segmented into training (80%) and testing (20%) datasets. Analysis of model outcomes indicated that the Extra Tree algorithm was superior, exhibiting the highest R$^2$ value of 0.95, whereas the Support Vector Regression algorithm recorded the lowest, with an R$^2$ value of 0.70. Furthermore, Shapley Additive Explanations were utilized to interpret model results, providing insights into the contributions of various predictors to model outcomes. The findings affirm that machine learning algorithms are effective for predicting asphalt pavement surface temperatures, thereby supporting pavement management systems in adapting to changing environmental conditions.
1. Introduction
Asphalt pavement layers are affected by climatic factors, including temperature, humidity, wind, and freeze-thaw cycles [1]. These climatic factors also impact the maintenance and repair processes during the manufacturing stages of asphalt pavements and afterward. The temperature effect, one of the climatic effects that pavements are constantly exposed to due to climate change, directly affects the service level, bearing capacity effect, and overall pavement performance, especially in seasons with high temperatures [2], [3].
Alkaissi [4] studied the effect of temperature on the rutting performance of flexible pavements under high temperatures, and traffic load was evaluated using the Abaqus thermal analysis method in a 3D model developed using the finite element method. They revealed that temperature and traffic load significantly impact rutting formation, leading to severe damage.
Artificial intelligence algorithms are commonly utilized in studies that establish the relationship between prediction models of future temperature values and pavement performance. Various approaches have been employed, such as artificial intelligence and BELLS modeling [5], regression-based prediction models [6], [7], machine learning [2], [8], [9], and finite element methods [10], [11]. Xu et al. [12] employed the BP neural network method to model temperature predictions for pavements in cold regions. They developed early warning systems for temperature predictions in such climates. Molavi Nojumi et al. [13] utilized a machine learning method for temperature predictions, explicitly determining the minimum, maximum, and average temperature values in different layers of asphalt pavements. Swarna and Hossain [14] developed temperature prediction models for future climatic conditions. They predicted the selection of asphalt binder performance class (PG) and developed prediction models based on the average seven-day maximum and minimum pavement temperature values. Qiu et al. [15] investigated the relationship between asphalt pavement temperature, meteorological factors, and pavement temperature by developing prediction models for surface temperature measurements of asphalt pavements. The gradient-boosting decision tree was determined to give the best model result in models using different algorithms.
Tabrizi et al. [16] used Long Short-Term Memory (LSTM), Convolutional-LSTM (CovnLSTM), Sequence-to-Sequence (Seq2Seq), and wavelet neural networks to predict pavement surface temperature. Liu et al. [17] developed models with gradient-boosting to assemble a ReLU (rectified linear unit)/softplus Extreme Learning Machine (ELM) for pavement surface temperature prediction.
In light of the above references, artificial intelligence methods successfully predict road surface temperatures. However, studies on road surface temperature prediction with machine learning methods are limited in the literature.
In this study, 12 different machine learning algorithms (K-neighbors, linear regression, multi-layer perceptron, lasso, ridge, support vector regression, decision tree, island boost, random forest, extra tree, gradient boosting, and XGBoost) were used to estimate the temperature values of the pavement. The dataset used in the study consisted of 12,581 observations obtained from the Lowa State University Road Meteorology Information Station. The dataset includes various variables such as air temperature (tmpf), dew point temperature (dwpf), wind speed (sknt), wind direction (drct), wind gust (gust), and pavement sensor temperature (tfs0). RWIS stations have high installation, maintenance, and operation costs. Failures of sensors in RWIS stations may cause data interruptions. In addition, in regions with similar climates where RWIS is unavailable, the pavement surface temperature is unknown, making it difficult to perform adequate maintenance and repair in case of road surface deterioration. The main objective of this study is to evaluate the accuracy of machine learning models using climatic parameters in pavement surface temperature prediction. The proposed machine learning models can be used for pavement surface temperature prediction in regions with similar climates where RWIS is unavailable.
2. Methodology
Depending on environmental factors, asphalt pavements are constantly exposed to atmospheric phenomena such as precipitation, humidity, wind, air temperature, freeze-thaw cycles, and solar radiation. Depending on atmospheric conditions, temperature changes are experienced on asphalt pavements according to changing climatic factors. The pavement surface generates reflection radiation mainly due to solar and atmospheric radiation. This radiation effect increases heat energy on the road surface, especially in hot climate regions, leading to high surface temperatures [18], [19], [20]. The temperature effect in asphalt pavements affects the tensile strength, strength properties, and aging process of the pavement because of the temperature susceptibility of the binder [21], [22]. Figure 1 shows a one-dimensional schematization of the meteorological factors exposed to asphalt pavements.
Meteorological data such as tmpf, dwpf, sknt, drct, gust, and tfs0 are used to predict pavement surface temperature. The dataset is obtained from the Road Weather Information System (RWIS) of the Iowa Department of Transportation in the United States [23]. Statistical descriptions of the dataset are provided in Table 1. Figure 2 shows the box and whisker plot of the dataset.
Statistical Description | tmpf | dwpf | sknt | drct | gust | tfs0 |
---|---|---|---|---|---|---|
count | 12581 | 12581 | 12581 | 12581 | 12581 | 12581 |
mean | 26.9 | 21.4 | 9.0 | 209.3 | 12.4 | 32.2 |
std | 11.2 | 11.2 | 4.8 | 100.6 | 6.4 | 12.8 |
min | -22.5 | -27.89 | 0 | 0 | 0.9 | -11.7 |
25% | 20.8 | 16 | 5.2 | 130 | 7.8 | 25.3 |
50% | 28.8 | 23.9 | 8.7 | 210 | 12.2 | 32.2 |
75% | 34 | 28.6 | 12.2 | 300 | 16.5 | 38.8 |
$\max$ | 69.6 | 60.3 | 27.8 | 360 | 36.5 | 79.7 |
Machine learning, a field within computer science and a branch of artificial intelligence, has emerged as a widely utilized technology, science, and business technique. Its rapid popularity can be attributed to its successful performance in various domains. By establishing connections between input and output data, machine learning methods construct adaptable and flexible models, enabling the prediction of new outputs by uncovering concealed dependencies in a dataset. The three primary approaches to machine learning are supervised, unsupervised, and semi-supervised.
Furthermore, reinforcement learning involves the optimization of a reward or penalty signal to achieve specific objectives. In supervised machine learning, methods are learned from the examples provided. They identify relationships among independent variables, generalize the algorithm, and forecast outputs based on a labeled dataset for unforeseen data. These methods are categorized as regression models designed to handle output variables with continuous actual values [24], [25].
In this study, 12 different machine-learning algorithms were used for predicting pavement surface temperature. The machine learning models were developed in Python. Information about the algorithms used is presented in Table 2.
Algorithms | Module |
---|---|
K-Nearest Neighbors | Sklearn. neighbors |
Linear regression | sklearn.linear model |
MLP | sklearn.neural network |
Lasso | sklearn.linear model |
Ridge | sklearn.linear model |
Support Vector Regression | Sklearn.svm |
Decision tree | sklearn.tree |
Ada boost | sklearn.ensemble |
Random Forest | Sklearn. ensemble |
Extra Trees Regression | Sklearn. ensemble |
Gradient Boosting | Sklearn. ensemble |
eXtreme Gradient Boost | xgboost.xgb |
3. Result and Discussion
Machine learning algorithms are used to predict pavement surface temperature. For this purpose, weather data from the HKYI4 road meteorology information station located in Iowa is used, including tmpf, dwpf, sknt, drct, gust, and tfs0. The study initially examined the relationship between input and output data. For this purpose, a correlation matrix (Figure 3) and a pairwise plot (Figure 4) are drawn.
When examining Figure 3 and Figure 4, it is determined that the pavement surface temperature positively correlates with air temperature, dew point temperature, wind speed, and wind gust. At the same time, it has a negative correlation with wind direction. The data with the highest correlation to the pavement surface temperature are air and dew point temperatures. The correlations between these variables and the pavement surface temperature are 0.88 and 0.70, respectively. Wind speed, direction, and gusts correlate poorly with the pavement surface temperature. After determining the correlations between the input and output data, machine learning algorithms (KNN, linear regression, MLP, lasso, ridge, SVR, decision tree, ada boost, random forest, extra tree, gradient boost, and xgboost) are developed. The dataset is divided into 80% training and 20% testing data sets for model development. R$^2$ (coefficient of determination), root mean square error (RMSE), and graphical methods are used to evaluate the model results. The results of all models are provided in Table 3.
Algorithms | Train | Test | ||
---|---|---|---|---|
R$^2$ | RMSE | R$^2$ | RMSE | |
KNN | 0.94 | 3.17 | 0.88 | 4.48 |
Linear regression | 0.83 | 5.24 | 0.83 | 5.36 |
MLP | 0.84 | 5.07 | 0.84 | 5.20 |
Lasso | 0.83 | 5.28 | 0.83 | 5.41 |
Ridge | 0.83 | 5.24 | 0.83 | 5.36 |
SVR | 0.71 | 6.91 | 0.70 | 7.20 |
Decision tree | 1.00 | 0.02 | 0.89 | 4.40 |
Ada boost | 0.77 | 6.10 | 0.77 | 6.20 |
Random forest | 0.99 | 1.10 | 0.95 | 3.05 |
Extra tree | 1.00 | 0.02 | 0.95 | 2.91 |
Gradient boosting | 0.88 | 4.50 | 0.86 | 4.80 |
Xgboost | 0.97 | 2.29 | 0.93 | 3.38 |
When examining Table 3, it is observed that the SVR model has the lowest R$^2$ value for the test set, with an R$^2$ value of 0.70. The machine learning algorithms that perform the best with a high R$^2$ value of 0.95 are random forest and extra tree. When analyzing the RMSE values for the test set, it is observed that the extra tree algorithm has the lowest RMSE value. The scatter plot for the model developed using the extra tree algorithm for the train and test set is provided in Figure 5. In addition, Figure 6 shows the time series graph of the test set of the model developed with the extra tree algorithm.
Subgraph (a) of Figure 5 shows that the observed and predicted values are distributed around the 1:1 line. Subgraph (b) of Figure 5 shows that the observed and predicted values are distributed around the 1:1 line, although scattered compared to Figure 4. Figure 6 shows that the observed values and predicted values overlap with each other. A feature importance graph is used to see the effects of the input parameters on the Extra Tree model. Feature importance is given in Figure 7.
In Figure 7, each bar quantitatively represents the importance of a feature in the model. The input parameter tmpf is the most critical for the extra tree model, with an importance level of approximately 0.7. Although the dwpf parameter has less influence on the extra tree model than the tmpf parameter, it seems to be an essential feature. The importance level of this parameter is approximately 0.2. The drct, gust, and sknt parameters have less importance in the model. The significance of the drct and gust parameters is approximately 0.1, while the relevance of the sknt parameter is 0.05. In addition, Shap values are calculated to determine the magnitude and direction (positive/negative) of the effect (positive/negative) caused by each feature in the predictions of the input parameter extra tree model. The Shap plot is given in Figure 8.
In Figure 8, each point represents the impact of a feature on the model output for a single observation. The input parameter Tmpf usually has a positive Shap value. This parameter indicates that it tends to increase model prediction. The red shift of the dots indicates that high values of tmpf significantly increase the model output. The Shap values for the dwpf parameter have positive and negative effects, indicating that this parameter has a more minor impact on the model output. The Shap values for drct, gust, and sknt are closer to the zero axis. This shows that these parameters have less impact on the model than tmpf and dwpf. It is also seen that these three parameters have both positive and negative effects on the extra tree model.
4. Conclusions
The precise prediction of pavement surface temperature is imperative for effective planning and management throughout the lifespan of road infrastructure. In this investigation, twelve machine learning algorithms were employed to forecast pavement surface temperature, utilizing predictors such as air temperature, dew point temperature, wind speed, wind direction, and wind gust. Prior to model development, an examination of the relationships between input variables and the pavement temperature was conducted.
The findings from the analysis are summarized as follows:
$\bullet$ A positive correlation was observed between the air and dew point temperatures and the pavement surface temperature.
$\bullet$ A slight negative correlation was noted for wind direction in relation to pavement surface temperature.
$\bullet$ Although wind speed and gust displayed a positive correlation with pavement surface temperature, the strength of these relationships was found to be minimal.
$\bullet$ Evaluations of the developed models revealed that each algorithm effectively predicted the pavement surface temperature.
$\bullet$ Notably, the Random Forest and Extra Tree algorithms demonstrated superior performance, achieving an R² value of 0.95.
$\bullet$ Further analysis of the RMSE values indicated that the Extra Tree model was the most accurate in predicting pavement surface temperature.
$\bullet$ It is concluded that machine learning algorithms are proficient in forecasting pavement surface temperature.
Conceptualization, Serdal Terzi and Ekinhan Eriskin; methodology, Serdal Terzi, Ekinhan Eriskin, Fatih Ergezer, Tahsin Baykal; formal analysis, Tahsin Baykal; writing—original draft preparation, Serdal Terzi, Ekinhan Eriskin, Fatih Ergezer, Tahsin Baykal; writing—review and editing, Serdal Terzi, Ekinhan Eriskin, Fatih Ergezer, Tahsin Baykal; visualization, Tahsin Baykal.; supervision, Serdal Terzi. All authors have read and agreed to the published version of the manuscript.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.