Javascript is required
Aloui, R. & Ben Aïssa, M. S. (2016). Relationship between oil, stock prices and exchange rates: A vine copula based GARCH method. N. Am. J. Econ. Fin., 37, 458–471. [Google Scholar] [Crossref]
Bhat, S. R., Yadav, J. S., Kumar, C. M. N., Amar, H. A., Rakesh, N., & Kumar, S. V. P. (2024). Oil price volatility and its impact on industry stock return – Bi variate analysis. In Proceedings of the International Conference on Business and Technology (ICBT2024) (pp. 102–111). [Google Scholar] [Crossref]
Boubaker, S., Karim, S., Naeem, M. A., & Sharma, G. D. (2023). Financial markets, energy shocks, and extreme volatility spillovers. Energy Econ., 126, 107031. [Google Scholar] [Crossref]
Bouri, E., Rognone, L., Sokhanvar, A., & Wang, Z. K. (2023). From climate risk to the returns and volatility of energy assets and green bonds: A predictability analysis under various conditions. Technol. Forecast. Soc. Change, 194, 122682. [Google Scholar] [Crossref]
Chaibi, M., Benghoulam, E. M., Tarik, L., Berrada, M., & El Hmaidi, A. (2022). Machine learning models based on random forest feature selection and bayesian optimization for predicting daily global solar radiation. Int. J. Renew. Energy Dev., 11(1), 309–323. [Google Scholar] [Crossref]
Duchesne, L., Karangelos, E., & Wehenkel, L. (2017). Machine learning of real-time power systems reliability management response. In 2017 IEEE Manchester PowerTech, Manchester, UK (pp. 1–6). [Google Scholar] [Crossref]
Fieberg, C., Metko, D., Poddig, T., & Loy, T. (2023). Machine learning techniques for cross-sectional equity returns’ prediction. OR Spec., 45(1), 289–323. [Google Scholar] [Crossref]
Han, J. W., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques. Elsevier. [Google Scholar] [Crossref]
He, Z. F., Chen, J. Q., Zhou, F. Z., Zhang, G. Q., & Wen, F. H. (2022). Oil price uncertainty and the risk-return relation in stock markets: Evidence from oil-importing and oil-exporting countries. Int. J. Financ. Econ., 27(1), 1154–1172. [Google Scholar] [Crossref]
Herm, L. V., Heinrich, K., Wanner, J., & Janiesch, C. (2023). Stop ordering machine learning algorithms by their explainability! A user-centered investigation of performance and explainability. Int. J. Inf. Manag., 69, 102538. [Google Scholar] [Crossref]
Huang, R. D., Masulis, R. W., & Stoll, H. R. (1996). Energy shocks and financial markets. J. Fut. Mark., 16(1), 1–27. [Google Scholar]
Kang, W. S. & Ratti, R. A. (2013). Oil shocks, policy uncertainty and stock market return. J. Int. Financ. Mark. Inst. Money, 26, 305–318. [Google Scholar] [Crossref]
Kennedy, D. & Wallis, I. (2007). Impacts of fuel price changes on New Zealand transport. Land Transport New Zealand Research Report 331. [Google Scholar]
Kim, S., Jeon, J., & Kim, H. (2024). Drought and energy stock markets in the United States. Environ. Res. Lett., 19(9), 094012. [Google Scholar] [Crossref]
Kuziak, K. & Górka, J. (2023). Dependence analysis for the energy sector based on energy ETFs. Energies, 16(3), 1329. [Google Scholar] [Crossref]
Liu, L. B., Chen, C., & Wang, B. (2022). Predicting financial crises with machine learning methods. J. Forecasting, 41(5), 871–910. [Google Scholar] [Crossref]
Naeem, M. A., Sadorsky, P., & Karim, S. (2023). Sailing across climate-friendly bonds and clean energy stocks: An asymmetric analysis with the Gulf Cooperation Council Stock markets. Energy Econ., 126, 106911. [Google Scholar] [Crossref]
Naifar, N., Hammoudeh, S., & Tiwari, A. K. (2019). Do energy and banking CDS sector spreads reflect financial risks and economic policy uncertainty? A time-scale decomposition approach. Comput. Econ., 54(2), 507–534. [Google Scholar] [Crossref]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res., 12, 2825–2830. [Google Scholar]
Rajput, N., Jeyalakshmi, R., & Ghai, R. K. (2024). The impact of sustainable electrification on stock market performance: An empirical study of Indian energy companies. In 2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India (pp. 1–7). [Google Scholar] [Crossref]
Sadraoui, T., Regaieg, R., Abdelghani, S., Moussa, W., & Mgadmi, N. (2021). The dependence and risk spillover between energy market and BRICS stock markets: A copula-MGARCH model approach. Global Bus. Rev. [Google Scholar] [Crossref]
Son, J. & Ryu, D. (2024). Energy price shocks and stock market volatility in an energy-importing country. Energy Environ. [Google Scholar] [Crossref]
Wang, Y. D. & Hao, X. F. (2023). Forecasting the real prices of crude oil: What is the role of parameter instability? Energy Econ., 117, 106483. [Google Scholar] [Crossref]
Yilmaz, E. S., Ozpolat, A., & Destek, M. A. (2022). Do Twitter sentiments really effective on energy stocks? Evidence from the intercompany dependency. Environ. Sci. Pollut. Res., 29(52), 78757–78767. [Google Scholar] [Crossref]
Search
Open Access
Research article

Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability

ahmet akusta*
Rectorate, Konya Technical University, 42250 Konya, Türkiye
Journal of Corporate Governance, Insurance, and Risk Management
|
Volume 11, Issue 3, 2024
|
Pages 160-167
Received: 07-14-2024,
Revised: 09-11-2024,
Accepted: 09-20-2024,
Available online: 09-27-2024
View Full Article|Download PDF

Abstract:

Energy dependency plays a pivotal role in shaping the performance of stock markets, particularly in energy-sensitive indices such as the BIST Industrial Index in Turkey. This study presents a comparative evaluation of traditional statistical models and machine learning (ML) techniques in capturing the complex relationship between energy variables and the BIST Industrial Index. A dataset encompassing energy imports, production levels, and energy prices is utilised to assess the effectiveness of Ordinary Least Squares (OLS) regression, Random Forest (RF), and Gradient Boosting (GB) models. The results reveal that ML models substantially outperform traditional statistical methods in their ability to capture nonlinear, intricate relationships between energy metrics and market behaviour. Among the ML models, RF demonstrates the highest predictive accuracy. Feature importance analysis identifies crude oil production as the most significant variable, underscoring the dominant influence of domestic energy dynamics in shaping the BIST Industrial Index. While ML models offer superior forecasting capabilities, they introduce challenges in terms of model interpretability. In contexts where transparency is crucial, statistical models such as OLS remain more favoured for their simplicity and explainability. The findings highlight the need for a balanced approach in model selection, with hybrid models potentially offering the best of both worlds by combining the strengths of traditional and modern methodologies. The insights derived from this study can inform policymakers and investors, particularly within emerging markets, providing a nuanced understanding of the trade-offs between predictive power and model transparency in forecasting energy-sensitive financial indices.

Keywords: Energy dependency, BIST Industrial Index, Energy markets, Emerging markets, Feature importance, Statistical models, Machine learning (ML)

1. Introduction

Energy dependency stands as a crucial determinant of stock market performance, as fluctuations in energy inputs such as crude oil, natural gas, and electricity generation can influence production costs, investor sentiment, and overall firm profitability. Prior research has underscored the sensitivity of market returns to variations in energy availability and input costs (K​i​m​ ​e​t​ ​a​l​.​,​ ​2​0​2​4). In many instances, these energy-related factors intertwine with broader macroeconomic indicators, demonstrating that the dynamics of energy markets extend beyond simple price movements and are integral to understanding the performance of industrial indices.

Turkey’s BIST Industrial Index provides a compelling case study for examining energy dependency because of the country’s substantial reliance on energy imports, its evolving energy production landscape, and the pivotal role that industrial firms play in Turkey’s economic growth. The index’s sensitivity to both domestic production and external energy shocks makes it an appropriate focal point for assessing how energy variables shape market behavior in an emerging market context.

While numerous studies have delved into the relationship between energy market factors and stock indices, much of the early work relied on statistical models that often imposed linear relationships and may have needed help to capture nonlinearities or complex interactions. For example, conventional regression-based approaches have been extensively used to analyze financial returns, given their interpretability and transparency (F​i​e​b​e​r​g​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). However, as market conditions become more volatile and data complexity increases, linear models may fail to identify intricate patterns and nonlinear dependencies that could enhance forecast accuracy.

Recent advances in ML have shown considerable promise in dealing with such complexity, as ML models can uncover nonlinear relationships and intricate interactions among multiple predictors (H​e​r​m​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). In particular, techniques like GB and RF have demonstrated strong predictive capabilities when modeling financial returns driven by energy and climate-related factors (B​o​u​r​i​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). However, despite their predictive power, ML models often appear as “black boxes” with limited transparency, posing challenges for stakeholders who need to understand the rationale behind forecasts for informed policy and investment decisions.

This complexity-interpretability trade-off is more than merely an academic concern. Investors, policymakers, and corporate managers increasingly require more nuanced models that balance the benefits of superior predictive accuracy with the necessity for interpretability and trustworthiness (H​e​r​m​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). In highly regulated or risk-averse environments, understanding a model’s decision-making process becomes just as important as its predictive strength. Thus, while ML models hold the potential for improved forecasting in energy-influenced financial settings, statistical models continue to serve an essential role where clarity and transparency are paramount.

Against this backdrop, the present study aims to contribute to the literature by directly comparing classical statistical techniques and ML methods in modeling the relationship between energy dependency variables and the BIST Industrial Index. This focus on a key emerging market index not only provides insights specific to Turkey’s industrial sector but also expands the existing body of work by exploring how different modeling paradigms perform under conditions of energy uncertainty. This research will guide practitioners in choosing between interpretable statistical models and complex, but potentially more accurate, ML algorithms, depending on their specific strategic needs and regulatory constraints. Nevertheless, it is also important to acknowledge potential data limitations, such as occasional gaps in publicly available energy statistics or measurement inconsistencies across different sources, which may influence the robustness and generalizability of the findings.

2. Literature Review

Energy dependency has increasingly been recognized as a significant factor influencing financial market behavior, particularly in energy-sensitive indices. The predictive modeling of financial returns driven by energy variables has long been the subject of academic inquiry, with traditional statistical approaches historically dominating the discourse. However, the rise of ML has offered new possibilities for uncovering complex, nonlinear relationships in financial and energy data.

Statistical models have been extensively applied to examine the relationship between energy variables and financial markets. Traditional methods such as OLS regression and autoregressive integrated moving average (ARIMA) are favored for their simplicity and interpretability. For instance, W​a​n​g​ ​&​ ​H​a​o​ ​(​2​0​2​3​) emphasized that statistical models remain useful in contexts requiring transparent and interpretable insights, particularly when addressing parameter instability in time-series data. However, the authors noted that traditional methods often fail to capture relationships’ complex and dynamic nature in energy-driven financial indices, especially structural breaks.

Additionally, A​l​o​u​i​ ​&​ ​B​e​n​ ​A​ï​s​s​a​ ​(​2​0​1​6​) applied a vine copula approach to explore dependencies among energy, stock, and currency markets, demonstrating that statistical models could capture symmetric relationships effectively. However, they highlighted that these models may inadequately represent time-varying dynamics or nonlinear dependencies, particularly during financial crises or market turbulence. Such findings underscore the limitations of statistical models in dealing with complex datasets involving energy variables.

In recent years, ML methods have gained traction for their ability to model nonlinear relationships and capture intricate patterns in financial data. For example, N​a​e​e​m​ ​e​t​ ​a​l​.​ ​(​2​0​2​3​) used ML techniques such as the cross-quantilogram approach to examine dependencies between clean energy stocks and GCC stock markets. Their study demonstrated that ML models could detect time-varying features and nonlinear dependencies, particularly under extreme market conditions like the COVID-19 pandemic. These findings highlight the adaptability of ML models in analyzing energy-related financial phenomena.

Similarly, B​o​u​b​a​k​e​r​ ​e​t​ ​a​l​.​ ​(​2​0​2​3​) employed quantile vector autoregression (VAR) models to capture volatility spillovers across asset classes, demonstrating that ML techniques could outperform traditional mean-based methods under extreme conditions. Their study also emphasized the asymmetric impact of crises on volatility dependencies, showcasing the flexibility of ML in handling complex financial datasets. However, the increased complexity of these models raises questions about their interpretability and usability in practical applications.

Comparative statistical and ML model analyses have highlighted key trade-offs between interpretability and predictive performance. For instance, A​l​o​u​i​ ​&​ ​B​e​n​ ​A​ï​s​s​a​ ​(​2​0​1​6​) noted that statistical models, while simpler and more interpretable, often fail to account for nonlinearities and time-varying relationships inherent in energy-sensitive financial markets. Conversely, advanced ML models such as GB and RF are better equipped to handle these complexities but suffer from a lack of transparency.

Y​i​l​m​a​z​ ​e​t​ ​a​l​.​ ​(​2​0​2​2​) explored the impact of social media sentiment on energy stock prices using ML models. They found that such models could integrate external variables like Twitter sentiment into financial forecasting. However, the authors cautioned that the increased complexity of these models might hinder their acceptance among practitioners seeking actionable insights.

Furthermore, N​a​i​f​a​r​ ​e​t​ ​a​l​.​ ​(​2​0​1​9​) demonstrated that wavelet-based ML models could capture co-movement dynamics between energy and credit markets more effectively than traditional statistical approaches. Their findings reinforced that ML models provide a more nuanced understanding of financial interdependencies, particularly over intermediate and long investment horizons. However, they also acknowledged the computational challenges and expertise required to implement such models effectively.

Applying ML models to energy-driven financial indices has revealed their potential to enhance predictive accuracy. For instance, S​a​d​r​a​o​u​i​ ​e​t​ ​a​l​.​ ​(​2​0​2​1​) used copula-based multivariate GARCH models to investigate volatility spillovers between energy markets and BRICS stock indices. Their results demonstrated that ML methods could better account for risk spillovers and volatility clustering than traditional statistical models, offering valuable insights for portfolio management and risk mitigation.

However, the trade-off between accuracy and interpretability remains a critical consideration. L​i​u​ ​e​t​ ​a​l​.​ ​(​2​0​2​2​) highlighted this issue in their exploration of early warning systems for financial crises, noting that while ML models excel in predictive tasks, their complexity often obscures the underlying decision-making processes. As a result, simpler models may be more suitable for scenarios requiring clear communication and transparency.

While ML models offer significant advantages in modeling energy-driven financial indices, their adoption is challenging. A​l​o​u​i​ ​&​ ​B​e​n​ ​A​ï​s​s​a​ ​(​2​0​1​6​) emphasized that the complexity of these models could lead to overfitting, particularly in high-dimensional datasets. Moreover, B​o​u​b​a​k​e​r​ ​e​t​ ​a​l​.​ ​(​2​0​2​3​) pointed out that the lack of interpretability in ML models might limit their practical utility for policymakers and investors.

Despite these challenges, integrating statistical and ML methods holds promise for advancing financial forecasting. For example, S​a​d​r​a​o​u​i​ ​e​t​ ​a​l​.​ ​(​2​0​2​1​) suggested that hybrid models combining the interpretability of statistical approaches with the predictive power of ML could offer a balanced solution. Such models could leverage the strengths of both methodologies to address the complexities of energy-driven financial markets. This aligns with a broader literature stream indicating that domestic production metrics, geopolitical uncertainties, and external price shocks can interact in diverse ways, emphasizing the importance of carefully interpreting feature importance results in light of local energy conditions and global market forces.

3. Data and Methodology

3.1 Variables

This study utilizes a monthly dataset from 2008/01-2024/06, sourced from TradingView that integrates energy-related and economic indicators relevant to the BIST Industrial Index. These variables, classified into three primary groups, reflect critical factors influencing industrial performance: Energy Imports, Energy Production, and Energy Prices. Table 1 provides a detailed categorization of these variables and their corresponding literature references, highlighting the foundational works contextualizing their relevance.

TradingView was used as the primary data source in the analysis of energy dependency on the BIST Industrial Index.

Table 1. Variables

Variable Group

Variable/Source

Literature Reference

Energy Import

Crude Oil Import, Natural Gas Imports

B​h​a​t​ ​e​t​ ​a​l​.​ ​(​2​0​2​4​); H​e​ ​e​t​ ​a​l​.​ ​(​2​0​2​2​); S​o​n​ ​&​ ​R​y​u​ ​(​2​0​2​4​)

Energy Production

Crude Oil Production, Electricity Production

K​a​n​g​ ​&​ ​R​a​t​t​i​ ​(​2​0​1​3​); R​a​j​p​u​t​ ​e​t​ ​a​l​.​ ​(​2​0​2​4​)

Energy Prices

Gasoline Prices, Crude Oil Futures Price, Natural Gas Futures Price

H​u​a​n​g​ ​e​t​ ​a​l​.​ ​(​1​9​9​6​); K​e​n​n​e​d​y​ ​&​ ​W​a​l​l​i​s​ ​(​2​0​0​7​); K​u​z​i​a​k​ ​&​ ​G​ó​r​k​a​ ​(​2​0​2​3​)

3.2 Data Cleaning and Handling Missing Values

The dataset required preprocessing to ensure its suitability for modeling. Missing values, a common challenge in economic and energy datasets, were addressed using interpolation techniques. This method, commonly employed in time series and economic data, allowed for estimating missing entries by leveraging existing trends and patterns within the data. Specifically, a linear interpolation approach was selected to fill the gaps based on the assumption of gradual changes in energy and financial variables over time, which is a reasonable approach in this context.

Additionally, non-relevant columns, such as temporal identifiers, were removed to streamline the dataset. The dependent variable, the BIST Industrial Index, was isolated for analysis, while the remaining columns were standardized to ensure consistency across different measurement scales.

3.3 Data Preparation and Preprocessing

To enhance the dataset’s compatibility with the selected models, the independent variables were standardized using z-scores (H​a​n​ ​e​t​ ​a​l​.​,​ ​2​0​1​2). This transformation normalized the data, ensuring that each variable’s mean was zero and standard deviation was one, thereby minimizing potential biases from variables with significant magnitude differences. Such standardization is particularly important in ML contexts, as it can improve convergence rates and model stability, mainly when ensemble methods are employed. This step was particularly crucial for ML models such as RF and GB, which, although robust to scaling, benefit from more uniform data distributions during the training process.

The processed dataset included all variables in Table 1, enabling a comprehensive analysis of how energy imports, production, and prices influence the BIST Industrial Index. This preprocessing framework ensures that the subsequent model training and evaluation are based on reliable and well-structured data, setting a strong foundation for the comparative analysis of statistical and ML models.

The dataset, comprising energy-driven variables relevant to the Turkish industrial sector, was utilized to predict the BIST Industrial Index. Data preparation involved dropping the datetime column and separating the dependent variable (BIST Industrial Index) from the independent variables. The independent variables were standardized using a StandardScaler to ensure uniformity in scaling and to mitigate the effects of varying magnitudes across features (P​e​d​r​e​g​o​s​a​ ​e​t​ ​a​l​.​,​ ​2​0​1​1).

Subsequently, the dataset was partitioned into training (80%) and testing (20%) sets. This division ensured that model evaluation was conducted on unseen data, providing an unbiased assessment of generalization capabilities.

3.4 Model Training and Evaluation

Three predictive models were implemented: OLS Regression, RF, and GB. These models were chosen to represent diverse methodologies, ranging from statistical regression to ensemble-based machine-learning approaches. Model evaluation was conducted using standard metrics, including Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and R2 score. These metrics offered a comprehensive model performance evaluation from accuracy and explanatory perspectives.

Model training times were also recorded to evaluate computational efficiency. Feature importance analysis was performed for RF and GB to interpret the significance of predictor variables. For OLS regression, feature coefficients provided insights into linear relationships.

4. Results

4.1 Model Performance

The evaluation of the three models highlights distinct strengths and limitations. As shown in Table 2, RF demonstrated superior predictive accuracy, suggesting its ability to effectively capture complex relationships between energy-related variables and the BIST Industrial Index. This model’s robust performance aligns with its inherent capacity to model non-linear interactions and manage the importance of diverse features.

GB, while slightly less accurate than RF, achieved comparable performance, emphasizing its capability to balance predictive power and computational efficiency. This suggests that GB could be a viable alternative in contexts requiring reduced computational overhead without substantially compromising accuracy.

Table 2. Model performance metrics

Model

MAE

MAPE

MSE

RMSE

R2

OLS Regression

725.374121

63.199368

1,102,524

1,050.011289

0.778915

RF

204.179320

10.793117

185,960.2

431.231004

0.962710

GB

287.593380

16.555031

478,523.8

691.754173

0.904044

In contrast, the performance of OLS regression was constrained by its linear assumptions. While the model exhibited computational efficiency and simplicity, its relatively weaker fit indicates limitations in addressing non-linear dependencies within the dataset. This underscores the need for more sophisticated models when complex relationships dominate.

4.2 Feature Importance

The feature importance analysis revealed a consistent pattern across the ML models. “Turkey Crude Oil Production” emerged as the most influential predictor, underscoring its critical role in explaining variations in the BIST Industrial Index. This finding reflects the significant impact of crude oil production on industrial dynamics in Turkey.

Other features, such as “Turkey Gasoline Prices” and “Turkey Electricity Production,” displayed moderate importance, indicating their secondary yet meaningful contributions. These variables may encapsulate broader energy sector trends that indirectly influence industrial performance. Variables such as “Crude Oil Futures Price” and “Natural Gas Futures Price” exhibited minimal importance, suggesting their limited direct impact on the index within the examined context.

Table 3 indicates that Turkey’s crude oil production is the most influential predictor across both the RF and GB models. This result parallels the emphasis on identifying dominant variables in energy systems found in D​u​c​h​e​s​n​e​ ​e​t​ ​a​l​.​ ​(​2​0​1​7​). In that work, tree-based ensemble methods were deployed to determine the key factors affecting real-time reliability management, demonstrating that feature importance can offer valuable insights into system behavior. Furthermore, C​h​a​i​b​i​ ​e​t​ ​a​l​.​ ​(​2​0​2​2​) showed that RF-based feature selection effectively isolates critical features when predicting daily global solar radiation, underscoring the general utility of focusing on the top-ranked variables. These findings are consistent with the broader literature’s focus on employing ensemble methods to identify and emphasize dominant predictors.

Table 3. Feature importances of ML models

Feature

RF Importance

GB Importance

Turkey Crude Oil Production

0.953490

0.965428

Turkey Gasoline Prices

0.017171

0.010904

Turkey Electricity Production

0.010426

0.015457

Crude Oil Futures Price

0.006598

0.003996

Crude Oil Import

0.005692

0.001665

Natural Gas Futures Price

0.004882

0.000689

Turkey Natural Gas Imports

0.001740

0.001860

4.3 OLS Regression Coefficients

OLS regression coefficients, shown in Table 4, provided further insights into the linear relationships within the dataset. The strong positive coefficient for “Turkey Crude Oil Production” reinforced its dominant role, aligning with the feature importance results from ML models. Conversely, the negative coefficient for “Turkey Electricity Production” suggested an inverse relationship, potentially pointing to structural or economic dynamics that warrant further exploration. These coefficients offer interpretable insights but do not capture potential non-linear interactions, which limits their standalone applicability.

From a policy and investment standpoint, the positive coefficient of Turkey’s crude oil production implies that strategies to bolster domestic production could have favorable effects on the industrial index, potentially enhancing investor confidence and encouraging capital inflows. Conversely, the negative coefficient for electricity production might indicate inefficiencies or capacity constraints that, if addressed, could positively affect industrial performance.

Table 4. OLS regression coefficients

Feature

OLS Coefficient

Turkey Crude Oil Production

4204.140144

Turkey Gasoline Prices

851.968485

Natural Gas Futures Price

265.966321

Turkey Natural Gas Imports

256.899648

Crude Oil Import

84.122354

Crude Oil Futures Price

-91.288507

Turkey Electricity Production

-339.082297

4.4 Interpretability Complexity Trade-Off

The models shown in Table 5, exhibited a clear trade-off between interpretability and complexity. With its straightforward coefficients and transparent mechanics, OLS regression offers high interpretability and low computational demands. This simplicity makes it a suitable choice in scenarios where model explainability is paramount and relationships are expected to be predominantly linear.

ML models, on the other hand, presented higher complexity, as evidenced by the larger number of parameters and longer training times. RF and GB provided moderate interpretability through feature importance metrics and SHAP analysis. These models excel in capturing non-linear relationships but require careful consideration of their complexity, particularly in resource-constrained or time-sensitive applications.

Practitioners can navigate this trade-off by determining the regulatory and operational environment: highly transparent settings, such as regulatory compliance or board-level reporting, may benefit from OLS or simpler models. In contrast, competitive industries prioritizing predictive accuracy for high-stakes decisions might opt for RF or GB, supplemented by explainability tools like SHAP to gain additional transparency.

Table 5. Model attributes

Model

Training Time (s)

Number of Parameters

Interpretability

Complexity

OLS Regression

0.024717

7

High

Low

RF

0.721259

700

Moderate

High

GB

0.389806

700

Moderate

High

5. Discussion

The comparison of traditional statistical models and advanced ML techniques in forecasting energy-influenced financial indices has garnered increasing attention. Traditional linear models, such as OLS regression, provide transparent and interpretable results, making them suitable in settings where clarity and trustworthiness are paramount (W​a​n​g​ ​&​ ​H​a​o​,​ ​2​0​2​3). However, these models often struggle to capture the complexity and nonlinearities inherent in energy and financial data, especially when confronted with structural breaks or rapidly changing market conditions (A​l​o​u​i​ ​&​ ​B​e​n​ ​A​ï​s​s​a​,​ ​2​0​1​6).

Recent studies have demonstrated ML methods’ capacity to address traditional models’ limitations in energy-related financial forecasting. ML approaches have been noted for their adaptability during market turbulence, including crises driven by energy shocks (B​o​u​b​a​k​e​r​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). Such findings emphasize their value in capturing time-varying dynamics and nonlinear dependencies.

However, the enhanced predictive power of ML models introduces the well-documented complexity-interpretability trade-off. Stakeholders in highly regulated or risk-sensitive environments may find the opacity of ML models challenging, as understanding the rationale behind the forecasts is often as important as their predictive strength (H​e​r​m​ ​e​t​ ​a​l​.​,​ ​2​0​2​3). Despite the interpretability limitations, hybrid modeling approaches that integrate the transparency of statistical methods with the robustness of ML are emerging as promising avenues, potentially delivering both explainability and accuracy in energy-driven financial forecasting tasks (S​a​d​r​a​o​u​i​ ​e​t​ ​a​l​.​,​ ​2​0​2​1).

Moreover, these findings hold significant implications for practitioners and policymakers. By understanding which energy factors (e.g., crude oil production) wield the most significant influence, decision-makers can devise more targeted investment strategies and energy policies. For instance, incentives to enhance domestic production or improve production efficiencies may positively affect the overall industrial market performance. Beyond Turkey, these insights may apply to other emerging markets with similar energy-dependence characteristics. However, the specific influence of each variable could differ due to variations in local energy supply chains, regulatory frameworks, and geopolitical conditions.

6. Conclusion

This study distinguishes between traditional statistical and ML approaches in modeling the BIST Industrial Index under energy-driven conditions. While OLS regression offers computational efficiency and interpretability, it cannot handle capturing nonlinearities and complex interdependencies in energy-related data, resulting in stronger predictive performance.

ML models, particularly RF and GB, significantly outperform OLS, demonstrating a superior ability to handle nonlinear relationships. RF achieved the highest accuracy, while GB balanced performance with computational efficiency. Feature importance analysis highlighted Turkey’s crude oil production as the dominant predictor, underscoring the critical role of domestic energy production over global price signals in influencing the index.

However, the improved accuracy of ML models comes at the cost of reduced transparency. This complexity-interpretability trade-off emphasizes the need for alignment between model choice and strategic objectives. Simple, interpretable models like OLS may be preferable where explainability is paramount, while ML models are better suited for high-stakes contexts requiring robust predictive accuracy.

Future studies could investigate the role of geopolitical events—such as regional conflicts or energy supply disputes—in shaping the relationship between energy variables and financial indices, potentially introducing new layers of uncertainty. Additionally, expanding the analysis to other emerging markets or distinct energy market conditions could reveal variations in the impact of domestic production and global price signals, further refining our understanding of energy dependency. This research also suggests opportunities to develop practical decision-making frameworks integrating interpretability and predictive power, guiding practitioners toward model choices consistent with their risk tolerance, regulatory environment, and strategic objectives.

Data Availability

The data used to support the research findings are available from the corresponding author upon request.

Conflicts of Interest

The author declares no conflict of interest.

References
Aloui, R. & Ben Aïssa, M. S. (2016). Relationship between oil, stock prices and exchange rates: A vine copula based GARCH method. N. Am. J. Econ. Fin., 37, 458–471. [Google Scholar] [Crossref]
Bhat, S. R., Yadav, J. S., Kumar, C. M. N., Amar, H. A., Rakesh, N., & Kumar, S. V. P. (2024). Oil price volatility and its impact on industry stock return – Bi variate analysis. In Proceedings of the International Conference on Business and Technology (ICBT2024) (pp. 102–111). [Google Scholar] [Crossref]
Boubaker, S., Karim, S., Naeem, M. A., & Sharma, G. D. (2023). Financial markets, energy shocks, and extreme volatility spillovers. Energy Econ., 126, 107031. [Google Scholar] [Crossref]
Bouri, E., Rognone, L., Sokhanvar, A., & Wang, Z. K. (2023). From climate risk to the returns and volatility of energy assets and green bonds: A predictability analysis under various conditions. Technol. Forecast. Soc. Change, 194, 122682. [Google Scholar] [Crossref]
Chaibi, M., Benghoulam, E. M., Tarik, L., Berrada, M., & El Hmaidi, A. (2022). Machine learning models based on random forest feature selection and bayesian optimization for predicting daily global solar radiation. Int. J. Renew. Energy Dev., 11(1), 309–323. [Google Scholar] [Crossref]
Duchesne, L., Karangelos, E., & Wehenkel, L. (2017). Machine learning of real-time power systems reliability management response. In 2017 IEEE Manchester PowerTech, Manchester, UK (pp. 1–6). [Google Scholar] [Crossref]
Fieberg, C., Metko, D., Poddig, T., & Loy, T. (2023). Machine learning techniques for cross-sectional equity returns’ prediction. OR Spec., 45(1), 289–323. [Google Scholar] [Crossref]
Han, J. W., Kamber, M., & Pei, J. (2012). Data Mining: Concepts and Techniques. Elsevier. [Google Scholar] [Crossref]
He, Z. F., Chen, J. Q., Zhou, F. Z., Zhang, G. Q., & Wen, F. H. (2022). Oil price uncertainty and the risk-return relation in stock markets: Evidence from oil-importing and oil-exporting countries. Int. J. Financ. Econ., 27(1), 1154–1172. [Google Scholar] [Crossref]
Herm, L. V., Heinrich, K., Wanner, J., & Janiesch, C. (2023). Stop ordering machine learning algorithms by their explainability! A user-centered investigation of performance and explainability. Int. J. Inf. Manag., 69, 102538. [Google Scholar] [Crossref]
Huang, R. D., Masulis, R. W., & Stoll, H. R. (1996). Energy shocks and financial markets. J. Fut. Mark., 16(1), 1–27. [Google Scholar]
Kang, W. S. & Ratti, R. A. (2013). Oil shocks, policy uncertainty and stock market return. J. Int. Financ. Mark. Inst. Money, 26, 305–318. [Google Scholar] [Crossref]
Kennedy, D. & Wallis, I. (2007). Impacts of fuel price changes on New Zealand transport. Land Transport New Zealand Research Report 331. [Google Scholar]
Kim, S., Jeon, J., & Kim, H. (2024). Drought and energy stock markets in the United States. Environ. Res. Lett., 19(9), 094012. [Google Scholar] [Crossref]
Kuziak, K. & Górka, J. (2023). Dependence analysis for the energy sector based on energy ETFs. Energies, 16(3), 1329. [Google Scholar] [Crossref]
Liu, L. B., Chen, C., & Wang, B. (2022). Predicting financial crises with machine learning methods. J. Forecasting, 41(5), 871–910. [Google Scholar] [Crossref]
Naeem, M. A., Sadorsky, P., & Karim, S. (2023). Sailing across climate-friendly bonds and clean energy stocks: An asymmetric analysis with the Gulf Cooperation Council Stock markets. Energy Econ., 126, 106911. [Google Scholar] [Crossref]
Naifar, N., Hammoudeh, S., & Tiwari, A. K. (2019). Do energy and banking CDS sector spreads reflect financial risks and economic policy uncertainty? A time-scale decomposition approach. Comput. Econ., 54(2), 507–534. [Google Scholar] [Crossref]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. J. Mach. Learn. Res., 12, 2825–2830. [Google Scholar]
Rajput, N., Jeyalakshmi, R., & Ghai, R. K. (2024). The impact of sustainable electrification on stock market performance: An empirical study of Indian energy companies. In 2024 International Conference on Power, Energy, Control and Transmission Systems (ICPECTS), Chennai, India (pp. 1–7). [Google Scholar] [Crossref]
Sadraoui, T., Regaieg, R., Abdelghani, S., Moussa, W., & Mgadmi, N. (2021). The dependence and risk spillover between energy market and BRICS stock markets: A copula-MGARCH model approach. Global Bus. Rev. [Google Scholar] [Crossref]
Son, J. & Ryu, D. (2024). Energy price shocks and stock market volatility in an energy-importing country. Energy Environ. [Google Scholar] [Crossref]
Wang, Y. D. & Hao, X. F. (2023). Forecasting the real prices of crude oil: What is the role of parameter instability? Energy Econ., 117, 106483. [Google Scholar] [Crossref]
Yilmaz, E. S., Ozpolat, A., & Destek, M. A. (2022). Do Twitter sentiments really effective on energy stocks? Evidence from the intercompany dependency. Environ. Sci. Pollut. Res., 29(52), 78757–78767. [Google Scholar] [Crossref]

Cite this:
APA Style
IEEE Style
BibTex Style
MLA Style
Chicago Style
GB-T-7714-2015
Akusta, A. (2024). Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability. J. Corp. Gov. Insur. Risk Manag., 11(3), 160-167. https://doi.org/10.56578/jcgirm110302
A. Akusta, "Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability," J. Corp. Gov. Insur. Risk Manag., vol. 11, no. 3, pp. 160-167, 2024. https://doi.org/10.56578/jcgirm110302
@research-article{Akusta2024ComparativeEO,
title={Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability},
author={Ahmet Akusta},
journal={Journal of Corporate Governance, Insurance, and Risk Management},
year={2024},
page={160-167},
doi={https://doi.org/10.56578/jcgirm110302}
}
Ahmet Akusta, et al. "Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability." Journal of Corporate Governance, Insurance, and Risk Management, v 11, pp 160-167. doi: https://doi.org/10.56578/jcgirm110302
Ahmet Akusta. "Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability." Journal of Corporate Governance, Insurance, and Risk Management, 11, (2024): 160-167. doi: https://doi.org/10.56578/jcgirm110302
AKUSTA A. Comparative Evaluation of Statistical Models and Machine Learning Approaches in Modelling the Energy Dependency of the BIST Industrial Index: Balancing Predictive Performance and Interpretability[J]. Journal of Corporate Governance, Insurance, and Risk Management, 2024, 11(3): 160-167. https://doi.org/10.56578/jcgirm110302
cc
©2024 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.