Reliability Analysis of Complex Repairable Systems Using Artificial Neural Networks: A Case Study on Underground Mining Machinery
Abstract:
The effective utilisation of equipment is essential for achieving the operational goals within production sectors, particularly in industries involving heavy machinery. Throughout its lifecycle, equipment is exposed to dynamic loads and harsh operational environments, leading to potential failures that may significantly shorten their service life. Therefore, evaluating equipment reliability is crucial for mitigating production losses and ensuring continuous operations. This study presents a comprehensive reliability analysis of underground mining machinery, with a focus on Load-Haul-Dump (LHD) systems, which are key to material handling in mining operations. Reliability assessments are performed using methodologies based on the series configuration of repairable systems. The reliability of each LHD system is quantitatively evaluated by employing a feed-forward back-propagation artificial neural network (ANN) model implemented in MATLAB. This model is designed to predict the optimal responses of each LHD machine under varying operational conditions. The results obtained from the ANN model are compared with the calculated reliability values, demonstrating a high degree of correlation between the predicted and observed outcomes. This strong alignment underscores the potential of ANN-based models in accurately forecasting system reliability. Based on the analysis, recommendations are made to identify the most critical components contributing to the system's unreliability, thereby enabling targeted corrective actions. The findings provide valuable insights for engineers seeking to enhance the performance and operational efficiency of mining machinery through more informed maintenance and operational strategies.
1. Introduction
Inefficient and unproductive use of equipment is a primary factor contributing to the subpar performance of the mining industry. The LHD serves as the essential workhorse for underground mining operations, significantly impacting production and productivity levels. The performance of this equipment is largely influenced by the conditions of its working environment, the manner in which it is utilized, the effectiveness of maintenance and operational practices, as well as the expertise of the operating personnel [1]. These assessments are valuable for organizing appropriate maintenance and production improvement strategies for capital-intensive equipment [2]. In contrast to software-based methods, analytical and statistical approaches may require more time to resolve complex issues. Currently, numerous researchers are concentrating on the application of soft computing techniques to address intricate numerical challenges.
Real-world challenges often involve systems characterized by non-linearity, temporal variability, uncertainty, and significant complexity [3]. The study of these systems encompasses algorithmic processes that describe and manipulate information, focusing on their theoretical foundations, analytical methods, design principles, efficiency, implementation strategies, and practical applications. Traditional computing, or hard computing, necessitates precise mathematical models and extensive computational resources [4]. In contrast, for such complex issues, computationally intelligent methods that emulate human expertise and adapt to dynamic environments can be employed both effectively and efficiently [5]. Soft computing represents a progressive array of artificial intelligence techniques designed to leverage the inherent tolerance for imprecision and uncertainty found in real-world scenarios, aiming to provide robust, efficient, and optimal solutions while further investigating and harnessing the available design knowledge [6].
Soft computing, which encompasses Fuzzy Logic (FL), ANN, and Evolutionary Computation (EC), has emerged as a pivotal research domain with applications across various engineering disciplines, including aerospace, communication networks, computer science, mechanical engineering, mining, power systems, and control systems [7], [8]. The methodologies within Soft Computing include key components such as Fuzzy Systems (FS), which incorporate FL; EC, which features genetic algorithms; ANN, which involve Neural Computing (NC); machine learning; and Probabilistic Reasoning (PR) [9], [10]. Notably, PR and FL systems are grounded in knowledge-driven reasoning, while ANN and EC represent data-driven search and optimization techniques. In the current research, the user-friendly and comprehensible non-linear model known as ANN has been employed to address complex problems [11]. A significant benefit of utilizing ANN is its non-parametric nature, contrasting with many other statistical methods that are parametric and require a more substantial statistical background [12]. However, ANN operates as a black box learning method, which limits its ability to elucidate the relationship between inputs and outputs and to manage uncertainties [13].
Furthermore, this paper discusses the adaptation of the DFA technique for the detection of faults in LHD machines. Introduced in 1994, the DFA serves as a fractal scaling method designed to identify long-range autocorrelations in noisy and non-stationary time series [3], [14], [15]. This method has demonstrated effective applications across a diverse array of fields, including science, medicine, and engineering, such as physiology, geophysics, finance, cardiac dynamics, and bioinformatics, among others. However, its application in machine fault detection remains unexplored [16], [17], [18].
2. Case Study
The current contextual analysis has been conducted within the underground metal mining sector in the northeastern region of India, specifically at M/s Hindustan Zinc Limited (HZL), located in Sindesar Khurd, Rajasthan. This site is recognized as the largest underground metal mine in India, achieving a production volume of 4.5 million metric tons in the fiscal year 2018. The mine features a well-mechanized operation with lower operational costs, housing silver-zinc-lead mineral reserves that average a grade of 7%. Prior to transporting the ore from the surface to the beneficiation plant for secondary processes such as grinding, crushing, flotation, and washing, a primary crushing operation is conducted on the extracted ore. LHD vehicles serve as the primary equipment for metal handling and transportation. These LHDs are employed to scoop the extracted metal, load it into their buckets, and deposit it at the base of the mine for initial crushing before the ore is hoisted to the surface. Currently, the mine operates six LHDs manufactured by M/s Sandvik, each with a capacity of 17 cubic meters. Subgraphs (a) and (b) of Figure 1 illustrate a typical LHD in operation and one in the workshop for maintenance.
3. Data Collection and Classification
In the current study, each LHD was regarded as a distinct system, designated as LH26, LH27, LH28, LH29, LH30, and LH31. Each LHD system has been categorized into seven sub-systems: Engine Sub-System (SSE), Braking Sub-System (SSBr), Tyre Sub-System (SSTy), Hydraulics Sub-System (SSH), Electrical Sub-System (SSEl), Transmission Sub-System (SSTr), and Mechanical Sub-System (SSM). The data collected regarding Time Between Failures (TBF), Time To Repair (TTR), and Failure Frequency (FF) for each sub-system of the LHDs is presented in Table 1.
Machine ID | Parameter | Sub-Systemof Engine (SSE) | SSBr | Sub-System of Tyre (SSTy) | SSH | SSEl | SSTr | SSM |
LH26 | FF (No/.) | 29 | 28 | 24 | 20 | 24 | 18 | 33 |
TBF (Hrs) | 563.7 | 584.4 | 682.8 | 882 | 682.7 | 913.8 | 492.2 | |
TTR (Hrs) | 170.8 | 184.2 | 210.4 | 254.6 | 210.3 | 310.4 | 164.2 | |
LH27 | FF (No/.) | 14 | 18 | 16 | 17 | 32 | 12 | 35 |
TBF (Hrs) | 1199 | 929.5 | 1047.3 | 984.7 | 517.2 | 1401.4 | 470.6 | |
TTR (Hrs) | 353 | 286 | 323 | 304 | 168 | 389 | 142 | |
LH28 | FF (No/.) | 34 | 16 | 18 | 24 | 32 | 44 | 28 |
TBF (Hrs) | 489.4 | 1047.2 | 930.1 | 696.7 | 516.5 | 373.5 | 596.3 | |
TTR (Hrs) | 158 | 326 | 302 | 174 | 159 | 140 | 164 | |
LH29 | FF (No/.) | 32 | 16 | 18 | 17 | 25 | 20 | 36 |
TBF (Hrs) | 520.8 | 1047 | 922.8 | 983.5 | 665.1 | 836.6 | 461.5 | |
TTR (Hrs) | 160 | 326 | 300 | 322 | 175 | 198 | 148 | |
LH30 | FF (No/.) | 16 | 11 | 10 | 12 | 9 | 10 | 16 |
TBF (Hrs) | 1009.6 | 1477.2 | 1625.7 | 1352.5 | 1805.6 | 1622.5 | 1012 | |
TTR (Hrs) | 300 | 389 | 412 | 367 | 446 | 408 | 302 | |
LH31 | FF (No/.) | 28 | 20 | 12 | 20 | 10 | 10 | 24 |
TBF (Hrs) | 576.5 | 813.6 | 1357.3 | 813.9 | 1628.4 | 1630.1 | 677.7 | |
TTR (Hrs) | 184 | 298 | 346 | 299 | 402 | 410 | 208 |
4. Results and Discussion
Reliability, $R(t)$, of the equipment is the probability of the equipment in the work environment to perform its intended task under specified conditions and time $t$ and is referred to as Eq. (1):
where, $f(t)$ demonstrates the probability density function (PDF) of the arbitrary variable, time to disappointment. The rate at which disappointments happen in the interim $t$1 to $t$2, the disappointment rate, $\lambda(t)$, is characterized as the proportion of likelihood that disappointment happens in the interim, given Eq. (2) that it has not happened before $t$1, the beginning of the interim, separated by the interim length [19].
The disappointment rate can, accordingly, be characterized as the likelihood of disappointment in unit time of a segment that is as yet working agreeably. Hence, the statistical method has been utilized to know the type of the reliability modelling method (HPP, NHPP, and RP) and this can be performed by using Eq. (3):
where, $n$ demonstrates the cumulative failure frequency, $\mathrm{n}^{\text {th }}$ breakdown time is denoted by $T_n$ and $\text{i}^\text{th}$ breakdown time is denoted by $T_i$. This HPP (statistic-U) test has a chi-squared distribution with 2(n-1) degrees of freedom. The data sets are identically distributed when the null hypothesis is rejected at a 5% level of significance [13]. The reliability characteristic of equipment can be determined by the analysis of TBF and TTR. The step-by-step procedure of reliability modeling of a repairable system is given in Figure 2. This detailed flowchart helps model the datasets and is used as a base for failure and repair data analysis.
The trend analysis for the current study was conducted using graphical methods. However, an analytical approach can also be employed to examine the trends within the datasets. Prior to fitting the data, it is essential to determine whether a trend is present, specifically whether the failure rates of the equipment are increasing, decreasing, or remaining constant [19]. The trend in failure data can be assessed by plotting the Cumulative Time Between Failures (CTBF) against the number of failures. A concave upward line indicates an improving system, while a concave downward line suggests a deteriorating system. A linear line indicates the presence of a trend within the data. The purpose of the serial correlation test is to detect correlations among variables. This test can be executed using two variables, specifically the i-th TBF and the (i-1)-th TBF. The trend analysis between the CTBF and the Cumulative Failure Frequency (CFF), along with the scatter plot for the serial correlation test between the TBF and the (i-1)-th TBF datasets for LH26, is illustrated in subgraphs (a) and (b) of Figure 3. From subgraph (a) in Figure 3, it is evident that the line is linear, and many data points do not align with the trend line, indicating the absence of a trend among the dataset points. Similarly, the scatter plot in subgraph (b) in Figure 3 shows that the data points are randomly distributed, suggesting that there is no correlation between consecutive breakdowns. Consequently, it is concluded that the dataset points do not exhibit independent and identically distributed characteristics. Additionally, both trend and serial correlation tests were performed on the remaining machines (as shown in Figure 4, Figure 5, Figure 6, Figure 7, and Figure 8). The graphical analysis revealed that none of these machines displayed a trend, nor was there any correlation between them.
The assessment of the independent and identical distribution characteristics of data sets can be conducted analytically through the use of the statistic-U (Chi-Squared) test [9]. The calculated values of the statistic-U test for various system failures in relation to the TBF are detailed in Table 2. The findings indicate that the null hypothesis was not rejected at the 5% significance level for all the LHDs. Similar outcomes were observed in both trend and serial correlation tests. Consequently, it can be concluded that the data sets from all systems exhibit independent and identical distribution characteristics. Additionally, the datasets may be statistically analyzed using the following statistical U-test (Eq. (3)), as referenced in sources [3], [14].
Machine | Dataset | DOF | Calculated Statistic U | Rejection of Null Hypothesis at 5% Level of Significance | Status |
LH26 | TBF | 350 | 4.91 | 4.91< 37.65 | Not Rejected |
LH27 | TBF | 286 | 6.60 | 6.60< 31.41 | Not Rejected |
LH28 | TBF | 390 | 4.77 | 4.77< 40.11 | Not Rejected |
LH29 | TBF | 326 | 4.76 | 4.76< 35.17 | Not Rejected |
LH30 | TBF | 166 | 3.72 | 3.72< 19.68 | Not Rejected |
LH31 | TBF | 246 | 6.98 | 6.98< 27.59 | Not Rejected |
Four statistical distribution parameters, namely the Exponential Parameter, 1-Parameter Weibull, 2-Parameter Weibull, and 3-Parameter Weibull, will be examined to determine the optimal fit approximation model for all LHDs. The selected data distributions are suitable functions for modeling system uncertainties. The Maximum Likelihood Estimate (MLE) method is predominantly employed to estimate the parameters of the best-fitting distribution [8]. In this research, the software'Isograph Reliability Workbench 13.0' was utilized to conduct the K-S and MLE tests. Following the extraction of significance level ($\alpha$) values from the Cumulative Probability Function (CDF) curves, various distributions were compared, with the distribution exhibiting the lowest significance level coefficient deemed the best fit. The results indicated that the data sets for all systems were most appropriately modeled by the Weibull 2-parameter and 3-parameter distributions. The MLE for the scale parameter ($\eta$), shape parameter ($\beta$), and location parameter ($\gamma$) were recorded in relation to the best-fitting distribution model. These MLE values served as input data for subsequent analyses. The metrics derived from the K-S test were utilized as input data sets, along with the Mean Time Between Failure (MTBF) of the machine, to develop the ANN reliability model. Additionally, when predicting reliability-based Preventive Maintenance (PM) time intervals, the results from the K-S tests were incorporated as input data alongside the reliability function. The recorded outcomes of the K-S test for goodness of fit approximation are presented in Table 3. Based on the goodness of fit approximation, the statistical probability distribution parameters were estimated, as shown in Table 4.
Machine | K-S Statistics Dmax | Best Fit Model | ||||
Exponential | Weibull 1P | Weibull 2P | Weibull 3P | Weibayes |
| |
LH26 | 0.2314 | 0.2075 | 0.0804 | 0.0526 | 0.2314 | Weibull 3P |
LH27 | 0.1857 | 0.1647 | 0.0691 | 0.0916 | 0.1857 | Weibull 2P |
LH28 | 0.1889 | 0.1674 | 0.0635 | 0.0367 | 0.1889 | Weibull 3P |
LH29 | 0.2052 | 0.1834 | 0.0543 | 0.0553 | 0.2052 | Weibull 2P |
LH30 | 0.2331 | 0.2097 | 0.0617 | 0.0578 | 0.2331 | Weibull 3P |
LH31 | 0.1739 | 0.1543 | 0.0899 | 0.0727 | 0.1739 | Weibull 3P |
Machine | Best Fit Model | MLEs of the Best Fit Parameters (ɳ=Scale/life, β=Shape, γ=Location) | ||
ɳ | β | γ | ||
LH26 | Weibull 3P | 287.1 | 1.39 | 438.3 |
LH27 | Weibull 2P | 1073 | 2.493 | 0 |
LH28 | Weibull 3P | 411.4 | 1.295 | 308 |
LH29 | Weibull 2P | 870 | 3.267 | 0 |
LH30 | Weibull 3P | 2317 | 7.015 | -760.2 |
LH31 | Weibull 3P | 672.6 | 1.106 | 479.4 |
The data sets devoid of any trends are subjected to further analysis to determine the specific characteristics of the breakdown time approximations for LHDs. Probability distribution functions are typically employed to characterize the nature of the data points. A selection of the software's recommended outcomes for the most suitable distributions, including CDF plots, is illustrated in Figure 9 and Figure 10. An examination of the CDF plots indicates that the optimal fit for the data sets is represented by the Weibull 2-parameter and Weibull 3-parameter distributions. Consequently, the reliability of each individual subsystem was computed using the formulas provided in Eq. (6) [19], [20], which corresponds to a 3-parameter Weibull distribution function. In the case of the 2-parameter Weibull distribution function, the location parameter $\gamma$ is set to zero. The calculated results detailing the percentages of unreliability and reliability for each subsystem of the LHD in relation to TBF are presented in Table 5.
Machine ID | Parameter | SSE | SSBr | SSTy | SSH | SSEl | SSTr | SSM |
LH26 (3P W) | TBF | 563.7 | 584.4 | 682.8 | 882 | 682.7 | 913.8 | 492.2 |
Q% | 27.24 | 32.50 | 55.20 | 79.05 | 55.18 | 79.74 | 9.38 | |
R% | 72.75 | 67.49 | 44.79 | 20.95 | 44.81 | 20.26 | 90.61 | |
LH27 (2P W) | TBF | 1199 | 929.5 | 1047.3 | 984.7 | 517.2 | 1401.4 | 470.6 |
Q% | 73.33 | 50.38 | 61.07 | 55.47 | 15.00 | 75.77 | 12.06 | |
R% | 26.66 | 49.61 | 38.92 | 44.52 | 84.99 | 24.23 | 87.93 | |
LH28 (3P W) | TBF | 489.4 | 1047.2 | 930.1 | 696.7 | 516.5 | 373.5 | 596.3 |
Q% | 29.48 | 78.21 | 71.93 | 60.64 | 34.15 | 9.03 | 46.97 | |
R% | 70.51 | 21.78 | 28.06 | 39.35 | 65.84 | 90.96 | 53.02 | |
LH29 (2P W) | TBF | 520.8 | 1047 | 922.8 | 983.5 | 665.1 | 836.6 | 461.5 |
Q% | 17.11 | 74.00 | 60.29 | 67.55 | 34.09 | 58.57 | 11.88 | |
R% | 82.88 | 25.99 | 39.70 | 32.44 | 65.90 | 41.42 | 88.11 | |
LH30 (3P W) | TBF | 1009.6 | 1477.2 | 1625.7 | 1352.5 | 1805.6 | 1622.5 | 1012 |
Q% | 14.00 | 54.24 | 60.70 | 40.72 | 77.06 | 60.36 | 14.13 | |
R% | 85.99 | 45.75 | 39.29 | 59.28 | 22.93 | 39.63 | 85.86 | |
LH31 (3P W) | TBF | 576.5 | 813.6 | 1357.3 | 813.9 | 1628.4 | 1630.1 | 677.7 |
Q% | 11.14 | 37.01 | 63.90 | 37.04 | 73.61 | 73.66 | 22.87 | |
R% | 88.85 | 62.98 | 36.09 | 62.95 | 26.38 | 26.33 | 77.12 |
In the present research study, an assumption was made that all the sub-systems or sub-assemblies of LHD are connected in a series configuration arrangement. Hence, the percentage of overall system reliability ($Rs$) has been estimated with series configuration calculation (Table 6) and the pictorial representation of estimated results of $Rs$ is shown in Figure 11. The $Rs$ was utilized by utilizing the empirical Eq. (4) as follows [7].
where, $R s$ denotes overall system reliability, $i$ indicates the number of sub-systems, i.e., $1,2,3 \ldots . n$ and $R$ indicates the reliability of each sub-system.
Equipment | Sub-System Reliability | System Reliability, Rs (%) | ||||||
SS E | SS Br | SS Ty | SS H | SS El | SSTr | SS M | ||
LH26 | 72.75 | 67.49 | 44.79 | 20.95 | 44.81 | 20.26 | 90.61 | 59.98 |
LH27 | 26.66 | 49.61 | 38.92 | 44.52 | 84.99 | 24.23 | 87.93 | 68.65 |
LH28 | 70.51 | 21.78 | 28.06 | 39.35 | 65.84 | 90.96 | 53.02 | 65.63 |
LH29 | 82.88 | 25.99 | 39.70 | 32.44 | 65.90 | 41.42 | 88.11 | 69.44 |
LH30 | 85.99 | 45.75 | 39.29 | 59.28 | 22.93 | 39.63 | 85.86 | 69.41 |
LH31 | 88.85 | 62.98 | 36.09 | 62.95 | 26.38 | 26.33 | 77.12 | 67.24 |
In this study, the reliability of six LHDs was forecasted using an ANN tool, as illustrated in Figure 12. To determine the reliability output response, four input parameters were considered: MTBF, scale parameter ($\eta$), shape parameter ($\beta$), and location parameter ($\gamma$). Consequently, the three predicted values obtained from the output layer of the MATLAB-based ANN model were documented for further discussion.
The reliability model based on ANN was established through the application of MTBF and Mean Time To Repair (MTTR) metrics. The training process employed the TRAINLM learning function. Following the determination of the training function, the Gradient Descent with Momentum (LEARNGDM) was also selected for the learning process. For the hidden layer, the TANSIG transfer function was utilized, while the output layer employed a linear function (PURELIN) [8], [20]. The model underwent testing with neuron counts ranging from 4 to 11 and was trained for up to 1000 iterations to achieve optimal results. The optimal value of R² was determined based on the Root Mean Square Error (RMSE) (Eq. (5)). The results indicated that the optimal R² value of 0.99815 (Figure 12) corresponded to an RMSE of 0.034731 for LM-10. The predicted values associated with the optimal R² and the output responses are presented in Table 7 and Table 8.
Number of Neurons | R2 | RMSE | |
1 | 4 | 0.99287 | 0.64565 |
2 | 5 | 0.98050 | 1.05281 |
3 | 6 | 0.99569 | 0.49692 |
4 | 7 | 0.99283 | 0.63317 |
5 | 8 | 0.98106 | 0.88473 |
6 | 9 | 0.90955 | 0.94432 |
7 | 10 | 0.99815 | 0.03473 |
8 | 11 | 0.97838 | 0.91630 |
Sl. No | Machine ID | MTBF (Hrs) | η | β | γ | Reliability (%) |
1 | LH26 | 4.21 | 287.1 | 1.39 | 438.3 | 62.59 |
2 | LH27 | 4.83 | 1073 | 2.493 | 0 | 60.04 |
3 | LH28 | 4.58 | 411.4 | 1.295 | 308 | 78.28 |
4 | LH29 | 4.91 | 870 | 3.267 | 0 | 58.18 |
5 | LH30 | 4.90 | 2317 | 7.015 | -760.2 | 69.98 |
6 | LH31 | 4.72 | 672.6 | 1.106 | 479.4 | 65.44 |
Upon the successful completion of the development and simulation of reliability, the validation of the computed results was conducted based on the predicted values from the MATLAB-based ANN as shown in Table 9. The graphical representation illustrating the variation in estimated results alongside the ANN predicted values is displayed in Figure 13. The comparison revealed that the computed and predicted values of performance characteristics were satisfactory, achieving the highest R² value. Utilizing this developed ANN model, six datasets of TBF from all the LHD systems in the simulation were analyzed to assess the percentage of system reliability. This model demonstrates a strong learning relationship between the predicted and analytically estimated values, comprising four hidden layers and a minimal number of neurons in the connection weights. Reliability percentage test phase indicates that the ANN architecture is proficient in generalizing the relationship between input and output variables, yielding reasonably accurate predictions. A slight percentage variation error was observed between the computed and predicted results, as shown in Figure 13.
Sl. No | Machine | Estimated Reliability from Isograph Reliability Workbench 13.0 | Predicted Reliability from MATLAB Based ANN |
1 | LH26 | 64.78 | 62.59 |
2 | LH27 | 60.14 | 60.04 |
3 | LH28 | 78.28 | 78.28 |
4 | LH29 | 58.18 | 58.18 |
5 | LH30 | 69.98 | 69.98 |
6 | LH31 | 65.44 | 65.44 |
5. Conclusions
Reliability analysis serves as a crucial indicator for assessing the performance of a system. The enhancement of continuous equipment operation can be achieved through the strategic planning and organization of maintenance practices. Such investigations will facilitate the forecasting of necessary managerial actions or control measures, including potential design modifications and component replacements, to ensure the desired levels of availability and utilization. The following conclusions are drawn from the results obtained:
• The highest reliability level recorded (Table 8) is 69.44% (LH29), while the lowest is 59.98% (LH26) in comparison to other systems. The irregular occurrence of frequent failures, coupled with reduced TBF, contributes to a significant decline in reliability percentages. Therefore, it is advisable to maintain less efficient equipment at an adequate level by implementing optimal maintenance practices. This can be further improved through strict adherence to preventive maintenance schedules, effective management of equipment and personnel, a skilled operating team, and efficient machinery management. A notable increase in availability and utilization can be achieved through shift overlapping.
• The developed ANN reliability model identified an optimal value for the hidden layer. According to the statistical error analysis, the Levenberg-Marquardt (LM) learning algorithm with 10 neurons for the reliability function was determined to be the best optimal value. For the ANN reliability model, an R² value of 0.99815 was achieved at a RMSE of 0.034731 for LM-10. The predicted reliability values, characterized by the highest R², yielded satisfactory results. A slight percentage variation error was observed between the computed and predicted outcomes. It was concluded that ANN is a suitable technique for validating the developed network model.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.