Applications of Machine Learning in Aircraft Maintenance
Abstract:
Aircraft maintenance is an expansive multidisciplinary field which entails robust design and optimization of extensive maintenance operations and procedures; encompassing the fault identification, detection and rectification, and overhauling, repair or modification of aircraft systems, subsystems, and components, as well as the scheduling for various maintenance operations, in compliance with the aviation standards; in order to predict, pre-empt and prevent failures and thus ensure the continual reliability of aircraft. Advances in Big Data Analytics (BDA) and artificial intelligence techniques have revolutionized predictive maintenance operations. Predictive maintenance is making big strides in the aerospace sector accompanied by a variety of prognostic health management options. Artificial intelligence algorithms have recently been extensively applied to optimize aircraft maintenance systems and operations. Several researchers have proposed, analysed, and investigated the applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) based data analytics for predictive maintenance of aircraft systems, subsystems, and components. This paper provides a comprehensive review of the ML techniques like Multilayer Perceptron (MLP), Logic Regression (LR), Random Forest (RF), Artificial Neural Network (ANN), Support Vector Regression (SVR), Linear Regression (LR), and other common ML techniques for their present implementation and potential future applications in aircraft maintenance.
1. Introduction
Predictive maintenance (PdM) involves forecasting maintenance requirements in future using time-based data from in-service facilities such as airplanes. One of the main goals of this method is to accurately forecast when it is time to repair or replace a component. The benefits of prolonged usage of a component diminish if its replacement is done farther ahead; if it’s performed very late, unforeseen failures can occur lowering asset availability. As a result, improving the accuracy of estimated component lifetime is a continuous priority. Machine learning (ML) is a subfield of artificial intelligence (AI) in which learning algorithms analyse enormous datasets to discover complex patterns. Despite the fact that the basic architecture and ideas for ML were devised generations earlier, the data volumes and computer power required to make it a reality did not exist until lately. ML applications benefits include maintenance cost reduction, repair stop reduction, machine fault reduction, spare-part life increases and inventory reduction, operator safety enhancement, increased production, repair verification, sensor calibration, fuel tank quantity evaluation, icing detection, and increase in overall profit, and many more [1], [2], [3].
For far more than a century, aerospace has relied on "preventative" maintenance, which includes simplistic analytical expression that avoid attempting to predict failures in favour of erring on the side of caution and replacing parts in a systematic way, using many equipment sensor readings as a guideline marker of system health. Predictive machine learning algorithms, on either side, can account for complicated underlying relationships by using knowledge embedded in previously untapped datasets. As a consequence, such systems can assist in identifying failure patterns in components that would be difficult to discover otherwise.
To begin with, having a vast volume of data is insufficient. Defining all of the parameters that potentially influence equipment behaviour is required when developing a machine learning model. This can be achieved through cooperation between data scientists and experienced field engineers, as well as a desire to comprehend each other's environment and perspective.
Second, data must be collected and "cleaned" to focus on the variables that affect equipment performance, such as operating circumstances, temperature exposure, and previous repairs. However, data does not need to be flawless; it was discovered that freely available data can be used to begin training a machine learning model and improve prediction accuracy.
Finally, the appropriate machine learning model must be chosen and developed for the task. Systems get more effective when they become harder to comprehend. So, a simple "linear" machine learning model will produce outputs that are simpler to use (though not as accurate), whereas complicated "neural networks" would yield higher accuracy, however the algorithms itself are much more tough to comprehend.
2. Aircraft Maintenance
Apart from unexpected repairs, aircraft maintenance is performed in a sequence of more thorough checks. How frequently these checks are carried out relates to the number of flying hours, take-off and landing rotations, and can be performed at any location with the necessary equipment. Since each aircraft type has its own inventory requirements, integrating facilities for various fleets saves just a little amount of money. Several companies have developed maintenance practices that require periodic checks at least every four days to comply with Federal Aviation Administration (FAA) requirements in United States. The FAA requires each aircraft to go through four main types of inspections. The range, timing, and regularity of these events vary [4].
1. A Type Checks
The FAA requires the first main check (A Type) per 65 hours of flight, or approximately once in a week. During Type A inspections, critical systems like engines, landing gears, and control surfaces at the cockpit are inspected [5].
2. B Type Checks
Every 300-600 flying hours, B Type which includes a comprehensive visual inspection together with greasing all moving parts such as ailerons and horizontal stabilizers is carried out [5].
3. C&D Type Checks
The Type C and D primary inspections are performed once in a four-year period and necessitate the aircraft to be out of use for at least a month. Type C and D inspections don't have to be considered in maintenance scheduling since they are spread at relatively long periods and also because of the dynamic nature of the industry.
The airlines' main focus is fulfilling the Type A and B inspection and maintenance criteria in their chosen 4-day inspection and maintenance program. Inspections and repairs are done at night until they are in unique conditions [5].
The problem of aircraft maintenance management is highly complicated, encompassing a variety of factors that can be divided into three major sub-problems: project scheduling and resource allocation for scheduled and unscheduled duties, as well as personnel capacity planning, spare parts forecasts, and stock control [6]. In this sense, capacity planning refers to the practice by which Maintenance Repair Overhaul (MRO) businesses allocate their workforce at a given point in time to meet future demand. Dijkstra et al. [7] reported their work at KLM, the Royal Dutch Airlines, on designing a Decision Support System (DSS) to solve capacity planning problem. Because the topic is NP-hard, the optimization model is presented an integer programming and is dealt with using an approximation approach based on Lagrange relaxation. Data related to labour and manpower is stored in a database module, a methodology assesses potential outcomes in terms of scenarios, and DSS is contained in a graphical user interface (GUI). The methodology which makes use of optimization models contains routines for estimating labour, optimizing workforce size, organizing and evaluating computed labour’s quality [8]. Evaluations routine aims at maximizing the total number of work to be completed with the designated size and shape of labour, whereas the optimization model's routines aims to achieve the least number of maintenance engineers necessary to finish all jobs. Yan et al. [9] published an airplane maintenance capacity planning model that incorporates different maintenance certificates and flexible management tactics. In increasing the various degrees of flexibility in the maintenance schedules, Yan et al. [9] suggested three flexible solutions. (1) flexible shifts, that gives room for the maintenance management to schedule the best values of shift patterns and their continuous phase; (2) flexible teams (also known as squadrons), which allow companies to modify the number of team members in terms of supply; (3) flexible working hours, that enables the firm to create separate amount of hours. The purpose of the challenge which is modelled as a mixed integer algorithm, is to decrease overall maintenance manpower (man-hour) while still fulfilling demand.
Regattieri et al. [10] highlighted two methods to select spare parts of aircrafts: (1) selection based on operating expertise of an enterprise; and (2) the use of prediction methodologies. Comparison was made on the performance of twenty prediction methodologies using some dataset from Alitalia airline in forecasting the need for aircraft spare parts in case of irregular demands of spare parts and it was observed that Croston method, exponentially weighted moving average, and weighted moving averages models outperformed the rest when evaluated with mean absolute deviation (MAD). As evidenced by some researchers [10], [11], [12] in predicting aircraft spare parts, the latter mostly made use of statistical estimate techniques such as the Croston's approach, single exponential smoothing (SES), and the simple moving average (SMA). Neural Networks (NNs) was used in making prediction for irregular and lumpy demand patterns [13], [14], although not explicitly in aviation. Optimization methodologies have not previously been used in combination with spare parts predictions to the authors' knowledge. They are, nevertheless, frequently used to solve stock management issues. In order to address production/inventory dilemma with non-stationary supply unpredictability and the accessibility of reasonably close supply information (called advance supply information), Atasoy et al. [15] offered a dynamic programming technique. Wang et al. [16] combined a dynamic programming and an enumeration coding approach to enhance the spare parts inventory using regular preventive maintenance period. The subject of managing logistics of spare parts and its operations for maintenance service suppliers was addressed by Kazemi Zanjani and Nourelfath [17]. The best combination of maintenance tasks that can be carried out and the number of spare part orders that will reduce the costs of inventory, procurement, and late delivery while taking into consideration the lead time for supplying spare parts was first determined by Kazemi Zanjani and Nourelfath [17] using a theoretical mathematical programming model. Also, the non-stationary random variable model which is later reconstructed as a multi-stage random programming with access was used to represent the request uncertainty for spare parts. Gu et al. [18] presented two non-linear modelling techniques for deciding the best procurement duration and quantity while keeping costs to a minimum.
Huang et al. [19] used a GA to allocate human resources to a problem involving airplane maintenance. They started with a specified sequence of maintenance chores and then figured out the best distribution of resources to keep the maintenance tasks as short as possible. Safaei et al. [20] focused on issues regarding scheduling maintenance tasks for a fleet of military aircraft. These problems which was constructed as a mixed-integer programming model (MIP) was solved using a branch-and-bound approach. Maximizing fleet availability was the key objective of the model, but the availability of skilled labour was the main constraint.
3. Recent Research
Nuhu et al. [21] investigated the applicability and effectiveness of ML prediction models for fault diagnosis in smart manufacturing. In predicting the Remaining Useful Lifetime (RUL) of aircraft parts, Azevedo et al. [22] developed a web-based application. This application models a Prognostics and Health Management (PHM) system with an aim of forecasting the RUL of specific aircraft parts by implementing various machine learning algorithms such as these three proposed methods: extrapolation-based method, similarity-based method and neural network-based method. MATLAB 2018a was used to implement the three techniques. Nguyen et al. [23] designed a dynamic PdM structure based on LSTM, a deep learning architecture, to look into the impact imperfect prognostics have on maintenance decisions in Prognostics and Health Management (PHM) system. Sarkon et al. [24] reviewed ML applications in additive manufacturing; from design to manufacturing and property control. Wang et al. [25] suggested a prognostics approach based on similarity for estimating the RUL of designed systems, of which a unique data-driven RUL estimation methodology that begins with damage assessment and then employs a similarity-based matching method to compute RUL was presented. The method is made up of two procedures: RUL estimate and performance evaluation.
Capodieci et al. [26] employed various machine learning algorithms to evaluate spare parts request performance for aircraft engine by developing iterative Artificial Intelligence-based algorithms to outline the plan for engine removal and maintenance, enhance maintenance costs and engine availability at the customer, and also obtain an acquisition plan of integrated parts with intervention planning and maintenance scheme execution. Machine Learning was used on a workshop dataset to optimize the amount, cost, and lead-time of warehouse spare parts in order to achieve this goal. This dataset comprises of information such as repair claims, engine operating hours, forensic evidence, and general information regarding processed spare parts for a given engine type, spanning various years and fleets. The Confusion Matrix was then used to assess the overall results and make comparison using three machine learning algorithms - Naive Bayes, Logistic Regression, and Random Forest classifiers. Naive Bayes and Logistic Regression estimates perform best globally, with an accuracy rate of roughly 80%, indicating that the models are generally accurate.
Yan et al. [27] used term frequency-inverse document frequency (TF-IDF) and random forest to develop a predictive model that forecasts high-priority defects in advance in order to carry out preventive maintenance. This model was built on historical data from airplane maintenance systems. Yan and Zhou [27] formulated the prediction of faults with varying importance as a binary classification problem. Counts of different defects that have occurred in previous flights was used as raw data. TF-IDF algorithm was then implemented in extracting features from the raw data. The RF method was used to simulate the classification of defects with varied priorities. The training dataset's performance measurements are based on the Receiver Operating Characteristics (ROC) curve. Matthew et al. [28] performed a comparative analysis on existing machine learning techniques for forecasting the RUL of an aircraft's turbofan engine. These techniques are K-Means algorithm, Linear Regression, Random Forest, Decision Tree, Support Vector Machine (SVM), K-Nearest Neighbours, Gradient Boosting Method (GBM), and AdaBoost. A research carried out by El Afia and Sarhani [29] which is about model selection of aircraft maintenance predictive models using particle swarm optimization was tested in the calculation of an aircraft's RUL, which has an impact on maintenance planning. Autoregressive integrated moving average (ARIMA) models are the most common predictive models, and they're extremely useful for forecasting. Machine learning techniques such as ANN and SVM are used to create predictive models. Azevedp et al. [30] developed an online simulation tool to allow users simulate a Prognostics and Health Management (PHM) system, build and post machine learning exploratory models in order to predict the RUL of aircraft parts impacted by a system fault with Neural Network based method, Similarity based method and an Extrapolation based method. Adhikari et al. [31] developed a Data Driven Diagnostics & Prognostics Structure to perform aircraft predictive maintenance based on machine learning. Deng and Santos [32] proposed a Lookahead approximate dynamic programming framework (incorporating a hybrid lookahead scheduling policy) which uses deterministic prediction to make the best decision for heavy maintenance of aircraft and stochastic prediction to decide light airplane maintenance. Due to unpredictability of aircraft’s daily operation and maintenance check elapsed time Deng and Santos [32]’s work aimed at reducing the wasted time period between maintenance checks. Celikmih et al. [33] utilized machine learning algorithms to develop a feature selection and data elimination model in order to forecast aircraft equipment failures. Using aircraft maintenance and failure data collected over a period of time, this model was able to separate important parameters from ineffective ones in the data. K-means algorithm was then implemented in the second step to remove inconsistency in the data. Multilayer Perceptron (MLP) which is an Artificial Neural Network (ANN), and Support Vector Regression (SVR), and Linear Regression (LR) which are machine learning techniques were used on the equipment maintenance dataset to evaluate the performance of the model. Also, the model was evaluated using performance metrics such as the Correlation Coefficient (CC), Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). Results showed that the hybrid data preparation model is effective at forecasting the equipment failure rate. Ai et al. [34] showed how machine learning algorithms can be implemented to detect impacts on aircraft composite structures which describes a promising technique for automatically detecting and localizing a debris or hail impact during flight. Acoustic emission (AE) was used as an impact surveillance method to attain this purpose. The flowchart of this study is shown in Figure 3. An overview of the implementation of various deep learning techniques used for aircraft maintenance, repair, and overhaul (MRO) was conducted by Rengasamy et al. [35]. Four architectures found to have been used for MRO were Deep Autoencoders, Convolutional Neural Networks (CNN), Deep Belief Networks and Long Short-Term Memory. Paul et al. [36] proposed an MRO systems them applies ANN to eliminate deficiencies in existing aircraft MRO systems.
Novel strategies based on machine learning models to score rotorcraft maintenance logbook event data for reporting purposes was developed by Seale et al. [37]. Manual inspection of huge generated maintenance has led to about 10% of the data receiving scores from human analyst. Machine learning algorithms are introduced to solve this but despite the fact that the computing efficiency of standard classification algorithms has significantly improved, using these machine learning methods to solve issues with huge numbers of unique class labels remains difficult. Using strategic label set segmentation and hierarchical ensemble models, Seale et al. [37] employed distributed random forest to handle this challenge. Wade et al. [38] designed two decision making models for engine output gearbox and turbo-shaft engine using data obtained from health and usage monitoring systems (HUMS). Support vector machine (SVM) was used to create decision boundaries on this dataset and receiver operating characteristic (ROC) curve was used to analyse the simulated work and observe the performance of the employed machine learning method. Dangut et al. [39] proposed a hybrid model for aircraft parts rare failure prognostics. This hybrid model which combines natural language processing techniques with ensemble learning is used for forecasting rare airplane parts failure. The model is validated with a genuine aircraft central maintenance system log-based dataset. The performance of various machine learning algorithms implemented in the reliability analysis of aircraft engine was investigated by Singh et al. [40], and a superior approach for predicting RUL was proposed. Different classification and regression algorithms were used. Examples of classification algorithms were Ensemble method, Random Forest classifier, CatBoost classifier, Light GBM classifier and XGBoost classifier and examples of Regression Algorithms were Random Forests Regression, Ridge regression, Lasso regression, XGBoost regression, Elastic-net regression, Support vector Machine as Regression, K-Nearest Neighbour Regression, Decision Tree Regression, Adaboost Regression and Linear Regression. Andrade et al. [41] improved long-term maintenance check scheduling for aircraft fleets using Reinforcement Learning (RL). Scheduling hangar checks for a given time horizon takes into account the fleet state, repair capacity, and other maintenance limitations. The inspections are planned at regular intervals, with the intention of scheduling them as close as possible to their due date.
Airline companies with a drive for profit making have begun to concentrate on getting the best routes that are viable to maintain. Operational aircraft maintenance routing problem (OAMRP) was investigated by Ruan et al. [42] with the goal of firstly offering an Integer Linear Programming (ILP) framework for the OAMRP that simultaneously analyse these three key maintenance constraints: maximum take-offs between two successive maintenance checks, maximum flying-hour, and capacity of work-force, and secondly designing a novel reinforcement learning-based algorithm that will provide a fast and efficiently solution to the problem. Testing this approach on a dataset received from a prominent Middle Eastern airline showed that it produces outstanding solutions for both datasets of medium and large-scale flight schedule. The Markov Decision Process (MDP) is a problem that can be handled utilizing an existing reinforcement learning-based method to develop best routes which are also maintainable. An earlier work published by Doğru et al. [43] showed the implementation of MASK R-CNN, a modern Convolutional Neural Network architecture, on autonomous drones to assist aircraft maintenance engineers to detect faults in airplanes. The reason for the choice of MASK-RCNN [44] was because it detects several objects in an image while producing a segmentation mask for each occurrence. Doğru et al. [43] built on their earlier work by experimenting various ways to evaluate prediction and make it more accurate. These experimented ways included (1) focusing only on wing images to increase data homogeneity; (2) adding images without dents to balance the original dataset; (3) investigating the potential of some augmentation techniques which are rotating, flipping, and blurring, in improving model performance; and (4) using a pre-classifier in conjunction with MASK R-CNN. Figure 1 shows the statistics of articles published over the years between 1998 and 2023. These statistics is obtained from Scopus database using (TITLE-ABS-KEY ("aircraft maintenance") AND ("machine learning")) as the search keyword. The number of published articles using major keywords of this review is shown in Figure 2.


4. Classification of Machine Learning Methods
SBMs are a generalization of minimal distance (MD) approaches, which are used in a variety of machine learning and pattern recognition algorithms [22], [25]. A nearest neighbour approach is used in similarity, a machine learning technique, to compare two or more objects using algorithmic distance functions [30], [45].
Different neural network types such as artificial neural network (ANN) and convolutional neural network (CNN) operate according to various principles and establish their own rules. A neural network (NN) is artificial neuronal circuit [22]. The connection of neurons is modelled by ANN as weights between nodes. Excitatory connections have a positive weight, while inhibitory connections have a negative weight [29]. Each input is given a weight and then summed. This is termed linear combination. Lastly, an activation function regulates the output's amplitude. Output range can be in the interval [0, 1], though it may also be in the interval [-1, 1] [30]. Figure 4 shows the structure of a neural network. ANN is used for applications that require training a dataset such as predictive modelling, and adaptive control. Networks can learn from themselves based on experience therefore, drawing conclusions from a large and seemingly unconnected set of data [33]. A CNN is a type of NN used in recognising and processing images [35], [43]. ANN is a massively parallel computing system that consist of large number of basic processors linked together by a large number of interconnections. ANNs learn the basic principles from a series of supplied symbolic scenarios or try to understand patterns in a dataset to make predictions, rather than following a set of laws established by human experts. Furthermore, the source of these ANNs' analytical activity is the relationships between the network processing units.
Extrapolation is a method for estimating the value of a variable beyond the initial observation range based on its relationship with another variable [22]. Extrapolation is similar to interpolation in that it provides estimates between known observations, but it is riskier and more likely to produce meaningless results. Extrapolation can also refer to the expansion of a method based on the assumption that similar approaches are applicable [30]. Extrapolation is the process of projecting, extending, or expanding known knowledge into an unknown or previously unexperienced area in order to gain insight into the unknown. Extrapolation is a technique for estimating a value that lies outside of a training data region [46].
It's a classification method built on Bayes' Theorem [26]. Naive Bayes classifier postulates that a feature existing in a class has no bearing on the presence of any other features. The Naive Bayes model is easy to develop and performs better with large data sets. Also it can be used to detect and reduce noise in data [47].
Logit model which is a statistical analysis tool is used for predictive analytics and modelling. In this analytics approach, dependent variables are either finite or in categories i.e., either a binary regression (A or B) or a multiple/multinomial regression (A, B, C, or D). Logistic regression helps to understand the connection between a dependent variable and one or more independent variables by estimating probabilities [26]. Analysis of multiple variables with this tools helps to reduce confounding factor in variables [48].
DT is a type of probability tree that aids making a choice about a particular process. For instance, one could choose whether to produce item A or item B [28]. In dealing with complicated decisions with lots of variables and are usually ambiguous, DT are a great approach. Since trees can quickly grow complex, using a software is usually preferred even though they can be drawn by hand [61]. A Decision Tree is a network structure made up mostly of nodes and branches, with root nodes and intermediate nodes constituting the nodes. The leaf nodes indicate a class label, whereas the intermediate nodes contain a characteristic. DT classifiers have grown in prominence in a variety of fields, including character recognition, medical diagnosis, and voice recognition [49], [50], [51].
Random forest, developed by Breiman et al. [52] employs an ensemble learning method for solving complex problems by merging multiple classifiers [26], [37]. RF algorithm consist of many decision trees which can be trained using bagging or bootstrap aggregation. Bagging is a method used to increase their accuracy [53]. RF algorithm determines the outcome from predictions of DT by averaging their results. As the number of trees grows result’s accuracy improves. The output category is decided collectively by these individual trees in an ensemble learning process, which is made up of numerous DT classifiers. Generalization error converges as the number of trees grow. The RF has a number of other advantages. For example, without selecting a feature, it may handle high-dimensional data. Trees are independent of one another throughout the training process, and implementation is very easy, yet, training speed is normally rapid, and generalization functionality is enough.
How frequently a phrase appears in a document is indicated by term frequency (TF). Terms and sentences or words are synonymous in natural language. But any text token can be represented by a word. A phrase is probably going to appear more frequently in longer documents than in shorter ones as documents come in all different lengths [27]. As a result, a term will appear to be more essential in a longer text than in a shorter one. Term frequency is regularly divided by the total number of terms in the document in order to normalize the effect. TF checks the frequency of a term t in document d, while DF counts the occurrence of term t in document set N. DF stands for the number of documents in which the word appears [54].
A linear model tells a linear relationship between input and output variable x and y [28]. y is determined by a linear combination of the input variables x. For a single input variable x, this procedure is known as linear regression while several input variables describes multiple linear regression in statistics literature [4]. Linear regression is a multivariate linear combination of regression coefficients. The generalized least square approach is used to determine the coefficients. Linear regression has its application in time series regression techniques that have been employed in predictive maintenance [55].
SVM is a linear model that can be used to solve classification and regression issues [28] and also linear and nonlinear problems [29]. SVM divides data into classes by drawing a line or hyperplane [56]. A statistical learning idea with an adaptive computational learning approach is defined as SVM. SVM, a supervised machine learning approach, can achieve pattern classification, recognition, and regression analysis. SVMs have been frequently used in the PdM of industrial equipment to determine a certain condition based on the recorded signal [57].
The nearest neighbour (NN) rule is a technique for classification. It categorizes a sample based on its closest neighbour’s category [28]. The probability of error in NN is less than twice the optimum error when many samples are used thus having a lower probability of error than any other decision rule. To categorize a test pattern, nearest neighbour based classifiers employ part or all of the patterns in the training set. The fundamental goal of these classifiers is to find correlations between each pattern in the training set and the test pattern [58].
The K-means clustering technique, also known as Flat clustering algorithm, calculates centroids and iterates to get the best centroid. It speculates the number of clusters that exist [28]. 'K' in K-means is the number of clusters identified by the algorithm from the data. The data points are assigned to clusters in such a way that the sum of their squared distances from the centroid is as small as possible [33]. Various distance metrics such as Euclidean, Manhattan, and Minkowski can be employed to observe the behaviour of clustering results [59].
Gradient boosting is used in classification and regression tasks [28]. It is a group of sophisticated machine-learning algorithms (XGBoost, LightGBM and CatBoost) that have had a lot of success in a variety of applications [60]. Additionally, it is an ensemble-based model that continuously improves the prediction accuracy of new models [61].
A Multilayer Perceptron is a fully connected multi-layer ANN. It contains three levels in which one is hidden [33]. MLP uses backpropagation in updating the weight parameters by finding the gradient of the loss function. Training of this ANN starts with random initialization of weight and bias parameters. Training the network aims at developing a model that can generalize well on input data and produce a high accuracy on predicted results [62].
Support vectors are points that are outside the tube in SVR. The smaller the value, the more the points outside the tube, and hence the more the support vectors. The name SVM, or Support Vector Machine, is well-known among those who work in Machine Learning or Data Science. SVR, on the other hand, is not the same as SVM [33]. As the name implies, SVR is a regression algorithm, which means we can use it instead of SVM for working with continuous value [11].
Correlation coefficient measures the strength of the linear connection between two variables. Correlation coefficient is represented by r. The formula calculates the distance between each data point and the variable mean for two variables and uses the result to assess how well the relationships between the variables can be fit to a hypothetical line through the data [33]. Correlation does not deal with connections in bivariate data, rather it only considers the two variables in question. Outliers in the data will not be detected, so it’s important to inspect plotted data to ensure data is within the required range in a fairly uniform way [63].
Reinforcement learning employs Markov Decision Process (MDP), a mathematical framework to describe an environment [42]. RL is a branch of machine learning that covers different strategies for learning an optimal policy (i.e., a mapping from states to actions) for an agent in a given environment. The MDP is a method for defining an unknown environment in which the agent takes a couple of actions to maximize its reward [64]. (See Table 1).
Model | References | Remarks |
Similarity Based Method | [22], [25], [30], [41], [45] | Due to the transitive quality of pairwise similarities, they are robust. They can exploit the pairwise relationships between unlabelled objects to their advantage. |
Neural Network | [22], [29], [30], [33], [35], [41], [65], [66], [67] | They can interpret data by grouping or labelling raw input. They can be thought of as a categorization of the grouping layer that sits on top of the stored and managed data. Neural networks are a component of broader machine learning as services and applications, which includes algorithms for classification, regression, and reinforcement learning. |
Extrapolation Based Method | [22], [30], [46] | Enables organizations to make predictions based on the data they have. Data requirements are minimal for this forecasting approach. It’s not necessary to collect a lot of data to forecast future data points. |
Naïve Bayes | [26], [47] | It is simple and straightforward to apply. It doesn't necessitate as much data for training. It is capable of dealing with both continuous and discrete data. It provides fast response when making predictions in real-time. |
Logistic Regression | [26], [48] | Training and analysing with Logistic regression is more straightforward. Logistic Regression cannot be used if the number of data points is less than the list of features; therefore, it may result in overfitting. It classifies unfamiliar records fairly quickly. It reduces the effects of confounding factors in analysed variables. |
Decision Tree | [28] | A benefit of decision trees is that statistical expertise is not needed in analysing its results. It's simple to put together. Data cleansing isn't as necessary. Multiple problems can be analysed and solved by decision trees. |
Random Forest | [26], [28], [37], [40], [68] | It performs both regression and classification. It generates accurate predictions that are easy to comprehend. Can manipulate huge datasets effectively. It reduces overfitting in decision trees prediction accuracy. It’s capable of working with category and continuous data. |
Term Frequency-Inverse Document Frequency | [27], [54] | Simple to calculate. It knows how to extract the most descriptive terms from a document using some simple metrics. It can quickly calculate the similarity of two documents. |
Linear Regression | [28], [40], [69] | This algorithm is best used when the independent and dependent variables have a linear relationship. Linear regression is simple to use, and the output coefficients are easier to understand. Linear regression models can be trained efficiently on systems with little processing resources. |
Support Vector Machine (SVM) | [28], [29], [70] | Performs efficiently on data that contains a clear margin of distinction among classes. When the number of dimensions is more than the number of samples, SVM is effective. SVM takes little memory space when training a model. |
K-Nearest Neighbours | [28], [71] | Computational time is short. Results are simple method to evaluate. It's versatile, and it can be used for regression and classification. Provides high precision results. |
The K Means algorithm | [28], [59] | Less complicated to implement. It is capable of manipulating big data sets. Convergence is guaranteed. Generalizes to other shapes and sizes of clusters, such as elliptical clusters. It is reliant on starting values. |
Gradient Boosting Method (GBM) | [28], [60], [72] | Frequently gives exceptional forecasting accuracy. Many different loss functions can be optimized, and there are various hyper parameter adjustment possibilities that make the function fit quite versatile. |
Multilayer Perceptron (MLP) | [33] | Works well with large amount of data. Works with linear and non-linear models. It has real-time data learning capability. |
Support Vector Regression (SVR) | [33], [57], [70], [73] | It can tolerate outliers. Update the decision model is easy. It has a high prediction accuracy and generalizes well. It is simple to implement. |
Correlation Coefficient (CC) | [33], [63] | It can be used to demonstrate the strength of a relationship between two variables. Obtain quantitative data that can be analysed quickly. |
Reinforcement Learning via Markov Decision Process | [42], [64] | It enhances performance. Change can be sustained for a long time. |
5. Machine Learning Implementation in Aircraft Maintenance Systems
The replacement of aircraft components is one of the most significant tasks performed by the aviation maintenance staff [22]. Instead of doing scheduled maintenance at predetermined intervals, replacement may be necessary based on the state of the aircraft component to maximize the lifetime of the aircraft equipment [25], [26]. An intelligent analysis of the data collected from sensors can be used to gain information on the state of various aircraft components [30]. Furthermore, machine learning methods use data to train and apply models about the degradation nature of various components to real-world conditions so as to forecast the remaining useful life of aircraft components [28]. The goal of this type of projects are to create a web interface for simulating a Prognostics and Health Management (PHM) system, in which multiple machine learning approaches can be utilized to forecast the Remaining Useful Life (RUL) of certain aircraft components [29]. Different approaches based on Artificial Intelligence are suggested and target accurate prediction of RUL of aircraft components [35], [37]. Models are trained with training dataset and validated with test dataset to achieve this [42], [43].


The aviation sector has a lot of information and maintenance data that might be utilized to get useful results in projecting future actions. A study shows the introduction of machine learning models that use feature extraction and data removal to anticipate failures in aviation system [27]. Corrective maintenance is performed by maintenance specialists on the ground using real-time condition monitoring data collected during the flight in recent aircraft maintenance systems. By analysing historical data from airplane maintenance systems, a predictive model is provided for predicting high-priority defects in advance, and preventive maintenance can be performed based on the model's prediction results. A study shows how the prediction of faults with varying priorities is treated as a binary classification problem. For decades, maintenance technology has advanced, and numerous ways to reducing the likelihood of machine system failure have been found. Corrective and preventive maintenance are the two types of maintenance approaches in general [33].
6. Objectives in implementing ML to Aircraft Maintenance
1. Replacement of Components of the Aircraft
To maximize equipment lifetime, replacement may be based on the condition of the aircraft components rather than a set time period. An intelligent analysis of sensor data can determine the condition of various aircraft components. In order to forecast the remaining useful life of aircraft components, machine learning techniques can be used to develop models trained on data and apply these models in describing the degradation behaviour of individual components to real-world settings [22].
2. Maintenance of Parts
Artificial intelligence-based algorithms were used to define engine maintenance work, improve customer engine availability and maintenance costs, and obtain a procurement plan of integrated parts with intervention planning and maintenance strategy implementation. Machine Learning was used on a workshop dataset to optimize the amount, cost, and lead-time of warehouse spare parts in order to achieve this goal. This dataset comprises information such as repair claims, engine operating hours, forensic evidence, and general information regarding processed spare parts for a given engine type, spanning various years and fleets. These data have been used to develop various machine learning models in order to anticipate the repair state of each spare item for better warehouse handling. A multi-label classification strategy was utilized to create and train a Machine Learning model for each spare part that predicts the part repair state in the same way that a multiclass classifier does. Each classifier is specifically requested to predict the part's repair state (categorized as "Efficient," "Repaired," or "Replaced") [26].
3. Maintenance Planning
The process of developing a course of action is known as maintenance planning. Effective maintenance planning entails devising a strategy that encompasses all maintenance, repair, and construction tasks. Work to be done should be specified as clearly and completely as feasible [29]. In aviation, aircraft component reliability and availability have always been essential considerations. The reliability of aircraft components and systems will be improved through accurate prediction of possible faults. The entire maintenance and overhaul expenses of airplane components are determined by the maintenance operations schedule [33]. In the aviation business, intelligent maintenance, repair, and overhaul (MRO) has become increasingly crucial. Aircraft are increasingly equipped with sensors that continuously collect data on their status, diagnosis, and potential defects. Effective maintenance management is aided by the ability to use sensor data to accurately predict and diagnose problems. Furthermore, the extensive use of sensors in aircraft has facilitated the move from time-based maintenance, in which maintenance is scheduled at regular intervals, to condition-based maintenance, in which choices are made based on data gathered through sensor monitoring [35]. Aircraft planning and scheduling is one of the most time-sensitive and important tasks in the airline business. Each flight leg on each aircraft will be covered exactly once during sequencing of the flight legs. This practice can be referred to as aircraft maintenance routing because it takes maintenance constraints into account while establishing each aircraft's routes because each aircraft must undergo routine maintenance inspections [42].
4. Safety
Aircraft maintenance is a crucial task in the field of aeronautics. An airplane is sensitive to the effects of component failures due to component degradation over time, which could jeopardize the aircraft's overall reliability and safety. Aircraft maintenance is carried out by qualified professionals to prevent issues. The current practice of repositioning airplane parts at predetermined intervals complies with regulatory standards, however it does not maximize the useful lifetime of these components. Aviation maintenance's goal is to ensure the safety of the aircraft and its occupants by preventing aircraft subsystems from becoming vulnerable to failure and, if at all possible, extending component useful lifetimes to reduce costs for the airline industry [30]. According to Airbus system/component failure or malfunction (SCF) was responsible for roughly 13% of total losses between 1998 and 2017 [75].
5. Maintenance Cost
In aviation, aircraft component reliability and availability have always been essential considerations. The reliability of aircraft components and systems will be improved through accurate prediction of possible faults. The entire maintenance and overhaul expenses of airplane components are determined by the maintenance operations schedule. Maintenance expenditures account for a large amount of an aviation system's total operating expenses [33].
Aviation Production Planning organizes aircraft maintenance plans, which include how to schedule maintenance and repairs at the right time, in the right location, and in the right order to guarantee that the actual flight plan is followed and that maintenance expenses are kept to a minimum. The airline's profitability has been significantly impacted by rising operating costs as a result of greater competition in aviation sector. Lowering operational expenses has become an important approach for civil aviation companies to reach new profit levels. Maintenance cost differ depending on the type of aircraft. This cost can vary from 10% to 45% of annual operational costs. Even if it’s 10% that might not seem like a lot, when overall running costs could be in the thousands or millions of dollars [76]. So, a main problem is how to minimize cost of maintenance and the upkeep time of an aircraft. Mathematical formulation as well as practical methodology are implemented to solve the problem of airplane maintenance scheduling. The purpose of the formulation and solution approach is to designate aircraft in order to reduce maintenance costs and time. (The above analysis can be shown in Table 2).
Objective | References | Remarks |
Replacement of Components of the Aircraft | [22], [30] | To extend the life of aircraft equipment, replacement may be necessary depending on the status of the aircraft components. |
Maintenance of Parts | [26], [77], [78] | To optimize the amount, cost, and lead-time |
Maintenance Planning | [33], [35], [42] | Time-sensitive and crucial functions in the airline industry Increase the asset's lifespan. The most crucial benefit of preventive maintenance, in my opinion, is that assets have a longer lifespan. Breakdowns are less likely. Boost your productivity. Reduce unscheduled downtime. |
Safety | [22], [30], [75] | An improved safety culture. In the operational environment, there is a higher level of safety. Higher levels of compliance. There are fewer accidents. There will be fewer inexcusable safety mishaps. |
Maintenance Costs | [33], [76] | Routine Maintenance Should Be Optimized Reduce the amount of non-routine maintenance. Improve the Aircraft Components' Reliability. |
7. System Variables/Parameters Considered While Implementing ML in Aircraft Maintenance
1. Time for Repair
Predictive maintenance involves predicting maintenance needs in advance using time-based data from in-service assets such as trains and planes [35]. One main objective of this method is to be able to accurately forecast when it is time to repair or replace a component [29]. The advantages of prolonged usage are lost if done too far in advance; if done too late, unforeseen failures can occur, lowering asset availability [34]. As a result, improving component lifetime estimates accuracy is a continuous priority [37].
2. Time for Replacement
The replacement of aircraft components is one of the most significant tasks performed by the aviation maintenance staff [76]. Instead of doing scheduled maintenance at predetermined intervals, the condition of aircraft components should determine if replacement can be done in order to maximize the lifetime of the aircraft equipment [29], [34].
3. Maintenance Cost
The major purpose of this study is to design Artificial Intelligence-based iterative algorithms [79] in order to define the engine removal plan and its repair work, and also get an acquisition plan of integrated components with intervention planning and maintenance strategy execution [28].
4. Safety
The purpose of aviation maintenance is to guarantee the safety of the plane and its occupants by avoiding aircraft subsystems from becoming vulnerable to failure and, if feasible, extending component useful lifetimes to reduce costs for the airline industry [27], [28]. (The above analysis can be shown in Table 3).
Variables | Units | References | Remarks |
Time for Repair | seconds | [29], [34], [37], [80] | Forecasting repair time. Avoids failures. Ensures that aircraft will perform safely. Repair on time. Optimize regular repair. Enhance the Reliability of Aircraft Components. |
Time for Replacement | seconds | [29], [34], [37], [80] | Forecasting replacement time. Maximize the equipment utilization lifetime. Using time efficiently. Optimize regular replacement. |
Maintenance Cost | Dollars | [28], [79] | Maintenance cost of parts and aircraft. To achieve competitive advantage over competitors. Reduce Non-regular Maintenance. Optimization of Maintenance |
8. Constraints in Utilizing ML in Aircraft Maintenance
1. Data might be interpreted incorrectly, resulting in erroneous maintenance requests.
Data Collection: The amount of data required is directly proportional to the difficulty of the problem to be addressed. Data sources for a company might be both proprietary and open, such as weather data, traffic data, and so on. Data controlled by a business can be quantitative (loan amount, customer retention rate), category (gender, colour, property kind), time stamped (how many things were purchased over a period of time), or even free text (emails, doctor's notes).
Data transformation is required, as well as the removal of missing values. When data is acquired from many sources, the formats vary and must be standardized in order for ML to understand them. This level includes a lot of feature engineering. It also necessitates the creation of linkages between various sources of data. For more meaningful patterns, sales performance can be separated into day, month, and year category values.
Data Training: The analytics may now begin, and selecting the appropriate data model is critical - various methods are used for different jobs. In the 80:20 percent rule, it's critical to divide the data into training and assessment sets.
Parameter Tuning: The model will be evaluated against the evaluation set, and parameters such as the number of training steps, learning rate, and so on will be fine-tuned.
2. Contextual information, such as the age of the equipment or the weather, may be ignored by predictive analysis. Time series data are made up of data points that are dated at specific moments in time. This data is often gathered at regular intervals. You can compare data from week to week, month to month, year to year, or any other time-based metric you want by learning and implementing time series data. Numerical data is just a collection of numbers that aren't rooted in any particular time periods, whereas time series data has set beginning and ending points.
3. Proactive physical examination and equipment maintenance may be discouraged by predictive maintenance. The process of examining objective structural findings through observation and testing is known as physical examination. The information gathered must be carefully linked with the history of the parts. Furthermore, a thorough physical examination should offer 20% of the information needed for parts inspection and treatment.
4. Timelines, rather than actual machine conditions, may trigger preventative maintenance operations. To make sure that any issues are minimized, frequent inspections, maintenance, and service are all carried out. Thus, only the most precise and reliable aircraft parts must be used, and there must be a concerted effort to minimize human error. During inspections, the quality of components and aircraft parts is evaluated using routine manual checks and visual inspections. The main objective is to maintain aircraft in excellent condition to prevent any type of failure that could cause an accident. "Regularly scheduled inspections and preventative maintenance maintain airworthiness," according to the Federal Aviation Administration (FAA). By identifying minor faults and wear and tear early on and enabling specialists to fix and rectify them swiftly, maintenance of aircraft components and services reduces malfunctions and the danger of operational failures. Keeping accurate records is also beneficial to the process. (The above analysis can be shown in Table 4).
Constraints | References | Remarks |
Data | [28], [34], [37], [76], [79] | Data Collection Data Transformation Data Training Parameter Tuning |
Daily Range | [28], [34], [37], [76], [79] | Time Series Data |
Physical Examination | [28], [34], [37], [76], [79] | Part Examination Physical Examination |
Condition | [28], [34], [37], [76], [79] | Risk of Failures Avoid Breakdowns |
9. Simulation Platforms Used for Applications of Machine Learning in Aircraft Maintenance
Data Science and Machine Learning Platforms offer platforms for the development, implementation, and analysis of machine learning algorithms [1]. To do predictive maintenance, sensors are first installed in the system to monitor and gather data on its activities. Time series data is used in predictive maintenance. A timestamp, a collection of sensor readings recorded at the same time as timestamps, and device IDs are all included in the data. The purpose of predictive maintenance is to anticipate whether equipment will fail in the near future at time t using data collected up to that point. Two approaches to predictive maintenance are: (1) the classification technique which indicates if the following n-steps are likely to fail, (2) the regression technique which forecasts the amount of time until the next failure. This is referred to as Remaining Useful Life (RUL).
1. MATLAB
MATLAB is a computer application used by engineers and scientists to study and build systems and products. The MATLAB language enables the most natural representation of computational mathematics [22]. You can create models and applications by analysing data, developing models and developing algorithms. MATLAB allows you to take your ideas from research to production by deploying them to enterprise applications and embedded devices and integrating them with Simulink and Model-Based Design [29]. MATLAB contains interactive and visualization tools that make machine learning operations simple. One may study your data, uncover essential attributes, and share your results using data visualization tools.
2. PYTHON
Python is an interpreted high-level, general-purpose programming language. İts design philosophy uses of considerable indentation in order aid code readability. Its language features and object-oriented methodology are designed to assist programmers in writing logical code for both small and large-scale projects. Python supports a variety of programming standards, including structured (especially procedural) programming, object-oriented programming, and functional programming [42]. It is sometimes referred to as a "batteries included" language due to its extensive standard library [29].
3. R PROGRAMMING
R is a statistical computing and graphics programming language. It is usually used by statisticians and data miners to design statistical software and for data analysis. R is widely used, according to polls, data mining surveys, and examinations of scientific literature databases [28]. The syntax will come to you shortly. You don't have to be an expert coder. It's not about being a great coder; it's about understanding which packages to use and how to use them effectively. Using R programming is that easy.
4. ERP PLATFORM
Enterprise resource planning (ERP) is a management and integration technique used by organizations for their daily operations [33]. Numerous ERP software programs combines all of the tasks necessary to run a company into a single system. ERP software can integrate planning, purchasing, marketing, finance, sales, inventory, human resources, and other functions [37]. (The above analysis can be shown in Table 5).
Simulation platforms | References | Remarks |
MATLAB | [22], [29], [30], [81] | Enhance fast implementation and testing of algorithms. Creating the computational codes is easy. Debug with ease. Big database of pre-installed algorithms. It's simple to manipulate still photographs and make simulation films. Symbolic computation is simple to do. Add external libraries easily. |
PYHTON | [28], [40], [42], [43], [82] | It is simple to read, learn, and write. Python is a high-level programming language. Syntax that is similar to English. Increased productivity. Language that has been translated. Typed in a dynamic way. It's free and open-source. |
R PROGRAMMING | [40], [42] | Superb for Statistical Analysis. Open-source. Includes a variety of libraries. Support for several platforms. Supports a variety of data types. |
ERP PLATFORM | [33], [37] | By assisting users in navigating complicated procedures, ERP improves efficiency and productivity. blocking re-entry of data. Production, order fulfilment, and delivery are all operations that may be improved. Throughout, procedures have been streamlined and made more efficient. |
10. Conclusion
This paper gives a comprehensive review and analytic comparison of most recent design and optimization machine learning methods for aircraft maintenance. In order to optimize aircraft maintenance system, various evaluation factors such as time and cost of maintenance are described and summarized. The consideration of some parameters such as safety, time of repair, time of replacement and maintenance cost is essentially in obtaining an optimal combination for the fastest and cheapest maintenance. These parameters will allow precise forecasting repair time, avoiding failures, ensuring the safety of aircraft and optimization of regular repair. Moreover, the availability of the required part and planning the maintenance have an impact on the aircraft maintenance optimization problem. Based on this review, maintenance schedule problem [9], [19], [32], [41], aircraft spare parts prediction problem [10], [11], [12], [16], [17], [26], predicting RUL of aircraft components [22], [25], [28], [29], [30], and predicting aircraft equipment failure [33], [39], [40], [43] are some of the problems that have been optimized by researchers using machine learning. It is important to accurately predict the RUL of components so as to maximize its lifetime and also know when it needs repair or replacement to avoid unseen failures. It is observed that artificial neural network is one of the main techniques used in developing predictive models to predict RUL, probably because of its ability to learn patterns and features in data to make accurate predictions [21], [30]. The unique feature of TF-IDF technique in extracting terms from document can be utilized to extract important features from data especially when solving classification problems [27]. Random Forest can handle both classification and regression issues with ease [40]. Additionally, it can handle categorical as well as continuous data. Not to mention, it automates the procedure for adding missing values to data. Furthermore, Neural Network can save data throughout the whole network and be able to function with limited information. It has an excellent hazard and risk tolerance and also has a memory that is distributed. Mostly researchers used MATLAB and PYTHON to develop and test machine learning models.
Artificial Intelligence algorithms have recently been extensively applied for the optimization of aircraft maintenance system. Several researchers have proposed, analysed, and investigated the applications of Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) based data analytics for predictive maintenance of aircraft systems, subsystems, and components. In order to inform more experts about the current research breakthroughs and to offer some direction for pertinent research in the field of aircraft maintenance this paper provided a comprehensive review of the ML techniques like Multilayer Perceptron (MLP), Logic Regression (LR), Random Forest (RF), Artificial Neural Network (ANN), Support Vector Regression (SVR), Linear Regression (LR), and other common ML techniques for their present implementation and potential applications in aircraft maintenance.
However, several gaps and research prospects need yet to be pursued. Alternative ML algorithms and architectures must be tested and explored for various aircraft maintenance applications. It is important to consider the most appropriate ML methodology that is best suited for a particular dataset and can give the best result for specific aircraft maintenance problems. It is therefore recommended that more real datasets may be made available in public domain so that more and more researchers can test their proposed ML algorithms on these real datasets and improve the efficiency and efficacy of their ML algorithms so that they can be extensively utilized for aircraft maintenance in the near future.
Furthermore, performance of any ML algorithm is heavily dependent on the training dataset. Major problems faced while using ML algorithms is having low quality or insufficient data, underfitting or overfitting of the training data etc. Moreover, implementation can take too much time and when data grows, there can be imperfections in the algorithm, that is why machine learning is still a tedious task and it is a long way till we can extensively and solely rely on ML data only for the design of maintenance operations. However, with the advent of faster computers, BDA platforms, Industrial internet of things (IIOT), Cloud Computation platforms, and significant advancements in data acquisitions systems, and prognostic health management tools, particularly Micro Electro-Mechanical System (MEMS) technologies, high-end microsensors for precision applications, it is anticipated that Machine Learning (ML) will play a far more substantial role in aircraft maintenance in years to come.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.
