A Gearbox Vibration Signal Compressed Sensing Method Based on the Improved GLOW Flow Model
Abstract:
In response to the complex characteristics of gearbox vibration signals, including high frequency, high dimensionality, non-stationarity, non-linearity, and noise interference, this paper proposes a data processing method based on improved compressed sensing. First, the K-means Singular Value Decomposition (K-SVD) dictionary is used for sparse representation, ensuring good sparsity in the frequency domain. Next, a random convolution kernel measurement matrix is employed in place of the traditional Gaussian random matrix, satisfying the equidistant constraint while enhancing both computational and hardware implementation efficiency. Finally, the Generative Flow (GLOW) model is introduced, incorporating the measurement matrix, dictionary matrix, and sparse coefficient matrix into a unified optimization framework for joint solving. Through reversible mapping and probabilistic distribution modeling, the method effectively addresses noise interference and the challenges posed by complex signal distributions. Experimental results show that, compared with traditional compressed sensing methods, the proposed method offers superior signal reconstruction quality and better noise robustness.
1. Introduction
The gearbox is a critical core component in many mechanical systems and is widely used in aviation, automobiles, wind power generation, and various industrial equipment. Its operational status directly affects the stability and work efficiency of the mechanical system [1]. However, due to the complex and variable working conditions, uncertain loads, and subtle component defects, gearboxes often face various potential issues such as gear damage, bearing failure, and tooth surface wear [2], [3]. Therefore, real-time and accurate analysis and processing of gearbox vibration signals are of great significance for early fault prevention and efficient maintenance of gears.
Compressed sensing theory [4] provides a new approach for achieving high-precision signal reconstruction at lower sampling rates. By leveraging the sparsity of the signal in a certain transformation domain, compressed sensing can achieve information extraction and reconstruction through undersampling measurements well below the traditional Nyquist rate, thus reducing the burden of data collection and processing. In recent years, compressed sensing has achieved significant results in fields such as image reconstruction, feature detection, target localization, and weak signal extraction. Deng and Tian [5] proposed a compressed sensing image reconstruction network based on inverted residual blocks, which enhances the network's ability to capture multi-level image features through an inverted residual structure and multi-scale attention mechanism, significantly improving PSNR and visual effects. Zhang and Yu [6] proposed an adaptive sparse dictionary method based on Laplace wavelets to address the redundancy issue in traditional compressed sensing's sparse dictionaries, effectively reducing memory usage and improving fault feature extraction efficiency in noisy environments. Liu et al. [7] designed a SAMNet network based on a self-attention mechanism, introducing self-attention convolution and bottleneck transformations to improve block effects and optimize reconstruction quality and time, further enhancing the application effects in image processing. Wang et al. [8] proposed an improved ISTA algorithm that optimizes the step size through an acceleration operator, bidirectional search backtracking, and a quadratic approximation model, improving both the accuracy and efficiency in the reconstruction of rolling bearing vibration signals. Singh and Dandapat [9] introduced a joint compressed sensing recovery algorithm based on weighted mixed-norm minimization to enhance the reconstruction accuracy and compression efficiency of multi-channel electrocardiogram signals. Volaric et al. [10] proposed a data-driven compressive sensing ambiguity function (AF) area selection method to optimize time-frequency signal reconstruction and improve convergence speed. Chen [11] presented a compressive sensing signal reconstruction method based on smooth l_p-norm and maximum entropy function, which improved reconstruction accuracy and computational efficiency. However, directly applying compressed sensing technology to gearbox vibration signal processing still faces many challenges: the spectral components of gearbox vibration signals are complex, and the sparse assumption under traditional fixed bases is difficult to fully satisfy, affecting reconstruction accuracy; weak fault features in low signal-to-noise ratio (SNR) environments are difficult to extract accurately, reducing noise robustness; reconstruction errors may lead to incomplete fault features, and the coupling of frequency components in multi-fault modes further complicates fault identification.
To address these issues, this paper proposes an improved compressed sensing method based on the GLOW model, as shown in Figure 1. First, in the sparse representation phase, K-SVD dictionary sparse representation is used to provide a reasonable frequency domain sparse expression foundation for vibration signals. Second, a random convolution kernel measurement matrix is introduced to replace the traditional Gaussian random matrix, satisfying the equidistant constraint while simplifying hardware implementation and reducing computational costs. Finally, the measurement matrix, dictionary matrix, and sparse coefficients are incorporated into the unified optimization framework of the GLOW model. Through reversible mapping and probabilistic distribution modeling, the method effectively improves the adaptability and reconstruction accuracy for noisy environments and complex signal distributions.
2. Compressed Sensing Theory and Related Methods
Compressed sensing theory was proposed by Candes, Donoho, and others in 2006, providing a completely new approach for signal sampling and reconstruction. Its core idea is that if a signal is sparse in some domain, it can be sampled at a rate much lower than the Nyquist sampling rate, and its precise reconstruction can be achieved through nonlinear optimization algorithms.
Figure 2 shows the basic flow diagram of compressed sensing algorithm. The basic framework of compressed sensing theory includes three core parts: sparse representation, compressed sampling, and signal reconstruction.
Given a signal $x \in \mathbb{R}^N$, if there exists a basis (or overcomplete dictionary) $D \in \mathbb{R}^{N \times \mathbb{R}}$ such that the signal can be represented as:
where, $\alpha$ is the sparse vector of the signal in the basis $D$. If the sparse vector $\alpha$ has only $k$ non-zero (or approximately non-zero) elements, and $k \ll N$, the signal $x$ is said to be $k$-sparse in the basis $D$. The existence of sparsity is the premise of compressed sensing theory and forms the foundation for signal compression and reconstruction.
To compressively sample a sparse signal, a measurement matrix $A \in \mathbb{R}^{M \times N}$ is chosen to perform a linear projection of the signal $x$, obtaining the compressed measurement values:
where, $y \in \mathbb{R}^M$ is the measurement vector after compression, and $M \ll N$. At this point, the signal has been compressed from a high-dimensional space to a lower-dimensional space.
Measurement Matrix Requirements:
(1) Satisfy the Restricted Isometry Property (RIP) condition: To ensure the recoverability of the signal, the combination of the measurement matrix $A$ and the sparse basis $D$ (i.e., the sensing matrix $\Phi=A D$ ) needs to satisfy the RIP. Specifically, the matrix $\Phi$ should satisfy the following condition for all $k$-sparse vectors $\alpha$:
where, $\delta_k$ is the RIP constant, and it must satisfy $0<\delta_k<1$.
(2) Low Cross-Correlation: The cross-correlation between the column vectors of the measurement matrix $A$ should be as low as possible to avoid information loss and ensure the recoverability of the sparse signal in the low-dimensional space.
Given the measurement values $y$ and the measurement matrix $A$, the sparse coefficients $\alpha$ can be reconstructed by solving an optimization problem to recover the original signal $x$:
where, $\|\cdot\|_0$ represents the $l_0$ norm, and $\varepsilon$ is the allowable reconstruction error.
Since directly solving the $l_0$-minimization problem is NP-hard, the $l_1$ norm is often used as a substitute for the $l_0$ norm, and approximate algorithms such as Basis Pursuit (BP), Matching Pursuit (MP), and Orthogonal Matching Pursuit (OMP) are used for sparse reconstruction.
Limitations of Traditional Compressed Sensing Methods:
(a) Inaccurate Sparse Representation: Fixed sparse bases (e.g., Fourier basis, wavelet basis) have difficulty representing complex gearbox vibration signals sparsely enough.
(b) High Computational Complexity of Reconstruction Algorithms: Traditional $l_1$-minimization algorithms are computationally expensive and are not suitable for real-time processing.
(c) Insufficient Noise Robustness: Real-world signals contain noise, and traditional methods are sensitive to noise, which affects the reconstruction quality.
K-SVD [12], [13] is an iterative algorithm used to learn overcomplete dictionaries. The goal is to find a dictionary $D \in \mathbb{R}^{N \times K}$ and a sparse representation matrix $\alpha \in \mathbb{R}^{K \times M}$ for a given training sample set, such that the reconstruction error is minimized. The optimization objective function of the K-SVD algorithm is:
where, $X=\left[x_1, x_2, \ldots, x_N\right]$ is the training sample matrix, with each column being a signal sample; $D \in \mathbb{R}^{N \times K}$ is the dictionary matrix to be learned, containing $K$ atoms (column vectors); $\alpha=\left[\alpha_1, \alpha_2, \ldots, \alpha_N\right]$ is the sparse coefficient matrix, with each column being the sparse representation of sample $x_i ;\|\cdot\|_F$ represents the Frobenius norm; and $s$ is the sparsity constraint, representing the maximum number of non-zero elements in each sparse coefficient vector.
Algorithm Procedure:
Step 1: Initialization
Dictionary Initialization: Set the number of atoms $K$ in the dictionary, typically $K>N$ (where $N$ is the signal dimension), making the dictionary overcomplete. Randomly initialize the dictionary $D^{(0)}$ and normalize each atom:
Sparsity Setting: Set the sparsity $s$.
Step 2: Sparse Coding Phase
For each signal sample $x_i$, solve for the sparse coefficient matrix $\alpha_i$:
This is solved using the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA).
Step 3: Dictionary Update Phase
For each atom $d_x$, perform the following updates:
(a) Compute the reconstruction error excluding the atom $d_x$:
where, $X$ is the matrix of all training samples, and $\alpha_j$ is the corresponding sparse coefficient column vector.
(b) Perform SVD on error matrix $E_k$:
(c) Update the dictionary atom $d_k$:
(d) Normalize the new dictionary atom:
Step 4: Stopping Criteria
The iteration stops when the reconstruction error reduction is smaller than a threshold $\varepsilon$, or when the maximum number of iterations is reached:
The random convolution kernel measurement matrix is a special structured measurement matrix proposed by Li et al. [14], which is constructed based on the convolution operation. Compared to traditional Gaussian random matrices, the random convolution kernel measurement matrix has significant advantages in hardware implementation and computational efficiency.
Definition of the Random Convolution Kernel Measurement Matrix: Let $h \in \mathbb{R}^L$ be a random convolution kernel vector of length $L$, and $L \ll N$ (signal length). The random convolution kernel measurement matrix $A \in \mathbb{R}^{M \times N}$ is generated by performing a convolution operation on the signal $x \in \mathbb{R}^N$ to obtain the measurement values $y \in \mathbb{R}^M$:
where, * represents the 1D convolution operation.
Generation of the Convolution Kernel:
(1) Random Kernel Generation:
Generate a random convolution kernel $h$ of length $L$, where each element $h_i$ follows an independent standard normal distribution:
(2) Normalization of the Convolution Kernel:
Normalize the generated convolution kernel to ensure that its energy is normalized:
(3) Construction of the Random Convolution Measurement Matrix:
Construct the measurement matrix $A$ by rolling the normalized convolution kernel $h$ to generate the matrix. Specifically, each row of the measurement matrix corresponds to a window of the convolution kernel $h$ at a certain position:
where, $M=N-L+1$, with each row corresponds to a convolution window of the signal $x$.
Through this design, the measurement process can be seen as the convolution of the signal $x$ with the random convolution kernel $h$. Since convolution operations can be accelerated using the Fast Fourier Transform (FFT), this design results in a hardware-friendly and computationally efficient compressed sensing process. The randomly generated convolution kernel, under certain conditions, can ensure that the measurement matrix $A$ satisfies the RIP, improving the recoverability of the signal in low-dimensional space.
3. GLOW Model and Its Joint Optimization Strategy
GLOW [15] is a generative model based on flow models, which uses a series of reversible transformations to generate data. Unlike traditional Generative Adversarial Networks (GANs) or Variational Autoencoders (VAEs), GLOW maps simple latent distributions (such as standard Gaussian distribution) to complex real data distributions through a series of reversible, nonlinear transformations. The key advantage of the GLOW model lies in its reversibility and precise likelihood estimation, which allows for efficient data sample generation, signal reconstruction, and optimization. The core principles of the GLOW model include the following key components:
The GLOW model uses multiple reversible transformation functions (called layers) to map the input data to the latent space, achieving data generation and reconstruction. Figure 3 below is a schematic of the GLOW model. The design of these transformation functions ensures that the transformation in each layer is reversible, including reversible $1 \times 1$ convolutions and affine coupling layers. Let the input data be $x \in \mathbb{R}^d$, and through a series of reversible transformation functions $f_1, f_2, \ldots, f_L$, the input data is transformed into latent space data $z \in \mathbb{R}^d$:
To recover the original data, each layer of the flow model has a reverse transformation function $f_k^{-1}$, which transforms the latent space data $z$ back to the input data space $x$, such that:
The GLOW model can accurately calculate the probability density of the input data. By estimating the data density in the latent space, a probability value can be assigned to the input data. Assuming $z$ follows a standard Gaussian distribution, the probability density of the input data $x$ can be calculated using the Jacobian determinant:
where, $p_z$ is the probability density in the latent space, typically assumed to be Gaussian distribution; $h^{(l-1)}$ represents the input to the transformation at the $l$-th layer (i.e., the output of the previous layer or the initial input data); $\frac{\partial f_l}{\partial h^{(l-1)}}$ is the Jacobian matrix of the transformation function, representing the scale change during the transformation process.
The training of the GLOW model is carried out by maximizing the log-likelihood function, with the goal of learning the model parameters such that the generated data closely matches the distribution of the real data. Let $x_1, x_2, \ldots, x_N$ be the training dataset, and maximize the log-likelihood function of the data, i.e.,
where, $p\left(x_i\right)$ is the probability density of the $i$-th sample. With the help of the backpropagation algorithm, the parameter $\theta$ of the flow model is optimized by adopting the gradient descent method.
In traditional compressive sensing, the selection of the measurement matrix $A$ and dictionary $D$ is usually done independently, which may lead to the measurement matrix not fully exploiting the sparsity of the dictionary, thereby affecting the signal reconstruction performance. In this paper, the dictionary matrix $D$ is learned using the K-SVD algorithm, which has the advantage of finding sparser representations in high-dimensional and complex spectral environments.
To further improve reconstruction performance, this paper adopts the GLOW model. By introducing latent space distribution modeling, the joint optimization of the measurement matrix $A$, the Gaussian random matrix $D$, and the sparse coefficient $\alpha$ is achieved. Under noisy conditions and complex signal distributions, the reversibility and probability density estimation ability of the GLOW model help obtain better reconstruction results.
The goal of the joint optimization is to simultaneously optimize the measurement matrix $A$, dictionary matrix $D$, and sparse coefficient matrix $\alpha$, so that the signal can be effectively recovered within the compressive sensing framework. The joint optimization objective function is:
where, $\lambda_1, \lambda_2$ and $\lambda_3$ represent the sparse term and GLOW-related regularization terms, respectively; $\|y-A D \alpha\|_2^2$ ensures that the measured signal and the reconstructed signal are close; $\lambda_1\|\alpha\|_1$ is the term that enforces most elements of the sparse coefficient matrix $\alpha$ to be zero through the $l_1$-norm, ensuring the signal sparsity; $\lambda_2\|D\|_F^2$ prevents the dictionary matrix $D$ from overfitting; and $\lambda_3 R(A, D, \alpha)$ is the GLOW-related term that uses the flow model to capture complex data distributions, enhancing the reconstruction stability.
In this paper, $R(A, D, \alpha)$ is selected as the negative log-likelihood (NLL) based on the GLOW model, expressed as:
where, $p_{{GLOW }}(\hat{x})$ is the probability density of the reconstructed signal $\hat{x}=D \alpha$ estimated by the GLOW model. The GLOW model maps the signal $\hat{x}$ to the latent space $z$ through a series of reversible transformations and assumes that $z$ follows a standard Gaussian distribution. Therefore, the probability density of the signal $\hat{x}$ can be expressed as:
This can be further expanded as:
Since $z$ is assumed to follow a standard Gaussian distribution $N(0,1)$, we have:
Therefore, the expression for $R(A, D, \alpha)$ in the objective function is:
The optimization process of the flow model is shown in Figure 4. To optimize the measurement matrix $A$ and the sparse matrix $\alpha$, the gradients of the loss function with respect to these parameters need to be calculated and the parameters updated. The specific steps are as follows:
Step 1: Calculate Gradients
Calculate the gradients of the loss function with respect to the dictionary matrix $D$ and the measurement matrix $A$:
Step 2: Update Dictionary Matrix and Measurement Matrix
Update the dictionary matrix $D$ and measurement matrix $A$ using gradient descent:
where, $\eta_D$ and $\eta_A$ are the learning rates for the dictionary matrix and measurement matrix, respectively.
Step 3: Update Sparse Coefficient Matrix $\alpha$
For each column $\alpha_j$, update using the soft thresholding method:
where, $S_\tau(\cdot)$ represents the soft-threshold function and can be defined as:
where, $\tau$ is the soft threshold parameter.
Step 4: Perform Forward Transformation
The forward transformation of the GLOW model maps the current iteration's approximated reconstructed signal $\hat{x}$ to the latent space representation $z$. The GLOW model consists of multiple reversible functions, and each layer has an invertible function $f_l$, including reversible convolution $1 \times 1$ and affine coupling layers. Assume the GLOW model has $L$ layers, and each layer's transformation is reversible.
Operations for each layer $l=1,2,3, ..., L$:
(a) Reversible $1 \times 1$ Convolution:
where, $W^{(l)}$ is the $1 \times 1$ reversible convolution matrix for the $l$-th layer, and $x^{(l)}$ is the output of the $l$-th layer.
(b) Affine Coupling Layer:
Split $x^{(l)}$ into two parts $x_A^{(l)}$ and $x_B^{(l)}$:
where, $ScaleNet$ ${ }^{(l)}$ and $TranslateNet$ ${ }^{(l)}$ are the neural networks of the $l$-th layer, used to generate the scaling factor $s^{(l)}$ and translation factor $t^{(l)} ; \odot$ represents element-wise multiplication, used for affine transformation; $y^{(l)}$ is the output of the $l$-th layer. The final output is:
Step 5: Iterative Update
Each iteration includes updating the measurement matrix, dictionary, sparse coefficients, performing sparse coefficient thresholding, and evaluating the GLOW forward transformation. By repeatedly following the above steps and using GLOW to model the latent distribution of the signal, the reconstruction results are further optimized. This process continues until the loss function converges (i.e., the change is smaller than the preset threshold) or the maximum iteration count is reached.
Step 6: GLOW Inverse Transformation
After joint optimization, the inverse transformation in the GLOW model is applied to recover the original signal $\hat{x}$.
Perform inverse transformation:
The inverse transformation of the GLOW model is the reverse operation of the forward transformation. Each layer's inverse transformation is the inverse operation of its forward transformation.
Operations for each layer $l=L, L-1, \ldots, 1$ :
(a) Inverse operation of the affine coupling layer
Split $y^{(l)}$ into two parts $y_A^{(l)}$ and $y_B^{(l)}$:
(b) Inverse Operation of Reversible $1 \times 1$ Convolution:
Step 7: Reconstruction Output
The inverse transformation ensures that the signal can be accurately restored through the latent space mapping of the GLOW model, maintaining the temporal characteristics and structural integrity of the signal.
This paper proposes an improved compressive sensing method, combining K-SVD for dictionary updating to achieve efficient sparse representation of signals; introducing a random convolution measurement matrix to compress the signal and reduce its dimensionality; and further integrating the GLOW flow model to jointly optimize the measurement matrix, dictionary, and sparse coefficients. Through the forward transformation, the reconstructed signal is mapped to the latent space, iterating to update the parameters, and ultimately, the inverse transformation outputs a high-precision reconstructed signal.
The GLOW model optimization process is shown in Figure 5. The specific implementation steps are as follows:
Step 1: Use a three-dimensional accelerometer to obtain gear vibration signal data, randomly select a 2000-dimensional data segment, reduce data redundancy, and retain key features.
Step 2: Initialize the dictionary, use the FISTA algorithm to solve for the sparse coefficient matrix $\alpha$, and update the dictionary using the K-SVD algorithm until the maximum number of iterations is reached or the reconstruction error is smaller than the preset threshold.
Step 3: Construct a random convolution kernel measurement matrix and apply Eq. (6) to compress the collected raw signal.
Step 4: Introduce the GLOW flow model for joint optimization of the measurement matrix, dictionary matrix, and sparse coefficient matrix, and construct the joint optimization objective function (14). Iteratively run Eqs. (20) to (26), use the backpropagation algorithm to compute the gradients and update the measurement matrix and dictionary matrix, and use the soft-thresholding function to update the sparse coefficient matrix. Simultaneously, execute the forward transformation, gradually mapping the compressed signal into the latent space, until the loss function converges or the maximum number of iterations is reached.
Step 5: Execute the inverse transformation process in the GLOW flow model, step by step executing Eqs. (35) to (40), recover the signal from the latent space, and output the reconstructed signal.
4. Experimental Verification and Results Analysis
To validate the effectiveness of the proposed algorithm, the publicly available experimental data from reference [16] is used for verification and analysis. The dataset considers factors such as gear installation and disassembly, and includes five different operating conditions under various rotational speeds: gear health, gear damage, tooth surface wear, root fracture, and missing teeth. Table 1 shows the calculation formula of tooth number parameters and fault characteristic frequency of the planetary gearbox.
This dataset includes five working states of the sun gear at rotational speeds of 20Hz, 25Hz, 30Hz, 35Hz, 40Hz, 45Hz, and 50Hz. The data consists of vibration data from the X, Y, and Z axes, as well as encoder data from the planetary gearbox input shaft. The sampling frequency of both the vibration and rotational speed data is 48kHz.
Number of Teeth | Sun Gear | 28 |
Planet Gear | 100 | |
Planet Gears (Quantity) | 4 | |
Meshing Frequency | (175/8) $f_r$ | |
Sun Gear Fault Frequency | (25/8) $f_r$ |
In the experiment, vibration signals along the X-axis at a rotational speed of 20Hz were selected, and a 2000-dimensional data segment was randomly extracted. The experimental parameter settings are provided in Table 2.
Parameter | Symbol | Value |
Signal sample length | $N$ | 2000-dimensional |
Dictionary matrix atom count | $K$ | 3000 |
Dictionary matrix dimensions | $D$ | $2000 \times 3000$ |
Sparsity | $s$ | From 5 to 50, increment by 5 |
K-SVD algorithm iterations | - | 50 |
K-SVD stopping threshold | - | $1 \times 10^{-6}$ |
Random convolution kernel length | $l$ | 100 |
Measurement matrix dimensions | $A$ | $2000 \times 2000$ (compression ratio) |
GLOW model layers | $L$ | 10 |
Sparse regularization parameter | $\lambda_1$ | 0.1 |
Dictionary regularization parameter | $\lambda_2$ | 0.1 |
GLOW-related regularization parameter | $\lambda_3$ | 0.1 |
Dictionary matrix learning rate | $\eta_D$ | 0.001 |
Measurement matrix learning rate | $\eta_A$ | 0.001 |
Soft-threshold parameter | $\tau$ | 0.01 |
GLOW optimization maximum iterations | - | 50 |
GLOW optimization stopping threshold | $\mathcal{E}$ | $1 \times 10^{-6}$ |
To comprehensively evaluate the performance of different algorithms in compressive sensing, four methods are designed for comparison with this work:
Method 1: Use the FFT method for dictionary learning, with the dictionary size fixed based on the FFT results. A Gaussian random measurement matrix is used for data compression, and the OMP algorithm is used for signal reconstruction.
Method 2: Use the K-SVD method for dictionary learning, with a Gaussian random measurement matrix for data compression, and the OMP algorithm for signal reconstruction.
Method 3: Use the K-SVD method for dictionary learning, with a Gaussian random measurement matrix for data compression, and the COSAMP algorithm for signal reconstruction.
Method 4: Use the K-SVD method for dictionary learning, with a random convolution kernel measurement matrix for data compression, and the COSAMP algorithm for signal reconstruction.
For the above methods, sparsity is set to increase from 5 to 50 with a step size of 5, the reconstruction algorithm iteration count is set to 50, and the stopping threshold is $1 \times 10^{-6}$. In Methods 2, 3, and 4, the dictionary atom count is set to 3000.
In the performance evaluation of the compressive sensing algorithm, the three key evaluation metrics are Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and computation time.
MSE: Represents the average squared error between the original and reconstructed signals. The smaller the MSE, the closer the reconstructed signal is to the original, indicating better reconstruction quality. The formula is as follows:
where, $x_i$ is the $i$-th sample point of the original signal, $\hat{x}_i$ is the $i$-th sample point of the reconstructed signal, and $N$ is the total number of sample points.
PSNR: A logarithmic ratio used to measure the difference between the reconstructed signal and the original signal. The higher the PSNR, the better the reconstruction quality. The formula is as follows:
With a fixed compression rate of reducing 2000-dimensional data to 1024 dimensions (approximately 51.2% compression), sparsity increases from 5 to 50 with a step size of 5, and the MSE and PSNR metrics at each sparsity level are recorded [17], [18], [19], [20].
In subgrah (a) of Figure 6, as the sparsity increases, the MSE of all methods decreases. The proposed method consistently shows the lowest MSE at all sparsity levels, demonstrating its advantage in sparse representation. In subgrah (b) of Figure 6, as the sparsity increases, the PSNR of all methods rises. The proposed method exhibits the highest PSNR at all sparsity levels, indicating the best signal reconstruction quality.
With a fixed sparsity of 10, the compression rate is varied, reducing the signal from 2000 dimensions to different dimensions (compression rates from 10% to 90%, step size of 10%), to evaluate the MSE and PSNR metrics at different compression rates.
In subgrah (a) of Figure 7, as the compression rate increases (i.e., the compression dimension decreases), the MSE of all methods increases. However, the proposed method consistently shows significantly lower MSE across all compression rates, reflecting its strong resistance to compression. In subgrah (b) of Figure 7, as the compression rate increases, the PSNR of all methods decreases. The proposed method maintains higher PSNR than the other methods at all compression rates, indicating its ability to preserve good reconstruction quality even at high compression rates.
With a fixed compression rate of reducing from 2000 dimensions to 1024 dimensions, the sparsity increases from 5 to 50 with a step size of 5, and the computation time for each algorithm at each sparsity level is recorded. The results are as follows in Table 3:
Sparsity | Method 1(s) | Method 2(s) | Method 3(s) | Method 4(s) | The Proposed Method (s) |
5 | 0.063 | 0.0822 | 0.104 | 0.1241 | 0.1628 |
10 | 0.0673 | 0.0867 | 0.1069 | 0.1265 | 0.1686 |
15 | 0.073 | 0.0919 | 0.1112 | 0.1329 | 0.1729 |
20 | 0.0772 | 0.0973 | 0.1177 | 0.1379 | 0.1771 |
25 | 0.0828 | 0.1031 | 0.1226 | 0.1409 | 0.1843 |
30 | 0.0873 | 0.1065 | 0.1277 | 0.1478 | 0.1867 |
35 | 0.0917 | 0.112 | 0.1326 | 0.1534 | 0.1923 |
40 | 0.0959 | 0.1174 | 0.1379 | 0.1558 | 0.1987 |
45 | 0.1024 | 0.1229 | 0.1414 | 0.1611 | 0.203 |
50 | 0.1077 | 0.1273 | 0.146 | 0.1689 | 0.2053 |
The computation time of the proposed method is slightly higher than that of the other methods due to the introduction of the GLOW model's joint optimization process. Despite the increased computation time, the significant improvement in reconstruction accuracy and noise resistance makes it more practical for real-world applications.
The experimental results indicate that the proposed method consistently achieves lower MSE and higher PSNR across different sparsity levels and compression rates. This is because the K-SVD dictionary finds a sparser representation in higher-dimensional and complex spectral environments, thus reducing information loss; the random convolution kernel measurement matrix maintains the validity of information during signal compression; and the GLOW flow model, through latent distribution modeling and probability density estimation, enables the system to effectively distinguish signal and noise components in low-dimensional space, achieving stable reconstruction results even under low SNR and high compression rates. In contrast, traditional methods lose their ability to accurately capture signal features under high compression rates and noisy conditions, leading to a more noticeable increase in MSE and decrease in PSNR.
Therefore, the improvements in the proposed method result not from a single element, but from the synergistic effects of enhanced sparse representation, optimized measurement matrices, and latent space probability modeling.
5. Conclusion
This paper proposes an improved compressive sensing method for gearbox vibration signals with high frequency, high dimensionality, non-stationarity, and non-linearity, as well as noise interference and real-time processing requirements [21], [22], [23]. The method combines K-SVD dictionary learning, random convolution kernel measurement matrix design, and GLOW flow model joint optimization strategy. The main conclusions are as follows:
(1) Improved Sparse Representation Ability: By using K-SVD dictionary learning, the sparsity of the signal in the frequency domain is enhanced, enabling effective extraction of key features from complex, non-stationary gearbox vibration signals and significantly improving reconstruction accuracy in low-dimensional space.
(2) Enhanced Compression and Reconstruction Efficiency: The design of a random convolution kernel measurement matrix, combined with convolution operations and fast Fourier transforms, significantly reduces hardware implementation and computational complexity, while satisfying the RIP condition to ensure signal recoverability in low-dimensional space.
(3) Improved Noise and Complex Signal Processing Capability: The introduction of the GLOW flow model, through latent distribution modeling and joint optimization, unifies the measurement matrix, dictionary matrix, and sparse coefficient matrix in a single framework. The forward transformation captures the complex distribution of signals, while the inverse transformation enables precise signal reconstruction, significantly improving the stability and accuracy of signal reconstruction in noisy environments.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.