Lifetime Extension of Wireless Sensor Networks by Perceptive Selection of Cluster Head Using K-Means and Einstein Weighted Averaging Aggregation Operator under Uncertainty
Abstract:
In the realm of Wireless Sensor Networks (WSNs), energy efficiency emerges as a paramount concern due to the inherent limitations in the energy capacity of sensor nodes. The extension of network lifespan is critically dependent on the strategic selection of Cluster Heads (CHs), a process that necessitates a nuanced approach to optimize communication, resource allocation, and network performance overall. This study proposes a novel methodology for CH selection, integrating Multiple Criteria Decision Making (MCDM) with the K-Means algorithm to facilitate a more discerning aggregation and forwarding of data to the network sink. Central to this approach is the application of the Einstein Weighted Averaging Aggregation (EWA) operator, which introduces a layer of sophistication in handling the uncertainties inherent in WSN deployments. The efficiency of CH selection is vital, as CHs serve as pivotal nodes within the network, their selection and operational efficiency directly influencing the network's energy consumption and data processing capabilities. By employing a meticulously designed clustering process via the K-Means algorithm and selecting CHs based on a comprehensive set of parameters, including, but not limited to, residual energy and node proximity, this methodology seeks to substantially enhance the energy efficiency of WSNs. Comparative analysis with the Low-Energy Adaptive Cluster Hierarchy (LEACH)-Fuzzy Clustering (FC) algorithm underscores the efficacy of the proposed approach, demonstrating a 15% improvement in network lifespan. This advancement not only ensures optimal utilization of limited resources but also promotes the sustainability of WSN deployments, a critical consideration for the widespread application of these networks in various fields. The findings of this study underscore the significance of adopting sophisticated, algorithmically driven strategies for CH selection, highlighting the potential for significant enhancements in WSN longevity through methodical, data-informed decision-making processes.
1. Introduction
Since WSNs are inexpensive, scalable, and simple to set up, they are frequently utilized in real-time applications [1]. As the fundamental components of WSNs, self-organizing sensors can establish an adaptive multi-hop network and send optimized data packets to the base station (BS) for analysis after post-processing [2], [3]. The wireless nodes' memory and remaining energy are the biggest implementation restrictions. For this reason, in order to maximize the benefits of these WSNs, these WSN nodes require a regulating system to regulate their interactions with the access point and one another [4], [5]. Network lifetime can be greatly impacted by sending and receiving multimedia data, using advanced network security techniques, and more [6], [7], [8], [9]. Numerous techniques, like clustering and routing protocols, are used [10] to extend the lifetime of networks and maximize energy efficiency. MCDM, combined with entropy and the EWA operator, can improve CH selection in WSNs, contributing to a longer network lifetime. MCDM aids in evaluating diverse criteria, such as energy consumption, connectivity, and node proximity, to identify optimal CHs. Incorporating entropy assists in quantifying the weight of parameters, allowing for informed decisions on CH selection. The EWA provides a mechanism to aggregate multiple criteria, considering their interdependencies. By employing this integrated approach, WSNs can strategically select CHs based on a comprehensive criteria analysis, promoting efficient energy utilization and balanced network distribution. Consequently, the extension of WSN lifetime is achieved by improving energy depletion, enhancing network resilience and optimizing resource utilization within the specified constraints. To increase energy efficiency and maximize network lifetime, a new algorithm must be developed. This research proposes a new algorithm to maximize the network lifetime, selecting a CH in each round based on specific constraints. The clustering technique based on K-Means [11], which offers energy-efficient clustering in WSN, is the foundation for cluster creation. Fuzzy logic can be effectively used to represent uncertainty in WSNs. WSNs are often deployed in dynamic and unpredictable environments where factors like sensor readings, communication reliability, and environmental conditions can introduce uncertainty. Fuzzy logic provides a mathematical framework to handle this uncertainty by allowing the representation of vague and imprecise information. triangular fuzzy numbers (TFNs) are often used to represent uncertainty because they offer a simple yet effective way to model imprecision and vagueness in real-world data. The choice of TFNs is based on their ease of use, interpretability, and computational efficiency. In this study, TFNs are used to represent uncertainty in WSNs.
2. Related Works
Decision-making holds great importance across scientific disciplines. Employing the MCDM approach proves highly effective for discerning superior alternatives compared to various options in a multitude of scenarios [12]. Numerous studies have been done into clustering using MCDM within WSN, yielding promising results [13], [14], [15], [16], [17], [18]. Heuristic methods are also applicable in WSN for clustering purposes [19], [20]. Clustering approaches in WSN can be broadly categorized into static and dynamic. In the static technique, the CH remains fixed during clustering [21], [22], [23]. Conversely, dynamic clustering involves regular CH rotation. Heinzelman's LEACH [24] stands as an example of dynamic clustering. Despite LEACH's lower algorithmic complexity compared to alternative approaches, its uneven distribution of CHs results in lower energy efficiency. To address this issue in a heterogeneous environment where some nodes possess higher energy capacities, an Energy-Efficient Heterogeneous Cluster (EEHC) [25] has been introduced. In EEHC, nodes take on the role of CHs, with their residual energy determining a weighted election probability. While introducing the concept of heterogeneity, this approach fails to consider several factors in CH selection. A modified version of the LEACH protocol, known as Centralized LEACH (LEACH-C) [26], was introduced to resolve this issue by minimizing the total sum of squared distances between all CHs. This modification resulted in decreased energy consumption when transmitting data from non-cluster-head nodes to their respective CHs. In WSN CH selection, El Alami and Najid [27] introduced an energy-efficient approach based on fuzzy logic. The primary goal of this study is to leverage fuzzy parameters for minimizing energy consumption and enhancing the overall network lifespan. Khan et al. [28] suggested a fuzzy-Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS)-based CH election in mobile networks, employing four criteria. Studies compared this approach with conventional LEACH and fuzzy methods. Azada and Sharma [29] proposed a TOPSIS method focused on the election of cluster leaders using a multiple attribute decision-making approach. In this study, the CHs have been chosen by the MCDM technique based on the entropy-weighted technique. A new algorithm has been developed to select the CHs after each round which directly helps to increase the network lifetime.
3. Preliminaries
Entropy, initially formulated by the German physicist R. Clausius in 1865 as a thermodynamic metric, characterizes the disorder or randomness arising from thermodynamic processes. Claude Shannon later introduced the concept of information entropy in 1948 to quantify uncertainty in communication from information sources. The entropy weight method evaluates the extent to which each criterion in decision-making preserves decision information, determining the relative significance of different features. It essentially gauges the level of unpredictable communication through the utilization of entropy values. The computation of entropy weight involves analyzing the choice matrix. One can consult Sen et al. [30] for further information regarding the entropy weighted technique.
Assume that $Z=\left(z_{i j}\right)_{m \times n}$ be the decision matrix and $w=\left(w_1, w_2, \ldots, w_n\right)$, where $0 \leq w_j \leq 1$ and $\sum w_j=1$ be the weight vector with regard to the $m$ alternatives $A_i(i=1,2, \ldots, m)$ and $n$ criterion $C_j(j=1,2, \ldots, n)$. Now, we can calculate the weight $w_j, j=1,2, \ldots, n$ using the following steps:
Step 1: Compute $p_{i j}=\frac{z_{i j}}{\sum_{i=1}^m z_{i j}}$
Step 2: Compute $E_j=-\frac{1}{\log (m)} \sum_{i=1}^m p_{i j} \log \left(p_{i j}\right)$. It is to be that $p_{i j} \log p_{i j} \rightarrow 0$, when $p_{i j} \rightarrow 0$
Step 3: Compute $U_j=1-E_j$
Step 4: Compute $w_j=\frac{U_j}{\sum_{j=1}^n U_j}=\frac{1-E_j}{\sum_{j=1}^n\left(1-E_j\right)}$
A triangular fuzzy number is a representation of uncertainty that is characterized by a triangular-shaped membership function. It is often used in fuzzy logic and fuzzy set theory to model imprecise or vague information.
A fuzzy number $\tilde{A}=(a, b, c)$, where $a \leq b \leq c$ is called triangular fuzzy number (TFN) whose membership function $\mu_{\tilde{A}}(x): X \rightarrow[ 0,1]$ is as follows:
$\mu_{\tilde{A}}(x)= \begin{cases}\frac{x-a}{b-a} & \text { if } a \leq x \leq b \\ 1 & \text { if } x=b \\ \frac{c-x}{c-b} & \text { if } b \leq x \leq c\end{cases}$
Let $\tilde{A}=(a, b, c)$ be a triangular fuzzy number, then $\alpha$-level set of $\tilde{A}$ is $A_\alpha=\left\{x \in X: \mu_{\tilde{A}}(x) \geq \alpha\right\}=\left[A_\alpha^{-}, A_\alpha^{+}\right]$, where $A_\alpha^{-}=a+(b-a) \alpha$ and $A_\alpha^{+}=c-(c-b) \alpha, \alpha \in[ 0,1]$. Now, we can represent $\tilde{A}$ as $\tilde{A}=\underset{\alpha \in[ 0,1]}{\cup} A_\alpha$. Here, we can derive the signed distance [31] from $\left[A_\alpha^{-}, A_\alpha^{+}\right]$ to $\tilde{0}$ as $D\left(A_\alpha, \tilde{0}\right)=\frac{1}{2}\left(A_\alpha^{-}+A_\alpha^{+}\right)$. If $\tilde{A}=(a, b, c)$ be the TFN then we have $D(\tilde{A}, \tilde{0})=\frac{1}{2} \int_0^1\left(A_\alpha^{-}+A_\alpha^{+}\right) d \alpha=0.25(a+2 b+c)$.
The K-Means [32] algorithm is a popular unsupervised machine learning algorithm used for clustering data. It partitions a dataset into K clusters where each data point belongs to the cluster with the nearest mean. The algorithm is iterative and converges to a solution where the assignment of data points to clusters minimizes the sum of squared distances between data points and their respective cluster centers. K-Means is sensitive to the initial placement of cluster centroids, and different initializations may lead to different results. To mitigate this, the algorithm is often run multiple times with different initializations, and the best result in terms of the sum of squared distances is chosen. K-Means is widely used for tasks such as customer segmentation, image compression, and pattern recognition. However, it has some limitations, such as sensitivity to outliers and the need to specify the number of clusters in advance.
The notion of a triangular norm was presented by Klement et al. [33] as an extension of the triangle inequality observed in metrics. Schweizer and Sklar [34] are credited with developing the concept of a $T$-norm and the accompanying dual operator $T$-conorm. Let $R_j, j=1,2, \ldots, n$ be the collection of real numbers; then EWA operators can be defined as follows:
$E W A_w\left(R_1, R_2, \ldots, R_n\right)=w_1 \otimes R_1 \oplus w_2 \otimes R_2 \oplus w_3 \otimes R_3 \oplus \ldots \oplus w_n \otimes R_n$, where $w=\left(w_1, w_2, \ldots, w_n\right)$ is the weighted vector of $R_j, j=1,2, \ldots, n$, such that $0 \leq w_j \leq 1, j=1,2, \ldots, n$ and $\sum_{j=1}^n w_j=1$.
It is to be noted that for two real numbers $p$ and $q$ we have used Einstein $T$-norm for product $p \otimes q$ and $T$-conorm for sum $p \oplus q$.
Einstein product is a $T$-norm function $T:[ 0,1] \times[ 0,1] \rightarrow[ 0,1]$ such that
Einstein sum is a $T$-conorm function $S:[ 0,1] \times[ 0,1] \rightarrow[ 0,1]$ such that
4. System Model Definition and Formulation
The system infrastructure comprises a solitary base station (BS) and an extensive array of sensor nodes, each classified into two categories: common nodes and cluster head nodes. Common nodes are responsible for monitoring the surroundings and transmitting sensor data to the designated cluster head node. The selection of the cluster head node is a meticulous process facilitated by the common nodes. Upon receiving data from the common nodes, the cluster head node amalgamates the information before relaying it to the BS.
The first-order radio energy model explicitly focuses on energy utilization during the communication phase, encompassing energy expenditure in transmission, reception, and data aggregation processes. Eq. (3) represents the computation of energy consumption within this framework, derived from the exchanged bit data between a cluster head node and a common node.
$\tilde{E}_{T X}(\tilde{L}, \tilde{d})$ is the energy consumption during the transmission of $L$-bit of data and $\tilde{E}_{R X}(\tilde{L})$ is the energy consumption during receiving of data. Eq. (5) can be used to determine the amplifier's energy usage during the transmission phase where $\tilde{\varepsilon}_{a m p}$ is the amplifier energy consumption during transmission phase.
If the value of $\tilde{d}$ less than or equal to $\tilde{d}_0$, then the sensor node will use free-space propagation model. On the other hand, if the system uses multipath fading channel which use $\tilde{\varepsilon}_{f s}$ and $\tilde{\varepsilon}_{m p}$ communication energy parameter, can be used to calculate the value of $\tilde{d}_0$ by Eq. (6).
Determining the number of cluster heads in each cycle is crucial for increasing the WSN's lifetime and energy efficiency. We have determined the optimal cluster size $\tilde{k}_{o p t}$ as:
where, $\tilde{M}, \tilde{N}$ are represented as area covered and number of nodes in the system. BS defines the base location.
5. Experimental Setup and Results
For this research, a network comprising 100 nodes was established, as shown in Figure 1, featuring a BS located at a central point and a random distribution of nodes throughout the area. Each packet type has a 25-byte packet header, and data messages have a fixed length of 4000 bits. The channel bandwidth was set at a constant 1 Mb/s. The K-Means algorithm divides the network into groups of clusters depending on the value calculated by Eq. (7), as shown in Figure 2. As nodes start to become inactive, the number of clusters undergoes adjustments based on node density, and the optimal value is employed to determine the initial number of clusters. Larger and smaller groups are amalgamated. The BS is characterized as a node with limitless processing power and no energy constraints. Table 1 presents a list of symbols used in this study.
Symbol | Description |
$\tilde{d}$ | Distance to base station |
$\tilde{d}_0$ | Fixed measuring distance to base station |
$\left(\tilde{C}_x, \tilde{C}_y\right)$ | Co-ordinate of cluster head in a WSNs |
$\left(\tilde{N}_x, \tilde{N}_y\right)$ | Co-ordinate of node in a WSNs |
$\tilde{E}_{ {initial }}$ | Initial energy |
$\tilde{E}_{ {elec }}$ | Electronics energy |
$\tilde{E}_{T X}$ | Data transmission energy consumption |
$\tilde{\varepsilon}_{f s}$ | Energy amplification to overcome open area |
$\tilde{\varepsilon}_{m p}$ | Energy amplification in order to navigate the multi-path |
$\tilde{E}_{R X}$ | Energy consumption while data reception |
$\tilde{K}_{o p t}$ | Number of cluster heads that is optimal |
$\tilde{L}$ | Length of data |
$\tilde{N}$ | Number of nodes in the network as a whole |
$\tilde{H}$ | Distance between the special node and the common node |
$\tilde{n}$ | Number of clusters |
$\alpha$ | Distance from the sink |
$\beta$ | Average distance of cluster nodes |
$\chi$ | Number of neighbors |
$\delta$ | Residual energy |
As shown in Table 2, the experiment in this study calculates the entropy weights of each parameter, like residual energy, the number of neighbor nodes, the distance from the sink and the average distance of cluster nodes. The best CHs were selected after the first simulation round based on four factors, namely, distance from the sink (BS), average distance of cluster nodes, number of neighbor and residual energy, using the Einstein operator. Using the EWA operator mentioned in Section 3.4, the average weight of each node in each cluster can be calculated.
Distance From the Sink ($\alpha$) | Average Distance of Cluster Nodes ($\beta$) | Number of Neighbors ($\chi$) | Residual Energy ($\delta$) |
0.1062 | 0.1300 | 0.2640 | 0.4980 |
Eq. (8) has been used to determine the weight of each node of each cluster which decided the selection of CHs has been shown in Table 3. We have also considered uncertain parameters for the entire network setup which has been shown in Table 4. Also, we have estimated the optimum range of $\tilde{K}_{ {opt }}$. Here, we have considered $\tilde{\mathrm{N}}$ = 100 nodes, $\tilde{Z}$ = 100m, $\tilde{\varepsilon}_{f s}$ = 10pJ, $\tilde{\varepsilon}_{{mp}}$ = 0.0013pJ and 76m $< \tilde{d}_0 <$ 168m. Therefore, the expected optimum number of clusters to be lied in the range (1,11), i.e. 1 $< \tilde{k}_{ {opt }} <$ 11 which is taken as 9 . Figure 2 shows the clustering of 100 nodes using K-Means algorithm by $\tilde{k}_{ {opt }}$ value.
Cluster Head | Residual Energy | Number of Neighbors | Distance From the Sink | Average Distance of Clusters Nodes |
CH1 | 0.9795 | 7 | 156.203 | 11.232 |
CH2 | 0.9754 | 5 | 78.223 | 15.527 |
CH3 | 0.9798 | 9 | 140.173 | 28.937 |
CH4 | 0.9753 | 6 | 136.059 | 31.049 |
CH5 | 0.9788 | 3 | 93.444 | 49.752 |
CH6 | 0.9641 | 4 | 116.069 | 24.688 |
CH7 | 0.9647 | 5 | 86.988 | 23.348 |
CH8 | 0.9657 | 4 | 105.367 | 26.433 |
CH9 | 0.9649 | 6 | 102.181 | 18.694 |
Parameters | Parametric Value as per Assumptions | Defuzzified Value |
$\tilde{N}$ | 100 | |
$\tilde{E}_{ {initial }}$ | (0.7,1,1.2) | 0.975 |
Coordinate of BS | (50,175) | |
Size of the data packet | (495,500,510) | 501.25 |
Hello/broadcast/CH join message | (22,25,28) | 25 |
$\tilde{\varepsilon}_{f s}$ | (8,10,12) | 10 |
$\tilde{\varepsilon}_{m p}$ | (0.001,0.0013,0.0015) | 0.001275 |
$\tilde{L}$ | (47,50,52) | 49.75 |
The number of cycles before the network's single node runs out of energy is used to describe the network's lifespan. Figure 3 displays the experimental outcomes accordingly. Sensor nodes are dispersed at random across a preset area. The plot of network lifetimes displays the number of active nodes with time in cycles. CHs for the first round are selected using the MCDM technique. For the subsequent round, Algorithm 1 is implemented and described below.
Algorithm 1:
Step 1: 100 nodes have been deployed randomly over (100,100)$\mathrm{m}^2$ area with BS coordinates.
Step 2: For the first rounds, the selected CHs will send the data which has been selected by using Eq. (8).
Step 3: Repeat Steps 4 to 9 to choose CHs for subsequent round until all nodes' residual energy is not diminished.
Step 4: Increment the counter if the remaining energy of a node surpasses that of all other nodes within its cluster.
Step 5: If a node is at a greater distance from the sink than the cumulative distance of all other nodes in the cluster, then a counter is incremented.
Step 6: Increase a counter if a node's average distance from other nodes within the cluster is shorter than that of any other nodes.
Step 7: The node with the highest counter value has been considered CHs for the next round.
Step 8: If a cluster contains less than three nodes, assign additional nodes to the nearest cluster.
Step 9: Move to the next round.
Step 10: End.
LEACH-FC is an extension of the original LEACH algorithm, incorporating fuzzy logic for improved cluster formation. Its main goal is to enhance energy efficiency and prolong the network lifetime of WSNs. In this study, both the LEACH-FC clustering approach and the approach proposed in this study are implemented in the same environment. Figure 4 shows the network lifetime of LEACH-FC. It has been found that the proposed approach shows a 15% greater network lifetime in comparison with LEACH-FC.
6. Conclusions
Research on energy efficiency in WSNs is crucial due to the constrained energy resources of sensor nodes. Improving energy efficiency enhances the longevity of WSNs, allowing them to operate for extended periods without frequent battery replacements. This, in turn, promotes sustainable and cost-effective deployment of sensor networks for various applications such as environmental monitoring, healthcare, and smart cities. Efficient energy utilization also contributes to minimizing environmental impact, making WSNs more environmentally friendly and aligning with the broader goal of creating energy-efficient and eco-friendly technological solutions. Entropy, MCDM and K-Means algorithms play significant roles in creating clusters and selecting CHs in WSNs. The integration of these algorithms enables the creation of energy-efficient and well-organized clusters in WSNs. Entropy aids in evaluating the quality of clusters; MCDM ensures comprehensive decision-making for CH selection; and K-Means contributes to the formation of homogeneous and resource-efficient clusters. This combined approach enhances the performance and longevity of WSNs, particularly in applications where energy conservation and network reliability are critical. It has been found that, by applying Algorithm 1 and using these algorithms, the proposed approach shows better network lifetime performance in comparison with the LEACH-FC algorithm.
Not applicable.
The authors declare no conflict of interest.