Robust Leaf Disease Detection Using Complex Fuzzy Sets and HSV-Based Color Segmentation Techniques
Abstract:
Leaf diseases pose a significant threat to global agricultural productivity, impacting both crop yields and quality. Traditional detection methods often rely on expert knowledge, are labor-intensive, and can be time-consuming. To address these limitations, a novel framework was developed for the segmentation and detection of leaf diseases, incorporating complex fuzzy set (CFS) theory and advanced spatial averaging and difference techniques. This approach leverages the Hue, Saturation, and Value (HSV) color model, which offers superior contrast and visual clarity, to effectively distinguish between healthy and diseased regions in leaf images. The HSV space was utilized due to its ability to enhance the visibility of subtle disease patterns. CFSs were introduced to manage the inherent uncertainty and imprecision associated with disease characteristics, enabling a more accurate delineation of affected areas. Spatial techniques further refine the segmentation, improving detection precision and robustness. Experimental validation on diverse datasets demonstrates the proposed method’s high accuracy across a variety of plant diseases, highlighting its reliability and adaptability to real-world agricultural conditions. Moreover, the framework enhances interpretability by offering insights into the progression of disease, thus supporting informed decision-making for crop protection and management. The proposed model shows considerable potential in practical agricultural applications, where it can assist farmers and agronomists in timely and accurate disease identification, ultimately improving crop management practices.
1. Introduction
The identification of leaf diseases is crucial for effective plant management and pest control. Early detection can lead to timely interventions, reducing the economic impact on farmers. Traditional methods, such as visual inspection, are subjective and can be time-consuming [1]. In modern agriculture, early identification and classification of plant diseases are vital for optimizing crop yields and ensuring healthy plant growth. Several researchers have developed various techniques that leverage computational power and machine learning to facilitate early disease detection [2], [3], [4]. These approaches also encompass precision agriculture and effective crop management strategies, aimed at enhancing productivity. Current automated plant classification systems typically follow a systematic process: collecting leaves, preprocessing them to extract key features, classifying the leaves, building a database, training the model, and evaluating its performance. While leaves are crucial for disease detection, several recent studies have expanded to include other plant parts, such as stems, flowers, and seeds, for more comprehensive classification [4], [5], [6].
Automated disease identification systems are especially useful for non-experts, enabling them to detect and classify diseases efficiently. Early detection is essential for promoting healthy plant growth and preventing damage. For example, the little leaf disease significantly impacted pine tree production in the U.S. over the past six years. Machine learning techniques [7], [8], [9], [10] offer quicker, more reliable, and less labor-intensive methods for disease detection.
Image processing plays a key role in assessing the damage caused by diseases, utilizing color variations to identify affected areas. Effective image segmentation and grouping are crucial for analyzing distinct sections of images, employing techniques ranging from simple thresholding to more advanced color-based methods. Since computers often struggle to recognize objects without assistance, various segmentation techniques have been developed, focusing on features such as color and boundaries of leaf images.
Historically, plant disease detection relied on human expertise and laboratory verification. However, advancements in computer science, including high resolution cameras and powerful processors, have made it possible to identify plant diseases through sophisticated image processing techniques. While food production has improved globally, threats to food safety, such as plant diseases, remain a significant concern. Despite technological advancements, effective early detection of these diseases is still lacking in many areas.
Given the progress in machine learning and image processing, there is considerable potential for implementing these technologies in leaf disease detection and classification, which can greatly surpass human capabilities and enhance agricultural outcomes.
In this study, a novel algorithm was developed, drawing on the principles of CFS theory, integrated with spatial averaging and difference techniques. The framework incorporates fuzzy logic operations to segment the identified diseased areas accurately. The results were quantitatively evaluated and visually represented, offering insights into the severity and extent of the disease. By integrating CFSs with advanced segmentation techniques, this framework not only enhances the accuracy of leaf disease detection but also provides a robust tool for agricultural monitoring and decision-making.
The study is structured as follows: Section 2 reviews the current literature; Section 3 details the proposed classifier’s design; Section 4 presents the experimental results; and Section 5 concludes with suggestions for future research.
2. Related Work
This section reviews various classification and segmentation techniques for leaf disease identification. The classification process typically involves several common steps: color transformation, green pixel elimination, segmentation, and classification [7]. Islam et al. [11] applied these steps to classify images of 500 plant species using the multimodal hybrid deep learning (MHDL) approach, achieving notable classification results despite facing optimization challenges.
Sibiya and Sumbwanyamb [8] introduced a method specifically designed for identifying and classifying plant diseases, which is particularly advantageous for the Indian economy as it optimizes resources and reduces costs. This study utilized color co-occurrence techniques to extract features from leaf images and employed automatic fuzzy logic (AFL) for disease detection, showing effectiveness in accuracy while minimizing computational demands. For root and stem diseases, Liu et al. [12] explored efficient techniques based on texture and color analysis alongside K-means clustering. Different agricultural contexts may require tailored classification methods, including the use of Bayes classifiers, K-means clustering, and Principal Component Analysis (PCA) classifiers. Overall, these studies indicate promising strategies for disease detection and classification that can enhance plant disease management.
To address challenges like overlapping rubber tree leaves, a template-based approach utilizing the Scale Invariant Feature Transform (SIFT) for feature extraction was proposed by Devi et al. [13]. SIFT operates through a three-step process to identify key points, effectively facilitating accurate detection in agricultural applications. Zhao et al. [14] presented an automated method for plant species detection using generated leaf-based descriptors (GLBD), simplifying taxonomic classification. They proposed a histogram-based image classifier (HOS) combined with features extracted using Histogram of Oriented Gradients (HOG) and Zernike Moments (ZM), achieving promising accuracy and demonstrating its value for researchers and farmers. The study by Creswell et al. [15] compared various leaf segmentation techniques on a unique dataset, employing both unsupervised learning methods and optimal template selection with Chamfer matching. The findings provide insights into the strengths and weaknesses of different segmentation approaches.
Yang et al. [16] focused on optimizing leaf shape characterization through central deviation measures, employing bio-inspired algorithms like Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO). This study combined shape and texture features to enhance leaf detection accuracy. The application of the artificial neural network (ANN) in various contexts, such as predicting contact pressures in engineering [17] and analyzing geometric structures in nano-surfaces [18], further demonstrates ANN’s versatility. In the study by YayIaci et al. [19], the use of finite element methods alongside ANN for analyzing contact problems in functionally graded materials showcases additional applications of these technologies. Rajagopala et al. [20] proposed a segmentation technique utilizing Fuzzy C-Means (FCM) and PSO for effective partition of leaf images and feature extraction that can help in the identification of disease.
Building on existing methods and the limitations of the mentioned models given in Table 1, a new algorithm based on CFS theory integrated with spatial averaging and difference techniques in the entire region was proposed. The proposed framework utilizes fuzzy logic operations to effectively segment the detected diseased regions. The outcomes were assessed both quantitatively and visually, providing valuable insights into the severity and scope of the disease. Figure 1 shows the flowchart for the proposed model.
Method | Advantage | Limitation of the Competing Model |
MHDL | Accuracy | Optimization issue |
AFL | Color co-occurrence | Poor segmentation |
GLBD | Cost effective | Poor segmentation |
LK-PSVM | Optimization | Stuck in structural deformities |
3. Proposed Framework
Color segmentation of leaf disease is crucial for identifying affected areas on the leaf surface. The use of CFSs with spatial averaging and difference techniques provides a robust method for detecting diseased regions by handling uncertainties and variations in image data. A CFS introduces both a magnitude and phase component to the traditional fuzzy membership function. The membership function is represented as follows in Eq. (1):
where, $a(x)$ is the real part, representing the traditional membership degree, and $b(x)$ is the imaginary part, capturing uncertainties and variations.
Membership functions for both healthy and diseased regions on the leaf were defined in this study. The membership values were derived based on the hue value $H(x)$ of a pixel $x$. The healthy membership function can be defined as follows in Eq. (2):
The real part $a_{healthy}(x)$ decreases as the hue transitions from healthy to diseased regions. While membership function for disease part of the leaf is given as follows Eq. (3):
This function captures the transition towards diseased regions. The imaginary part $b(x)$ models the uncertainty as follows in Eq. (4):
These functions capture phase differences between healthy and diseased regions. To handle interactions between healthy and diseased regions, complex fuzzy operations were utilized. The complex fuzzy AND (minimum) operation can be defined as follows Eq. (5):
Use of complex multiplication leads to the following Eq. (6):
The OR (maximum) operation can be defined using the magnitude of the membership values in Eq. (7):
where, the magnitude $\left|\mu_A(x)\right|$ can be calculated as follows in Eq. (8):
After computing the membership values, spatial averaging was applied to reduce noise.
where, $N=(2 k+1)^2$ represents the size of the neighborhood.
To highlight the boundaries between healthy and diseased areas, the spatial difference between fuzzy memberships can be computed as follows:
This highlights areas with significant transitions between healthy and diseased regions.
A threshold was applied to the spatial difference to segment the diseased areas:
where, $T_{\text {diff }}$ is the threshold chosen based on the specific dataset.
The use of CFSs allows us to model the uncertainties present in leaf disease detection effectively. The combination of complex membership functions, spatial averaging, and difference calculations provides a robust framework for accurate segmentation of diseased areas.
4. Outcomes of the Proposed Model
This section describes a new model designed to segment and classify leaf diseases by analyzing leaf images. The model’s performance was compared to other established methods such as MHDL [11], AFL [8], GLBD [14] and Localized Kernel Principal Support Vector Machine (LK-PSVM) [20]. The experiments were conducted using MATLAB and evaluated based on multiple performance metrics. The experiments were run on a high-performance CPU with 8 GB of RAM and Windows 10 (64-bit) to manage the computational load of processing large-scale images.
To assess the effectiveness of the proposed method, a study focused on detecting diseases in different leaves was conducted using a diverse set of conditions. A total of 450 images were sourced from publicly available datasets, specifically aimed at identifying whether a leaf is healthy or infected. For this analysis, a disease pair model was developed for each class label, allowing for precise categorization. All images used in the experiment were standardized to a size of 110×110 pixels, ensuring uniformity for both prediction and optimization processes performed on the downscaled images. The threshold value taken in the proposed model is $T_{\mathrm{diff}} \in [ 0.2, 0.5]$ based on the specific dataset. Sample images utilized in the experiment are illustrated in Figure 2. The fuzzy-based segmentation was applied to these images, dividing them into regions based on pixel intensity. In Figure 3, the proposed model shows qualitative analysis over competing models.
To improve the speed of the training process and overall performance, parallel computing was utilized. After the training phase, the model’s performance was assessed using standard metrics like accuracy, precision, recall, and F1-score. The results were then compared with other advanced classification techniques, showing that the proposed model performed effectively. The dataset used in this study consists of leaf images categorized into three types of diseases: sunburn, fungal infections, and paling. Each category was selected based on prevalence and significance in agricultural practices. The sunburn category includes images of leaves exhibiting sunburn damage, characterized by brown, dried areas. The fungal category encompasses leaves affected by various fungal pathogens, displaying irregular spots and decay. Lastly, the paling category includes images of leaves that have lost their green color, indicating nutrient deficiency or other physiological stress.
The classification results in Table 2 indicate the number of correctly identified instances for each disease type, demonstrating the efficacy of the proposed method in accurately classifying leaf diseases. The overall accuracy across all classes is 97%, reflecting the model’s robust performance in differentiating between healthy and diseased leaves. A total of 20 images of leaves affected by sunburn were tested, with only two of them being misclassified as fungal disease. In contrast, all images of leaves affected by fungal disease were accurately classified, and the 20 images with paling disease were also correctly identified. These results demonstrate the effectiveness and accuracy of the proposed model. The model achieved an impressive classification accuracy of 97.1%, as shown in Figure 4. This high accuracy, combined with strong results in other metrics, highlights the model’s effectiveness and reliability in detecting leaf diseases with minimal misclassification.
Disease | Sun-burn | Fungal | Paling | Accuracy |
Sun-burn | 18 | 2 | 0 | $90.0 \%$ |
Fungal | 0 | 20 | 0 | $100 \%$ |
Paling | 0 | 0 | 20 | $100 \%$ |
Average | $97 \%$ |
The classification performance was evaluated using metrics like accuracy, precision, recall, and F-measure as shown in Eqs. (12)-(15). These metrics were calculated based on the instances of true positives, true negatives, false positives, and false negatives. The calculations for each metric were derived from these values, providing a comprehensive assessment of the model’s performance.
The performance of the proposed leaf disease detection model was evaluated using several plant image classification tasks and four key metrics. The results show that this new method outperforms existing techniques, such as MHDL, GLBD, and LK-PSVM. As the dataset size grows, the proposed model demonstrates even clearer advantages, particularly in reducing classification time and increasing accuracy. Table 3 provides a comparative analysis of performance metrics across different methods for leaf disease detection, including the proposed method, MHDL, AFL, GBLD, and LK-PSVM. The proposed model outperforms the others with the highest accuracy of 97.10%, significantly surpassing the closest competitor, LK-PSVM, which achieves 92.23%. In terms of precision, the proposed model also leads at 95.13%, reflecting its superior ability to correctly identify diseased leaves. Additionally, the recall for the proposed model is 94.12%, showing that it effectively detects most diseased instances compared to other approaches. The F-measure, which balances precision and recall, further highlights the proposed model’s effectiveness at 96.54%, well ahead of the other techniques tested. This analysis demonstrates the robustness and efficiency of the proposed model in comparison to existing methods. The findings highlight that the proposed model excels in accuracy, showcasing its ability to effectively extract features and optimize classification. The segmentation results shown in Figure 5 and Figure 6 depict accurate and precise segmentation achieved by the proposed method. Meanwhile, Figure 4, Figure 7, Figure 8, and Figure 9 compare the performance of the proposed model against others like MHDL, AFL, GLBD, and LK-PSVM. The proposed model stands out with its high accuracy and faster processing time while also delivering excellent image quality.
Metrics | Our | MHDL | AFL | GBLD | LK-PSVM |
Accuracy | 97.10% | 84.22% | 85.23% | 82.12% | 92.23% |
Precision | 95.13% | 82.44% | 87.12% | 78.51% | 93.10% |
Recall | 94.12% | 86.31% | 83.13% | 82.24% | 90.14% |
F-Measure | 96.54% | 81.27% | 85.33% | 84.15% | 89.14% |
For instance, Figure 4 shows that the proposed method achieves an accuracy of 97.10%, significantly higher than the other techniques. Similarly, Figure 7 displays the model’s precision, which is 95.13%, again surpassing all other methods. Figure 8 and Figure 9 visually compare the recall and F-measure metrics, where the proposed method achieves a recall of 94.12%.
Table 4 presents the performance metrics for the proposed leaf disease detection model, along with their 95% confidence intervals, highlighting the statistical significance of the results. The model has an accuracy of 97.1%, with a confidence interval ranging between 95.9% and 98.3%, demonstrating consistent and reliable performance. Precision is recorded at 95.1%, with a confidence interval of [94.4%, 96.2%], indicating the model’s ability to correctly identify diseased leaves. Recall is 94.1%, with a confidence interval of [93.7%, 95.8%], showcasing its effectiveness in detecting most diseased instances. The F1-score, a balance between precision and recall, is 96.5%, with a confidence interval of [95.7%, 97.5%], reflecting the model’s strong overall performance. The narrow confidence intervals for each metric suggest low variability, reinforcing the model’s robustness and reliability.
Metric | Mean Value | 95% Confidence Interval |
Accuracy | 97.1 % | [95.9%, 98.3%] |
Precision | 95.1% | [94.4%, 96.2%] |
Recall | 94.1% | [93.7%, 95.8%] |
F1-Score | 96.5% | [95.7%, 97.5%] |
5. Conclusion
The proposed model, based on CFS theory combined with spatial averaging and difference techniques, introduces a novel approach to accurately segment and classify diseased and healthy areas on leaves. This methodology addresses the limitations of traditional segmentation techniques by incorporating fuzziness to handle the inherent uncertainties in image data, making it highly adaptable to different plant species and environmental conditions. The experimental results demonstrate significant improvements in disease detection accuracy, emphasizing its potential use in precision agriculture to monitor plant health and enhance crop management. Moreover, this method can be applied to various disease types, offering a flexible framework that can be extended with additional parameters such as texture or shape analysis. Future work will focus on refining the fuzzy logic parameters, automating the threshold selection process, and testing the system in diverse real-world agricultural environments to ensure broader applicability and scalability. This contribution provides a valuable tool for advancing digital plant disease detection and integrating it into agricultural decision-making processes.
The proposed method faces limitations in inconsistent lighting conditions, low color contrast, and noisy images, and it struggles with non-color-related symptoms, requiring careful threshold tuning for accurate detection.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.