Advanced Vehicle Detection and License Plate Recognition via the Kanade-Lucas-Tomasi Technique
Abstract:
The optimization of traffic flow, enhancement of safety measures, and minimization of emissions in intelligent transportation systems (ITS) pivotally depend on the Vehicle License Plate Recognition (VLPR) technology. Challenges predominantly arise in the precise localization and accurate identification of license plates, which are critical for the applicability of VLPR across various domains, including law enforcement, traffic management, and both governmental and private sectors. Utilization in electronic toll collection, personal security, visitor management, and smart parking systems is commercially significant. In this investigation, a novel methodology grounded in the Kanade-Lucas-Tomasi (KLT) algorithm is introduced, targeting the localization, segmentation, and recognition of characters within license plates. Implementation was conducted utilizing MATLAB software, with grayscale images derived from both still cameras and video footage serving as the input. An extensive evaluation of the results revealed an accuracy of 99.267%, a precision of 100%, a recall of 99.267%, and an F-Score of 99.632%, thereby surpassing the performance of existing methodologies. The contribution of this research is significant in addressing critical challenges inherent in VLPR systems and achieving an enhanced performance standard.
1. Introduction
The transformative impact of technological advancements on numerous facets of daily life is undeniable, and the realm of transportation management stands as a prime example [1]. Within the ambit of ITS, the development of VLPR systems has been identified as crucial for achieving optimal traffic management and bolstering security measures. These systems are adept at capturing and deciphering vehicle license plate numbers, thereby offering a broad spectrum of applications encompassing access control in parking lots, crime prevention, and traffic analysis. The origins of VLPR can be traced back to the 1970s, marking the commencement of efforts to automate the reading of license plates [2]. Yet, it was the advent of digital cameras and image processing techniques that propelled the widespread adoption of VLPR systems. Early iterations of these systems predominantly employed rule-based approaches, utilizing manually crafted features to discern license plate characters [3], [4]. These initial systems, however, demonstrated limitations in adapting to variances in lighting conditions, image quality, and license plate formats. A paradigm shift towards machine learning-based approaches for VLPR has been observed in recent years [5], [6], [7]. Such algorithms, capable of learning from data, exhibit enhanced robustness to fluctuations in image quality and conditions. VLPR systems, it is acknowledged, occupy a central role across diverse sectors, finding application in scenarios ranging from toll collection and traffic enforcement to border control and vehicle tracking [8], [9]. The present research is committed to contributing to the evolution of VLPR technology, with the objective of forging a system characterized by accuracy, robustness, and efficiency [10], [11]. Emphasis will be placed on image enhancement, license plate detection, character segmentation, character recognition, and the optimization of the overall system.
This paper's structure is as follows: Section 2 provides a comprehensive review of the VLPR literature. In Section 3, the research methodology designed to address existing challenges in VLPR and improve its performance is outlined, with innovative techniques integrated to enhance accuracy, robustness, and efficiency. Empirical findings are presented in Section 4, where performance metrics, comparative studies, and practical insights are explored. The paper concludes in Section 5, emphasizing the contributions of this work to the advancement of VLPR systems and the assurance of safer and more efficient transportation networks.
2. Literature Review
The review encompasses an exploration of various methodologies pertinent to object tracking, with a particular emphasis on applications within traffic monitoring and surveillance contexts. Techniques such as video analytics, vehicle detection, and motion tracking are scrutinized. The evaluation encompasses a range of algorithms, including ne-Class Support Vector Machine (OC-SVM) and Convolutional Neural Network (CNN)-based approaches, with a focus on augmenting accuracy and mitigating false alarm occurrences. The overarching aim is articulated as the enhancement of object tracking efficacy in real-time scenarios, inclusive of challenging environmental conditions. In the work of Velazquez-Pupo et al. [12], a stationary camera is utilized in a video analytics context, serving multifarious functions such as vehicle detection, occlusion handling, vehicle counting, tracking, and classification. Within this context, the application of OC-SVM with an RBF Kernel is highlighted, having demonstrated superior performance, particularly in the classification of midsize vehicles, yielding a F-measure of 98.190% and 99.051% respectively. It is underscored that SVM is acknowledged as the optimal classifier in this scenario. Furthermore, the research conducted by Qu et al. [13] is brought into focus, advocating for the implementation of an accurate moving vehicle detector. This encompasses the incorporation of techniques such as candidate target recognition, CNN-based vehicle screening, and the application of motion sensors with image normalization for real-time scenarios, aiming for a high detection rate. Empirical studies employing diverse datasets underscore the effectiveness of moving vehicle detection, achieving up to 90% detection performance for automobiles, while maintaining an average false alarm rate below 10%.
In the work presented by Sarcevic and Pletl [14], a novel technique has been introduced for the filtration of false alarms. Regulations have been constructed separately, based on various data types derived from the signals, serving as the foundation for the filtration process. The parameters exerting the most significant influence were subjected to independent examination across each data type. Subsequently, these parameters were amalgamated into sophisticated algorithms to yield more precise outcomes. Optimization of the model parameters was achieved through the application of evolutionary algorithms. Results garnered from this approach indicate that 97% of false detections could be successfully eliminated, with a negligible loss of 0.3% in accurate detection systems, when rules are meticulously crafted. It was observed that even the application of a singular parameter could facilitate this process.
In a separate study conducted by Guo et al. [15], an augmented Single Shot MultiBox Detector (SSD) method has been proposed, aimed at addressing the shortcomings associated with low accuracy and missing detections in existing SSD methodologies for object tracking. The backbone of the proposed SSD network is ResNet50, selected for its capability to extract intricate details pertaining to vehicle features. The Feature Fusion Model, designed to enhance the accuracy of small target vehicle recognition, amalgamates positional data from shallow features with semantic information from feature representation. The incorporation of a Squeeze-and-Excitation (SE) block within the feature extraction layer further augments the model’s performance, enabling more comprehensive feature extraction and a reevaluation of the channel's significance. Experimental findings attest to the efficacy of the modified approach, as evidenced by an average accuracy of 83.09% on a dataset comprising home-built vehicles, surpassing the accuracy of the preceding algorithm by 3.23%. The work of Ma et al. [16] introduces the Partial Anchors based Detection Network (PADeN), advocating for the identification and subsequent removal of incomplete anchors on vehicles to expedite the object detection process significantly. Contextual information is utilized within PADeN to discern and discard unnecessary anchors, enhancing the efficiency of object detection in images. The integration of the centerness mask branch into the network is highlighted as a pivotal enhancement to PADeN’s performance. Results from this study indicate a Mean Average Precision (mAP) of 76.9%, positioning PADeN as a superior method in comparison to previous object tracking methodologies.
In another study, Barnouti et al. [17] propose the utilization of the KLT tracker in conjunction with the Two-Dimensional Principal Component Analysis (2DPCA) tracker for the purpose of monitoring and recognizing facial features within video sequences. The initial phase employs the Viola-Jones face identification technique for face detection in images or video sequences, followed by the application of the KLT method for face tracking. The KLT tracker maintains a long-term tracking capability of facial objects across successive frames, ensuring continuity even in instances of facial appearance and disappearance. The 2DPCA feature extraction method is utilized for noise reduction and enhancement of face recognition through a distance classifier. The proposed methodology undergoes validation using the Face94 database and webcam images. Experimental results confirm the efficacy of the Viola-Jones method in frontal face detection, the proficiency of the KLT system in face tracking across diverse webcam-shot videos, and the successful face recognition capabilities of 2DPCA in both the Face94 dataset and computer webcam video series.
In the work of Yue [18], a recursive tracking system oriented towards Augmented Reality (AR) for human motion tracking is introduced. This system leverages the positional relationship between consecutive frames, employing the KLT approach in tandem with Oriented Rapid and Rotated Brief (ORB) feature descriptors. The KLT tracking technique is applied to track the ORB feature descriptor, matching the first frame image and the reference image, while concurrently tracking the feature descriptor from the preceding frame in the current frame. This approach effectively mitigates the phenomenon of virtual object jitter. Comparative analysis reveals that the recursive tracking method surpasses the detection tracking strategy in terms of both speed and accuracy. Nevertheless, the study acknowledges the existence of challenges, particularly the inability to develop a feature tracking technique with enhanced accuracy and extended tracking longevity to diminish or mitigate the effects of cumulative error.
In the work conducted by Ramakrishnan et al. [19], an investigation into the optimization of the window size in the KLT tracking algorithm was presented, emphasizing the necessity of adapting the window size to mitigate the impact of distortions surrounding each feature point. The researchers introduced an adaptive window size technique, employing the iterations of the KLT algorithm as a metric to assess the quality of the tracks and consequently determine the optimal window sizes. Experimental results from well-established tracking datasets indicated that this adaptive approach exhibits enhanced robustness in comparison to the conventional fixed-window KLT, and offers a comparable level of robustness to the affine KLT, all the while achieving an average runtime speedup of seven-fold.
The system “Traffic Sensor” was introduced by Fernández et al. [20], employing deep learning techniques for the automatic detection and classification of vehicles on highways, utilizing a stationary, calibrated camera. The models were trained on a novel traffic image dataset, inclusive of images captured under sub-optimal lighting and weather conditions, as well as low-resolution images. The system is comprised of two principal modules: the first responsible for vehicle detection and classification, and the second for vehicle tracking. Extensive evaluation and comparison of various neural models were conducted for the first module, culminating in the selection of a network based on YOLOv3 or YOLOv4, trained on the new traffic dataset. The second module integrates a straightforward spatial association technique with the more intricate KLT tracker for the tracking of moving vehicles. Validation of the system was undertaken through numerous tests on challenging traffic videos, demonstrating the system's capability to effectively and real-time detect, track, and classify vehicles on highways.
Yin et al. [21] detailed the development of an optical flow target tracking system based on the KLT algorithm, implemented on the OpenCV platform and evaluated in the context of a water pipeline intelligent inspection competition. The technique leverages the optical flow method, aiming to achieve high detection certainty and rapid operational speed for frame differentiation, with a particular focus on underwater target detection and localization. The system ensures the stable control error of an underwater vehicle's motion through the application of incremental Proportional-Integral-Derivative (PID) control.
Several limitations have been identified in the prevailing state-of-the-art algorithm employed for the tasks of vehicle detection and tracking. These constraints are primarily attributed to the algorithm’s inherent complexity, its operation within a confined frequency range for feature extraction, and its reliance on the minimum enclosing rectangle (MER) as a mechanism for object detection. The adoption of a SVM for the task of classification, particularly in scenarios characterized by high traffic volumes and noisy datasets, has been observed to yield suboptimal performance [22], [23]. The algorithm’s effectiveness is further compromised by its dependence on a fixed threshold value, a factor that serves to impede its adaptive capabilities.
The proposed approach distinguishes itself through several innovative facets:
•The employment of Haar features in conjunction with the KLT algorithm is central to the development of a vehicle detection and tracking algorithm, which is anticipated to demonstrate both computational efficiency and robustness.
•A deep learning model is integrated with the aim of enhancing the accuracy of the proposed algorithm, particularly when applied to extensive datasets and those characterized by the presence of noise.
•The introduction of a novel thresholding technique is proposed, with the objective of rendering the algorithm less susceptible to variations in threshold values.
In alignment with these innovative aspects, the study sets forth several key research objectives:
•A methodological framework is to be established for the processing and analytical examination of real-time video data, with a particular focus on the accurate localization of vehicle license plates. This effort is expected to significantly contribute to the location and retrieval of lost automobiles.
•The capabilities of Haar features and the KLT algorithm are to be harnessed for the detection and tracking of vehicles within video streams.
•A rigorous comparative analysis is planned, wherein the proposed methodology will be evaluated against existing approaches. This evaluation will utilize a comprehensive set of performance metrics, including but not limited to accuracy, efficiency, recall, and precision, ensuring a thorough assessment of the methodologies in question.
3. Methodology
To fulfill the established research objectives, a novel integration of Haar features and the KLT algorithm is introduced, enhancing the vehicle detection and tracking process. Haar features are utilized for their capacity to robustly identify distinctive object characteristics, while the KLT algorithm is employed to facilitate object tracking across frames within real-time video streams [15], [24]. The objective of employing these techniques is to augment both accuracy and efficiency, thereby mitigating the limitations identified in the extant algorithm.
In the framework of the study, the following methodologies are employed:
Haar Features
Characterized by their simplistic rectangular shape, Haar features serve to represent the edges and corners of objects within images. Their computational efficiency in extraction, coupled with their demonstrated efficacy in diverse image processing tasks, including object detection and tracking, renders them a valuable tool in this context.
KLT Algorithm
The KLT algorithm, grounded in optical flow principles, is employed for feature tracking within video sequences. It operates on the premise that the brightness of a pixel remains invariant over time, thus facilitating the tracking of feature movement.
Deep Learning
Artificial neural networks form the basis of deep learning, a subset of machine learning. This technique has shown substantial effectiveness across a myriad of image processing tasks, including those pertinent to object detection and tracking.
The methodology encompasses several pivotal steps, delineated in Figure 1:
Step 1: Real-Time Video Data Processing: Vehicle license plates are located and extracted through the processing of input video data in real time.
Step 2: Haar Feature-Based Vehicle Detection: Vehicles within video frames are detected with precision, utilizing Haar features.
Step 3: KLT-Based Vehicle Tracking: Subsequent to detection, vehicles are continuously monitored across consecutive frames through the application of the KLT algorithm.
Step 4: Performance Evaluation: The efficacy of the proposed methodology is rigorously assessed through comparative analysis with pre-existing methods. Accuracy, efficiency, recall, and precision are employed as the key performance indicators in this evaluation.
The incorporation of Haar features and the KLT algorithm is substantiated by their proven efficacy in this domain. Haar features excel in robustly discerning vehicles within video frames, whereas the KLT algorithm guarantees seamless tracking of the vehicles once identified, across various frames. The integration of an adaptive thresholding technique further refines the system’s adaptability and overall performance, promising enhanced accuracy, particularly in scenarios characterized by high traffic volumes and prevalent noise.
4. Results and Discussion
In the current urban milieu, the escalating vehicular population necessitates advanced VLPR systems, as instances of vehicle theft, traffic violations, and unauthorized access to restricted areas are witnessing a surge. This manuscript introduces a novel methodology leveraging the KLT) algorithm, meticulously designed for the localization, segmentation, and recognition of characters on license plates. The method is delineated across key phases: detection of the number plate, segmentation of characters, and subsequent character recognition. A thorough comparative analysis, juxtaposed with extant methodologies, is presented herein.
Figure 2 shows a car with a plate. The plate is rectangular and white. It has black lettering. The car is parked in front of a garage. Whereas Figure 3 shows the detection of the car plate using KLT and R-CNN. KLT is a feature tracking algorithm that tracks the movement of the plate over time. R-CNN is an object detection algorithm that detects the location of the plate in the image. The efficacy of the proposed approach is rigorously evaluated through a plethora of performance metrics, including but not limited to precision, accuracy, recall, and the F1-Score. These metrics collectively afford a holistic evaluation of the model’s performance capabilities. Furthermore, quantitative insights pertaining to processing time, speed, and computational complexity are elucidated, providing a comprehensive overview of the system's operational efficiency.
Precision, a prevalent metric in assessing the efficacy of text classification and information retrieval systems, quantifies the proportion of retrieved items that are pertinent to the user's query.
Figure 4 shows a bar chart comparing the precision of existing and proposed license plate detection methods. The KLT+R-CNN method has the highest precision of the two methods as seen from the figure. This is because KLT+R-CNN use a combination of feature tracking and object detection to identify license plates. Feature tracking helps to identify the plate's location in the image, while object detection helps to confirm whether the object identified is actually a license plate.
Accuracy stands as the predominant metric for gauging the performance of a classification model, being derived from the ratio of correctly classified instances to the aggregate sum of instances. On the other hand, the error rate offers an alternative measure of classification efficacy, computed as the quotient of incorrectly classified instances and the total number of instances classified correctly. Figure 5 compares the accuracy of different plate detection methods. The blue bar represents the average accuracy of the proposed algorithm, whereas the orange bar represents the average accuracy of the existing CNN model.
The performance of the proposed method is presented in Table 1, which includes accuracy scores for both our method and existing algorithms. The accuracy is a measure of how much detection are correct. A higher accuracy score indicates that the algorithm is more likely to correctly identify license plates. As shown in the table, the proposed algorithm has the highest accuracy amongst all the other algorithms. This means that the proposed algorithm is more likely to correctly identify license plates than the other algorithms.
Algorithm | Accuracy |
---|---|
ZF | 0.94 |
VGG16 | 0.97 |
VGG-CNN-M-1024 | 0.96 |
ResNet101 | 0.94 |
ResNet50 | 0.97 |
OKM-CNN | 0.98 |
Proposed | 0.99267 |
It is deemed suitable when the primary objective lies in the minimization of false negatives. On certain occasions, the emphasis is placed on obtaining precise predictions for the positive class. Figure 6 compares the recall of different plate detection methods. As shown in the figure, the proposed algorithm has a recall of 90%, which is significantly higher than the recall of the existing algorithm, which is 80%. This suggests that the proposed algorithm is more likely to correctly identify plates in an image.
The F-score, alternatively referred to as the F1-score, constitutes a critical metric employed in assessing a model’s performance in dataset analysis. It predominantly finds its application in the scrutiny of binary classification systems, responsible for assigning instances to either ‘positive’ or ‘negative’ classes. Fundamentally, the F-score functions as a balanced mechanism to amalgamate both the precision and recall of the model, being rigorously delineated as the harmonic mean of the model’s respective precision and recall metrics.
Figure 7 shows the comparison of the model F-score across different feature sets. In this figure, the x-axis represents the different feature sets that were used to train the model, and the y-axis represents the F-score of the model. The blue line shows the F-score of the model that was trained on the full set of features, and the other lines show the F-score of the model that was trained on subsets of the features. As it can be seen, the F-score increases as the number of features increases. However, the increase in F-score starts to level off after a certain number of features. This suggests that there is a point at which adding more features does not significantly improve the performance of the model. Figure 8 provides a visual representation of the performance of the proposed method compared to the existing work. The figure shows that the proposed method consistently outperforms the existing work across all four parameters.
Parameters | Existing Work | Proposed Work |
---|---|---|
Accuracy | 74.572 | 99.267 |
Recall | 74.572 | 99.267 |
Precision | 100 | 100 |
F-Score | 85.434 | 99.632 |
Table 2 compares the performance of the proposed method to existing algorithms in terms of four key parameters: accuracy, recall, precision, and F1-score. In case of accuracy, the proposed method has an accuracy of 99.267%, while the existing work has an accuracy of 74.572%. This means that the proposed method is much better at correctly classifying instances than the existing work.
In case of recall, the proposed method has a recall of 99.267%, while the existing work has a recall of 74.572%. This means that the proposed method is also better at correctly classifying positive instances than the existing work. For precision, the proposed method has a precision of 100%, while the existing work has a precision of 100%. This means that the proposed method is very good at avoiding false positives, while the existing work has a similar performance.
F1-score is the harmonic mean of precision and recall. It is a more balanced measure of overall performance than either precision or recall alone. In this case, the proposed method has an F1-score of 99.632%, which is significantly higher than the F1-score of 85.434% for the existing work. This means that the proposed method is better at both correctly classifying positive instances and avoiding false positives.
Overall, the proposed method is significantly more accurate, has a higher recall, and has a higher F1-score. These results suggest that the proposed method is a more effective approach to the task of classification.
Figure 9 presents the performance metrics of the proposed KLT-based approach, revealing an impressive accuracy of 99.267%, a precision of 100%, a recall of 99.267%, and an F-Score of 99.632%. These results markedly outperform those achieved by existing techniques. The KLT method has demonstrated its robustness and precision in character localization and segmentation, culminating in enhanced accuracy for character recognition.
5. Conclusion
The burgeoning population density worldwide necessitates efficacious methodologies for vehicle detection, crucial for traffic management optimization. In this study, a robust VLPR system has been elucidated, utilizing Raspberry Pi for video processing. This system adeptly identifies and extracts numerical information from vehicle license plates through a meticulously designed suite of methodologies and algorithms.
The feasibility of the VLPR system for practical implementation in traffic control and management has been established, with promising implications for enhancing law enforcement, traffic surveillance, and security measures. The KLT-based method proposed herein has demonstrated its capability to develop a vehicle detection and tracking algorithm that is not only computationally efficient but also robust and precise, addressing the limitations inherent in existing methodologies.
In conclusion, the current research has validated the effectiveness of the VLPR system, laying a solid groundwork for its practical and impactful applications. As continual refinements and expansions are made to this work, there is a potential for technology to play a pivotal role in surmounting contemporary challenges, further harnessing its capacity to contribute to societal advancement.
The data used to support the findings of this study are available from the corresponding author upon request.
The authors declare that they have no conflicts of interest.