International Journal of Intelligent Systems, volume 2023, pages 1-17

LE-YOLOv5: A Lightweight and Efficient Road Damage Detection Algorithm Based on Improved YOLOv5

Zhuo Diao 1
Xianfu Huang 2, 3
Han Liu 4
Zhanwei Liu 1
2
 
State Key Laboratory of Nonlinear Mechanics (LNM), Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, China
4
 
Beijing Institute of Structure and Environment Engineering, Beijing 100076, China
Publication typeJournal Article
Publication date2023-09-28
scimago Q1
SJR1.264
CiteScore11.3
Impact factor5
ISSN08848173, 1098111X
Artificial Intelligence
Software
Theoretical Computer Science
Human-Computer Interaction
Abstract

Road damage detection is very important for road safety and timely repair. The previous detection methods mainly rely on humans or large machines, which are costly and inefficient. Existing algorithms are computationally expensive and difficult to arrange in edge detection devices. To solve this problem, we propose a lightweight and efficient road damage detection algorithm LE-YOLOv5 based on YOLOv5. We propose a global shuffle attention module to improve the shortcomings of the SE attention module in MobileNetV3, which in turn builds a better backbone feature extraction network. It greatly reduces the parameters and GFLOPS of the model while increasing the computational speed. To construct a simple and efficient neck network, a lightweight hybrid convolution is introduced into the neck network to replace the standard convolution. Meanwhile, we introduce the lightweight coordinate attention module into the cross-stage partial network module that was designed using the one-time aggregation method. Specifically, we propose a parameter-free attentional feature fusion (PAFF) module, which significantly enhances the model’s ability to capture contextual information at a long distance by guiding and enhancing correlation learning between the channel direction and spatial direction without introducing additional parameters. The K-means clustering algorithm is used to make the anchor boxes more suitable for the dataset. Finally, we use a label smoothing algorithm to improve the generalization ability of the model. The experimental results show that the LE-YOLOv5 proposed in this document can stably and effectively detect road damage. Compared to YOLOv5s, LE-YOLOv5 reduces the parameters by 52.6% and reduces the GFLOPS by 57.0%. However, notably, the mean average precision (mAP) of our model improves by 5.3%. This means that LE-YOLOv5 is much more lightweight while still providing excellent performance. We set up visualization experiments for multialgorithm comparative detection in a variety of complex road environments. The experimental results show that LE-YOLOv5 exhibits excellent robustness and reliability in complex road environments.

Xu W., Wang W., Ren J., Cai C., Xue Y.
Applied Sciences (Switzerland) scimago Q2 wos Q2 Open Access
2023-03-16 citations by CoLab: 2 PDF Abstract  
Pointer meters have been widely used in industrial field due to their strong stability; it is an important issue to be able to accurately read the meter. At present, patrol robots with computer vision function are often used to detect and read meters in some situations that are not suitable for manual reading of the meter. However, existing object detection algorithms are often misread and miss detection due to factors such as lighting, shooting angles, and complex background environments. To address these problems, this paper designs a YOLOv4-Tiny-based pointer meter detection model named pointer meter detection-YOLO (PMD-YOLO) for the goal of practical applications. Firstly, to reduce weight of the model and ensure the accuracy of object detection, a feature extraction network named GhostNet with a channel attention mechanism is implemented in YOLOv4-Tiny. Then, to enhance feature extraction ability of small- and medium-sized targets, an improved receptive field block (RFB) module is added after the backbone network, and a convolutional block attention module (CBAM) is introduced into the feature pyramid network (FPN). Finally, the FPN is optimized to improve the feature utilization, which further improves the detection accuracy. In order to verify the effectiveness and superiority of the PMD-YOLO proposed in this paper, the PMD-YOLO is used for experimental research on the constructed dataset of the pointer meter, and the target detection algorithms such as Faster region convolutional neural network (RCNN), YOLOv4, YOLOv4-Tiny, and YOLOv5-s are compared under the same conditions. The experimental results show that the mean average precision of the PMD-YOLO is 97.82%, which is significantly higher than the above algorithms. The weight of the PMD-YOLO is 9.38 M, which is significantly lower than the above algorithms. Therefore, the PMD-YOLO not only has high detection accuracy, but can also reduce the weight of the model and can meet the requirements of practical applications.
Wan F., Sun C., He H., Lei G., Xu L., Xiao T.
2022-10-18 citations by CoLab: 53 PDF Abstract  
In computer vision, timely and accurate execution of object identification tasks is critical. However, present road damage detection approaches based on deep learning suffer from complex models and computationally time-consuming issues. To address these issues, we present a lightweight model for road damage identification by enhancing the YOLOv5s approach. The resulting algorithm, YOLO-LRDD, provides a good balance of detection precision and speed. First, we propose the novel backbone network Shuffle-ECANet by adding an ECA attention module into the lightweight model ShuffleNetV2. Second, to ensure reliable detection, we employ BiFPN rather than the original feature pyramid network since it improves the network's capacity to describe features. Moreover, in the model training phase, localization loss is modified to Focal-EIOU in order to get higher-quality anchor box. Lastly, we augment the well-known RDD2020 dataset with many samples of Chinese road scenes and compare YOLO-LRDD against several state-of-the-art object detection techniques. The smaller model of our YOLO-LRDD offers superior performance in terms of accuracy and efficiency, as determined by our experiments. Compared to YOLOv5s in particular, YOLO-LRDD improves single image recognition speed by 22.3% and reduces model size by 28.8% while maintaining comparable accuracy. In addition, it is easier to implant in mobile devices because its model is smaller and lighter than those of the other approaches.
Shim S., Kim J., Lee S., Cho G.
Automation in Construction scimago Q1 wos Q1
2021-10-01 citations by CoLab: 45 Abstract  
• Hierarchical neural network structure to extract various features of road distress. • Training and prediction method with multiple loss functions and weighted soft voting. • Computationally efficient road damage detection with high recognition performance. • Multiple road damage detection with an accuracy of 81.62% m-IoU and 79.33% F1. • Real-time performance algorithm for personal mobility vehicle safety. In this paper, we propose a novel neural network structure and training and prediction methods. We propose a novel deep neural network algorithm to detect road surface damage conditions for establishing a safe road environment. We secure 1300 training and 400 testing images to train the neural network; the images contain multiple types of road distress. The proposed algorithm is compared with nine deep learning models from various fields. Comparison results indicate that the proposed algorithm outperforms all others with a pixel accuracy of 97.61%, F1 score of 79.33%, mean intersection over union of 81.62%, and frequency-weighted intersection over union of 95.64%; in addition, it requires only 3.56 M parameters. In the future, the results of this study are expected to play an important role in ensuring safe driving by efficiently detecting poor road conditions.
Arya D., Maeda H., Ghosh S.K., Toshniwal D., Sekimoto Y.
Data in Brief scimago Q3 wos Q3 Open Access
2021-06-01 citations by CoLab: 133 Abstract  
This data article provides details for the RDD2020 dataset comprising 26,336 road images from India, Japan, and the Czech Republic with more than 31,000 instances of road damage. The dataset captures four types of road damage: longitudinal cracks, transverse cracks, alligator cracks, and potholes; and is intended for developing deep learning-based methods to detect and classify road damage automatically. The images in RDD2020 were captured using vehicle-mounted smartphones, making it useful for municipalities and road agencies to develop methods for low-cost monitoring of road pavement surface conditions. Further, the machine learning researchers can use the datasets for benchmarking the performance of different algorithms for solving other problems of the same type (image classification, object detection, etc.). RDD2020 is freely available at [1] . The latest updates and the corresponding articles related to the dataset can be accessed at [2] .
Carion N., Massa F., Synnaeve G., Usunier N., Kirillov A., Zagoruyko S.
2020-11-03 citations by CoLab: 7809 Abstract  
We present a new method that views object detection as a direct set prediction problem. Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components like a non-maximum suppression procedure or anchor generation that explicitly encode our prior knowledge about the task. The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss that forces unique predictions via bipartite matching, and a transformer encoder-decoder architecture. Given a fixed small set of learned object queries, DETR reasons about the relations of the objects and the global image context to directly output the final set of predictions in parallel. The new model is conceptually simple and does not require a specialized library, unlike many other modern detectors. DETR demonstrates accuracy and run-time performance on par with the well-established and highly-optimized Faster R-CNN baseline on the challenging COCO object detection dataset. Moreover, DETR can be easily generalized to produce panoptic segmentation in a unified manner. We show that it significantly outperforms competitive baselines. Training code and pretrained models are available at https://github.com/facebookresearch/detr .
Ahmed M., Seraj R., Islam S.M.
Electronics (Switzerland) scimago Q2 wos Q2 Open Access
2020-08-12 citations by CoLab: 776 PDF Abstract  
The k-means clustering algorithm is considered one of the most powerful and popular data mining algorithms in the research community. However, despite its popularity, the algorithm has certain limitations, including problems associated with random initialization of the centroids which leads to unexpected convergence. Additionally, such a clustering algorithm requires the number of clusters to be defined beforehand, which is responsible for different cluster shapes and outlier effects. A fundamental problem of the k-means algorithm is its inability to handle various data types. This paper provides a structured and synoptic overview of research conducted on the k-means algorithm to overcome such shortcomings. Variants of the k-means algorithms including their recent developments are discussed, where their effectiveness is investigated based on the experimental analysis of a variety of datasets. The detailed experimental analysis along with a thorough comparison among different k-means clustering algorithms differentiates our work compared to other existing survey papers. Furthermore, it outlines a clear and thorough understanding of the k-means algorithm along with its different research directions.
Sultana F., Sufian A., Dutta P.
Knowledge-Based Systems scimago Q1 wos Q1
2020-08-01 citations by CoLab: 204 Abstract  
From the autonomous car driving to medical diagnosis, the requirement of the task of image segmentation is everywhere. Segmentation of an image is one of the indispensable tasks in computer vision. This task is comparatively complicated than other vision tasks as it needs low-level spatial information. Basically, image segmentation can be of two types: semantic segmentation and instance segmentation. The combined version of these two basic tasks is known as panoptic segmentation. In the recent era, the success of deep convolutional neural networks (CNN) has influenced the field of segmentation greatly and gave us various successful models to date. In this survey, we are going to take a glance at the evolution of both semantic and instance segmentation work based on CNN. We have also specified comparative architectural details of some state-of-the-art models and discuss their training details to present a lucid understanding of hyper-parameter tuning of those models. We have also drawn a comparison among the performance of those models on different datasets. Lastly, we have given a glimpse of some state-of-the-art panoptic segmentation models.
Maeda H., Kashiyama T., Sekimoto Y., Seto T., Omata H.
2020-06-02 citations by CoLab: 212 Abstract  
Machine learning can produce promising results when sufficient training data are available; however, infrastructure inspections typically do not provide sufficient training data for road d...
He K., Zhang X., Ren S., Sun J.
2015-09-01 citations by CoLab: 8116 Abstract  
Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 $\times$ 224) input image. This requirement is “artificial” and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, “spatial pyramid pooling”, to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 $\times$ faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.
Modha D.S.
Machine Learning scimago Q1 wos Q2
2003-09-12 citations by CoLab: 264 Abstract  
Data sets with multiple, heterogeneous feature spaces occur frequently. We present an abstract framework for integrating multiple feature spaces in the k-means clustering algorithm. Our main ideas are (i) to represent each data object as a tuple of multiple feature vectors, (ii) to assign a suitable (and possibly different) distortion measure to each feature space, (iii) to combine distortions on different feature spaces, in a convex fashion, by assigning (possibly) different relative weights to each, (iv) for a fixed weighting, to cluster using the proposed convex k-means algorithm, and (v) to determine the optimal feature weighting to be the one that yields the clustering that simultaneously minimizes the average within-cluster dispersion and maximizes the average between-cluster dispersion along all the feature spaces. Using precision/recall evaluations and known ground truth classifications, we empirically demonstrate the effectiveness of feature weighting in clustering on several different application domains.
Han J., Nair P.P.
Cancer scimago Q1 wos Q1
1995-07-15 citations by CoLab: 11 Abstract  
The expression of tumor-associated cell surface antigens is a reflection of the state of cell differentiation of tumor cells in culture.Monoclonal antibodies (MoAbs) against the tumor-associated antigens carcinoembryonic antigen (CEA) and CA19-9 and the extracellular matrix protein CD44 were used to label the cell surface of human colonic cells in culture. The binding of each antibody to its respective antigen was measured by fluorescence-activated flow cytometry and expressed as a percentage of positive cells.The human colon adenocarcinoma cell (HCAC) line, LS-180, showed strong binding with CEA (81%), CA 19-9 (87%), and CD44 (83%). LS-174t cells, a trypsinized variant of LS-180 cells, showed less binding with CEA (66%) and CA 19-9 (49%), but no binding with CD44. With cells from HCAC line HT-29, antigen expression was highly variable for CEA (13% +/- 18) and CD44 (31% +/- 35) but was consistently positive for CA19-9 (33% +/- 13). The expression of CEA in the Caco-2 cell line was weak (24%), whereas there was no expression of CA19-9 and CD44. Normal human colon fibroblast cells (CCD-18Co) did not recognize the monoclonal antibodies to CEA or CA 19-9, but were strongly positive with the CD44 antibody (97%).These results support the concept that the expression of the tumor associated markers CEA and CA19-9 and the cell surface marker CD44 on human colonic cell lines varies with the degree of cellular differentiation. Carcinoembryonic antigen and/or CA19-9 were expressed in all four human colon adenocarcinoma cell lines, but not in the normal colon fibroblast cells (CCD-18Co). Using these two MoAbs appeared to be a more reliable measure of the state of differentiation of human colon adenocarcinoma cells.
Hartigan J.A., Wong M.A.
1979-01-01 citations by CoLab: 6845
Xing H., Yang F., Qiao X., Li F., Huang X.
Journal of Supercomputing scimago Q2 wos Q2
2025-01-08 citations by CoLab: 1 Abstract  
To address challenges such as variations in lighting, weather, and the size and shape of cracks and potholes, we propose an enhanced end-to-end regression algorithm for autonomous road damage detection. This method balances computational efficiency and accuracy by incorporating feature extraction structures to improve performance in scenarios involving multiple damage types, shadows, and fine-grained feature variations. The proposed model integrates a down-sampling structure for dimensionality reduction and feature extraction, an inverted residual mobile block for feature fusion, and an attention mechanism with multi-scale features for multi-scale detail extraction. Additionally, the integration of a Decoupled Head structure enhances bounding box localization. Experimental results show that the proposed method outperforms YOLOv5s (You Only Look Once version 5 small), achieving a 2.9% improvement in the F1 score and a 4% improvement in the mean average precision. Further validation through visualization experiments in seven challenging road scenarios, including varying lighting and environmental conditions, highlights the model’s superior detection accuracy, completeness, and robustness.
Khan M.W., Mahmood K., Hussain I., Talha Qureshi M., Muhammad Sanaullah Badar H., Gao W.
2024-12-13 citations by CoLab: 0
Paramarthalingam A., Sivaraman J., Theerthagiri P., Vijayakumar B., Baskaran V.
2024-09-01 citations by CoLab: 2 Abstract  
Visually impaired individuals encounter numerous impediments when traveling, such as navigating unfamiliar routes, accessing information, and transportation, which can limit their mobility and restrict their access to opportunities. However, assistive technologies and infrastructure solutions such as tactile paving, audio cues, voice announcements, and smartphone applications have been developed to mitigate these challenges. Visually impaired individuals also face difficulties when encountering potholes while traveling. Potholes can pose a significant safety hazard, as they can cause individuals to trip and fall, potentially leading to injury. For visually impaired individuals, identifying and avoiding potholes can be particularly challenging. The solutions ensure that all individuals can travel safely and independently, regardless of their visual abilities. An innovative approach that leverages the You Only Look Once (YOLO) algorithm to detect potholes and provide auditory or haptic feedback to visually impaired individuals has been proposed in this paper. The dataset of pothole images was trained and integrated into an application for detecting potholes in real-time image data using a camera. The app provides feedback to the user, allowing them to navigate potholes and increasing their mobility and safety. This approach highlights the potential of YOLO for pothole detection and provides a valuable tool for visually impaired individuals. According to the testing, the model achieved 82.7% image accuracy and 30 Frames Per Second (FPS) accuracy in live video. The model is trained to detect potholes close to the user, but it may be hard to detect potholes far away from the user. The current model is only trained to detect potholes, but visually impaired people face other challenges. The proposed technology is a portable option for visually impaired people.
Pan J., Yang C., Wu L., Huang X., Qiu S.
2024-08-06 citations by CoLab: 0 PDF Abstract  
This study introduces an improved lightweight section-steel surface detection (ILSSD) YOLOX-s algorithm model to enhance feature fusion performance in single-stage target detection networks, addressing the low accuracy in detecting defects on section-steel surfaces and limited computing resources at steel plants. The ILSSD YOLOX-s model is improved by introducing the deep-wise separable convolution (DSC) module to reduce parameter count, a dual parallel attention module for improved feature extraction efficiency, and a weighted feature fusion path using bi-directional feature pyramid network (BiFPN). Additionally, the CIoU loss function is employed for boundary frame regression to enhance prediction accuracy. Based on the NEU-DET dataset, experimental results demonstrate that the ILSSD YOLOX-s algorithm model achieves a 75.9% mean average precision with an IoU threshold of 0.5 (mAP@0.5), an improvement of 7.1 percentage points over the original YOLOX-s model, with a detection speed of 78.4 frames per second (FPS). Its practicality is validated through training and validating it with a lightweight section-steel surface defect dataset from an industrial steel plant, further confirming its viability for industrial defect detection applications.
Yi B., Long Q., Liu H., Gong Z., Yu J.
Heliyon scimago Q1 wos Q1 Open Access
2024-07-17 citations by CoLab: 1 Abstract  
To address the issue of detecting complex-shaped cracks that rely on manual, which may result in high costs and low efficiency, this paper proposed a lightweight ground crack rapid detection method based on semantic enhancement. Firstly, the introduction of the Context Guided Block module enhanced the YOLOv8 backbone network, improving its feature extraction capability. Next, the incorporation of GSConv and VoV-GSCSP was introduced to construct a lightweight yet efficient neck network, facilitating the effective fusion of information from multiple feature maps. Finally, the detection head achieved more precise target localization by optimizing the probability around the labels. The proposed method was validated through experiments on the public dataset RDD-2022. The experimental results demonstrate that our method effectively detects cracks. Compared to YOLOv8, the model parameters have been reduced by 73.5 %, while accuracy, F1 score, and FPS have improved by 6.6 %, 4.3 %, and 116, respectively. Therefore, our proposed method is more lightweight and holds significant application value.
Ji Y., Zhang A., Chen Z., Wei M., Yu Z., Zhang X., Han L.
2024-06-14 citations by CoLab: 1
Lin T., Wang Q., Huang J., Qu X., Ju G., Wu H.
2024-05-24 citations by CoLab: 1

Top-30

Journals

1
1

Publishers

1
2
3
1
2
3
  • We do not take into account publications without a DOI.
  • Statistics recalculated only for publications connected to researchers, organizations and labs registered on the platform.
  • Statistics recalculated weekly.

Are you a researcher?

Create a profile to get free access to personal recommendations for colleagues and new articles.
Share
Cite this
GOST | RIS | BibTex
Found error?