Segui questo link per vedere altri tipi di pubblicazioni sul tema: Detr.

Articoli di riviste sul tema "Detr"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Vedi i top-50 articoli di riviste per l'attività di ricerca sul tema "Detr".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Vedi gli articoli di riviste di molte aree scientifiche e compila una bibliografia corretta.

1

Wang, Runqi, Huixin Sun, Linlin Yang, Shaohui Lin, Chuanjian Liu, Yan Gao, Yao Hu e Baochang Zhang. "AQ-DETR: Low-Bit Quantized Detection Transformer with Auxiliary Queries". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 14 (24 marzo 2024): 15598–606. http://dx.doi.org/10.1609/aaai.v38i14.29487.

Testo completo
Abstract (sommario):
DEtection TRansformer (DETR)-based models have achieved remarkable performance. However, they are accompanied by a large computation overhead cost, which significantly prevents their applications on resource-limited devices. Prior arts attempt to reduce the computational burden of DETR using low-bit quantization, while these methods sacrifice a severe significant performance on weight-activation-attention low-bit quantization. We observe that the number of matching queries and positive samples affect much on the representation capacity of queries in DETR, while quantifying queries of DETR further reduces its representational capacity, thus leading to a severe performance drop. We introduce a new quantization strategy based on Auxiliary Queries for DETR (AQ-DETR), aiming to enhance the capacity of quantized queries. In addition, a layer-by-layer distillation is proposed to reduce the quantization error between quantized attention and full-precision counterpart. Through our extensive experiments on large-scale open datasets, the performance of the 4-bit quantization of DETR and Deformable DETR models is comparable to full-precision counterparts.
Gli stili APA, Harvard, Vancouver, ISO e altri
2

Gu, Yanhong, Tao Zhang, Yuxia Hu e Fudong Nian. "CC-DETR: DETR with Hybrid Context and Multi-Scale Coordinate Convolution for Crowd Counting". Mathematics 12, n. 10 (17 maggio 2024): 1562. http://dx.doi.org/10.3390/math12101562.

Testo completo
Abstract (sommario):
Prevailing crowd counting approaches primarily rely on density map regression methods. Despite wonderful progress, significant scale variations and complex background interference within the same image remain challenges. To address these issues, in this paper we propose a novel DETR-based crowd counting framework called Crowd Counting DETR (CC-DETR), which aims to extend the state-of-the-art DETR object detection framework to the crowd counting task. In CC-DETR, a DETR-like encoder–decoder structure (Hybrid Context DETR, i.e., HCDETR) is proposed to tackle complex visual information by fusing features from hybrid semantic levels through a transformer. In addition, we design a Coordinate Dilated Convolution Module (CDCM) to effectively employ position-sensitive context information in different scales. Extensive experiments on three challenging crowd counting datasets (ShanghaiTech, UCF-QNRF, and NWPU) demonstrate that our model is effective and competitive when compared against SOTA crowd counting models.
Gli stili APA, Harvard, Vancouver, ISO e altri
3

Wang, Dashuai, Zhuolin Li, Xiaoqiang Du, Zenghong Ma e Xiaoguang Liu. "Farmland Obstacle Detection from the Perspective of UAVs Based on Non-local Deformable DETR". Agriculture 12, n. 12 (23 novembre 2022): 1983. http://dx.doi.org/10.3390/agriculture12121983.

Testo completo
Abstract (sommario):
In precision agriculture, unmanned aerial vehicles (UAVs) are playing an increasingly important role in farmland information acquisition and fine management. However, discrete obstacles in the farmland environment, such as trees and power lines, pose serious threats to the flight safety of UAVs. Real-time detection of the attributes of obstacles is urgently needed to ensure their flight safety. In the wake of rapid development of deep learning, object detection algorithms based on convolutional neural networks (CNN) and transformer architectures have achieved remarkable results. Detection Transformer (DETR) and Deformable DETR combine CNN and transformer to achieve end-to-end object detection. The goal of this work is to use Deformable DETR for the task of farmland obstacle detection from the perspective of UAVs. However, limited by local receptive fields and local self-attention mechanisms, Deformable DETR lacks the ability to capture long-range dependencies to some extent. Inspired by non-local neural networks, we introduce the global modeling capability to the front-end ResNet to further improve the overall performance of Deformable DETR. We refer to the improved version as Non-local Deformable DETR. We evaluate the performance of Non-local Deformable DETR for farmland obstacle detection through comparative experiments on our proposed dataset. The results show that, compared with the original Deformable DETR network, the mAP value of the Non-local Deformable DETR is increased from 71.3% to 78.0%. Additionally, Non-local Deformable DETR also presents great performance for detecting small and slender objects. We hope this work can provide a solution to the flight safety problems encountered by UAVs in unstructured farmland environments.
Gli stili APA, Harvard, Vancouver, ISO e altri
4

Wu, Jianshuang, Changji Wen, Hongrui Chen, Zhenyu Ma, Tian Zhang, Hengqiang Su e Ce Yang. "DS-DETR: A Model for Tomato Leaf Disease Segmentation and Damage Evaluation". Agronomy 12, n. 9 (26 agosto 2022): 2023. http://dx.doi.org/10.3390/agronomy12092023.

Testo completo
Abstract (sommario):
Early blight and late blight are important factors restricting tomato yield. However, it is still a challenge to accurately and objectively detect and segment crop diseases in order to evaluate disease damage. In this paper, the Disease Segmentation Detection Transformer (DS-DETR) is proposed to segment leaf disease spots efficiently based on several improvements to DETR. Additionally, a damage assessment is carried out by the area ratio of the segmented leaves to the disease spots. First, an unsupervised pre-training method was introduced into DETR with the Plant Disease Classification Dataset (PDCD) to solve the problem of the long training epochs and slow convergence speed of DETR. This method can train the Transformer structures in advance to obtain leaf disease features. Loading the pre-training model weight in DS-DETR can speed up the convergence speed of the model. Then, Spatially Modulated Co-Attention (SMCA) was used to assign Gaussian-like spatial weights to the query box of DS-DETR. The different positions in the image are trained using the query boxes with different weights to improve the accuracy of the model. Finally, an improved relative position code was added to the Transformer structure of DS-DETR. Relative position coding promotes the capture of the sequence order of input tokens by the Transformer. The spatial location feature is strengthened by establishing the location relationship between different instances. Based on these improvements, the DS-DETR model was tested on the Tomato leaf Disease Segmentation Dataset (TDSD) constructed by us. The experimental results show that the DS-DETR proposed by us achieved 0.6823 for APmask, which improved by 12.87%, 8.25%, 3.67%, 1.95%, 10.27%, and 9.52% compared with the state-of-the-art: Mask RCNN, BlendMask, CondInst, SOLOv2, ISTR, and DETR, respectively. In addition, the disease grading accuracy reached 0.9640 according to the segmentation results given by our proposed model.
Gli stili APA, Harvard, Vancouver, ISO e altri
5

Liu, Minggao, Haifeng Wang, Luyao Du, Fangsong Ji e Ming Zhang. "Bearing-DETR: A Lightweight Deep Learning Model for Bearing Defect Detection Based on RT-DETR". Sensors 24, n. 13 (30 giugno 2024): 4262. http://dx.doi.org/10.3390/s24134262.

Testo completo
Abstract (sommario):
Detecting bearing defects accurately and efficiently is critical for industrial safety and efficiency. This paper introduces Bearing-DETR, a deep learning model optimised using the Real-Time Detection Transformer (RT-DETR) architecture. Enhanced with Dysample Dynamic Upsampling, Efficient Model Optimization (EMO) with Meta-Mobile Blocks (MMB), and Deformable Large Kernel Attention (D-LKA), Bearing-DETR offers significant improvements in defect detection while maintaining a lightweight framework suitable for low-resource devices. Validated on a dataset from a chemical plant, Bearing-DETR outperformed the standard RT-DETR, achieving a mean average precision (mAP) of 94.3% at IoU = 0.5 and 57.5% at IoU = 0.5–0.95. It also reduced floating-point operations (FLOPs) to 8.2 G and parameters to 3.2 M, underscoring its enhanced efficiency and reduced computational demands. These results demonstrate the potential of Bearing-DETR to transform maintenance strategies and quality control across manufacturing environments, emphasising adaptability and impact on sustainability and operational costs.
Gli stili APA, Harvard, Vancouver, ISO e altri
6

Qin, Yujing, Qian Liu e Chuhan Lu. "Cold Front Identification Using the DETR Model with Satellite Cloud Imagery". Remote Sensing 17, n. 1 (26 dicembre 2024): 36. https://doi.org/10.3390/rs17010036.

Testo completo
Abstract (sommario):
The cloud system characteristics within satellite cloud imagery play a crucial role in the meteorological operational analysis of cold fronts, and integrating satellite cloud imagery into automated frontal identification schemes can provide valuable insights for accurately determining the position and morphology of cold fronts. This study introduces Cloud-DETR, a deep learning identification method that uses the DETR model with satellite cloud imagery, to identify cold fronts from extensive datasets. In the Cloud-DETR method, preprocessed satellite cloud imagery is used to generate training images, which are then put into the DETR model for cold front identification, achieving excellent results. The alignment between the Cloud-DETR cold fronts and weather systems during continuous periods and extreme weather events is assessed. The Cloud-DETR method exhibits high accuracy in both the position and morphology of cold fronts, ensuring stable identification performance. The high matching rate between the Cloud-DETR cold fronts and the manually identified ones in the test set, image dataset and labels from 2017 is verified. This indicates that the Cloud-DETR method can provide an accurate cold fronts dataset. The cold fronts dataset from 2005 to 2023 was obtained using the Cloud-DETR method. It was found that over the past 18 years, the frequency of cold fronts displays distinct seasonal patterns, with the highest occurrences observed during winter, particularly along the mid-latitude storm tracks extending from the east coast of East Asia to the Northwest Pacific. The methodology and findings presented in this study could help advance further research on the characteristics of cold front cloud systems based on long-term datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
7

Liu, Zhiyong, Kehan Wang, Changming Li, Yixuan Wang e Guoqian Luo. "TIG-DETR: Enhancing Texture Preservation and Information Interaction for Target Detection". Applied Sciences 13, n. 14 (10 luglio 2023): 8037. http://dx.doi.org/10.3390/app13148037.

Testo completo
Abstract (sommario):
FPN (Feature Pyramid Network) and transformer-based target detectors are commonly employed in target detection tasks. However, these approaches suffer from design flaws that restrict their performance. To overcome these limitations, we proposed TIG-DETR (Texturized Instance Guidance DETR), a novel target detection model. TIG-DETR comprises a backbone network, TE-FPN (Texture-Enhanced FPN), and an enhanced DETR detector. TE-FPN addresses the issue of texture information loss in FPN by utilizing a bottom-up architecture, Lightweight Feature-wise Attention, and Feature-wise Attention. These components effectively compensate for texture information loss, mitigate the confounding effect of cross-scale fusion, and enhance the final output features. Additionally, we introduced the Instance Based Advanced Guidance Module in the DETR-based detector to tackle the weak detection of larger objects caused by the limitations of window interactions in Shifted Window-based Self-Attention. By incorporating TE-FPN instead of FPN in Faster RCNN and employing ResNet-50 as the backbone network, we observed an improvement of 1.9 AP in average accuracy. By introducing the Instance-Based Advanced Guidance Module, the average accuracy of the DETR-based target detector has been improved by 0.4 AP. TIG-DETR achieves an impressive average accuracy of 44.1% with ResNet-50 as the backbone network.
Gli stili APA, Harvard, Vancouver, ISO e altri
8

Zhang, Haiyan, Huiqi Li, Guodong Sun e Feng Yang. "MDA-DETR:Enhancing Offending Animal Detection with Multi-Channel Attention and Multi-Scale Feature Aggregation". Animals 15, n. 2 (17 gennaio 2025): 259. https://doi.org/10.3390/ani15020259.

Testo completo
Abstract (sommario):
Conflicts between humans and animals in agricultural and settlement areas have recently increased, resulting in significant resource loss and risks to human and animal lives. This growing issue presents a global challenge. This paper addresses the detection and identification of offending animals, particularly in obscured or blurry nighttime images. This article introduces Multi-Channel Coordinated Attention and Multi-Dimension Feature Aggregation (MDA-DETR). It integrates multi-scale features for enhanced detection accuracy, employing a Multi-Channel Coordinated Attention (MCCA) mechanism to incorporate location, semantic, and long-range dependency information and a Multi-Dimension Feature Aggregation Module (DFAM) for cross-scale feature aggregation. Additionally, the VariFocal Loss function is utilized to assign pixel weights, enhancing detail focus and maintaining accuracy. In the dataset section, this article uses a dataset from the Northeast China Tiger and Leopard National Park, which includes images of six common offending animal species. In the comprehensive experiments on the dataset, the mAP50 index of MDA-DETR was 1.3%, 0.6%, 0.3%, 3%, 1.1%, and 0.5% higher than RT-DETR-r18, yolov8n, yolov9-C, DETR, Deformable-detr, and DCA-yolov8, respectively, indicating that MDA-DETR is superior to other advanced methods.
Gli stili APA, Harvard, Vancouver, ISO e altri
9

Zheng, Zhiqiang, Mengbo Wang, Xiaoyu Zhao e Zhi Weng. "Adltformer Team-Training with Detr: Enhancing Cattle Detection in Non-Ideal Lighting Conditions Through Adaptive Image Enhancement". Animals 14, n. 24 (17 dicembre 2024): 3635. https://doi.org/10.3390/ani14243635.

Testo completo
Abstract (sommario):
This study proposes an image enhancement detection technique based on Adltformer (Adaptive dynamic learning transformer) team-training with Detr (Detection transformer) to improve model accuracy in suboptimal conditions, addressing the challenge of detecting cattle in real pastures under complex lighting conditions—including backlighting, non-uniform lighting, and low light. This often results in the loss of image details and structural information, color distortion, and noise artifacts, thereby compromising the visual quality of captured images and reducing model accuracy. To train the Adltformer enhancement model, the day-to-night image synthesis (DTN-Synthesis) algorithm generates low-light image pairs that are precisely aligned with normal light images and include controlled noise levels. The Adltformer and Detr team-training (AT-Detr) method is employed to preprocess the low-light cattle dataset for image enhancement, ensuring that the enhanced images are more compatible with the requirements of machine vision systems. The experimental results demonstrate that the AT-Detr algorithm achieves superior detection accuracy, with comparable runtime and model complexity, reaching 97.5% accuracy under challenging illumination conditions, outperforming both Detr alone and sequential image enhancement followed by Detr. This approach provides both theoretical justification and practical applicability for detecting cattle under challenging conditions in real-world farming environments.
Gli stili APA, Harvard, Vancouver, ISO e altri
10

Cao, Xipeng, Peng Yuan, Bailan Feng e Kun Niu. "CF-DETR: Coarse-to-Fine Transformers for End-to-End Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 1 (28 giugno 2022): 185–93. http://dx.doi.org/10.1609/aaai.v36i1.19893.

Testo completo
Abstract (sommario):
The recently proposed DEtection TRansformer (DETR) achieves promising performance for end-to-end object detection. However, it has relatively lower detection performance on small objects and suffers from slow convergence. This paper observed that DETR performs surprisingly well even on small objects when measuring Average Precision (AP) at decreased Intersection-over-Union (IoU) thresholds. Motivated by this observation, we propose a simple way to improve DETR by refining the coarse features and predicted locations. Specifically, we propose a novel Coarse-to-Fine (CF) decoder layer constituted of a coarse layer and a carefully designed fine layer. Within each CF decoder layer, the extracted local information (region of interest feature) is introduced into the flow of global context information from the coarse layer to refine and enrich the object query features via the fine layer. In the fine layer, the multi-scale information can be fully explored and exploited via the Adaptive Scale Fusion(ASF) module and Local Cross-Attention (LCA) module. The multi-scale information can also be enhanced by another proposed Transformer Enhanced FPN (TEF) module to further improve the performance. With our proposed framework (named CF-DETR), the localization accuracy of objects (especially for small objects) can be largely improved. As a byproduct, the slow convergence issue of DETR can also be addressed. The effectiveness of CF-DETR is validated via extensive experiments on the coco benchmark. CF-DETR achieves state-of-the-art performance among end-to-end detectors, e.g., achieving 47.8 AP using ResNet-50 with 36 epochs in the standard 3x training schedule.
Gli stili APA, Harvard, Vancouver, ISO e altri
11

Xia, Jiaao, Meijuan Li, Weikang Liu e Xuebo Chen. "DSRA-DETR: An Improved DETR for Multiscale Traffic Sign Detection". Sustainability 15, n. 14 (11 luglio 2023): 10862. http://dx.doi.org/10.3390/su151410862.

Testo completo
Abstract (sommario):
Traffic sign detection plays an important role in improving the capabilities of automated driving systems by addressing road safety challenges in sustainable urban living. In this paper, we present DSRA-DETR, a novel approach focused on improving multiscale detection performance. Our approach integrates the dilated spatial pyramid pooling model (DSPP) and the multiscale feature residual aggregation module (FRAM) to aggregate features at various scales. These modules excel at reducing feature noise and minimizing loss of low-level features during feature map extraction. Additionally, they enhance the model’s capability to detect objects at different scales, thereby improving the accuracy and robustness of traffic sign detection. We evaluate the performance of our method on two widely used datasets, the GTSDB and CCTSDB, and achieve impressive average accuracies (APs) of 76.13% and 78.24%, respectively. Compared with other well-known algorithms, our method shows a significant improvement in detection accuracy, demonstrating its superiority and generality. Our proposed method shows great potential for improving the performance of traffic sign detection for autonomous driving systems and will help in the development of safe and efficient autonomous driving technologies.
Gli stili APA, Harvard, Vancouver, ISO e altri
12

Achit, Mohamed, Nacera Yassa, Ahcene Bouzida, Ali Mazari e Okba Fergani. "Enhancing PV panel diagnosis: RT-DETR vs YOLO v8 for real-time hotspots detection". STUDIES IN ENGINEERING AND EXACT SCIENCES 5, n. 2 (27 settembre 2024): e8390. http://dx.doi.org/10.54021/seesv5n2-268.

Testo completo
Abstract (sommario):
Accurate and efficient detection of hotspots in photovoltaic (PV) panels is paramount for guiding maintenance decisions, enhancing power generation, and ensuring the safety and stability of power stations. In this context, this study applies the Real-Time Detection Transformer (RT-DETR) for the real-time identification of hotspots on PV panels to improve their performance. The use of RT-DETR eliminates the need for traditional post-processing methods like Non-Maximum Suppression (NMS), leading to faster and more accurate detection. A comparative analysis reveals that RT-DETR stands out with zero postprocessing time, whereas YOLO v8 requires over 1.0 ms for similar image analysis tasks. Notably, RT-DETR achieves a higher mean Average Precision (mAP) of 0.802, compared to YOLO v8-l's 0.673, despite the latter having more parameters. This points to RT-DETR's superior efficiency in both speed and accuracy. Additionally, RT-DETR's F1 score of 88% further demonstrates its precision, surpassing that of YOLO v8-l, while maintaining a high frame rate. These results confirm RT-DETR as an effective tool for real-time hotspot detection, suitable for operation on low-power devices such as drones and infrared-equipped smartphones, offering a practical solution for areas with limited resources. The study showcases the potential of RT-DETR in enabling rapid and accurate hotspot detection in PV panels, a significant step forward in photovoltaic technology management.
Gli stili APA, Harvard, Vancouver, ISO e altri
13

Suherman, Endang, Ben Rahman, Djarot Hindarto e Handri Santoso. "Implementation of ResNet-50 on End-to-End Object Detection (DETR) on Objects". SinkrOn 8, n. 2 (25 aprile 2023): 1085–96. http://dx.doi.org/10.33395/sinkron.v8i2.12378.

Testo completo
Abstract (sommario):
Object recognition in images is one of the problems that continues to be faced in the world of computer vision. Various approaches have been developed to address this problem, and end-to-end object detection is one relatively new approach. End-to-end object detection involves using the CNN and Transformer architectures to learn object information directly from the image and can produce very good results in object detection. In this research, we implemented ResNet-50 in an End-to-End Object Detection system to improve object detection performance in images. ResNet-50 is a CNN architecture that is well-known for its effectiveness in image recognition tasks, while DETR utilizes Transformers to study object representations directly from images. We tested our system performance on the COCO dataset and demonstrated that ResNet-50 + DETR achieves a better level of accuracy than DETR models that do not use ResNet-50. In addition, we also show that ResNet-50 + DETR can detect objects more quickly than similar traditional CNN models. The results of our research show that the use of ResNet-50 in the DETR system can improve object detection performance in images by about 90%. We also show that using ResNet-50 in DETR systems can improve object detection speed, which is a huge advantage in real-time applications. We hope that the results of this research can contribute to the development of object detection technology in images in the world of computer vision.
Gli stili APA, Harvard, Vancouver, ISO e altri
14

Liu, Xin, Xudong Yang, Lianhe Shao, Xihan Wang, Quanli Gao e Hongbo Shi. "GM-DETR: Research on a Defect Detection Method Based on Improved DETR". Sensors 24, n. 11 (3 giugno 2024): 3610. http://dx.doi.org/10.3390/s24113610.

Testo completo
Abstract (sommario):
Defect detection is an indispensable part of the industrial intelligence process. The introduction of the DETR model marked the successful application of a transformer for defect detection, achieving true end-to-end detection. However, due to the complexity of defective backgrounds, low resolutions can lead to a lack of image detail control and slow convergence of the DETR model. To address these issues, we proposed a defect detection method based on an improved DETR model, called the GM-DETR. We optimized the DETR model by integrating GAM global attention with CNN feature extraction and matching features. This optimization process reduces the defect information diffusion and enhances the global feature interaction, improving the neural network’s performance and ability to recognize target defects in complex backgrounds. Next, to filter out unnecessary model parameters, we proposed a layer pruning strategy to optimize the decoding layer, thereby reducing the model’s parameter count. In addition, to address the issue of poor sensitivity of the original loss function to small differences in defect targets, we replaced the L1 loss in the original loss function with MSE loss to accelerate the network’s convergence speed and improve the model’s recognition accuracy. We conducted experiments on a dataset of road pothole defects to further validate the effectiveness of the GM-DETR model. The results demonstrate that the improved model exhibits better performance, with an increase in average precision of 4.9% (mAP@0.5), while reducing the parameter count by 12.9%.
Gli stili APA, Harvard, Vancouver, ISO e altri
15

Dong, Na, Yongqiang Zhang, Mingli Ding e Gim Hee Lee. "Incremental-DETR: Incremental Few-Shot Object Detection via Self-Supervised Learning". Proceedings of the AAAI Conference on Artificial Intelligence 37, n. 1 (26 giugno 2023): 543–51. http://dx.doi.org/10.1609/aaai.v37i1.25129.

Testo completo
Abstract (sommario):
Incremental few-shot object detection aims at detecting novel classes without forgetting knowledge of the base classes with only a few labeled training data from the novel classes. Most related prior works are on incremental object detection that rely on the availability of abundant training samples per novel class that substantially limits the scalability to real-world setting where novel data can be scarce. In this paper, we propose the Incremental-DETR that does incremental few-shot object detection via fine-tuning and self-supervised learning on the DETR object detector. To alleviate severe over-fitting with few novel class data, we first fine-tune the class-specific components of DETR with self-supervision from additional object proposals generated using Selective Search as pseudo labels. We further introduce an incremental few-shot fine-tuning strategy with knowledge distillation on the class-specific components of DETR to encourage the network in detecting novel classes without forgetting the base classes. Extensive experiments conducted on standard incremental object detection and incremental few-shot object detection settings show that our approach significantly outperforms state-of-the-art methods by a large margin. Our source code is available at https://github.com/dongnana777/Incremental-DETR.
Gli stili APA, Harvard, Vancouver, ISO e altri
16

He, Xiaohai, Kaiwen Liang, Weimin Zhang, Fangxing Li, Zhou Jiang, Zhengqing Zuo e Xinyan Tan. "DETR-ORD: An Improved DETR Detector for Oriented Remote Sensing Object Detection with Feature Reconstruction and Dynamic Query". Remote Sensing 16, n. 18 (22 settembre 2024): 3516. http://dx.doi.org/10.3390/rs16183516.

Testo completo
Abstract (sommario):
Optical remote sensing images often feature high resolution, dense target distribution, and uneven target sizes, while transformer-based detectors like DETR reduce manually designed components, DETR does not support arbitrary-oriented object detection and suffers from high computational costs and slow convergence when handling large sequences of images. Additionally, bipartite graph matching and the limit on the number of queries result in transformer-based detectors performing poorly in scenarios with multiple objects and small object sizes. We propose an improved DETR detector for Oriented remote sensing object detection with Feature Reconstruction and Dynamic Query, termed DETR-ORD. It introduces rotation into the transformer architecture for oriented object detection, reduces computational cost with a hybrid encoder, and includes an IFR (image feature reconstruction) module to address the loss of positional information due to the flattening operation. It also uses ATSS to select auxiliary dynamic training queries for the decoder. This improved DETR-based detector enhances detection performance in challenging oriented optical remote sensing scenarios with similar backbone network parameters. Our approach achieves superior results on most optical remote sensing datasets, such as DOTA-v1.5 (72.07% mAP) and DIOR-R (66.60% mAP), surpassing the baseline detector.
Gli stili APA, Harvard, Vancouver, ISO e altri
17

Huda, Dwi Nurul, Mochammad Rizki Romdoni, Liza Safitri, Ade Winarni e Abdur Rahman. "Real-time Detection Transformer (RT-DETR) of Ornamental Fish Diseases with YOLOv9 using CNN (Convolutional Neural Network) Algorithm". Journal of Applied Informatics and Computing 8, n. 2 (13 novembre 2024): 463–71. https://doi.org/10.30871/jaic.v8i2.8561.

Testo completo
Abstract (sommario):
The lack of specialized tools to check the condition of ornamental fish has hindered effective management. This research proposes a novel software architecture that uses the YOLOv9 model combined with RT-DETR to enable accurate and timely identification of ornamental fish conditions including fish diseases, empowering farmers and hobbyists with a valuable resource. This integration is done using Soft Voting Ensemble Learning technique. To achieve this goal, an Android mobile application successfully classified healthy fish and accurately identified common diseases such as bacteria, fungal, parasitic, and whitetail. Based on the test results, the integration accuracy of the YOLOv9 and RT-DETR models produced a high result of 0.8947 while the stand-alone YOLOv9 showed 0.8889 and the stand-alone RT-DETR of 0.8904. Recommendations are given for the combination of YOLOv9 and RT-DETR in condition detection and diagnosis of ornamental fish diseases.
Gli stili APA, Harvard, Vancouver, ISO e altri
18

Zhen, Peining, Ziyang Gao, Tianshu Hou, Yuan Cheng e Hai-Bao Chen. "Deeply Tensor Compressed Transformers for End-to-End Object Detection". Proceedings of the AAAI Conference on Artificial Intelligence 36, n. 4 (28 giugno 2022): 4716–24. http://dx.doi.org/10.1609/aaai.v36i4.20397.

Testo completo
Abstract (sommario):
DEtection TRansformer (DETR) is a recently proposed method that streamlines the detection pipeline and achieves competitive results against two-stage detectors such as Faster-RCNN. The DETR models get rid of complex anchor generation and post-processing procedures thereby making the detection pipeline more intuitive. However, the numerous redundant parameters in transformers make the DETR models computation and storage intensive, which seriously hinder them to be deployed on the resources-constrained devices. In this paper, to obtain a compact end-to-end detection framework, we propose to deeply compress the transformers with low-rank tensor decomposition. The basic idea of the tensor-based compression is to represent the large-scale weight matrix in one network layer with a chain of low-order matrices. Furthermore, we propose a gated multi-head attention (GMHA) module to mitigate the accuracy drop of the tensor-compressed DETR models. In GMHA, each attention head has an independent gate to determine the passed attention value. The redundant attention information can be suppressed by adopting the normalized gates. Lastly, to obtain fully compressed DETR models, a low-bitwidth quantization technique is introduced for further reducing the model storage size. Based on the proposed methods, we can achieve significant parameter and model size reduction while maintaining high detection performance. We conduct extensive experiments on the COCO dataset to validate the effectiveness of our tensor-compressed (tensorized) DETR models. The experimental results show that we can attain 3.7 times full model compression with 482 times feed forward network (FFN) parameter reduction and only 0.6 points accuracy drop.
Gli stili APA, Harvard, Vancouver, ISO e altri
19

Sun, Hao, Mingyao Zhou, Wenjing Chen e Wei Xie. "TR-DETR: Task-Reciprocal Transformer for Joint Moment Retrieval and Highlight Detection". Proceedings of the AAAI Conference on Artificial Intelligence 38, n. 5 (24 marzo 2024): 4998–5007. http://dx.doi.org/10.1609/aaai.v38i5.28304.

Testo completo
Abstract (sommario):
Video moment retrieval (MR) and highlight detection (HD) based on natural language queries are two highly related tasks, which aim to obtain relevant moments within videos and highlight scores of each video clip. Recently, several methods have been devoted to building DETR-based networks to solve both MR and HD jointly. These methods simply add two separate task heads after multi-modal feature extraction and feature interaction, achieving good performance. Nevertheless, these approaches underutilize the reciprocal relationship between two tasks. In this paper, we propose a task-reciprocal transformer based on DETR (TR-DETR) that focuses on exploring the inherent reciprocity between MR and HD. Specifically, a local-global multi-modal alignment module is first built to align features from diverse modalities into a shared latent space. Subsequently, a visual feature refinement is designed to eliminate query-irrelevant information from visual features for modal interaction. Finally, a task cooperation module is constructed to refine the retrieval pipeline and the highlight score prediction process by utilizing the reciprocity between MR and HD. Comprehensive experiments on QVHighlights, Charades-STA and TVSum datasets demonstrate that TR-DETR outperforms existing state-of-the-art methods. Codes are available at https://github.com/mingyao1120/TR-DETR.
Gli stili APA, Harvard, Vancouver, ISO e altri
20

Abdusalomov, Akmalbek, Sabina Umirzakova, Furkat Safarov, Sanjar Mirzakhalilov, Nodir Egamberdiev e Young-Im Cho. "A Multi-Scale Approach to Early Fire Detection in Smart Homes". Electronics 13, n. 22 (6 novembre 2024): 4354. http://dx.doi.org/10.3390/electronics13224354.

Testo completo
Abstract (sommario):
In recent years, advancements in smart home technologies have underscored the need for the development of early fire and smoke detection systems to enhance safety and security. Traditional fire detection methods relying on thermal or smoke sensors exhibit limitations in terms of response time and environmental adaptability. To address these issues, this paper introduces the multi-scale information transformer–DETR (MITI-DETR) model, which incorporates multi-scale feature extraction and transformer-based attention mechanisms, tailored specifically for fire detection in smart homes. MITI-DETR achieves a precision of 99.00%, a recall of 99.50%, and a mean average precision (mAP) of 99.00% on a custom dataset designed to reflect diverse lighting and spatial conditions in smart homes. Extensive experiments demonstrate that MITI-DETR outperforms state-of-the-art models in terms of these metrics, especially under challenging environmental conditions. This work provides a robust solution for early fire detection in smart homes, combining high accuracy with real-time deployment feasibility.
Gli stili APA, Harvard, Vancouver, ISO e altri
21

Xing, Zijian, Jia Ren, Xiaozhong Fan e Yu Zhang. "S-DETR: A Transformer Model for Real-Time Detection of Marine Ships". Journal of Marine Science and Engineering 11, n. 4 (24 marzo 2023): 696. http://dx.doi.org/10.3390/jmse11040696.

Testo completo
Abstract (sommario):
Due to the ever-changing shape and scale of ships, as well as the complex sea background, accurately detecting multi-scale ships on the sea while considering real-time requirements remains a challenge. To address this problem, we propose a model called S-DETR based on the DETR framework for end-to-end detection of ships on the sea. A scale attention module is designed to effectively learn the weights of different scale information by utilizing the global information brought by global average pooling. We analyzed the potential reasons for the performance degradation of the end-to-end detector and proposed a decoder based on Dense Query. Although the computational complexity and convergence of the entire S-DETR model have not been rigorously proven mathematically, Dense Query can reduce the computational complexity of multi-head self-attention from O(Nq2) into O(Nq). To evaluate the performance of S-DETR, we conducted experiments on the Singapore Maritime Dataset and Marine Image Dataset. The experimental results show that the proposed method can effectively solve the problem of multi-scale ship detection in complex marine environments and achieve state-of-the-art performance. The model inference speed of S-DETR is comparable to that of single-stage target detection models and meets the real-time requirements of shoreside ship detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
22

Wang, Zhuowei, Zhukang Ruan e Chong Chen. "DyFish-DETR: Underwater Fish Image Recognition Based on Detection Transformer". Journal of Marine Science and Engineering 12, n. 6 (22 maggio 2024): 864. http://dx.doi.org/10.3390/jmse12060864.

Testo completo
Abstract (sommario):
Due to the complexity of underwater environments and the lack of training samples, the application of target detection algorithms to the underwater environment has yet to provide satisfactory results. It is crucial to design specialized underwater target recognition algorithms for different underwater tasks. In order to achieve this goal, we created a dataset of freshwater fish captured from multiple angles and lighting conditions, aiming to improve underwater target detection of freshwater fish in natural environments. We propose a method suitable for underwater target detection, called DyFish-DETR (Dynamic Fish Detection with Transformers). In DyFish-DETR, we propose a DyFishNet (Dynamic Fish Net) to better extract fish body texture features. A Slim Hybrid Encoder is designed to fuse fish body feature information. The results of ablation experiments show that DyFishNet can effectively improve the mean Average Precision (mAP) of model detection. The Slim Hybrid Encoder can effectively improve Frame Per Second (FPS). Both DyFishNet and the Slim Hybrid Encoder can reduce model parameters and Floating Point Operations (FLOPs). In our proposed freshwater fish dataset, DyFish-DETR achieved a mAP of 96.6%. The benchmarking experimental results show that the Average Precision (AP) and Average Recall (AR) of DyFish-DETR are higher than several state-of-the-art methods. Additionally, DyFish-DETR, respectively, achieved 99%, 98.8%, and 83.2% mAP in other underwater datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
23

Liu, Ziyuan, Chunxia Sun e Xiaopeng Wang. "DST-DETR: Image Dehazing RT-DETR for Safety Helmet Detection in Foggy Weather". Sensors 24, n. 14 (17 luglio 2024): 4628. http://dx.doi.org/10.3390/s24144628.

Testo completo
Abstract (sommario):
In foggy weather, outdoor safety helmet detection often suffers from low visibility and unclear objects, hindering optimal detector performance. Moreover, safety helmets typically appear as small objects at construction sites, prone to occlusion and difficult to distinguish from complex backgrounds, further exacerbating the detection challenge. Therefore, the real-time and precise detection of safety helmet usage among construction personnel, particularly in adverse weather conditions such as foggy weather, poses a significant challenge. To address this issue, this paper proposes the DST-DETR, a framework for foggy weather safety helmet detection. The DST-DETR framework comprises a dehazing module, PAOD-Net, and an object detection module, ST-DETR, for joint dehazing and detection. Initially, foggy images are restored within PAOD-Net, enhancing the AOD-Net model by introducing a novel convolutional module, PfConv, guided by the parameter-free average attention module (PfAAM). This module enables more focused attention on crucial features in lightweight models, therefore enhancing performance. Subsequently, the MS-SSIM + ℓ2 loss function is employed to bolster the model’s robustness, making it adaptable to scenes with intricate backgrounds and variable fog densities. Next, within the object detection module, the ST-DETR model is designed to address small objects. By refining the RT-DETR model, its capability to detect small objects in low-quality images is enhanced. The core of this approach lies in utilizing the variant ResNet-18 as the backbone to make the network lightweight without sacrificing accuracy, followed by effectively integrating the small-object layer into the improved BiFPN neck structure, resulting in CCFF-BiFPN-P2. Various experiments were conducted to qualitatively and quantitatively compare our method with several state-of-the-art approaches, demonstrating its superiority. The results validate that the DST-DETR algorithm is better suited for foggy safety helmet detection tasks in construction scenarios.
Gli stili APA, Harvard, Vancouver, ISO e altri
24

Liu, Yanhong, Fang Zhou, Wenxin Zheng, Tao Bai, Xinwen Chen e Leifeng Guo. "Recognition of Foal Nursing Behavior Based on an Improved RT-DETR Model". Animals 15, n. 3 (24 gennaio 2025): 340. https://doi.org/10.3390/ani15030340.

Testo completo
Abstract (sommario):
Foal nursing behavior is a crucial indicator of healthy growth. The mare being in a standing posture and the foal being in a suckling posture are important markers for foal suckling behavior. To enable the recognition of a mare’s standing posture and its foal’s suckling posture in stalls, this paper proposes an RT-DETR-Foalnursing model based on RT-DETR. The model employs SACGNet as the backbone to enhance the efficiency of image feature extraction. Furthermore, by incorporating a multiscale multihead attention module and a channel attention module into the Adaptive Instance Feature Integration (AIFI), the model strengthens feature utilization and integration capabilities, thereby improving recognition accuracy. Experimental results demonstrate that the improved RT-DETR achieves a best mAP@50 of 98.5%, increasing by 1.8% compared to the RT-DETR. Additionally, this study achieves real-time statistical analysis of the duration of the foal in the suckling posture, which is one of the important indicators for determining whether the foal is suckling. This has significant implications for the healthy growth of foals.
Gli stili APA, Harvard, Vancouver, ISO e altri
25

Su, Hailin, Haijiang Sun e Yongxian Zhao. "Efficient Pruning of Detection Transformer in Remote Sensing Using Ant Colony Evolutionary Pruning". Applied Sciences 15, n. 1 (29 dicembre 2024): 200. https://doi.org/10.3390/app15010200.

Testo completo
Abstract (sommario):
This study mainly addresses the issues of an excessive model parameter count and computational complexity in Detection Transformer (DETR) for remote sensing object detection and similar neural networks. We propose an innovative neural network pruning method called “ant colony evolutionary pruning (ACEP)” which reduces the number of parameters in the neural network to improve the performance and efficiency of DETR-based neural networks in the remote sensing field. To retain the original network’s performance as much as possible, we combine population evolution and ant colony algorithms for dynamic search processes to automatically find efficient sparse sub-networks. Additionally, we design three different sparse operators based on the structural characteristics of DETR-like neural networks. Furthermore, considering the characteristics of remote sensing objects, we introduce sparsity constraints to each network layer to achieve efficient network pruning. The experimental results demonstrate that ACEP is effective on various DETR-like models. After removing a significant number of redundant parameters, it greatly improves the inference speed of these networks when performing remote sensing object detection tasks.
Gli stili APA, Harvard, Vancouver, ISO e altri
26

WEN, Feng, Mei WANG e Xiaojie HU. "DFAM-DETR: Deformable Feature Based Attention Mechanism DETR on Slender Object Detection". IEICE Transactions on Information and Systems E106.D, n. 3 (1 marzo 2023): 401–9. http://dx.doi.org/10.1587/transinf.2022edp7111.

Testo completo
Gli stili APA, Harvard, Vancouver, ISO e altri
27

Huang, Yueming, e Guowu Yuan. "AD-DETR: DETR with asymmetrical relation and decoupled attention in crowded scenes". Mathematical Biosciences and Engineering 20, n. 8 (2023): 14158–79. http://dx.doi.org/10.3934/mbe.2023633.

Testo completo
Abstract (sommario):
<abstract><p>Pedestrian detection in crowded scenes is widely used in computer vision. However, it still has two difficulties: 1) eliminating repeated predictions (multiple predictions corresponding to the same object); 2) false detection and missing detection due to the high scene occlusion rate and the small visible area of detected pedestrians. This paper presents a detection framework based on DETR (detection transformer) to address the above problems, and the model is called AD-DETR (asymmetrical relation detection transformer). We find that the symmetry in a DETR framework causes synchronous prediction updates and duplicate predictions. Therefore, we propose an asymmetric relationship fusion mechanism and let each query asymmetrically fuse the relative relationships of surrounding predictions to learn to eliminate duplicate predictions. Then, we propose a decoupled cross-attention head that allows the model to learn to restrict the range of attention to focus more on visible regions and regions that contribute more to confidence. The method can reduce the noise information introduced by the occluded objects to reduce the false detection rate. Meanwhile, in our proposed asymmetric relations module, we establish a way to encode the relative relation between sets of attention points and improve the baseline. Without additional annotations, combined with the deformable-DETR with Res50 as the backbone, our method can achieve an average precision of 92.6%, MR$ ^{-2} $ of 40.0% and Jaccard index of 84.4% on the challenging CrowdHuman dataset. Our method exceeds previous methods, such as Iter-E2EDet (progressive end-to-end object detection), MIP (one proposal, multiple predictions), etc. Experiments show that our method can significantly improve the performance of the query-based model for crowded scenes, and it is highly robust for the crowded scene.</p></abstract>
Gli stili APA, Harvard, Vancouver, ISO e altri
28

Yang, Mingji, Rongyu Xu, Chunyu Yang, Haibin Wu e Aili Wang. "Hybrid-DETR: A Differentiated Module-Based Model for Object Detection in Remote Sensing Images". Electronics 13, n. 24 (20 dicembre 2024): 5014. https://doi.org/10.3390/electronics13245014.

Testo completo
Abstract (sommario):
Currently, embedded unmanned aerial vehicle (UAV) systems face significant challenges in balancing detection accuracy and computational efficiency when processing remote sensing images with complex backgrounds, small objects, and occlusions. This paper proposes the Hybrid-DETR model based on a real-time end-to-end Detection Transformer (RT-DETR), featuring a novel HybridNet backbone network that implements a differentiated hybrid structure through lightweight RepConv Cross-stage Partial Efficient Layer Aggregation Network (RCSPELAN) modules and the Heat-Transfer Cross-stage Fusion (HTCF) modules, effectively balancing feature extraction efficiency and global perception capabilities. Additionally, we introduce a Small-Object Detection Module (SODM) and an EIFI module to enhance the detection capability of small objects in complex scenarios, while employing the Focaler-Shape-IoU loss function to optimize bounding box regression. Experimental results on the VisDrone2019 dataset demonstrate that Hybrid-DETR achieves mAP50 and mAP50:95 scores of 52.2% and 33.3%, respectively, representing improvements of 5.2% and 4.3% compared to RT-DETR-R18, while reducing model parameters by 29.33%. The effectiveness and robustness of our improved method are further validated on multiple challenging datasets, including AI-TOD and HIT-UAV.
Gli stili APA, Harvard, Vancouver, ISO e altri
29

Wang, Zhijian, Jie Liu, Yixiao Sun, Xiang Zhou, Boyan Sun, Dehong Kong, Jay Xu, Xiaoping Yue e Wenyu Zhang. "Applying auxiliary supervised depth-assisted transformer and cross modal attention fusion in monocular 3D object detection". PeerJ Computer Science 11 (28 gennaio 2025): e2656. https://doi.org/10.7717/peerj-cs.2656.

Testo completo
Abstract (sommario):
Monocular 3D object detection is the most widely applied and challenging solution for autonomous driving, due to 2D images lacking 3D information. Existing methods are limited by inaccurate depth estimations by inequivalent supervised targets. The use of both depth and visual features also faces problems of heterogeneous fusion. In this article, we propose Depth Detection Transformer (Depth-DETR), applying auxiliary supervised depth-assisted transformer and cross modal attention fusion in monocular 3D object detection. Depth-DETR introduces two additional depth encoders besides the visual encoder. Two depth encoders are supervised by ground truth depth and bounding box respectively, working independently to complement each other’s limitations and predicting more accurate target distances. Furthermore, Depth-DETR employs cross modal attention mechanisms to effectively fuse three different features. A parallel structure of two cross modal transformer is applied to fuse two depth features with visual features. Avoiding early fusion between two depth features enhances the final fused feature for better feature representations. Through multiple experimental validations, the Depth-DETR model has achieved highly competitive results in the Karlsruhe Institute of Technology and Toyota Technological Institute (KITTI) dataset, with an AP score of 17.49, representing its outstanding performance in 3D object detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
30

Li, Chenglong, Jianwei Zhang, Bihan Huo e Yingjian Xue. "DHQ-DETR: Distributed and High-Quality Object Query for Enhanced Dense Detection in Remote Sensing". Remote Sensing 17, n. 3 (1 febbraio 2025): 514. https://doi.org/10.3390/rs17030514.

Testo completo
Abstract (sommario):
With the widespread application of remote sensing technologies and UAV imagery in various fields, dense object detection has become a significant and challenging task in computer vision research. Existing end-to-end detection models, particularly those based on DETR, often face criticism due to their high computational demands, slow convergence rates, and inadequacy in managing dense, multi-scale objects. These challenges are especially acute in remote sensing applications, where accurate analysis of large-scale aerial and satellite imagery relies heavily on effective dense object detection. In this paper, we propose the DHQ-DETR framework, which addresses these issues by modeling bounding box offsets as distributions. DHQ-DETR incorporates the Distribution Focus Loss (DFL) to enhance residual learning, and introduces a High-Quality Query Selection (HQQS) module to effectively balance classification and regression tasks. Additionally, we propose an auxiliary detection head and a sample assignment strategy that complements the Hungarian algorithm to accelerate convergence. Our experimental results demonstrate the superior performance of DHQ-DETR, achieving an average precision (AP) of 53.7% on the COCO val2017 dataset, 54.3% on the DOTAv1.0, and 32.4% on Visdrone, underscoring its effectiveness for real-world dense object detection tasks.
Gli stili APA, Harvard, Vancouver, ISO e altri
31

Liu, Yingfan, Miao He e Bin Hui. "ESO-DETR: An Improved Real-Time Detection Transformer Model for Enhanced Small Object Detection in UAV Imagery". Drones 9, n. 2 (14 febbraio 2025): 143. https://doi.org/10.3390/drones9020143.

Testo completo
Abstract (sommario):
Object detection is a fundamental capability that enables drones to perform various tasks. However, achieving a suitable equilibrium between performance, efficiency, and lightweight design continues to be a significant challenge for current algorithms. To address this issue, we propose an enhanced small object detection transformer model called ESO-DETR. First, we present a gated single-head attention backbone block, known as the GSHA block, which enhances the extraction of local details. Besides, ESO-DETR utilizes the multiscale multihead self-attention mechanism (MMSA) to efficiently manage complex features within its backbone network. We also introduce a novel and efficient feature fusion pyramid network for enhanced small object detection, termed ESO-FPN. This network integrates large convolutional kernels with dual-domain attention mechanisms. Lastly, we introduce the EMASlideVariFocal loss (ESVF Loss), which dynamically adjusts the weights to improve the model’s focus on more challenging samples. In comparison with the baseline model, ESO-DETR demonstrates enhancements of 3.9% and 4.0% in the mAP50 metric on the VisDrone and HIT-UAV datasets, respectively, while also reducing parameters by 25%. These results highlight the capability of ESO-DETR to improve detection accuracy while maintaining a lightweight and efficient structure.
Gli stili APA, Harvard, Vancouver, ISO e altri
32

Wan, Yan, Hui Wang, Lingxin Lu, Xin Lan, Feifei Xu e Shenglin Li. "An Improved Real-Time Detection Transformer Model for the Intelligent Survey of Traffic Safety Facilities". Sustainability 16, n. 23 (21 novembre 2024): 10172. http://dx.doi.org/10.3390/su162310172.

Testo completo
Abstract (sommario):
The undertaking of traffic safety facility (TSF) surveys represents a significant labor-intensive endeavor, which is not sustainable in the long term. The subject of traffic safety facility recognition (TSFR) is beset with numerous challenges, including those associated with background misclassification, the diminutive dimensions of the targets, the spatial overlap of detection targets, and the failure to identify specific targets. In this study, transformer-based and YOLO (You Only Look Once) series target detection algorithms were employed to construct TSFR models to ensure both recognition accuracy and efficiency. The TSF image dataset, comprising six categories of TSFs in urban areas of three cities, was utilized for this research. The dimensions and intricacies of the Detection Transformer (DETR) family of models are considerably more substantial than those of the YOLO family. YOLO-World and Real-Time Detection Transformer (RT-DETR) models were optimal and comparable for the TSFR task, with the former exhibiting a higher detection efficiency and the latter a higher detection accuracy. The RT-DETR model exhibited a notable reduction in model complexity by 57% in comparison to the DINO (DETR with improved denoising anchor boxes for end-to-end object detection) model while also demonstrating a slight enhancement in recognition accuracy. The incorporation of the RepGFPN (Reparameterized Generalized Feature Pyramid Network) module has markedly enhanced the multi-target detection accuracy of RT-DETR, with a mean average precision (mAP) of 82.3%. The introduction of RepGFPN significantly enhanced the detection rate of traffic rods, traffic sign boards, and water surround barriers and somewhat ameliorated the problem of duplicate detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
33

Wang, Sen, Huiping Jiang, Jixiang Yang, Xuan Ma e Jiamin Chen. "AMFEF-DETR: An End-to-End Adaptive Multi-Scale Feature Extraction and Fusion Object Detection Network Based on UAV Aerial Images". Drones 8, n. 10 (26 settembre 2024): 523. http://dx.doi.org/10.3390/drones8100523.

Testo completo
Abstract (sommario):
To address the challenge of low detection accuracy and slow detection speed in unmanned aerial vehicle (UAV) aerial images target detection tasks, caused by factors such as complex ground environments, varying UAV flight altitudes and angles, and changes in lighting conditions, this study proposes an end-to-end adaptive multi-scale feature extraction and fusion detection network, named AMFEF-DETR. Specifically, to extract target features from complex backgrounds more accurately, we propose an adaptive backbone network, FADC-ResNet, which dynamically adjusts dilation rates and performs adaptive frequency awareness. This enables the convolutional kernels to effectively adapt to varying scales of ground targets, capturing more details while expanding the receptive field. We also propose a HiLo attention-based intra-scale feature interaction (HLIFI) module to handle high-level features from the backbone. This module uses dual-pathway encoding of high and low frequencies to enhance the focus on the details of dense small targets while reducing noise interference. Additionally, the bidirectional adaptive feature pyramid network (BAFPN) is proposed for cross-scale feature fusion, integrating semantic information and enhancing adaptability. The Inner-Shape-IoU loss function, designed to focus on bounding box shapes and incorporate auxiliary boxes, is introduced to accelerate convergence and improve regression accuracy. When evaluated on the VisDrone dataset, the AMFEF-DETR demonstrated improvements of 4.02% and 16.71% in mAP50 and FPS, respectively, compared to the RT-DETR. Additionally, the AMFEF-DETR model exhibited strong robustness, achieving mAP50 values 2.68% and 3.75% higher than the RT-DETR and YOLOv10, respectively, on the HIT-UAV dataset.
Gli stili APA, Harvard, Vancouver, ISO e altri
34

Smith, Kelly, Karen Winegard, Audrey L. Hicks e Neil McCartney. "Two Years of Resistance Training in Older Men and Women: The Effects of Three Years of Detraining on the Retention of Dynamic Strength". Canadian Journal of Applied Physiology 28, n. 3 (1 giugno 2003): 462–74. http://dx.doi.org/10.1139/h03-034.

Testo completo
Abstract (sommario):
Dynamic muscle strength (1-RM) and symptom-limited treadmill endurance were compared among three groups (5 M and 5 F per group) of older adults (mean age 72.5 yrs) who had either weight-trained continuously twice per week for 5 years (Tr), ceased to weight train after 2 years (Detr), or acted as controls throughout (Con). The Tr and Detr trained hard (progressing up to 3 sets at up to 80% of 1-RM) for 2 years; the Tr continued training for an additional 3 years at a maintenance level (2 to 3 sets at 60-70% 1-RM), whereas the Detr stopped training for those 3 years. The Con subjects did not train for the duration of the study but took part in identical testing procedures. After 2 years of resistance training, dynamic strength in the Tr and Detr groups increased significantly above baseline and Con values for all exercises, p < 0.0001. Following 3 years of maintenance level training, arm curl, leg press, and bench press 1-RM (sum of both limbs) in the Tr remained significantly above baseline values (21.6 kg = 17%; 15.7 kg = 82%; 8.3 kg = 34%, respectively). The 1-RM in Detr were 18.4 kg (14%), 5.3 kg (24%), and 1.4 kg (9%) above base line for leg press, arm curl, and bench press after 5 years, whereas the Con declined over the 5-yr period by 18.4 kg (-9.7%), 4.4 kg (-19%), and 3.5 kg (-6%), respectively. There were nonsignificant improvements in treadmill performance in the Tr and Detr, and a decline in the Con after 2 years. Treadmill performance declined between Years 2 and 5 in all groups despite continued training (ns). We conclude that: (1) dynamic strength gains from 2 years of resistance training in older individuals are not entirely lost even after 3 years of detraining; (2) these effects may be specific to the exercises performed in the training program; (3) adoption of maintenance-level moderate-intensity training significantly attenuates the decline in dynamic strength of previously trained muscles. Key words: muscle, weightlifting, overload
Gli stili APA, Harvard, Vancouver, ISO e altri
35

Liu, Yanli, Jiahe Jin e Heng Zhang. "Improvement of deformable DETR model for insulator defect classification detection method". Journal of Physics: Conference Series 2858, n. 1 (1 ottobre 2024): 012006. http://dx.doi.org/10.1088/1742-6596/2858/1/012006.

Testo completo
Abstract (sommario):
Abstract In response to challenges faced by traditional detection methods such as image blurring, scarcity of insulator defect datasets, and insulators as small targets, we propose an improved Deformable DETR network based on the DETR defect detection algorithm. This network model accurately identifies the position information of insulators and segments the insulators on the insulator string. To address the classification problem after defect detection, we introduce a fused insulator defect classifier with a self-attention mechanism built behind the Deformable DETR model. The detected insulators are classified into three categories. Due to the limited dataset of damaged insulators, corresponding loss functions are set to address the issue of sample imbalance, thereby increasing the model’s focus on damaged insulators. Experimental results demonstrate an accuracy of 97.5% on the test set, highlighting the network’s strong generalization ability.
Gli stili APA, Harvard, Vancouver, ISO e altri
36

Zhang, Hao, Zheng Ma e Xiang Li. "RS-DETR: An Improved Remote Sensing Object Detection Model Based on RT-DETR". Applied Sciences 14, n. 22 (10 novembre 2024): 10331. http://dx.doi.org/10.3390/app142210331.

Testo completo
Abstract (sommario):
Object detection is a fundamental task in computer vision. Recently, deep-learning-based object detection has made significant progress. However, due to large variations in target scale, the predominance of small targets, and complex backgrounds in remote sensing imagery, remote sensing object detection still faces challenges, including low detection accuracy, poor real-time performance, high missed detection rates, and high false detection rates in practical applications. To enhance remote sensing target detection performance, this study proposes a new model, the remote sensing detection transformer (RS-DETR). First, we incorporate cascaded group attention (CGA) into the attention-driven feature interaction module. By capturing features at different levels, it enhances the interaction between features through cascading and improves computational efficiency. Additionally, we propose an enhanced bidirectional feature pyramid network (EBiFPN) to facilitate multi-scale feature fusion. By integrating features across multiple scales, it improves object detection accuracy and robustness. Finally, we propose a novel bounding box regression loss function, Focaler-GIoU, which makes the model focus more on difficult samples, improving detection performance for small and overlapping targets. Experimental results on the satellite imagery multi-vehicles dataset (SIMD) and the high-resolution remote sensing object detection (TGRS-HRRSD) dataset show that the improved algorithm achieved mean average precision (mAP) of 78.2% and 91.6% at an intersection over union threshold of 0.5, respectively, which is an improvement of 2.0% and 1.5% over the baseline model. This result demonstrates the effectiveness and robustness of our proposed method for remote sensing image object detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
37

Wei, Xiaolong, Ling Yin, Liangliang Zhang e Fei Wu. "DV-DETR: Improved UAV Aerial Small Target Detection Algorithm Based on RT-DETR". Sensors 24, n. 22 (19 novembre 2024): 7376. http://dx.doi.org/10.3390/s24227376.

Testo completo
Abstract (sommario):
For drone-based detection tasks, accurately identifying small-scale targets like people, bicycles, and pedestrians remains a key challenge. In this paper, we propose DV-DETR, an improved detection model based on the Real-Time Detection Transformer (RT-DETR), specifically optimized for small target detection in high-density scenes. To achieve this, we introduce three main enhancements: (1) ResNet18 as the backbone network to improve feature extraction and reduce model complexity; (2) the integration of recalibration attention units and deformable attention mechanisms in the neck network to enhance multi-scale feature fusion and improve localization accuracy; and (3) the use of the Focaler-IoU loss function to better handle the imbalanced distribution of target scales and focus on challenging samples. Experimental results on the VisDrone2019 dataset show that DV-DETR achieves an mAP@0.5 of 50.1%, a 1.7% improvement over the baseline model, while increasing detection speed from 75 FPS to 90 FPS, meeting real-time processing requirements. These improvements not only enhance the model’s accuracy and efficiency but also provide practical significance in complex, high-density urban environments, supporting real-world applications in UAV-based surveillance and monitoring tasks.
Gli stili APA, Harvard, Vancouver, ISO e altri
38

Yuan, Xin, Shutong Fang, Ning Li, Qiansheng Ma, Ziheng Wang, Mingfeng Gao, Pingpeng Tang, Changli Yu, Yihan Wang e José-Fernán Martínez Ortega. "Performance Comparison of Sea Cucumber Detection by the Yolov5 and DETR Approach". Journal of Marine Science and Engineering 11, n. 11 (25 ottobre 2023): 2043. http://dx.doi.org/10.3390/jmse11112043.

Testo completo
Abstract (sommario):
Sea cucumber detection represents an important step in underwater environmental perception, which is an indispensable part of the intelligent subsea fishing system. However, water turbidity decreases the clarity of underwater images, presenting a challenge to vision-based underwater target detection. Therefore, accurate, real-time, and lightweight detection models are required. First of all, the development of subsea target detection is summarized in this present work. Object detection methods based on deep learning including YOLOv5 and DETR, which are, respectively, examples of one-stage and anchor-free object detection approaches, have been increasingly applied in underwater detection scenarios. Based on the state-of-the-art underwater sea cucumber detection methods and aiming to provide a reference for practical subsea identification, adjacent and overlapping sea cucumber detection based on YOLOv5 and DETR are investigated and compared in detail. For each approach, the detection experiment is carried out on the derived dataset, which consists of a wide variety of sea cucumber sample images. Experiments demonstrate that YOLOv5 surpasses DETR in low computing consumption and high precision, particularly in the detection of small and dense features. Nevertheless, DETR exhibits rapid development and holds promising prospects in underwater object detection applications, owing to its relatively simple architecture and ingenious attention mechanism.
Gli stili APA, Harvard, Vancouver, ISO e altri
39

Yang, Hua, Xingquan Deng, Hao Shen, Qingfeng Lei, Shuxiang Zhang e Neng Liu. "Disease Detection and Identification of Rice Leaf Based on Improved Detection Transformer". Agriculture 13, n. 7 (7 luglio 2023): 1361. http://dx.doi.org/10.3390/agriculture13071361.

Testo completo
Abstract (sommario):
In recent years, the domain of diagnosing plant afflictions has predominantly relied upon the utilization of deep learning techniques for classifying images of diseased specimens; however, these classification algorithms remain insufficient for instances where a single plant exhibits multiple ailments. Consequently, we view the region afflicted by the malady of rice leaves as a minuscule issue of target detection, and then avail ourselves of a computational approach to vision to identify the affected area. In this paper, we advance a proposal for a Dense Higher-Level Composition Feature Pyramid Network (DHLC-FPN) that is integrated into the Detection Transformer (DETR) algorithm, thereby proffering a novel Dense Higher-Level Composition Detection Transformer (DHLC-DETR) methodology which can effectively detect three diseases: sheath blight, rice blast, and flax spot. Initially, the proposed DHLC-FPN is utilized to supersede the backbone network of DETR through amalgamation with Res2Net, thus forming a feature extraction network. Res2Net then extracts five feature scales, which are coalesced through the deployment of high-density rank hybrid sampling by the DHLC-FPN architecture. The fused features, in concert with the location encoding, are then fed into the transformer to produce predictions of classes and prediction boxes. Lastly, the prediction classes and the prediction boxes are subjected to binary matching through the application of the Hungarian algorithm. On the IDADP datasets, the DHLC-DETR model, through the utilization of data enhancement, elevated mean Average Precision (mAP) by 17.3% in comparison to the DETR model. Additionally, mAP for small target detection was improved by 9.5%, and the magnitude of hyperparameters was reduced by 324.9 M. The empirical outcomes demonstrate that the optimized structure for feature extraction can significantly enhance the average detection accuracy and small target detection accuracy of the model, achieving an average accuracy of 97.44% on the IDADP rice disease dataset.
Gli stili APA, Harvard, Vancouver, ISO e altri
40

Sun, Baoshan, e Xin Cheng. "Smoke Detection Transformer: An Improved Real-Time Detection Transformer Smoke Detection Model for Early Fire Warning". Fire 7, n. 12 (23 dicembre 2024): 488. https://doi.org/10.3390/fire7120488.

Testo completo
Abstract (sommario):
As one of the important features in the early stage of fires, the detection of smoke can provide a faster early warning of a fire, thus suppressing the spread of the fire in time. However, the features of smoke are not apparent; the shape of smoke is not fixed, and it is easy to be confused with the background outdoors, which leads to difficulties in detecting smoke. Therefore, this study proposes a model called Smoke Detection Transformer (Smoke-DETR) for smoke detection, which is based on a Real-Time Detection Transformer (RT-DETR). Considering the limited computational resources of smoke detection devices, Enhanced Channel-wise Partial Convolution (ECPConv) is introduced to reduce the number of parameters and the amount of computation. This approach improves Partial Convolution (PConv) by using a selection strategy that selects channels containing more information for each convolution, thereby increasing the network’s ability to learn smoke features. To cope with smoke images with inconspicuous features and irregular shapes, the Efficient Multi-Scale Attention (EMA) module is used to strengthen the feature extraction capability of the backbone network. Additionally, in order to overcome the problem of smoke being easily confused with the background, the Multi-Scale Foreground-Focus Fusion Pyramid Network (MFFPN) is designed to strengthen the model’s attention to the foreground of images, which improves the accuracy of detection in situations where smoke is not well differentiated from the background. Experimental results demonstrate that Smoke-DETR has achieved significant improvements in smoke detection. In the self-building dataset, compared to RT-DETR, Smoke-DETR achieves a Precision that has reached 86.2%, marking an increase of 3.6 percentage points. Similarly, Recall has achieved 80%, showing an improvement of 3.6 percentage points. In terms of mAP50, it has reached 86.2%, with a 3.8 percentage point increase. Furthermore, mAP50 has reached 53.9%, representing a 3.6 percentage point increase.
Gli stili APA, Harvard, Vancouver, ISO e altri
41

Yang, Zhi, Ke Wang, Sihang Zhang, Bin Liu, Mengxuan Li, Yi Liu e Binbin Zhao. "Identification of potential external breakage of construction work areas in transmission corridors based on DETR". Journal of Physics: Conference Series 2820, n. 1 (1 agosto 2024): 012093. http://dx.doi.org/10.1088/1742-6596/2820/1/012093.

Testo completo
Abstract (sommario):
Abstract Construction work such as large-scale machinery may lead to transmission line failures and endanger the safe work of power grids. A DETR-based method for identifying the potential damage caused by construction work in transmission corridors is proposed, which combines the advantages of convolutional neural network (CNN) and transformer. Firstly, the ResNet-50 network is deployed to extract the image features of the construction work areas. Then, the DETR network with encoder-decoder structure is used for multi-scale training and prediction to obtain the identification results of the external breakage potentials in the construction areas of transmission corridors. Experiments show that the average accuracy (AP) of monitoring and identifying the construction work areas using the DETR network reaches 0.856, which can accurately identify the potential hazards in the construction work areas and provide data for the safe work of transmission corridors.
Gli stili APA, Harvard, Vancouver, ISO e altri
42

Xie, Tianming, Zhonghao Zhang, Jing Tian e Lihong Ma. "Focal DETR: Target-Aware Token Design for Transformer-Based Object Detection". Sensors 22, n. 22 (10 novembre 2022): 8686. http://dx.doi.org/10.3390/s22228686.

Testo completo
Abstract (sommario):
In this paper, we propose a novel target-aware token design for transformer-based object detection. To tackle the target attribute diffusion challenge of transformer-based object detection, we propose two key components in the new target-aware token design mechanism. Firstly, we propose a target-aware sampling module, which forces the sampling patterns to converge inside the target region and obtain its representative encoded features. More specifically, a set of four sampling patterns are designed, including small and large patterns, which focus on the detailed and overall characteristics of a target, respectively, as well as the vertical and horizontal patterns, which handle the object’s directional structures. Secondly, we propose a target-aware key-value matrix. This is a unified, learnable, feature-embedding matrix which is directly weighted on the feature map to reduce the interference of non-target regions. With such a new design, we propose a new variant of the transformer-based object-detection model, called Focal DETR, which achieves superior performance over the state-of-the-art transformer-based object-detection models on the COCO object-detection benchmark dataset. Experimental results demonstrate that our Focal DETR achieves a 44.7 AP in the coco2017 test set, which is 2.7 AP and 0.9 AP higher than the DETR and deformable DETR using the same training strategy and the same feature-extraction network.
Gli stili APA, Harvard, Vancouver, ISO e altri
43

Semenyuk, Vladislav, Ildar Kurmashev, Dmitriy Alyoshin, Liliya Kurmasheva, Vasiliy Serbin e Alessandro Cantelli-Forti. "Study of the Possibility to Combine Deep Learning Neural Networks for Recognition of Unmanned Aerial Vehicles in Optoelectronic Surveillance Channels". Modelling 5, n. 4 (21 novembre 2024): 1773–88. http://dx.doi.org/10.3390/modelling5040092.

Testo completo
Abstract (sommario):
This article explores the challenges of integrating two deep learning neural networks, YOLOv5 and RT-DETR, to enhance the recognition of unmanned aerial vehicles (UAVs) within the optical-electronic channels of Sensor Fusion systems. The authors conducted an experimental study to test YOLOv5 and Faster RT-DETR in order to identify the average accuracy of UAV recognition. A dataset in the form of images of two classes of objects, UAVs, and birds, was prepared in advance. The total number of images, including augmentation, amounted to 6337. The authors implemented training, verification, and testing of the neural networks exploiting PyCharm 2024 IDE. Inference testing was conducted using six videos with UAV flights. On all test videos, RT-DETR-R50 was more accurate by an average of 18.7% in terms of average classification accuracy (Pc). In terms of operating speed, YOLOv5 was 3.4 ms more efficient. It has been established that the use of RT-DETR as the only module for UAV classification in optical-electronic detection channels is not effective due to the large volumes of calculations, which is due to the relatively large number of parameters. Based on the obtained results, an algorithm for combining two neural networks is proposed, which allows for increasing the accuracy of UAV and bird classification without significant losses in speed.
Gli stili APA, Harvard, Vancouver, ISO e altri
44

Zhang, Gege, Luping Wang e Zengping Chen. "A Step-Wise Domain Adaptation Detection Transformer for Object Detection under Poor Visibility Conditions". Remote Sensing 16, n. 15 (25 luglio 2024): 2722. http://dx.doi.org/10.3390/rs16152722.

Testo completo
Abstract (sommario):
To address the performance degradation of cross-domain object detection under various illumination conditions and adverse weather scenarios, this paper introduces a novel method a called Step-wise Domain Adaptation DEtection TRansformer (SDA-DETR). Our approach decomposes the adaptation process into three sequential steps, progressively transferring knowledge from a labeled dataset to an unlabeled one using the DETR (DEtection TRansformer) architecture. Each step precisely reduces domain discrepancy, thereby facilitating effective transfer learning. In the initial step, a target-like domain is constructed as an auxiliary to the source domain to reduce the domain gap at the image level. Then, we adaptively align the source domain and target domain features at both global and local levels. To further mitigate model bias towards the source domain, we develop a token-masked autoencoder (t-MAE) to enhance target domain features at the semantic level. Comprehensive experiments demonstrate that the SDA-DETR outperforms several popular cross-domain object detection methods on three challenging public driving datasets.
Gli stili APA, Harvard, Vancouver, ISO e altri
45

Yang, Yu, Xin Wang, Zhenfang Liu, Min Huang, Shangpeng Sun e Qibing Zhu. "Detection of multi-size peach in orchard using RGB-D camera combined with an improved DEtection Transformer model". Intelligent Data Analysis: An International Journal 27, n. 5 (settembre 2023): 1539–54. https://doi.org/10.3233/ida-220449.

Testo completo
Abstract (sommario):
The first major contribution of the paper is the proposal of using an improved DEtection Transformer network (named R2N-DETR) and Kinect-V2 camera for detecting multiple-size peaches under orchards with varied illumination and fruit occlusion. R2N-DETR model first employed Res2Net-50 to extract a fused low-high level feature map containing fine spatial features and precise semantic information of multi-size peaches from Red-Green-Blue-Depth (RGB-D) images. Second, the encoder-decoder was performed on the feature map to obtain the global context. Finally, all detected objects were detected according to each object’s global context. For the detection of 1101 RGB-D images (imaged from two orchards over three years), the R2N-DETR model achieves an average precision of 0.944 and an average detecting time of 53 ms for each image. The developed system could provide precise visual guidance for robotic picking and contribute to improving yield prediction by providing accurate fruit counting.
Gli stili APA, Harvard, Vancouver, ISO e altri
46

Kong, Yaning, Xiangfeng Shang e Shijie Jia. "Drone-DETR: Efficient Small Object Detection for Remote Sensing Image Using Enhanced RT-DETR Model". Sensors 24, n. 17 (24 agosto 2024): 5496. http://dx.doi.org/10.3390/s24175496.

Testo completo
Abstract (sommario):
Performing low-latency, high-precision object detection on unmanned aerial vehicles (UAVs) equipped with vision sensors holds significant importance. However, the current limitations of embedded UAV devices present challenges in balancing accuracy and speed, particularly in the analysis of high-precision remote sensing images. This challenge is particularly pronounced in scenarios involving numerous small objects, intricate backgrounds, and occluded overlaps. To address these issues, we introduce the Drone-DETR model, which is based on RT-DETR. To overcome the difficulties associated with detecting small objects and reducing redundant computations arising from complex backgrounds in ultra-wide-angle images, we propose the Effective Small Object Detection Network (ESDNet). This network preserves detailed information about small objects, reduces redundant computations, and adopts a lightweight architecture. Furthermore, we introduce the Enhanced Dual-Path Feature Fusion Attention Module (EDF-FAM) within the neck network. This module is specifically designed to enhance the network’s ability to handle multi-scale objects. We employ a dynamic competitive learning strategy to enhance the model’s capability to efficiently fuse multi-scale features. Additionally, we incorporate the P2 shallow feature layer from the ESDNet into the neck network to enhance the model’s ability to fuse small-object features, thereby enhancing the accuracy of small object detection. Experimental results indicate that the Drone-DETR model achieves an mAP50 of 53.9% with only 28.7 million parameters on the VisDrone2019 dataset, representing an 8.1% enhancement over RT-DETR-R18.
Gli stili APA, Harvard, Vancouver, ISO e altri
47

Xie, Wei, Fei Wu, Chao Ouyang, Yan Yang, Jian Qian, Shuang Lin, Chenxi Zhou e Jun Zhang. "Enhanced Self-Supervised Transmission Inspection with Improved Region Prior and Scale Variation". Processes 12, n. 12 (19 dicembre 2024): 2913. https://doi.org/10.3390/pr12122913.

Testo completo
Abstract (sommario):
As an important means to ensure the safety of power transmission, the inspection of overhead transmission lines requires high accuracy for detecting small objects on the transmission lines and relies heavily on the construction of large-scale datasets by using deep learning instead of manual inspection. However, transmission inspection data often involve some sensitive information and need to be labeled by professionals, so it is difficult to construct a large transmission inspection dataset. In order to solve the problem of how to effectively train only on a small amount of transmission line data and achieve high object detection accuracy considering the large-scale variation in transmission objects, we propose an enhanced self-supervised pre-training model for DETR-like models, which are innovative object detectors eliminating hand-crafted non-maximum suppression and manual anchor design compared to previous CNN-based detectors. This paper mainly covers the following two points: (i) We compare UP-DETR and DETReg, noting that UP-DETR’s random cropping method performs poorly on small datasets and affects DETR’s localization ability. To address this, we adopt DETReg’s approach, replacing Selective Search with Edge Boxes for better results. (ii) To tackle large-scale variations in transmission inspection datasets, we propose a multi-scale feature reconstruction task, aligning feature embeddings with multi-scale encoder embeddings, and enhancing multi-scale object detection. Our method surpasses UP-DETR DETReg with DETR variants when fine-tuning PASCAL VOC and PTL-AI Furnas for object detection.
Gli stili APA, Harvard, Vancouver, ISO e altri
48

Liao, Wenbing, e Wenwen Li. "Research on DETR-based Weed Detection Algorithm". Frontiers in Computing and Intelligent Systems 11, n. 1 (21 gennaio 2025): 97–101. https://doi.org/10.54097/7n2ye068.

Testo completo
Abstract (sommario):
To address the problem of complex field conditions and high similarity between corn seedlings and weeds, this study proposes an improved DETR (Detection Transformer) model for weed detection in corn fields, which uses the CBAM convolutional attention mechanism in the DETR model, and uses a focal loss function instead of the traditional cross-entropy loss function to balance the number of positive and negative class samples of the data. Compared with the original model, the mAP (Mean Average precision) of the improved model was increased by 1.12% to 91.56%.
Gli stili APA, Harvard, Vancouver, ISO e altri
49

Peng, Jinmin, Weipeng Fan, Song Lan e Dingran Wang. "MDD-DETR: Lightweight Detection Algorithm for Printed Circuit Board Minor Defects". Electronics 13, n. 22 (13 novembre 2024): 4453. http://dx.doi.org/10.3390/electronics13224453.

Testo completo
Abstract (sommario):
PCBs (printed circuit boards) are the core components of modern electronic devices, and inspecting them for defects will have a direct impact on the performance, reliability and cost of the product. However, the performance of current detection algorithms in identifying minor PCB defects (e.g., mouse bite and spur) still requires improvement. This paper presents the MDD-DETR algorithm for detecting minor defects in PCBs. The backbone network, MDDNet, is used to efficiently extract features while significantly reducing the number of parameters. Simultaneously, the HiLo attention mechanism captures both high- and low-frequency features, transmitting a broader range of gradient information to the neck. Additionally, the proposed SOEP neck network effectively fuses scale features, particularly those rich in small targets, while INM-IoU loss function optimization enables more effective distinction between defects and background, further improving detection accuracy. Experimental results on the PCB_DATASET show that MDD-DETR achieves a 99.3% mAP, outperforming RT-DETR by 2.0% and reducing parameters by 32.3%, thus effectively addressing the challenges of detecting minor PCB defects.
Gli stili APA, Harvard, Vancouver, ISO e altri
50

Arriola-Valverde, Sergio, Renato Rimolo-Donadio, Karolina Villagra-Mendoza, Alfonso Chacón-Rodriguez, Ronny García-Ramirez e Eduardo Somarriba-Chavez. "A Comparative Study of Deep Learning Frameworks Applied to Coffee Plant Detection from Close-Range UAS-RGB Imagery in Costa Rica". Remote Sensing 16, n. 24 (10 dicembre 2024): 4617. https://doi.org/10.3390/rs16244617.

Testo completo
Abstract (sommario):
Introducing artificial intelligence techniques in agriculture offers new opportunities for improving crop management, such as in coffee plantations, which constitute a complex agroforestry environment. This paper presents a comparative study of three deep learning frameworks: Deep Forest, RT-DETR, and Yolov9, customized for coffee plant detection and trained from images with a high spatial resolution (cm/pix). Each frame had dimensions of 640 × 640 pixels acquired from passive RGB sensors onboard a UAS (Unmanned Aerial Systems) system. The image set was structured and consolidated from UAS-RGB imagery acquisition in six locations along the Central Valley, Costa Rica, through automated photogrammetric missions. It was evidenced that the RT-DETR and Yolov9 frameworks allowed adequate generalization and detection with mAP50 values higher than 90% and mAP5095 higher than 54%, in scenarios of application with data augmentation techniques. Deep Forest also achieved good metrics, but noticeably lower when compared to the other frameworks. RT-DETR and Yolov9 were able to generalize and detect coffee plants in unseen scenarios that include complex forest structures within tropical agroforestry Systems (AFS).
Gli stili APA, Harvard, Vancouver, ISO e altri
Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!

Vai alla bibliografia