Journal articles on the topic 'Convolution dilatée'

To see the other types of publications on this topic, follow the link: Convolution dilatée.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Convolution dilatée.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Wei, Yiyang Hu, Ting Zou, Hongmei Liu, Jin Wang, and Xin Wang. "A New Image Classification Approach via Improved MobileNet Models with Local Receptive Field Expansion in Shallow Layers." Computational Intelligence and Neuroscience 2020 (August 1, 2020): 1–10. http://dx.doi.org/10.1155/2020/8817849.

Full text
Abstract:
Because deep neural networks (DNNs) are both memory-intensive and computation-intensive, they are difficult to apply to embedded systems with limited hardware resources. Therefore, DNN models need to be compressed and accelerated. By applying depthwise separable convolutions, MobileNet can decrease the number of parameters and computational complexity with less loss of classification precision. Based on MobileNet, 3 improved MobileNet models with local receptive field expansion in shallow layers, also called Dilated-MobileNet (Dilated Convolution MobileNet) models, are proposed, in which dilated convolutions are introduced into a specific convolutional layer of the MobileNet model. Without increasing the number of parameters, dilated convolutions are used to increase the receptive field of the convolution filters to obtain better classification accuracy. The experiments were performed on the Caltech-101, Caltech-256, and Tubingen animals with attribute datasets, respectively. The results show that Dilated-MobileNets can obtain up to 2% higher classification accuracy than MobileNet.
APA, Harvard, Vancouver, ISO, and other styles
2

Peng, Wenli, Shenglai Zhen, Xin Chen, Qianjing Xiong, and Benli Yu. "Study on convolutional recurrent neural networks for speech enhancement in fiber-optic microphones." Journal of Physics: Conference Series 2246, no. 1 (April 1, 2022): 012084. http://dx.doi.org/10.1088/1742-6596/2246/1/012084.

Full text
Abstract:
Abstract In this paper, several improved convolutional recurrent networks (CRN) are proposed, which can enhance the speech with non-additive distortion captured by fiber-optic microphones. Our preliminary study shows that the original CRN structure based on amplitude spectrum estimation is seriously distorted due to the loss of phase information. Therefore, we transform the network to run in time domain and gain 0.42 improvement on PESQ and 0.03 improvement on STOI. In addition, we integrate dilated convolution into CRN architecture, and adopt three different types of bottleneck modules, namely long short-term memory (LSTM), gated recurrent units (GRU) and dilated convolutions. The experimental results show that the model with dilated convolution in the encoder-decoder and the model with dilated convolution at bottleneck layer have the highest PESQ and STOI scores, respectively.
APA, Harvard, Vancouver, ISO, and other styles
3

Chim, Seyha, Jin-Gu Lee, and Ho-Hyun Park. "Dilated Skip Convolution for Facial Landmark Detection." Sensors 19, no. 24 (December 4, 2019): 5350. http://dx.doi.org/10.3390/s19245350.

Full text
Abstract:
Facial landmark detection has gained enormous interest for face-related applications due to its success in facial analysis tasks such as facial recognition, cartoon generation, face tracking and facial expression analysis. Many studies have been proposed and implemented to deal with the challenging problems of localizing facial landmarks from given images, including large appearance variations and partial occlusion. Studies have differed in the way they use the facial appearances and shape information of input images. In our work, we consider facial information within both global and local contexts. We aim to obtain local pixel-level accuracy for local-context information in the first stage and integrate this with knowledge of spatial relationships between each key point in a whole image for global-context information in the second stage. Thus, the pipeline of our architecture consists of two main components: (1) a deep network for local-context subnet that generates detection heatmaps via fully convolutional DenseNets with additional kernel convolution filters and (2) a dilated skip convolution subnet—a combination of dilated convolutions and skip-connections networks—that are in charge of robustly refining the local appearance heatmaps. Through this proposed architecture, we demonstrate that our approach achieves state-of-the-art performance on challenging datasets—including LFPW, HELEN, 300W and AFLW2000-3D—by leveraging fully convolutional DenseNets, skip-connections and dilated convolution architecture without further post-processing.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhao, Feng, Junjie Zhang, Zhe Meng, and Hanqiang Liu. "Densely Connected Pyramidal Dilated Convolutional Network for Hyperspectral Image Classification." Remote Sensing 13, no. 17 (August 26, 2021): 3396. http://dx.doi.org/10.3390/rs13173396.

Full text
Abstract:
Recently, with the extensive application of deep learning techniques in the hyperspectral image (HSI) field, particularly convolutional neural network (CNN), the research of HSI classification has stepped into a new stage. To avoid the problem that the receptive field of naive convolution is small, the dilated convolution is introduced into the field of HSI classification. However, the dilated convolution usually generates blind spots in the receptive field, resulting in discontinuous spatial information obtained. In order to solve the above problem, a densely connected pyramidal dilated convolutional network (PDCNet) is proposed in this paper. Firstly, a pyramidal dilated convolutional (PDC) layer integrates different numbers of sub-dilated convolutional layers is proposed, where the dilated factor of the sub-dilated convolution increases exponentially, achieving multi-sacle receptive fields. Secondly, the number of sub-dilated convolutional layers increases in a pyramidal pattern with the depth of the network, thereby capturing more comprehensive hyperspectral information in the receptive field. Furthermore, a feature fusion mechanism combining pixel-by-pixel addition and channel stacking is adopted to extract more abstract spectral–spatial features. Finally, in order to reuse the features of the previous layers more effectively, dense connections are applied in densely pyramidal dilated convolutional (DPDC) blocks. Experiments on three well-known HSI datasets indicate that PDCNet proposed in this paper has good classification performance compared with other popular models.
APA, Harvard, Vancouver, ISO, and other styles
5

Tang, Jingfan, Meijia Zhou, Pengfei Li, Min Zhang, and Ming Jiang. "Crowd Counting Based on Multiresolution Density Map and Parallel Dilated Convolution." Scientific Programming 2021 (January 20, 2021): 1–10. http://dx.doi.org/10.1155/2021/8831458.

Full text
Abstract:
The current crowd counting tasks rely on a fully convolutional network to generate a density map that can achieve good performance. However, due to the crowd occlusion and perspective distortion in the image, the directly generated density map usually neglects the scale information and spatial contact information. To solve it, we proposed MDPDNet (Multiresolution Density maps and Parallel Dilated convolutions’ Network) to reduce the influence of occlusion and distortion on crowd estimation. This network is composed of two modules: (1) the parallel dilated convolution module (PDM) that combines three dilated convolutions in parallel to obtain the deep features on the larger receptive field with fewer parameters while reducing the loss of multiscale information; (2) the multiresolution density map module (MDM) that contains three-branch networks for extracting spatial contact information on three different low-resolution density maps as the feature input of the final crowd density map. Experiments show that MDPDNet achieved excellent results on three mainstream datasets (ShanghaiTech, UCF_CC_50, and UCF-QNRF).
APA, Harvard, Vancouver, ISO, and other styles
6

Ma, Tian, Xinlei Zhou, Jiayi Yang, Boyang Meng, Jiali Qian, Jiehui Zhang, and Gang Ge. "Dental Lesion Segmentation Using an Improved ICNet Network with Attention." Micromachines 13, no. 11 (November 7, 2022): 1920. http://dx.doi.org/10.3390/mi13111920.

Full text
Abstract:
Precise segmentation of tooth lesions is critical to creation of an intelligent tooth lesion detection system. As a solution to the problem that tooth lesions are similar to normal tooth tissues and difficult to segment, an improved segmentation method of the image cascade network (ICNet) network is proposed to segment various lesion types, such as calculus, gingivitis, and tartar. First, the ICNet network model is used to achieve real-time segmentation of lesions. Second, the Convolutional Block Attention Module (CBAM) is integrated into the ICNet network structure, and large-size convolutions in the spatial attention module are replaced with layered dilated convolutions to enhance the relevant features while suppressing useless features and solve the problem of inaccurate lesion segmentations. Finally, part of the convolution in the network model is replaced with an asymmetric convolution to reduce the calculations added by the attention module. Experimental results show that compared with Fully Convolutional Networks (FCN), U-Net, SegNet, and other segmentation algorithms, our method has a significant improvement in the segmentation effect, and the image processing frequency is higher, which satisfies the real-time requirements of tooth lesion segmentation accuracy.
APA, Harvard, Vancouver, ISO, and other styles
7

Song, Zhendong, Yupeng Ma, Fang Tan, and Xiaoyi Feng. "Hybrid Dilated and Recursive Recurrent Convolution Network for Time-Domain Speech Enhancement." Applied Sciences 12, no. 7 (March 29, 2022): 3461. http://dx.doi.org/10.3390/app12073461.

Full text
Abstract:
In this paper, we propose a fully convolutional neural network based on recursive recurrent convolution for monaural speech enhancement in the time domain. The proposed network is an encoder-decoder structure using a series of hybrid dilated modules (HDM). The encoder creates low-dimensional features of a noisy input frame. In the HDM, the dilated convolution is used to expand the receptive field of the network model. In contrast, the standard convolution is used to make up for the under-utilized local information of the dilated convolution. The decoder is used to reconstruct enhanced frames. The recursive recurrent convolutional network uses GRU to solve the problem of multiple training parameters and complex structures. State-of-the-art results are achieved on two commonly used speech datasets.
APA, Harvard, Vancouver, ISO, and other styles
8

Viriyasaranon, Thanaporn, Seung-Hoon Chae, and Jang-Hwan Choi. "MFA-net: Object detection for complex X-ray cargo and baggage security imagery." PLOS ONE 17, no. 9 (September 1, 2022): e0272961. http://dx.doi.org/10.1371/journal.pone.0272961.

Full text
Abstract:
Deep convolutional networks have been developed to detect prohibited items for automated inspection of X-ray screening systems in the transport security system. To our knowledge, the existing frameworks were developed to recognize threats using only baggage security X-ray scans. Therefore, the detection accuracy in other domains of security X-ray scans, such as cargo X-ray scans, cannot be ensured. We propose an object detection method for efficiently detecting contraband items in both cargo and baggage for X-ray security scans. The proposed network, MFA-net, consists of three plug-and-play modules, including the multiscale dilated convolutional module, fusion feature pyramid network, and auxiliary point detection head. First, the multiscale dilated convolutional module converts the standard convolution of the detector backbone to a conditional convolution by aggregating the features from multiple dilated convolutions using dynamic feature selection to overcome the object-scale variant issue. Second, the fusion feature pyramid network combines the proposed attention and fusion modules to enhance multiscale object recognition and alleviate the object and occlusion problem. Third, the auxiliary point detection head adopts an auxiliary head to predict the new keypoints of the bounding box to emphasize the localizability without requiring further ground-truth information. We tested the performance of the MFA-net on two large-scale X-ray security image datasets from different domains: a Security Inspection X-ray (SIXray) dataset in the baggage domain and our dataset, named CargoX, in the cargo domain. Moreover, MFA-net outperformed state-of-the-art object detectors in both domains. Thus, adopting the proposed modules can further increase the detection capability of the current object detectors on X-ray security images.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhang, Jianming, Chaoquan Lu, Jin Wang, Lei Wang, and Xiao-Guang Yue. "Concrete Cracks Detection Based on FCN with Dilated Convolution." Applied Sciences 9, no. 13 (July 1, 2019): 2686. http://dx.doi.org/10.3390/app9132686.

Full text
Abstract:
In civil engineering, the stability of concrete is of great significance to safety of people’s life and property, so it is necessary to detect concrete damage effectively. In this paper, we treat crack detection on concrete surface as a semantic segmentation task that distinguishes background from crack at the pixel level. Inspired by Fully Convolutional Networks (FCN), we propose a full convolution network based on dilated convolution for concrete crack detection, which consists of an encoder and a decoder. Specifically, we first used the residual network to extract the feature maps of the input image, designed the dilated convolutions with different dilation rates to extract the feature maps of different receptive fields, and fused the extracted features from multiple branches. Then, we exploited the stacked deconvolution to do up-sampling operator in the fused feature maps. Finally, we used the SoftMax function to classify the feature maps at the pixel level. In order to verify the validity of the model, we introduced the commonly used evaluation indicators of semantic segmentation: Pixel Accuracy (PA), Mean Pixel Accuracy (MPA), Mean Intersection over Union (MIoU), and Frequency Weighted Intersection over Union (FWIoU). The experimental results show that the proposed model converges faster and has better generalization performance on the test set by introducing dilated convolutions with different dilation rates and a multi-branch fusion strategy. Our model has a PA of 96.84%, MPA of 92.55%, MIoU of 86.05% and FWIoU of 94.22% on the test set, which is superior to other models.
APA, Harvard, Vancouver, ISO, and other styles
10

Rahman, Takowa, Md Saiful Islam, and Jia Uddin. "MRI-Based Brain Tumor Classification Using a Dilated Parallel Deep Convolutional Neural Network." Digital 4, no. 3 (June 28, 2024): 529–54. http://dx.doi.org/10.3390/digital4030027.

Full text
Abstract:
Brain tumors are frequently classified with high accuracy using convolutional neural networks (CNNs) to better comprehend the spatial connections among pixels in complex pictures. Due to their tiny receptive fields, the majority of deep convolutional neural network (DCNN)-based techniques overfit and are unable to extract global context information from more significant regions. While dilated convolution retains data resolution at the output layer and increases the receptive field without adding computation, stacking several dilated convolutions has the drawback of producing a grid effect. This research suggests a dilated parallel deep convolutional neural network (PDCNN) architecture that preserves a wide receptive field in order to handle gridding artifacts and extract both coarse and fine features from the images. This article applies multiple preprocessing strategies to the input MRI images used to train the model. By contrasting various dilation rates, the global path uses a low dilation rate (2,1,1), while the local path uses a high dilation rate (4,2,1) for decremental even numbers to tackle gridding artifacts and to extract both coarse and fine features from the two parallel paths. Using three different types of MRI datasets, the suggested dilated PDCNN with the average ensemble method performs best. The accuracy achieved for the multiclass Kaggle dataset-III, Figshare dataset-II, and binary tumor identification dataset-I is 98.35%, 98.13%, and 98.67%, respectively. In comparison to state-of-the-art techniques, the suggested structure improves results by extracting both fine and coarse features, making it efficient.
APA, Harvard, Vancouver, ISO, and other styles
11

Cao, Ruifen, Xi Pei, Ning Ge, and Chunhou Zheng. "Clinical Target Volume Auto-Segmentation of Esophageal Cancer for Radiotherapy After Radical Surgery Based on Deep Learning." Technology in Cancer Research & Treatment 20 (January 1, 2021): 153303382110342. http://dx.doi.org/10.1177/15330338211034284.

Full text
Abstract:
Radiotherapy plays an important role in controlling the local recurrence of esophageal cancer after radical surgery. Segmentation of the clinical target volume is a key step in radiotherapy treatment planning, but it is time-consuming and operator-dependent. This paper introduces a deep dilated convolutional U-network to achieve fast and accurate clinical target volume auto-segmentation of esophageal cancer after radical surgery. The deep dilated convolutional U-network, which integrates the advantages of dilated convolution and the U-network, is an end-to-end architecture that enables rapid training and testing. A dilated convolution module for extracting multiscale context features containing the original information on fine texture and boundaries is integrated into the U-network architecture to avoid information loss due to down-sampling and improve the segmentation accuracy. In addition, batch normalization is added to the deep dilated convolutional U-network for fast and stable convergence. In the present study, the training and validation loss tended to be stable after 40 training epochs. This deep dilated convolutional U-network model was able to segment the clinical target volume with an overall mean Dice similarity coefficient of 86.7% and a respective 95% Hausdorff distance of 37.4 mm, indicating reasonable volume overlap of the auto-segmented and manual contours. The mean Cohen kappa coefficient was 0.863, indicating that the deep dilated convolutional U-network was robust. Comparisons with the U-network and attention U-network showed that the overall performance of the deep dilated convolutional U-network was best for the Dice similarity coefficient, 95% Hausdorff distance, and Cohen kappa coefficient. The test time for segmentation of the clinical target volume was approximately 25 seconds per patient. This deep dilated convolutional U-network could be applied in the clinical setting to save time in delineation and improve the consistency of contouring.
APA, Harvard, Vancouver, ISO, and other styles
12

Yang, Xing-Yao, Shao-Dong Zhang, Rui Xiao, Jiong Yu, and Zi-Yang Li. "Speech Recognition of Accented Mandarin Based on Improved Conformer." Sensors 23, no. 8 (April 16, 2023): 4025. http://dx.doi.org/10.3390/s23084025.

Full text
Abstract:
The convolution module in Conformer is capable of providing translationally invariant convolution in time and space. This is often used in Mandarin recognition tasks to address the diversity of speech signals by treating the time-frequency maps of speech signals as images. However, convolutional networks are more effective in local feature modeling, while dialect recognition tasks require the extraction of a long sequence of contextual information features; therefore, the SE-Conformer-TCN is proposed in this paper. By embedding the squeeze-excitation block into the Conformer, the interdependence between the features of channels can be explicitly modeled to enhance the model’s ability to select interrelated channels, thus increasing the weight of effective speech spectrogram features and decreasing the weight of ineffective or less effective feature maps. The multi-head self-attention and temporal convolutional network is built in parallel, in which the dilated causal convolutions module can cover the input time series by increasing the expansion factor and convolutional kernel to capture the location information implied between the sequences and enhance the model’s access to location information. Experiments on four public datasets demonstrate that the proposed model has a higher performance for the recognition of Mandarin with an accent, and the sentence error rate is reduced by 2.1% compared to the Conformer, with only 4.9% character error rate.
APA, Harvard, Vancouver, ISO, and other styles
13

Yang, Hongbo, and Shi Qiu. "A Novel Dynamic Contextual Feature Fusion Model for Small Object Detection in Satellite Remote-Sensing Images." Information 15, no. 4 (April 18, 2024): 230. http://dx.doi.org/10.3390/info15040230.

Full text
Abstract:
Ground objects in satellite images pose unique challenges due to their low resolution, small pixel size, lack of texture features, and dense distribution. Detecting small objects in satellite remote-sensing images is a difficult task. We propose a new detector focusing on contextual information and multi-scale feature fusion. Inspired by the notion that surrounding context information can aid in identifying small objects, we propose a lightweight context convolution block based on dilated convolutions and integrate it into the convolutional neural network (CNN). We integrate dynamic convolution blocks during the feature fusion step to enhance the high-level feature upsampling. An attention mechanism is employed to focus on the salient features of objects. We have conducted a series of experiments to validate the effectiveness of our proposed model. Notably, the proposed model achieved a 3.5% mean average precision (mAP) improvement on the satellite object detection dataset. Another feature of our approach is lightweight design. We employ group convolution to reduce the computational cost in the proposed contextual convolution module. Compared to the baseline model, our method reduces the number of parameters by 30%, computational cost by 34%, and an FPS rate close to the baseline model. We also validate the detection results through a series of visualizations.
APA, Harvard, Vancouver, ISO, and other styles
14

Wang, Ran, Ruyu Shi, Xiong Hu, and Changqing Shen. "Remaining Useful Life Prediction of Rolling Bearings Based on Multiscale Convolutional Neural Network with Integrated Dilated Convolution Blocks." Shock and Vibration 2021 (January 25, 2021): 1–11. http://dx.doi.org/10.1155/2021/6616861.

Full text
Abstract:
Remaining useful life (RUL) prediction is necessary for guaranteeing machinery’s safe operation. Among deep learning architectures, convolutional neural network (CNN) has shown achievements in RUL prediction because of its strong ability in representation learning. Features from different receptive fields extracted by different sizes of convolution kernels can provide complete information for prognosis. The single size convolution kernel in traditional CNN is difficult to learn comprehensive information from complex signals. Besides, the ability to learn local and global features synchronously is limited to conventional CNN. Thus, a multiscale convolutional neural network (MS-CNN) is introduced to overcome these aforementioned problems. Convolution filters with different dilation rates are integrated to form a dilated convolution block, which can learn features in different receptive fields. Then, several stacked integrated dilated convolution blocks in different depths are concatenated to extract local and global features. The effectiveness of the proposed method is verified by a bearing dataset prepared from the PRONOSTIA platform. The results turn out that the proposed MS-CNN has higher prediction accuracy than many other deep learning-based RUL methods.
APA, Harvard, Vancouver, ISO, and other styles
15

Ho, David Joon, and Qian Lin. "Person Segmentation Using Convolutional Neural Networks With Dilated Convolutions." Electronic Imaging 2018, no. 10 (January 28, 2018): 455–1. http://dx.doi.org/10.2352/issn.2470-1173.2018.10.imawm-455.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Contreras, Jonatan, Martine Ceberio, and Vladik Kreinovich. "Why Dilated Convolutional Neural Networks: A Proof of Their Optimality." Entropy 23, no. 6 (June 18, 2021): 767. http://dx.doi.org/10.3390/e23060767.

Full text
Abstract:
One of the most effective image processing techniques is the use of convolutional neural networks that use convolutional layers. In each such layer, the value of the layer’s output signal at each point is a combination of the layer’s input signals corresponding to several neighboring points. To improve the accuracy, researchers have developed a version of this technique, in which only data from some of the neighboring points is processed. It turns out that the most efficient case—called dilated convolution—is when we select the neighboring points whose differences in both coordinates are divisible by some constant ℓ. In this paper, we explain this empirical efficiency by proving that for all reasonable optimality criteria, dilated convolution is indeed better than possible alternatives.
APA, Harvard, Vancouver, ISO, and other styles
17

Hu, Yicheng, Shufang Tian, and Jia Ge. "Hybrid Convolutional Network Combining Multiscale 3D Depthwise Separable Convolution and CBAM Residual Dilated Convolution for Hyperspectral Image Classification." Remote Sensing 15, no. 19 (October 1, 2023): 4796. http://dx.doi.org/10.3390/rs15194796.

Full text
Abstract:
In recent years, convolutional neural networks (CNNs) have been increasingly leveraged for the classification of hyperspectral imagery, displaying notable advancements. To address the issues of insufficient spectral and spatial information extraction and high computational complexity in hyperspectral image classification, we introduce the MDRDNet, an integrated neural network model. This novel architecture is comprised of two main components: a Multiscale 3D Depthwise Separable Convolutional Network and a CBAM-augmented Residual Dilated Convolutional Network. The first component employs depthwise separable convolutions in a 3D setting to efficiently capture spatial–spectral characteristics, thus substantially reducing the computational burden associated with 3D convolutions. Meanwhile, the second component enhances the network by integrating the Convolutional Block Attention Module (CBAM) with dilated convolutions via residual connections, effectively counteracting the issue of model degradation. We have empirically evaluated the MDRDNet’s performance by running comprehensive experiments on three publicly available datasets: Indian Pines, Pavia University, and Salinas. Our findings indicate that the overall accuracy of the MDRDNet on the three datasets reached 98.83%, 99.81%, and 99.99%, respectively, which is higher than the accuracy of existing models. Therefore, the MDRDNet proposed in this study can fully extract spatial–spectral joint information, providing a new idea for solving the problem of large model calculations in 3D convolutions.
APA, Harvard, Vancouver, ISO, and other styles
18

Hu, Guoping, Fangzheng Zhao, and Bingqi Liu. "Estimation of the Two-Dimensional Direction of Arrival for Low-Elevation and Non-Low-Elevation Targets Based on Dilated Convolutional Networks." Remote Sensing 15, no. 12 (June 14, 2023): 3117. http://dx.doi.org/10.3390/rs15123117.

Full text
Abstract:
This paper addresses the problem of the two-dimensional direction-of-arrival (2D DOA) estimation of low-elevation or non-low-elevation targets using L-shaped uniform and sparse arrays by analyzing the signal models’ features and their mapping to 2D DOA. This paper proposes a 2D DOA estimation algorithm based on the dilated convolutional network model, which consists of two components: a dilated convolutional autoencoder and a dilated convolutional neural network. If there are targets at low elevation, the dilated convolutional autoencoder suppresses the multipath signal and outputs a new signal covariance matrix as the input of the dilated convolutional neural network to directly perform 2D DOA estimation in the absence of a low-elevation target. The algorithm employs 3D convolution to fully retain and extract features. The simulation experiments and the analysis of their results revealed that for both L-shaped uniform and L-shaped sparse arrays, the dilated convolutional autoencoder could effectively suppress the multipath signals without affecting the direct wave and non-low-elevation targets, whereas the dilated convolutional neural network could effectively achieve 2D DOA estimation with a matching rate and an effective ratio of pitch and azimuth angles close to 100% without the need for additional parameter matching. Under the condition of a low signal-to-noise ratio, the estimation accuracy of the proposed algorithm was significantly higher than that of the traditional DOA estimation.
APA, Harvard, Vancouver, ISO, and other styles
19

Wang, Yanjie, Shiyu Hu, Guodong Wang, Chenglizhao Chen, and Zhenkuan Pan. "Multi-scale dilated convolution of convolutional neural network for crowd counting." Multimedia Tools and Applications 79, no. 1-2 (October 17, 2019): 1057–73. http://dx.doi.org/10.1007/s11042-019-08208-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Wang, Yanjie, Guodong Wang, Chenglizhao Chen, and Zhenkuan Pan. "Multi-scale dilated convolution of convolutional neural network for image denoising." Multimedia Tools and Applications 78, no. 14 (February 23, 2019): 19945–60. http://dx.doi.org/10.1007/s11042-019-7377-y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Heo, Woon-Haeng, Hyemi Kim, and Oh-Wook Kwon. "Source Separation Using Dilated Time-Frequency DenseNet for Music Identification in Broadcast Contents." Applied Sciences 10, no. 5 (March 3, 2020): 1727. http://dx.doi.org/10.3390/app10051727.

Full text
Abstract:
We propose a source separation architecture using dilated time-frequency DenseNet for background music identification of broadcast content. We apply source separation techniques to the mixed signals of music and speech. For the source separation purpose, we propose a new architecture to add a time-frequency dilated convolution to the conventional DenseNet in order to effectively increase the receptive field in the source separation scheme. In addition, we apply different convolutions to each frequency band of the spectrogram in order to reflect the different frequency characteristics of the low- and high-frequency bands. To verify the performance of the proposed architecture, we perform singing-voice separation and music-identification experiments. As a result, we confirm that the proposed architecture produces the best performance in both experiments because it uses the dilated convolution to reflect wide contextual information.
APA, Harvard, Vancouver, ISO, and other styles
22

Wang Aili, 王爱丽, 张宇枭 Zhang Yuxiao, 吴海滨 Wu Haibin, 姜开元 Jiang Kaiyuan, and 岩堀祐之 Iwahori Yuji. "基于空洞卷积胶囊网络的激光雷达数据分类." Chinese Journal of Lasers 48, no. 11 (2021): 1110003. http://dx.doi.org/10.3788/cjl202148.1110003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Zhang, Guokai, Xiao Liu, Dandan Zhu, Pengcheng He, Lipeng Liang, Ye Luo, and Jianwei Lu. "3D Spatial Pyramid Dilated Network for Pulmonary Nodule Classification." Symmetry 10, no. 9 (September 1, 2018): 376. http://dx.doi.org/10.3390/sym10090376.

Full text
Abstract:
Lung cancer mortality is currently the highest among all kinds of fatal cancers. With the help of computer-aided detection systems, a timely detection of malignant pulmonary nodule at early stage could improve the patient survival rate efficiently. However, the sizes of the pulmonary nodules are usually various, and it is more difficult to detect small diameter nodules. The traditional convolution neural network uses pooling layers to reduce the resolution progressively, but it hampers the network’s ability to capture the tiny but vital features of the pulmonary nodules. To tackle this problem, we propose a novel 3D spatial pyramid dilated convolution network to classify the malignancy of the pulmonary nodules. Instead of using the pooling layers, we use 3D dilated convolution to learn the detailed characteristic information of the pulmonary nodules. Furthermore, we show that the fusion of multiple receptive fields from different dilated convolutions could further improve the classification performance of the model. Extensive experimental results demonstrate that our model achieves a better result with an accuracy of 88 . 6 % , which outperforms other state-of-the- art methods.
APA, Harvard, Vancouver, ISO, and other styles
24

Xu, Jiawei, Jie Wu, Yu Lei, and Yuxiang Gu. "Application of Pseudo-Three-Dimensional Residual Network to Classify the Stages of Moyamoya Disease." Brain Sciences 13, no. 5 (April 29, 2023): 742. http://dx.doi.org/10.3390/brainsci13050742.

Full text
Abstract:
It is essential to assess the condition of moyamoya disease (MMD) patients accurately and promptly to prevent MMD from endangering their lives. A Pseudo-Three-Dimensional Residual Network (P3D ResNet) was proposed to process spatial and temporal information, which was implemented in the identification of MMD stages. Digital Subtraction Angiography (DSA) sequences were split into mild, moderate and severe stages in accordance with the progression of MMD, and divided into a training set, a verification set, and a test set with a ratio of 6:2:2 after data enhancement. The features of the DSA images were processed using decoupled three-dimensional (3D) convolution. To increase the receptive field and preserve the features of the vessels, decoupled 3D dilated convolutions that are equivalent to two-dimensional dilated convolutions, plus one-dimensional dilated convolution, were utilized in the spatial and temporal domains, respectively. Then, they were coupled in serial, parallel, and serial–parallel modes to form P3D modules based on the structure of the residual unit. The three kinds of module were placed in a proper sequence to create the complete P3D ResNet. The experimental results demonstrate that the accuracy of P3D ResNet can reach 95.78% with appropriate parameter quantities, making it easy to implement in a clinical setting.
APA, Harvard, Vancouver, ISO, and other styles
25

Jin, Ran, Xiaozhen Han, and Tongrui Yu. "A Real-Time Image Semantic Segmentation Method Based on Multilabel Classification." Mathematical Problems in Engineering 2021 (May 31, 2021): 1–13. http://dx.doi.org/10.1155/2021/9963974.

Full text
Abstract:
Image semantic segmentation as a kind of technology has been playing a crucial part in intelligent driving, medical image analysis, video surveillance, and AR. However, since the scene needs to infer more semantics from video and audio clips and the request for real-time performance becomes stricter, whetherthe single-label classification method that was usually used before or the regular manual labeling cannot meet this end. Given the excellent performance of deep learning algorithms in extensive applications, the image semantic segmentation algorithm based on deep learning framework has been brought under the spotlight of development. This paper attempts to improve the ESPNet (Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation) based on the multilabel classification method by the following steps. First, the standard convolution is replaced by applying Receptive Field in Deep Convolutional Neural Network in the convolution layer, to the extent that every pixel in the covered area would facilitate the ultimate feature response. Second, the ASPP (Atrous Spatial Pyramid Pooling) module is improved based on the atrous convolution, and the DB-ASPP (Delate Batch Normalization-ASPP) is proposed as a way to reducing gridding artifacts due to the multilayer atrous convolution, acquiring multiscale information, and integrating the feature information in relation to the image set. Finally, the proposed model and regular models are subject to extensive tests and comparisons on a plurality of multiple data sets. Results show that the proposed model demonstrates a good accuracy of segmentation, the smallest network parameter at 0.3 M and the fastest speed of segmentation at 25 FPS.
APA, Harvard, Vancouver, ISO, and other styles
26

Khotimah, Wijayanti Nurul, Farid Boussaid, Ferdous Sohel, Lian Xu, David Edwards, Xiu Jin, and Mohammed Bennamoun. "SC-CAN: Spectral Convolution and Channel Attention Network for Wheat Stress Classification." Remote Sensing 14, no. 17 (August 30, 2022): 4288. http://dx.doi.org/10.3390/rs14174288.

Full text
Abstract:
Biotic and abiotic plant stress (e.g., frost, fungi, diseases) can significantly impact crop production. It is thus essential to detect such stress at an early stage before visual symptoms and damage become apparent. To this end, this paper proposes a novel deep learning method, called Spectral Convolution and Channel Attention Network (SC-CAN), which exploits the difference in spectral responses of healthy and stressed crops. The proposed SC-CAN method comprises two main modules: (i) a spectral convolution module, which consists of dilated causal convolutional layers stacked in a residual manner to capture the spectral features; (ii) a channel attention module, which consists of a global pooling layer and fully connected layers that compute inter-relationship between feature map channels before scaling them based on their importance level (attention score). Unlike standard convolution, which focuses on learning local features, the dilated convolution layers can learn both local and global features. These layers also have long receptive fields, making them suitable for capturing long dependency patterns in hyperspectral data. However, because not all feature maps produced by the dilated convolutional layers are important, we propose a channel attention module that weights the feature maps according to their importance level. We used SC-CAN to classify salt stress (i.e., abiotic stress) on four datasets (Chinese Spring (CS), Aegilops columnaris (co(CS)), Ae. speltoides auchery (sp(CS)), and Kharchia datasets) and Fusarium head blight disease (i.e., biotic stress) on Fusarium dataset. Reported experimental results show that the proposed method outperforms existing state-of-the-art techniques with an overall accuracy of 83.08%, 88.90%, 82.44%, 82.10%, and 82.78% on CS, co(CS), sp(CS), Kharchia, and Fusarium datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
27

You, Jiangchuan, and Zhenhong Shang. "Solar Filament Detection Based on Improved DeepLab V3+." Publications of the Astronomical Society of the Pacific 134, no. 1036 (June 1, 2022): 064501. http://dx.doi.org/10.1088/1538-3873/ac6e07.

Full text
Abstract:
Abstract A novel solar filament detection method based on an improved DeepLab V3+ is proposed to address the low detection accuracy of small solar filaments in Hα full-disk solar images. First, the Xception structure of the backbone network is fine-tuned, and the low-level feature information of the filaments is added to the decoder module of the network to improve the utilization of the solar filament features. Second, the receptive field of dilated convolution is expanded, and the information utilization rate is increased via cascaded dilated convolution to improve the detection accuracy of the small solar filaments. In the decoder module, two depthwise separable convolutions are used instead of ordinary convolutions to reduce incomplete detections. Finally, a dense conditional random field is added to optimize the edge of the detection results. Experiments on a public data set comprising full-disk Hα images show that compared with the original Deeplab V3+ algorithm, the proposed method improves the mean pixel accuracy, mean intersection over union, and F1-score by 1.86%, 1.95%, and 2.18%, respectively, which also demonstrates its superiority over other existing solar filament detection algorithms.
APA, Harvard, Vancouver, ISO, and other styles
28

Renton, Guillaume, Yann Soullard, Clément Chatelain, Sébastien Adam, Christopher Kermorvant, and Thierry Paquet. "Fully convolutional network with dilated convolutions for handwritten text line segmentation." International Journal on Document Analysis and Recognition (IJDAR) 21, no. 3 (May 30, 2018): 177–86. http://dx.doi.org/10.1007/s10032-018-0304-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Wu, Junjie, Wen Liu, and Yoshihisa Maruyama. "Automated Road-Marking Segmentation via a Multiscale Attention-Based Dilated Convolutional Neural Network Using the Road Marking Dataset." Remote Sensing 14, no. 18 (September 9, 2022): 4508. http://dx.doi.org/10.3390/rs14184508.

Full text
Abstract:
Road markings, including road lanes and symbolic road markings, can convey abundant guidance information to autonomous driving cars. However, recent works have paid less attention to the recognition of symbolic road markings compared with road lanes. In this study, a road-marking-segmentation dataset named the RMD (Road Marking Dataset) is introduced to compensate for the lack of datasets and the limitations of the existing datasets. Furthermore, we propose a novel multiscale attention-based dilated convolutional neural network (MSA-DCNN) to tackle the proposed RMD. The proposed method employs multiscale attention to merge the weighting outputs of adjacent multiscale inputs, and dilated convolution to capture spatial-context information. The performance analysis shows that the proposed MSA-DCNN yields the best results by combining multiscale attention and dilated convolution. Additionally, the proposed method gains the mIoU of 74.88%, which is a significant improvement over the existing techniques.
APA, Harvard, Vancouver, ISO, and other styles
30

Xiang, Zhenwu, Qi Mao, Jintao Wang, Yi Tian, Yan Zhang, and Wenfeng Wang. "Dmbg-Net: Dilated multiresidual boundary guidance network for COVID-19 infection segmentation." Mathematical Biosciences and Engineering 20, no. 11 (2023): 20135–54. http://dx.doi.org/10.3934/mbe.2023892.

Full text
Abstract:
<abstract> <p>Accurate segmentation of infected regions in lung computed tomography (CT) images is essential for the detection and diagnosis of coronavirus disease 2019 (COVID-19). However, lung lesion segmentation has some challenges, such as obscure boundaries, low contrast and scattered infection areas. In this paper, the dilated multiresidual boundary guidance network (Dmbg-Net) is proposed for COVID-19 infection segmentation in CT images of the lungs. This method focuses on semantic relationship modelling and boundary detail guidance. First, to effectively minimize the loss of significant features, a dilated residual block is substituted for a convolutional operation, and dilated convolutions are employed to expand the receptive field of the convolution kernel. Second, an edge-attention guidance preservation block is designed to incorporate boundary guidance of low-level features into feature integration, which is conducive to extracting the boundaries of the region of interest. Third, the various depths of features are used to generate the final prediction, and the utilization of a progressive multi-scale supervision strategy facilitates enhanced representations and highly accurate saliency maps. The proposed method is used to analyze COVID-19 datasets, and the experimental results reveal that the proposed method has a Dice similarity coefficient of 85.6% and a sensitivity of 84.2%. Extensive experimental results and ablation studies have shown the effectiveness of Dmbg-Net. Therefore, the proposed method has a potential application in the detection, labeling and segmentation of other lesion areas.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
31

Zhou, Yuepeng, Huiyou Chang, Yonghe Lu, and Xili Lu. "CDTNet: Improved Image Classification Method Using Standard, Dilated and Transposed Convolutions." Applied Sciences 12, no. 12 (June 12, 2022): 5984. http://dx.doi.org/10.3390/app12125984.

Full text
Abstract:
Convolutional neural networks (CNNs) have achieved great success in image classification tasks. In the process of a convolutional operation, a larger input area can capture more context information. Stacking several convolutional layers can enlarge the receptive field, but this increases the parameters. Most CNN models use pooling layers to extract important features, but the pooling operations cause information loss. Transposed convolution can increase the spatial size of the feature maps to recover the lost low-resolution information. In this study, we used two branches with different dilated rates to obtain different size features. The dilated convolution can capture richer information, and the outputs from the two channels are concatenated together as input for the next block. The small size feature maps of the top blocks are transposed to increase the spatial size of the feature maps to recover low-resolution prediction maps. We evaluated the model on three image classification benchmark datasets (CIFAR-10, SVHN, and FMNIST) with four state-of-the-art models, namely, VGG16, VGG19, ResNeXt, and DenseNet. The experimental results show that CDTNet achieved lower loss, higher accuracy, and faster convergence speed in the training and test stages. The average test accuracy of CDTNet increased by 54.81% at most on SVHN with VGG19 and by 1.28% at least on FMNIST with VGG16, which proves that CDTNet has better performance and strong generalization abilities, as well as fewer parameters.
APA, Harvard, Vancouver, ISO, and other styles
32

Zhao, Haixia, You Zhou, Tingting Bai, and Yuanzhong Chen. "A U-Net Based Multi-Scale Deformable Convolution Network for Seismic Random Noise Suppression." Remote Sensing 15, no. 18 (September 17, 2023): 4569. http://dx.doi.org/10.3390/rs15184569.

Full text
Abstract:
Seismic data processing plays a key role in the field of geophysics. The collected seismic data are inevitably contaminated by various types of noise, which makes the effective signals difficult to be accurately discriminated. A fundamental issue is how to improve the signal-to-noise ratio of seismic data. Due to the complex characteristics of noise and signals, it is a challenge for the denoising model to suppress noise and recover weak signals. To suppress random noise in seismic data, we propose a multi-scale deformable convolution neural network denoising model based on U-Net, named MSDC-Unet. The MSDC-Unet mainly contains modules of deformable convolution and dilated convolution. The deformable convolution can change the shape of the convolution kernel to adjust the shape of seismic signals to fit different features, while the dilated convolution with different dilation rates is used to extract feature information at different scales. Furthermore, we combine Charbonnier loss and structure similarity index measure (SSIM) to better characterize geological structures of seismic data. Several examples of synthetic and field seismic data demonstrate that the proposed method is effective in the comprehensive results in terms of quantitative metrics and visual effect of denoising, compared with two traditional denoising methods and two deep convolutional neural network denoising models.
APA, Harvard, Vancouver, ISO, and other styles
33

Zhu, Yiqun, Guojian Jin, Tongfei Liu, Hanhong Zheng, Mingyang Zhang, Shuang Liang, Jieyi Liu, and Linqi Li. "Self-Attention and Convolution Fusion Network for Land Cover Change Detection over a New Data Set in Wenzhou, China." Remote Sensing 14, no. 23 (November 25, 2022): 5969. http://dx.doi.org/10.3390/rs14235969.

Full text
Abstract:
With the process of increasing urbanization, there is great significance in obtaining urban change information by applying land cover change detection techniques. However, these existing methods still struggle to achieve convincing performances and are insufficient for practical applications. In this paper, we constructed a new data set, named Wenzhou data set, aiming to detect the land cover changes of Wenzhou City and thus update the urban expanding geographic data. Based on this data set, we provide a new self-attention and convolution fusion network (SCFNet) for the land cover change detection of the Wenzhou data set. The SCFNet is composed of three modules, including backbone (local–global pyramid feature extractor in SLGPNet), self-attention and convolution fusion module (SCFM), and residual refinement module (RRM). The SCFM combines the self-attention mechanism with convolutional layers to acquire a better feature representation. Furthermore, RRM exploits dilated convolutions with different dilation rates to refine more accurate and complete predictions over changed areas. In addition, to explore the performance of existing computational intelligence techniques in application scenarios, we selected six classical and advanced deep learning-based methods for systematic testing and comparison. The extensive experiments on the Wenzhou and Guangzhou data sets demonstrated that our SCFNet obviously outperforms other existing methods. On the Wenzhou data set, the precision, recall and F1-score of our SCFNet are all better than 85%.
APA, Harvard, Vancouver, ISO, and other styles
34

Ku, Tao, Qirui Yang, and Hao Zhang. "Multilevel feature fusion dilated convolutional network for semantic segmentation." International Journal of Advanced Robotic Systems 18, no. 2 (March 1, 2021): 172988142110076. http://dx.doi.org/10.1177/17298814211007665.

Full text
Abstract:
Recently, convolutional neural network (CNN) has led to significant improvement in the field of computer vision, especially the improvement of the accuracy and speed of semantic segmentation tasks, which greatly improved robot scene perception. In this article, we propose a multilevel feature fusion dilated convolution network (Refine-DeepLab). By improving the space pyramid pooling structure, we propose a multiscale hybrid dilated convolution module, which captures the rich context information and effectively alleviates the contradiction between the receptive field size and the dilated convolution operation. At the same time, the high-level semantic information and low-level semantic information obtained through multi-level and multi-scale feature extraction can effectively improve the capture of global information and improve the performance of large-scale target segmentation. The encoder–decoder gradually recovers spatial information while capturing high-level semantic information, resulting in sharper object boundaries. Extensive experiments verify the effectiveness of our proposed Refine-DeepLab model, evaluate our approaches thoroughly on the PASCAL VOC 2012 data set without MS COCO data set pretraining, and achieve a state-of-art result of 81.73% mean interaction-over-union in the validate set.
APA, Harvard, Vancouver, ISO, and other styles
35

Zhuang, Zilong, Huichun Lv, Jie Xu, Zizhao Huang, and Wei Qin. "A Deep Learning Method for Bearing Fault Diagnosis through Stacked Residual Dilated Convolutions." Applied Sciences 9, no. 9 (May 1, 2019): 1823. http://dx.doi.org/10.3390/app9091823.

Full text
Abstract:
Real-time monitoring and fault diagnosis of bearings are of great significance to improve production safety, prevent major accidents, and reduce production costs. However, there are three primary concerns in the current research, namely real-time performance, effectiveness, and generalization performance. In this paper, a deep learning method based on stacked residual dilated convolutional neural network (SRDCNN) is proposed for real-time bearing fault diagnosis, which is subtly combined by the dilated convolution, the input gate structure of long short-term memory network (LSTM) and the residual network. In the SRDCNN model, the dilated convolution is used to exponentially increase the receptive field of convolution kernel and extract features from the sample with more points, alleviating the influence of randomness. The input gate structure of LSTM could effectively remove noise and control the entry of information contained in the input sample. Meanwhile, the residual network is introduced to overcome the problem of vanishing gradients caused by the deeper structure of the neural network, hence improving the overall classification accuracy. The experimental results indicate that compared with three excellent models, the proposed SRDCNN model has higher denoising ability and better workload adaptability.
APA, Harvard, Vancouver, ISO, and other styles
36

Qin, Yanjun, Haiyong Luo, Fang Zhao, Chenxing Wang, and Yuchen Fang. "NDGCN: Network in Network, Dilate Convolution and Graph Convolutional Networks Based Transportation Mode Recognition." IEEE Transactions on Vehicular Technology 70, no. 3 (March 2021): 2138–52. http://dx.doi.org/10.1109/tvt.2021.3060761.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Ni, Jian, Rui Wang, and Jing Tang. "ADSSD: Improved Single-Shot Detector with Attention Mechanism and Dilated Convolution." Applied Sciences 13, no. 6 (March 22, 2023): 4038. http://dx.doi.org/10.3390/app13064038.

Full text
Abstract:
The detection of small objects is easily affected by background information, and a lack of context information makes detection difficult. Therefore, small object detection has become an extremely challenging task. Based on the above problems, we proposed a Single-Shot MultiBox Detector with an attention mechanism and dilated convolution (ADSSD). In the attention module, we strengthened the connection between information in space and channels while using cross-layer connections to accelerate training. In the multi-branch dilated convolution module, we combined three expansion convolutions with different dilated ratios to obtain multi-scale context information and used hierarchical feature fusion to reduce the gridding effect. The results show that on PASCAL VOC2007 and VOC2012 datasets, our 300 × 300 input ADSSD model reaches 78.4% mAP and 76.1% mAP. The results outperform those of SSD and other advanced detectors; the effect of some small object detection is significantly improved. Moreover, the performance of the ADSSD in object detection affected by factors such as dense occlusion is better than that of the traditional SSD.
APA, Harvard, Vancouver, ISO, and other styles
38

Ma, Hao, Chao Chen, Qing Zhu, Haitao Yuan, Liming Chen, and Minglei Shu. "An ECG Signal Classification Method Based on Dilated Causal Convolution." Computational and Mathematical Methods in Medicine 2021 (February 2, 2021): 1–10. http://dx.doi.org/10.1155/2021/6627939.

Full text
Abstract:
The incidence of cardiovascular disease is increasing year by year and is showing a younger trend. At the same time, existing medical resources are tight. The automatic detection of ECG signals becomes increasingly necessary. This paper proposes an automatic classification of ECG signals based on a dilated causal convolutional neural network. To solve the problem that the recurrent neural network framework network cannot be accelerated by hardware equipment, the dilated causal convolutional neural network is adopted. Given the features of the same input and output time steps of the recurrent neural network and the nondisclosure of future information, the network is constructed with fully convolutional networks and causal convolution. To reduce the network depth and prevent gradient explosion or gradient disappearance, the dilated factor is introduced into the model, and the residual blocks are introduced into the model according to the shortcut connection idea. The effectiveness of the algorithm is verified in the MIT-BIH Atrial Fibrillation Database (MIT-BIH AFDB). In the experiment of the MIT-BIH AFDB database, the classification accuracy rate is 98.65%.
APA, Harvard, Vancouver, ISO, and other styles
39

Shen, Sheng, Honghui Yang, Xiaohui Yao, Junhao Li, Guanghui Xu, and Meiping Sheng. "Ship Type Classification by Convolutional Neural Networks with Auditory-Like Mechanisms." Sensors 20, no. 1 (January 1, 2020): 253. http://dx.doi.org/10.3390/s20010253.

Full text
Abstract:
Ship type classification with radiated noise helps monitor the noise of shipping around the hydrophone deployment site. This paper introduces a convolutional neural network with several auditory-like mechanisms for ship type classification. The proposed model mainly includes a cochlea model and an auditory center model. In cochlea model, acoustic signal decomposition at basement membrane is implemented by time convolutional layer with auditory filters and dilated convolutions. The transformation of neural patterns at hair cells is modeled by a time frequency conversion layer to extract auditory features. In the auditory center model, auditory features are first selectively emphasized in a supervised manner. Then, spectro-temporal patterns are extracted by deep architecture with multistage auditory mechanisms. The whole model is optimized with an objective function of ship type classification to form the plasticity of the auditory system. The contributions compared with an auditory inspired convolutional neural network include the improvements in dilated convolutions, deep architecture and target layer. The proposed model can extract auditory features from a raw hydrophone signal and identify types of ships under different working conditions. The model achieved a classification accuracy of 87.2% on four ship types and ocean background noise.
APA, Harvard, Vancouver, ISO, and other styles
40

Tran, Song-Toan, Thanh-Tuan Nguyen, Minh-Hai Le, Ching-Hwa Cheng, and Don-Gey Liu. "TDC-Unet: Triple Unet with Dilated Convolution for Medical Image Segmentation." International Journal of Pharma Medicine and Biological Sciences 11, no. 1 (January 2022): 1–7. http://dx.doi.org/10.18178/ijpmbs.11.1.1-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Xie, Wen, Licheng Jiao, and Wenqiang Hua. "Complex-Valued Multi-Scale Fully Convolutional Network with Stacked-Dilated Convolution for PolSAR Image Classification." Remote Sensing 14, no. 15 (August 4, 2022): 3737. http://dx.doi.org/10.3390/rs14153737.

Full text
Abstract:
Polarimetric synthetic aperture radar (PolSAR) image classification is a pixel-wise issue, which has become increasingly prevalent in recent years. As a variant of the Convolutional Neural Network (CNN), the Fully Convolutional Network (FCN), which is designed for pixel-to-pixel tasks, has obtained enormous success in semantic segmentation. Therefore, effectively using the FCN model combined with polarimetric characteristics for PolSAR image classification is quite promising. This paper proposes a novel FCN model by adopting complex-valued domain stacked-dilated convolution (CV-SDFCN). Firstly, a stacked-dilated convolution layer with different dilation rates is constructed to capture multi-scale features of PolSAR image; meanwhile, the sharing weight is employed to reduce the calculation burden. Unfortunately, the labeled training samples of PolSAR image are usually limited. Then, the encoder–decoder structure of the original FCN is reconstructed with a U-net model. Finally, in view of the significance of the phase information for PolSAR images, the proposed model is trained in the complex-valued domain rather than the real-valued domain. The experiment results show that the classification performance of the proposed method is better than several state-of-the-art PolSAR image classification methods.
APA, Harvard, Vancouver, ISO, and other styles
42

Liu, Junwen, Yongjun Zhang, Jianbin Xie, Yan Wei, Zewei Wang, and Mengjia Niu. "Head Detection Based on DR Feature Extraction Network and Mixed Dilated Convolution Module." Electronics 10, no. 13 (June 29, 2021): 1565. http://dx.doi.org/10.3390/electronics10131565.

Full text
Abstract:
Pedestrian detection for complex scenes suffers from pedestrian occlusion issues, such as occlusions between pedestrians. As well-known, compared with the variability of the human body, the shape of a human head and their shoulders changes minimally and has high stability. Therefore, head detection is an important research area in the field of pedestrian detection. The translational invariance of neural network enables us to design a deep convolutional neural network, which means that, even if the appearance and location of the target changes, it can still be recognized effectively. However, the problems of scale invariance and high miss detection rates for small targets still exist. In this paper, a feature extraction network DR-Net based on Darknet-53 is proposed to improve the information transmission rate between convolutional layers and to extract more semantic information. In addition, the MDC (mixed dilated convolution) with different sampling rates of dilated convolution is embedded to improve the detection rate of small targets. We evaluated our method on three publicly available datasets and achieved excellent results. The AP (Average Precision) value on the Brainwash dataset, HollywoodHeads dataset, and SCUT-HEAD dataset reached 92.1%, 84.8%, and 90% respectively.
APA, Harvard, Vancouver, ISO, and other styles
43

Abid, Fazeel, Ikram Ud Din, Ahmad Almogren, Hasan Ali Khattak, and Mirza Waqar Baig. "Augmentation of Contextualized Concatenated Word Representation and Dilated Convolution Neural Network for Sentiment Analysis." Wireless Communications and Mobile Computing 2021 (November 25, 2021): 1–13. http://dx.doi.org/10.1155/2021/1428710.

Full text
Abstract:
Deep learning-based methodologies are significant to perform sentiment analysis on social media data. The valuable insights of social media data through sentiment analysis can be employed to develop intelligent applications. Among many networks, convolution neural networks (CNNs) are widely used in many conventional text classification tasks and perform a significant role. However, to capture long-term contextual information and address the detail loss problem, CNNs require stacking multiple convolutional layers. Also, the stacking of convolutional layers has issues requiring massive computations and the tuning of additional parameters. To solve these problems, in this paper, a contextualized concatenated word representation (CCWRs) is initialized from social media data based on text which is essential to misspelled and out of vocabulary words (OOV). In CCWRs, different word representation models, for example, Word2Vec, its optimized version FastText and Global Vectors, and GloVe, collectively create contextualized representations upon the sequence of input. Second, a three-layered dilated convolutional neural network (3D-CNN) is proposed that places dilated convolution kernels instead of conventional CNN kernels. Incorporating the extension in the receptive field’s size successfully solves the detail loss problem and achieves long-term context information with different dilation rates. Experiments on datasets demonstrate that the proposed framework achieves reliable results with the selection of numerous hyperparameter tuning and configurations for improved optimization leads to reduced computational resources and reliable accuracy.
APA, Harvard, Vancouver, ISO, and other styles
44

Roy, Sanjiban Sekhar, Nishant Rodrigues, and Y.-h. Taguchi. "Incremental Dilations Using CNN for Brain Tumor Classification." Applied Sciences 10, no. 14 (July 17, 2020): 4915. http://dx.doi.org/10.3390/app10144915.

Full text
Abstract:
Brain tumor classification is a challenging task in the field of medical image processing. Technology has now enabled medical doctors to have additional aid for diagnosis. We aim to classify brain tumors using MRI images, which were collected from anonymous patients and artificial brain simulators. In this article, we carry out a comparative study between Simple Artificial Neural Networks with dropout, Basic Convolutional Neural Networks (CNN), and Dilated Convolutional Neural Networks. The experimental results shed light on the high classification performance (accuracy 97%) of Dilated CNN. On the other hand, Dilated CNN suffers from the gridding phenomenon. An incremental, even number dilation rate takes advantage of the reduced computational overhead and also overcomes the adverse effects of gridding. Comparative analysis between different combinations of dilation rates for the different convolution layers, help validate the results. The computational overhead in terms of efficiency for training the model to reach an acceptable threshold accuracy of 90% is another parameter to compare the model performance.
APA, Harvard, Vancouver, ISO, and other styles
45

Orhei, Ciprian, Victor Bogdan, Cosmin Bonchis, and Radu Vasiu. "Dilated Filters for Edge-Detection Algorithms." Applied Sciences 11, no. 22 (November 13, 2021): 10716. http://dx.doi.org/10.3390/app112210716.

Full text
Abstract:
Edges are a basic and fundamental feature in image processing that is used directly or indirectly in huge number of applications. Inspired by the expansion of image resolution and processing power, dilated-convolution techniques appeared. Dilated convolutions have impressive results in machine learning, so naturally we discuss the idea of dilating the standard filters from several edge-detection algorithms. In this work, we investigated the research hypothesis that use dilated filters, rather than the extended or classical ones, and obtained better edge map results. To demonstrate this hypothesis, we compared the results of the edge-detection algorithms using the proposed dilation filters with original filters or custom variants. Experimental results confirm our statement that the dilation of filters have a positive impact for edge-detection algorithms from simple to rather complex algorithms.
APA, Harvard, Vancouver, ISO, and other styles
46

Lin, Jiang, and Yepeng Guan. "Load Prediction in Double-Channel Residual Self-Attention Temporal Convolutional Network with Weight Adaptive Updating in Cloud Computing." Sensors 24, no. 10 (May 17, 2024): 3181. http://dx.doi.org/10.3390/s24103181.

Full text
Abstract:
When resource demand increases and decreases rapidly, container clusters in the cloud environment need to respond to the number of containers in a timely manner to ensure service quality. Resource load prediction is a prominent challenge issue with the widespread adoption of cloud computing. A novel cloud computing load prediction method has been proposed, the Double-channel residual Self-attention Temporal convolutional Network with Weight adaptive updating (DSTNW), in order to make the response of the container cluster more rapid and accurate. A Double-channel Temporal Convolution Network model (DTN) has been developed to capture long-term sequence dependencies and enhance feature extraction capabilities when the model handles long load sequences. Double-channel dilated causal convolution has been adopted to replace the single-channel dilated causal convolution in the DTN. A residual temporal self-attention mechanism (SM) has been proposed to improve the performance of the network and focus on features with significant contributions from the DTN. DTN and SM jointly constitute a dual-channel residual self-attention temporal convolutional network (DSTN). In addition, by evaluating the accuracy aspects of single and stacked DSTNs, an adaptive weight strategy has been proposed to assign corresponding weights for the single and stacked DSTNs, respectively. The experimental results highlight that the developed method has outstanding prediction performance for cloud computing in comparison with some state-of-the-art methods. The proposed method achieved an average improvement of 24.16% and 30.48% on the Container dataset and Google dataset, respectively.
APA, Harvard, Vancouver, ISO, and other styles
47

Deng, Feiyue, Yan Bi, Yongqiang Liu, and Shaopu Yang. "Deep-Learning-Based Remaining Useful Life Prediction Based on a Multi-Scale Dilated Convolution Network." Mathematics 9, no. 23 (November 26, 2021): 3035. http://dx.doi.org/10.3390/math9233035.

Full text
Abstract:
Remaining useful life (RUL) prediction of key components is an important influencing factor in making accurate maintenance decisions for mechanical systems. With the rapid development of deep learning (DL) techniques, the research on RUL prediction based on the data-driven model is increasingly widespread. Compared with the conventional convolution neural networks (CNNs), the multi-scale CNNs can extract different-scale feature information, which exhibits a better performance in the RUL prediction. However, the existing multi-scale CNNs employ multiple convolution kernels with different sizes to construct the network framework. There are two main shortcomings of this approach: (1) the convolution operation based on multiple size convolution kernels requires enormous computation and has a low operational efficiency, which severely restricts its application in practical engineering. (2) The convolutional layer with a large size convolution kernel needs a mass of weight parameters, leading to a dramatic increase in the network training time and making it prone to overfitting in the case of small datasets. To address the above issues, a multi-scale dilated convolution network (MsDCN) is proposed for RUL prediction in this article. The MsDCN adopts a new multi-scale dilation convolution fusion unit (MsDCFU), in which the multi-scale network framework is composed of convolution operations with different dilated factors. This effectively expands the range of receptive field (RF) for the convolution kernel without an additional computational burden. Moreover, the MsDCFU employs the depthwise separable convolution (DSC) to further improve the operational efficiency of the prognostics model. Finally, the proposed method was validated with the accelerated degradation test data of rolling element bearings (REBs). The experimental results demonstrate that the proposed MSDCN has a higher RUL prediction accuracy compared to some typical CNNs and better operational efficiency than the existing multi-scale CNNs based on different convolution kernel sizes.
APA, Harvard, Vancouver, ISO, and other styles
48

Guo, Feng, Hongbing Ma, Liangliang Li, Ming Lv, and Zhenhong Jia. "FCNet: Flexible Convolution Network for Infrared Small Ship Detection." Remote Sensing 16, no. 12 (June 19, 2024): 2218. http://dx.doi.org/10.3390/rs16122218.

Full text
Abstract:
The automatic monitoring and detection of maritime targets hold paramount significance in safeguarding national sovereignty, ensuring maritime rights, and advancing national development. Among the principal means of maritime surveillance, infrared (IR) small ship detection technology stands out. However, due to their minimal pixel occupancy and lack of discernible color and texture information, IR small ships have persistently posed a formidable challenge in the realm of target detection. Additionally, the intricate maritime backgrounds often exacerbate the issue by inducing high false alarm rates. In an effort to surmount these challenges, this paper proposes a flexible convolutional network (FCNet), integrating dilated convolutions and deformable convolutions to achieve flexible variations in convolutional receptive fields. Firstly, a feature enhancement module (FEM) is devised to enhance input features by fusing standard convolutions with dilated convolutions, thereby obtaining precise feature representations. Subsequently, a context fusion module (CFM) is designed to integrate contextual information during the downsampling process, mitigating information loss. Furthermore, a semantic fusion module (SFM) is crafted to fuse shallow features with deep semantic information during the upsampling process. Additionally, squeeze-and-excitation (SE) blocks are incorporated during upsampling to bolster channel information. Experimental evaluations conducted on two datasets demonstrate that FCNet outperforms other algorithms in the detection of IR small ships on maritime surfaces. Moreover, to propel research in deep learning-based IR small ship detection on maritime surfaces, we introduce the IR small ship dataset (Maritime-SIRST).
APA, Harvard, Vancouver, ISO, and other styles
49

Wan, Renzhuo, Chengde Tian, Wei Zhang, Wendi Deng, and Fan Yang. "A Multivariate Temporal Convolutional Attention Network for Time-Series Forecasting." Electronics 11, no. 10 (May 10, 2022): 1516. http://dx.doi.org/10.3390/electronics11101516.

Full text
Abstract:
Multivariate time-series forecasting is one of the crucial and persistent challenges in time-series forecasting tasks. As a kind of data with multivariate correlation and volatility, multivariate time series impose highly nonlinear time characteristics on the forecasting model. In this paper, a new multivariate time-series forecasting model, multivariate temporal convolutional attention network (MTCAN), based on a self-attentive mechanism is proposed. MTCAN is based on the Convolution Neural Network (CNN) model, using 1D dilated convolution as the basic unit to construct asymmetric blocks, and then, the feature extraction is performed by the self-attention mechanism to finally obtain the prediction results. The input and output lengths of this network can be determined flexibly. The validation of the method is carried out with three different multivariate time-series datasets. The reliability and accuracy of the prediction results are compared with Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Long Short-Term Memory (ConvLSTM), and Temporal Convolutional Network (TCN). The prediction results show that the model proposed in this paper has significantly improved prediction accuracy and generalization.
APA, Harvard, Vancouver, ISO, and other styles
50

Park, Sangun, and Dong Eui Chang. "Multipath Lightweight Deep Network Using Randomly Selected Dilated Convolution." Sensors 21, no. 23 (November 26, 2021): 7862. http://dx.doi.org/10.3390/s21237862.

Full text
Abstract:
Robot vision is an essential research field that enables machines to perform various tasks by classifying/detecting/segmenting objects as humans do. The classification accuracy of machine learning algorithms already exceeds that of a well-trained human, and the results are rather saturated. Hence, in recent years, many studies have been conducted in the direction of reducing the weight of the model and applying it to mobile devices. For this purpose, we propose a multipath lightweight deep network using randomly selected dilated convolutions. The proposed network consists of two sets of multipath networks (minimum 2, maximum 8), where the output feature maps of one path are concatenated with the input feature maps of the other path so that the features are reusable and abundant. We also replace the 3×3 standard convolution of each path with a randomly selected dilated convolution, which has the effect of increasing the receptive field. The proposed network lowers the number of floating point operations (FLOPs) and parameters by more than 50% and the classification error by 0.8% as compared to the state-of-the-art. We show that the proposed network is efficient.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography