Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: 3DCNNs.

Artykuły w czasopismach na temat „3DCNNs”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „3DCNNs”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Paralic, Martin, Kamil Zelenak, Patrik Kamencay i Robert Hudec. "Automatic Approach for Brain Aneurysm Detection Using Convolutional Neural Networks". Applied Sciences 13, nr 24 (16.12.2023): 13313. http://dx.doi.org/10.3390/app132413313.

Pełny tekst źródła
Streszczenie:
The paper introduces an approach for detecting brain aneurysms, a critical medical condition, by utilizing a combination of 3D convolutional neural networks (3DCNNs) and Convolutional Long Short-Term Memory (ConvLSTM). Brain aneurysms pose a significant health risk, and early detection is vital for effective treatment. Traditional methods for aneurysm detection often rely on complex and time-consuming procedures. A radiologist specialist annotates each aneurysm and supports our work with true-ground annotations. From the annotated data, we extract images to train proposed neural networks. The paper experiments with two different types of networks, specifically focusing on 2D convolutional neural networks (2DCNNs), 3D convolutional neural networks (3DCNNs), and Convolutional Long Short-Term Memory (ConvLSTM). Our goal is to create a virtual assistant to improve the search for aneurysm locations, with the aim of further realizing the virtual assistant. Subsequently, a radiologist specialist will confirm or reject the presence of an aneurysm, leading to a reduction in the time spent on the searching process and revealing hidden aneurysms. Our experimental results demonstrate the superior performance of the proposed approach compared to existing methods, showcasing its potential as a valuable tool in clinical settings for early and accurate brain aneurysm detection. This innovative fusion of 3DCNN and LSTM (3DCNN-ConvLSTM) techniques not only improves diagnostic precision but also holds promise for advancing the field of medical image analysis, particularly in the domain of neurovascular diseases. Overall, our research underscores the potential of neural networks for the machine detection of brain aneurysms.
Style APA, Harvard, Vancouver, ISO itp.
2

Vrskova, Roberta, Patrik Kamencay, Robert Hudec i Peter Sykora. "A New Deep-Learning Method for Human Activity Recognition". Sensors 23, nr 5 (4.03.2023): 2816. http://dx.doi.org/10.3390/s23052816.

Pełny tekst źródła
Streszczenie:
Currently, three-dimensional convolutional neural networks (3DCNNs) are a popular approach in the field of human activity recognition. However, due to the variety of methods used for human activity recognition, we propose a new deep-learning model in this paper. The main objective of our work is to optimize the traditional 3DCNN and propose a new model that combines 3DCNN with Convolutional Long Short-Term Memory (ConvLSTM) layers. Our experimental results, which were obtained using the LoDVP Abnormal Activities dataset, UCF50 dataset, and MOD20 dataset, demonstrate the superiority of the 3DCNN + ConvLSTM combination for recognizing human activities. Furthermore, our proposed model is well-suited for real-time human activity recognition applications and can be further enhanced by incorporating additional sensor data. To provide a comprehensive comparison of our proposed 3DCNN + ConvLSTM architecture, we compared our experimental results on these datasets. We achieved a precision of 89.12% when using the LoDVP Abnormal Activities dataset. Meanwhile, the precision we obtained using the modified UCF50 dataset (UCF50mini) and MOD20 dataset was 83.89% and 87.76%, respectively. Overall, our work demonstrates that the combination of 3DCNN and ConvLSTM layers can improve the accuracy of human activity recognition tasks, and our proposed model shows promise for real-time applications.
Style APA, Harvard, Vancouver, ISO itp.
3

Wang, Dingheng, Guangshe Zhao, Guoqi Li, Lei Deng i Yang Wu. "Compressing 3DCNNs based on tensor train decomposition". Neural Networks 131 (listopad 2020): 215–30. http://dx.doi.org/10.1016/j.neunet.2020.07.028.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Hong, Qingqing, Xinyi Zhong, Weitong Chen, Zhenghua Zhang, Bin Li, Hao Sun, Tianbao Yang i Changwei Tan. "SATNet: A Spatial Attention Based Network for Hyperspectral Image Classification". Remote Sensing 14, nr 22 (21.11.2022): 5902. http://dx.doi.org/10.3390/rs14225902.

Pełny tekst źródła
Streszczenie:
In order to categorize feature classes by capturing subtle differences, hyperspectral images (HSIs) have been extensively used due to the rich spectral-spatial information. The 3D convolution-based neural networks (3DCNNs) have been widely used in HSI classification because of their powerful feature extraction capability. However, the 3DCNN-based HSI classification approach could only extract local features, and the feature maps it produces include a lot of spatial information redundancy, which lowers the classification accuracy. To solve the above problems, we proposed a spatial attention network (SATNet) by combining 3D OctConv and ViT. Firstly, 3D OctConv divided the feature maps into high-frequency maps and low-frequency maps to reduce spatial information redundancy. Secondly, the ViT model was used to obtain global features and effectively combine local-global features for classification. To verify the effectiveness of the method in the paper, a comparison with various mainstream methods on three publicly available datasets was performed, and the results showed the superiority of the proposed method in terms of classification evaluation performance.
Style APA, Harvard, Vancouver, ISO itp.
5

Gomez-Donoso, Francisco, Felix Escalona i Miguel Cazorla. "Par3DNet: Using 3DCNNs for Object Recognition on Tridimensional Partial Views". Applied Sciences 10, nr 10 (14.05.2020): 3409. http://dx.doi.org/10.3390/app10103409.

Pełny tekst źródła
Streszczenie:
Deep learning-based methods have proven to be the best performers when it comes to object recognition cues both in images and tridimensional data. Nonetheless, when it comes to 3D object recognition, the authors tend to convert the 3D data to images and then perform their classification. However, despite its accuracy, this approach has some issues. In this work, we present a deep learning pipeline for object recognition that takes a point cloud as input and provides the classification probabilities as output. Our proposal is trained on synthetic CAD objects and is able to perform accurately when fed with real data provided by commercial sensors. Unlike most approaches, our method is specifically trained to work on partial views of the objects rather than on a full representation, which is not the representation of the objects as captured by commercial sensors. We trained our proposal with the ModelNet10 dataset and achieved a 78.39 % accuracy. We also tested it by adding noise to the dataset and against a number of datasets and real data with high success.
Style APA, Harvard, Vancouver, ISO itp.
6

Motamed, Sara, i Elham Askari. "Detection of handgun using 3D convolutional neural network model (3DCNNs)". Signal and Data Processing 20, nr 2 (1.09.2023): 69–79. http://dx.doi.org/10.61186/jsdp.20.2.69.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Firsov, Nikita, Evgeny Myasnikov, Valeriy Lobanov, Roman Khabibullin, Nikolay Kazanskiy, Svetlana Khonina, Muhammad A. Butt i Artem Nikonorov. "HyperKAN: Kolmogorov–Arnold Networks Make Hyperspectral Image Classifiers Smarter". Sensors 24, nr 23 (30.11.2024): 7683. https://doi.org/10.3390/s24237683.

Pełny tekst źródła
Streszczenie:
In traditional neural network designs, a multilayer perceptron (MLP) is typically employed as a classification block following the feature extraction stage. However, the Kolmogorov–Arnold Network (KAN) presents a promising alternative to MLP, offering the potential to enhance prediction accuracy. In this paper, we studied KAN-based networks for pixel-wise classification of hyperspectral images. Initially, we compared baseline MLP and KAN networks with varying numbers of neurons in their hidden layers. Subsequently, we replaced the linear, convolutional, and attention layers of traditional neural networks with their KAN-based counterparts. Specifically, six cutting-edge neural networks were modified, including 1D (1DCNN), 2D (2DCNN), and 3D convolutional networks (two different 3DCNNs, NM3DCNN), as well as transformer (SSFTT). Experiments conducted using seven publicly available hyperspectral datasets demonstrated a substantial improvement in classification accuracy across all the networks. The best classification quality was achieved using a KAN-based transformer architecture.
Style APA, Harvard, Vancouver, ISO itp.
8

Alharbi, Yasser F., i Yousef A. Alotaibi. "Decoding Imagined Speech from EEG Data: A Hybrid Deep Learning Approach to Capturing Spatial and Temporal Features". Life 14, nr 11 (18.11.2024): 1501. http://dx.doi.org/10.3390/life14111501.

Pełny tekst źródła
Streszczenie:
Neuroimaging is revolutionizing our ability to investigate the brain’s structural and functional properties, enabling us to visualize brain activity during diverse mental processes and actions. One of the most widely used neuroimaging techniques is electroencephalography (EEG), which records electrical activity from the brain using electrodes positioned on the scalp. EEG signals capture both spatial (brain region) and temporal (time-based) data. While a high temporal resolution is achievable with EEG, spatial resolution is comparatively limited. Consequently, capturing both spatial and temporal information from EEG data to recognize mental activities remains challenging. In this paper, we represent spatial and temporal information obtained from EEG signals by transforming EEG data into sequential topographic brain maps. We then apply hybrid deep learning models to capture the spatiotemporal features of the EEG topographic images and classify imagined English words. The hybrid framework utilizes a sequential combination of three-dimensional convolutional neural networks (3DCNNs) and recurrent neural networks (RNNs). The experimental results reveal the effectiveness of the proposed approach, achieving an average accuracy of 77.8% in identifying imagined English speech.
Style APA, Harvard, Vancouver, ISO itp.
9

Wei, Minghua, i Feng Lin. "A novel multi-dimensional features fusion algorithm for the EEG signal recognition of brain's sensorimotor region activated tasks". International Journal of Intelligent Computing and Cybernetics 13, nr 2 (8.06.2020): 239–60. http://dx.doi.org/10.1108/ijicc-02-2020-0019.

Pełny tekst źródła
Streszczenie:
PurposeAiming at the shortcomings of EEG signals generated by brain's sensorimotor region activated tasks, such as poor performance, low efficiency and weak robustness, this paper proposes an EEG signals classification method based on multi-dimensional fusion features.Design/methodology/approachFirst, the improved Morlet wavelet is used to extract the spectrum feature maps from EEG signals. Then, the spatial-frequency features are extracted from the PSD maps by using the three-dimensional convolutional neural networks (3DCNNs) model. Finally, the spatial-frequency features are incorporated to the bidirectional gated recurrent units (Bi-GRUs) models to extract the spatial-frequency-sequential multi-dimensional fusion features for recognition of brain's sensorimotor region activated task.FindingsIn the comparative experiments, the data sets of motor imagery (MI)/action observation (AO)/action execution (AE) tasks are selected to test the classification performance and robustness of the proposed algorithm. In addition, the impact of extracted features on the sensorimotor region and the impact on the classification processing are also analyzed by visualization during experiments.Originality/valueThe experimental results show that the proposed algorithm extracts the corresponding brain activation features for different action related tasks, so as to achieve more stable classification performance in dealing with AO/MI/AE tasks, and has the best robustness on EEG signals of different subjects.
Style APA, Harvard, Vancouver, ISO itp.
10

Torres, Felipe Soares, Shazia Akbar, Srinivas Raman, Kazuhiro Yasufuku, Felix Baldauf-Lenschen i Natasha B. Leighl. "Automated imaging-based stratification of early-stage lung cancer patients prior to receiving surgical resection using deep learning applied to CTs." Journal of Clinical Oncology 39, nr 15_suppl (20.05.2021): 1552. http://dx.doi.org/10.1200/jco.2021.39.15_suppl.1552.

Pełny tekst źródła
Streszczenie:
1552 Background: Computed tomography (CT) imaging is an important tool to guide further investigation and treatment in patients with lung cancer. For patients with early stage lung cancer, surgery remains an optimal treatment option. Artificial intelligence applied to pretreatment CTs may have the ability to quantify mortality risk and stratify patients for more individualized diagnostic, treatment and monitoring decisions. Methods: A fully automated, end-to-end model was designed to localize the 36cm x 36cm x 36cm space centered on the lungs and learn deep prognostic features using a 3-dimensional convolutional neural network (3DCNN) to predict 5-year mortality risk. The 3DCNN was trained and validated in a 5-fold cross-validation using 2,924 CTs of 1,689 lung cancer patients from 6 public datasets made available in The Cancer Imaging Archive. We evaluated 3DCNN’s ability to stratify stage I & II patients who received surgery into mortality risk quintiles using the Cox proportional hazards model. Results: 260 of the 1,689 lung cancer patients in the withheld validation dataset were diagnosed as stage I or II, received a surgical resection within 6 months of their pretreatment CT and had known 5-year disease and survival outcomes. Based on the 3DCNN’s predicted mortality risk, patients in the highest risk quintile had a 14.2-fold (95% CI 4.3-46.8, p < 0.001) increase in 5-year mortality hazard compared to patients in the lowest risk quintile. Conclusions: Deep learning applied to pretreatment CTs provides personalised prognostic insights for early stage lung cancer patients who received surgery and has the potential to inform treatment and monitoring decisions.[Table: see text]
Style APA, Harvard, Vancouver, ISO itp.
11

Li, Jin, Xianglong Liu, Zhuofan Zong, Wanru Zhao, Mingyuan Zhang i Jingkuan Song. "Graph Attention Based Proposal 3D ConvNets for Action Detection". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 04 (3.04.2020): 4626–33. http://dx.doi.org/10.1609/aaai.v34i04.5893.

Pełny tekst źródła
Streszczenie:
The recent advances in 3D Convolutional Neural Networks (3D CNNs) have shown promising performance for untrimmed video action detection, employing the popular detection framework that heavily relies on the temporal action proposal generations as the input of the action detector and localization regressor. In practice the proposals usually contain strong intra and inter relations among them, mainly stemming from the temporal and spatial variations in the video actions. However, most of existing 3D CNNs ignore the relations and thus suffer from the redundant proposals degenerating the detection performance and efficiency. To address this problem, we propose graph attention based proposal 3D ConvNets (AGCN-P-3DCNNs) for video action detection. Specifically, our proposed graph attention is composed of intra attention based GCN and inter attention based GCN. We use intra attention to learn the intra long-range dependencies inside each action proposal and update node matrix of Intra Attention based GCN, and use inter attention to learn the inter dependencies between different action proposals as adjacency matrix of Inter Attention based GCN. Afterwards, we fuse intra and inter attention to model intra long-range dependencies and inter dependencies simultaneously. Another contribution is that we propose a simple and effective framewise classifier, which enhances the feature presentation capabilities of backbone model. Experiments on two proposal 3D ConvNets based models (P-C3D and P-ResNet) and two popular action detection benchmarks (THUMOS 2014, ActivityNet v1.3) demonstrate the state-of-the-art performance achieved by our method. Particularly, P-C3D embedded with our module achieves average mAP 3.7% improvement on THUMOS 2014 dataset compared to original model.
Style APA, Harvard, Vancouver, ISO itp.
12

Li, Haoying. "The application and challenges of different face recognition technologies in the three major fields of security, social media, and medical care". Applied and Computational Engineering 95, nr 1 (25.10.2024): 174–81. http://dx.doi.org/10.54254/2755-2721/95/2024ch0051.

Pełny tekst źródła
Streszczenie:
Abstract. Face recognition technology is a very important part of modern technology, which can not only be used to ensure security, but also can be used in the field of information organization and content division. However, with the popularization and application of face recognition technology, many problems that need to be solved urgently have emerged: excessive computing resources are consumed in order to pursue high-precision recognition, which brings computing pressure; The basis for improving recall is the need for a lot of power and memory; If security is not guaranteed, it can cause problems such as data breaches. The demand for face recognition technology is different in different use fields, so the purpose of this study is to combine the scene requirements and technical advantages more reasonably. The research results are as follows: the high accuracy and recall rate of 3D convolutional neural networks (3DCNNs) ensure that it can be used safely in high-precision and high-security scenarios. Lightweight Convolutional Neural Network (MobileNetV2) is suitable for resource-constrained environments due to its low memory consumption and low communication cost. Edge computing real-time face recognition (EC-RFERNet) is the most suitable for large-scale popularization and application among the three because of its lowest power consumption and latency. This study deeply explores the advantages and disadvantages of different facial recognition technologies and finds solutions to their shortcomings. According to their unique advantages combined with the requirements of commonly used scenarios, it provides a scientific basis for the deployment of face recognition technology in different fields. However, due to limited information, this paper cannot cover all application scenarios and the latest technologies, so it is hoped that in the future, we can combine the advantages of different technologies to develop more comprehensive face recognition technology and make more reasonable technical planning.
Style APA, Harvard, Vancouver, ISO itp.
13

Low, Kah Sin, i Swee Kheng Eng. "Performance evaluation of deep learning techniques for human activity recognition system". Journal of Physics: Conference Series 2641, nr 1 (1.11.2023): 012012. http://dx.doi.org/10.1088/1742-6596/2641/1/012012.

Pełny tekst źródła
Streszczenie:
Abstract Human Activity Recognition (HAR) is crucial in various applications, such as sports and surveillance. This paper focuses on the performance evaluation of a HAR system using deep learning techniques. Features will be extracted using 3DCNN, and classification will be performed using LSTM. Meanwhile, 3DCNN and RNN are two additional, well-known classification techniques that will be applied in order to compare the effectiveness of the three classifiers. The 3DCNN-LSTM approach contributes the highest overall accuracy of 86.57%, followed by 3DCNN-3DCNN and 3DCNN-RNN with the overall accuracy of 86.07% and 79.60%, respectively. Overall, this paper contributes to the field of HAR and provides valuable insights for the development of activity recognition systems.
Style APA, Harvard, Vancouver, ISO itp.
14

Li, Wenmei, Huaihuai Chen, Qing Liu, Haiyan Liu, Yu Wang i Guan Gui. "Attention Mechanism and Depthwise Separable Convolution Aided 3DCNN for Hyperspectral Remote Sensing Image Classification". Remote Sensing 14, nr 9 (5.05.2022): 2215. http://dx.doi.org/10.3390/rs14092215.

Pełny tekst źródła
Streszczenie:
Hyperspectral Remote Rensing Image (HRSI) classification based on Convolution Neural Network (CNN) has become one of the hot topics in the field of remote sensing. However, the high dimensional information and limited training samples are prone to the Hughes phenomenon for hyperspectral remote sensing images. Meanwhile, high-dimensional information processing also consumes significant time and computing power, or the extracted features may not be representative, resulting in unsatisfactory classification efficiency and accuracy. To solve these problems, an attention mechanism and depthwise separable convolution are introduced to the three-dimensional convolutional neural network (3DCNN). Thus, 3DCNN-AM and 3DCNN-AM-DSC are proposed for HRSI classification. Firstly, three hyperspectral datasets (Indian pines, University of Pavia and University of Houston) are used to analyze the patchsize and dataset allocation ratio (Training set: Validation set: Test Set) in the performance of 3DCNN and 3DCNN-AM. Secondly, in order to improve work efficiency, principal component analysis (PCA) and autoencoder (AE) dimension reduction methods are applied to reduce data dimensionality, and maximize the classification accuracy of the 3DCNN, but it will still take time. Furthermore, the HRSI classification model 3DCNN-AM and 3DCNN-AM-DSC are applied to classify with the three classic HRSI datasets. Lastly, the classification accuracy index and time consumption are evaluated. The results indicate that 3DCNN-AM could improve classification accuracy and reduce computing time with the dimension reduction dataset, and the 3DCNN-AM-DSC model can reduce the training time by a maximum of 91.77% without greatly reducing the classification accuracy. The results of the three classic hyperspectral datasets illustrate that 3DCNN-AM-DSC can improve the classification performance and reduce the time required for model training. It may be a new way to tackle hyperspectral datasets in HRSl classification tasks without dimensionality reduction.
Style APA, Harvard, Vancouver, ISO itp.
15

Al Barazanchi, Israa Ibraheem, Wahidah Hashim, Reema Thabit i Noor Al-Huda K. Hussein. "Advanced Hybrid Mask Convolutional Neural Network with Backpropagation Optimization for Precise Sensor Node Classification in Wireless Body Area Networks". KHWARIZMIA 2024 (13.03.2024): 17–31. https://doi.org/10.70470/khwarizmia/2024/004.

Pełny tekst źródła
Streszczenie:
Wireless Body Area Networks (WBANs) are crucial in continuous health monitoring, fitness tracking, and other applications where a real-time collection of physiological data is needed by sensors worn on the body. This is important in WBAN to achieve reliable data transmission, energy efficiency, and overall system performance. Still WBANs present several challenges: for example, the data that is being collected is heterogeneous since it originates from diverse IMDs measuring different bio-physiological signals; these can be also quite noisy because of motion artifacts when moving; as well high limitations when it comes to energy, bandwidth or storage push for low-complexity methods rather than standard deep learning techniques. Common Convolutional Neural Networks) CNNs( are successfully utilized for spatial information extraction but they cannot catch temporal dependencies well and also, WBAN sensor data has a noisy and multi-modal structure which acts as an additional challenge for traditional CNNs. These limitations emphasize the need of a flexible, fast and precise classification model based on the specific needs of WBAN applications. To overcome these challenges, this paper presents a novel hybrid neural network architecture consisting of combined 2D and 3D convolutions for spatial-temporal feature extraction along with masked convolution layers to provide an ability to adaptively ignore uninterested parts of the data. The model aims at achieving a high classification performance while also balancing with the system computational efficiency, perfect for its ipsi deployment on resource constrained WBAN devices. Then, we apply further backpropagation optimization measures such as adaptive learning rate scheduling and gradient clipping, to improve the stability of training speed and reduce latency which in return finds its way into supporting real-time processing capabilities of the model. By using each of these components, the model is able to deal with the multi-dimension aspects and high noise level nature of WBANs without excessive computation resources [18]. The Hybrid Masked CNN model is shown to out-perform existing approaches without such masking, yielding substantially higher performance in terms of accuracy, precision, recall and F1-score across all metrics defined for the application as compared to traditional 2DCNNs, 3DCNNs and other hybrid models. Consequently, the latency of the model is significantly decreased as well which confirms its applicability to real-time WBAN applications. The obtained results confirm the efficacy of features from hybrid architectures with masked convolutions along with optimization in training techniques for WBAN sensor node classification. The results of this paper improve WBAN technology by providing a solid and scalable solution which can be implemented when more reliability and flexibility are required from such systems in applications like healthcare, fitness or any other field.
Style APA, Harvard, Vancouver, ISO itp.
16

Yang, Da-wei, Xi-bin Jia, Yu-jie Xiao, Xiao-pei Wang, Zhen-chang Wang i Zheng-han Yang. "Noninvasive Evaluation of the Pathologic Grade of Hepatocellular Carcinoma Using MCF-3DCNN: A Pilot Study". BioMed Research International 2019 (28.04.2019): 1–12. http://dx.doi.org/10.1155/2019/9783106.

Pełny tekst źródła
Streszczenie:
Purpose. To evaluate the diagnostic performance of deep learning with a multichannel fusion three-dimensional convolutional neural network (MCF-3DCNN) in the differentiation of the pathologic grades of hepatocellular carcinoma (HCC) based on dynamic contrast-enhanced magnetic resonance images (DCE-MR images). Methods and Materials. Fifty-one histologically proven HCCs from 42 consecutive patients from January 2015 to September 2017 were included in this retrospective study. Pathologic examinations revealed nine well-differentiated (WD), 35 moderately differentiated (MD), and seven poorly differentiated (PD) HCCs. DCE-MR images with five phases were collected using a 3.0 Tesla MR scanner. The 4D-tensor representation was employed to organize the collected data in one temporal and three spatial dimensions by referring to the phases and 3D scanning slices of the DCE-MR images. A deep learning diagnosis model with MCF-3DCNN was proposed, and the structure of MCF-3DCNN was determined to approximate clinical diagnosis experience by taking into account the significance of the spatial and temporal information from DCE-MR images. Then, MCF-3DCNN was trained based on well-labeled samples of HCC lesions from real patient cases by experienced radiologists. The accuracy when differentiating the pathologic grades of HCC was calculated, and the performance of MCF-3DCNN in lesion diagnosis was assessed. Additionally, the areas under the receiver operating characteristic curves (AUC) for distinguishing WD, MD, and PD HCCs were calculated. Results. MCF-3DCNN achieved an average accuracy of 0.7396±0.0104 with regard to totally differentiating the pathologic grade of HCC. MCF-3DCNN also achieved the highest diagnostic performance for discriminating WD HCCs from others, with an average AUC, accuracy, sensitivity, and specificity of 0.96, 91.00%, 96.88%, and 89.62%, respectively. Conclusions. This study indicates that MCF-3DCNN can be a promising technology for evaluating the pathologic grade of HCC based on DCE-MR images.
Style APA, Harvard, Vancouver, ISO itp.
17

Vrskova, Roberta, Robert Hudec, Patrik Kamencay i Peter Sykora. "Human Activity Classification Using the 3DCNN Architecture". Applied Sciences 12, nr 2 (17.01.2022): 931. http://dx.doi.org/10.3390/app12020931.

Pełny tekst źródła
Streszczenie:
Interest in utilizing neural networks in a variety of scientific and academic studies and in industrial applications is increasing. In addition to the growing interest in neural networks, there is also a rising interest in video classification. Object detection from an image is used as a tool for various applications and is the basis for video classification. Identifying objects in videos is more difficult than for single images, as the information in videos has a time continuity constraint. Common neural networks such as ConvLSTM (Convolutional Long Short-Term Memory) and 3DCNN (3D Convolutional Neural Network), as well as many others, have been used to detect objects from video. Here, we propose a 3DCNN for the detection of human activity from video data. The experimental results show that the optimized proposed 3DCNN provides better results than neural network architectures for motion, static and hybrid features. The proposed 3DCNN obtains the highest recognition precision of the methods considered, 87.4%. In contrast, the neural network architectures for motion, static and hybrid features achieve precisions of 65.4%, 63.1% and 71.2%, respectively. We also compare results with previous research. Previous 3DCNN architecture on database UCF Youtube Action worked worse than the architecture we proposed in this article, where the achieved result was 29%. The experimental results on the UCF YouTube Action dataset demonstrate the effectiveness of the proposed 3DCNN for recognition of human activity. For a more complex comparison of the proposed neural network, the modified UCF101 dataset, full UCF50 dataset and full UCF101 dataset were compared. An overall precision of 82.7% using modified UCF101 dataset was obtained. On the other hand, the precision using full UCF50 dataset and full UCF101 dataset was 80.6% and 78.5%, respectively.
Style APA, Harvard, Vancouver, ISO itp.
18

Erbey, Ali, i Necaattin Barışçı. "Lip-Reading Classification of Turkish Digits Using Ensemble Learning Architecture Based on 3DCNN". Applied Sciences 15, nr 2 (8.01.2025): 563. https://doi.org/10.3390/app15020563.

Pełny tekst źródła
Streszczenie:
Understanding others correctly is of great importance for maintaining effective communication. Factors such as hearing difficulties or environmental noise can disrupt this process. Lip reading offers an effective solution to these challenges. With the growing success of deep learning architectures, research on lip reading has gained momentum. The aim of this study is to create a lip reading dataset for Turkish digit recognition and to conduct predictive analyses. The dataset has divided into two subsets: the face region and the lip region. CNN, LSTM, and 3DCNN-based models, including C3D, I3D, and 3DCNN+BiLSTM, were used. While LSTM models are effective in processing temporal data, 3DCNN-based models, which can process both spatial and temporal information, achieved higher accuracy in this study. Experimental results showed that the dataset containing only the lip region performed better; accuracy rates for CNN, LSTM, C3D, and I3D on the lip region were 67.12%, 75.53%, 86.32%, and 93.24%, respectively. The 3DCNN-based models achieved higher accuracy due to their ability to process spatio-temporal data. Furthermore, an additional 1.23% improvement was achieved through ensemble learning, with the best result reaching 94.53% accuracy. Ensemble learning, by combining the strengths of different models, provided a meaningful improvement in overall performance. These results demonstrate that 3DCNN architectures and ensemble learning methods yield high success in addressing the problem of lip reading in the Turkish language. While our study focuses on Turkish digit recognition, the proposed methods have the potential to be successful in other languages or broader lip reading applications.
Style APA, Harvard, Vancouver, ISO itp.
19

Takei, Yuma, i Takashi Ishida. "P3CMQA: Single-Model Quality Assessment Using 3DCNN with Profile-Based Features". Bioengineering 8, nr 3 (19.03.2021): 40. http://dx.doi.org/10.3390/bioengineering8030040.

Pełny tekst źródła
Streszczenie:
Model quality assessment (MQA), which selects near-native structures from structure models, is an important process in protein tertiary structure prediction. The three-dimensional convolution neural network (3DCNN) was applied to the task, but the performance was comparable to existing methods because it used only atom-type features as the input. Thus, we added sequence profile-based features, which are also used in other methods, to improve the performance. We developed a single-model MQA method for protein structures based on 3DCNN using sequence profile-based features, namely, P3CMQA. Performance evaluation using a CASP13 dataset showed that profile-based features improved the assessment performance, and the proposed method was better than currently available single-model MQA methods, including the previous 3DCNN-based method. We also implemented a web-interface of the method to make it more user-friendly.
Style APA, Harvard, Vancouver, ISO itp.
20

Torres, Felipe, Shazia Akbar, Felix Baldauf-Lenschen i Natasha B. Leighl. "Improved prognostication for lung cancer patients from computed tomography imaging using deep learning." Journal of Clinical Oncology 38, nr 15_suppl (20.05.2020): 2044. http://dx.doi.org/10.1200/jco.2020.38.15_suppl.2044.

Pełny tekst źródła
Streszczenie:
2044 Background: Clinical TNM staging derived from computed tomography (CT) imaging is a key prognostic factor for lung cancer patients when making decisions about treatment, monitoring, and clinical trial eligibility. However, heterogeneity among patients, including by molecular subtypes, may result in variability of survival outcomes of patients with the same TNM stage that receive the same treatment. Artificial intelligence may offer additional, individualized prognostic information based on both known and unknown features present in CTs to facilitate more precise clinical decision making. We developed a novel deep learning-based technique to predict 2-year survival from pretreatment CTs of pathologically-confirmed lung cancer patients. Methods: A fully automated, end-to-end model was designed to localize the three-dimensional (3D) space comprising the lungs and heart, and to learn deep prognostic features using a 3D convolutional neural network (3DCNN). The 3DCNN was trained and validated using 1,841 CTs of 1,184 patients from five public datasets made available in The Cancer Imaging Archive. Spearman’s rank correlation (R) and concordance index (C-index) between the model output and survival status of each patient after 2-year follow-up from CT acquisition was assessed, in addition to sensitivity, specificity and accuracy stratified by staging. Results: 3DCNN showed an overall prediction accuracy of 75.0% (R = 0.32, C-index = 0.67, p < 0.0001), with higher performance achieved for stage I patients (Table) . 3DCNN showed better overall correlation with survival for 1,124 patients with available TNM staging, in comparison to TNM staging only (R = 0.19, C-index = 0.63, p < 0.0001); however, a weighted linear combination of both TNM staging and the 3DCNN yielded a superior correlation (R = 0.34, C-index = 0.73, p < 0.0001). Conclusions: Deep learning applied to pretreatment CT images provides personalized prognostic information that complements clinical staging and may help facilitate more precise prognostication of patients diagnosed with lung cancer. [Table: see text]
Style APA, Harvard, Vancouver, ISO itp.
21

Mahareek, Esraa A., Eman K. ElSayed, Nahed M. ElDesouky i Kamal A. ElDahshan. "Detecting anomalies in security cameras with 3D-convolutional neural network and convolutional long short-term memory". International Journal of Electrical and Computer Engineering (IJECE) 14, nr 1 (1.02.2024): 993. http://dx.doi.org/10.11591/ijece.v14i1.pp993-1004.

Pełny tekst źródła
Streszczenie:
<p>This paper presents a novel deep learning-based approach for anomaly detection in surveillance films. A deep network that has been trained to recognize objects and human activity in movies forms the foundation of the suggested approach. In order to detect anomalies in surveillance films, the proposed method combines the strengths of 3D-convolutional neural network (3DCNN) and convolutional long short-term memory (ConvLSTM). From the video frames, the 3DCNN is utilized to extract spatiotemporal features,while ConvLSTM is employed to record temporal relationships between frames. The technique was evaluated on five large-scale datasets from the actual world (UCFCrime, XDViolence, UBIFights, CCTVFights, UCF101) that had both indoor and outdoor video clips as well as synthetic datasets with a range of object shapes, sizes, and behaviors. The results further demonstrate that combining 3DCNN with ConvLSTM can increase precision and reduce false positives, achieving a high accuracy and area under the receiver operating characteristic-area under the curve (ROC-AUC) in both indoor and outdoor scenarios when compared to cuttingedge techniques mentioned in the comparison.</p>
Style APA, Harvard, Vancouver, ISO itp.
22

Collins, Toby, Marianne Maktabi, Manuel Barberio, Valentin Bencteux, Boris Jansen-Winkeln, Claire Chalopin, Jacques Marescaux, Alexandre Hostettler, Michele Diana i Ines Gockel. "Automatic Recognition of Colon and Esophagogastric Cancer with Machine Learning and Hyperspectral Imaging". Diagnostics 11, nr 10 (30.09.2021): 1810. http://dx.doi.org/10.3390/diagnostics11101810.

Pełny tekst źródła
Streszczenie:
There are approximately 1.8 million diagnoses of colorectal cancer, 1 million diagnoses of stomach cancer, and 0.6 million diagnoses of esophageal cancer each year globally. An automatic computer-assisted diagnostic (CAD) tool to rapidly detect colorectal and esophagogastric cancer tissue in optical images would be hugely valuable to a surgeon during an intervention. Based on a colon dataset with 12 patients and an esophagogastric dataset of 10 patients, several state-of-the-art machine learning methods have been trained to detect cancer tissue using hyperspectral imaging (HSI), including Support Vector Machines (SVM) with radial basis function kernels, Multi-Layer Perceptrons (MLP) and 3D Convolutional Neural Networks (3DCNN). A leave-one-patient-out cross-validation (LOPOCV) with and without combining these sets was performed. The ROC-AUC score of the 3DCNN was slightly higher than the MLP and SVM with a difference of 0.04 AUC. The best performance was achieved with the 3DCNN for colon cancer and esophagogastric cancer detection with a high ROC-AUC of 0.93. The 3DCNN also achieved the best DICE scores of 0.49 and 0.41 on the colon and esophagogastric datasets, respectively. These scores were significantly improved using a patient-specific decision threshold to 0.58 and 0.51, respectively. This indicates that, in practical use, an HSI-based CAD system using an interactive decision threshold is likely to be valuable. Experiments were also performed to measure the benefits of combining the colorectal and esophagogastric datasets (22 patients), and this yielded significantly better results with the MLP and SVM models.
Style APA, Harvard, Vancouver, ISO itp.
23

Ha, Manh-Hung. "Top-Heavy CapsNets Based on Spatiotemporal Non-Local for Action Recognition". Journal of Computing Theories and Applications 2, nr 1 (25.05.2024): 39–50. http://dx.doi.org/10.62411/jcta.10551.

Pełny tekst źródła
Streszczenie:
To effectively comprehend human actions, we have developed a Deep Neural Network (DNN) that utilizes inner spatiotemporal non-locality to capture meaningful semantic context for efficient action identification. This work introduces the Top-Heavy CapsNet as a novel approach for video analysis, incorporating a 3D Convolutional Neural Network (3DCNN) to apply the thematic actions of local classifiers for effective classification based on motion from the spatiotemporal context in videos. This DNN comprises multiple layers, including 3D Convolutional Neural Network (3DCNN), Spatial Depth-Based Non-Local (SBN) layer, and Deep Capsule (DCapsNet). Firstly, the 3DCNN extracts structured and semantic information from RGB and optical flow streams. Secondly, the SBN layer processes feature blocks with spatial depth to emphasize visually advantageous cues, potentially aiding in action differentiation. Finally, DCapsNet is more effective in exploiting vectorized prominent features to represent objects and various action features for the ultimate label determination. Experimental results demonstrate that the proposed DNN achieves an average accuracy of 97.6%, surpassing conventional DNNs on the traffic police dataset. Furthermore, the proposed DNN attains average accuracies of 98.3% and 80.7% on the UCF101 and HMDB51 datasets, respectively. This underscores the applicability of the proposed DNN for effectively recognizing diverse actions performed by subjects in videos.
Style APA, Harvard, Vancouver, ISO itp.
24

Riahi, Ali, Omar Elharrouss i Somaya Al-Maadeed. "BEMD-3DCNN-based method for COVID-19 detection". Computers in Biology and Medicine 142 (marzec 2022): 105188. http://dx.doi.org/10.1016/j.compbiomed.2021.105188.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
25

Al-Hammadi, Muneer, Ghulam Muhammad, Wadood Abdul, Mansour Alsulaiman, Mohamed A. Bencherif i Mohamed Amine Mekhtiche. "Hand Gesture Recognition for Sign Language Using 3DCNN". IEEE Access 8 (2020): 79491–509. http://dx.doi.org/10.1109/access.2020.2990434.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
26

Xu, Hao, Wei Yao, Li Cheng i Bo Li. "Multiple Spectral Resolution 3D Convolutional Neural Network for Hyperspectral Image Classification". Remote Sensing 13, nr 7 (25.03.2021): 1248. http://dx.doi.org/10.3390/rs13071248.

Pełny tekst źródła
Streszczenie:
In recent years, benefiting from the rapid development of deep learning technology in the field of computer vision, the study of hyperspectral image (HSI) classification has also made great progress. However, compared with ordinary RGB images, HSIs are more like 3D cubes; therefore, it is necessary and beneficial to explore classification methods suitable for the very special data structure of HSIs. In this paper, we propose Multiple Spectral Resolution 3D Convolutional Neural Network (MSR-3DCNN) for HSI classification tasks. In MSR-3DCNN, we expand the idea of multi-scale feature fusion and dilated convolution from the spatial dimension to the spectral dimension, and combine 3D convolution and residual connection; therefore, it can better adapt to the 3D cubic form of hyperspectral data and make efficient use of spectral information in different bands. Experimental results on four benchmark datasets show the effectiveness of the proposed approach and its superiority as compared with some state-of-the-art (SOTA) HSI classification methods.
Style APA, Harvard, Vancouver, ISO itp.
27

Do, Luu-Ngoc, Byung Hyun Baek, Seul Kee Kim, Hyung-Jeong Yang, Ilwoo Park i Woong Yoon. "Automatic Assessment of ASPECTS Using Diffusion-Weighted Imaging in Acute Ischemic Stroke Using Recurrent Residual Convolutional Neural Network". Diagnostics 10, nr 10 (9.10.2020): 803. http://dx.doi.org/10.3390/diagnostics10100803.

Pełny tekst źródła
Streszczenie:
The early detection and rapid quantification of acute ischemic lesions play pivotal roles in stroke management. We developed a deep learning algorithm for the automatic binary classification of the Alberta Stroke Program Early Computed Tomographic Score (ASPECTS) using diffusion-weighted imaging (DWI) in acute stroke patients. Three hundred and ninety DWI datasets with acute anterior circulation stroke were included. A classifier algorithm utilizing a recurrent residual convolutional neural network (RRCNN) was developed for classification between low (1–6) and high (7–10) DWI-ASPECTS groups. The model performance was compared with a pre-trained VGG16, Inception V3, and a 3D convolutional neural network (3DCNN). The proposed RRCNN model demonstrated higher performance than the pre-trained models and 3DCNN with an accuracy of 87.3%, AUC of 0.941, and F1-score of 0.888 for classification between the low and high DWI-ASPECTS groups. These results suggest that the deep learning algorithm developed in this study can provide a rapid assessment of DWI-ASPECTS and may serve as an ancillary tool that can assist physicians in making urgent clinical decisions.
Style APA, Harvard, Vancouver, ISO itp.
28

Li, Xin, i Yan Piao. "TCANet: Three-dimensional cross-attention mechanism for stereo-matching". Journal of Physics: Conference Series 2858, nr 1 (1.10.2024): 012004. http://dx.doi.org/10.1088/1742-6596/2858/1/012004.

Pełny tekst źródła
Streszczenie:
Abstract Effective disparity estimation is a current hotspot in stereo vision research, and cost aggregation is an important part of disparity prediction. How to perform more effective cost aggregation is the core step to improve the accuracy of disparity prediction. In previous studies, 3DCNN with a stacked hourglass structure is often used for cost aggregation. In this research, we propose an effective 3D cross-attention stereo network that utilizes the attention mechanism to obtain contextual information for cost aggregation in a more efficient way. Specifically, the 3D cross-attention module in TCANet acquires the geometric information of all pixels on the 3D cross path. By repeating this operation twice, each pixel can eventually obtain global dependencies from all other pixels. Using the 3D cross-attention module in a stacked hourglass-structured 3DCNN only increases the number of parameters by a very small amount, which can effectively improve the performance of the model. Experimental results show that TCANet performs well on virtual dataset Scene Flow and realistic KITTI datasets.
Style APA, Harvard, Vancouver, ISO itp.
29

Zhang, Bo, Lizbeth Goodman i Xiaoqing Gu. "Novel 3D Contextual Interactive Games on a Gamified Virtual Environment Support Cultural Learning Through Collaboration Among Intercultural Students". SAGE Open 12, nr 2 (kwiecień 2022): 215824402210961. http://dx.doi.org/10.1177/21582440221096141.

Pełny tekst źródła
Streszczenie:
This study aims to help international students learn the language and cultural knowledge of their future study destination by collaborating with local students through coplaying games in online virtual rooms. Therefore, this study explores whether the 3D interactive game with specific contexts on a virtual platform can support intercultural collaboration and improve the students’ language and cultural learning. This study created novel 3D contextual interactive games (3DCIGs) in a gamified virtual environment (GVE), established on a unique virtual platform named Terf®. Terf® enables the observing and recording of data related to the conversations and behaviors of users. To investigate the effects of 3DCIGs on students, a focus group consisting of newly arrived Chinese students and Irish students from an Irish university participated in this study. The study adopted mixed methods of qualitative and quantitative analysis to examine whether 3DCIGs effectively motivate the collaborative learning of intercultural students compared with text-based assignments set in the Game Play Rooms. The findings reveal that the novel 3DCIGs developed in this study have a positive potential to motivate intercultural students to engage in team collaboration and help their cultural and language knowledge exchange.
Style APA, Harvard, Vancouver, ISO itp.
30

Li, Zhenjiang, Guangli Wu, Ye Liu, Yifan Shuai i Lei Wang. "Video Abnormal Event Detection Based on Optical Flow and 3DCNN". Journal of Physics: Conference Series 1881, nr 2 (1.04.2021): 022022. http://dx.doi.org/10.1088/1742-6596/1881/2/022022.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
31

A. Alameen, Sara, i Areej M. Alhothali. "A Lightweight Driver Drowsiness Detection System Using 3DCNN With LSTM". Computer Systems Science and Engineering 44, nr 1 (2023): 895–912. http://dx.doi.org/10.32604/csse.2023.024643.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
32

Zhu, Guangming, Liang Zhang, Peiyi Shen, Juan Song, Syed Afaq Ali Shah i Mohammed Bennamoun. "Continuous Gesture Segmentation and Recognition Using 3DCNN and Convolutional LSTM". IEEE Transactions on Multimedia 21, nr 4 (kwiecień 2019): 1011–21. http://dx.doi.org/10.1109/tmm.2018.2869278.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
33

Rajagopal, Sureshkumar, Tamilvizhi Thanarajan, Youseef Alotaibi i Saleh Alghamdi. "Brain Tumor: Hybrid Feature Extraction Based on UNet and 3DCNN". Computer Systems Science and Engineering 45, nr 2 (2023): 2093–109. http://dx.doi.org/10.32604/csse.2023.032488.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
34

Ullah, Hayat, i Arslan Munir. "A 3DCNN-Based Knowledge Distillation Framework for Human Activity Recognition". Journal of Imaging 9, nr 4 (14.04.2023): 82. http://dx.doi.org/10.3390/jimaging9040082.

Pełny tekst źródła
Streszczenie:
Human action recognition has been actively explored over the past two decades to further advancements in video analytics domain. Numerous research studies have been conducted to investigate the complex sequential patterns of human actions in video streams. In this paper, we propose a knowledge distillation framework, which distills spatio-temporal knowledge from a large teacher model to a lightweight student model using an offline knowledge distillation technique. The proposed offline knowledge distillation framework takes two models: a large pre-trained 3DCNN (three-dimensional convolutional neural network) teacher model and a lightweight 3DCNN student model (i.e., the teacher model is pre-trained on the same dataset on which the student model is to be trained on). During offline knowledge distillation training, the distillation algorithm trains only the student model to help enable the student model to achieve the same level of prediction accuracy as the teacher model. To evaluate the performance of the proposed method, we conduct extensive experiments on four benchmark human action datasets. The obtained quantitative results verify the efficiency and robustness of the proposed method over the state-of-the-art human action recognition methods by obtaining up to 35% improvement in accuracy over existing methods. Furthermore, we evaluate the inference time of the proposed method and compare the obtained results with the inference time of the state-of-the-art methods. Experimental results reveal that the proposed method attains an improvement of up to 50× in terms of frames per seconds (FPS) over the state-of-the-art methods. The short inference time and high accuracy make our proposed framework suitable for human activity recognition in real-time applications.
Style APA, Harvard, Vancouver, ISO itp.
35

Alimasi, Alimina, Hongchen Liu i Chengang Lyu. "Low Frequency Vibration Visual Monitoring System Based on Multi-Modal 3DCNN-ConvLSTM". Sensors 20, nr 20 (17.10.2020): 5872. http://dx.doi.org/10.3390/s20205872.

Pełny tekst źródła
Streszczenie:
Low frequency vibration monitoring has significant implications on environmental safety and engineering practices. Vibration expressed by visual information should contain sufficient spatial information. RGB-D camera could record diverse spatial information of vibration in frame images. Deep learning can adaptively transform frame images into deep abstract features through nonlinear mapping, which is an effective method to improve the intelligence of vibration monitoring. In this paper, a multi-modal low frequency visual vibration monitoring system based on Kinect v2 and 3DCNN-ConvLSTM is proposed. Microsoft Kinect v2 collects RGB and depth video information of vibrating objects in unstable ambient light. The 3DCNN-ConvLSTM architecture can effectively learn the spatial-temporal characteristics of muti-frequency vibration. The short-term spatiotemporal feature of the collected vibration information is learned through 3D convolution networks and the long-term spatiotemporal feature is learned through convolutional LSTM. Multi-modal fusion of RGB and depth mode is used to further improve the monitoring accuracy to 93% in the low frequency vibration range of 0–10 Hz. The results show that the system can monitor low frequency vibration and meet the basic measurement requirements.
Style APA, Harvard, Vancouver, ISO itp.
36

Zheng, Yijie, Jianxin Luo, Weiwei Chen, Yanyan Zhang, Haixun Sun i Zhisong Pan. "Unsupervised 3D Reconstruction with Multi-Measure and High-Resolution Loss". Sensors 23, nr 1 (23.12.2022): 136. http://dx.doi.org/10.3390/s23010136.

Pełny tekst źródła
Streszczenie:
Multi-view 3D reconstruction technology based on deep learning is developing rapidly. Unsupervised learning has become a research hotspot because it does not need ground truth labels. The current unsupervised method mainly uses 3DCNN to regularize the cost volume to regression image depth. This approach results in high memory requirements and long computing time. In this paper, we propose an end-to-end unsupervised multi-view 3D reconstruction network framework based on PatchMatch, Unsup_patchmatchnet. It dramatically reduces memory requirements and computing time. We propose a feature point consistency loss function. We incorporate various self-supervised signals such as photometric consistency loss and semantic consistency loss into the loss function. At the same time, we propose a high-resolution loss method. This improves the reconstruction of high-resolution images. The experiment proves that the memory usage of the network is reduced by 80% and the running time is reduced by more than 50% compared with the network using 3DCNN method. The overall error of reconstructed 3D point cloud is only 0.501 mm. It is superior to most current unsupervised multi-view 3D reconstruction networks. Then, we test on different data sets and verify that the network has good generalization.
Style APA, Harvard, Vancouver, ISO itp.
37

Alqaraghuli, Sarah Mohammed, i Oguz Karan. "Using Deep Learning Technology Based Energy-Saving For Software Defined Wireless Sensor Networks (SDWSN) Framework". Babylonian Journal of Artificial Intelligence 2024 (30.04.2024): 34–45. http://dx.doi.org/10.58496/bjai/2024/006.

Pełny tekst źródła
Streszczenie:
This paper discusses the significance of Wireless Sensor Networks (WSNs) in collecting critical data from various environments, highlighting the challenges presented by the limited resources of small, highly mobile sensors. The integration of WSNs into the Internet of Things (IoT) enables the collection and transmission of data to centralized locations. Especially in complex network topologies, efficient routing of packets is crucial for optimizing resource utilization in WSN nodes. Software-Defined Networks (SDNs), in which a centralized controller makes routing decisions based on network and packet data, are replacing traditional static routing. Nevertheless, due to the complexity of WSN topologies and cost-effectiveness concerns, Machine Learning (ML) techniques are currently being used to improve SDNWSN decision-making. This paper presents a technique that employs a neural network trained via Deep Reinforcement Learning (DRL) to extend the lifespan of WSNs by optimizing energy utilization via efficient routing. 2DCNN and 3DCNN neural networks are evaluated, with 3DCNN showing superior performance, resulting in an 18% increase in network lifespan. Additionally, the study emphasizes the significance of avoiding resource depletion in high-traffic nodes by considering alternative routing paths to guarantee the lifespan of the network.
Style APA, Harvard, Vancouver, ISO itp.
38

Miao, Sheng, Guoqing Ni, Guangze Kong, Xiuhe Yuan, Chao Liu, Xiang Shen i Weijun Gao. "A spatial interpolation method based on 3D-CNN for soil petroleum hydrocarbon pollution". PLOS ONE 20, nr 1 (24.01.2025): e0316940. https://doi.org/10.1371/journal.pone.0316940.

Pełny tekst źródła
Streszczenie:
Petroleum hydrocarbon pollution causes significant damage to soil, so accurate prediction and early intervention are crucial for sustainable soil management. However, traditional soil analysis methods often rely on statistical methods, which means they always rely on specific assumptions and are sensitive to outliers. Existing machine learning based methods convert features containing spatial information into one-dimensional vectors, resulting in the loss of some spatial features of the data. This study explores the application of Three-Dimensional Convolutional Neural Networks (3DCNN) in spatial interpolation to evaluate soil pollution. By introducing Channel Attention Mechanism (CAM), the model assigns different weights to auxiliary variables, improving the prediction accuracy of soil hydrocarbon content. We collected soil pollution data and validated the spatial distribution map generated using this method based on the drilling dataset. The results indicate that compared with traditional Kriging3D methods (R2 = 0.318) and other machine learning methods such as support vector regression (R2 = 0.582), the proposed 3DCNN based method can achieve better accuracy (R2 = 0.954). This approach provides a sustainable tool for soil pollution management, supports decision-makers in developing effective remediation strategies, and promotes the sustainable development of spatial interpolation techniques in environmental science.
Style APA, Harvard, Vancouver, ISO itp.
39

Sanchez-Garcia, Ruben, Carlos Sorzano, Jose Carazo i Joan Segura. "3DCONS-DB: A Database of Position-Specific Scoring Matrices in Protein Structures". Molecules 22, nr 12 (15.12.2017): 2230. http://dx.doi.org/10.3390/molecules22122230.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
40

Zhu, Maochang, Sheng Bin i Gengxin Sun. "Lite-3DCNN Combined with Attention Mechanism for Complex Human Movement Recognition". Computational Intelligence and Neuroscience 2022 (9.09.2022): 1–9. http://dx.doi.org/10.1155/2022/4816549.

Pełny tekst źródła
Streszczenie:
Three-dimensional convolutional network (3DCNN) is an essential field of motion recognition research. The research work of this paper optimizes the traditional three-dimensional convolution network, introduces the self-attention mechanism, and proposes a new network model to analyze and process complex human motion videos. In this study, the average frame skipping sampling and scaling and the one-hot encoding are used for data pre-processing to retain more features in the limited data. The experimental results show that this paper innovatively designs a lightweight three-dimensional convolutional network combined with an attention mechanism framework, and the number of parameters of the model is reduced by more than 90% to only about 1.7 million. This study compared the performance of different models in different classifications and found that the model proposed in this study performed well in complex human motion video classification. Its recognition rate increased by 1%–8% compared with the C3D model.
Style APA, Harvard, Vancouver, ISO itp.
41

Gionfrida, Letizia, Wan M. R. Rusli, Angela E. Kedgley i Anil A. Bharath. "A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition". Electronics 11, nr 15 (4.08.2022): 2427. http://dx.doi.org/10.3390/electronics11152427.

Pełny tekst źródła
Streszczenie:
This paper introduces a multi-class hand gesture recognition model developed to identify a set of hand gesture sequences from two-dimensional RGB video recordings, using both the appearance and spatiotemporal parameters of consecutive frames. The classifier utilizes a convolutional-based network combined with a long-short-term memory unit. To leverage the need for a large-scale dataset, the model deploys training on a public dataset, adopting a technique known as transfer learning to fine-tune the architecture on the hand gestures of relevance. Validation curves performed over a batch size of 64 indicate an accuracy of 93.95% (±0.37) with a mean Jaccard index of 0.812 (±0.105) for 22 participants. The fine-tuned architecture illustrates the possibility of refining a model with a small set of data (113,410 fully labelled image frames) to cover previously unknown hand gestures. The main contribution of this work includes a custom hand gesture recognition network driven by monocular RGB video sequences that outperform previous temporal segmentation models, embracing a small-sized architecture that facilitates wide adoption.
Style APA, Harvard, Vancouver, ISO itp.
42

Zhou, Ying, Yanxin Song, Lei Chen, Yang Chen, Xianye Ben i Yewen Cao. "A novel micro-expression detection algorithm based on BERT and 3DCNN". Image and Vision Computing 119 (marzec 2022): 104378. http://dx.doi.org/10.1016/j.imavis.2022.104378.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
43

Ullah, Hayat, i Arslan Munir. "Human Action Representation Learning Using an Attention-Driven Residual 3DCNN Network". Algorithms 16, nr 8 (31.07.2023): 369. http://dx.doi.org/10.3390/a16080369.

Pełny tekst źródła
Streszczenie:
The recognition of human activities using vision-based techniques has become a crucial research field in video analytics. Over the last decade, there have been numerous advancements in deep learning algorithms aimed at accurately detecting complex human actions in video streams. While these algorithms have demonstrated impressive performance in activity recognition, they often exhibit a bias towards either model performance or computational efficiency. This biased trade-off between robustness and efficiency poses challenges when addressing complex human activity recognition problems. To address this issue, this paper presents a computationally efficient yet robust approach, exploiting saliency-aware spatial and temporal features for human action recognition in videos. To achieve effective representation of human actions, we propose an efficient approach called the dual-attentional Residual 3D Convolutional Neural Network (DA-R3DCNN). Our proposed method utilizes a unified channel-spatial attention mechanism, allowing it to efficiently extract significant human-centric features from video frames. By combining dual channel-spatial attention layers with residual 3D convolution layers, the network becomes more discerning in capturing spatial receptive fields containing objects within the feature maps. To assess the effectiveness and robustness of our proposed method, we have conducted extensive experiments on four well-established benchmark datasets for human action recognition. The quantitative results obtained validate the efficiency of our method, showcasing significant improvements in accuracy of up to 11% as compared to state-of-the-art human action recognition methods. Additionally, our evaluation of inference time reveals that the proposed method achieves up to a 74× improvement in frames per second (FPS) compared to existing approaches, thus showing the suitability and effectiveness of the proposed DA-R3DCNN for real-time human activity recognition.
Style APA, Harvard, Vancouver, ISO itp.
44

Chen, Youqiang, Ridong Zhang i Furong Gao. "Fault diagnosis of industrial process using attention mechanism with 3DCNN-LSTM". Chemical Engineering Science 293 (lipiec 2024): 120059. http://dx.doi.org/10.1016/j.ces.2024.120059.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
45

Chen, Suting, Song Zhang, Huantong Geng, Yaodeng Chen, Chuang Zhang i Jinzhong Min. "Strong Spatiotemporal Radar Echo Nowcasting Combining 3DCNN and Bi-Directional Convolutional LSTM". Atmosphere 11, nr 6 (29.05.2020): 569. http://dx.doi.org/10.3390/atmos11060569.

Pełny tekst źródła
Streszczenie:
In order to solve the existing problems of easy spatiotemporal information loss and low forecast accuracy in traditional radar echo nowcasting, this paper proposes an encoding-forecasting model (3DCNN-BCLSTM) combining 3DCNN and bi-directional convolutional long short-term memory. The model first constructs dimensions of input data and gets 3D tensor data with spatiotemporal features, extracts local short-term spatiotemporal features of radar echoes through 3D convolution networks, then utilizes constructed bi-directional convolutional LSTM to learn global long-term spatiotemporal feature dependencies, and finally realizes the forecast of echo image changes by forecasting network. This structure can capture the spatiotemporal correlation of radar echoes in continuous motion fully and realize more accurate forecast of moving trend of short-term radar echoes within a region. The samples of radar echo images recorded by Shenzhen and Hong Kong meteorological stations are used for experiments, the results show that the critical success index (CSI) of this proposed model for eight predicted echoes reaches 0.578 when the echo threshold is 10 dBZ, the false alarm ratio (FAR) is 20% lower than convolutional LSTM network (ConvLSTM), and the mean square error (MSE) is 16% lower than the real-time optical flow by variational method (ROVER), which outperforms the current state-of-the-art radar echo nowcasting methods.
Style APA, Harvard, Vancouver, ISO itp.
46

Li, Huiguang, Hanzhao Guo i Hong Huang. "Analytical Model of Action Fusion in Sports Tennis Teaching by Convolutional Neural Networks". Computational Intelligence and Neuroscience 2022 (31.07.2022): 1–8. http://dx.doi.org/10.1155/2022/7835241.

Pełny tekst źródła
Streszczenie:
In order to improve the effectiveness of tennis teaching and enhance students’ understanding and mastery of tennis standard movements, based on the three-dimensional (3D) convolutional neural network architecture, the problem of action recognition is deeply studied. Firstly, through OpenPose, the recognition process of human poses in tennis sports videos is discussed. Athlete tracking algorithms are designed to target players. According to the target tracking data, combined with the movement characteristics of tennis, real-time semantic analysis is used to discriminate the movement types of human key point displacement in tennis. Secondly, through 2D pose estimation of tennis players, the analysis of tennis movement types is achieved. Finally, in the tennis player action recognition, a lightweight multiscale convolutional model is proposed for tennis player action recognition. Meanwhile, a key frame segment network (KFSN) for local information fusion based on keyframes is proposed. The network improves the efficiency of the whole action video learning. Through simulation experiments on the public dataset UCF101, the proposed 3DCNN-based KFSN achieves a recognition rate of 94.8%. The average time per iteration is only 1/3 of the C3D network, and the convergence speed of the model is significantly faster. The 3DCNN-based recognition method of information fusion action discussed can effectively improve the recognition effect of tennis actions and improve students’ learning and understanding of actions in the teaching process.
Style APA, Harvard, Vancouver, ISO itp.
47

Li, Zhengdao, Yupei Zhang, Hanwen Xing i Kwok-Leung Chan. "Facial Micro-Expression Recognition Using Double-Stream 3D Convolutional Neural Network with Domain Adaptation". Sensors 23, nr 7 (29.03.2023): 3577. http://dx.doi.org/10.3390/s23073577.

Pełny tekst źródła
Streszczenie:
Humans show micro-expressions (MEs) under some circumstances. MEs are a display of emotions that a human wants to conceal. The recognition of MEs has been applied in various fields. However, automatic ME recognition remains a challenging problem due to two major obstacles. As MEs are typically of short duration and low intensity, it is hard to extract discriminative features from ME videos. Moreover, it is tedious to collect ME data. Existing ME datasets usually contain insufficient video samples. In this paper, we propose a deep learning model, double-stream 3D convolutional neural network (DS-3DCNN), for recognizing MEs captured in video. The recognition framework contains two streams of 3D-CNN. The first extracts spatiotemporal features from the raw ME videos. The second extracts variations of the facial motions within the spatiotemporal domain. To facilitate feature extraction, the subtle motion embedded in a ME is amplified. To address the insufficient ME data, a macro-expression dataset is employed to expand the training sample size. Supervised domain adaptation is adopted in model training in order to bridge the difference between ME and macro-expression datasets. The DS-3DCNN model is evaluated on two publicly available ME datasets. The results show that the model outperforms various state-of-the-art models; in particular, the model outperformed the best model presented in MEGC2019 by more than 6%.
Style APA, Harvard, Vancouver, ISO itp.
48

Lin, Min-Wen, Shanq-Jang Ruan i Ya-Wen Tu. "A 3DCNN-LSTM Hybrid Framework for sEMG-Based Noises Recognition in Exercise". IEEE Access 8 (2020): 162982–88. http://dx.doi.org/10.1109/access.2020.3021344.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
49

Liu, Yiqing, Tao Zhang i Zhen Li. "3DCNN-Based Real-Time Driver Fatigue Behavior Detection in Urban Rail Transit". IEEE Access 7 (2019): 144648–62. http://dx.doi.org/10.1109/access.2019.2945136.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
50

Pramanto, Haryo, i Suharjito Suharjito. "Continuous Sign Language Recognition Using Combination of Two Stream 3DCNN and SubUNet". JURNAL TEKNIK INFORMATIKA 16, nr 2 (22.12.2023): 170–84. http://dx.doi.org/10.15408/jti.v16i2.27030.

Pełny tekst źródła
Streszczenie:
Research on sign language recognition using deep learning has been carried out by many researchers in the field of computer science but there are still obstacles in achieving the expected level of accuracy. Not a few researchers who want to do research for Continuous Sign Language Recognition but are trapped into research for Isolated Sign Language Recognition. The purpose of this study was to find the best method for performing Continuous Sign Language Recognition using Deep Learning. The 2014 RWTH-PHOENIX-Weather dataset was used in this study. The dataset was obtained from a literature study conducted to find datasets that are commonly used in Continuous Sign Language Recognition research. The dataset is used to develop the proposed method. The combination of 3DCNN, LSTM and CTC models is used to form part of the proposed method architecture. The collected dataset is also converted into an Optical Flow frame sequence to be used as Two Stream input along with the original RGB frame sequence. Word Error Rate on the prediction results is used to review the performance of the developed method. Through this research, the best achieved Word Error Rate is 94.1% using the C3D BLSTM CTC model with spatio stream input.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii