Log in

Relevant bibliographies by topics / DNN architecture / Journal articles

To see the other types of publications on this topic, follow the link: DNN architecture.

Journal articles on the topic 'DNN architecture'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'DNN architecture.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Roorda, Esther, Seyedramin Rasoulinezhad, Philip H. W. Leong, and Steven J. E. Wilton. "FPGA Architecture Exploration for DNN Acceleration." ACM Transactions on Reconfigurable Technology and Systems 15, no. 3 (September 30, 2022): 1–37. http://dx.doi.org/10.1145/3503465.

Full text

Abstract:

Recent years have seen an explosion of machine learning applications implemented on Field-Programmable Gate Arrays (FPGAs) . FPGA vendors and researchers have responded by updating their fabrics to more efficiently implement machine learning accelerators, including innovations such as enhanced Digital Signal Processing (DSP) blocks and hardened systolic arrays. Evaluating architectural proposals is difficult, however, due to the lack of publicly available benchmark circuits. This paper addresses this problem by presenting an open-source benchmark circuit generator that creates realistic DNN-oriented circuits for use in FPGA architecture studies. Unlike previous generators, which create circuits that are agnostic of the underlying FPGA, our circuits explicitly instantiate embedded blocks, allowing for meaningful comparison of recent architectural proposals without the need for a complete inference computer-aided design (CAD) flow. Our circuits are compatible with the VTR CAD suite, allowing for architecture studies that investigate routing congestion and other low-level architectural implications. In addition to addressing the lack of machine learning benchmark circuits, the architecture exploration flow that we propose allows for a more comprehensive evaluation of FPGA architectures than traditional static benchmark suites. We demonstrate this through three case studies which illustrate how realistic benchmark circuits can be generated to target different heterogeneous FPGAs.

APA, Harvard, Vancouver, ISO, and other styles

2

Elola, Andoni, Elisabete Aramendi, Unai Irusta, Artzai Picón, Erik Alonso, Pamela Owens, and Ahamed Idris. "Deep Neural Networks for ECG-Based Pulse Detection during Out-of-Hospital Cardiac Arrest." Entropy 21, no. 3 (March 21, 2019): 305. http://dx.doi.org/10.3390/e21030305.

Full text

Abstract:

The automatic detection of pulse during out-of-hospital cardiac arrest (OHCA) is necessary for the early recognition of the arrest and the detection of return of spontaneous circulation (end of the arrest). The only signal available in every single defibrillator and valid for the detection of pulse is the electrocardiogram (ECG). In this study we propose two deep neural network (DNN) architectures to detect pulse using short ECG segments (5 s), i.e., to classify the rhythm into pulseless electrical activity (PEA) or pulse-generating rhythm (PR). A total of 3914 5-s ECG segments, 2372 PR and 1542 PEA, were extracted from 279 OHCA episodes. Data were partitioned patient-wise into training (80%) and test (20%) sets. The first DNN architecture was a fully convolutional neural network, and the second architecture added a recurrent layer to learn temporal dependencies. Both DNN architectures were tuned using Bayesian optimization, and the results for the test set were compared to state-of-the art PR/PEA discrimination algorithms based on machine learning and hand crafted features. The PR/PEA classifiers were evaluated in terms of sensitivity (Se) for PR, specificity (Sp) for PEA, and the balanced accuracy (BAC), the average of Se and Sp. The Se/Sp/BAC of the DNN architectures were 94.1%/92.9%/93.5% for the first one, and 95.5%/91.6%/93.5% for the second one. Both architectures improved the performance of state of the art methods by more than 1.5 points in BAC.

APA, Harvard, Vancouver, ISO, and other styles

3

Tran, Van Duy, Duc Khai Lam, and Thi Hong Tran. "Hardware-Based Architecture for DNN Wireless Communication Models." Sensors 23, no. 3 (January 23, 2023): 1302. http://dx.doi.org/10.3390/s23031302.

Full text

Abstract:

Multiple Input Multiple Output Orthogonal Frequency Division Multiplexing (MIMO OFDM) is a key technology for wireless communication systems. However, because of the problem of a high peak-to-average power ratio (PAPR), OFDM symbols can be distorted at the MIMO OFDM transmitter. It degrades the signal detection and channel estimation performance at the MIMO OFDM receiver. In this paper, three deep neural network (DNN) models are proposed to solve the problem of non-linear distortions introduced by the power amplifier (PA) of the transmitters and replace the conventional digital signal processing (DSP) modules at the receivers in 2 × 2 MIMO OFDM and 4 × 4 MIMO OFDM systems. Proposed model type I uses the DNN model to de-map the signals at the receiver. Proposed model type II uses the DNN model to learn and filter out the channel noises at the receiver. Proposed model type III uses the DNN model to de-map and detect the signals at the receiver. All three model types attempt to solve the non-linear problem. The robust bit error rate (BER) performances of the proposed receivers are achieved through the software and hardware implementation results. In addition, we have also implemented appropriate hardware architectures for the proposed DNN models using special techniques, such as quantization and pipeline to check the feasibility in practice, which recent studies have not done. Our hardware architectures are successfully designed and implemented on the Virtex 7 vc709 FPGA board.

APA, Harvard, Vancouver, ISO, and other styles

4

Turner, Daniel, Pedro J. S. Cardoso, and João M. F. Rodrigues. "Modular Dynamic Neural Network: A Continual Learning Architecture." Applied Sciences 11, no. 24 (December 18, 2021): 12078. http://dx.doi.org/10.3390/app112412078.

Full text

Abstract:

Learning to recognize a new object after having learned to recognize other objects may be a simple task for a human, but not for machines. The present go-to approaches for teaching a machine to recognize a set of objects are based on the use of deep neural networks (DNN). So, intuitively, the solution for teaching new objects on the fly to a machine should be DNN. The problem is that the trained DNN weights used to classify the initial set of objects are extremely fragile, meaning that any change to those weights can severely damage the capacity to perform the initial recognitions; this phenomenon is known as catastrophic forgetting (CF). This paper presents a new (DNN) continual learning (CL) architecture that can deal with CF, the modular dynamic neural network (MDNN). The presented architecture consists of two main components: (a) the ResNet50-based feature extraction component as the backbone; and (b) the modular dynamic classification component, which consists of multiple sub-networks and progressively builds itself up in a tree-like structure that rearranges itself as it learns over time in such a way that each sub-network can function independently. The main contribution of the paper is a new architecture that is strongly based on its modular dynamic training feature. This modular structure allows for new classes to be added while only altering specific sub-networks in such a way that previously known classes are not forgotten. Tests on the CORe50 dataset showed results above the state of the art for CL architectures.

APA, Harvard, Vancouver, ISO, and other styles

5

Lee, Junghwan, Huanli Sun, Yuxia Liu, Xue Li, Yixin Liu, and Myungjun Kim. "State-of-Health Estimation and Anomaly Detection in Li-Ion Batteries Based on a Novel Architecture with Machine Learning." Batteries 9, no. 5 (May 8, 2023): 264. http://dx.doi.org/10.3390/batteries9050264.

Full text

Abstract:

Variations across cells, modules, packs, and vehicles can cause significant errors in the state estimation of LIBs using machine learning algorithms, especially when trained with small datasets. Training with large datasets that account for all variations is often impractical due to resource and time constraints at initial product release. To address this issue, we proposed a novel architecture that leverages electronic control units, edge computers, and the cloud to detect unrevealed variations and abnormal degradations in LIBs. The architecture comprised a generalized deep neural network (DNN) for generalizability, a personalized DNN for accuracy within a vehicle, and a detector. We emphasized that a generalized DNN trained with small datasets must show reasonable estimation accuracy during cross validation, which is critical for real applications before online training. We demonstrated the feasibility of the architecture by conducting experiments on 65 DNN models, where we found distinct hyperparameter configurations. The results showed that the personalized DNN achieves a root mean square error (RMSE) of 0.33%, while the generalized DNN achieves an RMSE of 4.6%. Finally, the Mahalanobis distance was used to consider the SOH differences between the generalized DNN and personalized DNN to detect abnormal degradations.

APA, Harvard, Vancouver, ISO, and other styles

6

Mudgil, Pooja, Pooja Gupta, Iti Mathur, and Nisheeth Joshi. "An ontological architecture for context data retrieval and ranking using SVM and DNN." Journal of Information and Optimization Sciences 44, no. 3 (2023): 369–82. http://dx.doi.org/10.47974/jios-1347.

Full text

Abstract:

Context retrieval and ranking have always been an area of interest for researchers around the world. The ranking provides significance to the data that has to be presented in front of users but it also consumes time if the ranking architecture is not organized. The retrieval is dependent upon the co-relation among the data attributes that are supplied against a class label also referred to as ground truth and the ranking depends upon the sensing polarity that indicates the hold of the outcome towards asked information. This paper illustrates an ontological architecture that involves two phases namely context retrieval and ranking. The ranking phase is composed of three different algorithm architectures namely k-means, Support Vector Machines (SVM), and Deep Neural Networks (DNN). The DNN is tuned to fit and work as per the availability of a total number of samples. The proposed work has been evaluated for both quantitative and qualitative parameters in different sets and scenarios. The proposed work has also been compared with other state of art techniques and is illustrated in the paper itself.

APA, Harvard, Vancouver, ISO, and other styles

7

Elsisi, Mahmoud, and Minh-Quang Tran. "Development of an IoT Architecture Based on a Deep Neural Network against Cyber Attacks for Automated Guided Vehicles." Sensors 21, no. 24 (December 18, 2021): 8467. http://dx.doi.org/10.3390/s21248467.

Full text

Abstract:

This paper introduces an integrated IoT architecture to handle the problem of cyber attacks based on a developed deep neural network (DNN) with a rectified linear unit in order to provide reliable and secure online monitoring for automated guided vehicles (AGVs). The developed IoT architecture based on a DNN introduces a new approach for the online monitoring of AGVs against cyber attacks with a cheap and easy implementation instead of the traditional cyber attack detection schemes in the literature. The proposed DNN is trained based on experimental AGV data that represent the real state of the AGV and different types of cyber attacks including a random attack, ramp attack, pulse attack, and sinusoidal attack that is injected by the attacker into the internet network. The proposed DNN is compared with different deep learning and machine learning algorithms such as a one dimension convolutional neural network (1D-CNN), a supported vector machine model (SVM), random forest, extreme gradient boosting (XGBoost), and a decision tree for greater validation. Furthermore, the proposed IoT architecture based on a DNN can provide an effective detection for the AGV status with an excellent accuracy of 96.77% that is significantly greater than the accuracy based on the traditional schemes. The AGV status based on the proposed IoT architecture with a DNN is visualized by an advanced IoT platform named CONTACT Elements for IoT. Different test scenarios with a practical setup of an AGV with IoT are carried out to emphasize the performance of the suggested IoT architecture based on a DNN. The results approve the usefulness of the proposed IoT to provide effective cybersecurity for data visualization and tracking of the AGV status that enhances decision-making and improves industrial productivity.

APA, Harvard, Vancouver, ISO, and other styles

8

P, Shanmugavadivu, Mary Shanthi Rani M, Chitra P, Lakshmanan S, Nagaraja P, and Vignesh U. "Bio-Optimization of Deep Learning Network Architectures." Security and Communication Networks 2022 (September 20, 2022): 1–11. http://dx.doi.org/10.1155/2022/3718340.

Full text

Abstract:

Deep learning is reaching new heights as a result of its cutting-edge performance in a variety of fields, including computer vision, natural language processing, time series analysis, and healthcare. Deep learning is implemented using batch and stochastic gradient descent methods, as well as a few optimizers; however, this led to subpar model performance. However, there is now a lot of effort being done to improve deep learning’s performance using gradient optimization methods. The suggested work analyses convolutional neural networks (CNN) and deep neural networks (DNN) using several cutting-edge optimizers to enhance the performance of architectures. This work uses specific optimizers (SGD, RMSprop, Adam, Adadelta, etc.) to enhance the performance of designs using different types of datasets for result matching. A thorough report on the optimizers’ performance across a variety of architectures and datasets finishes the study effort. This research will be helpful to researchers in developing their framework and appropriate architecture optimizers. The proposed work involves eight new optimizers using four CNN and DNN architectures. The experimental results exploit breakthrough results for improving the efficiency of CNN and DNN architectures using various datasets.

APA, Harvard, Vancouver, ISO, and other styles

9

Krishnan, Gokul, Sumit K. Mandal, Chaitali Chakrabarti, Jae-Sun Seo, Umit Y. Ogras, and Yu Cao. "Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks." ACM Journal on Emerging Technologies in Computing Systems 18, no. 2 (April 30, 2022): 1–22. http://dx.doi.org/10.1145/3460233.

Full text

Abstract:

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions—one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6 × improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.

APA, Harvard, Vancouver, ISO, and other styles

10

Zhao, Jiaqi, Ming Xu, Yunzhi Chen, and Guoliang Xu. "A DNN Architecture Generation Method for DDoS Detection via Genetic Alogrithm." Future Internet 15, no. 4 (March 26, 2023): 122. http://dx.doi.org/10.3390/fi15040122.

Full text

Abstract:

Nowdays, DNNs (Deep Neural Networks) are widely used in the field of DDoS attack detection. However, designing a good DNN architecture relies on the designer’s experience and requires considerable work. In this paper, a GA (genetic algorithm) is used to automatically generate the DNN architecture for DDoS detection to minimize human intervention in the design process. Furthermore, given the complexity of contemporary networks and the diversity of DDoS attacks, the objective of this paper is to generate a DNN model that boasts superior performance, real-time capability, and generalization ability to tackle intricate network scenarios. This paper presents a fitness function that guarantees the best model generated possesses a specific level of real-time capability. Additionally, the proposed method employs multiple datasets to joint models generated, thereby enhancing the model’s generalization performance. This paper conducts several experiments to validate the viability of the proposed method. Firstly, the best model generated with one dataset is compared with existing DNN models on the CICDDoS2019 dataset. The experimental results indicate that the model generated with one dataset has higher precision and F1-score than the existing DNN models. Secondly, model generation experiments are conducted on the CICIDS2017 and CICIDS2018 datasets, and the best model generated still performs well. Finally, this paper conducts comparative experiments on multiple datasets using the best model generated with six datasets and the best model generated by existing methods. The experimental results demonstrate that the best model generated with six datasets has better generalization ability and real-time capability.

APA, Harvard, Vancouver, ISO, and other styles

11

Kulkarni, Uday, Shreya B. Devagiri, Rohit B. Devaranavadagi, Sneha Pamali, Nishanth R. Negalli, and V. Prabakaran. "Depth Estimation using DNN Architecture and Vision-Based Transformers." ITM Web of Conferences 53 (2023): 02010. http://dx.doi.org/10.1051/itmconf/20235302010.

Full text

Abstract:

Depth Estimation is the process of estimating the depth of objects with a 2D image as an input. The importance of Depth Estimation is that it provides some critical information about the 3D structure of a scene given a sequence of 2D images. This is helpful in various applications like robotics, virtual reality, autonomous driving, medical imaging, and so on and so forth. Our main objective here is to make the environment more perceivable by autonomous vehicles. Unlike the traditional approaches which try to extract depth from the input image, we apply instance segmentation to the image and then use the segmented image as an input to the depth map generator. In this paper, we use a fully connected neural network with dense prediction transformers. As mentioned above, the image after instance segmentation is given as an input for the transformers. The Transformer is the backbone for producing high-resolution images. Our experimentation results have shown many improvements related to the details and sharpness of the image as compared to the traditional depth estimation techniques. The architecture is applied and tested on the KITTI data set.

APA, Harvard, Vancouver, ISO, and other styles

12

Aspri, Maria, Grigorios Tsagkatakis, and Panagiotis Tsakalides. "Distributed Training and Inference of Deep Learning Models for Multi-Modal Land Cover Classification." Remote Sensing 12, no. 17 (August 19, 2020): 2670. http://dx.doi.org/10.3390/rs12172670.

Full text

Abstract:

Deep Neural Networks (DNNs) have established themselves as a fundamental tool in numerous computational modeling applications, overcoming the challenge of defining use-case-specific feature extraction processing by incorporating this stage into unified end-to-end trainable models. Despite their capabilities in modeling, training large-scale DNN models is a very computation-intensive task that most single machines are often incapable of accomplishing. To address this issue, different parallelization schemes were proposed. Nevertheless, network overheads as well as optimal resource allocation pose as major challenges, since network communication is generally slower than intra-machine communication while some layers are more computationally expensive than others. In this work, we consider a novel multimodal DNN based on the Convolutional Neural Network architecture and explore several different ways to optimize its performance when training is executed on an Apache Spark Cluster. We evaluate the performance of different architectures via the metrics of network traffic and processing power, considering the case of land cover classification from remote sensing observations. Furthermore, we compare our architectures with an identical DNN architecture modeled after a data parallelization approach by using the metrics of classification accuracy and inference execution time. The experiments show that the way a model is parallelized has tremendous effect on resource allocation and hyperparameter tuning can reduce network overheads. Experimental results also demonstrate that proposed model parallelization schemes achieve more efficient resource use and more accurate predictions compared to data parallelization approaches.

APA, Harvard, Vancouver, ISO, and other styles

13

Shu, Deqin, Hao Fan, and Liang Zhang. "Research on the Overview of Image Processing Architecture of Computer Based Deep Neural Network Accelerator." Journal of Physics: Conference Series 2074, no. 1 (November 1, 2021): 012010. http://dx.doi.org/10.1088/1742-6596/2074/1/012010.

Full text

Abstract:

Abstract DNN algorithm still has many shortcomings in the process of operation, which need to be further solved. Specifically, there is more data reuse, and the repeated access of global cache takes up more resources and computation, thus reducing the efficiency of operation. Based on this, this paper first analyses the research status and value of the DNN accelerator, then studies the image processing architecture of the DNN accelerator, and finally gives the computer DNN model and acceleration algorithm analysis.

APA, Harvard, Vancouver, ISO, and other styles

14

Tao, Zhe, Stephanie Nawas, Jacqueline Mitchell, and Aditya V. Thakur. "Architecture-Preserving Provable Repair of Deep Neural Networks." Proceedings of the ACM on Programming Languages 7, PLDI (June 6, 2023): 443–67. http://dx.doi.org/10.1145/3591238.

Full text

Abstract:

Deep neural networks (DNNs) are becoming increasingly important components of software, and are considered the state-of-the-art solution for a number of problems, such as image recognition. However, DNNs are far from infallible, and incorrect behavior of DNNs can have disastrous real-world consequences. This paper addresses the problem of architecture-preserving V-polytope provable repair of DNNs. A V-polytope defines a convex bounded polytope using its vertex representation. V-polytope provable repair guarantees that the repaired DNN satisfies the given specification on the infinite set of points in the given V-polytope. An architecture-preserving repair only modifies the parameters of the DNN, without modifying its architecture. The repair has the flexibility to modify multiple layers of the DNN, and runs in polynomial time. It supports DNNs with activation functions that have some linear pieces, as well as fully-connected, convolutional, pooling and residual layers. To the best our knowledge, this is the first provable repair approach that has all of these features. We implement our approach in a tool called APRNN. Using MNIST, ImageNet, and ACAS Xu DNNs, we show that it has better efficiency, scalability, and generalization compared to PRDNN and REASSURE, prior provable repair methods that are not architecture preserving.

APA, Harvard, Vancouver, ISO, and other styles

15

Kapočiūtė-Dzikienė, Jurgita, Kaspars Balodis, and Raivis Skadiņš. "Intent Detection Problem Solving via Automatic DNN Hyperparameter Optimization." Applied Sciences 10, no. 21 (October 22, 2020): 7426. http://dx.doi.org/10.3390/app10217426.

Full text

Abstract:

Accurate intent detection-based chatbots are usually trained on larger datasets that are not available for some languages. Seeking the most accurate models, three English benchmark datasets that were human-translated into four morphologically complex languages (i.e., Estonian, Latvian, Lithuanian, Russian) were used. Two types of word embeddings (fastText and BERT), three types of deep neural network (DNN) classifiers (convolutional neural network (CNN); long short-term memory method (LSTM), and bidirectional LSTM (BiLSTM)), different DNN architectures (shallower and deeper), and various DNN hyperparameter values were investigated. DNN architecture and hyperparameter values were optimized automatically using the Bayesian method and random search. On three datasets of 2/5/8 intents for English, Estonian, Latvian, Lithuanian, and Russian languages, accuracies of 0.991/0.890/0.712, 0.972/0.890/0.644, 1.000/0.890/0.644, 0.981/0.872/0.712, and 0.972/0.881/0.661 were achieved, respectively. The BERT multilingual vectorization with the CNN classifier was proven to be a good choice for all datasets for all languages. Moreover, in the majority of models, the same set of optimal hyperparameter values was determined. The results obtained in this research were also compared with the previously reported values (where hyperparameter values of DNN models were selected by an expert). This comparison revealed that automatically optimized models are competitive or even more accurate when created with larger training datasets.

APA, Harvard, Vancouver, ISO, and other styles

16

Li, Guihong, Sumit K. Mandal, Umit Y. Ogras, and Radu Marculescu. "FLASH: F ast Neura l A rchitecture S earch with H ardware Optimization." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–26. http://dx.doi.org/10.1145/3476994.

Full text

Abstract:

Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs). As the performance requirements of ML applications grow continuously, the hardware accelerators start playing a central role in DNN design. This trend makes NAS even more complicated and time-consuming for most real applications. This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform. As the main theoretical contribution, we first propose the NN-Degree, an analytical metric to quantify the topological characteristics of DNNs with skip connections (e.g., DenseNets, ResNets, Wide-ResNets, and MobileNets). The newly proposed NN-Degree allows us to do training-free NAS within one second and build an accuracy predictor by training as few as 25 samples out of a vast search space with more than 63 billion configurations. Second, by performing inference on the target hardware, we fine-tune and validate our analytical models to estimate the latency, area, and energy consumption of various DNN architectures while executing standard ML datasets. Third, we construct a hierarchical algorithm based on simplicial homology global optimization (SHGO) to optimize the model-architecture co-design process, while considering the area, latency, and energy consumption of the target hardware. We demonstrate that, compared to the state-of-the-art NAS approaches, our proposed hierarchical SHGO-based algorithm enables more than four orders of magnitude speedup (specifically, the execution time of the proposed algorithm is about 0.1 seconds). Finally, our experimental evaluations show that FLASH is easily transferable to different hardware architectures, thus enabling us to do NAS on a Raspberry Pi-3B processor in less than 3 seconds.

APA, Harvard, Vancouver, ISO, and other styles

17

Singh, Manu, and Vibhakar Shrimali. "Classification of Brain Tumor using Hybrid Deep Learning Approach." BRAIN. Broad Research in Artificial Intelligence and Neuroscience 13, no. 2 (June 30, 2022): 308–27. http://dx.doi.org/10.18662/brain/13.2/345.

Full text

Abstract:

Diagnosis of tumor at its early stage is the most challenging task for its treatment in the area of neurology. As, brain tumor is the most common problem in the world, so tremendous research is being carried out to find out the cancer during its onset stages. The task of diagnosis as well as its automation has been extremely difficult using conventional image processing methods. In view of this, a novel technique has been proposed based on convolutional neural network architecture to classify the brain tumor which assists radiologists and physicians to make their decision fast and accurate. The proposed deep learning structure helps to analyze and produce better feature maps to classify the variations in the normal and malignant cases. The proposed method i.e. Hybrid Deep Neural Network (H-DNN) architecture is the combination of two different DNN. First Deep Neural Network (DNN-1) uses the spatial texture information of the cranial Magnetic Resonance (MR) images, whereas in the second method Deep Neural Network (DNN-2) uses the frequency domain information of the MRI scans. Finally, we combine both neural networks to produce better classification result based on prediction score. The training input to the DNN-1 is the texture which is computed by Local Binary Patterns, whereas the DNN-2 uses the frequencies, which have being calculated by Wavelet Transformation as its training input. Here two Dataset have been used for the evaluation of the proposed model i.e. Real MRI dataset and BraTS 2012 MRI Dataset for T2- weighted MRI scans. In this study, the proposed model provides 98.7% classification accuracy, which outperforms the other methods as reported in the related work. Also comparisons of Accuracy, Sensitivity and Specificity of the proposed method has been done with DNN-1 and DNN-2 architecture to indicate that the reported model gives better results when compared to the other methods.

APA, Harvard, Vancouver, ISO, and other styles

18

Pandey, Pramesh, Noel Daniel Gundi, Prabal Basu, Tahmoures Shabanian, Mitchell Craig Patrick, Koushik Chakraborty, and Sanghamitra Roy. "Challenges and Opportunities in Near-Threshold DNN Accelerators around Timing Errors." Journal of Low Power Electronics and Applications 10, no. 4 (October 16, 2020): 33. http://dx.doi.org/10.3390/jlpea10040033.

Full text

Abstract:

AI evolution is accelerating and Deep Neural Network (DNN) inference accelerators are at the forefront of ad hoc architectures that are evolving to support the immense throughput required for AI computation. However, much more energy efficient design paradigms are inevitable to realize the complete potential of AI evolution and curtail energy consumption. The Near-Threshold Computing (NTC) design paradigm can serve as the best candidate for providing the required energy efficiency. However, NTC operation is plagued with ample performance and reliability concerns arising from the timing errors. In this paper, we dive deep into DNN architecture to uncover some unique challenges and opportunities for operation in the NTC paradigm. By performing rigorous simulations in TPU systolic array, we reveal the severity of timing errors and its impact on inference accuracy at NTC. We analyze various attributes—such as data–delay relationship, delay disparity within arithmetic units, utilization pattern, hardware homogeneity, workload characteristics—and uncover unique localized and global techniques to deal with the timing errors in NTC.

APA, Harvard, Vancouver, ISO, and other styles

19

Galitsky, Boris, Dmitry Ilvovsky, and Saveli Goldberg. "Shaped-Charge Learning Architecture for the Human–Machine Teams." Entropy 25, no. 6 (June 12, 2023): 924. http://dx.doi.org/10.3390/e25060924.

Full text

Abstract:

In spite of great progress in recent years, deep learning (DNN) and transformers have strong limitations for supporting human–machine teams due to a lack of explainability, information on what exactly was generalized, and machinery to be integrated with various reasoning techniques, and weak defense against possible adversarial attacks of opponent team members. Due to these shortcomings, stand-alone DNNs have limited support for human–machine teams. We propose a Meta-learning/DNN → kNN architecture that overcomes these limitations by integrating deep learning with explainable nearest neighbor learning (kNN) to form the object level, having a deductive reasoning-based meta-level control learning process, and performing validation and correction of predictions in a way that is more interpretable by peer team members. We address our proposal from structural and maximum entropy production perspectives.

APA, Harvard, Vancouver, ISO, and other styles

20

Greif, Kevin, and Kevin Lannon. "Physics Inspired Deep Neural Networks for Top Quark Reconstruction." EPJ Web of Conferences 245 (2020): 06029. http://dx.doi.org/10.1051/epjconf/202024506029.

Full text

Abstract:

Deep neural networks (DNNs) have been applied to the fields of computer vision and natural language processing with great success in recent years. The success of these applications has hinged on the development of specialized DNN architectures that take advantage of specific characteristics of the problem to be solved, namely convolutional neural networks for computer vision and recurrent neural networks for natural language processing. This research explores whether a neural network architecture specific to the task of identifying t → Wb decays in particle collision data yields better performance than a generic, fully-connected DNN. Although applied here to resolved top quark decays, this approach is inspired by an DNN technique for tagging boosted top quarks, which consists of defining custom neural network layers known as the combination and Lorentz layers. These layers encode knowledge of relativistic kinematics applied to combinations of particles, and the output of these specialized layers can then be fed into a fully connected neural network to learn tasks such as classification. This research compares the performance of these physics inspired networks to that of a generic, fully-connected DNN, to see if there is any advantage in terms of classification performance, size of the network, or ease of training.

APA, Harvard, Vancouver, ISO, and other styles

21

Kanimozhi, G., and P. Shanmugavadivu. "OPTIMIZED DEEP NEURAL NETWORKS ARCHITECTURE MODEL FOR BREAST CANCER DIAGNOSIS." YMER Digital 20, no. 11 (November 16, 2021): 161–75. http://dx.doi.org/10.37896/ymer20.11/15.

Full text

Abstract:

Breast cancer has increasingly claimed the lives of women. Oncologists use digital mammograms as a viable source to detect breast cancer and classify it into benign and malignant based on the severity. The performance of the traditional methods on breast cancer detection could not be improved beyond a certain point due to the limitations and scope of computing. Moreover, the constrained scope of image processing techniques in developing automated breast cancer detection systems has motivated the researchers to shift their focus towards Artificial Intelligence based models. The Neural Networks (NN) have exhibited greater scope for the development of automated medical image analysis systems with the highest degree of accuracy. As NN model enables the automated system to understand the feature of problem-solving without being explicitly programmed. The optimization for NN offers an additional payoff on accuracy, computational complexity, and time. As the scope and suitability of optimization methods are data-dependent, the choice of selection of an appropriate optimization method itself is emerging as a prominent domain of research. In this paper, Deep Neural Networks (DNN) with different optimizers and Learning rates were designed for the prediction of breast cancer and its classification. Comparative performance analysis of five distinct first-order gradient-based optimization techniques, namely, Adaptive Gradient (Adagrad), Root Mean Square Propagation (RMSProp), Adaptive Delta (Adadelta), Adaptive Moment Estimation (Adam), and Stochastic Gradient Descent (SGD), is carried out to make predictions on the classification of breast cancer masses. For this purpose, the Mammographic Mass dataset was chosen for experimentation. The parameters determined for experiments were chosen on the number of hidden layers and learning rate along with hyperparameter tuning. The impacts of those optimizers were tested on the NN with One Hidden Layer (NN1HL), DNN with Three Hidden Layers (DNN4HL), and DNN with Eight Hidden Layers (DNN8HL). The experimental results showed that DNN8HL-Adam (DNN8HL-AM) had produced the highest accuracy of 91% among its counterparts. This research endorsed that the incorporation of optimizers in DNN contributes to an increased accuracy and optimized architecture for automated system development using neural networks.

APA, Harvard, Vancouver, ISO, and other styles

22

Kavitha, S., and J. Manikandan. "Design of a Bottleneck Layered DNN Algorithm for Intrusion Detection System." IRO Journal on Sustainable Wireless Systems 3, no. 4 (May 16, 2022): 242–58. http://dx.doi.org/10.36548/jsws.2021.4.004.

Full text

Abstract:

Deep learning algorithms are very effective in the application of classification and prediction over the traditional estimators. The proposed work employs a bottleneck layer algorithm on CICIDS-2017 dataset to prove its efficacy on the prediction of cyber-attacks. The performance of the bottleneck model architecture is incorporated with Artificial Neural Network (ANN) and Deep Neural Network (DNN) models and compared over the traditional ANN, DNN and Support Vector Machines (SVM) models. The experimental work reaches a maximum accuracy of 92.35% in the DNN and 90.98% in ANN algorithm respectively.

APA, Harvard, Vancouver, ISO, and other styles

23

Mei, Linyan, Pouya Houshmand, Vikram Jain, Sebastian Giraldo, and Marian Verhelst. "ZigZag: Enlarging Joint Architecture-Mapping Design Space Exploration for DNN Accelerators." IEEE Transactions on Computers 70, no. 8 (August 1, 2021): 1160–74. http://dx.doi.org/10.1109/tc.2021.3059962.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Shamasundar, Bharath, and Ananthanarayanan Chockalingam. "A DNN Architecture for the Detection of Generalized Spatial Modulation Signals." IEEE Communications Letters 24, no. 12 (December 2020): 2770–74. http://dx.doi.org/10.1109/lcomm.2020.3018260.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Long, Yun, Daehyun Kim, Edward Lee, Priyabrata Saha, Burhan Ahmad Mudassar, Xueyuan She, Asif Islam Khan, and Saibal Mukhopadhyay. "A Ferroelectric FET-Based Processing-in-Memory Architecture for DNN Acceleration." IEEE Journal on Exploratory Solid-State Computational Devices and Circuits 5, no. 2 (December 2019): 113–22. http://dx.doi.org/10.1109/jxcdc.2019.2923745.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Yu, Ye, Yingmin Li, Shuai Che, Niraj K. Jha, and Weifeng Zhang. "Software-Defined Design Space Exploration for an Efficient DNN Accelerator Architecture." IEEE Transactions on Computers 70, no. 1 (January 1, 2021): 45–56. http://dx.doi.org/10.1109/tc.2020.2983694.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Kulkarni, Uday, Abhishek Patil, Rohit Devaranavadagi, Shreya B. Devagiri, Sneha K. Pamali, and Raunak Ujawane. "Vision-Based Quality Control Check of Tube Shaft using DNN Architecture." ITM Web of Conferences 53 (2023): 02009. http://dx.doi.org/10.1051/itmconf/20235302009.

Full text

Abstract:

Quality control is the process of ensuring that a product or service meets certain predetermined standards of quality. This can involve testing, inspection, and other methods to ensure that the product or service is fit for its intended use. The tube shaft is a component used in the drive shaft of a vehicle. It undergoes several stages from raw material to final product to increase its structural properties. Following the first step, which is hardening, preliminary quality control is done by cutting the tube shaft into two parts lengthwise to check the intensity of hardening and decide whether to accept or reject the part. We present a machine vision-based quality control system that uses You Only Look Once (YOLO) v5 to assess hardening intensity by analyzing the pattern formed on the cut piece’s surface.

APA, Harvard, Vancouver, ISO, and other styles

28

Zhou, Siqi, Mohamed K. Helwa, and Angela P. Schoellig. "Deep neural networks as add-on modules for enhancing robot performance in impromptu trajectory tracking." International Journal of Robotics Research 39, no. 12 (September 11, 2020): 1397–418. http://dx.doi.org/10.1177/0278364920953902.

Full text

Abstract:

High-accuracy trajectory tracking is critical to many robotic applications, including search and rescue, advanced manufacturing, and industrial inspection, to name a few. Yet the unmodeled dynamics and parametric uncertainties of operating in such complex environments make it difficult to design controllers that are capable of accurately tracking arbitrary, feasible trajectories from the first attempt (i.e., impromptu trajectory tracking). This article proposes a platform-independent, learning-based “add-on” module to enhance the tracking performance of black-box control systems in impromptu tracking tasks. Our approach is to pre-cascade a deep neural network (DNN) to a stabilized baseline control system, in order to establish an identity mapping from the desired output to the actual output. Previous research involving quadrotors showed that, for 30 arbitrary hand-drawn trajectories, the DNN-enhancement control architecture reduces tracking errors by 43% on average, as compared with the baseline controller. In this article, we provide a platform-independent formulation and practical design guidelines for the DNN-enhancement approach. In particular, we: (1) characterize the underlying function of the DNN module; (2) identify necessary conditions for the approach to be effective; (3) provide theoretical insights into the stability of the overall DNN-enhancement control architecture; (4) derive a condition that supports data-efficient training of the DNN module; and (5) compare the novel theory-driven DNN design with the prior trial-and-error design using detailed quadrotor experiments. We show that, as compared with the prior trial-and-error design, the novel theory-driven design allows us to reduce the input dimension of the DNN by two thirds while achieving similar tracking performance.

APA, Harvard, Vancouver, ISO, and other styles

29

Kao, Hsu-Yu, Shih-Hsu Huang, and Wei-Kai Cheng. "Design Framework for ReRAM-Based DNN Accelerators with Accuracy and Hardware Evaluation." Electronics 11, no. 13 (July 5, 2022): 2107. http://dx.doi.org/10.3390/electronics11132107.

Full text

Abstract:

To achieve faster design closure, there is a need to provide a design framework for the design of ReRAM-based DNN (deep neural network) accelerator at the early design stage. In this paper, we develop a high-level ReRAM-based DNN accelerator design framework. The proposed design framework has the following three features. First, we consider ReRAM’s non-linear properties, including lognormal distribution, leakage current, IR drop, and sneak path. Thus, model accuracy and circuit performance can be accurately evaluated. Second, we use SystemC with TLM modeling method to build our virtual platform. To our knowledge, the proposed design framework is the first behavior-level ReRAM deep learning accelerator simulator that can simulate real hardware behavior. Third, the proposed design framework can evaluate not only model accuracy but also hardware cost. As a result, the proposed design framework can be used for behavior-level design space exploration. In the experiments, we have deployed different DNN models on the virtual platform. Circuit performance can be easily evaluated on the proposed design framework. Furthermore, experiment results also show that the noise effects are different in different ReRAM array architectures. Based on the proposed design framework, we can easily mitigate noise effects by tuning architecture parameters.

APA, Harvard, Vancouver, ISO, and other styles

30

Uto, Masaki. "A review of deep-neural automated essay scoring models." Behaviormetrika 48, no. 2 (July 2021): 459–84. http://dx.doi.org/10.1007/s41237-021-00142-y.

Full text

Abstract:

AbstractAutomated essay scoring (AES) is the task of automatically assigning scores to essays as an alternative to grading by humans. Although traditional AES models typically rely on manually designed features, deep neural network (DNN)-based AES models that obviate the need for feature engineering have recently attracted increased attention. Various DNN-AES models with different characteristics have been proposed over the past few years. To our knowledge, however, no study has provided a comprehensive review of DNN-AES models while introducing each model in detail. Therefore, this review presents a comprehensive survey of DNN-AES models, describing the main idea and detailed architecture of each model. We classify the AES task into four types and introduce existing DNN-AES models according to this classification.

APA, Harvard, Vancouver, ISO, and other styles

31

Hosseini, Fateme S., Fanruo Meng, Chengmo Yang, Wujie Wen, and Rosario Cammarota. "Tolerating Defects in Low-Power Neural Network Accelerators Via Retraining-Free Weight Approximation." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–21. http://dx.doi.org/10.1145/3477016.

Full text

Abstract:

Hardware accelerators are essential to the accommodation of ever-increasing Deep Neural Network (DNN) workloads on the resource-constrained embedded devices. While accelerators facilitate fast and energy-efficient DNN operations, their accuracy is threatened by faults in their on-chip and off-chip memories, where millions of DNN weights are held. The use of emerging Non-Volatile Memories (NVM) further exposes DNN accelerators to a non-negligible rate of permanent defects due to immature fabrication, limited endurance, and aging. To tolerate defects in NVM-based DNN accelerators, previous work either requires extra redundancy in hardware or performs defect-aware retraining, imposing significant overhead. In comparison, this paper proposes a set of algorithms that exploit the flexibility in setting the fault-free bits in weight memory to effectively approximate weight values, so as to mitigate defect-induced accuracy drop. These algorithms can be applied as a one-step solution when loading the weights to embedded devices. They only require trivial hardware support and impose negligible run-time overhead. Experiments on popular DNN models show that the proposed techniques successfully boost inference accuracy even in the face of elevated defect rates in the weight memory.

APA, Harvard, Vancouver, ISO, and other styles

32

Pham, Tuan Anh, Van Quan Tran, and Huong-Lan Thi Vu. "Evolution of Deep Neural Network Architecture Using Particle Swarm Optimization to Improve the Performance in Determining the Friction Angle of Soil." Mathematical Problems in Engineering 2021 (May 7, 2021): 1–17. http://dx.doi.org/10.1155/2021/5570945.

Full text

Abstract:

This study focuses on the use of deep neural network (DNN) to predict the soil friction angle, one of the crucial parameters in geotechnical design. Besides, particle swarm optimization (PSO) algorithm was used to improve the performance of DNN by selecting the best structural DNN parameters, namely, the optimal numbers of hidden layers and neurons in each hidden layer. For this aim, a database containing 245 laboratory tests collected from a project in Ho Chi Minh city, Vietnam, was used for the development of the proposed hybrid PSO-DNN model, including seven input factors (soil state, standard penetration test value, unit weight of soil, void ratio, thickness of soil layer, top elevation of soil layer, and bottom elevation of soil layer) and the friction angle was considered as the target. The data set was divided into three parts, namely, the training, validation, and testing sets for the construction, validation, and testing phases of the model. Various quality assessment criteria, namely, the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE), were used to estimate the performance of PSO-DNN models. The PSO algorithm showed a remarkable ability to find out an optimal DNN architecture for the prediction process. The results showed that the PSO-DNN model using 10 hidden layers outperformed the DNN model, in which the average correlation improvement increased R2 by 1.83%, MAE by 5.94%, and RMSE by 8.58%. Besides, a global sensitivity analysis technique was used to detect the most important inputs, and it showed that, among the seven input variables, the elevation of top and bottom of soil played an important role in predicting the friction angle of soil.

APA, Harvard, Vancouver, ISO, and other styles

33

Ahmad, Zeeshan, Adnan Shahid Khan, Kashif Nisar, Iram Haider, Rosilah Hassan, Muhammad Reazul Haque, Seleviawati Tarmizi, and Joel J. P. C. Rodrigues. "Anomaly Detection Using Deep Neural Network for IoT Architecture." Applied Sciences 11, no. 15 (July 30, 2021): 7050. http://dx.doi.org/10.3390/app11157050.

Full text

Abstract:

The revolutionary idea of the internet of things (IoT) architecture has gained enormous popularity over the last decade, resulting in an exponential growth in the IoT networks, connected devices, and the data processed therein. Since IoT devices generate and exchange sensitive data over the traditional internet, security has become a prime concern due to the generation of zero-day cyberattacks. A network-based intrusion detection system (NIDS) can provide the much-needed efficient security solution to the IoT network by protecting the network entry points through constant network traffic monitoring. Recent NIDS have a high false alarm rate (FAR) in detecting the anomalies, including the novel and zero-day anomalies. This paper proposes an efficient anomaly detection mechanism using mutual information (MI), considering a deep neural network (DNN) for an IoT network. A comparative analysis of different deep-learning models such as DNN, Convolutional Neural Network, Recurrent Neural Network, and its different variants, such as Gated Recurrent Unit and Long Short-term Memory is performed considering the IoT-Botnet 2020 dataset. Experimental results show the improvement of 0.57–2.6% in terms of the model’s accuracy, while at the same time reducing the FAR by 0.23–7.98% to show the effectiveness of the DNN-based NIDS model compared to the well-known deep learning models. It was also observed that using only the 16–35 best numerical features selected using MI instead of 80 features of the dataset result in almost negligible degradation in the model’s performance but helped in decreasing the overall model’s complexity. In addition, the overall accuracy of the DL-based models is further improved by almost 0.99–3.45% in terms of the detection accuracy considering only the top five categorical and numerical features.

APA, Harvard, Vancouver, ISO, and other styles

34

Паршин, А. И., М. Н. Аралов, В. Ф. Барабанов, and Н. И. Гребенникова. "RANDOM MULTI-MODAL DEEP LEARNING IN THE PROBLEM OF IMAGE RECOGNITION." ВЕСТНИК ВОРОНЕЖСКОГО ГОСУДАРСТВЕННОГО ТЕХНИЧЕСКОГО УНИВЕРСИТЕТА, no. 4 (October 20, 2021): 21–26. http://dx.doi.org/10.36622/vstu.2021.17.4.003.

Full text

Abstract:

Задача распознавания изображений - одна из самых сложных в машинном обучении, требующая от исследователя как глубоких знаний, так и больших временных и вычислительных ресурсов. В случае использования нелинейных и сложных данных применяются различные архитектуры глубоких нейронных сетей, но при этом сложным вопросом остается проблема выбора нейронной сети. Основными архитектурами, используемыми повсеместно, являются свёрточные нейронные сети (CNN), рекуррентные нейронные сети (RNN), глубокие нейронные сети (DNN). На основе рекуррентных нейронных сетей (RNN) были разработаны сети с долгой краткосрочной памятью (LSTM) и сети с управляемыми реккурентными блоками (GRU). Каждая архитектура нейронной сети имеет свою структуру, свои настраиваемые и обучаемые параметры, обладает своими достоинствами и недостатками. Комбинируя различные виды нейронных сетей, можно существенно улучшить качество предсказания в различных задачах машинного обучения. Учитывая, что выбор оптимальной архитектуры сети и ее параметров является крайне трудной задачей, рассматривается один из методов построения архитектуры нейронных сетей на основе комбинации свёрточных, рекуррентных и глубоких нейронных сетей. Показано, что такие архитектуры превосходят классические алгоритмы машинного обучения The image recognition task is one of the most difficult in machine learning, requiring both deep knowledge and large time and computational resources from the researcher. In the case of using nonlinear and complex data, various architectures of deep neural networks are used but the problem of choosing a neural network remains a difficult issue. The main architectures used everywhere are convolutional neural networks (CNN), recurrent neural networks (RNN), deep neural networks (DNN). Based on recurrent neural networks (RNNs), Long Short Term Memory Networks (LSTMs) and Controlled Recurrent Unit Networks (GRUs) were developed. Each neural network architecture has its own structure, customizable and trainable parameters, and advantages and disadvantages. By combining different types of neural networks, you can significantly improve the quality of prediction in various machine learning problems. Considering that the choice of the optimal network architecture and its parameters is an extremely difficult task, one of the methods for constructing the architecture of neural networks based on a combination of convolutional, recurrent and deep neural networks is considered. We showed that such architectures are superior to classical machine learning algorithms

APA, Harvard, Vancouver, ISO, and other styles

35

Qasrina Ann, Nurnajmin, Dwi Pebrianti, Mohd Fadhil Abas, and Luhur Bayuaji. "Automated-tuned hyper-parameter deep neural network by using arithmetic optimization algorithm for Lorenz chaotic system." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 2 (April 1, 2023): 2167. http://dx.doi.org/10.11591/ijece.v13i2.pp2167-2176.

Full text

Abstract:

<p>Deep neural networks (DNNs) are very dependent on their parameterization and require experts to determine which method to implement and modify the hyper-parameters value. This study proposes an automated-tuned hyper-parameter for DNN using a metaheuristic optimization algorithm, arithmetic optimization algorithm (AOA). AOA makes use of the distribution properties of mathematics’ primary arithmetic operators, including multiplication, division, addition, and subtraction. AOA is mathematically modeled and implemented to optimize processes across a broad range of search spaces. The performance of AOA is evaluated against 29 benchmark functions, and several real-world engineering design problems are to demonstrate AOA’s applicability. The hyper-parameter tuning framework consists of a set of Lorenz chaotic system datasets, hybrid DNN architecture, and AOA that works automatically. As a result, AOA produced the highest accuracy in the test dataset with a combination of optimized hyper-parameters for DNN architecture. The boxplot analysis also produced the ten AOA particles that are the most accurately chosen. Hence, AOA with ten particles had the smallest size of boxplot for all hyper-parameters, which concluded the best solution. In particular, the result for the proposed system is outperformed compared to the architecture tested with particle swarm optimization.</p>

APA, Harvard, Vancouver, ISO, and other styles

36

Cho, Hyungmin. "RiSA: A Reinforced Systolic Array for Depthwise Convolutions and Embedded Tensor Reshaping." ACM Transactions on Embedded Computing Systems 20, no. 5s (October 31, 2021): 1–20. http://dx.doi.org/10.1145/3476984.

Full text

Abstract:

Depthwise convolutions are widely used in convolutional neural networks (CNNs) targeting mobile and embedded systems. Depthwise convolution layers reduce the computation loads and the number of parameters compared to the conventional convolution layers. Many deep neural network (DNN) accelerators adopt an architecture that exploits the high data-reuse factor of DNN computations, such as a systolic array. However, depthwise convolutions have low data-reuse factor and under-utilize the processing elements (PEs) in systolic arrays. In this paper, we present a DNN accelerator design called RiSA, which provides a novel mechanism that boosts the PE utilization for depthwise convolutions on a systolic array with minimal overheads. In addition, the PEs in systolic arrays can be efficiently used only if the data items ( tensors ) are arranged in the desired layout. Typical DNN accelerators provide various types of PE interconnects or additional modules to flexibly rearrange the data items and manage data movements during DNN computations. RiSA provides a lightweight set of tensor management tasks within the PE array itself that eliminates the need for an additional module for tensor reshaping tasks. Using this embedded tensor reshaping, RiSA supports various DNN models, including convolutional neural networks and natural language processing models while maintaining a high area efficiency. Compared to Eyeriss v2, RiSA improves the area and energy efficiency for MobileNet-V1 inference by 1.91× and 1.31×, respectively.

APA, Harvard, Vancouver, ISO, and other styles

37

Lei, Hong, Yue Xiao, Yanchun Liang, Dalin Li, and Heow Pueh Lee. "DLD: An Optimized Chinese Speech Recognition Model Based on Deep Learning." Complexity 2022 (May 2, 2022): 1–8. http://dx.doi.org/10.1155/2022/6927400.

Full text

Abstract:

Speech recognition technology has played an indispensable role in realizing human-computer intelligent interaction. However, most of the current Chinese speech recognition systems are provided online or offline models with low accuracy and poor performance. To improve the performance of offline Chinese speech recognition, we propose a hybrid acoustic model of deep convolutional neural network, long short-term memory, and deep neural network (DCNN-LSTM-DNN, DLD). This model utilizes DCNN to reduce frequency variation and adds a batch normalization (BN) layer after its convolutional layer to ensure the stability of data distribution, and then use LSTM to effectively solve the gradient vanishing problem. Finally, the fully connected structure of DNN is utilized to efficiently map the input features into a separable space, which is helpful for data classification. Therefore, leveraging the strengths of DCNN, LSTM, and DNN by combining them into a unified architecture can effectively improve speech recognition performance. Our model was tested on the open Chinese speech database THCHS-30 released by the Center for Speech and Language Technology (CSLT) of Tsinghua University, and it was concluded that the DLD model with 3 layers of LSTM and 3 layers of DNN had the best performance, reaching 13.49% of words error rate (WER).

APA, Harvard, Vancouver, ISO, and other styles

38

Wang, Yi-Ren, and Yi-Jyun Wang. "Flutter speed prediction by using deep learning." Advances in Mechanical Engineering 13, no. 11 (November 2021): 168781402110622. http://dx.doi.org/10.1177/16878140211062275.

Full text

Abstract:

Deep learning technology has been widely used in various field in recent years. This study intends to use deep learning algorithms to analyze the aeroelastic phenomenon and compare the differences between Deep Neural Network (DNN) and Long Short-term Memory (LSTM) applied on the flutter speed prediction. In this present work, DNN and LSTM are used to address complex aeroelastic systems by superimposing multi-layer Artificial Neural Network. Under such an architecture, the neurons in neural network can extract features from various flight data. Instead of time-consuming high-fidelity computational fluid dynamics (CFD) method, this study uses the K method to build the aeroelastic flutter speed big data for different flight conditions. The flutter speeds for various flight conditions are predicted by the deep learning methods and verified by the K method. The detailed physical meaning of aerodynamics and aeroelasticity of the prediction results are studied. The LSTM model has a cyclic architecture, which enables it to store information and update it with the latest information at the same time. Although the training of the model is more time-consuming than DNN, this method can increase the memory space. The results of this work show that the LSTM model established in this study can provide more accurate flutter speed prediction than the DNN algorithm.

APA, Harvard, Vancouver, ISO, and other styles

39

Pedram, Ardavan, Ali Shafie Ardestani, Ling Li, Hamzah Abdelaziz, Jun Fang, and Joseph Hassoun. "Algorithm/architecture solutions to improve beyond uniform quantization in embedded DNN accelerators." Journal of Systems Architecture 126 (May 2022): 102454. http://dx.doi.org/10.1016/j.sysarc.2022.102454.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Gowda, Kavitha Malali Vishveshwarappa, Sowmya Madhavan, Stefano Rinaldi, Parameshachari Bidare Divakarachari, and Anitha Atmakur. "FPGA-Based Reconfigurable Convolutional Neural Network Accelerator Using Sparse and Convolutional Optimization." Electronics 11, no. 10 (May 22, 2022): 1653. http://dx.doi.org/10.3390/electronics11101653.

Full text

Abstract:

Nowadays, the data flow architecture is considered as a general solution for the acceleration of a deep neural network (DNN) because of its higher parallelism. However, the conventional DNN accelerator offers only a restricted flexibility for diverse network models. In order to overcome this, a reconfigurable convolutional neural network (RCNN) accelerator, i.e., one of the DNN, is required to be developed over the field-programmable gate array (FPGA) platform. In this paper, the sparse optimization of weight (SOW) and convolutional optimization (CO) are proposed to improve the performances of the RCNN accelerator. The combination of SOW and CO is used to optimize the feature map and weight sizes of the RCNN accelerator; therefore, the hardware resources consumed by this RCNN are minimized in FPGA. The performances of RCNN-SOW-CO are analyzed by means of feature map size, weight size, sparseness of the input feature map (IFM), weight parameter proportion, block random access memory (BRAM), digital signal processing (DSP) elements, look-up tables (LUTs), slices, delay, power, and accuracy. An existing architectures OIDSCNN, LP-CNN, and DPR-NN are used to justify efficiency of the RCNN-SOW-CO. The LUT of RCNN-SOW-CO with Alexnet designed in the Zynq-7020 is 5150, which is less than the OIDSCNN and DPR-NN.

APA, Harvard, Vancouver, ISO, and other styles

41

Nabavinejad, Seyed Morteza, and Sherief Reda. "BayesTuner: Leveraging Bayesian Optimization For DNN Inference Configuration Selection." IEEE Computer Architecture Letters 20, no. 2 (July 1, 2021): 166–70. http://dx.doi.org/10.1109/lca.2021.3123695.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Kwon, Hyoukjun, Michael Pellauer, Angshuman Parashar, and Tushar Krishna. "Flexion: A Quantitative Metric for Flexibility in DNN Accelerators." IEEE Computer Architecture Letters 20, no. 1 (January 1, 2021): 1–4. http://dx.doi.org/10.1109/lca.2020.3044607.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Anahid Robert Safavi, Alberto G. Perotti, Branislav M. Popovic, Mahdi Boloursaz Mashhadi, and Deniz G�nd�z. "Deep extended feedback codes." ITU Journal on Future and Evolving Technologies 2, no. 6 (September 13, 2021): 33–41. http://dx.doi.org/10.52953/snlm1743.

Full text

Abstract:

A new Deep Neural Network (DNN)-based error correction encoder architecture for channels with feedback, called Deep Extended Feedback (DEF), is presented in this paper. The encoder in the DEF architecture transmits an information message followed by a sequence of parity symbols which are generated based on the message as well as the observations of the past forward channel outputs sent to the transmitter through a feedback channel. DEF codes generalize Deepcode in several ways: parity symbols are generated based on forward channel output observations over longer time intervals in order to provide better error correction capability; and high-order modulation formats are deployed in the encoder so as to achieve increased spectral efficiency. Performance evaluations show that DEF codes have better performance compared to other DNN-based codes for channels with feedback.

APA, Harvard, Vancouver, ISO, and other styles

44

Mateen, Muhammad, Junhao Wen, Nasrullah, Sun Song, and Zhouping Huang. "Fundus Image Classification Using VGG-19 Architecture with PCA and SVD." Symmetry 11, no. 1 (December 20, 2018): 1. http://dx.doi.org/10.3390/sym11010001.

Full text

Abstract:

Automated medical image analysis is an emerging field of research that identifies the disease with the help of imaging technology. Diabetic retinopathy (DR) is a retinal disease that is diagnosed in diabetic patients. Deep neural network (DNN) is widely used to classify diabetic retinopathy from fundus images collected from suspected persons. The proposed DR classification system achieves a symmetrically optimized solution through the combination of a Gaussian mixture model (GMM), visual geometry group network (VGGNet), singular value decomposition (SVD) and principle component analysis (PCA), and softmax, for region segmentation, high dimensional feature extraction, feature selection and fundus image classification, respectively. The experiments were performed using a standard KAGGLE dataset containing 35,126 images. The proposed VGG-19 DNN based DR model outperformed the AlexNet and spatial invariant feature transform (SIFT) in terms of classification accuracy and computational time. Utilization of PCA and SVD feature selection with fully connected (FC) layers demonstrated the classification accuracies of 92.21%, 98.34%, 97.96%, and 98.13% for FC7-PCA, FC7-SVD, FC8-PCA, and FC8-SVD, respectively.

APA, Harvard, Vancouver, ISO, and other styles

45

Wang, Jihong, Hao Wang, Xiaodan Wang, and Huiyou Chang. "Predicting Drug-target Interactions via FM-DNN Learning." Current Bioinformatics 15, no. 1 (February 6, 2020): 68–76. http://dx.doi.org/10.2174/1574893614666190227160538.

Full text

Abstract:

Background: Identifying Drug-Target Interactions (DTIs) is a major challenge for current drug discovery and drug repositioning. Compared to traditional experimental approaches, in silico methods are fast and inexpensive. With the increase in open-access experimental data, numerous computational methods have been applied to predict DTIs. Methods: In this study, we propose an end-to-end learning model of Factorization Machine and Deep Neural Network (FM-DNN), which emphasizes both low-order (first or second order) and high-order (higher than second order) feature interactions without any feature engineering other than raw features. This approach combines the power of FM and DNN learning for feature learning in a new neural network architecture. Results: The experimental DTI basic features include drug characteristics (609), target characteristics (1819), plus drug ID, target ID, total 2430. We compare 8 models such as SVM, GBDT, WIDE-DEEP etc, the FM-DNN algorithm model obtains the best results of AUC(0.8866) and AUPR(0.8281). Conclusion: Feature engineering is a job that requires expert knowledge, it is often difficult and time-consuming to achieve good results. FM-DNN can auto learn a lower-order expression by FM and a high-order expression by DNN.FM-DNN model has outstanding advantages over other commonly used models.

APA, Harvard, Vancouver, ISO, and other styles

46

Zhang, Tunhou, Hsin-Pai Cheng, Zhenwen Li, Feng Yan, Chengyu Huang, Hai Li, and Yiran Chen. "AutoShrink: A Topology-Aware NAS for Discovering Efficient Neural Architecture." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 6829–36. http://dx.doi.org/10.1609/aaai.v34i04.6163.

Full text

Abstract:

Resource is an important constraint when deploying Deep Neural Networks (DNNs) on mobile and edge devices. Existing works commonly adopt the cell-based search approach, which limits the flexibility of network patterns in learned cell structures. Moreover, due to the topology-agnostic nature of existing works, including both cell-based and node-based approaches, the search process is time consuming and the performance of found architecture may be sub-optimal. To address these problems, we propose AutoShrink, a topology-aware Neural Architecture Search (NAS) for searching efficient building blocks of neural architectures. Our method is node-based and thus can learn flexible network patterns in cell structures within a topological search space. Directed Acyclic Graphs (DAGs) are used to abstract DNN architectures and progressively optimize the cell structure through edge shrinking. As the search space intrinsically reduces as the edges are progressively shrunk, AutoShrink explores more flexible search space with even less search time. We evaluate AutoShrink on image classification and language tasks by crafting ShrinkCNN and ShrinkRNN models. ShrinkCNN is able to achieve up to 48% parameter reduction and save 34% Multiply-Accumulates (MACs) on ImageNet-1K with comparable accuracy of state-of-the-art (SOTA) models. Specifically, both ShrinkCNN and ShrinkRNN are crafted within 1.5 GPU hours, which is 7.2× and 6.7× faster than the crafting time of SOTA CNN and RNN models, respectively.

APA, Harvard, Vancouver, ISO, and other styles

47

Munoz-Martinez, Francisco, Jose L. Abellan, Manuel E. Acacio, and Tushar Krishna. "STONNE: Enabling Cycle-Level Microarchitectural Simulation for DNN Inference Accelerators." IEEE Computer Architecture Letters 20, no. 2 (July 1, 2021): 122–25. http://dx.doi.org/10.1109/lca.2021.3097253.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Jang, Yongjoo, Sejin Kim, Daehoon Kim, Sungjin Lee, and Jaeha Kung. "Deep Partitioned Training From Near-Storage Computing to DNN Accelerators." IEEE Computer Architecture Letters 20, no. 1 (January 1, 2021): 70–73. http://dx.doi.org/10.1109/lca.2021.3081752.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Lin, Shaoxiong, Wangyou Zhang, and Yanmin Qian. "Two-Stage Single-Channel Speech Enhancement with Multi-Frame Filtering." Applied Sciences 13, no. 8 (April 14, 2023): 4926. http://dx.doi.org/10.3390/app13084926.

Full text

Abstract:

Speech enhancement has been extensively studied and applied in the fields of automatic speech recognition (ASR), speaker recognition, etc. With the advances of deep learning, attempts to apply Deep Neural Networks (DNN) to speech enhancement have achieved remarkable results and the quality of enhanced speech has been greatly improved. In this study, we propose a two-stage model for single-channel speech enhancement. The model has two DNNs with the same architecture. In the first stage, only the first DNN is trained. In the second stage, the second DNN is trained to refine the enhanced output from the first DNN, while the first DNN is frozen. A multi-frame filter is introduced to help the second DNN reduce the distortion of the enhanced speech. Experimental results on both synthetic and real datasets show that the proposed model outperforms other enhancement models not only in terms of speech enhancement evaluation metrics and word error rate (WER), but also in its superior generalization ability. The results of the ablation experiments also demonstrate that combining the two-stage model with the multi-frame filter yields better enhancement performance and less distortion.

APA, Harvard, Vancouver, ISO, and other styles

50

Xiao, Dongwei, Zhibo Liu, Yuanyuan Yuan, Qi Pang, and Shuai Wang. "Metamorphic Testing of Deep Learning Compilers." ACM SIGMETRICS Performance Evaluation Review 50, no. 1 (June 20, 2022): 65–66. http://dx.doi.org/10.1145/3547353.3522655.

Full text

Abstract:

The prosperous trend of deploying deep neural network (DNN) models to diverse hardware platforms has boosted the development of deep learning (DL) compilers. DL compilers take high-level DNN model specifications as input and generate optimized DNN executables for diverse hardware architectures like CPUs, GPUs, and hardware accelerators. We introduce MT-DLComp, a metamorphic testing framework specifically designed for DL compilers to uncover erroneous compilations. Our approach leverages deliberately-designed metamorphic relations (MRs) to launch semantics-preserving mutations toward DNN models to generate their variants. This way, DL compilers can be automatically tested for compilation correctness by comparing the execution outputs of the compiled DNN models and their variants without manual intervention. We detected over 435 inputs that can result in erroneous compilations in four popular DL compilers, all of which are industry-strength products maintained by Amazon, Facebook, Microsoft, and Google. We uncovered four bugs in these compilers by debugging them using the error-triggering inputs.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!