Academic literature on the topic 'Deep Photonic Neural Networks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Deep Photonic Neural Networks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Deep Photonic Neural Networks"

1

Pai, Sunil, Zhanghao Sun, Tyler W. Hughes, Taewon Park, Ben Bartlett, Ian A. D. Williamson, Momchil Minkov, et al. "Experimentally realized in situ backpropagation for deep learning in photonic neural networks." Science 380, no. 6643 (April 28, 2023): 398–404. http://dx.doi.org/10.1126/science.ade8450.

Full text
Abstract:
Integrated photonic neural networks provide a promising platform for energy-efficient, high-throughput machine learning with extensive scientific and commercial applications. Photonic neural networks efficiently transform optically encoded inputs using Mach-Zehnder interferometer mesh networks interleaved with nonlinearities. We experimentally trained a three-layer, four-port silicon photonic neural network with programmable phase shifters and optical power monitoring to solve classification tasks using “in situ backpropagation,” a photonic analog of the most popular method to train conventional neural networks. We measured backpropagated gradients for phase-shifter voltages by interfering forward- and backward-propagating light and simulated in situ backpropagation for 64-port photonic neural networks trained on MNIST image recognition given errors. All experiments performed comparably to digital simulations ( > 94% test accuracy), and energy scaling analysis indicated a route to scalable machine learning.
APA, Harvard, Vancouver, ISO, and other styles
2

Sheng, Huayi. "Review of Integrated Diffractive Deep Neural Networks." Highlights in Science, Engineering and Technology 24 (December 27, 2022): 264–78. http://dx.doi.org/10.54097/hset.v24i.3957.

Full text
Abstract:
An integrated photonic diffractive deep neural network ( ID^2 NN) is one of the most exciting cross-discipline fields of artificial intelligence and optical computing, combining deep learning with the power of light-speed processing on an integrated platform. We know that neural network in a digital computer is based on transistors, which have significant challenges in keeping pace with Moore's law and limited real-time processing applications due to the increased computational costs associated with them. However, with remarkable progress and advancement in silicon photonic integrated circuits over the last few decades, ID^2 NN hold the promise of on-chip miniaturisation and high-speed performance with low power consumption. This paper covers the essential theoretical background for constructing the ID^2 NN and reviews the research status of optical diffractive neural networks in the field of neuromorphic computing. Problems of narrowing down current ID^2 NN applications are also included in this review. Finally, future research directions for ID^2 NN are discussed, and conclusions are delivered.
APA, Harvard, Vancouver, ISO, and other styles
3

Jiang, Jiaqi, and Jonathan A. Fan. "Multiobjective and categorical global optimization of photonic structures based on ResNet generative neural networks." Nanophotonics 10, no. 1 (September 22, 2020): 361–69. http://dx.doi.org/10.1515/nanoph-2020-0407.

Full text
Abstract:
AbstractWe show that deep generative neural networks, based on global optimization networks (GLOnets), can be configured to perform the multiobjective and categorical global optimization of photonic devices. A residual network scheme enables GLOnets to evolve from a deep architecture, which is required to properly search the full design space early in the optimization process, to a shallow network that generates a narrow distribution of globally optimal devices. As a proof-of-concept demonstration, we adapt our method to design thin-film stacks consisting of multiple material types. Benchmarks with known globally optimized antireflection structures indicate that GLOnets can find the global optimum with orders of magnitude faster speeds compared to conventional algorithms. We also demonstrate the utility of our method in complex design tasks with its application to incandescent light filters. These results indicate that advanced concepts in deep learning can push the capabilities of inverse design algorithms for photonics.
APA, Harvard, Vancouver, ISO, and other styles
4

Mao, Simei, Lirong Cheng, Caiyue Zhao, Faisal Nadeem Khan, Qian Li, and H. Y. Fu. "Inverse Design for Silicon Photonics: From Iterative Optimization Algorithms to Deep Neural Networks." Applied Sciences 11, no. 9 (April 23, 2021): 3822. http://dx.doi.org/10.3390/app11093822.

Full text
Abstract:
Silicon photonics is a low-cost and versatile platform for various applications. For design of silicon photonic devices, the light-material interaction within its complex subwavelength geometry is difficult to investigate analytically and therefore numerical simulations are majorly adopted. To make the design process more time-efficient and to improve the device performance to its physical limits, various methods have been proposed over the past few years to manipulate the geometries of silicon platform for specific applications. In this review paper, we summarize the design methodologies for silicon photonics including iterative optimization algorithms and deep neural networks. In case of iterative optimization methods, we discuss them in different scenarios in the sequence of increased degrees of freedom: empirical structure, QR-code like structure and irregular structure. We also review inverse design approaches assisted by deep neural networks, which generate multiple devices with similar structure much faster than iterative optimization methods and are thus suitable in situations where piles of optical components are needed. Finally, the applications of inverse design methodology in optical neural networks are also discussed. This review intends to provide the readers with the suggestion for the most suitable design methodology for a specific scenario.
APA, Harvard, Vancouver, ISO, and other styles
5

Dang, Dharanidhar, Sai Vineel Reddy Chittamuru, Sudeep Pasricha, Rabi Mahapatra, and Debashis Sahoo. "BPLight-CNN: A Photonics-Based Backpropagation Accelerator for Deep Learning." ACM Journal on Emerging Technologies in Computing Systems 17, no. 4 (October 31, 2021): 1–26. http://dx.doi.org/10.1145/3446212.

Full text
Abstract:
Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation (BP) algorithm. This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pretrained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high-performance deep learning training. We present the design for a convolutional neural network (CNN), BPLight-CNN , which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models, including LeNet and VGG-Net. The proposed design achieves (i) at least 34× speedup, 34× improvement in computational efficiency, and 38.5× energy savings during training; and (ii) 29× speedup, 31× improvement in computational efficiency, and 38.7× improvement in energy savings during inference compared with the state-of-the-art designs. All of these comparisons are done at a 16-bit resolution, and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared with the state-of-the-art.
APA, Harvard, Vancouver, ISO, and other styles
6

Ahmed, Moustafa, Yas Al-Hadeethi, Ahmed Bakry, Hamed Dalir, and Volker J. Sorger. "Integrated photonic FFT for photonic tensor operations towards efficient and high-speed neural networks." Nanophotonics 9, no. 13 (June 26, 2020): 4097–108. http://dx.doi.org/10.1515/nanoph-2020-0055.

Full text
Abstract:
AbstractThe technologically-relevant task of feature extraction from data performed in deep-learning systems is routinely accomplished as repeated fast Fourier transforms (FFT) electronically in prevalent domain-specific architectures such as in graphics processing units (GPU). However, electronics systems are limited with respect to power dissipation and delay, due to wire-charging challenges related to interconnect capacitance. Here we present a silicon photonics-based architecture for convolutional neural networks that harnesses the phase property of light to perform FFTs efficiently by executing the convolution as a multiplication in the Fourier-domain. The algorithmic executing time is determined by the time-of-flight of the signal through this photonic reconfigurable passive FFT ‘filter’ circuit and is on the order of 10’s of picosecond short. A sensitivity analysis shows that this optical processor must be thermally phase stabilized corresponding to a few degrees. Furthermore, we find that for a small sample number, the obtainable number of convolutions per {time, power, and chip area) outperforms GPUs by about two orders of magnitude. Lastly, we show that, conceptually, the optical FFT and convolution-processing performance is indeed directly linked to optoelectronic device-level, and improvements in plasmonics, metamaterials or nanophotonics are fueling next generation densely interconnected intelligent photonic circuits with relevance for edge-computing 5G networks by processing tensor operations optically.
APA, Harvard, Vancouver, ISO, and other styles
7

Sun, Yichen, Mingli Dong, Mingxin Yu, Jiabin Xia, Xu Zhang, Yuchen Bai, Lidan Lu, and Lianqing Zhu. "Nonlinear All-Optical Diffractive Deep Neural Network with 10.6 μm Wavelength for Image Classification." International Journal of Optics 2021 (February 27, 2021): 1–16. http://dx.doi.org/10.1155/2021/6667495.

Full text
Abstract:
A photonic artificial intelligence chip is based on an optical neural network (ONN), low power consumption, low delay, and strong antiinterference ability. The all-optical diffractive deep neural network has recently demonstrated its inference capabilities on the image classification task. However, the size of the physical model does not have miniaturization and integration, and the optical nonlinearity is not incorporated into the diffraction neural network. By introducing the nonlinear characteristics of the network, complex tasks can be completed with high accuracy. In this study, a nonlinear all-optical diffraction deep neural network (N-D2NN) model based on 10.6 μm wavelength is constructed by combining the ONN and complex-valued neural networks with the nonlinear activation function introduced into the structure. To be specific, the improved activation function of the rectified linear unit (ReLU), i.e., Leaky-ReLU, parametric ReLU (PReLU), and randomized ReLU (RReLU), is selected as the activation function of the N-D2NN model. Through numerical simulation, it is proved that the N-D2NN model based on 10.6 μm wavelength has excellent representation ability, which enables them to perform classification learning tasks of the MNIST handwritten digital dataset and Fashion-MNIST dataset well, respectively. The results show that the N-D2NN model with the RReLU activation function has the highest classification accuracy of 97.86% and 89.28%, respectively. These results provide a theoretical basis for the preparation of miniaturized and integrated N-D2NN model photonic artificial intelligence chips.
APA, Harvard, Vancouver, ISO, and other styles
8

Ren, Yangming, Lingxuan Zhang, Weiqiang Wang, Xinyu Wang, Yufang Lei, Yulong Xue, Xiaochen Sun, and Wenfu Zhang. "Genetic-algorithm-based deep neural networks for highly efficient photonic device design." Photonics Research 9, no. 6 (May 24, 2021): B247. http://dx.doi.org/10.1364/prj.416294.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Asano, Takashi, and Susumu Noda. "Iterative optimization of photonic crystal nanocavity designs by using deep neural networks." Nanophotonics 8, no. 12 (November 16, 2019): 2243–56. http://dx.doi.org/10.1515/nanoph-2019-0308.

Full text
Abstract:
AbstractDevices based on two-dimensional photonic-crystal nanocavities, which are defined by their air hole patterns, usually require a high quality (Q) factor to achieve high performance. We demonstrate that hole patterns with very high Q factors can be efficiently found by the iteration procedure consisting of machine learning of the relation between the hole pattern and the corresponding Q factor and new dataset generation based on the regression function obtained by machine learning. First, a dataset comprising randomly generated cavity structures and their first principles Q factors is prepared. Then a deep neural network is trained using the initial dataset to obtain a regression function that approximately predicts the Q factors from the structural parameters. Several candidates for higher Q factors are chosen by searching the parameter space using the regression function. After adding these new structures and their first principles Q factors to the training dataset, the above process is repeated. As an example, a standard silicon-based L3 cavity is optimized by this method. A cavity design with a high Q factor exceeding 11 million is found within 101 iteration steps and a total of 8070 cavity structures. This theoretical Q factor is more than twice the previously reported record values of the cavity designs detected by the evolutionary algorithm and the leaky mode visualization method. It is found that structures with higher Q factors can be detected within less iteration steps by exploring not only the parameter space near the present highest-Q structure but also that distant from the present dataset.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Renjie, Xiaozhe Gu, Yuanwen Shen, Ke Li, Zhen Li, and Zhaoyu Zhang. "Smart and Rapid Design of Nanophotonic Structures by an Adaptive and Regularized Deep Neural Network." Nanomaterials 12, no. 8 (April 16, 2022): 1372. http://dx.doi.org/10.3390/nano12081372.

Full text
Abstract:
The design of nanophotonic structures based on deep learning is emerging rapidly in the research community. Design methods using Deep Neural Networks (DNN) are outperforming conventional physics-based simulations performed iteratively by human experts. Here, a self-adaptive and regularized DNN based on Convolutional Neural Networks (CNNs) for the smart and fast characterization of nanophotonic structures in high-dimensional design parameter space is presented. This proposed CNN model, named LRS-RCNN, utilizes dynamic learning rate scheduling and L2 regularization techniques to overcome overfitting and speed up training convergence and is shown to surpass the performance of all previous algorithms, with the exception of two metrics where it achieves a comparable level relative to prior works. We applied the model to two challenging types of photonic structures: 2D photonic crystals (e.g., L3 nanocavity) and 1D photonic crystals (e.g., nanobeam) and results show that LRS-RCNN achieves record-high prediction accuracies, strong generalizibility, and substantially faster convergence speed compared to prior works. Although still a proof-of-concept model, the proposed smart LRS-RCNN has been proven to greatly accelerate the design of photonic crystal structures as a state-of-the-art predictor for both Q-factor and V. It can also be modified and generalized to predict any type of optical properties for designing a wide range of different nanophotonic structures. The complete dataset and code will be released to aid the development of related research endeavors.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Deep Photonic Neural Networks"

1

Liu, Qian. "Deep spiking neural networks." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/deep-spiking-neural-networks(336e6a37-2a0b-41ff-9ffb-cca897220d6c).html.

Full text
Abstract:
Neuromorphic Engineering (NE) has led to the development of biologically-inspired computer architectures whose long-term goal is to approach the performance of the human brain in terms of energy efficiency and cognitive capabilities. Although there are a number of neuromorphic platforms available for large-scale Spiking Neural Network (SNN) simulations, the problem of programming these brain-like machines to be competent in cognitive applications still remains unsolved. On the other hand, Deep Learning has emerged in Artificial Neural Network (ANN) research to dominate state-of-the-art solutions for cognitive tasks. Thus the main research problem emerges of understanding how to operate and train biologically-plausible SNNs to close the gap in cognitive capabilities between SNNs and ANNs. SNNs can be trained by first training an equivalent ANN and then transferring the tuned weights to the SNN. This method is called ‘off-line’ training, since it does not take place on an SNN directly, but rather on an ANN instead. However, previous work on such off-line training methods has struggled in terms of poor modelling accuracy of the spiking neurons and high computational complexity. In this thesis we propose a simple and novel activation function, Noisy Softplus (NSP), to closely model the response firing activity of biologically-plausible spiking neurons, and introduce a generalised off-line training method using the Parametric Activation Function (PAF) to map the abstract numerical values of the ANN to concrete physical units, such as current and firing rate in the SNN. Based on this generalised training method and its fine tuning, we achieve the state-of-the-art accuracy on the MNIST classification task using spiking neurons, 99.07%, on a deep spiking convolutional neural network (ConvNet). We then take a step forward to ‘on-line’ training methods, where Deep Learning modules are trained purely on SNNs in an event-driven manner. Existing work has failed to provide SNNs with recognition accuracy equivalent to ANNs due to the lack of mathematical analysis. Thus we propose a formalised Spike-based Rate Multiplication (SRM) method which transforms the product of firing rates to the number of coincident spikes of a pair of rate-coded spike trains. Moreover, these coincident spikes can be captured by the Spike-Time-Dependent Plasticity (STDP) rule to update the weights between the neurons in an on-line, event-based, and biologically-plausible manner. Furthermore, we put forward solutions to reduce correlations between spike trains; thereby addressing the result of performance drop in on-line SNN training. The promising results of spiking Autoencoders (AEs) and Restricted Boltzmann Machines (SRBMs) exhibit equivalent, sometimes even superior, classification and reconstruction capabilities compared to their non-spiking counterparts. To provide meaningful comparisons between these proposed SNN models and other existing methods within this rapidly advancing field of NE, we propose a large dataset of spike-based visual stimuli and a corresponding evaluation methodology to estimate the overall performance of SNN models and their hardware implementations.
APA, Harvard, Vancouver, ISO, and other styles
2

Squadrani, Lorenzo. "Deep neural networks and thermodynamics." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Find full text
Abstract:
Deep learning is the most effective and used approach to artificial intelligence, and yet it is far from being properly understood. The understanding of it is the way to go to further improve its effectiveness and in the best case to gain some understanding of the "natural" intelligence. We attempt a step in this direction with the aim of physics. We describe a convolutional neural network for image classification (trained on CIFAR-10) within the descriptive framework of Thermodynamics. In particular we define and study the temperature of each component of the network. Our results provides a new point of view on deep learning models, which may be a starting point towards a better understanding of artificial intelligence.
APA, Harvard, Vancouver, ISO, and other styles
3

Mancevo, del Castillo Ayala Diego. "Compressing Deep Convolutional Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217316.

Full text
Abstract:
Deep Convolutional Neural Networks and "deep learning" in general stand at the cutting edge on a range of applications, from image based recognition and classification to natural language processing, speech and speaker recognition and reinforcement learning. Very deep models however are often large, complex and computationally expensive to train and evaluate. Deep learning models are thus seldom deployed natively in environments where computational resources are scarce or expensive. To address this problem we turn our attention towards a range of techniques that we collectively refer to as "model compression" where a lighter student model is trained to approximate the output produced by the model we wish to compress. To this end, the output from the original model is used to craft the training labels of the smaller student model. This work contains some experiments on CIFAR-10 and demonstrates how to use the aforementioned techniques to compress a people counting model whose precision, recall and F1-score are improved by as much as 14% against our baseline.
APA, Harvard, Vancouver, ISO, and other styles
4

Abbasi, Mahdieh. "Toward robust deep neural networks." Doctoral thesis, Université Laval, 2020. http://hdl.handle.net/20.500.11794/67766.

Full text
Abstract:
Dans cette thèse, notre objectif est de développer des modèles d’apprentissage robustes et fiables mais précis, en particulier les Convolutional Neural Network (CNN), en présence des exemples anomalies, comme des exemples adversaires et d’échantillons hors distribution –Out-of-Distribution (OOD). Comme la première contribution, nous proposons d’estimer la confiance calibrée pour les exemples adversaires en encourageant la diversité dans un ensemble des CNNs. À cette fin, nous concevons un ensemble de spécialistes diversifiés avec un mécanisme de vote simple et efficace en termes de calcul pour prédire les exemples adversaires avec une faible confiance tout en maintenant la confiance prédicative des échantillons propres élevée. En présence de désaccord dans notre ensemble, nous prouvons qu’une borne supérieure de 0:5 + _0 peut être établie pour la confiance, conduisant à un seuil de détection global fixe de tau = 0; 5. Nous justifions analytiquement le rôle de la diversité dans notre ensemble sur l’atténuation du risque des exemples adversaires à la fois en boîte noire et en boîte blanche. Enfin, nous évaluons empiriquement la robustesse de notre ensemble aux attaques de la boîte noire et de la boîte blanche sur plusieurs données standards. La deuxième contribution vise à aborder la détection d’échantillons OOD à travers un modèle de bout en bout entraîné sur un ensemble OOD approprié. À cette fin, nous abordons la question centrale suivante : comment différencier des différents ensembles de données OOD disponibles par rapport à une tâche de distribution donnée pour sélectionner la plus appropriée, ce qui induit à son tour un modèle calibré avec un taux de détection des ensembles inaperçus de données OOD? Pour répondre à cette question, nous proposons de différencier les ensembles OOD par leur niveau de "protection" des sub-manifolds. Pour mesurer le niveau de protection, nous concevons ensuite trois nouvelles mesures efficaces en termes de calcul à l’aide d’un CNN vanille préformé. Dans une vaste série d’expériences sur les tâches de classification d’image et d’audio, nous démontrons empiriquement la capacité d’un CNN augmenté (A-CNN) et d’un CNN explicitement calibré pour détecter une portion significativement plus grande des exemples OOD. Fait intéressant, nous observons également qu’un tel A-CNN (nommé A-CNN) peut également détecter les adversaires exemples FGS en boîte noire avec des perturbations significatives. En tant que troisième contribution, nous étudions de plus près de la capacité de l’A-CNN sur la détection de types plus larges d’adversaires boîte noire (pas seulement ceux de type FGS). Pour augmenter la capacité d’A-CNN à détecter un plus grand nombre d’adversaires,nous augmentons l’ensemble d’entraînement OOD avec des échantillons interpolés inter-classes. Ensuite, nous démontrons que l’A-CNN, entraîné sur tous ces données, a un taux de détection cohérent sur tous les types des adversaires exemples invisibles. Alors que la entraînement d’un A-CNN sur des adversaires PGD ne conduit pas à un taux de détection stable sur tous les types d’adversaires, en particulier les types inaperçus. Nous évaluons également visuellement l’espace des fonctionnalités et les limites de décision dans l’espace d’entrée d’un CNN vanille et de son homologue augmenté en présence d’adversaires et de ceux qui sont propres. Par un A-CNN correctement formé, nous visons à faire un pas vers un modèle d’apprentissage debout en bout unifié et fiable avec de faibles taux de risque sur les échantillons propres et les échantillons inhabituels, par exemple, les échantillons adversaires et OOD. La dernière contribution est de présenter une application de A-CNN pour l’entraînement d’un détecteur d’objet robuste sur un ensemble de données partiellement étiquetées, en particulier un ensemble de données fusionné. La fusion de divers ensembles de données provenant de contextes similaires mais avec différents ensembles d’objets d’intérêt (OoI) est un moyen peu coûteux de créer un ensemble de données à grande échelle qui couvre un plus large spectre d’OoI. De plus, la fusion d’ensembles de données permet de réaliser un détecteur d’objet unifié, au lieu d’en avoir plusieurs séparés, ce qui entraîne une réduction des coûts de calcul et de temps. Cependant, la fusion d’ensembles de données, en particulier à partir d’un contexte similaire, entraîne de nombreuses instances d’étiquetées manquantes. Dans le but d’entraîner un détecteur d’objet robuste intégré sur un ensemble de données partiellement étiquetées mais à grande échelle, nous proposons un cadre d’entraînement auto-supervisé pour surmonter le problème des instances d’étiquettes manquantes dans les ensembles des données fusionnés. Notre cadre est évalué sur un ensemble de données fusionné avec un taux élevé d’étiquettes manquantes. Les résultats empiriques confirment la viabilité de nos pseudo-étiquettes générées pour améliorer les performances de YOLO, en tant que détecteur d’objet à la pointe de la technologie.
In this thesis, our goal is to develop robust and reliable yet accurate learning models, particularly Convolutional Neural Networks (CNNs), in the presence of adversarial examples and Out-of-Distribution (OOD) samples. As the first contribution, we propose to predict adversarial instances with high uncertainty through encouraging diversity in an ensemble of CNNs. To this end, we devise an ensemble of diverse specialists along with a simple and computationally efficient voting mechanism to predict the adversarial examples with low confidence while keeping the predictive confidence of the clean samples high. In the presence of high entropy in our ensemble, we prove that the predictive confidence can be upper-bounded, leading to have a globally fixed threshold over the predictive confidence for identifying adversaries. We analytically justify the role of diversity in our ensemble on mitigating the risk of both black-box and white-box adversarial examples. Finally, we empirically assess the robustness of our ensemble to the black-box and the white-box attacks on several benchmark datasets.The second contribution aims to address the detection of OOD samples through an end-to-end model trained on an appropriate OOD set. To this end, we address the following central question: how to differentiate many available OOD sets w.r.t. a given in distribution task to select the most appropriate one, which in turn induces a model with a high detection rate of unseen OOD sets? To answer this question, we hypothesize that the “protection” level of in-distribution sub-manifolds by each OOD set can be a good possible property to differentiate OOD sets. To measure the protection level, we then design three novel, simple, and cost-effective metrics using a pre-trained vanilla CNN. In an extensive series of experiments on image and audio classification tasks, we empirically demonstrate the abilityof an Augmented-CNN (A-CNN) and an explicitly-calibrated CNN for detecting a significantly larger portion of unseen OOD samples, if they are trained on the most protective OOD set. Interestingly, we also observe that the A-CNN trained on the most protective OOD set (calledA-CNN) can also detect the black-box Fast Gradient Sign (FGS) adversarial examples. As the third contribution, we investigate more closely the capacity of the A-CNN on the detection of wider types of black-box adversaries. To increase the capability of A-CNN to detect a larger number of adversaries, we augment its OOD training set with some inter-class interpolated samples. Then, we demonstrate that the A-CNN trained on the most protective OOD set along with the interpolated samples has a consistent detection rate on all types of unseen adversarial examples. Where as training an A-CNN on Projected Gradient Descent (PGD) adversaries does not lead to a stable detection rate on all types of adversaries, particularly the unseen types. We also visually assess the feature space and the decision boundaries in the input space of a vanilla CNN and its augmented counterpart in the presence of adversaries and the clean ones. By a properly trained A-CNN, we aim to take a step toward a unified and reliable end-to-end learning model with small risk rates on both clean samples and the unusual ones, e.g. adversarial and OOD samples.The last contribution is to show a use-case of A-CNN for training a robust object detector on a partially-labeled dataset, particularly a merged dataset. Merging various datasets from similar contexts but with different sets of Object of Interest (OoI) is an inexpensive way to craft a large-scale dataset which covers a larger spectrum of OoIs. Moreover, merging datasets allows achieving a unified object detector, instead of having several separate ones, resultingin the reduction of computational and time costs. However, merging datasets, especially from a similar context, causes many missing-label instances. With the goal of training an integrated robust object detector on a partially-labeled but large-scale dataset, we propose a self-supervised training framework to overcome the issue of missing-label instances in the merged datasets. Our framework is evaluated on a merged dataset with a high missing-label rate. The empirical results confirm the viability of our generated pseudo-labels to enhance the performance of YOLO, as the current (to date) state-of-the-art object detector.
APA, Harvard, Vancouver, ISO, and other styles
5

Lu, Yifei. "Deep neural networks and fraud detection." Thesis, Uppsala universitet, Tillämpad matematik och statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-331833.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kalogiras, Vasileios. "Sentiment Classification with Deep Neural Networks." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-217858.

Full text
Abstract:
Attitydanalys är ett delfält av språkteknologi (NLP) som försöker analysera känslan av skriven text. Detta är ett komplext problem som medför många utmaningar. Av denna anledning har det studerats i stor utsträckning. Under de senaste åren har traditionella maskininlärningsalgoritmer eller handgjord metodik använts och givit utmärkta resultat. Men den senaste renässansen för djupinlärning har växlat om intresse till end to end deep learning-modeller.Å ena sidan resulterar detta i mer kraftfulla modeller men å andra sidansaknas klart matematiskt resonemang eller intuition för dessa modeller. På grund av detta görs ett försök i denna avhandling med att kasta ljus på nyligen föreslagna deep learning-arkitekturer för attitydklassificering. En studie av deras olika skillnader utförs och ger empiriska resultat för hur ändringar i strukturen eller kapacitet hos modellen kan påverka exaktheten och sättet den representerar och ''förstår'' meningarna.
Sentiment analysis is a subfield of natural language processing (NLP) that attempts to analyze the sentiment of written text.It is is a complex problem that entails different challenges. For this reason, it has been studied extensively. In the past years traditional machine learning algorithms or handcrafted methodologies used to provide state of the art results. However, the recent deep learning renaissance shifted interest towards end to end deep learning models. On the one hand this resulted into more powerful models but on the other hand clear mathematical reasoning or intuition behind distinct models is still lacking. As a result, in this thesis, an attempt to shed some light on recently proposed deep learning architectures for sentiment classification is made.A study of their differences is performed as well as provide empirical results on how changes in the structure or capacity of a model can affect its accuracy and the way it represents and ''comprehends'' sentences.
APA, Harvard, Vancouver, ISO, and other styles
7

Choi, Keunwoo. "Deep neural networks for music tagging." Thesis, Queen Mary, University of London, 2018. http://qmro.qmul.ac.uk/xmlui/handle/123456789/46029.

Full text
Abstract:
In this thesis, I present my hypothesis, experiment results, and discussion that are related to various aspects of deep neural networks for music tagging. Music tagging is a task to automatically predict the suitable semantic label when music is provided. Generally speaking, the input of music tagging systems can be any entity that constitutes music, e.g., audio content, lyrics, or metadata, but only the audio content is considered in this thesis. My hypothesis is that we can fi nd effective deep learning practices for the task of music tagging task that improves the classi fication performance. As a computational model to realise a music tagging system, I use deep neural networks. Combined with the research problem, the scope of this thesis is the understanding, interpretation, optimisation, and application of deep neural networks in the context of music tagging systems. The ultimate goal of this thesis is to provide insight that can help to improve deep learning-based music tagging systems. There are many smaller goals in this regard. Since using deep neural networks is a data-driven approach, it is crucial to understand the dataset. Selecting and designing a better architecture is the next topic to discuss. Since the tagging is done with audio input, preprocessing the audio signal becomes one of the important research topics. After building (or training) a music tagging system, fi nding a suitable way to re-use it for other music information retrieval tasks is a compelling topic, in addition to interpreting the trained system. The evidence presented in the thesis supports that deep neural networks are powerful and credible methods for building a music tagging system.
APA, Harvard, Vancouver, ISO, and other styles
8

Yin, Yonghua. "Random neural networks for deep learning." Thesis, Imperial College London, 2018. http://hdl.handle.net/10044/1/64917.

Full text
Abstract:
The random neural network (RNN) is a mathematical model for an 'integrate and fire' spiking network that closely resembles the stochastic behaviour of neurons in mammalian brains. Since its proposal in 1989, there have been numerous investigations into the RNN's applications and learning algorithms. Deep learning (DL) has achieved great success in machine learning, but there has been no research into the properties of the RNN for DL to combine their power. This thesis intends to bridge the gap between RNNs and DL, in order to provide powerful DL tools that are faster, and that can potentially be used with less energy expenditure than existing methods. Based on the RNN function approximator proposed by Gelenbe in 1999, the approximation capability of the RNN is investigated and an efficient classifier is developed. By combining the RNN, DL and non-negative matrix factorisation, new shallow and multi-layer non-negative autoencoders are developed. The autoencoders are tested on typical image datasets and real-world datasets from different domains, and the test results yield the desired high learning accuracy. The concept of dense nuclei/clusters is examined, using RNN theory as a basis. In dense nuclei, neurons may interconnect via soma-to-soma interactions and conventional synaptic connections. A mathematical model of the dense nuclei is proposed and the transfer function can be deduced. A multi-layer architecture of the dense nuclei is constructed for DL, whose value is demonstrated by experiments on multi-channel datasets and server-state classification in cloud servers. A theoretical study into the multi-layer architecture of the standard RNN (MLRNN) for DL is presented. Based on the layer-output analyses, the MLRNN is shown to be a universal function approximator. The effects of the layer number on the learning capability and high-level representation extraction are analysed. A hypothesis for transforming the DL problem into a moment-learning problem is also presented. The power of the standard RNN for DL is investigated. The ability of the RNN with only positive parameters to conduct image convolution operations is demonstrated. The MLRNN equipped with the developed training algorithm achieves comparable or better classification at a lower computation cost than conventional DL methods.
APA, Harvard, Vancouver, ISO, and other styles
9

Zagoruyko, Sergey. "Weight parameterizations in deep neural networks." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1129/document.

Full text
Abstract:
Les réseaux de neurones multicouches ont été proposés pour la première fois il y a plus de trois décennies, et diverses architectures et paramétrages ont été explorés depuis. Récemment, les unités de traitement graphique ont permis une formation très efficace sur les réseaux neuronaux et ont permis de former des réseaux beaucoup plus grands sur des ensembles de données plus importants, ce qui a considérablement amélioré le rendement dans diverses tâches d'apprentissage supervisé. Cependant, la généralisation est encore loin du niveau humain, et il est difficile de comprendre sur quoi sont basées les décisions prises. Pour améliorer la généralisation et la compréhension, nous réexaminons les problèmes de paramétrage du poids dans les réseaux neuronaux profonds. Nous identifions les problèmes les plus importants, à notre avis, dans les architectures modernes : la profondeur du réseau, l'efficacité des paramètres et l'apprentissage de tâches multiples en même temps, et nous essayons de les aborder dans cette thèse. Nous commençons par l'un des problèmes fondamentaux de la vision par ordinateur, le patch matching, et proposons d'utiliser des réseaux neuronaux convolutifs de différentes architectures pour le résoudre, au lieu de descripteurs manuels. Ensuite, nous abordons la tâche de détection d'objets, où un réseau devrait apprendre simultanément à prédire à la fois la classe de l'objet et l'emplacement. Dans les deux tâches, nous constatons que le nombre de paramètres dans le réseau est le principal facteur déterminant sa performance, et nous explorons ce phénomène dans les réseaux résiduels. Nos résultats montrent que leur motivation initiale, la formation de réseaux plus profonds pour de meilleures représentations, ne tient pas entièrement, et des réseaux plus larges avec moins de couches peuvent être aussi efficaces que des réseaux plus profonds avec le même nombre de paramètres. Dans l'ensemble, nous présentons une étude approfondie sur les architectures et les paramétrages de poids, ainsi que sur les moyens de transférer les connaissances entre elles
Multilayer neural networks were first proposed more than three decades ago, and various architectures and parameterizations were explored since. Recently, graphics processing units enabled very efficient neural network training, and allowed training much larger networks on larger datasets, dramatically improving performance on various supervised learning tasks. However, the generalization is still far from human level, and it is difficult to understand on what the decisions made are based. To improve on generalization and understanding we revisit the problems of weight parameterizations in deep neural networks. We identify the most important, to our mind, problems in modern architectures: network depth, parameter efficiency, and learning multiple tasks at the same time, and try to address them in this thesis. We start with one of the core problems of computer vision, patch matching, and propose to use convolutional neural networks of various architectures to solve it, instead of manual hand-crafting descriptors. Then, we address the task of object detection, where a network should simultaneously learn to both predict class of the object and the location. In both tasks we find that the number of parameters in the network is the major factor determining it's performance, and explore this phenomena in residual networks. Our findings show that their original motivation, training deeper networks for better representations, does not fully hold, and wider networks with less layers can be as effective as deeper with the same number of parameters. Overall, we present an extensive study on architectures and weight parameterizations, and ways of transferring knowledge between them
APA, Harvard, Vancouver, ISO, and other styles
10

Ioannou, Yani Andrew. "Structural priors in deep neural networks." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278976.

Full text
Abstract:
Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, there remains a lack of understanding of both the optimization and structure of deep networks. The approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization --- an approach that leaves much to desire in efficiency. Instead we propose that carefully designing networks in consideration of our prior knowledge of the task and learned representation can improve the memory and compute efficiency of state-of-the art networks, and even improve generalization --- what we propose to denote as structural priors. We present two such novel structural priors for convolutional neural networks, and evaluate them in state-of-the-art image classification CNN architectures. The first of these methods proposes to exploit our knowledge of the low-rank nature of most filters learned for natural images by structuring a deep network to learn a collection of mostly small, low-rank, filters. The second addresses the filter/channel extents of convolutional filters, by learning filters with limited channel extents. The size of these channel-wise basis filters increases with the depth of the model, giving a novel sparse connection structure that resembles a tree root. Both methods are found to improve the generalization of these architectures while also decreasing the size and increasing the efficiency of their training and test-time computation. Finally, we present work towards conditional computation in deep neural networks, moving towards a method of automatically learning structural priors in deep networks. We propose a new discriminative learning model, conditional networks, that jointly exploit the accurate representation learning capabilities of deep neural networks with the efficient conditional computation of decision trees. Conditional networks yield smaller models, and offer test-time flexibility in the trade-off of computation vs. accuracy.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Deep Photonic Neural Networks"

1

Aggarwal, Charu C. Neural Networks and Deep Learning. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-94463-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Aggarwal, Charu C. Neural Networks and Deep Learning. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-29642-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Moolayil, Jojo. Learn Keras for Deep Neural Networks. Berkeley, CA: Apress, 2019. http://dx.doi.org/10.1007/978-1-4842-4240-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Caterini, Anthony L., and Dong Eui Chang. Deep Neural Networks in a Mathematical Framework. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-75304-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Razaghi, Hooshmand Shokri. Statistical Machine Learning & Deep Neural Networks Applied to Neural Data Analysis. [New York, N.Y.?]: [publisher not identified], 2020.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
6

Fingscheidt, Tim, Hanno Gottschalk, and Sebastian Houben, eds. Deep Neural Networks and Data for Automated Driving. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-01233-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Modrzyk, Nicolas. Real-Time IoT Imaging with Deep Neural Networks. Berkeley, CA: Apress, 2020. http://dx.doi.org/10.1007/978-1-4842-5722-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Iba, Hitoshi. Evolutionary Approach to Machine Learning and Deep Neural Networks. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-0200-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Lu, Le, Yefeng Zheng, Gustavo Carneiro, and Lin Yang, eds. Deep Learning and Convolutional Neural Networks for Medical Image Computing. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-42999-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tetko, Igor V., Věra Kůrková, Pavel Karpov, and Fabian Theis, eds. Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-30484-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Deep Photonic Neural Networks"

1

Sheu, Bing J., and Joongho Choi. "Photonic Neural Networks." In Neural Information Processing and VLSI, 369–96. Boston, MA: Springer US, 1995. http://dx.doi.org/10.1007/978-1-4615-2247-8_13.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Calin, Ovidiu. "Neural Networks." In Deep Learning Architectures, 167–98. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-36721-3_6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Vasudevan, Shriram K., Sini Raj Pulari, and Subashri Vasudevan. "Recurrent Neural Networks." In Deep Learning, 157–83. New York: Chapman and Hall/CRC, 2021. http://dx.doi.org/10.1201/9781003185635-7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Yu, Dong, and Li Deng. "Deep Neural Networks." In Automatic Speech Recognition, 57–77. London: Springer London, 2014. http://dx.doi.org/10.1007/978-1-4471-5779-3_4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Awad, Mariette, and Rahul Khanna. "Deep Neural Networks." In Efficient Learning Machines, 127–47. Berkeley, CA: Apress, 2015. http://dx.doi.org/10.1007/978-1-4302-5990-9_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Sun, Yanan, Gary G. Yen, and Mengjie Zhang. "Deep Neural Networks." In Evolutionary Deep Neural Architecture Search: Fundamentals, Methods, and Recent Advances, 9–30. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-16868-0_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Denuit, Michel, Donatien Hainaut, and Julien Trufin. "Deep Neural Networks." In Springer Actuarial, 63–82. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-25827-6_3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Hopgood, Adrian A. "Deep Neural Networks." In Intelligent Systems for Engineers and Scientists, 229–45. 4th ed. Boca Raton: CRC Press, 2021. http://dx.doi.org/10.1201/9781003226277-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Xiong, Momiao. "Deep Neural Networks." In Artificial Intelligence and Causal Inference, 1–44. Boca Raton: Chapman and Hall/CRC, 2022. http://dx.doi.org/10.1201/9781003028543-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Wang, Liang, and Jianxin Zhao. "Deep Neural Networks." In Architecture of Advanced Numerical Analysis Systems, 121–47. Berkeley, CA: Apress, 2022. http://dx.doi.org/10.1007/978-1-4842-8853-5_5.

Full text
Abstract:
AbstractThere are many articles teaching people how to build intelligent applications using different frameworks such as TensorFlow, PyTorch, etc. However, except those very professional research papers, very few articles can give us a comprehensive understanding on how to develop such frameworks. In this chapter, rather than just “casting spells,” we focus on explaining how to make the magic work in the first place. We will dissect the deep neural network module in Owl, then demonstrate how to assemble different building blocks to build a working framework. Owl’s neural network module is a full-featured DNN framework. You can define a neural network in a very compact and elegant way thanks to OCaml’s expressiveness. The DNN applications built on Owl can achieve state-of-the-art performance.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Deep Photonic Neural Networks"

1

Shastri, Bhavin J., Matthew J. Filipovich, Zhimu Guo, Paul R. Prucnal, Sudip Shekhar, and Volker J. Sorger. "Silicon Photonics Neural Networks for Training and Inference." In Photonic Networks and Devices. Washington, D.C.: Optica Publishing Group, 2022. http://dx.doi.org/10.1364/networks.2022.new2d.2.

Full text
Abstract:
Deep learning hardware accelerators based on analog photonic networks are trained on standard digital electronics. We discuss on-chip training of neural networks enabled by a silicon photonic architecture for parallel, efficient, and fast data operations.
APA, Harvard, Vancouver, ISO, and other styles
2

Leelar, Bhawani Shankar, E. S. Shivaleela, and T. Srinivas. "Learning with Deep Photonic Neural Networks." In 2017 IEEE Workshop on Recent Advances in Photonics (WRAP). IEEE, 2017. http://dx.doi.org/10.1109/wrap.2017.8468594.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shastri, Bhavin J., Matthew J. Filipovich, Zhimu Guo, Paul R. Prucnal, Sudip Shekhar, and Volker J. Sorger. "Silicon Photonics for Training Deep Neural Networks." In Conference on Lasers and Electro-Optics/Pacific Rim. Washington, D.C.: Optica Publishing Group, 2022. http://dx.doi.org/10.1364/cleopr.2022.ctha13b_02.

Full text
Abstract:
Analog photonic networks as deep learning hardware accelerators are trained on standard digital electronics. We propose an on-chip training of neural networks enabled by a silicon photonic architecture for parallel, efficient, and fast data operations.
APA, Harvard, Vancouver, ISO, and other styles
4

Ashtiani, Farshid, Mehmet Berkay On, David Sanchez-Jacome, Daniel Perez-Lopez, S. J. Ben Yoo, and Andrea Blanco-Redondo. "Photonic Max-Pooling for Deep Neural Networks Using a Programmable Photonic Platform." In Optical Fiber Communication Conference. Washington, D.C.: Optica Publishing Group, 2023. http://dx.doi.org/10.1364/ofc.2023.m1j.6.

Full text
Abstract:
We propose a photonic max-pooling architecture for photonic neural networks which is compatible with integrated photonic platforms. As a proof of concept, we have experimentally demonstrated the max-pooling function on a programmable photonic platform consisting of a hexagonal mesh of Mach-Zehnder interferometers.
APA, Harvard, Vancouver, ISO, and other styles
5

Tanimura, Takahito, Yuichi Akiyama, and Takeshi Hoshida. "Physical-layer Visualization and Analysis toward Efficient Network Operation by Deep Neural Networks." In Photonic Networks and Devices. Washington, D.C.: OSA, 2019. http://dx.doi.org/10.1364/networks.2019.neth1d.2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Picco, Enrico, and Serge Massar. "Real-Time Photonic Deep Reservoir Computing for Speech Recognition." In 2023 International Joint Conference on Neural Networks (IJCNN). IEEE, 2023. http://dx.doi.org/10.1109/ijcnn54540.2023.10191786.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Beyene, Yonatan, Nicola Peserico, Xiaoxuan Ma, and Volker J. Sorger. "Towards the full integration of Silicon Photonic Chip for Deep Neural Networks." In Bragg Gratings, Photosensitivity and Poling in Glass Waveguides and Materials. Washington, D.C.: Optica Publishing Group, 2022. http://dx.doi.org/10.1364/bgppm.2022.jw3a.31.

Full text
Abstract:
Neural networks are taking a central role, while the integration of novel technologies has lacked. Here we present an integration of Silicon Photonic chip MVM into a stand-alone PCB, aiming at the photonic “black-box” integration.
APA, Harvard, Vancouver, ISO, and other styles
8

Ashtiani, Farshid, Mehmet Berkay On, David Sanchez-Jacome, Daniel Perez-Lopez, S. J. Ben Yoo, and Andrea Blanco-Redondo. "Photonic Max-Pooling for Deep Neural Networks Using a Programmable Photonic Platform." In 2023 Optical Fiber Communications Conference and Exhibition (OFC). IEEE, 2023. http://dx.doi.org/10.23919/ofc49934.2023.10116774.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Pankov, Artem V., Oleg S. Sidelnikov, Ilya D. Vatnik, Dmitry V. Churkin, and Andrey A. Sukhorukov. "Deep Neural Networks with Time-Domain Synthetic Photonic Lattices." In 2021 Conference on Lasers and Electro-Optics Europe & European Quantum Electronics Conference (CLEO/Europe-EQEC). IEEE, 2021. http://dx.doi.org/10.1109/cleo/europe-eqec52157.2021.9542271.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Shi, B., N. Calabretta, and R. Stabile. "SOA-Based Photonic Integrated Deep Neural Networks for Image Classification." In CLEO: Science and Innovations. Washington, D.C.: OSA, 2019. http://dx.doi.org/10.1364/cleo_si.2019.sf1n.5.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Deep Photonic Neural Networks"

1

Yu, Haichao, Haoxiang Li, Honghui Shi, Thomas S. Huang, and Gang Hua. Any-Precision Deep Neural Networks. Web of Open Science, December 2020. http://dx.doi.org/10.37686/ejai.v1i1.82.

Full text
Abstract:
We present Any-Precision Deep Neural Networks (Any- Precision DNNs), which are trained with a new method that empowers learned DNNs to be flexible in any numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-width, by trun- cating the least significant bits, to support dynamic speed and accuracy trade-off. When all layers are set to low- bits, we show that the model achieved accuracy compara- ble to dedicated models trained at the same precision. This nice property facilitates flexible deployment of deep learn- ing models in real-world applications, where in practice trade-offs between model accuracy and runtime efficiency are often sought. Previous literature presents solutions to train models at each individual fixed efficiency/accuracy trade-off point. But how to produce a model flexible in runtime precision is largely unexplored. When the demand of efficiency/accuracy trade-off varies from time to time or even dynamically changes in runtime, it is infeasible to re-train models accordingly, and the storage budget may forbid keeping multiple models. Our proposed framework achieves this flexibility without performance degradation. More importantly, we demonstrate that this achievement is agnostic to model architectures. We experimentally validated our method with different deep network backbones (AlexNet-small, Resnet-20, Resnet-50) on different datasets (SVHN, Cifar-10, ImageNet) and observed consistent results.
APA, Harvard, Vancouver, ISO, and other styles
2

Koh, Christopher Fu-Chai, and Sergey Igorevich Magedov. Bond Order Prediction Using Deep Neural Networks. Office of Scientific and Technical Information (OSTI), August 2019. http://dx.doi.org/10.2172/1557202.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shevitski, Brian, Yijing Watkins, Nicole Man, and Michael Girard. Digital Signal Processing Using Deep Neural Networks. Office of Scientific and Technical Information (OSTI), April 2023. http://dx.doi.org/10.2172/1984848.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Talathi, S. S. Deep Recurrent Neural Networks for seizure detection and early seizure detection systems. Office of Scientific and Technical Information (OSTI), June 2017. http://dx.doi.org/10.2172/1366924.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Armstrong, Derek Elswick, and Joseph Gabriel Gorka. Using Deep Neural Networks to Extract Fireball Parameters from Infrared Spectral Data. Office of Scientific and Technical Information (OSTI), May 2020. http://dx.doi.org/10.2172/1623398.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Thulasidasan, Sunil, Gopinath Chennupati, Jeff Bilmes, Tanmoy Bhattacharya, and Sarah E. Michalak. On Mixup Training: Improved Calibration and Predictive Uncertainty for Deep Neural Networks. Office of Scientific and Technical Information (OSTI), June 2019. http://dx.doi.org/10.2172/1525811.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ellis, John, Attila Cangi, Normand Modine, John Stephens, Aidan Thompson, and Sivasankaran Rajamanickam. Accelerating Finite-temperature Kohn-Sham Density Functional Theory\ with Deep Neural Networks. Office of Scientific and Technical Information (OSTI), October 2020. http://dx.doi.org/10.2172/1677521.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ellis, Austin, Lenz Fielder, Gabriel Popoola, Normand Modine, John Stephens, Aidan Thompson, and Sivasankaran Rajamanickam. Accelerating Finite-Temperature Kohn-Sham Density Functional Theory with Deep Neural Networks. Office of Scientific and Technical Information (OSTI), June 2021. http://dx.doi.org/10.2172/1817970.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Stevenson, G. Analysis of Pre-Trained Deep Neural Networks for Large-Vocabulary Automatic Speech Recognition. Office of Scientific and Technical Information (OSTI), July 2016. http://dx.doi.org/10.2172/1289367.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Chronopoulos, Ilias, Katerina Chrysikou, George Kapetanios, James Mitchell, and Aristeidis Raftapostolos. Deep Neural Network Estimation in Panel Data Models. Federal Reserve Bank of Cleveland, July 2023. http://dx.doi.org/10.26509/frbc-wp-202315.

Full text
Abstract:
In this paper we study neural networks and their approximating power in panel data models. We provide asymptotic guarantees on deep feed-forward neural network estimation of the conditional mean, building on the work of Farrell et al. (2021), and explore latent patterns in the cross-section. We use the proposed estimators to forecast the progression of new COVID-19 cases across the G7 countries during the pandemic. We find significant forecasting gains over both linear panel and nonlinear time-series models. Containment or lockdown policies, as instigated at the national level by governments, are found to have out-of-sample predictive power for new COVID-19 cases. We illustrate how the use of partial derivatives can help open the "black box" of neural networks and facilitate semi-structural analysis: school and workplace closures are found to have been effective policies at restricting the progression of the pandemic across the G7 countries. But our methods illustrate significant heterogeneity and time variation in the effectiveness of specific containment policies.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography