Dissertations / Theses on the topic 'Efficient Neural Networks'

To see the other types of publications on this topic, follow the link: Efficient Neural Networks.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Efficient Neural Networks.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Silfa, Franyell. "Energy-efficient architectures for recurrent neural networks." Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/671448.

Full text
Abstract:
Deep Learning algorithms have been remarkably successful in applications such as Automatic Speech Recognition and Machine Translation. Thus, these kinds of applications are ubiquitous in our lives and are found in a plethora of devices. These algorithms are composed of Deep Neural Networks (DNNs), such as Convolutional Neural Networks and Recurrent Neural Networks (RNNs), which have a large number of parameters and require a large amount of computations. Hence, the evaluation of DNNs is challenging due to their large memory and power requirements. RNNs are employed to solve sequence to sequence problems such as Machine Translation. They contain data dependencies among the executions of time-steps hence the amount of parallelism is severely limited. Thus, evaluating them in an energy-efficient manner is more challenging than evaluating other DNN algorithms. This thesis studies applications using RNNs to improve their energy efficiency on specialized architectures. Specifically, we propose novel energy-saving techniques and highly efficient architectures tailored to the evaluation of RNNs. We focus on the most successful RNN topologies which are the Long Short Term memory and the Gated Recurrent Unit. First, we characterize a set of RNNs running on a modern SoC. We identify that accessing the memory to fetch the model weights is the main source of energy consumption. Thus, we propose E-PUR: an energy-efficient processing unit for RNN inference. E-PUR achieves 6.8x speedup and improves energy consumption by 88x compared to the SoC. These benefits are obtained by improving the temporal locality of the model weights. In E-PUR, fetching the parameters is the main source of energy consumption. Thus, we strive to reduce memory accesses and propose a scheme to reuse previous computations. Our observation is that when evaluating the input sequences of an RNN model, the output of a given neuron tends to change lightly between consecutive evaluations.Thus, we develop a scheme that caches the neurons' outputs and reuses them whenever it detects that the change between the current and previously computed output value for a given neuron is small avoiding to fetch the weights. In order to decide when to reuse a previous value we employ a Binary Neural Network (BNN) as a predictor of reusability. The low-cost BNN can be employed in this context since its output is highly correlated to the output of RNNs. We show that our proposal avoids more than 24.2% of computations. Hence, on average, energy consumption is reduced by 18.5% for a speedup of 1.35x. RNN models’ memory footprint is usually reduced by using low precision for evaluation and storage. In this case, the minimum precision used is identified offline and it is set such that the model maintains its accuracy. This method utilizes the same precision to compute all time-steps.Yet, we observe that some time-steps can be evaluated with a lower precision while preserving the accuracy. Thus, we propose a technique that dynamically selects the precision used to compute each time-step. A challenge of our proposal is choosing a lower bit-width. We address this issue by recognizing that information from a previous evaluation can be employed to determine the precision required in the current time-step. Our scheme evaluates 57% of the computations on a bit-width lower than the fixed precision employed by static methods. We implement it on E-PUR and it provides 1.46x speedup and 19.2% energy savings on average.
Los algoritmos de aprendizaje profundo han tenido un éxito notable en aplicaciones como el reconocimiento automático de voz y la traducción automática. Por ende, estas aplicaciones son omnipresentes en nuestras vidas y se encuentran en una gran cantidad de dispositivos. Estos algoritmos se componen de Redes Neuronales Profundas (DNN), tales como las Redes Neuronales Convolucionales y Redes Neuronales Recurrentes (RNN), las cuales tienen un gran número de parámetros y cálculos. Por esto implementar DNNs en dispositivos móviles y servidores es un reto debido a los requisitos de memoria y energía. Las RNN se usan para resolver problemas de secuencia a secuencia tales como traducción automática. Estas contienen dependencias de datos entre las ejecuciones de cada time-step, por ello la cantidad de paralelismo es limitado. Por eso la evaluación de RNNs de forma energéticamente eficiente es un reto. En esta tesis se estudian RNNs para mejorar su eficiencia energética en arquitecturas especializadas. Para esto, proponemos técnicas de ahorro energético y arquitecturas de alta eficiencia adaptadas a la evaluación de RNN. Primero, caracterizamos un conjunto de RNN ejecutándose en un SoC. Luego identificamos que acceder a la memoria para leer los pesos es la mayor fuente de consumo energético el cual llega hasta un 80%. Por ende, creamos E-PUR: una unidad de procesamiento para RNN. E-PUR logra una aceleración de 6.8x y mejora el consumo energético en 88x en comparación con el SoC. Esas mejoras se deben a la maximización de la ubicación temporal de los pesos. En E-PUR, la lectura de los pesos representa el mayor consumo energético. Por ende, nos enfocamos en reducir los accesos a la memoria y creamos un esquema que reutiliza resultados calculados previamente. La observación es que al evaluar las secuencias de entrada de un RNN, la salida de una neurona dada tiende a cambiar ligeramente entre evaluaciones consecutivas, por lo que ideamos un esquema que almacena en caché las salidas de las neuronas y las reutiliza cada vez que detecta un cambio pequeño entre el valor de salida actual y el valor previo, lo que evita leer los pesos. Para decidir cuándo usar un cálculo anterior utilizamos una Red Neuronal Binaria (BNN) como predictor de reutilización, dado que su salida está altamente correlacionada con la salida de la RNN. Esta propuesta evita más del 24.2% de los cálculos y reduce el consumo energético promedio en 18.5%. El tamaño de la memoria de los modelos RNN suele reducirse utilizando baja precisión para la evaluación y el almacenamiento de los pesos. En este caso, la precisión mínima utilizada se identifica de forma estática y se establece de manera que la RNN mantenga su exactitud. Normalmente, este método utiliza la misma precisión para todo los cálculos. Sin embargo, observamos que algunos cálculos se pueden evaluar con una precisión menor sin afectar la exactitud. Por eso, ideamos una técnica que selecciona dinámicamente la precisión utilizada para calcular cada time-step. Un reto de esta propuesta es como elegir una precisión menor. Abordamos este problema reconociendo que el resultado de una evaluación previa se puede emplear para determinar la precisión requerida en el time-step actual. Nuestro esquema evalúa el 57% de los cálculos con una precisión menor que la precisión fija empleada por los métodos estáticos. Por último, la evaluación en E-PUR muestra una aceleración de 1.46x con un ahorro de energía promedio de 19.2%
APA, Harvard, Vancouver, ISO, and other styles
2

Golea, Mostefa. "On efficient learning algorithms for neural networks." Thesis, University of Ottawa (Canada), 1993. http://hdl.handle.net/10393/6508.

Full text
Abstract:
Inductive Inference Learning can be described in terms of finding a good approximation to some unknown classification rule f, based on a pre-classified set of training examples $\langle$x,f(x)$\rangle.$ One particular class of learning systems that has attracted much attention recently is the class of neural networks. But despite the excitement generated by neural networks, learning in these systems has proven to be a difficult task. In this thesis, we investigate different ways and means to overcome the difficulty of training feedforward neural networks. Our goal is to come up with efficient learning algorithms for new classes (or architectures) of neural nets. In the first approach, we relax the constraint of fixed architecture adopted by most neural learning algorithms. We describe two constructive learning algorithms for two-layer and tree-like networks. In the second approach, we adopt the "probably approximately correct" (PAC) learning model and we look for positive learnability results by restricting the distribution generating the training examples, the connectivity of the networks, and/or the weight values. This enables us to identify new classes of neural networks that are efficiently learnable in the chosen setting. In the third and final approach, we look at the problem of learning in neural networks from the average case point of view. In particular, we investigate the average case behavior of the well known clipped Hebb rule when learning different neural networks with binary weights. The arguments given for the "efficient learnability" range from extensive simulations to rigorous mathematical proofs.
APA, Harvard, Vancouver, ISO, and other styles
3

Islam, Taj-ul. "Channel routing : efficient solutions using neural networks /." Online version of thesis, 1993. http://hdl.handle.net/1850/11154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Zhao, Wei. "Efficient neural networks for prediction of turbulent flow." Diss., Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/16939.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Billings, Rachel Mae. "On Efficient Computer Vision Applications for Neural Networks." Thesis, Virginia Tech, 2021. http://hdl.handle.net/10919/102957.

Full text
Abstract:
Since approximately the dawn of the new millennium, neural networks and other machine learning algorithms have become increasingly capable of adeptly performing difficult, dull, and dangerous work conventionally carried out by humans in times of old. As these algorithms become steadily more commonplace in everyday consumer and industry applications, the consideration of how they may be implemented on constrained hardware systems such as smartphones and Internet-of-Things (IoT) peripheral devices in a time- and power- efficient manner while also understanding the scenarios in which they fail is of increasing importance. This work investigates implementations of convolutional neural networks specifically in the context of image inference tasks. Three areas are analyzed: (1) a time- and power-efficient face recognition framework, (2) the development of a COVID-19-related mask classification system suitable for deployment on low-cost, low-power devices, and (3) an investigation into the implementation of spiking neural networks on mobile hardware and their conversion from traditional neural network architectures.
Master of Science
The subject of machine learning and its associated jargon have become ubiquitous in the past decade as industries seek to develop automated tools and applications and researchers continue to develop new methods for artificial intelligence and improve upon existing ones. Neural networks are a type of machine learning algorithm that can make predictions in complex situations based on input data with human-like (or better) accuracy. Real-time, low-power, and low-cost systems using these algorithms are increasingly used in consumer and industry applications, often improving the efficiency of completing mundane and hazardous tasks traditionally performed by humans. The focus of this work is (1) to explore when and why neural networks may make incorrect decisions in the domain of image-based prediction tasks, (2) the demonstration of a low-power, low-cost machine learning use case using a mask recognition system intended to be suitable for deployment in support of COVID-19-related mask regulations, and (3) the investigation of how neural networks may be implemented on resource-limited technology in an efficient manner using an emerging form of computing.
APA, Harvard, Vancouver, ISO, and other styles
6

Bozorgmehr, Pouya. "An efficient online feature extraction algorithm for neural networks." Diss., [La Jolla] : University of California, San Diego, 2009. http://wwwlib.umi.com/cr/ucsd/fullcit?p1470604.

Full text
Abstract:
Thesis (M.S.)--University of California, San Diego, 2009.
Title from first page of PDF file (viewed January 13, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 61-63).
APA, Harvard, Vancouver, ISO, and other styles
7

Al-Hindi, Khalid A. "Flexible basis function neural networks for efficient analog implementations /." free to MU campus, to others for purchase, 2002. http://wwwlib.umi.com/cr/mo/fullcit?p3074367.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Ekman, Carl. "Traffic Sign Classification Using Computationally Efficient Convolutional Neural Networks." Thesis, Linköpings universitet, Datorseende, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157453.

Full text
Abstract:
Traffic sign recognition is an important problem for autonomous cars and driver assistance systems. With recent developments in the field of machine learning, high performance can be achieved, but typically at a large computational cost. This thesis aims to investigate the relation between classification accuracy and computational complexity for the visual recognition problem of classifying traffic signs. In particular, the benefits of partitioning the classification problem into smaller sub-problems using prior knowledge in the form of shape or current region are investigated. In the experiments, the convolutional neural network (CNN) architecture MobileNetV2 is used, as it is specifically designed to be computationally efficient. To incorporate prior knowledge, separate CNNs are used for the different subsets generated when partitioning the dataset based on region or shape. The separate CNNs are trained from scratch or initialized by pre-training on the full dataset. The results support the intuitive idea that performance initially increases with network size and indicate a network size where the improvement stops. Including shape information using the two investigated methods does not result in a significant improvement. Including region information using pretrained separate classifiers results in a small improvement for small complexities, for one of the regions in the experiments. In the end, none of the investigated methods of including prior knowledge are considered to yield an improvement large enough to justify the added implementational complexity. However, some other methods are suggested, which would be interesting to study in future work.
APA, Harvard, Vancouver, ISO, and other styles
9

Adamu, Abdullahi S. "An empirical study towards efficient learning in artificial neural networks by neuronal diversity." Thesis, University of Nottingham, 2016. http://eprints.nottingham.ac.uk/33799/.

Full text
Abstract:
Artificial Neural Networks (ANN) are biologically inspired algorithms, and it is natural that it continues to inspire research in artificial neural networks. From the recent breakthrough of deep learning to the wake-sleep training routine, all have a common source of drawing inspiration: biology. The transfer functions of artificial neural networks play the important role of forming decision boundaries necessary for learning. However, there has been relatively little research on transfer function optimization compared to other aspects of neural network optimization. In this work, neuronal diversity - a property found in biological neural networks- is explored as a potentially promising method of transfer function optimization. This work shows how neural diversity can improve generalization in the context of literature from the bias-variance decomposition and meta-learning. It then demonstrates that neural diversity - represented in the form of transfer function diversity- can exhibit diverse and accurate computational strategies that can be used as ensembles with competitive results without supplementing it with other diversity maintenance schemes that tend to be computationally expensive. This work also presents neural network meta-features described as problem signatures sampled from models with diverse transfer functions for problem characterization. This was shown to meet the criteria of basic properties desired for any meta-feature, i.e. consistency for a problem and discriminatory for different problems. Furthermore, these meta-features were also used to study the underlying computational strategies adopted by the neural network models, which lead to the discovery of the strong discriminatory property of the evolved transfer function. The culmination of this study is the co-evolution of neurally diverse neurons with their weights and topology for efficient learning. It is shown to achieve significant generalization ability as demonstrated by its average MSE of 0.30 on 22 different benchmarks with minimal resources (i.e. two hidden units). Interestingly, these are the properties associated with neural diversity. Thus, showing the properties of efficiency and increased computational capacity could be replicated with transfer function diversity in artificial neural networks.
APA, Harvard, Vancouver, ISO, and other styles
10

Etchells, Terence Anthony. "Rule extraction from neural networks : a practical and efficient approach." Thesis, Liverpool John Moores University, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.402847.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Lundström, Dennis. "Data-efficient Transfer Learning with Pre-trained Networks." Thesis, Linköpings universitet, Datorseende, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-138612.

Full text
Abstract:
Deep learning has dominated the computer vision field since 2012, but a common criticism of deep learning methods is their dependence on large amounts of data. To combat this criticism research into data-efficient deep learning is growing. The foremost success in data-efficient deep learning is transfer learning with networks pre-trained on the ImageNet dataset. Pre-trained networks have achieved state-of-the-art performance on many tasks. We consider the pre-trained network method for a new task where we have to collect the data. We hypothesize that the data efficiency of pre-trained networks can be improved through informed data collection. After exhaustive experiments on CaffeNet and VGG16, we conclude that the data efficiency indeed can be improved. Furthermore, we investigate an alternative approach to data-efficient learning, namely adding domain knowledge in the form of a spatial transformer to the pre-trained networks. We find that spatial transformers are difficult to train and seem to not improve data efficiency.
APA, Harvard, Vancouver, ISO, and other styles
12

Riggelsen, Carsten. "Approximation methods for efficient learning of Bayesian networks /." Amsterdam ; Washington, DC : IOS Press, 2008. http://www.loc.gov/catdir/toc/fy0804/2007942192.html.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Ponca, Marek Scarbata Gerd. "Towards efficient implementation of artificial neural networks in systems on chip /." Ilmenau : ISLE, 2007. http://www.gbv.de/dms/ilmenau/toc/530583380.PDF.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Harper, Kevin M. "Challenging the Efficient Market Hypothesis with Dynamically Trained Artificial Neural Networks." UNF Digital Commons, 2016. http://digitalcommons.unf.edu/etd/718.

Full text
Abstract:
A review of the literature applying Multilayer Perceptron (MLP) based Artificial Neural Networks (ANNs) to market forecasting leads to three observations: 1) It is clear that simple ANNs, like other nonlinear machine learning techniques, are capable of approximating general market trends 2) It is not clear to what extent such forecasted trends are reliably exploitable in terms of profits obtained via trading activity 3) Most research with ANNs reporting profitable trading activity relies on ANN models trained over one fixed interval which is then tested on a separate out-of-sample fixed interval, and it is not clear to what extent these results may generalize to other out-of-sample periods. Very little research has tested the profitability of ANN models over multiple out-of-sample periods, and the author knows of no pure ANN (non-hybrid) systems that do so while being dynamically retrained on new data. This thesis tests the capacity of MLP type ANNs to reliably generate profitable trading signals over rolling training and testing periods. Traditional error statistics serve as descriptive rather than performance measures in this research, as they are of limited use for assessing a system’s ability to consistently produce above-market returns. Performance is measured for the ANN system by the average returns accumulated over multiple runs over multiple periods, and these averages are compared with the traditional buy-and-hold returns for the same periods. In some cases, our models were able to produce above-market returns over many years. These returns, however, proved to be highly sensitive to variability in the training, validation and testing datasets as well as to the market dynamics at play during initial deployment. We argue that credible challenges to the Efficient Market Hypothesis (EMH) by machine learning techniques must demonstrate that returns produced by their models are not similarly susceptible to such variability.
APA, Harvard, Vancouver, ISO, and other styles
15

Allen, Michael James. "Artificial intelligence techniques for efficient object location in image sequences." Thesis, University of Wolverhampton, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.343257.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Storkey, Amos James. "Efficient covariance matrix methods for Bayesian Gaussian processes and Hopfield neural networks." Thesis, Imperial College London, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.313335.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Hu, Xu. "Towards efficient learning of graphical models and neural networks with variational techniques." Thesis, Paris Est, 2019. http://www.theses.fr/2019PESC1037.

Full text
Abstract:
Dans cette thèse, je me concentrerai principalement sur l’inférence variationnelle et les modèles probabilistes. En particulier, je couvrirai plusieurs projets sur lesquels j'ai travaillé pendant ma thèse sur l'amélioration de l'efficacité des systèmes AI / ML avec des techniques variationnelles. La thèse comprend deux parties. Dans la première partie, l’efficacité des modèles probabilistes graphiques est étudiée. Dans la deuxième partie, plusieurs problèmes d’apprentissage des réseaux de neurones profonds sont examinés, qui sont liés à l’efficacité énergétique ou à l’efficacité des échantillons
In this thesis, I will mainly focus on variational inference and probabilistic models. In particular, I will cover several projects I have been working on during my PhD about improving the efficiency of AI/ML systems with variational techniques. The thesis consists of two parts. In the first part, the computational efficiency of probabilistic graphical models is studied. In the second part, several problems of learning deep neural networks are investigated, which are related to either energy efficiency or sample efficiency
APA, Harvard, Vancouver, ISO, and other styles
18

Highlander, Tyler. "Efficient Training of Small Kernel Convolutional Neural Networks using Fast Fourier Transform." Wright State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=wright1432747175.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Ioannou, Yani Andrew. "Structural priors in deep neural networks." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278976.

Full text
Abstract:
Deep learning has in recent years come to dominate the previously separate fields of research in machine learning, computer vision, natural language understanding and speech recognition. Despite breakthroughs in training deep networks, there remains a lack of understanding of both the optimization and structure of deep networks. The approach advocated by many researchers in the field has been to train monolithic networks with excess complexity, and strong regularization --- an approach that leaves much to desire in efficiency. Instead we propose that carefully designing networks in consideration of our prior knowledge of the task and learned representation can improve the memory and compute efficiency of state-of-the art networks, and even improve generalization --- what we propose to denote as structural priors. We present two such novel structural priors for convolutional neural networks, and evaluate them in state-of-the-art image classification CNN architectures. The first of these methods proposes to exploit our knowledge of the low-rank nature of most filters learned for natural images by structuring a deep network to learn a collection of mostly small, low-rank, filters. The second addresses the filter/channel extents of convolutional filters, by learning filters with limited channel extents. The size of these channel-wise basis filters increases with the depth of the model, giving a novel sparse connection structure that resembles a tree root. Both methods are found to improve the generalization of these architectures while also decreasing the size and increasing the efficiency of their training and test-time computation. Finally, we present work towards conditional computation in deep neural networks, moving towards a method of automatically learning structural priors in deep networks. We propose a new discriminative learning model, conditional networks, that jointly exploit the accurate representation learning capabilities of deep neural networks with the efficient conditional computation of decision trees. Conditional networks yield smaller models, and offer test-time flexibility in the trade-off of computation vs. accuracy.
APA, Harvard, Vancouver, ISO, and other styles
20

Lee, Hyuk-Jae 1965. "An efficient cooling algorithm for annealed neural networks with applications to optimization problems." Thesis, The University of Arizona, 1991. http://hdl.handle.net/10150/278008.

Full text
Abstract:
In this thesis we consider an efficient cooling schedule for a mean field annealing (MFA) algorithm. We combine the MFA algorithm with microcanonical simulation (MCS) method and propose a new algorithm called the microcanonical mean field annealing (MCMFA) algorithm. In the proposed algorithm, the cooling speed is controlled by the current temperature so that the amount of computation in MFA can be reduced without a degradation of performance. Unlike that produced by MFA, the solution quality produced by MCMFA is not affected by the choice of the initial temperature. Properties of MCMFA are analyzed and simulated with Hopfield neural networks (HNN). In order to compare MCMFA with MFA, we apply both algorithms to three problems namely, the graph bipartitioning problem, the traveling salesman problem and the weighted matching problem. Simulation results show that MCMFA produces a superior performance to that of MFA.
APA, Harvard, Vancouver, ISO, and other styles
21

Geras, Krzysztof Jerzy. "Exploiting diversity for efficient machine learning." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/28839.

Full text
Abstract:
A common practice for solving machine learning problems is currently to consider each problem in isolation, starting from scratch every time a new learning problem is encountered or a new model is proposed. This is a perfectly feasible solution when the problems are sufficiently easy or, if the problem is hard when a large amount of resources, both in terms of the training data and computation, are available. Although this naive approach has been the main focus of research in machine learning for a few decades and had a lot of success, it becomes infeasible if the problem is too hard in proportion to the available resources. When using a complex model in this naive approach, it is necessary to collect large data sets (if possible at all) to avoid overfitting and hence it is also necessary to use large computational resources to handle the increased amount of data, first during training to process a large data set and then also at test time to execute a complex model. An alternative to this strategy of treating each learning problem independently is to leverage related data sets and computation encapsulated in previously trained models. By doing that we can decrease the amount of data necessary to reach a satisfactory level of performance and, consequently, improve the accuracy achievable and decrease training time. Our attack on this problem is to exploit diversity - in the structure of the data set, in the features learnt and in the inductive biases of different neural network architectures. In the setting of learning from multiple sources we introduce multiple-source cross-validation, which gives an unbiased estimator of the test error when the data set is composed of data coming from multiple sources and the data at test time are coming from a new unseen source. We also propose new estimators of variance of the standard k-fold cross-validation and multiple-source cross-validation, which have lower bias than previously known ones. To improve unsupervised learning we introduce scheduled denoising autoencoders, which learn a more diverse set of features than the standard denoising auto-encoder. This is thanks to their training procedure, which starts with a high level of noise, when the network is learning coarse features and then the noise is lowered gradually, which allows the network to learn some more local features. A connection between this training procedure and curriculum learning is also drawn. We develop further the idea of learning a diverse representation by explicitly incorporating the goal of obtaining a diverse representation into the training objective. The proposed model, the composite denoising autoencoder, learns multiple subsets of features focused on modelling variations in the data set at different levels of granularity. Finally, we introduce the idea of model blending, a variant of model compression, in which the two models, the teacher and the student, are both strong models, but different in their inductive biases. As an example, we train convolutional networks using the guidance of bidirectional long short-term memory (LSTM) networks. This allows to train the convolutional neural network to be more accurate than the LSTM network at no extra cost at test time.
APA, Harvard, Vancouver, ISO, and other styles
22

Aboubakar, Moussa. "Efficient management of IoT low power networks." Thesis, Compiègne, 2020. http://www.theses.fr/2020COMP2571.

Full text
Abstract:
Durant cette dernière décennie, plusieurs objets connectés tels que les ordinateurs, les capteurs et les montres intelligentes ont intégrés notre quotidien et forment aujourd’hui ce que l’on appelle l’Internet des Objets (IdO) ou Internet of Things (IoT) en anglais. L’IoT est un nouveau paradigme permettant une interaction entre les objets connectés afin d’améliorer notre qualité de vie, notre façon de produire des biens et notre façon d’interagir avec notre environnement. De nos jours, l’IoT se caractérise par la présence, de par le monde, de milliards d’objets connectés à faibles ressources (batterie, mémoire, CPU, bande passante disponible, etc) et hétérogènes, déployés pour permettre diverses applications couvrant de nombreux domaines de notre société tels que la santé, l’industrie, les transports, l’agriculture, etc. Cependant, en raison des contraintes lié aux ressources et de l’hétérogénéité des objets connectés, les réseaux IoT à faibles ressources présents font face à des problèmes de performance, notamment la dégradation de la qualité des liens radio, la défaillance (logicielle ou matérielle) de certains objets du réseau, la congestion du réseau, etc. Ainsi, il est donc important de gérer efficacement les réseaux IoT à faible ressources afin d’assurer leur bon fonctionnement. Pour ce faire, la solution de gestion du réseau doit être autonome (pour faire face à la nature dynamique des réseaux IoT), tenir compte de l’hétérogénéité des objets connectés et être moins consommatrice en énergie pour répondre aux défis de l’IoT. Dans cette thèse, nous nous sommes intéressés au problème de gestion des réseaux IoT à faibles ressources et avons proposés des solutions efficaces pour permettre une optimisation des performances de ces types de réseaux. Dans un premier temps nous avons procédé à une étude comparative des solutions de gestion des réseaux IoT à faibles ressources afin d’identifier les verrous techniques. Ensuite, nous avons proposé une solution intelligente qui se base sur un modèle de réseau de neurones profonds pour permettre une configuration de la portée radio dans les réseaux sans fil à faibles ressources de type RPL (IPv6 Routing Protocol for Low power and Lossy Networks). Une évaluation des performances de cette solution montre qu’elle est capable de déterminer la portée radio permettant une réduction de la consommation énergétique du réseau tout en garantissant une connectivité des objets connectés. Nous avons également proposé une solution efficace et adaptative pour configurer les paramètres de la couche MAC dans les réseaux dynamiques de type IEEE 802.15.4. Les résultats des simulations démontrent que notre solution améliore le délai de transmission bout en bout par rapport à l’utilisation des paramètres par défaut de la MAC IEEE 802.15.4. En outre, nous avons proposé une étude des solutions existante pour la gestion des problèmes de congestion des réseaux IoT à faibles ressources et par la suite nous avons proposé un procédé d’acheminement de l’information de congestion des objets connectés présents sur un chemin de routage donné dans des réseaux à ressources limitées. Cette méthode a pour but de permettre une réponse efficace aux problèmes de congestion
In these recent years, several connected objects such as computer, sensors and smart watches became part of modern living and form the Internet of Things (IoT). The basic idea of IoT is to enable interaction among connected objects in order to achieve a desirable goal. IoT paradigm spans across many areas of our daily life such as smart transportation, smart city, smart agriculture, smart factory and so forth. Nowadays, IoT networks are characterized by the presence of billions of heterogeneous embedded devices with limited resources (e.g. limited memory, battery, CPU and bandwidth) deployed to enable various IoT applications. However, due to both resource constraints and the heterogeneity of IoT devices, IoT networks are facing with various problems (e.g. link quality deterioration, node failure, network congestion, etc.). Considering that, it is therefore important to perform an efficient management of IoT low power networks in order to ensure good performance of those networks. To achieve this, the network management solution should be able to perform self-configuration of devices to cope with the complexity introduced by current IoT networks (due to the increasing number of IoT devices and the dynamic nature of IoT networks). Moreover, the network management should provide a mechanism to deal with the heterogeneity of the IoT ecosystem and it should also be energy efficient in order to prolong the operational time of IoT devices in case they are using batteries. Thereby, in this thesis we addressed the problem of configuration of IoT low power networks by proposing efficient solutions that help to optimize the performance of IoT networks. We started by providing a comparative analysis of existing solutions for the management of IoT low power networks. Then we propose an intelligent solution that uses a deep neural network model to determine the efficient transmission power of RPL networks. The performance evaluation shows that the proposed solution enables the configuration of the transmission range that allows a reduction of the network energy consumption while maintaining the network connectivity. Besides, we also propose an efficient and adaptative solution for configuring the IEEE 802.15.4 MAC parameters of devices in dynamic IoT low power networks. Simulation results show that our proposal improves the end-to-end delay compared to the usage of the standard IEEE 802.15.4 MAC. Additionally, we develop a study on solutions for congestion control in IoT low power networks and propose a novel scheme for collecting the congestion state of devices in a given routing path of an IoT network so as to enable an efficient mitigation of the congestion by the network manager (the device in charge of configuration of the IoT network)
APA, Harvard, Vancouver, ISO, and other styles
23

Jackson, Thomas C. "Building Efficient Neuromorphic Networks in Hardware with Mixed Signal Techniques and Emerging Technologies." Research Showcase @ CMU, 2017. http://repository.cmu.edu/dissertations/1096.

Full text
Abstract:
In recent years, neuromorphic architectures have been an increasingly effective tool used to solve big data problems. Hardware neural networks have not been able to fully exploit the power efficient properties of the neural paradigm, however, due to limitations in standard CMOS. One of the largest challenges is the quadratic scaling of the synapses in a neural network. There has been some work in using post CMOS technology as synapses to overcome this limitation, but systems to date have not been scalable due to the design of their neurons. This dissertation aims to design and build scalable neural network architectures that can use emerging resistive memory technology as synapses. Using analog computing techniques to build networks is promising, especially due to the development of dense, CMOS compatible analog resistive memories. Building functional analog networks in advanced technology nodes, however, is challenging due to the relatively poor performance of analog components in these nodes. This work explores oscillatory neural networks (ONNs), which use phase as the analog state variable instead of voltage or current, reducing the number of traditional analog components required and making the networks better-suited for advanced nodes. This thesis develops additional ONN theory with regard to hardware networks, since previous work did not consider the effect of transmission delay on network dynamics. Transmission delay is proven to cause desynchronization in unmodified ONNs, and the theoretical analysis suggests ways to build networks which do synchronize. Conclusions from the theoretical development are used to build a PLL-based ONN in hardware. The PLL-based ONN is more energy efficient than comparable systems implemented in digital CMOS, although the neuron area is somewhat larger. The measurement of the PLL-based ONN also reveals additional poorly-studied facets of ONN dynamics. Using the knowledge gained from the PLL-based ONN, a larger, PLL-free ONN is built in the same technology. Removing the PLL in each neuron reduces the power and area consumption without sacrificing any functionality.This dissertation demonstrates that ONNs are well-suited to take advantage of emerging resistive memory technology to build efficient hardware neural networks.
APA, Harvard, Vancouver, ISO, and other styles
24

Cross, Richard J. (Richard John). "Efficient Tools For Reliability Analysis Using Finite Mixture Distributions." Thesis, Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/4853.

Full text
Abstract:
The complexity of many failure mechanisms and variations in component manufacture often make standard probability distributions inadequate for reliability modeling. Finite mixture distributions provide the necessary flexibility for modeling such complex phenomena but add considerable difficulty to the inference. This difficulty is overcome by drawing an analogy to neural networks. With appropropriate modifications, a neural network can represent a finite mixture CDF or PDF exactly. Training with Bayesian Regularization gives an efficient empirical Bayesian inference of the failure time distribution. Training also yields an effective number of parameters from which the number of components in the mixture can be estimated. Credible sets for functions of the model parameters can be estimated using a simple closed-form expression. Complete, censored, and inpection samples can be considered by appropriate choice of the likelihood function. In this work, architectures for Exponential, Weibull, Normal, and Log-Normal mixture networks have been derived. The capabilities of mixture networks have been demonstrated for complete, censored, and inspection samples from Weibull and Log-Normal mixtures. Furthermore, mixture networks' ability to model arbitrary failure distributions has been demonstrated. A sensitivity analysis has been performed to determine how mixture network estimator errors are affected my mixture component spacing and sample size. It is shown that mixture network estimators are asymptotically unbiased and that errors decay with sample size at least as well as with MLE.
APA, Harvard, Vancouver, ISO, and other styles
25

Minasny, Budiman. "Efficient Methods for Predicting Soil Hydraulic Properties." University of Sydney. Land, Water & Crop Sciences, 2000. http://hdl.handle.net/2123/853.

Full text
Abstract:
Both empirical and process-simulation models are useful for evaluating the effects of management practices on environmental quality and crop yield. The use of these models is limited, however, because they need many soil property values as input. The first step towards modelling is the collection of input data. Soil properties can be highly variable spatially and temporally, and measuring them is time-consuming and expensive. Efficient methods, which consider the uncertainty and cost of measurements, for estimating soil hydraulic properties form the main thrust of this study. Hydraulic properties are affected by other soil physical, and chemical properties, therefore it is possible to develop empirical relations to predict them. This idea quantified is called a pedotransfer function. Such functions may be global or restricted to a country or region. The different classification of particle-size fractions used in Australia compared with other countries presents a problem for the immediate adoption of exotic pedotransfer functions. A database of Australian soil hydraulic properties has been compiled. Pedotransfer functions for estimating water-retention and saturated hydraulic conductivity from particle size and bulk density for Australian soil are presented. Different approaches for deriving hydraulic transfer functions have been presented and compared. Published pedotransfer functions were also evaluated, generally they provide a satisfactory estimation of water retention and saturated hydraulic conductivity depending on the spatial scale and accuracy of prediction. Several pedotransfer functions were developed in this study to predict water retention and hydraulic conductivity. The pedotransfer functions developed here may predict adequately in large areas but for site-specific applications local calibration is needed. There is much uncertainty in the input data, and consequently the transfer functions can produce varied outputs. Uncertainty analysis is therefore needed. A general approach to quantifying uncertainty is to use Monte Carlo methods. By sampling repeatedly from the assumed probability distributions of the input variables and evaluating the response of the model the statistical distribution of the outputs can be estimated. A modified Latin hypercube method is presented for sampling joint multivariate probability distributions. This method is applied to quantify the uncertainties in pedotransfer functions of soil hydraulic properties. Hydraulic properties predicted using pedotransfer functions developed in this study are also used in a field soil-water model to analyze the uncertainties in the prediction of dynamic soil-water regimes. The use of the disc permeameter in the field conventionally requires the placement of a layer of sand in order to provide good contact between the soil surface and disc supply membrane. The effect of sand on water infiltration into the soil and on the estimate of sorptivity was investigated. A numerical study and a field experiment on heavy clay were conducted. Placement of sand significantly increased the cumulative infiltration but showed small differences in the infiltration rate. Estimation of sorptivity based on the Philip's two term algebraic model using different methods was also examined. The field experiment revealed that the error in infiltration measurement was proportional to the cumulative infiltration curve. Infiltration without placement of sand was considerably smaller because of the poor contact between the disc and soil surface. An inverse method for predicting soil hydraulic parameters from disc permeameter data has been developed. A numerical study showed that the inverse method is quite robust in identifying the hydraulic parameters. However application to field data showed that the estimated water retention curve is generally smaller than the one obtained in laboratory measurements. Nevertheless the estimated near-saturated hydraulic conductivity matched the analytical solution quite well. Th author believes that the inverse method can give a reasonable estimate of soil hydraulic parameters. Some experimental and theoretical problems were identified and discussed. A formal analysis was carried out to evaluate the efficiency of the different methods in predicting water retention and hydraulic conductivity. The analysis identified the contribution of individual source of measurement errors to the overall uncertainty. For single measurements, the inverse disc-permeameter analysis is economically more efficient than using pedotransfer functions or measuring hydraulic properties in the laboratory. However, given the large amount of spatial variation of soil hydraulic properties it is perhaps not surprising that lots of cheap and imprecise measurements, e.g. by hand texturing, are more efficient than a few expensive precise ones.
APA, Harvard, Vancouver, ISO, and other styles
26

Karlsson, Nils. "Comparison of linear regression and neural networks for stock price prediction." Thesis, Uppsala universitet, Signaler och system, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-445237.

Full text
Abstract:
Stock market prediction has been a hot topic lately due to advances in computer technology and economics. One economic theory, called Efficient Market Hypothesis (EMH), states that all known information is already factored into the prices which makes it impossible to predict the stock market. Despite the EMH, many researchers have been successful in predicting the stock market using neural networks on historical data. This thesis investigates stock prediction using both linear regression and neural networks (NN), with a twist. The inputs to the proposed methods are a number of profit predictions calculated with stochastic methods such as generalized autoregressive conditional heteroskedasticity (GARCH) and autoregressive integrated moving average (ARIMA). By contrast the traditional approach was instead to use raw data as inputs. The proposed methods show superior result in yielding profit: at best 1.1% in the Swedish market and 4.6% in the American market. The neural network yielded more profit than the linear regression model, which is reasonable given its ability to find nonlinear patterns. The historical data was used with different window sizes. This gives a good understanding of the window size impact on the prediction performance.
APA, Harvard, Vancouver, ISO, and other styles
27

Fonda, James William. "Energy efficient wireless sensor network protocols for monitoring and prognostics of large scale systems." Diss., Rolla, Mo. : Missouri University of Science and Technology, 2008. http://scholarsmine.mst.edu/thesis/pdf/fonda_09007dcc805070d4.pdf.

Full text
Abstract:
Thesis (Ph. D.)--Missouri University of Science and Technology, 2008.
Vita. The entire thesis text is included in file. Title from title screen of thesis/dissertation PDF file (viewed May 27, 2008) Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
28

Hayward, Ross. "Analytic and inductive learning in an efficient connectionist rule-based reasoning system." Thesis, Queensland University of Technology, 2001.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
29

Kuai, Wenming. "Neural networks constructed using families of dense subsets of L[subscript]2(R) functions and their capabilities in efficient and flexible training." Diss., Georgia Institute of Technology, 1991. http://hdl.handle.net/1853/29587.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Peh, Lawrence T. W. "An efficient algorithm for extracting Boolean functions from linear threshold gates, and a synthetic decompositional approach to extracting Boolean functions from feedforward neural networks with arbitrary transfer functions." University of Western Australia. Dept. of Computer Science, 2000. http://theses.library.uwa.edu.au/adt-WU2003.0013.

Full text
Abstract:
[Formulae and special characters can only be approximated here. Please see the pdf version of the Abstract for an accurate reproduction.] Artificial neural networks are universal function approximators that represent functions subsymbolically by weights, thresholds and network topology. Naturally, the representation remains the same regardless of the problem domain. Suppose a network is applied to a symbolic domain. It is difficult for a human to dynamically construct the symbolic function from the neural representation. It is also difficult to retrain networks on perturbed training vectors, to resume training with different training sets, to form a new neuron by combining trained neurons, and to reason with trained neurons. Even the original training set does not provide a symbolic representation of the function implemented by the trained network because the set may be incomplete or inconsistent, and the training phase may terminate with residual errors. The symbolic information in the network would be more useful if it is available in the language of the problem domain. Algorithms that translate the subsymbolic neural representation to a symbolic representation are called extraction algorithms. I argue that extraction algorithms that operate on single-output, layered feedforward networks are sufficient to analyse the class of multiple-output networks with arbitrary connections, including recurrent networks. The translucency dimensions of the ADT taxonomy for feedforward networks classifies extraction approaches as pedagogical, eclectic, or decompositional. Pedagogical and eclectic approaches typically use a symbolic learning algorithm that takes the network’s input-output behaviour as its raw data. Both approaches construct a set of input patterns and observe the network’s output for each pattern. Eclectic and pedagogical approaches construct the input patterns respectively with and without reference to the network’s internal information. These approaches are suitable for approximating the network’s function using a probably-approximately-correct (PAC) or similar framework, but they are unsuitable for constructing the network’s complete function. Decompositional approaches use internal information from a network more directly to produce the network’s function in symbolic form. Decompositional algorithms have two components. The first component is a core extraction algorithm that operates on a single neuron that is assumed to implement a symbolic function. The second component provides the superstructure for the first. It consists of a decomposition rule for producing such neurons and a recomposition rule for symbolically aggregating the extracted functions into the symbolic function of the network. This thesis makes contributions to both components for Boolean extraction. I introduce a relatively efficient core algorithm called WSX based on a novel Boolean form called BvF. The algorithm has a worst case complexity of O(2 to power of n divided by the square root of n) for a neuron with n inputs, but in all cases, its complexity can also be expressed as O(l) with an O(n) precalculation phase, where l is the length of the extracted expression in terms of the number of symbols it contains. I extend WSX for approximate extraction (AWSX) by introducing an interval about the neuron’s threshold. Assuming that the input patterns far from the threshold are more symbolically significant to the neuron than those near the threshold, ASWX ignores the neuron’s mappings for the symbolically input patterns, remapping them as convenient for efficiency. In experiments, this dramatically decreased extraction time while retaining most of the neuron’s mappings for the training set. Synthetic decomposition is this thesis’ contribution to the second component of decompositional extraction. Classical decomposition decomposes the network into its constituent neurons. By extracting symbolic functions from these neurons, classical decomposition assumes that the neurons implement symbolic functions, or that approximating the subsymbolic computation in the neurons with symbolic computation does not significantly affect the network’s symbolic function. I show experimentally that this assumption does not always hold. Instead of decomposing a network into its constituent neurons, synthetic decomposition uses constraints in the network that have the same functional form as neurons that implement Boolean functions; these neurons are called synthetic neurons. I present a starting point for constructing synthetic decompositional algorithms, and proceed to construct two such algorithms, each with a different strategy for decomposition and recomposition. One of the algorithms, ACX, works for networks with arbitrary monotonic transfer functions, so long as an inverse exists for the functions. It also has an elegant geometric interpretation that leads to meaningful approximations. I also show that ACX can be extended to layered networks with any number of layers.
APA, Harvard, Vancouver, ISO, and other styles
31

Hallberg, David, and Erik Renström. "PC Regression, Vector Autoregression, and Recurrent Neural Networks: How do they compare when predicting stock index returns for building efficient portfolios?" Thesis, KTH, Optimeringslära och systemteori, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252557.

Full text
Abstract:
This thesis examines the statistical and economic performance of modeling and predicting equity index returns by application of various statistical models on a set of macroeconomic and financial variables. By combining linear principal component regression, vector autoregressive models, and LSTM neural networks, the authors find that while a majority of the models display high statistical significance, virtually none of them successfully outperform classic portfolio theory on efficient markets in terms of risk-adjusted returns. Several implications are also discussed based on the results.
Detta examensarbete undersöker den statistiska och ekonomiska prestationen i att modellera och prognostisera aktieindexavkastning via applikation av flertalet statistiska modeller på en datamängd bestående av makroekonomiska och finansiella variabler. Genom att kombinera linjär huvudkomponentsregression (principal component analysis), vektorautoregression och den återkopplande neurala nätverksmodellen LSTM finner författarna att även om majoriteten av modellerna påvisar hög statistisk signifikans så överpresterar praktiskt taget ingen av dem mot klassisk portföljteori på effektiva marknader, sett till riskjusterad avkastning. Flera implikationer diskuteras också baserat på resultaten
APA, Harvard, Vancouver, ISO, and other styles
32

Vogel, Sebastian A. A. Verfasser], Gerd [Akademischer Betreuer] [Ascheid, and Walter [Akademischer Betreuer] Stechele. "Design and implementation of number representations for efficient multiplierless acceleration of convolutional neural networks / Sebastian A. A. Vogel ; Gerd Ascheid, Walter Stechele." Aachen : Universitätsbibliothek der RWTH Aachen, 2020. http://d-nb.info/1220082716/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Phan, Leon L. "A methodology for the efficient integration of transient constraints in the design of aircraft dynamic systems." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/34750.

Full text
Abstract:
Transient regimes experienced by dynamic systems may have severe impacts on the operation of the aircraft. They are often regulated by dynamic constraints, requiring the dynamic signals to remain within bounds whose values vary with time. The verification of these peculiar types of constraints, which generally requires high-fidelity time-domain simulation, intervenes late in the system development process, thus potentially causing costly design iterations. The research objective of this thesis is to develop a methodology that integrates the verification of dynamic constraints in the early specification of dynamic systems. In order to circumvent the inefficiencies of time-domain simulation, multivariate dynamic surrogate models of the original time-domain simulation models are generated using wavelet neural networks (or wavenets). Concurrently, an alternate approach is formulated, in which the envelope of the dynamic response, extracted via a wavelet-based multiresolution analysis scheme, is subject to transient constraints. Dynamic surrogate models using sigmoid-based neural networks are generated to emulate the transient behavior of the envelope of the time-domain response. The run-time efficiency of the resulting dynamic surrogate models enables the implementation of a data farming approach, in which the full design space is sampled through a Monte-Carlo Simulation. An interactive visualization environment, enabling what-if analyses, is developed; the user can thereby instantaneously comprehend the transient response of the system (or its envelope) and its sensitivities to design and operation variables, as well as filter the design space to have it exhibit only the design scenarios verifying the dynamic constraints. The proposed methodology, along with its foundational hypotheses, is tested on the design and optimization of a 350VDC network, where a generator and its control system are concurrently designed in order to minimize the electrical losses, while ensuring that the transient undervoltage induced by peak demands in the consumption of a motor does not violate transient power quality constraints.
APA, Harvard, Vancouver, ISO, and other styles
34

Valenti, Giacomo. "Secure, efficient automatic speaker verification for embedded applications." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS471.

Full text
Abstract:
Cette thèse se concentre uniquement sur la vérification automatique du locuteur, tâche de classification binaire, en deux aspects : l'efficacité et la sécurité. Chacun des aspects sera exploré dans deux itérations : sur l'ASV traditionnel (chapitres II, III et IV) et l'ASV expérimentale (chapitres V, VI et VII). Les chapitres II et V sont des revues de littérature, leur but est de mettre en perspective les contributions dans le chapitres que respectivement les suivent. Chapitre II est centré sur la réduction des donnés nécessaire pour entrainer un modèle de locuteur, dans ce cas l’efficacité est ce qui permet de rendre un système facilement usable par l’utilisateur final dans un contexte des application embarquées. Chapitre III se concentre sur les mots de passe oraux et sur la possibilité de donner une mesure de sécurité en fonction du contenu textuel et de l’énonciation au moment du choix, exactement comme pour les mots de passe écrits. Dans le chapitre V une approche génétique pour l’évolution des topologies neuronaux est appliqué pour la première fois au flux audio brut, pour l’identification du locuteur. Chapitre VI voit cette même approche appliquée à la reconnaissance des attaques de mystification de l’identité
This industrial CIFRE PhD thesis addresses automatic speaker verification (ASV) issues in the context of embedded applications. The first part of this thesis focuses on more traditional problems and topics. The first work investigates the minimum enrolment data requirements for a practical, text-dependent short-utterance ASV system. Contributions in part A of the thesis consist in a statistical analysis whose objective is to isolate text-dependent factors and prove they are consistent across different sets of speakers. For very short utterances, the influence of a specific text content on the system performance can be considered a speaker-independent factor. Part B of the thesis focuses on neural network-based solutions. While it was clear that neural networks and deep learning were becoming state-of-the-art in several machine learning domains, their use for embedded solutions was hindered by their complexity. Contributions described in the second part of the thesis comprise blue-sky, experimental research which tackles the substitution of hand-crafted, traditional speaker features in favour of operating directly upon the audio waveform and the search for optimal network architectures and weights by means of genetic algorithms. This work is the most fundamental contribution: lightweight, neuro-evolved network structures which are able to learn from the raw audio input
APA, Harvard, Vancouver, ISO, and other styles
35

Westphal, Florian. "Efficient Document Image Binarization using Heterogeneous Computing and Interactive Machine Learning." Licentiate thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-16797.

Full text
Abstract:
Large collections of historical document images have been collected by companies and government institutions for decades. More recently, these collections have been made available to a larger public via the Internet. However, to make accessing them truly useful, the contained images need to be made readable and searchable. One step in that direction is document image binarization, the separation of text foreground from page background. This separation makes the text shown in the document images easier to process by humans and other image processing algorithms alike. While reasonably well working binarization algorithms exist, it is not sufficient to just being able to perform the separation of foreground and background well. This separation also has to be achieved in an efficient manner, in terms of execution time, but also in terms of training data used by machine learning based methods. This is necessary to make binarization not only theoretically possible, but also practically viable. In this thesis, we explore different ways to achieve efficient binarization in terms of execution time by improving the implementation and the algorithm of a state-of-the-art binarization method. We find that parameter prediction, as well as mapping the algorithm onto the graphics processing unit (GPU) help to improve its execution performance. Furthermore, we propose a binarization algorithm based on recurrent neural networks and evaluate the choice of its design parameters with respect to their impact on execution time and binarization quality. Here, we identify a trade-off between binarization quality and execution performance based on the algorithm’s footprint size and show that dynamically weighted training loss tends to improve the binarization quality. Lastly, we address the problem of training data efficiency by evaluating the use of interactive machine learning for reducing the required amount of training data for our recurrent neural network based method. We show that user feedback can help to achieve better binarization quality with less training data and that visualized uncertainty helps to guide users to give more relevant feedback.
Scalable resource-efficient systems for big data analytics
APA, Harvard, Vancouver, ISO, and other styles
36

Elbita, Abdulhakim M. "Efficient Processing of Corneal Confocal Microscopy Images. Development of a computer system for the pre-processing, feature extraction, classification, enhancement and registration of a sequence of corneal images." Thesis, University of Bradford, 2013. http://hdl.handle.net/10454/6463.

Full text
Abstract:
Corneal diseases are one of the major causes of visual impairment and blindness worldwide. Used for diagnoses, a laser confocal microscope provides a sequence of images, at incremental depths, of the various corneal layers and structures. From these, ophthalmologists can extract clinical information on the state of health of a patient’s cornea. However, many factors impede ophthalmologists in forming diagnoses starting with the large number and variable quality of the individual images (blurring, non-uniform illumination within images, variable illumination between images and noise), and there are also difficulties posed for automatic processing caused by eye movements in both lateral and axial directions during the scanning process. Aiding ophthalmologists working with long sequences of corneal image requires the development of new algorithms which enhance, correctly order and register the corneal images within a sequence. The novel algorithms devised for this purpose and presented in this thesis are divided into four main categories. The first is enhancement to reduce the problems within individual images. The second is automatic image classification to identify which part of the cornea each image belongs to, when they may not be in the correct sequence. The third is automatic reordering of the images to place the images in the right sequence. The fourth is automatic registration of the images with each other. A flexible application called CORNEASYS has been developed and implemented using MATLAB and the C language to provide and run all the algorithms and methods presented in this thesis. CORNEASYS offers users a collection of all the proposed approaches and algorithms in this thesis in one platform package. CORNEASYS also provides a facility to help the research team and Ophthalmologists, who are in discussions to determine future system requirements which meet clinicians’ needs.
The data and image files accompanying this thesis are not available online.
APA, Harvard, Vancouver, ISO, and other styles
37

Bergström, Carl, and Oscar Hjelm. "Impact of Time Steps on Stock Market Prediction with LSTM." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-262221.

Full text
Abstract:
Machine learning models as tools for predicting time series have in recent years proven to perform exceptionally well. With financial time series in the form of stock indices being inherently complex and subject to noise and volatility, the prediction of stock market movements has proven to be especially difficult throughout extensive research. The objective of this study is to thoroughly analyze the LSTM architecture for neural networks and its performance when applied to the S&P 500 stock index. The main research question revolves around quantifying the impact of varying the number of time steps in the LSTM model on predictive performance when applied to the S&P 500 index. The data used in the model is of high reliability downloaded from the Bloomberg Terminal, where the closing price has been used as feature in the model. Other constituents of the model have been based in previous research, where satisfactory results have been reached. The results indicate that among the evaluated time steps, ten steps provided the superior performance. However, the impact of varying time steps is not all too significant for the overall performance of the model. Finally, the implications of the results for the field of research present themselves as good basis for future research, where parameters are varied and fine-tuned in pursuit of optimal performance.
Maskininlärningsmodeller som redskap för att förutspå tidsserier har de senaste åren visat sig prestera exceptionellt bra. Vad gäller finansiella tidsserier i formen av aktieindex, som har en inneboende komplexitet, och är föremål för störningar och volatilitet, har förutsägelse av aktiemarknadsrörelser visat sig vara särskilt svårt igenom omfattande forskning. Målet med denna studie är att grundligt undersöka LSTM-arkitekturen för neurala nätverk och dess prestanda när den appliceras på aktieindexet S&P 500. Huvudfrågan kretsar kring att kvantifiera inverkan som varierande av antal tidssteg i LTSM-modellen har på prediktivprestanda när den appliceras på aktieindexet S&P 500. Data som använts i modellen är av hög pålitlighet, nedladdad frånBloomberg-terminalen, där stängningskurs har använts som feature i modellen. Andra beståndsdelar av modellen har baserats i tidigare forskning, där tillfredsställande resultat har uppnåtts. Resultaten indikerar att bland de testade tidsstegen så producerartio tidssteg bäst resultat. Dock verkar inte påverkan av antalet tidssteg vara särskilt signifikant för modellens övergripandeprestanda. Slutligen så presenterar sig implikationerna av resultaten för forskningsområdet som god grund för framtida forskning, där parametrar kan varieras och finjusteras i strävan efter optimal prestanda.
APA, Harvard, Vancouver, ISO, and other styles
38

Elbita, Abdulhakim Mehemed. "Efficient processing of corneal confocal microscopy images : development of a computer system for the pre-processing, feature extraction, classification, enhancement and registration of a sequence of corneal images." Thesis, University of Bradford, 2013. http://hdl.handle.net/10454/6463.

Full text
Abstract:
Corneal diseases are one of the major causes of visual impairment and blindness worldwide. Used for diagnoses, a laser confocal microscope provides a sequence of images, at incremental depths, of the various corneal layers and structures. From these, ophthalmologists can extract clinical information on the state of health of a patient’s cornea. However, many factors impede ophthalmologists in forming diagnoses starting with the large number and variable quality of the individual images (blurring, non-uniform illumination within images, variable illumination between images and noise), and there are also difficulties posed for automatic processing caused by eye movements in both lateral and axial directions during the scanning process. Aiding ophthalmologists working with long sequences of corneal image requires the development of new algorithms which enhance, correctly order and register the corneal images within a sequence. The novel algorithms devised for this purpose and presented in this thesis are divided into four main categories. The first is enhancement to reduce the problems within individual images. The second is automatic image classification to identify which part of the cornea each image belongs to, when they may not be in the correct sequence. The third is automatic reordering of the images to place the images in the right sequence. The fourth is automatic registration of the images with each other. A flexible application called CORNEASYS has been developed and implemented using MATLAB and the C language to provide and run all the algorithms and methods presented in this thesis. CORNEASYS offers users a collection of all the proposed approaches and algorithms in this thesis in one platform package. CORNEASYS also provides a facility to help the research team and Ophthalmologists, who are in discussions to determine future system requirements which meet clinicians’ needs.
APA, Harvard, Vancouver, ISO, and other styles
39

Fernandez, Brillet Lucas. "Réseaux de neurones CNN pour la vision embarquée." Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM043.

Full text
Abstract:
Pour obtenir des hauts taux de détection, les CNNs requièrent d'un grand nombre de paramètres à stocker, et en fonction de l'application, aussi un grand nombre d'opérations. Cela complique gravement le déploiement de ce type de solutions dans les systèmes embarqués. Ce manuscrit propose plusieurs solutions à ce problème en visant une coadaptation entre l'algorithme, l'application et le matériel.Dans ce manuscrit, les principaux leviers permettant de fixer la complexité computationnelle d'un détecteur d'objets basé sur les CNNs sont identifiés et étudies. Lorsqu'un CNN est employé pour détecter des objets dans une scène, celui-ci doit être appliqué à travers toutes les positions et échelles possibles. Cela devient très coûteux lorsque des petits objets doivent être trouvés dans des images en haute résolution. Pour rendre la solution efficiente et ajustable, le processus est divisé en deux étapes. Un premier CNN s'especialise à trouver des régions d'intérêt de manière efficiente, ce qui permet d'obtenir des compromis flexibles entre le taux de détection et le nombre d’opérations. La deuxième étape comporte un CNN qui classifie l’ensemble des propositions, ce qui réduit la complexité de la tâche, et par conséquent la complexité computationnelle.De plus, les CNN exhibent plusieurs propriétés qui confirment leur surdimensionnement. Ce surdimensionnement est une des raisons du succès des CNN, puisque cela facilite le processus d’optimisation en permettant un ample nombre de solutions équivalentes. Cependant, cela complique leur implémentation dans des systèmes avec fortes contraintes computationnelles. Dans ce sens, une méthode de compression de CNN basé sur une Analyse en Composantes Principales (ACP) est proposé. L’ACP permet de trouver, pour chaque couche du réseau, une nouvelle représentation de l’ensemble de filtres appris par le réseau en les exprimant à travers d’une base ACP plus adéquate. Cette base ACP est hiérarchique, ce qui veut dire que les termes de la base sont ordonnés par importance, et en supprimant les termes moins importants, il est possible de trouver des compromis optimales entre l’erreur d’approximation et le nombre de paramètres. À travers de cette méthode il es possible d’obtenir, par exemple, une réduction x2 sur le nombre de paramètres et opérations d’un réseau du type ResNet-32, avec une perte en accuracy <2%. Il est aussi démontré que cette méthode est compatible avec d’autres méthodes connues de l’état de l’art, notamment le pruning, winograd et la quantification. En les combinant toutes, il est possible de réduire la taille d’un ResNet-110 de 6.88 Mbytes à 370kBytes (gain mémoire x19) avec une dégradation d’accuracy de 3.9%.Toutes ces techniques sont ensuite misses en pratique dans un cadre applicatif de détection de vissages. La solution obtenue comporte une taille de modèle de 29.3kBytes, ce qui représente une réduction x65 par rapport à l’état de l’art, à égal taux de détection. La solution est aussi comparé a une méthode classique telle que Viola-Jones, ce qui confirme autour d’un ordre de magnitude moins de calculs, au même temps que l’habilité d’obtenir des taux de détection plus hauts, sans des hauts surcoûts computationnels Les deux réseaux sont en suite évalues sur un multiprocesseur embarqué, ce qui permet de vérifier que les taux de compression théoriques obtenues restent cohérents avec les chiffres mesurées. Dans le cas de la détection de vissages, la parallélisation du réseau comprimé par ACP sûr 8 processeurs incrémente la vitesse de calcul d’un facteur x11.68 par rapport au réseau original sûr un seul processeur
Recently, Convolutional Neural Networks have become the state-of-the-art soluion(SOA) to most computer vision problems. In order to achieve high accuracy rates, CNNs require a high parameter count, as well as a high number of operations. This greatly complicates the deployment of such solutions in embedded systems, which strive to reduce memory size. Indeed, while most embedded systems are typically in the range of a few KBytes of memory, CNN models from the SOA usually account for multiple MBytes, or even GBytes in model size. Throughout this thesis, multiple novel ideas allowing to ease this issue are proposed. This requires to jointly design the solution across three main axes: Application, Algorithm and Hardware.In this manuscript, the main levers allowing to tailor computational complexity of a generic CNN-based object detector are identified and studied. Since object detection requires scanning every possible location and scale across an image through a fixed-input CNN classifier, the number of operations quickly grows for high-resolution images. In order to perform object detection in an efficient way, the detection process is divided into two stages. The first stage involves a region proposal network which allows to trade-off recall for the number of operations required to perform the search, as well as the number of regions passed on to the next stage. Techniques such as bounding box regression also greatly help reduce the dimension of the search space. This in turn simplifies the second stage, since it allows to reduce the task’s complexity to the set of possible proposals. Therefore, parameter counts can greatly be reduced.Furthermore, CNNs also exhibit properties that confirm their over-dimensionment. This over-dimensionement is one of the key success factors of CNNs in practice, since it eases the optimization process by allowing a large set of equivalent solutions. However, this also greatly increases computational complexity, and therefore complicates deploying the inference stage of these algorithms on embedded systems. In order to ease this problem, we propose a CNN compression method which is based on Principal Component Analysis (PCA). PCA allows to find, for each layer of the network independently, a new representation of the set of learned filters by expressing them in a more appropriate PCA basis. This PCA basis is hierarchical, meaning that basis terms are ordered by importance, and by removing the least important basis terms, it is possible to optimally trade-off approximation error for parameter count. Through this method, it is possible to compress, for example, a ResNet-32 network by a factor of ×2 both in the number of parameters and operations with a loss of accuracy <2%. It is also shown that the proposed method is compatible with other SOA methods which exploit other CNN properties in order to reduce computational complexity, mainly pruning, winograd and quantization. Through this method, we have been able to reduce the size of a ResNet-110 from 6.88Mbytes to 370kbytes, i.e. a x19 memory gain with a 3.9 % accuracy loss.All this knowledge, is applied in order to achieve an efficient CNN-based solution for a consumer face detection scenario. The proposed solution consists of just 29.3kBytes model size. This is x65 smaller than other SOA CNN face detectors, while providing equal detection performance and lower number of operations. Our face detector is also compared to a more traditional Viola-Jones face detector, exhibiting approximately an order of magnitude faster computation, as well as the ability to scale to higher detection rates by slightly increasing computational complexity.Both networks are finally implemented in a custom embedded multiprocessor, verifying that theorical and measured gains from PCA are consistent. Furthermore, parallelizing the PCA compressed network over 8 PEs achieves a x11.68 speed-up with respect to the original network running on a single PE
APA, Harvard, Vancouver, ISO, and other styles
40

Nasser, Yehya. "An Efficient Computer-Aided Design Methodology for FPGA&ASIC High-Level Power Estimation Based on Machine Learning." Thesis, Rennes, INSA, 2019. http://www.theses.fr/2019ISAR0014.

Full text
Abstract:
Aujourd’hui, des systèmes numériques avancés sont nécessaires pour mettre en œuvre des fonctionnalités complexes. Cette complexité impose au concepteur de respecter différentes contraintes de conception telles que la performance, la surface, la consommation électrique et le délai de mise sur le marché. Pour effectuer une conception efficace, les concepteurs doivent rapidement évaluer les différentes architectures possibles. Dans cette thèse, nous nous concentrons sur l’évaluation de la consommation d’énergie afin de fournir une méthode d’estimation de puissance rapide, précise et flexible. Nous présentons NeuPow qui est une méthode s’appliquant aux FPGA et ASIC. Cette approche système est basée sur des techniques d’apprentissage statistique.Notamment, nous exploitons les réseaux neuronaux pour aider les concepteurs à explorer la consommation d’énergie dynamique. NeuPow s’appuie sur la propagation des signaux à travers des modèles neuronaux connectés pour prédire la consommation d’énergie d’un système composite à haut niveau d’abstraction. La méthodologie permet de prendre en compte la fréquence de fonctionnement et les différentes technologies de circuits (ASIC et FPGA). Les résultats montrent une très bonne précision d’estimation avec moins de 10% d’erreur relative indépendamment de la technologie et de la taille du circuit. NeuPow permet d’obtenir une productivité de conception élevée. Les temps de simulation obtenus sont significativement améliorés par rapport à ceux obtenus avec les outils de conception conventionnels
Nowadays, advanced digital systems are required to address complex functionnalities in a very wide range of applications. Systems complexity imposes designers to respect different design constraints such as the performance, area, power consumption and the time-to-market. The best design choice is the one that respects all of these constraints. To select an efficient design, designers need to quickly assess the possible architectures. In this thesis, we focus on facilitating the evaluation of the power consumption for both signal processing and hardware design engineers, so that it is possible to maintain fast, accurate and flexible power estimation. We present NeuPowas a system-level FPGA/ASIC power estimation method based on machine learning. We exploit neural networks to aid the designers in exploring the dynamic power consumption of possible architectural solutions. NeuPow relies on propagating the signals throughout connected neural models to predict the power consumption of a composite system at high-level of abstractions. We also provide an upgraded version that is frequency aware estimation. To prove the effectiveness of the proposed methodology, assessments such as technology and scalability studies have been conducted on ASIC and FPGA. Results show very good estimationaccuracy with less than 10% of relative error independently from the technology and the design size. NeuPow maintains high design productivity, where the simulation time obtained is significantly improved compared to those obtained with conventional design tools
APA, Harvard, Vancouver, ISO, and other styles
41

Shuvo, Md Kamruzzaman. "Hardware Efficient Deep Neural Network Implementation on FPGA." OpenSIUC, 2020. https://opensiuc.lib.siu.edu/theses/2792.

Full text
Abstract:
In recent years, there has been a significant push to implement Deep Neural Networks (DNNs) on edge devices, which requires power and hardware efficient circuits to carry out the intensive matrix-vector multiplication (MVM) operations. This work presents hardware efficient MVM implementation techniques using bit-serial arithmetic and a novel MSB first computation circuit. The proposed designs take advantage of the pre-trained network weight parameters, which are already known in the design stage. Thus, the partial computation results can be pre-computed and stored into look-up tables. Then the MVM results can be computed in a bit-serial manner without using multipliers. The proposed novel circuit implementation for convolution filters and rectified linear activation function used in deep neural networks conducts computation in an MSB-first bit-serial manner. It can predict earlier if the outcomes of filter computations will be negative and subsequently terminate the remaining computations to save power. The benefits of using the proposed MVM implementations techniques are demonstrated by comparing the proposed design with conventional implementation. The proposed circuit is implemented on an FPGA. It shows significant power and performance improvements compared to the conventional designs implemented on the same FPGA.
APA, Harvard, Vancouver, ISO, and other styles
42

Limnios, Stratis. "Graph Degeneracy Studies for Advanced Learning Methods on Graphs and Theoretical Results Edge degeneracy: Algorithmic and structural results Degeneracy Hierarchy Generator and Efficient Connectivity Degeneracy Algorithm A Degeneracy Framework for Graph Similarity Hcore-Init: Neural Network Initialization based on Graph Degeneracy." Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAX038.

Full text
Abstract:
L'extraction de sous-structures significatives a toujours été un élément clé de l’étude des graphes. Dans le cadre de l'apprentissage automatique, supervisé ou non, ainsi que dans l'analyse théorique des graphes, trouver des décompositions spécifiques et des sous-graphes denses est primordial dans de nombreuses applications comme entre autres la biologie ou les réseaux sociaux.Dans cette thèse, nous cherchons à étudier la dégénérescence de graphe, en partant d'un point de vue théorique, et en nous appuyant sur nos résultats pour trouver les décompositions les plus adaptées aux tâches à accomplir. C'est pourquoi, dans la première partie de la thèse, nous travaillons sur des résultats structurels des graphes à arête-admissibilité bornée, prouvant que de tels graphes peuvent être reconstruits en agrégeant des graphes à degré d’arête quasi-borné. Nous fournissons également des garanties de complexité de calcul pour les différentes décompositions de la dégénérescence, c'est-à-dire si elles sont NP-complètes ou polynomiales, selon la longueur des chemins sur lesquels la dégénérescence donnée est définie.Dans la deuxième partie, nous unifions les cadres de dégénérescence et d'admissibilité en fonction du degré et de la connectivité. Dans ces cadres, nous choisissons les plus expressifs, d'une part, et les plus efficaces en termes de calcul d'autre part, à savoir la dégénérescence 1-arête-connectivité pour expérimenter des tâches de dégénérescence standard, telle que la recherche d’influenceurs.Suite aux résultats précédents qui se sont avérés peu performants, nous revenons à l'utilisation du k-core mais en l’intégrant dans un cadre supervisé, i.e. les noyaux de graphes. Ainsi, en fournissant un cadre général appelé core-kernel, nous utilisons la décomposition k-core comme étape de prétraitement pour le noyau et appliquons ce dernier sur chaque sous-graphe obtenu par la décomposition pour comparaison. Nous sommes en mesure d'obtenir des performances à l’état de l’art sur la classification des graphes au prix d’une légère augmentation du coût de calcul.Enfin, nous concevons un nouveau cadre de dégénérescence de degré s’appliquant simultanément pour les hypergraphes et les graphes biparties, dans la mesure où ces derniers sont les graphes d’incidence des hypergraphes. Cette décomposition est ensuite appliquée directement à des architectures de réseaux de neurones pré-entrainés étant donné qu'elles induisent des graphes biparties et utilisent le core d'appartenance des neurones pour réinitialiser les poids du réseaux. Cette méthode est non seulement plus performant que les techniques d'initialisation de l’état de l’art, mais il est également applicable à toute paire de couches de convolution et linéaires, et donc adaptable à tout type d'architecture
Extracting Meaningful substructures from graphs has always been a key part in graph studies. In machine learning frameworks, supervised or unsupervised, as well as in theoretical graph analysis, finding dense subgraphs and specific decompositions is primordial in many social and biological applications among many others.In this thesis we aim at studying graph degeneracy, starting from a theoretical point of view, and building upon our results to find the most suited decompositions for the tasks at hand.Hence the first part of the thesis we work on structural results in graphs with bounded edge admissibility, proving that such graphs can be reconstructed by aggregating graphs with almost-bounded-edge-degree. We also provide computational complexity guarantees for the different degeneracy decompositions, i.e. if they are NP-complete or polynomial, depending on the length of the paths on which the given degeneracy is defined.In the second part we unify the degeneracy and admissibility frameworks based on degree and connectivity. Within those frameworks we pick the most expressive, on the one hand, and computationally efficient on the other hand, namely the 1-edge-connectivity degeneracy, to experiment on standard degeneracy tasks, such as finding influential spreaders.Following the previous results that proved to perform poorly we go back to using the k-core but plugging it in a supervised framework, i.e. graph kernels. Thus providing a general framework named core-kernel, we use the k-core decomposition as a preprocessing step for the kernel and apply the latter on every subgraph obtained by the decomposition for comparison. We are able to achieve state-of-the-art performance on graph classification for a small computational cost trade-off.Finally we design a novel degree degeneracy framework for hypergraphs and simultaneously on bipartite graphs as they are hypergraphs incidence graph. This decomposition is then applied directly to pretrained neural network architectures as they induce bipartite graphs and use the coreness of the neurons to re-initialize the neural network weights. This framework not only outperforms state-of-the-art initialization techniques but is also applicable to any pair of layers convolutional and linear thus being applicable however needed to any type of architecture
APA, Harvard, Vancouver, ISO, and other styles
43

Naoto, Chiche Benjamin. "Video classification with memory and computation-efficient convolutional neural network." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254678.

Full text
Abstract:
Video understanding involves problems such as video classification, which consists in labeling videos based on their contents and frames. In many real world applications such as robotics, self-driving car, augmented reality, and Internet of Things (IoT), video understanding tasks need to be carried out in a real-time manner on a device with limited memory resources and computation capabilities, while meeting latency requirement.In this context, whereas neural networks that are memory and computationefficient i.e., that present a reasonable trade-off between accuracy and efficiency with respect to memory size and computational speed have been developed for image recognition tasks, studies about video classification have not made the most of these networks. To fill this gap, this project answers the following research question: how to build video classification pipelines that are based on memory and computation-efficient convolutional neural network (CNN) and how do the latter perform?In order to answer this question, the project builds and evaluates video classification pipelines that are new artefacts. This research involves triangulation (i.e., is qualitative and quantitative at the same time) and the empirical research method is used for the evaluation. The artefacts are based on one of existing memory and computation-efficient CNNs and its evaluation is based on a public video classification dataset and multiclass classification performance metrics. The case study research strategy is adopted: we try to generalize obtained results as far as possible to other memory and computation-efficient CNNs and video classification datasets. The abductive research approach is used in order to verify or falsify hypotheses. As results, the artefacts are built and show satisfactory performance metrics compared to baseline pipelines that are also developed in this thesis and metric values that are reported in other papers that used the same dataset. To conclude, video-classification pipelines based on memory and computation-efficient CNN can be built by designing and developing artefacts that combine approaches inspired from existing papers and new approaches and these artefacts present satisfactory performance. In particular, we observe that the drop in accuracy induced by memory and computation-efficient CNN when dealing with video frames is, to some extent, compensated by capturing temporal information via consideration of sequence of these frames.
Videoförståelse innebär problem som videoklassificering, som består av att annotera videor baserat på deras innehåll och ramar. I många verkliga applikationer, som robotteknik, självkörande bilar, förstärkt verklighet (AR) och sakernas internet (IoT) måste videoförståelsuppgifter utföras i realtid på en enhet med begränsade minnesresurser och beräkningsförmåga, samtidigt som det uppfyller krav på låg fördröjning.I det här sammanhanget, medan neurala nätverk som är minnesoch beräkningseffektiva, dvs den aktuella presentationen har en rimlig avvägning mellan noggrannhet och effektivitet (med avseende på minnesstorlek och beräkningar) utvecklats för bildigenkänningsuppgifter, har studier om videoklassificering inte fullt utnyttjat dessa tekniker. För att fylla denna lucka i vetenskapen svarar det här projektet på följande forskningsfråga: hur bygger man videoklassificeringspipelines som bygger på minne och beräkningseffektiva faltningsnätverk (CNN) och hur utförs det sistnämnda?För att svara på denna fråga bygger projektet och utvärderar videoklassificeringspipelines som är nya artefakter. Den empiriska forskningsmetoden används i denna forskning som involverar triangulering (dvs kvalitativt och kvantitativt samtidigt). Artefakterna är baserade på ett befintligt minnesoch beräkningseffektivt CNN och dess utvärdering baseras på en öppet tillgängligt dataset för videoklassificering. Fallstudieforskningsstrategin antas: Vi försöker att generalisera erhållna resultat så långt som möjligt till andra minnesoch beräkningseffektiva CNNs och videoklassificeringsdataset. Som resultat byggs artefakterna och visar tillfredsställande prestandamätningar jämfört med baslinjeresultat som också utvecklas i denna avhandling och värden som rapporteras i andra forskningspapper baserat på samma dataset. Sammanfattningsvis kan video-klassificeringsledningar baserade på ett minne och beräkningseffektivt CNN byggas genom att utforma och utveckla artefakter som kombinerar metoder inspirerade av befintliga papper och nya tillvägagångssätt och dessa artefakter presenterar tillfredsställande prestanda. I synnerhet observerar vi att nedgången i noggrannhet som induceras av ett minne och beräkningseffektivt CNN vid hantering av videoramar kompenseras till viss del genom att ta upp tidsmässig information genom beaktande av sekvensen av dessa ramar.
APA, Harvard, Vancouver, ISO, and other styles
44

Batbayar, Batsukh, and S3099885@student rmit edu au. "Improving Time Efficiency of Feedforward Neural Network Learning." RMIT University. Electrical and Computer Engineering, 2009. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20090303.114706.

Full text
Abstract:
Feedforward neural networks have been widely studied and used in many applications in science and engineering. The training of this type of networks is mainly undertaken using the well-known backpropagation based learning algorithms. One major problem with this type of algorithms is the slow training convergence speed, which hinders their applications. In order to improve the training convergence speed of this type of algorithms, many researchers have developed different improvements and enhancements. However, the slow convergence problem has not been fully addressed. This thesis makes several contributions by proposing new backpropagation learning algorithms based on the terminal attractor concept to improve the existing backpropagation learning algorithms such as the gradient descent and Levenberg-Marquardt algorithms. These new algorithms enable fast convergence both at a distance from and in a close range of the ideal weights. In particular, a new fast convergence mechanism is proposed which is based on the fast terminal attractor concept. Comprehensive simulation studies are undertaken to demonstrate the effectiveness of the proposed backpropagataion algorithms with terminal attractors. Finally, three practical application cases of time series forecasting, character recognition and image interpolation are chosen to show the practicality and usefulness of the proposed learning algorithms with comprehensive comparative studies with existing algorithms.
APA, Harvard, Vancouver, ISO, and other styles
45

Harte, T. P. "Efficient neural network classification of magnetic resonance images of the breast." Thesis, University of Cambridge, 1998. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.603805.

Full text
Abstract:
This dissertation proposes a new method of automated malignancy recognition in contrast-enhanced magnetic resonance images of the human breast using the multi-layer perceptron (MLP) feed-forward neural network paradigm. The fundamental limitation is identified as being the efficiency of such a classifier: the computational budget demanded by multi-dimensional image data sets is immense. Without optimization the MLP flounders. This work proposes a new efficient algorithm for MLP classification of large multi-dimensional data sets based on fast discrete orthogonal transforms. This is possible given the straightforward observation that point-wise mask-processing of image data for classification purposes is linear spatial convolution. The novel observation, then, is that the MLP permits convolution at its input layer due to the linearity of the inner product which it computes. Optimized fast Fourier transform (FFT) are investigated and an order of magnitude improvement in the execution time of a four-dimensional transform is achieved over commonly-implemented FFTs. One of the principal retardations in common multi-dimensional FFTs is observed to be the lack of attention paid to memory-hierarchy considerations. A simple, but fast, technique for optimizing cache performance is implemented. The abstract mathematical basis for convolution is investigated and a finite integer number theoretic transform (NTT) approach suggests itself, because such a transform can be defined that is fast, purely real, has parsimony of memory requirements, and has compact hardware realizations. A new method for multi-dimensional convolution with long-length number theoretic transforms is presented. This is an extension of previous work where NTTs were implemented over pseudo-Mersenne, and pseudo-Fermat surrogate moduli. A suitable modulus is identified which allows long-length transforms that readily lend themselves to the multi-dimensional convolution problem involved in classifying large magnetic resonance image data sets.
APA, Harvard, Vancouver, ISO, and other styles
46

Zhou, Helong, and 周賀龍. "Efficient Kernel Sharing Convolutional Neural Networks." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/hx3vqj.

Full text
Abstract:
碩士
國立臺灣科技大學
電子工程系
106
Increasing focus has been put on pursuing computation efficient convolutional neural network (CNN) models. To lessen the redundancy of convolutional kernels, this paper proposes two new convolutional structures, i.e., kernel sharing convolution (KSC) and weighted kernel sharing convolution (WKSC), where an extra weighting is imposed for each input in WKSC to manifest the diversity of input channels. Inspired by the fact that in traditional convolution, each input channel has its respective kernel to convolute with, which may lead to redundant kernels, both of the proposed schemes gather the inputs using the same kernel together, so the inputs in each group can share the same convolutional kernel. As a consequence, the number of kernels can be greatly reduced, leading to a reduction of model parameters and the speedup of inference. Moreover, WKSC is also combined with depthwise separable convolutions, resulting in a highly compressed architecture. Extensive experiments on CIFAR-100, Caltech-256 and ImageNet classification demonstrate the effectiveness of the new approach in both computation cost and the parameters required compared with the state-of-the-art works.
APA, Harvard, Vancouver, ISO, and other styles
47

Stanley, Kenneth Owen. "Efficient evolution of neural networks through complexification." Thesis, 2004. http://hdl.handle.net/2152/1266.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

"Energy Efficient Hardware Design of Neural Networks." Master's thesis, 2018. http://hdl.handle.net/2286/R.I.51597.

Full text
Abstract:
abstract: Hardware implementation of deep neural networks is earning significant importance nowadays. Deep neural networks are mathematical models that use learning algorithms inspired by the brain. Numerous deep learning algorithms such as multi-layer perceptrons (MLP) have demonstrated human-level recognition accuracy in image and speech classification tasks. Multiple layers of processing elements called neurons with several connections between them called synapses are used to build these networks. Hence, it involves operations that exhibit a high level of parallelism making it computationally and memory intensive. Constrained by computing resources and memory, most of the applications require a neural network which utilizes less energy. Energy efficient implementation of these computationally intense algorithms on neuromorphic hardware demands a lot of architectural optimizations. One of these optimizations would be the reduction in the network size using compression and several studies investigated compression by introducing element-wise or row-/column-/block-wise sparsity via pruning and regularization. Additionally, numerous recent works have concentrated on reducing the precision of activations and weights with some reducing to a single bit. However, combining various sparsity structures with binarized or very-low-precision (2-3 bit) neural networks have not been comprehensively explored. Output activations in these deep neural network algorithms are habitually non-binary making it difficult to exploit sparsity. On the other hand, biologically realistic models like spiking neural networks (SNN) closely mimic the operations in biological nervous systems and explore new avenues for brain-like cognitive computing. These networks deal with binary spikes, and they can exploit the input-dependent sparsity or redundancy to dynamically scale the amount of computation in turn leading to energy-efficient hardware implementation. This work discusses configurable spiking neuromorphic architecture that supports multiple hidden layers exploiting hardware reuse. It also presents design techniques for minimum-area/-energy DNN hardware with minimal degradation in accuracy. Area, performance and energy results of these DNN and SNN hardware is reported for the MNIST dataset. The Neuromorphic hardware designed for SNN algorithm in 28nm CMOS demonstrates high classification accuracy (>98% on MNIST) and low energy (51.4 - 773 (nJ) per classification). The optimized DNN hardware designed in 40nm CMOS that combines 8X structured compression and 3-bit weight precision showed 98.4% accuracy at 33 (nJ) per classification.
Dissertation/Thesis
Masters Thesis Electrical Engineering 2018
APA, Harvard, Vancouver, ISO, and other styles
49

Stanley, Kenneth Owen Miikkulainen Risto. "Efficient evolution of neural networks through complexification." 2004. http://wwwlib.umi.com/cr/utexas/fullcit?p3143474.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Gupta, Kartik. "Towards Efficient and Reliable Deep Neural Networks." Phd thesis, 2022. http://hdl.handle.net/1885/275682.

Full text
Abstract:
Deep neural networks have achieved state-of-the-art performance for various machine learning tasks in different domains such as computer vision, natural language processing, bioinformatics, speech processing, etc. Despite the success, their excessive computational and memory requirements limit their practical usability for real-time applications or in resource-limited devices. Neural network quantization has become increasingly popular due to efficient memory consumption and faster computation resulting from bit-wise operations on the quantized networks, where the objective is to learn a network while restricting the parameters (and activations) to take values from a small discrete set. Another important aspect of modern neural networks is the adversarial vulnerability and reliability of the predictions of deep neural networks. In addition to obtaining accurate predictions, it is also critical to accurately quantify the predictive uncertainty of deep neural networks in many real-world decision-making applications. Calibrating neural networks is of utmost importance when employing them in safety-critical applications where the downstream decision-making depends on the predicted probabilities. Further to this, modern machine vision algorithms have also been shown to be extremely susceptible to small and almost imperceptible perturbations of their inputs. To this end, we tackle these fundamental challenges in modern neural networks, focussing on the efficiency and reliability of neural networks. Neural network quantization is usually formulated as a constrained optimization problem and optimized via a modified version of gradient descent. To this end, first by interpreting the continuous parameters (unconstrained) as the dual of the quantized ones, we introduce a Mirror Descent (MD) framework for NN quantization. Specifically, we provide conditions on the projections (i.e., mapping from continuous to quantized ones) which enable us to derive valid mirror maps and in turn the respective MD updates. Furthermore, we present a numerically stable implementation of MD that requires storing an additional set of auxiliary variables (unconstrained), and show that it is strikingly analogous to the STE based method which is typically viewed as a ``trick'' to avoid vanishing gradients issue. Our experiments on multiple computer vision classification datasets with multiple network architectures demonstrate that our MD variants yield state-of-the-art performance. Even though quantized networks exhibit excellent generalization capabilities, their robustness properties are not well-understood. Therefore next, we systematically study the robustness of quantized networks against gradient based adversarial attacks and demonstrate that these quantized models suffer from gradient vanishing issues and show a fake sense of robustness. By attributing gradient vanishing to poor forward-backward signal propagation in the trained network, we introduce a simple temperature scaling approach to mitigate this issue while preserving the decision boundary. Experiments on multiple image classification datasets with multiple network architectures demonstrate that our temperature scaled attacks obtain near-perfect success rate on quantized networks. Finally, we introduce a binning-free calibration measure inspired by the classical Kolmogorov-Smirnov (KS) statistical test in which the main idea is to compare the respective cumulative probability distributions. From this, by approximating the empirical cumulative distribution using a differentiable function via splines, we obtain a recalibration function, which maps the network outputs to actual (calibrated) class assignment probabilities. We tested our method against existing calibration approaches on various image classification datasets and our spline-based recalibration approach consistently outperforms existing methods on KS error as well as other commonly used calibration measures.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography