Dissertations / Theses: 'Approximate identity neural networks'

1

Ling, Hong. "Implementation of Stochastic Neural Networks for Approximating Random Processes." Master's thesis, Lincoln University. Environment, Society and Design Division, 2007. http://theses.lincoln.ac.nz/public/adt-NZLIU20080108.124352/.

Full text

Abstract:

Artificial Neural Networks (ANNs) can be viewed as a mathematical model to simulate natural and biological systems on the basis of mimicking the information processing methods in the human brain. The capability of current ANNs only focuses on approximating arbitrary deterministic input-output mappings. However, these ANNs do not adequately represent the variability which is observed in the systems natural settings as well as capture the complexity of the whole system behaviour. This thesis addresses the development of a new class of neural networks called Stochastic Neural Networks (SNNs) in order to simulate internal stochastic properties of systems. Developing a suitable mathematical model for SNNs is based on canonical representation of stochastic processes or systems by means of Karhunen-Loève Theorem. Some successful real examples, such as analysis of full displacement field of wood in compression, confirm the validity of the proposed neural networks. Furthermore, analysis of internal workings of SNNs provides an in-depth view on the operation of SNNs that help to gain a better understanding of the simulation of stochastic processes by SNNs.

APA, Harvard, Vancouver, ISO, and other styles

2

Garces, Freddy. "Dynamic neural networks for approximate input- output linearisation-decoupling of dynamic systems." Thesis, University of Reading, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.368662.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Li, Yingzhen. "Approximate inference : new visions." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/277549.

Full text

Abstract:

Nowadays machine learning (especially deep learning) techniques are being incorporated to many intelligent systems affecting the quality of human life. The ultimate purpose of these systems is to perform automated decision making, and in order to achieve this, predictive systems need to return estimates of their confidence. Powered by the rules of probability, Bayesian inference is the gold standard method to perform coherent reasoning under uncertainty. It is generally believed that intelligent systems following the Bayesian approach can better incorporate uncertainty information for reliable decision making, and be less vulnerable to attacks such as data poisoning. Critically, the success of Bayesian methods in practice, including the recent resurgence of Bayesian deep learning, relies on fast and accurate approximate Bayesian inference applied to probabilistic models. These approximate inference methods perform (approximate) Bayesian reasoning at a relatively low cost in terms of time and memory, thus allowing the principles of Bayesian modelling to be applied to many practical settings. However, more work needs to be done to scale approximate Bayesian inference methods to big systems such as deep neural networks and large-scale dataset such as ImageNet. In this thesis we develop new algorithms towards addressing the open challenges in approximate inference. In the first part of the thesis we develop two new approximate inference algorithms, by drawing inspiration from the well known expectation propagation and message passing algorithms. Both approaches provide a unifying view of existing variational methods from different algorithmic perspectives. We also demonstrate that they lead to better calibrated inference results for complex models such as neural network classifiers and deep generative models, and scale to large datasets containing hundreds of thousands of data-points. In the second theme of the thesis we propose a new research direction for approximate inference: developing algorithms for fitting posterior approximations of arbitrary form, by rethinking the fundamental principles of Bayesian computation and the necessity of algorithmic constraints in traditional inference schemes. We specify four algorithmic options for the development of such new generation approximate inference methods, with one of them further investigated and applied to Bayesian deep learning tasks.

APA, Harvard, Vancouver, ISO, and other styles

4

Liu, Leo M. Eng Massachusetts Institute of Technology. "Acoustic models for speech recognition using Deep Neural Networks based on approximate math." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100633.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 81-83).
Deep Neural Networks (DNNs) are eective models for machine learning. Unfortunately, training a DNN is extremely time-consuming, even with the aid of a graphics processing unit (GPU). DNN training is especially slow for tasks with large datasets. Existing approaches for speeding up the process involve parallelizing the Stochastic Gradient Descent (SGD) algorithm used to train DNNs. Those approaches do not guarantee the same results as normal SGD since they introduce non-trivial changes into the algorithm. A new approach for faster training that avoids signicant changes to SGD is to use low-precision hardware. The low-precision hardware is faster than a GPU, but it performs arithmetic with 1% error. In this arithmetic, 98 + 2 = 99:776 and 10 * 10 = 100:863. This thesis determines whether DNNs would still be able to produce state-of-the-art results using this low-precision arithmetic. To answer this question, we implement an approximate DNN that uses the low-precision arithmetic and evaluate it on the TIMIT phoneme recognition task and the WSJ speech recognition task. For both tasks, we nd that acoustic models based on approximate DNNs perform as well as ones based on conventional DNNs; both produce similar recognition error rates. The approximate DNN is able to match the conventional DNN only if it uses Kahan summations to preserve precision. These results show that DNNs can run on low-precision hardware without the arithmetic causing any loss in recognition ability. The low-precision hardware is therefore a suitable approach for speeding up DNN training.
by Leo Liu.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

5

Scotti, Andrea. "Graph Neural Networks and Learned Approximate Message Passing Algorithms for Massive MIMO Detection." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-284500.

Full text

Abstract:

Massive multiple-input and multiple-output (MIMO) is a method to improvethe performance of wireless communication systems by having a large numberof antennas at both the transmitter and the receiver. In the fifth-generation(5G) mobile communication system, Massive MIMO is a key technology toface the increasing number of mobile users and satisfy user demands. At thesame time, recovering the transmitted information in a massive MIMO uplinkreceiver requires more computational complexity when the number of transmittersincreases. Indeed, the optimal maximum likelihood (ML) detector hasa complexity exponentially increasing with the number of transmitters. Therefore,one of the main challenges in the field is to find the best sub-optimalMIMO detection algorithm according to the performance/complexity tradeoff.In this work, all the algorithms are empirically evaluated for large MIMOsystems and higher-order modulations.Firstly, we show how MIMO detection can be represented by a MarkovRandom Field (MRF) and addressed by the loopy belief propagation (LBP)algorithm to approximately solve the equivalent MAP (maximum a posteriori)inference problem. Then, we propose a novel algorithm (BP-MMSE) thatstarts from the minimum mean square error (MMSE) solution and updates theprior in each iteration with the LBP belief. To avoid the complexity of computingMMSE, we use Graph Neural Networks (GNNs) to learn a messagepassingalgorithm that solves the inference task on the same graph.To further reduce the complexity of message-passing algorithms, we recallhow in the large system limit, approximate message passing (AMP), a lowcomplexity iterative algorithm, can be derived from LBP to solve MIMO detectionfor i.i.d. Gaussian channels. Then, we show numerically how AMPwith damping (DAMP) can be robust to low/medium correlation among thechannels. To conclude, we propose a low complexity deep neural iterativescheme (Pseudo-MMNet) for solvingMIMOdetection in the presence of highlycorrelated channels at the expense of online training for each channel realization.Pseudo-MMNet is based on MMNet algorithm presented in [24] (in turnbased on AMP) and it significantly reduces the online training complexity thatmakes MMNet far from realistic implementations.
Massiv MIMO (multiple-input and multiple-output) är en metod som förbättrarprestandan i trådlösa kommunikationssystem genom att ett stort antal antenneranvänds i både sändare och mottagare. I den femte generationens (5G)mobila kommunikationssystem är Massiv MIMO en mycket viktig teknologiför att möta det växande antalet mobilanvändare och tillgodose användarnasbehov. Samtidigt ökar beräkningskomplexiteten för att återfinna den överfördainformationen i en trådlös Massiv MIMO-upplänk när antalet antenner ökar.Faktum är att den optimala ML-detektorn (maximum likelihood) har en beräkningskomplexitetsom ökar exponentiellt med antalet sändare. En av huvudutmaningarnainom detta område är därför att hitta den bästa suboptimalaMIMO-detekteringsalgoritmen med hänsyn till både prestanda och komplexitet.I detta arbete visar vi hur MIMO-detektering kan representeras av ett MarkovRandom Field (MRF) och använder loopy belief-fortplantning (LBP) föratt lösa det motsvarande MAP-slutledningsproblemet (maximum a posteriori).Vi föreslår sedan en ny algoritm (BP-MMSE) som kombinerar LBP ochMMSE (minimum mean square error) för att lösa problemet vid högre modulationsordningarsom QAM-16 (kvadratamplitudsmodulation) och QAM-64.För att undvika komplexiteten med att beräkna MMSE så använder vi oss avgraf neurala nätverk (GNN) för att lära en message-passing algoritm som löserslutledningsproblemet med samma graf. En message-passing algoritm måstegiven en komplett graf utbyta kvadraten av antalet noder meddelanden. För attminska message-passing algoritmers beräkningskomplexitet vet vi att approximativmessage-passing (AMP) kan härledas från LBP i gränsvärdet av storasystem för att lösa MIMO-detektering med oberoende och likafördelade (i.i.d)Gaussiska kanaler. Vi visar sedan hur AMP med dämpning (DAMP) kan vararobust med låg- till mellan-korrelerade kanaler.Avslutningsvis föreslår vi en iterativ djup neuralt nätverk algoritm medlåg beräkningskomplexitet (Pseudo-MMNet) för att lösa MIMO-detektering ikanaler med hög korrelation på bekostnad av online-träning för varje realiseringav kanalen. Pseudo-MMNet är baserad på MMnet som presenteras i [23](istället för AMP) och minskar signifikant online-träningskomplexiteten somgör MMNet orealistisk att använda. Alla föreslagna algoritmer är empirisktutvärderade för stora MIMO-system och högre ordningar av modulation.

APA, Harvard, Vancouver, ISO, and other styles

6

Gaur, Yamini. "Exploring Per-Input Filter Selection and Approximation Techniques for Deep Neural Networks." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/90404.

Full text

Abstract:

We propose a dynamic, input dependent filter approximation and selection technique to improve the computational efficiency of Deep Neural Networks. The approximation techniques convert 32 bit floating point representation of filter weights in neural networks into smaller precision values. This is done by reducing the number of bits used to represent the weights. In order to calculate the per-input error between the trained full precision filter weights and the approximated weights, a metric called Multiplication Error (ME) has been chosen. For convolutional layers, ME is calculated by subtracting the approximated filter weights from the original filter weights, convolving the difference with the input and calculating the grand-sum of the resulting matrix. For fully connected layers, ME is calculated by subtracting the approximated filter weights from the original filter weights, performing matrix multiplication between the difference and the input and calculating the grand-sum of the resulting matrix. ME is computed to identify approximated filters in a layer that result in low inference accuracy. In order to maintain the accuracy of the network, these filters weights are replaced with the original full precision weights. Prior work has primarily focused on input independent (static) replacement of filters to low precision weights. In this technique, all the filter weights in the network are replaced by approximated filter weights. This results in a decrease in inference accuracy. The decrease in accuracy is higher for more aggressive approximation techniques. Our proposed technique aims to achieve higher inference accuracy by not approximating filters that generate high ME. Using the proposed per-input filter selection technique, LeNet achieves an accuracy of 95.6% with 3.34% drop from the original accuracy value of 98.9% for truncating to 3 bits for the MNIST dataset. On the other hand upon static filter approximation, LeNet achieves an accuracy of 90.5% with 8.5% drop from the original accuracy. The aim of our research is to potentially use low precision weights in deep learning algorithms to achieve high classification accuracy with less computational overhead. We explore various filter approximation techniques and implement a per-input filter selection and approximation technique that selects the filters to approximate during run-time.
Master of Science
Deep neural networks, just like the human brain can learn important information about the data provided to them and can classify a new input based on the labels corresponding to the provided dataset. Deep learning technology is heavily employed in devices using computer vision, image and video processing and voice detection. The computational overhead incurred in the classification process of DNNs prohibits their use in smaller devices. This research aims to improve network efficiency in deep learning by replacing 32 bit weights in neural networks with less precision weights in an input-dependent manner. Trained neural networks are numerically robust. Different layers develop tolerance to minor variations in network parameters. Therefore, differences induced by low-precision calculations fall well within tolerance limit of the network. However, for aggressive approximation techniques like truncating to 3 and 2 bits, inference accuracy drops severely. We propose a dynamic technique that during run-time, identifies the approximated filters resulting in low inference accuracy for a given input and replaces those filters with the original filters to achieve high inference accuracy. The proposed technique has been tested for image classification on Convolutional Neural Networks. The datasets used are MNIST and CIFAR-10. The Convolutional Neural Networks used are 4-layered CNN, LeNet-5 and AlexNet.

APA, Harvard, Vancouver, ISO, and other styles

7

Dumlupinar, Taha. "Approximate Analysis And Condition Assesment Of Reinforced Concrete T-beam Bridges Using Artificial Neural Networks." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/3/12609732/index.pdf.

Full text

Abstract:

In recent years, artificial neural networks (ANNs) have been employed for estimation and prediction purposes in many areas of civil/structural engineering. In this thesis, multilayered feedforward backpropagation algorithm is used for the approximate analysis and calibration of RC T-beam bridges and modeling of bridge ratings of these bridges. Currently bridges are analyzed using a standard FEM program. However, when a large population of bridges is concerned, such as the one considered in this project (Pennsylvania T-beam bridge population), it is impractical to carry out FEM analysis of all bridges in the population due to the fact that development and analysis of every single bridge requires considerable time as well as effort. Rapid and acceptably approximate analysis of bridges seems to be possible using ANN approach. First part of the study describes the application of neural network (NN) systems in developing the relationships between bridge parameters and bridge responses. The NN models are trained using some training data that are obtainedfrom finite-element analyses and that contain bridge parameters as inputs and critical responses as outputs. In the second part, ANN systems are used for the calibration of the finite element model of a typical RC T-beam bridge -the Manoa Road Bridge from the Pennsylvania&rsquo
s T-beam bridge population - based on field test data. Manual calibration of these models are extremely time consuming and laborious. Therefore, a neural network- based method is developed for easy and practical calibration of these models. The ANN model is trained using some training data that are obtained from finite-element analyses and that contain modal and displacement parameters as inputs and structural parameters as outputs. After the training is completed, fieldmeasured data set is fed into the trained ANN model. Then, FE model is updated with the predicted structural parameters from the ANN model. In the final part, Neural Networks (NNs) are used to model the bridge ratings of RC T-beam bridges based on bridge parameters. Bridge load ratings are calculated more accurately by taking into account the actual geometry and detailing of the T-beam bridges. Then, ANN solution is developed to easily compute bridge load ratings.

APA, Harvard, Vancouver, ISO, and other styles

8

Tornstad, Magnus. "Evaluating the Practicality of Using a Kronecker-Factored Approximate Curvature Matrix in Newton's Method for Optimization in Neural Networks." Thesis, KTH, Skolan för teknikvetenskap (SCI), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-275741.

Full text

Abstract:

For a long time, second-order optimization methods have been regarded as computationally inefficient and intractable for solving the optimization problem associated with deep learning. However, proposed in recent research is an adaptation of Newton's method for optimization in which the Hessian is approximated by a Kronecker-factored approximate curvature matrix, known as KFAC. This work aims to assess its practicality for use in deep learning. Benchmarks were performed using abstract, binary, classification problems, as well as the real-world Boston Housing regression problem, and both deep and shallow network architectures were employed. KFAC was found to offer great savings in computational complexity compared to a naive approximate second-order implementation using the Gauss Newton matrix. Comparing performance in deep and shallow networks, the loss convergence of both stochastic gradient descent (SGD) and KFAC showed a dependency upon network architecture, where KFAC tended to converge quicker in deep networks, and SGD tended to converge quicker in shallow networks. The study concludes that KFAC can perform well in deep learning, showing competitive loss minimization versus basic SGD, but that it can be sensitive to initial weigths. This sensitivity could be remedied by allowing the first steps to be taken by SGD, in order to set KFAC on a favorable trajectory.
Andra ordningens optimeringsmetoder have länge ansetts vara beräkningsmässigt ineffektiva för att lösa optimeringsproblemet inom djup maskininlärning. En alternativ optimiseringsstrategi som använder en Kronecker-faktoriserad approximativ Hessian (KFAC) i Newtons metod för optimering, har föreslagits i tidigare studier. Detta arbete syftar till att utvärdera huruvida metoden är praktisk att använda i djup maskininlärning. Test körs på abstrakta, binära, klassificeringsproblem, samt ett verkligt regressionsproblem: Boston Housing data. Studien fann att KFAC erbjuder stora besparingar i tidskopmlexitet jämfört med när en mer naiv implementation med Gauss-Newton matrisen används. Vidare visade sig losskonvergensen hos både stokastisk gradient descent (SGD) och KFAC beroende av nätverksarkitektur: KFAC tenderade att konvergera snabbare i djupa nätverk, medan SGD tenderade att konvergera snabbare i grunda nätverk. Studien drar slutsatsen att KFAC kan prestera väl för djup maskininlärning jämfört med en grundläggande variant av SGD. KFAC visade sig dock kunna vara mycket känslig för initialvikter. Detta problem kunde lösas genom att låta de första stegen tas av SGD så att KFAC hamnade på en gynnsam bana.

APA, Harvard, Vancouver, ISO, and other styles

9

Hanselmann, Thomas. "Approximate dynamic programming with adaptive critics and the algebraic perceptron as a fast neural network related to support vector machines." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2003. http://theses.library.uwa.edu.au/adt-WU2004.0005.

Full text

Abstract:

[Truncated abstract. Please see the pdf version for the complete text. Also, formulae and special characters can only be approximated here. Please see the pdf version of this abstract for an accurate reproduction.] This thesis treats two aspects of intelligent control: The first part is about long-term optimization by approximating dynamic programming and in the second part a specific class of a fast neural network, related to support vector machines (SVMs), is considered. The first part relates to approximate dynamic programming, especially in the framework of adaptive critic designs (ACDs). Dynamic programming can be used to find an optimal decision or control policy over a long-term period. However, in practice it is difficult, and often impossible, to calculate a dynamic programming solution, due to the 'curse of dimensionality'. The adaptive critic design framework addresses this issue and tries to find a good solution by approximating the dynamic programming process for a stationary environment. In an adaptive critic design there are three modules, the plant or environment to be controlled, a critic to estimate the long-term cost and an action or controller module to produce the decision or control strategy. Even though there have been many publications on the subject over the past two decades, there are some points that have had less attention. While most of the publications address the training of the critic, one of the points that has not received systematic attention is training of the action module.¹ Normally, training starts with an arbitrary, hopefully stable, decision policy and its long-term cost is then estimated by the critic. Often the critic is a neural network that has to be trained, using a temporal difference and Bellman's principle of optimality. Once the critic network has converged, a policy improvement step is carried out by gradient descent to adjust the parameters of the controller network. Then the critic is retrained again to give the new long-term cost estimate. However, it would be preferable to focus more on extremal policies earlier in the training. Therefore, the Calculus of Variations is investigated to discard the idea of using the Euler equations to train the actor. However, an adaptive critic formulation for a continuous plant with a short-term cost as an integral cost density is made and the chain rule is applied to calculate the total derivative of the short-term cost with respect to the actor weights. This is different from the discrete systems, usually used in adaptive critics, which are used in conjunction with total ordered derivatives. This idea is then extended to second order derivatives such that Newton's method can be applied to speed up convergence. Based on this, an almost concurrent actor and critic training was proposed. The equations are developed for any non-linear system and short-term cost density function and these were tested on a linear quadratic regulator (LQR) setup. With this approach the solution to the actor and critic weights can be achieved in only a few actor-critic training cycles. Some other, more minor issues, in the adaptive critic framework are investigated, such as the influence of the discounting factor in the Bellman equation on total ordered derivatives, the target interpretation in backpropagation through time as moving and fixed targets, the relation between simultaneous recurrent networks and dynamic programming is stated and a reinterpretation of the recurrent generalized multilayer perceptron (GMLP) as a recurrent generalized finite impulse MLP (GFIR-MLP) is made. Another subject in this area that is investigated, is that of a hybrid dynamical system, characterized as a continuous plant and a set of basic feedback controllers, which are used to control the plant by finding a switching sequence to select one basic controller at a time. The special but important case is considered when the plant is linear but with some uncertainty in the state space and in the observation vector, and a quadratic cost function. This is a form of robust control, where a dynamic programming solution has to be calculated. ¹Werbos comments that most treatment of action nets or policies either assume enumerative maximization, which is good only for small problems, except for the games of Backgammon or Go [1], or, gradient-based training. The latter is prone to difficulties with local minima due to the non-convex nature of the cost-to-go function. With incremental methods, such as backpropagation through time, calculus of variations and model-predictive control, the dangers of non-convexity of the cost-to-go function with respect to the control is much less than the with respect to the critic parameters, when the sampling times are small. Therefore, getting the critic right has priority. But with larger sampling times, when the control represents a more complex plan, non-convexity becomes more serious.

APA, Harvard, Vancouver, ISO, and other styles

10

Malfatti, Guilherme Meneguzzi. "Técnicas de agrupamento de dados para computação aproximativa." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2017. http://hdl.handle.net/10183/169096.

Full text

Abstract:

Dois dos principais fatores do aumento da performance em aplicações single-thread – frequência de operação e exploração do paralelismo no nível das instruções – tiveram pouco avanço nos últimos anos devido a restrições de potência. Neste contexto, considerando a natureza tolerante a imprecisões (i.e.: suas saídas podem conter um nível aceitável de ruído sem comprometer o resultado final) de muitas aplicações atuais, como processamento de imagens e aprendizado de máquina, a computação aproximativa torna-se uma abordagem atrativa. Esta técnica baseia-se em computar valores aproximados ao invés de precisos que, por sua vez, pode aumentar o desempenho e reduzir o consumo energético ao custo de qualidade. No atual estado da arte, a forma mais comum de exploração da técnica é através de redes neurais (mais especificamente, o modelo Multilayer Perceptron), devido à capacidade destas estruturas de aprender funções arbitrárias e aproximá-las. Tais redes são geralmente implementadas em um hardware dedicado, chamado acelerador neural. Contudo, essa execução exige uma grande quantidade de área em chip e geralmente não oferece melhorias suficientes que justifiquem este espaço adicional. Este trabalho tem por objetivo propor um novo mecanismo para fazer computação aproximativa, baseado em reúso aproximativo de funções e trechos de código. Esta técnica agrupa automaticamente entradas e saídas de dados por similaridade, armazena-os em uma tabela em memória controlada via software. A partir disto, os valores quantizados podem ser reutilizados através de uma busca a essa tabela, onde será selecionada a saída mais apropriada e desta forma a execução do trecho de código será substituído. A aplicação desta técnica é bastante eficaz, sendo capaz de alcançar uma redução, em média, de 97.1% em Energy-Delay-Product (EDP) quando comparado a aceleradores neurais.
Two of the major drivers of increased performance in single-thread applications - increase in operation frequency and exploitation of instruction-level parallelism - have had little advances in the last years due to power constraints. In this context, considering the intrinsic imprecision-tolerance (i.e., outputs may present an acceptable level of noise without compromising the result) of many modern applications, such as image processing and machine learning, approximate computation becomes a promising approach. This technique is based on computing approximate instead of accurate results, which can increase performance and reduce energy consumption at the cost of quality. In the current state of the art, the most common way of exploiting the technique is through neural networks (more specifically, the Multilayer Perceptron model), due to the ability of these structures to learn arbitrary functions and to approximate them. Such networks are usually implemented in a dedicated neural accelerator. However, this implementation requires a large amount of chip area and usually does not offer enough improvements to justify this additional cost. The goal of this work is to propose a new mechanism to address approximate computation, based on approximate reuse of functions and code fragments. This technique automatically groups input and output data by similarity and stores this information in a sofware-controlled memory. Based on these data, the quantized values can be reused through a search to this table, in which the most appropriate output will be selected and, therefore, execution of the original code will be replaced. Applying this technique is effective, achieving an average 97.1% reduction in Energy-Delay-Product (EDP) when compared to neural accelerators.

APA, Harvard, Vancouver, ISO, and other styles

11

Romano, Michele. "Near real-time detection and approximate location of pipe bursts and other events in water distribution systems." Thesis, University of Exeter, 2012. http://hdl.handle.net/10871/9862.

Full text

Abstract:

The research work presented in this thesis describes the development and testing of a new data analysis methodology for the automated near real-time detection and approximate location of pipe bursts and other events which induce similar abnormal pressure/flow variations (e.g., unauthorised consumptions, equipment failures, etc.) in Water Distribution Systems (WDSs). This methodology makes synergistic use of several self-learning Artificial Intelligence (AI) and statistical/geostatistical techniques for the analysis of the stream of data (i.e., signals) collected and communicated on-line by the hydraulic sensors deployed in a WDS. These techniques include: (i) wavelets for the de-noising of the recorded pressure/flow signals, (ii) Artificial Neural Networks (ANNs) for the short-term forecasting of future pressure/flow signal values, (iii) Evolutionary Algorithms (EAs) for the selection of optimal ANN input structure and parameters sets, (iv) Statistical Process Control (SPC) techniques for the short and long term analysis of the burst/other event-induced pressure/flow variations, (v) Bayesian Inference Systems (BISs) for inferring the probability of a burst/other event occurrence and raising the detection alarms, and (vi) geostatistical techniques for determining the approximate location of a detected burst/other event. The results of applying the new methodology to the pressure/flow data from several District Metered Areas (DMAs) in the United Kingdom (UK) with real-life bursts/other events and simulated (i.e., engineered) burst events are also reported in this thesis. The results obtained illustrate that the developed methodology allowed detecting the aforementioned events in a fast and reliable manner and also successfully determining their approximate location within a DMA. The results obtained additionally show the potential of the methodology presented here to yield substantial improvements to the state-of-the-art in near real-time WDS incident management by enabling the water companies to save water, energy, money, achieve higher levels of operational efficiency and improve their customer service. The new data analysis methodology developed and tested as part of the research work presented in this thesis has been patented (International Application Number: PCT/GB2010/000961).

APA, Harvard, Vancouver, ISO, and other styles

12

Gómez, Cerdà Vicenç. "Algorithms and complex phenomena in networks: Neural ensembles, statistical, interference and online communities." Doctoral thesis, Universitat Pompeu Fabra, 2008. http://hdl.handle.net/10803/7548.

Full text

Abstract:

Aquesta tesi tracta d'algoritmes i fenòmens complexos en xarxes.

En la primera part s'estudia un model de neurones estocàstiques inter-comunicades mitjançant potencials d'acció. Proposem una tècnica de modelització a escala mesoscòpica i estudiem una transició de fase en un acoblament crític entre les neurones. Derivem una regla de plasticitat sinàptica local que fa que la xarxa s'auto-organitzi en el punt crític.

Seguidament tractem el problema d'inferència aproximada en xarxes probabilístiques mitjançant un algorisme que corregeix la solució obtinguda via belief propagation en grafs cíclics basada en una expansió en sèries. Afegint termes de correcció que corresponen a cicles generals en la xarxa, s'obté el resultat exacte. Introduïm i analitzem numèricament una manera de truncar aquesta sèrie.

Finalment analizem la interacció social en una comunitat d'Internet caracteritzant l'estructura de la xarxa d'usuaris, els fluxes de discussió en forma de comentaris i els patrons de temps de reacció davant una nova notícia.
This thesis is about algorithms and complex phenomena in networks.

In the first part we study a network model of stochastic spiking neurons. We propose a modelling technique based on a mesoscopic description level and show the presence of a phase transition around a critical coupling strength. We derive a local plasticity which drives the network towards the critical point.

We then deal with approximate inference in probabilistic networks. We develop an algorithm which corrects the belief propagation solution for loopy graphs based on a loop series expansion. By adding correction terms, one for each "generalized loop" in the network, the exact result is recovered. We introduce and analyze numerically a particular way of truncating the series.

Finally, we analyze the social interaction of an Internet community by characterizing the structure of the network of users, their discussion threads and the temporal patterns of reaction times to a new post.

APA, Harvard, Vancouver, ISO, and other styles

13

Glaros, Anastasios. "Data-driven Definition of Cell Types Based on Single-cell Gene Expression Data." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-297498.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Matula, Tomáš. "Využití aproximovaných aritmetických obvodů v neuronových sítí." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2019. http://www.nusl.cz/ntk/nusl-399179.

Full text

Abstract:

Táto práca sa zaoberá využitím aproximovaných obvodov v neurónových sieťach so zámerom prínosu energetických úspor. K tejto téme už existujú štúdie, avšak väčšina z nich bola príliš špecifická k aplikácii alebo bola demonštrovaná v malom rozsahu. Pre dodatočné preskúmanie možností sme preto skrz netriviálne modifikácie open-source frameworku TensorFlow vytvorili platformu umožňujúcu simulovať používanie approximovaných obvodov na populárnych a robustných neurónových sieťach ako Inception alebo MobileNet. Bodom záujmu bolo nahradenie väčšiny výpočtovo náročných častí konvolučných neurónových sietí, ktorými sú konkrétne operácie násobenia v konvolučnách vrstvách. Experimentálne sme ukázali a porovnávali rozličné varianty a aj napriek tomu, že sme postupovali bez preučenia siete sa nám podarilo získať zaujímavé výsledky. Napríklad pri architektúre Inception v4 sme získali takmer 8% úspor, pričom nedošlo k žiadnemu poklesu presnosti. Táto úspora vie rozhodne nájsť uplatnenie v mobilných zariadeniach alebo pri veľkých neurónových sieťach s enormnými výpočtovými nárokmi.

APA, Harvard, Vancouver, ISO, and other styles

15

Rodrigues, Dirceu Zeferino. "Redes neurais, identidade de modelos e resposta da cebola à adubação nitrogenada." Universidade Federal de Viçosa, 2013. http://locus.ufv.br/handle/123456789/4064.

Full text

Abstract:

Made available in DSpace on 2015-03-26T13:32:18Z (GMT). No. of bitstreams: 1 texto completo.pdf: 915073 bytes, checksum: b935760049a0fd3e2afd0852f0a37275 (MD5) Previous issue date: 2013-03-21
The study of the productivity curves compared with the amount of nitrogen absorbed by the onion crop is fundamentally important for the elaboration of a more efficient fertilization plan in technical terms as well as in economic terms. Many statistical techniques have been proposed, tested, and improved in order to help boost research in this direction. The justification for this research is the need to assess and improve new statistical techniques that help in obtaining accurate information in order to assist in decision making for improving productivity. For this case, this study aimed to use and evaluate two statistical methods with different specific objectives with respect to the evaluation of nitrogen application in the production of onion cultivars. In the first evaluation, statistical techniques based on regression models were used for adjusting curves for some nitrogen levels related to productivity, performing a survey with four onion cultivars in different locations, and then to carry out the evaluation of the grouping possibility of these statistical models using the models identity test. In this step, it was tried to estimate a curve that could represent together the fertilization response pattern in all four evaluated sites. In the second study, the goal was to verify the techniques efficiency based on neural networks. So, the proposal was to see the possibility of using safely this new concept based on artificial neural networks in research related to the onion cultivars response to nitrogen fertilization. In general, this study describes the successful use of new statistical techniques with emphasis on neural networks that help improve the onion productivity and thereafter to implement and disseminate techniques based on computational intelligence for purposes of study prediction and modeling.
O estudo das curvas de produtividade comparadas com a quantidade de nitrogênio absorvido pela cultura da cebola é de fundamental importância para a formulação de um plano de adubação que seja mais eficiente tanto em termos técnicos quanto econômicos. Diversas técnicas estatísticas têm sido propostas, testadas e aprimoradas com o intuito de contribuir para alavancar pesquisas nesta direção. A justificativa para este trabalho de pesquisa está na necessidade de avaliar e aprimorar novas técnicas estatísticas que ajudem na obtenção de informações precisas com a finalidade de auxiliar na tomada de decisão visando melhorar a produtividade. Para isso, este estudo teve como objetivo empregar e avaliar duas metodologias de auxílio à estatística, mas com objetivos específicos distintos com respeito à avaliação da aplicação de nitrogênio na produção dos cultivares da cebola. Na primeira avaliação, objetivou-se utilizar técnicas estatísticas baseadas em modelos de regressão e ajustar curvas para alguns níveis de doses de nitrogênio, relacionadas à produtividade, para uma pesquisa realizada com quatro cultivares em locais distintos de cebola e, em seguida, avaliar a possibilidade de agrupamento desses modelos estatísticos obtidos, utilizando o teste de identidade de modelos. Nesta etapa, procurou-se estimar uma curva que representasse, em conjunto, o padrão de resposta à adubação em todos os quatro locais avaliados. No segundo estudo, a meta era verificar a eficiência de técnicas baseadas em redes neurais. Assim, a proposta foi constatar se já é possível utilizar, com segurança, esse novo conceito baseado em redes neurais artificiais em pesquisas relacionadas à resposta de cultivares de cebola à adubação nitrogenada. De uma maneira geral, o trabalho descreve o êxito da utilização de novas técnicas estatísticas com ênfase em redes neurais que ajudem melhorar a produtividade da cebola para, a partir daí, permitir aplicar e difundir técnicas baseadas em inteligência computacional para fins de estudos de predição e modelagem.

APA, Harvard, Vancouver, ISO, and other styles

16

Uppala, Roshni. "Simulating Large Scale Memristor Based Crossbar for Neuromorphic Applications." University of Dayton / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1429296073.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Andrade, Gustavo Araújo de. "PROGRAMAÇÃO DINÂMICA HEURÍSTICA DUAL E REDES DE FUNÇÕES DE BASE RADIAL PARA SOLUÇÃO DA EQUAÇÃO DE HAMILTON-JACOBI-BELLMAN EM PROBLEMAS DE CONTROLE ÓTIMO." Universidade Federal do Maranhão, 2014. http://tedebc.ufma.br:8080/jspui/handle/tede/517.

Full text

Abstract:

Made available in DSpace on 2016-08-17T14:53:28Z (GMT). No. of bitstreams: 1 Dissertacao Gustavo Araujo.pdf: 2606649 bytes, checksum: efb1a5ded768b058f25d23ee8967bd38 (MD5) Previous issue date: 2014-04-28
In this work the main objective is to present the development of learning algorithms for online application for the solution of algebraic Hamilton-Jacobi-Bellman equation. The concepts covered are focused on developing the methodology for control systems, through techniques that aims to design online adaptive controllers to reject noise sensors, parametric variations and modeling errors. Concepts of neurodynamic programming and reinforcement learning are are discussed to design algorithms where the context of a given operating point causes the control system to adapt and thus present the performance according to specifications design. Are designed methods for online estimation of adaptive critic focusing efforts on techniques for gradient estimating of the environment value function.
Neste trabalho o principal objetivo é apresentar o desenvolvimento de algoritmos de aprendizagem para execução online para a solução da equação algébrica de Hamilton-Jacobi-Bellman. Os conceitos abordados se concentram no desenvolvimento da metodologia para sistemas de controle, por meio de técnicas que tem como objetivo o projeto online de controladores adaptativos são projetados para rejeitar ruídos de sensores, variações paramétricas e erros de modelagem. Conceitos de programação neurodinâmica e aprendizagem por reforço são abordados para desenvolver algoritmos onde a contextualização de determinado ponto de operação faz com que o sistema de controle se adapte e, dessa forma, apresente o desempenho de acordo com as especificações de projeto. Desenvolve-se métodos para a estimação online do crítico adaptativo concentrando os esforços em técnicas de estimação do gradiente da função valor do ambiente.

APA, Harvard, Vancouver, ISO, and other styles

18

Chiu, Jih-Sheng, and 邱日聖. "Improving Asymmetric Approximate Search through Neural Networks." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/twwteg.

Full text

Abstract:

碩士
國立嘉義大學
資訊工程學系研究所
106
Due to advance in information technology, we have to deal with growing digital data. The traditional linear search becomes impractical because of the large amount of data, so many researchers turn to develop approximate search methods. Before Approximate search, we have to do the clustering on data. In the search process, we compute the Euclidean distance between query and each cluster center, and then pick enough candidates according to their distances. However, the distance-based approach is not always the best way to pick candidates. In this study, we propose employing neural networks to optimize the relevance between query and each cluster center so that the candidate quality can be further improved. Experiment results, show the proposed method achieve satisfactory accuracy compared with our past work.

APA, Harvard, Vancouver, ISO, and other styles

19

"Approximate Neural Networks for Speech Applications in Resource-Constrained Environments." Master's thesis, 2016. http://hdl.handle.net/2286/R.I.39402.

Full text

Abstract:

abstract: Speech recognition and keyword detection are becoming increasingly popular applications for mobile systems. While deep neural network (DNN) implementation of these systems have very good performance, they have large memory and compute resource requirements, making their implementation on a mobile device quite challenging. In this thesis, techniques to reduce the memory and computation cost of keyword detection and speech recognition networks (or DNNs) are presented. The first technique is based on representing all weights and biases by a small number of bits and mapping all nodal computations into fixed-point ones with minimal degradation in the accuracy. Experiments conducted on the Resource Management (RM) database show that for the keyword detection neural network, representing the weights by 5 bits results in a 6 fold reduction in memory compared to a floating point implementation with very little loss in performance. Similarly, for the speech recognition neural network, representing the weights by 6 bits results in a 5 fold reduction in memory while maintaining an error rate similar to a floating point implementation. Additional reduction in memory is achieved by a technique called weight pruning, where the weights are classified as sensitive and insensitive and the sensitive weights are represented with higher precision. A combination of these two techniques helps reduce the memory footprint by 81 - 84% for speech recognition and keyword detection networks respectively. Further reduction in memory size is achieved by judiciously dropping connections for large blocks of weights. The corresponding technique, termed coarse-grain sparsification, introduces hardware-aware sparsity during DNN training, which leads to efficient weight memory compression and significant reduction in the number of computations during classification without loss of accuracy. Keyword detection and speech recognition DNNs trained with 75% of the weights dropped and classified with 5-6 bit weight precision effectively reduced the weight memory requirement by ~95% compared to a fully-connected network with double precision, while showing similar performance in keyword detection accuracy and word error rate.
Dissertation/Thesis
Masters Thesis Computer Science 2016

APA, Harvard, Vancouver, ISO, and other styles

20

(9178400), Sanchari Sen. "Efficient and Robust Deep Learning through Approximate Computing." Thesis, 2020.

Find full text

Abstract:

Deep Neural Networks (DNNs) have greatly advanced the state-of-the-art in a wide range of machine learning tasks involving image, video, speech and text analytics, and are deployed in numerous widely-used products and services. Improvements in the capabilities of hardware platforms such as Graphics Processing Units (GPUs) and specialized accelerators have been instrumental in enabling these advances as they have allowed more complex and accurate networks to be trained and deployed. However, the enormous computational and memory demands of DNNs continue to increase with growing data size and network complexity, posing a continuing challenge to computing system designers. For instance, state-of-the-art image recognition DNNs require hundreds of millions of parameters and hundreds of billions of multiply-accumulate operations while state-of-the-art language models require hundreds of billions of parameters and several trillion operations to process a single input instance. Another major obstacle in the adoption of DNNs, despite their impressive accuracies on a range of datasets, has been their lack of robustness. Specifically, recent efforts have demonstrated that small, carefully-introduced input perturbations can force a DNN to behave in unexpected and erroneous ways, which can have to severe consequences in several safety-critical DNN applications like healthcare and autonomous vehicles. In this dissertation, we explore approximate computing as an avenue to improve the speed and energy efficiency of DNNs, as well as their robustness to input perturbations.

Approximate computing involves executing selected computations of an application in an approximate manner, while generating favorable trade-offs between computational efficiency and output quality. The intrinsic error resilience of machine learning applications makes them excellent candidates for approximate computing, allowing us to achieve execution time and energy reductions with minimal effect on the quality of outputs. This dissertation performs a comprehensive analysis of different approximate computing techniques for improving the execution efficiency of DNNs. Complementary to generic approximation techniques like quantization, it identifies approximation opportunities based on the specific characteristics of three popular classes of networks - Feed-forward Neural Networks (FFNNs), Recurrent Neural Networks (RNNs) and Spiking Neural Networks (SNNs), which vary considerably in their network structure and computational patterns.

First, in the context of feed-forward neural networks, we identify sparsity, or the presence of zero values in the data structures (activations, weights, gradients and errors), to be a major source of redundancy and therefore, an easy target for approximations. We develop lightweight micro-architectural and instruction set extensions to a general-purpose processor core that enable it to dynamically detect zero values when they are loaded and skip future instructions that are rendered redundant by them. Next, we explore LSTMs (the most widely used class of RNNs), which map sequences from an input space to an output space. We propose hardware-agnostic approximations that dynamically skip redundant symbols in the input sequence and discard redundant elements in the state vector to achieve execution time benefits. Following that, we consider SNNs, which are an emerging class of neural networks that represent and process information in the form of sequences of binary spikes. Observing that spike-triggered updates along synaptic connections are the dominant operation in SNNs, we propose hardware and software techniques to identify connections that can be minimally impact the output quality and deactivate them dynamically, skipping any associated updates.

The dissertation also delves into the efficacy of combining multiple approximate computing techniques to improve the execution efficiency of DNNs. In particular, we focus on the combination of quantization, which reduces the precision of DNN data-structures, and pruning, which introduces sparsity in them. We observe that the ability of pruning to reduce the memory demands of quantized DNNs decreases with precision as the overhead of storing non-zero locations alongside the values starts to dominate in different sparse encoding schemes. We analyze this overhead and the overall compression of three different sparse formats across a range of sparsity and precision values and propose a hybrid compression scheme that identifies that optimal sparse format for a pruned low-precision DNN.

Along with improved execution efficiency of DNNs, the dissertation explores an additional advantage of approximate computing in the form of improved robustness. We propose ensembles of quantized DNN models with different numerical precisions as a new approach to increase robustness against adversarial attacks. It is based on the observation that quantized neural networks often demonstrate much higher robustness to adversarial attacks than full precision networks, but at the cost of a substantial loss in accuracy on the original (unperturbed) inputs. We overcome this limitation to achieve the best of both worlds, i.e., the higher unperturbed accuracies of the full precision models combined with the higher robustness of the low precision models, by composing them in an ensemble.

In summary, this dissertation establishes approximate computing as a promising direction to improve the performance, energy efficiency and robustness of neural networks.

APA, Harvard, Vancouver, ISO, and other styles

21

Abdella, Mussa Ismael. "The use of genetic algorithms and neural networks to approximate missing data in database." Thesis, 2006. http://hdl.handle.net/10539/105.

Full text

Abstract:

Missing data creates various problems in analysing and processing of data in databases. Due to this reason missing data has been an area of research in various disciplines for a quite long time. This report intro- duces a new method aimed at approximating missing data in a database using a combination of genetic algorithms and neural networks. The proposed method uses genetic algorithm to minimise an error function derived from an auto-associative neural network. The error function is expressed as the square of the di®erence between the actual observa- tions and predicted values from an auto-associative neural network. In the event of missing data, all the values of the actual observations are not known hence, the error function is decomposed to depend on the known and unknown (missing) values. Multi Layer Perceptron (MLP), and Radial Basis Function (RBF) neural networks are employed to train the neural networks. The research focus also lies on the investigation of using the proposed method in approximating missing data with great accuracy as the number of missing cases within a single record increases. The research also investigates the impact of using di®erent neural net- work architecture in training the neural network and the approximation ii found to the missing values. It is observed that approximations of miss- ing data obtained using the proposed model to be highly accurate with 95% correlation coe±cient between the actual missing values and cor- responding approximated values using the proposed model. It is found that results obtained using RBF are better than MLP. Results found us- ing the combination of both MLP and RBF are found to be better than those obtained using either MLP or RBF. It is also observed that there is no signi¯cant reduction in accuracy of results as the number of missing cases in a single record increases. Approximations found for missing data are also found to depend on the particular neural network architecture employed in training the data set.

APA, Harvard, Vancouver, ISO, and other styles

22

Pereira, Silvério Matos. "Anomaly detection in mobile networks." Master's thesis, 2021. http://hdl.handle.net/10773/31374.

Full text

Abstract:

Big data has become an increasingly important topic in recent years, with new sources of data comes the need to be aware of the trade-off it requires, necessitating great care in both choice and implementation of algorithms, as well as how to adapt existing algorithms to handle this new setting. At the same time, the interpretability and understanding of a small to medium number of features is still key in many areas where understanding the data is paramount. In this thesis we show how we can tackle both these issues with the aid of self-organizing algorithms. Two objectives were achieved: Firstly, we created an anomaly detection system with emphasis on feature interpretability and show its results on real-world mobile network data provided by Nokia. Secondly, we propose and implement modifications to the growing neural gas algorithm, an algorithm that has seen uses in fields such as anomaly detection, 3D reconstruction and data compression. This modification is done using approximate nearest neighbor techniques with the purpose of creating an algorithm that is capable of efficiently trading accuracy for execution time and making growing neural gas usable in high-dimensional settings and with a larger model size.
Big Data é um tópico de cada vez mais importância, com esta nova fonte de dados é necessário ter em mente os compromissos necessários para a utilizar, requerendo grande cuidado na escolha de algoritmo e implementação, bem como as mudanças necessárias para adaptar algoritmos existentes. Ao mesmo tempo, a interpretação de um número médio de variáveis continua a ser chave em diversas áreas. Nesta tese mostramos como resolver ambos estes problemas sob a lente de algoritmos intitulados "self-organizing". Dois objetivos são cumpridos: A criação de um sistema de deteção de anomalias com ênfase em interpretabilidade e os seus resultados quando aplicado a dados de uma rede móvel, disponibilizados pela Nokia. Propomos e implementamos também modificações ao algoritmo de "Growing Neural Gas", um algoritmo com uso em deteção de anomalias, reconstrução 3D e compressão de dados. Esta modificação é feita usando técnicas de "Approximate Nearest Neighbours", criando um algoritmo capaz de balancear a precisão do modelo desejado com o tempo de execução, estas mudanças fazem com que "Growing Neural Gas" seja usável em cenários com um número grande de variáveis e capaz de produzir modelos de maior dimensão em tempo útil.
Mestrado em Engenharia de Computadores e Telemática

APA, Harvard, Vancouver, ISO, and other styles

23

(6634835), Syed Sarwar. "Exploration of Energy Efficient Hardware and Algorithms for Deep Learning." Thesis, 2019.

Find full text

Abstract:

Deep Neural Networks (DNNs) have emerged as the state-of-the-art technique in a wide range of machine learning tasks for analytics and computer vision in the next generation of embedded (mobile, IoT, wearable) devices. Despite their success, they suffer from high energy requirements both in inference and training. In recent years, the inherent error resiliency of DNNs has been exploited by introducing approximations at either the algorithmic or the hardware levels (individually) to obtain energy savings while incurring tolerable accuracy degradation. We perform a comprehensive analysis to determine the effectiveness of cross-layer approximations for the energy-efficient realization of large-scale DNNs. Our experiments on recognition benchmarks show that cross-layer approximation provides substantial improvements in energy efficiency for different accuracy/quality requirements. Furthermore, we propose a synergistic framework for combining the approximation techniques.

To reduce the training complexity of Deep Convolutional Neural Networks (DCNN), we replace certain weight kernels of convolutional layers with Gabor filters. The convolutional layers use the Gabor filters as fixed weight kernels, which extracts intrinsic features, with regular trainable weight kernels. This combination creates a balanced system that gives better training performance in terms of energy and time, compared to the standalone Deep CNN (without any Gabor kernels), in exchange for tolerable accuracy degradation. We also explore an efficient training methodology and incrementally growing a DCNN to allow new classes to be learned while sharing part of the base network. Our approach is an end-to-end learning framework, where we focus on reducing the incremental training complexity while achieving accuracy close to the upper-bound without using any of the old training samples. We have also explored spiking neural networks for energy-efficiency. Training of deep spiking neural networks from direct spike inputs is difficult since its temporal dynamics are not well suited for standard supervision based training algorithms used to train DNNs. We propose a spike-based backpropagation training methodology for state-of-the-art deep Spiking Neural Network (SNN) architectures. This methodology enables real-time training in deep SNNs while achieving comparable inference accuracies on standard image recognition tasks.

APA, Harvard, Vancouver, ISO, and other styles

24

Chapados, Nicolas. "Sequential Machine learning Approaches for Portfolio Management." Thèse, 2009. http://hdl.handle.net/1866/3578.

Full text

Abstract:

Cette thèse envisage un ensemble de méthodes permettant aux algorithmes d'apprentissage statistique de mieux traiter la nature séquentielle des problèmes de gestion de portefeuilles financiers. Nous débutons par une considération du problème général de la composition d'algorithmes d'apprentissage devant gérer des tâches séquentielles, en particulier celui de la mise-à-jour efficace des ensembles d'apprentissage dans un cadre de validation séquentielle. Nous énumérons les desiderata que des primitives de composition doivent satisfaire, et faisons ressortir la difficulté de les atteindre de façon rigoureuse et efficace. Nous poursuivons en présentant un ensemble d'algorithmes qui atteignent ces objectifs et présentons une étude de cas d'un système complexe de prise de décision financière utilisant ces techniques. Nous décrivons ensuite une méthode générale permettant de transformer un problème de décision séquentielle non-Markovien en un problème d'apprentissage supervisé en employant un algorithme de recherche basé sur les K meilleurs chemins. Nous traitons d'une application en gestion de portefeuille où nous entraînons un algorithme d'apprentissage à optimiser directement un ratio de Sharpe (ou autre critère non-additif incorporant une aversion au risque). Nous illustrons l'approche par une étude expérimentale approfondie, proposant une architecture de réseaux de neurones spécialisée à la gestion de portefeuille et la comparant à plusieurs alternatives. Finalement, nous introduisons une représentation fonctionnelle de séries chronologiques permettant à des prévisions d'être effectuées sur un horizon variable, tout en utilisant un ensemble informationnel révélé de manière progressive. L'approche est basée sur l'utilisation des processus Gaussiens, lesquels fournissent une matrice de covariance complète entre tous les points pour lesquels une prévision est demandée. Cette information est utilisée à bon escient par un algorithme qui transige activement des écarts de cours (price spreads) entre des contrats à terme sur commodités. L'approche proposée produit, hors échantillon, un rendement ajusté pour le risque significatif, après frais de transactions, sur un portefeuille de 30 actifs.
This thesis considers a number of approaches to make machine learning algorithms better suited to the sequential nature of financial portfolio management tasks. We start by considering the problem of the general composition of learning algorithms that must handle temporal learning tasks, in particular that of creating and efficiently updating the training sets in a sequential simulation framework. We enumerate the desiderata that composition primitives should satisfy, and underscore the difficulty of rigorously and efficiently reaching them. We follow by introducing a set of algorithms that accomplish the desired objectives, presenting a case-study of a real-world complex learning system for financial decision-making that uses those techniques. We then describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-best paths search algorithm. We consider an application in financial portfolio management where we train a learning algorithm to directly optimize a Sharpe Ratio (or other risk-averse non-additive) utility function. We illustrate the approach by demonstrating extensive experimental results using a neural network architecture specialized for portfolio management and compare against well-known alternatives. Finally, we introduce a functional representation of time series which allows forecasts to be performed over an unspecified horizon with progressively-revealed information sets. By virtue of using Gaussian processes, a complete covariance matrix between forecasts at several time-steps is available. This information is put to use in an application to actively trade price spreads between commodity futures contracts. The approach delivers impressive out-of-sample risk-adjusted returns after transaction costs on a portfolio of 30 spreads.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Approximate identity neural networks'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles