Tesi: "Artificial Neural Networks and Recurrent Neutral Networks"

1

Kolen, John F. "Exploring the computational capabilities of recurrent neural networks /". The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1487853913100192.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Shao, Yuanlong. "Learning Sparse Recurrent Neural Networks in Language Modeling". The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1398942373.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Gudjonsson, Ludvik. "Comparison of two methods for evolving recurrent artificial neural networks for". Thesis, University of Skövde, University of Skövde, 1998. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-155.

Testo completo

Abstract (sommario):

n this dissertation a comparison of two evolutionary methods for evolving ANNs for robot control is made. The methods compared are SANE with enforced sub-population and delta-coding, and marker-based encoding. In an attempt to speed up evolution, marker-based encoding is extended with delta-coding. The task selected for comparison is the hunter-prey task. This task requires the robot controller to posess some form of memory as the prey can move out of sensor range. Incremental evolution is used to evolve the complex behaviour that is required to successfully handle this task. The comparison is based on computational power needed for evolution, and complexity, robustness, and generalisation of the resulting ANNs. The results show that marker-based encoding is the most efficient method tested and does not need delta-coding to increase the speed of evolution process. Additionally the results indicate that delta-coding does not increase the speed of evolution with marker-based encoding.

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Parfitt, Shan Helen. "Explorations in anaphora resolution in artificial neural networks : implications for nativism". Thesis, Imperial College London, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.267247.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

5

NAPOLI, CHRISTIAN. "A-I: Artificial intelligence". Doctoral thesis, Università degli studi di Catania, 2016. http://hdl.handle.net/20.500.11769/490996.

Testo completo

Abstract (sommario):

In this thesis we proposed new neural architectures and information theory approaches. By means of wavelet analysis, neural networks, and the results of our own creations, namely the wavelet recurrent neural networks and the radial basis probabilistic neural networks,we tried to better understand, model and cope with the human behavior itself. The first idea was to model the workers of a crowdsourcing project as nodes on a cloud-computing system, we also hope to have exceeded the limits of such a definition. We hope to have opened a door on new possibilities to model the behavior of socially interconnected groups of people cooperating for the execution of a common task. We showed how it is possible to use the Wavelet Recurrent Neural Networks to model a quite complex thing such as the availability of resources on an online service or a computational cloud, then we showed that, similarly, the availability of crowd workers can be modeled, as well as the execution time of tasks performed by crowd workers. Doing that we created a tool to tamper with the timeline, hence allowing us to obtain predictions regarding the status of the crowd in terms of available workers and executed workflows. Moreover, with our inanimate reasoner based on the developed Radial Basis Probabilistic Neural Networks, firstly applied to social networks, then applied to living companies, we also understood how to model and manage cooperative networks in terms of workgroups creation and optimization. We have done that by automatically interpreting worker profiles, then automatically extrapolating and interpreting the relevant information among hundreds of features for each worker in order to create workgroups based on their skills, professional attitudes, experience, etc. Finally, also thanks to the suggestions of prof. Michael Bernstein of the Stanford University, we simply proposed to connect the developed automata. We made use of artificial intelligence to model the availability of human resources, but then we had to use a second level of artificial intelligence in order to model human workgroups and skills, finally we used a third level of artificial intelligence to model workflows executed by the said human resources once organized in groups and levels according to their experiences. In our best intentions, such a three level artificial intelligence could address the limits that, until now, have refrained the crowds from growing up as companies, with a well recognizable pyramidal structure, in order to reward experience, skill and professionalism of their workers. We cannot frankly say whether our work will really contribute or not to the so called "crowdsourcing revolution", but we hope at least to have shedded some light on the agreeable possibilities that are yet to come.

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Kramer, Gregory Robert. "An analysis of neutral drift's effect on the evolution of a CTRNN locomotion controller with noisy fitness evaluation". Wright State University / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=wright1182196651.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Rallabandi, Pavan Kumar. "Processing hidden Markov models using recurrent neural networks for biological applications". Thesis, University of the Western Cape, 2013. http://hdl.handle.net/11394/4525.

Testo completo

Abstract (sommario):

Philosophiae Doctor - PhD
In this thesis, we present a novel hybrid architecture by combining the most popular sequence recognition models such as Recurrent Neural Networks (RNNs) and Hidden Markov Models (HMMs). Though sequence recognition problems could be potentially modelled through well trained HMMs, they could not provide a reasonable solution to the complicated recognition problems. In contrast, the ability of RNNs to recognize the complex sequence recognition problems is known to be exceptionally good. It should be noted that in the past, methods for applying HMMs into RNNs have been developed by other researchers. However, to the best of our knowledge, no algorithm for processing HMMs through learning has been given. Taking advantage of the structural similarities of the architectural dynamics of the RNNs and HMMs, in this work we analyze the combination of these two systems into the hybrid architecture. To this end, the main objective of this study is to improve the sequence recognition/classi_cation performance by applying a hybrid neural/symbolic approach. In particular, trained HMMs are used as the initial symbolic domain theory and directly encoded into appropriate RNN architecture, meaning that the prior knowledge is processed through the training of RNNs. Proposed algorithm is then implemented on sample test beds and other real time biological applications.

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Salihoglu, Utku. "Toward a brain-like memory with recurrent neural networks". Doctoral thesis, Universite Libre de Bruxelles, 2009. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210221.

Testo completo

Abstract (sommario):

For the last twenty years, several assumptions have been expressed in the fields of information processing, neurophysiology and cognitive sciences. First, neural networks and their dynamical behaviors in terms of attractors is the natural way adopted by the brain to encode information. Any information item to be stored in the neural network should be coded in some way or another in one of the dynamical attractors of the brain, and retrieved by stimulating the network to trap its dynamics in the desired item’s basin of attraction. The second view shared by neural network researchers is to base the learning of the synaptic matrix on a local Hebbian mechanism. The third assumption is the presence of chaos and the benefit gained by its presence. Chaos, although very simply produced, inherently possesses an infinite amount of cyclic regimes that can be exploited for coding information. Moreover, the network randomly wanders around these unstable regimes in a spontaneous way, thus rapidly proposing alternative responses to external stimuli, and being easily able to switch from one of these potential attractors to another in response to any incoming stimulus. Finally, since their introduction sixty years ago, cell assemblies have proved to be a powerful paradigm for brain information processing. After their introduction in artificial intelligence, cell assemblies became commonly used in computational neuroscience as a neural substrate for content addressable memories.

Based on these assumptions, this thesis provides a computer model of neural network simulation of a brain-like memory. It first shows experimentally that the more information is to be stored in robust cyclic attractors, the more chaos appears as a regime in the background, erratically itinerating among brief appearances of these attractors. Chaos does not appear to be the cause, but the consequence of the learning. However, it appears as an helpful consequence that widens the network’s encoding capacity. To learn the information to be stored, two supervised iterative Hebbian learning algorithm are proposed. One leaves the semantics of the attractors to be associated with the feeding data unprescribed, while the other defines it a priori. Both algorithms show good results, even though the first one is more robust and has a greater storing capacity. Using these promising results, a biologically plausible alternative to these algorithms is proposed using cell assemblies as substrate for information. Even though this is not new, the mechanisms underlying their formation are poorly understood and, so far, there are no biologically plausible algorithms that can explain how external stimuli can be online stored in cell assemblies. This thesis provide such a solution combining a fast Hebbian/anti-Hebbian learning of the network's recurrent connections for the creation of new cell assemblies, and a slower feedback signal which stabilizes the cell assemblies by learning the feed forward input connections. This last mechanism is inspired by the retroaxonal hypothesis.

Doctorat en Sciences
info:eu-repo/semantics/nonPublished

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Yang, Jidong. "Road crack condition performance modeling using recurrent Markov chains and artificial neural networks". [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000567.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Willmott, Devin. "Recurrent Neural Networks and Their Applications to RNA Secondary Structure Inference". UKnowledge, 2018. https://uknowledge.uky.edu/math_etds/58.

Testo completo

Abstract (sommario):

Recurrent neural networks (RNNs) are state of the art sequential machine learning tools, but have difficulty learning sequences with long-range dependencies due to the exponential growth or decay of gradients backpropagated through the RNN. Some methods overcome this problem by modifying the standard RNN architecure to force the recurrent weight matrix W to remain orthogonal throughout training. The first half of this thesis presents a novel orthogonal RNN architecture that enforces orthogonality of W by parametrizing with a skew-symmetric matrix via the Cayley transform. We present rules for backpropagation through the Cayley transform, show how to deal with the Cayley transform's singularity, and compare its performance on benchmark tasks to other orthogonal RNN architectures. The second half explores two deep learning approaches to problems in RNA secondary structure inference and compares them to a standard structure inference tool, the nearest neighbor thermodynamic model (NNTM). The first uses RNNs to detect paired or unpaired nucleotides in the RNA structure, which are then converted into synthetic auxiliary data that direct NNTM structure predictions. The second method uses recurrent and convolutional networks to directly infer RNA base pairs. In many cases, these approaches improve over NNTM structure predictions by 20-30 percentage points.

Gli stili APA, Harvard, Vancouver, ISO e altri

11

Napoli, Christian. "A-I: Artificial intelligence". Doctoral thesis, Università di Catania, 2016. http://hdl.handle.net/10761/3974.

Testo completo

Abstract (sommario):

In this thesis we proposed new neural architectures and information theory approaches. By means of wavelet analysis, neural networks, and the results of our own creations, namely the wavelet recurrent neural networks and the radial basis probabilistic neural networks,we tried to better understand, model and cope with the human behavior itself. The first idea was to model the workers of a crowdsourcing project as nodes on a cloud-computing system, we also hope to have exceeded the limits of such a definition. We hope to have opened a door on new possibilities to model the behavior of socially interconnected groups of people cooperating for the execution of a common task. We showed how it is possible to use the Wavelet Recurrent Neural Networks to model a quite complex thing such as the availability of resources on an online service or a computational cloud, then we showed that, similarly, the availability of crowd workers can be modeled, as well as the execution time of tasks performed by crowd workers. Doing that we created a tool to tamper with the timeline, hence allowing us to obtain predictions regarding the status of the crowd in terms of available workers and executed workflows. Moreover, with our inanimate reasoner based on the developed Radial Basis Probabilistic Neural Networks, firstly applied to social networks, then applied to living companies, we also understood how to model and manage cooperative networks in terms of workgroups creation and optimization. We have done that by automatically interpreting worker profiles, then automatically extrapolating and interpreting the relevant information among hundreds of features for each worker in order to create workgroups based on their skills, professional attitudes, experience, etc. Finally, also thanks to the suggestions of prof. Michael Bernstein of the Stanford University, we simply proposed to connect the developed automata. We made use of artificial intelligence to model the availability of human resources, but then we had to use a second level of artificial intelligence in order to model human workgroups and skills, finally we used a third level of artificial intelligence to model workflows executed by the said human resources once organized in groups and levels according to their experiences. In our best intentions, such a three level artificial intelligence could address the limits that, until now, have refrained the crowds from growing up as companies, with a well recognizable pyramidal structure, in order to reward experience, skill and professionalism of their workers. We cannot frankly say whether our work will really contribute or not to the so called "crowdsourcing revolution", but we hope at least to have shedded some light on the agreeable possibilities that are yet to come.

Gli stili APA, Harvard, Vancouver, ISO e altri

12

Vikström, Filip. "A recurrent neural network approach to quantification of risks surrounding the Swedish property market". Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-126192.

Testo completo

Abstract (sommario):

As the real estate market plays a central role in a countries financial situation, as a life insurer, a bank and a property developer, Skandia wants a method for better assessing the risks connected to the real estate market. The goal of this paper is to increase the understanding of property market risk and its covariate risks and to conduct an analysis of how a fall in real estate prices could affect Skandia’s exposed assets.This paper explores a recurrent neural network model with the aim of quantifying identified risk factors using exogenous data. The recurrent neural network model is compared to a vector autoregressive model with exogenous inputs that represent economic conditions.The results of this paper are inconclusive as to which method that produces the most accurate model under the specified settings. The recurrent neural network approach produces what seem to be better results in out-of-sample validation but both the recurrent neural network model and the vector autoregressive model fail to capture the hypothesized relationship between the exogenous and modeled variables. However producing results that does not fit previous assumptions, further research into artificial neural networks and tests with additional variables and longer sample series for calibration is suggested as the model preconditions are promising.

Gli stili APA, Harvard, Vancouver, ISO e altri

13

Condarcure, Thomas A. 1952. "A learning automaton approach to trajectory learning and control system design using dynamic recurrent neural networks". Thesis, The University of Arizona, 1993. http://hdl.handle.net/10150/291987.

Testo completo

Abstract (sommario):

This thesis presents a method for the training of dynamic, recurrent neural networks to generate continuous-time trajectories. In the past, most methods for this type of training were based on gradient descent methods and were deterministic. The method presented here is stochastic in nature. The problem of local minima is addressed by adding the enhancement of incremental learning to the learning automaton; i.e., small learning goals are used to train the neural network from its initialized state to its final parameters for the desired response. The method is applied to the learning of a benchmark continuous-time trajectory--the circle. Then the learning automaton approach is applied to stabilization and tracking problems for linear and nonlinear plant models, using either state or output feedback as needed.

Gli stili APA, Harvard, Vancouver, ISO e altri

14

Gattoni, Giacomo. "Improving the reliability of recurrent neural networks while dealing with bad data". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Cerca il testo completo

Abstract (sommario):

In practical applications, machine learning and deep learning models can have difficulty in achieving generalization, especially when dealing with training samples that are either noisy or limited in quantity. Standard neural networks do not guarantee the monotonicity of the input features with respect to the output, therefore they lack interpretability and predictability when it is known a priori that the input-output relationship should be monotonic. This problem can be encountered in the CPG industry, where it is not possible to ensure that a deep learning model will learn the increasing monotonic relationship between promotional mechanics and sales. To overcome this issue, it is proposed the combined usage of recurrent neural networks, a type of artificial neural networks specifically designed to deal with data structured as sequences, with lattice networks, conceived to guarantee monotonicity of the desired input features with respect to the output. The proposed architecture has proven to be more reliable when new samples are fed to the neural network, demonstrating its ability to infer the evolution of the sales depending on the promotions, even when it is trained on bad data.

Gli stili APA, Harvard, Vancouver, ISO e altri

15

Grose, Mitchell. "Forecasting Atmospheric Turbulence Conditions From Prior Environmental Parameters Using Artificial Neural Networks: An Ensemble Study". University of Dayton / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1619632748733788.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

16

Lindell, Adam. "Pulse Repetition Interval Time Series Modeling for Radar Waves using Long Short-Term Memory Artificial Recurrent Neural Networks". Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-377865.

Testo completo

Abstract (sommario):

This project is a performance study of Long Short-Term Memory artificial neural networks in the context of a specific time series prediction problem consisting of radar pulse trains. The network is tested both in terms of accuracy on a regular time series but also on an incomplete time series where values have been removed in order to test its robustness/resistance to small errors. The results indicate that the network can perform very well when no values are removed and can be trained relatively quickly using the parameters set in this project, although the robustness of the network seems to be quite low using this particular implementation.

Gli stili APA, Harvard, Vancouver, ISO e altri

17

Svebrant, Henrik. "Latent variable neural click models for web search". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-232311.

Testo completo

Abstract (sommario):

User click modeling in web search is most commonly done through probabilistic graphical models. Due to the successful use of machine learning techniques in other fields of research, it is interesting to evaluate how machine learning can be applied to click modeling. In this thesis, modeling is done using recurrent neural networks trained on a distributed representation of the state of the art user browsing model (UBM). It is further evaluated how extending this representation with a set of latent variables that are easily derivable from click logs, can affect the model's prediction performance. Results show that a model using the original representation does not perform very well. However, the inclusion of simple variables can drastically increase the performance regarding the click prediction task. For which it manages to outperform the two chosen baseline models, which themselves are well performing already. It also leads to increased performance for the relevance prediction task, although the results are not as significant. It can be argued that the relevance prediction task is not a fair comparison to the baseline functions, due to them needing more significant amounts of data to learn the respective probabilities. However, it is favorable that the neural models manage to perform quite well using smaller amounts of data. It would be interesting to see how well such models would perform when trained on far greater data quantities than what was used in this project. Also tailoring the model for the use of LSTM, which supposedly could increase performance even more. Evaluating other representations than the one used would also be of interest, as this representation did not perform remarkably on its own.
Klickmodellering av användare i söksystem görs vanligtvis med hjälp av probabilistiska modeller. På grund av maskininlärningens framgångar inom andra områden är det intressant att undersöka hur dessa tekniker kan appliceras för klickmodellering. Detta examensarbete undersöker klickmodellering med hjälp av recurrent neural networks tränade på en distribuerad representation av en populär och välpresterande klickmodell benämnd user browsing model (UBM). Det undersöks vidare hur utökandet av denna representation med statistiska variabler som enkelt kan utvinnas från klickloggar, kan påverka denna modells prestanda. Resultaten visar att grundrepresentationen inte presterar särskilt bra. Däremot har användningen av simpla variabler visats medföra drastiska prestandaökningar när det kommer till att förutspå en användares klick. I detta syfte lyckas modellerna prestera bättre än de två baselinemodeller som valts, vilka redan är välpresterande för syftet. De har även lyckats förbättra modellernas förmåga att förutspå relevans, fastän skillnaderna inte är lika drastiska. Relevans utgör inte en lika jämn jämförelse gentemot baselinemodellerna, då dessa kräver mycket större datamängder för att nå verklig prestanda. Det är däremot fördelaktigt att de neurala modellerna når relativt god prestanda för datamängden som använts. Det vore intressant att undersöka hur dessa modeller skulle prestera när de tränas på mycket större datamängder än vad som använts i detta projekt. Även att skräddarsy modellerna för LSTM, vilket borde kunna öka prestandan ytterligare. Att evaluera andra representationer än den som användes i detta projekt är också av intresse, då den använda representationen inte presterade märkvärdigt i sin grundform.

Gli stili APA, Harvard, Vancouver, ISO e altri

18

Chancan, Leon Marvin Aldo. "The role of motion-and-visual perception in robot place learning and navigation". Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/229769/8/Marvin%20Aldo_Chancan%20Leon_Thesis.pdf.

Testo completo

Abstract (sommario):

This thesis was a step forward in developing new robot learning-based localisation and navigation systems using real world data and simulation environments. Three new methods were proposed to provide new insights on the role of joint motion-and-vision-based end-to-end robot learning in both place recognition and navigation tasks, within modern reinforcement learning and deep learning frameworks. Inspired by biological neural circuits underlying these complex tasks in insect and rat mammalian brains, these methods were shown to be orders of magnitude faster than classical techniques, while setting new state-of-the-art performance standards in terms of accuracy, throughput and latency.

Gli stili APA, Harvard, Vancouver, ISO e altri

19

Canaday, Daniel M. "Modeling and Control of Dynamical Systems with Reservoir Computing". The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu157469471458874.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

20

Max, Lindblad. "The impact of parsing methods on recurrent neural networks applied to event-based vehicular signal data". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-223966.

Testo completo

Abstract (sommario):

This thesis examines two different approaches to parsing event-based vehicular signal data to produce input to a neural network prediction model: event parsing, where the data is kept unevenly spaced over the temporal domain, and slice parsing, where the data is made to be evenly spaced over the temporal domain instead. The dataset used as a basis for these experiments consists of a number of vehicular signal logs taken at Scania AB. Comparisons between the parsing methods have been made by first training long short-term memory (LSTM) recurrent neural networks (RNN) on each of the parsed datasets and then measuring the output error and resource costs of each such model after having validated them on a number of shared validation sets. The results from these tests clearly show that slice parsing compares favourably to event parsing.
Denna avhandling jämför två olika tillvägagångssätt vad gäller parsningen av händelsebaserad signaldata från fordon för att producera indata till en förutsägelsemodell i form av ett neuronnät, nämligen händelseparsning, där datan förblir ojämnt fördelad över tidsdomänen, och skivparsning, där datan är omgjord till att istället vara jämnt fördelad över tidsdomänen. Det dataset som används för dessa experiment är ett antal signalloggar från fordon som kommer från Scania. Jämförelser mellan parsningsmetoderna gjordes genom att först träna ett lång korttidsminne (LSTM) återkommande neuronnät (RNN) på vardera av de skapade dataseten för att sedan mäta utmatningsfelet och resurskostnader för varje modell efter att de validerats på en delad uppsättning av valideringsdata. Resultaten från dessa tester visar tydligt på att skivparsning står sig väl mot händelseparsning.

Gli stili APA, Harvard, Vancouver, ISO e altri

21

Mohammadisohrabi, Ali. "Design and implementation of a Recurrent Neural Network for Remaining Useful Life prediction". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Cerca il testo completo

Abstract (sommario):

A key idea underlying many Predictive Maintenance solutions is Remaining Useful Life (RUL) of machine parts, and it simply involves a prediction on the time remaining before a machine part is likely to require repair or replacement. Nowadays, with respect to fact that the systems are getting more complex, the innovative Machine Learning and Deep Learning algorithms can be deployed to study the more sophisticated correlations in complex systems. The exponential increase in both data accumulation and processing power make the Deep Learning algorithms more desirable that before. In this paper a Long Short-Term Memory (LSTM) which is a Recurrent Neural Network is designed to predict the Remaining Useful Life (RUL) of Turbofan Engines. The dataset is taken from NASA data repository. Finally, the performance obtained by RNN is compared to the best Machine Learning algorithm for the dataset.

Gli stili APA, Harvard, Vancouver, ISO e altri

22

Bahceci, Oktay. "Deep Neural Networks for Context Aware Personalized Music Recommendation : A Vector of Curation". Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-210252.

Testo completo

Abstract (sommario):

Information Filtering and Recommender Systems have been used and has been implemented in various ways from various entities since the dawn of the Internet, and state-of-the-art approaches rely on Machine Learning and Deep Learning in order to create accurate and personalized recommendations for users in a given context. These models require big amounts of data with a variety of features such as time, location and user data in order to find correlations and patterns that other classical models such as matrix factorization and collaborative filtering cannot. This thesis researches, implements and compares a variety of models with the primary focus of Machine Learning and Deep Learning for the task of music recommendation and do so successfully by representing the task of recommendation as a multi-class extreme classification task with 100 000 distinct labels. By comparing fourteen different experiments, all implemented models successfully learn features such as time, location, user features and previous listening history in order to create context-aware personalized music predictions, and solves the cold start problem by using user demographic information, where the best model being capable of capturing the intended label in its top 100 list of recommended items for more than 1/3 of the unseen data in an offine evaluation, when evaluating on randomly selected examples from the unseen following week.
Informationsfiltrering och rekommendationssystem har använts och implementeratspå flera olika sätt från olika enheter sedan gryningen avInternet, och moderna tillvägagångssätt beror påMaskininlärrning samtDjupinlärningför att kunna skapa precisa och personliga rekommendationerför användare i en given kontext. Dessa modeller kräver data i storamängder med en varians av kännetecken såsom tid, plats och användardataför att kunna hitta korrelationer samt mönster som klassiska modellersåsom matris faktorisering samt samverkande filtrering inte kan. Dettaexamensarbete forskar, implementerar och jämför en mängd av modellermed fokus påMaskininlärning samt Djupinlärning för musikrekommendationoch gör det med succé genom att representera rekommendationsproblemetsom ett extremt multi-klass klassifikationsproblem med 100000 unika klasser att välja utav. Genom att jämföra fjorton olika experiment,så lär alla modeller sig kännetäcken såsomtid, plats, användarkänneteckenoch lyssningshistorik för att kunna skapa kontextberoendepersonaliserade musikprediktioner, och löser kallstartsproblemet genomanvändning av användares demografiska kännetäcken, där den bästa modellenklarar av att fånga målklassen i sin rekommendationslista medlängd 100 för mer än 1/3 av det osedda datat under en offline evaluering,när slumpmässigt valda exempel från den osedda kommande veckanevalueras.

Gli stili APA, Harvard, Vancouver, ISO e altri

23

Forslund, John, e Jesper Fahlén. "Predicting customer purchase behavior within Telecom : How Artificial Intelligence can be collaborated into marketing efforts". Thesis, KTH, Skolan för industriell teknik och management (ITM), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279575.

Testo completo

Abstract (sommario):

This study aims to investigate the implementation of an AI model that predicts customer purchases, in the telecom industry. The thesis also outlines how such an AI model can assist decision-making in marketing strategies. It is concluded that designing the AI model by following a Recurrent Neural Network (RNN) architecture with a Long Short-Term Memory (LSTM) layer, allow for a successful implementation with satisfactory model performances. Stepwise instructions to construct such model is presented in the methodology section of the study. The RNN-LSTM model further serves as an assisting tool for marketers to assess how a consumer’s website behavior affect their purchase behavior over time, in a quantitative way - by observing what the authors refer to as the Customer Purchase Propensity Journey (CPPJ). The firm empirical basis of CPPJ, can help organizations improve their allocation of marketing resources, as well as benefit the organization’s online presence by allowing for personalization of the customer experience.
Denna studie undersöker implementeringen av en AI-modell som förutspår kunders köp, inom telekombranschen. Studien syftar även till att påvisa hur en sådan AI-modell kan understödja beslutsfattande i marknadsföringsstrategier. Genom att designa AI-modellen med en Recurrent Neural Network (RNN) arkitektur med ett Long Short-Term Memory (LSTM) lager, drar studien slutsatsen att en sådan design möjliggör en framgångsrik implementering med tillfredsställande modellprestation. Instruktioner erhålls stegvis för att konstruera modellen i studiens metodikavsnitt. RNN-LSTM-modellen kan med fördel användas som ett hjälpande verktyg till marknadsförare för att bedöma hur en kunds beteendemönster på en hemsida påverkar deras köpbeteende över tiden, på ett kvantitativt sätt - genom att observera det ramverk som författarna kallar för Kundköpbenägenhetsresan, på engelska Customer Purchase Propensity Journey (CPPJ). Den empiriska grunden av CPPJ kan hjälpa organisationer att förbättra allokeringen av marknadsföringsresurser, samt gynna deras digitala närvaro genom att möjliggöra mer relevant personalisering i kundupplevelsen.

Gli stili APA, Harvard, Vancouver, ISO e altri

24

Howard, Shaun Michael. "Deep Learning for Sensor Fusion". Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1495751146601099.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

25

Kišš, Martin. "Rozpoznávání historických textů pomocí hlubokých neuronových sítí". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2018. http://www.nusl.cz/ntk/nusl-385912.

Testo completo

Abstract (sommario):

The aim of this work is to create a tool for automatic transcription of historical documents. The work is mainly focused on the recognition of texts from the period of modern times written using font Fraktur. The problem is solved with a newly designed recurrent convolutional neural networks and a Spatial Transformer Network. Part of the solution is also an implemented generator of artificial historical texts. Using this generator, an artificial data set is created on which the convolutional neural network for line recognition is trained. This network is then tested on real historical lines of text on which the network achieves up to 89.0 % of character accuracy. The contribution of this work is primarily the newly designed neural network for text line recognition and the implemented artificial text generator, with which it is possible to train the neural network to recognize real historical lines of text.

Gli stili APA, Harvard, Vancouver, ISO e altri

26

Křepský, Jan. "Rekurentní neuronové sítě v počítačovém vidění". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2011. http://www.nusl.cz/ntk/nusl-237029.

Testo completo

Abstract (sommario):

The thesis concentrates on using recurrent neural networks in computer vision. The theoretical part describes the basic knowledge about artificial neural networks with focus on a recurrent architecture. There are presented some of possible applications of the recurrent neural networks which could be used for a solution of real problems. The practical part concentrates on face recognition from an image sequence using the Elman simple recurrent network. For training there are used the backpropagation and backpropagation through time algorithms.

Gli stili APA, Harvard, Vancouver, ISO e altri

27

Etienne, Caroline. "Apprentissage profond appliqué à la reconnaissance des émotions dans la voix". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS517.

Testo completo

Abstract (sommario):

Mes travaux de thèse s'intéressent à l'utilisation de nouvelles technologies d'intelligence artificielle appliquées à la problématique de la classification automatique des séquences audios selon l'état émotionnel du client au cours d'une conversation avec un téléconseiller. En 2016, l'idée est de se démarquer des prétraitements de données et modèles d'apprentissage automatique existant au sein du laboratoire, et de proposer un modèle qui soit le plus performant possible sur la base de données audios IEMOCAP. Nous nous appuyons sur des travaux existants sur les modèles de réseaux de neurones profonds pour la reconnaissance de la parole, et nous étudions leur extension au cas de la reconnaissance des émotions dans la voix. Nous nous intéressons ainsi à l'architecture neuronale bout-en-bout qui permet d'extraire de manière autonome les caractéristiques acoustiques du signal audio en vue de la tâche de classification à réaliser. Pendant longtemps, le signal audio est prétraité avec des indices paralinguistiques dans le cadre d'une approche experte. Nous choisissons une approche naïve pour le prétraitement des données qui ne fait pas appel à des connaissances paralinguistiques spécialisées afin de comparer avec l'approche experte. Ainsi le signal audio brut est transformé en spectrogramme temps-fréquence à l'aide d'une transformée de Fourier à court-terme. Exploiter un réseau neuronal pour une tâche de prédiction précise implique de devoir s'interroger sur plusieurs aspects. D'une part, il convient de choisir les meilleurs hyperparamètres possibles. D'autre part, il faut minimiser les biais présents dans la base de données (non discrimination) en ajoutant des données par exemple et prendre en compte les caractéristiques de la base de données choisie. Le but est d'optimiser le mieux possible l'algorithme de classification. Nous étudions ces aspects pour une architecture neuronale bout-en-bout qui associe des couches convolutives spécialisées dans le traitement de l'information visuelle, et des couches récurrentes spécialisées dans le traitement de l'information temporelle. Nous proposons un modèle d'apprentissage supervisé profond compétitif avec l'état de l'art sur la base de données IEMOCAP et cela justifie son utilisation pour le reste des expérimentations. Ce modèle de classification est constitué de quatre couches de réseaux de neurones à convolution et un réseau de neurones récurrent bidirectionnel à mémoire court-terme et long-terme (BLSTM). Notre modèle est évalué sur deux bases de données audios anglophones proposées par la communauté scientifique : IEMOCAP et MSP-IMPROV. Une première contribution est de montrer qu'avec un réseau neuronal profond, nous obtenons de hautes performances avec IEMOCAP et que les résultats sont prometteurs avec MSP-IMPROV. Une autre contribution de cette thèse est une étude comparative des valeurs de sortie des couches du module convolutif et du module récurrent selon le prétraitement de la voix opéré en amont : spectrogrammes (approche naïve) ou indices paralinguistiques (approche experte). À l'aide de la distance euclidienne, une mesure de proximité déterministe, nous analysons les données selon l'émotion qui leur est associée. Nous tentons de comprendre les caractéristiques de l'information émotionnelle extraite de manière autonome par le réseau. L'idée est de contribuer à une recherche centrée sur la compréhension des réseaux de neurones profonds utilisés en reconnaissance des émotions dans la voix et d'apporter plus de transparence et d'explicabilité à ces systèmes dont le mécanisme décisionnel est encore largement incompris
This thesis deals with the application of artificial intelligence to the automatic classification of audio sequences according to the emotional state of the customer during a commercial phone call. The goal is to improve on existing data preprocessing and machine learning models, and to suggest a model that is as efficient as possible on the reference IEMOCAP audio dataset. We draw from previous work on deep neural networks for automatic speech recognition, and extend it to the speech emotion recognition task. We are therefore interested in End-to-End neural architectures to perform the classification task including an autonomous extraction of acoustic features from the audio signal. Traditionally, the audio signal is preprocessed using paralinguistic features, as part of an expert approach. We choose a naive approach for data preprocessing that does not rely on specialized paralinguistic knowledge, and compare it with the expert approach. In this approach, the raw audio signal is transformed into a time-frequency spectrogram by using a short-term Fourier transform. In order to apply a neural network to a prediction task, a number of aspects need to be considered. On the one hand, the best possible hyperparameters must be identified. On the other hand, biases present in the database should be minimized (non-discrimination), for example by adding data and taking into account the characteristics of the chosen dataset. We study these aspects in order to develop an End-to-End neural architecture that combines convolutional layers specialized in the modeling of visual information with recurrent layers specialized in the modeling of temporal information. We propose a deep supervised learning model, competitive with the current state-of-the-art when trained on the IEMOCAP dataset, justifying its use for the rest of the experiments. This classification model consists of a four-layer convolutional neural networks and a bidirectional long short-term memory recurrent neural network (BLSTM). Our model is evaluated on two English audio databases proposed by the scientific community: IEMOCAP and MSP-IMPROV. A first contribution is to show that, with a deep neural network, we obtain high performances on IEMOCAP, and that the results are promising on MSP-IMPROV. Another contribution of this thesis is a comparative study of the output values of the layers of the convolutional module and the recurrent module according to the data preprocessing method used: spectrograms (naive approach) or paralinguistic indices (expert approach). We analyze the data according to their emotion class using the Euclidean distance, a deterministic proximity measure. We try to understand the characteristics of the emotional information extracted autonomously by the network. The idea is to contribute to research focused on the understanding of deep neural networks used in speech emotion recognition and to bring more transparency and explainability to these systems, whose decision-making mechanism is still largely misunderstood

Gli stili APA, Harvard, Vancouver, ISO e altri

28

Schäfer, Anton Maximilian. "Reinforcement Learning with Recurrent Neural Networks". Doctoral thesis, 2008. https://repositorium.ub.uni-osnabrueck.de/handle/urn:nbn:de:gbv:700-2008112111.

Testo completo

Abstract (sommario):

Controlling a high-dimensional dynamical system with continuous state and action spaces in a partially unknown environment like a gas turbine is a challenging problem. So far often hard coded rules based on experts´ knowledge and experience are used. Machine learning techniques, which comprise the field of reinforcement learning, are generally only applied to sub-problems. A reason for this is that most standard RL approaches still fail to produce satisfactory results in those complex environments. Besides, they are rarely data-efficient, a fact which is crucial for most real-world applications, where the available amount of data is limited. In this thesis recurrent neural reinforcement learning approaches to identify and control dynamical systems in discrete time are presented. They form a novel connection between recurrent neural networks (RNN) and reinforcement learning (RL) techniques. RNN are used as they allow for the identification of dynamical systems in form of high-dimensional, non-linear state space models. Also, they have shown to be very data-efficient. In addition, a proof is given for their universal approximation capability of open dynamical systems. Moreover, it is pointed out that they are, in contrast to an often cited statement, well able to capture long-term dependencies. As a first step towards reinforcement learning, it is shown that RNN can well map and reconstruct (partially observable) MDP. In the so-called hybrid RNN approach, the resulting inner state of the network is then used as a basis for standard RL algorithms. The further developed recurrent control neural network combines system identification and determination of an optimal policy in one network. In contrast to most RL methods, it determines the optimal policy directly without making use of a value function. The methods are tested on several standard benchmark problems. In addition, they are applied to different kinds of gas turbine simulations of industrial scale.

Gli stili APA, Harvard, Vancouver, ISO e altri

29

Rodríguez, Sotelo José Manuel. "Speech synthesis using recurrent neural networks". Thèse, 2016. http://hdl.handle.net/1866/19111.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

30

Chung, Junyoung. "On Deep Multiscale Recurrent Neural Networks". Thèse, 2018. http://hdl.handle.net/1866/21588.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

31

Rossi, Alberto. "Siamese and Recurrent neural networks for Medical Image Processing". Doctoral thesis, 2021. http://hdl.handle.net/2158/1238384.

Testo completo

Abstract (sommario):

In recent years computer vision applications have been pervaded by deep convolutional neural networks (CNNs). These networks allow practitioners to achieve the state of the art performance at least for the segmentation and classification of images and in object localization, but in each of these cases the obtained results are directly correlated with the size of the training set, the quality of the annotations, the network depth and the power of modern GPUs. The same rules apply to medical image analysis, although, in this case, collecting tagged images is more difficult than ever, due to the scarcity of data — because of privacy policies and acquisition difficulties — and to the need of experts in the field to make annotations. Very recently, scientific interest in the study and application of CNNs to medical imaging has grown significantly, opening up to challenging new tasks but also raising fundamental issues that are still open. Is there a way to use deep networks for image retrieval in a database to compare and analyze a new image? Are CNNs robust enough to be trusted by doctors? How can small institutions, with limited funds, manage expensive equipments, such as modern GPUs, needed to train very deep neural networks? This thesis investigates many of the issues described above, adopting two deep learning architectures, namely siamese networks and recurrent neural networks. We start with the use of siamese networks to build a Content–Based Image Retrieval system for prostate MRI, to provide radiologists with a tool for comparing multi–parametric MRI in order to facilitate a new diagnosis. Moreover, an investigation is proposed on the use of a composite loss classifier for prostate MRI, based on siamese networks, to increase robustness to random noise and adversarial attacks, yielding more reliable results. Finally, a new method for intra–procedural registration of prostatic MRIs based on siamese networks was developed. The use of recurrent neural networks is then explored for skin lesion classification and age estimation based on brain MRI. In particular, a new devised recurrent architecture, called C–FRPN, is employed for classifying natural images of nevis and melanomas allowing good performance with a reduced computational load. Similar conclusion can be drawn for the case brain MRI, where 3D images can be sliced and processed by recurrent architectures in an efficient though reliable way.

Gli stili APA, Harvard, Vancouver, ISO e altri

32

Lin, Wen-chung, e 林文中. "Qualitative Modeling of Genetic Regulatory Networks via Recurrent Artificial Neural Network". Thesis, 2002. http://ndltd.ncl.edu.tw/handle/76416405983136620066.

Testo completo

Abstract (sommario):

碩士
長庚大學
資訊管理研究所
90
According to the statistical abstract from Department of Health, Taiwan, R.O.C., 2001, cancer is still in the fist place in the cause of the death. Because of this reason, the therapy of cancer is widely emphasized on. Clinically, we can not tell the specific difference between the normal cell and cancer cell. This is one of the barriers to develop the therapy of cancer. Owing to these, a proposed qualitative model in order to help gene-related-disease workers to understand and reason the effect of toxic chemicals and medicines that are capable of activating or inactivating certain genes in the treatment of gene-related diseases. In this paper, we propose to model gene regulation networks qualitatively via recurrent artificial neural network. In this model, we assume the gene regulation network is definitive. Such a computational and representational model can reason about the interactions among related genes effectively and intuitively. It can help trace snapshots of gene regulatory dynamics at any two consecutive time steps concurrently along the discrete time line and it can help to produce what-if scenario when certain genes are activated or inactivated purposely as needed. Hence, it can serve as an auxiliary tool for gene-related-disease workers.

Gli stili APA, Harvard, Vancouver, ISO e altri

33

Krueger, David. "Designing Regularizers and Architectures for Recurrent Neural Networks". Thèse, 2016. http://hdl.handle.net/1866/14019.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

34

Peterson, Cole. "Generating rhyming poetry using LSTM recurrent neural networks". Thesis, 2019. http://hdl.handle.net/1828/10801.

Testo completo

Abstract (sommario):

Current approaches to generating rhyming English poetry with a neural network involve constraining output to enforce the condition of rhyme. We investigate whether this approach is necessary, or if recurrent neural networks can learn rhyme patterns on their own. We compile a new dataset of amateur poetry which allows rhyme to be learned without external constraints because of the dataset’s size and high frequency of rhymes. We then evaluate models trained on the new dataset using a novel framework that automatically measures the system’s knowledge of poetic form and generalizability. We find that our trained model is able to generalize the pattern of rhyme, generate rhymes unseen in the training data, and also that the learned word embeddings for rhyming sets of words are linearly separable. Our model generates a couplet which rhymes 68.15% of the time; this is the first time that a recurrent neural network has been shown to generate rhyming poetry a high percentage of the time. Additionally, we show that crowd-source workers can only distinguish between our generated couplets and couplets from our dataset 63.3% of the time, indicating that our model generates poetry with coherency, semantic meaning, and fluency comparable to couplets written by humans.
Graduate

Gli stili APA, Harvard, Vancouver, ISO e altri

35

Anbil, Parthipan Sarath Chandar. "On challenges in training recurrent neural networks". Thèse, 2019. http://hdl.handle.net/1866/23435.

Testo completo

Abstract (sommario):

Dans un problème de prédiction à multiples pas discrets, la prédiction à chaque instant peut dépendre de l’entrée à n’importe quel moment dans un passé lointain. Modéliser une telle dépendance à long terme est un des problèmes fondamentaux en apprentissage automatique. En théorie, les Réseaux de Neurones Récurrents (RNN) peuvent modéliser toute dépendance à long terme. En pratique, puisque la magnitude des gradients peut croître ou décroître exponentiellement avec la durée de la séquence, les RNNs ne peuvent modéliser que les dépendances à court terme. Cette thèse explore ce problème dans les réseaux de neurones récurrents et propose de nouvelles solutions pour celui-ci. Le chapitre 3 explore l’idée d’utiliser une mémoire externe pour stocker les états cachés d’un réseau à Mémoire Long et Court Terme (LSTM). En rendant l’opération d’écriture et de lecture de la mémoire externe discrète, l’architecture proposée réduit le taux de décroissance des gradients dans un LSTM. Ces opérations discrètes permettent également au réseau de créer des connexions dynamiques sur de longs intervalles de temps. Le chapitre 4 tente de caractériser cette décroissance des gradients dans un réseau de neurones récurrent et propose une nouvelle architecture récurrente qui, grâce à sa conception, réduit ce problème. L’Unité Récurrente Non-saturante (NRUs) proposée n’a pas de fonction d’activation saturante et utilise la mise à jour additive de cellules au lieu de la mise à jour multiplicative. Le chapitre 5 discute des défis de l’utilisation de réseaux de neurones récurrents dans un contexte d’apprentissage continuel, où de nouvelles tâches apparaissent au fur et à mesure. Les dépendances dans l’apprentissage continuel ne sont pas seulement contenues dans une tâche, mais sont aussi présentes entre les tâches. Ce chapitre discute de deux problèmes fondamentaux dans l’apprentissage continuel: (i) l’oubli catastrophique d’anciennes tâches et (ii) la capacité de saturation du réseau. De plus, une solution est proposée pour régler ces deux problèmes lors de l’entraînement d’un réseau de neurones récurrent.
In a multi-step prediction problem, the prediction at each time step can depend on the input at any of the previous time steps far in the past. Modelling such long-term dependencies is one of the fundamental problems in machine learning. In theory, Recurrent Neural Networks (RNNs) can model any long-term dependency. In practice, they can only model short-term dependencies due to the problem of vanishing and exploding gradients. This thesis explores the problem of vanishing gradient in recurrent neural networks and proposes novel solutions for the same. Chapter 3 explores the idea of using external memory to store the hidden states of a Long Short Term Memory (LSTM) network. By making the read and write operations of the external memory discrete, the proposed architecture reduces the rate of gradients vanishing in an LSTM. These discrete operations also enable the network to create dynamic skip connections across time. Chapter 4 attempts to characterize all the sources of vanishing gradients in a recurrent neural network and proposes a new recurrent architecture which has significantly better gradient flow than state-of-the-art recurrent architectures. The proposed Non-saturating Recurrent Units (NRUs) have no saturating activation functions and use additive cell updates instead of multiplicative cell updates. Chapter 5 discusses the challenges of using recurrent neural networks in the context of lifelong learning. In the lifelong learning setting, the network is expected to learn a series of tasks over its lifetime. The dependencies in lifelong learning are not just within a task, but also across the tasks. This chapter discusses the two fundamental problems in lifelong learning: (i) catastrophic forgetting of old tasks, and (ii) network capacity saturation. Further, it proposes a solution to solve both these problems while training a recurrent neural network.

Gli stili APA, Harvard, Vancouver, ISO e altri

36

Zhu, Yuqing. "Nonlinear system identification using a genetic algorithm and recurrent artificial neural networks". Thesis, 2006. http://spectrum.library.concordia.ca/9060/1/MR20771.pdf.

Testo completo

Abstract (sommario):

In this study, the application of Recurrent Artificial Neural Network (RANN) in nonlinear system identification has been extensively explored. Three RANN-based identification models have been presented to describe the behavior of the nonlinear systems. The approximation accuracy of RANN-based models relies on two key factors: architecture and weights. Due to its inherent property of parallelism and evolutionary mechanism, a Genetic Algorithm (GA) becomes a promising technique to obtain good neural network architecture. A GA is developed to approach the optimal architecture of a RANN with multiple hidden layers in this study. In order to approach the optimal architecture of Neural Networks in the sense of minimizing the identification error, an effective encoding scheme is in demand. A new Direct Matrix Mapping Encoding (DMME) method is proposed to represent the architecture of a neural network. A modified Back-propagation (BP) algorithm, in the sense of not only tuning NN weights but tuning other adjustable parameters as well, is utilized to tune the weights of RANNs and other parameters. The RANN with optimized or approximately optimized architecture and trained weights have been applied to the identification of nonlinear dynamic systems with unknown nonlinearities, which is a challenge in the control community. The effectiveness of these models and identification algorithms are extensively verified in the identification of several complex nonlinear systems such as a "smart" actuator preceded by hysteresis and friction-plague harmonic drive.

Gli stili APA, Harvard, Vancouver, ISO e altri

37

Ghazi-Zahedi, Keyan Mahmoud. "Self-Regulating Neurons. A model for synaptic plasticity in artificial recurrent neural networks". Doctoral thesis, 2009. https://repositorium.ub.uni-osnabrueck.de/handle/urn:nbn:de:gbv:700-2009020616.

Testo completo

Abstract (sommario):

Robustness and adaptivity are important behavioural properties observed in biological systems, which are still widely absent in artificial intelligence applications. Such static or non-plastic artificial systems are limited to their very specific problem domain. This work introducesa general model for synaptic plasticity in embedded artificial recurrent neural networks, which is related to short-term plasticity by synaptic scaling in biological systems. The model is general in the sense that is does not require trigger mechanisms or artificial limitations and it operates on recurrent neural networks of arbitrary structure. A Self-Regulation Neuron is defined as a homeostatic unit which regulates its activity against external disturbances towards a target value by modulation of its incoming and outgoing synapses. Embedded and situated in the sensori-motor loop, a network of these neurons is permanently driven by external stimuli andwill generally not settle at its asymptotically stable state. The system´s behaviour is determinedby the local interactions of the Self-Regulating Neurons. The neuron model is analysed as a dynamical system with respect to its attractor landscape and its transient dynamics. The latter is conducted based on different control structures for obstacle avoidance with increasing structural complexity derived from literature. The result isa controller that shows first traces of adaptivity. Next, two controllers for different tasks are evolved and their transient dynamics are fully analysed. The results of this work not only show that the proposed neuron model enhances the behavioural properties, but also points out the limitations of short-term plasticity which does not account for learning and memory.

Gli stili APA, Harvard, Vancouver, ISO e altri

38

Boulanger-Lewandowski, Nicolas. "Modeling High-Dimensional Audio Sequences with Recurrent Neural Networks". Thèse, 2014. http://hdl.handle.net/1866/11181.

Testo completo

Abstract (sommario):

Cette thèse étudie des modèles de séquences de haute dimension basés sur des réseaux de neurones récurrents (RNN) et leur application à la musique et à la parole. Bien qu'en principe les RNN puissent représenter les dépendances à long terme et la dynamique temporelle complexe propres aux séquences d'intérêt comme la vidéo, l'audio et la langue naturelle, ceux-ci n'ont pas été utilisés à leur plein potentiel depuis leur introduction par Rumelhart et al. (1986a) en raison de la difficulté de les entraîner efficacement par descente de gradient. Récemment, l'application fructueuse de l'optimisation Hessian-free et d'autres techniques d'entraînement avancées ont entraîné la recrudescence de leur utilisation dans plusieurs systèmes de l'état de l'art. Le travail de cette thèse prend part à ce développement. L'idée centrale consiste à exploiter la flexibilité des RNN pour apprendre une description probabiliste de séquences de symboles, c'est-à-dire une information de haut niveau associée aux signaux observés, qui en retour pourra servir d'à priori pour améliorer la précision de la recherche d'information. Par exemple, en modélisant l'évolution de groupes de notes dans la musique polyphonique, d'accords dans une progression harmonique, de phonèmes dans un énoncé oral ou encore de sources individuelles dans un mélange audio, nous pouvons améliorer significativement les méthodes de transcription polyphonique, de reconnaissance d'accords, de reconnaissance de la parole et de séparation de sources audio respectivement. L'application pratique de nos modèles à ces tâches est détaillée dans les quatre derniers articles présentés dans cette thèse. Dans le premier article, nous remplaçons la couche de sortie d'un RNN par des machines de Boltzmann restreintes conditionnelles pour décrire des distributions de sortie multimodales beaucoup plus riches. Dans le deuxième article, nous évaluons et proposons des méthodes avancées pour entraîner les RNN. Dans les quatre derniers articles, nous examinons différentes façons de combiner nos modèles symboliques à des réseaux profonds et à la factorisation matricielle non-négative, notamment par des produits d'experts, des architectures entrée/sortie et des cadres génératifs généralisant les modèles de Markov cachés. Nous proposons et analysons également des méthodes d'inférence efficaces pour ces modèles, telles la recherche vorace chronologique, la recherche en faisceau à haute dimension, la recherche en faisceau élagué et la descente de gradient. Finalement, nous abordons les questions de l'étiquette biaisée, du maître imposant, du lissage temporel, de la régularisation et du pré-entraînement.
This thesis studies models of high-dimensional sequences based on recurrent neural networks (RNNs) and their application to music and speech. While in principle RNNs can represent the long-term dependencies and complex temporal dynamics present in real-world sequences such as video, audio and natural language, they have not been used to their full potential since their introduction by Rumelhart et al. (1986a) due to the difficulty to train them efficiently by gradient-based optimization. In recent years, the successful application of Hessian-free optimization and other advanced training techniques motivated an increase of their use in many state-of-the-art systems. The work of this thesis is part of this development. The main idea is to exploit the power of RNNs to learn a probabilistic description of sequences of symbols, i.e. high-level information associated with observed signals, that in turn can be used as a prior to improve the accuracy of information retrieval. For example, by modeling the evolution of note patterns in polyphonic music, chords in a harmonic progression, phones in a spoken utterance, or individual sources in an audio mixture, we can improve significantly the accuracy of polyphonic transcription, chord recognition, speech recognition and audio source separation respectively. The practical application of our models to these tasks is detailed in the last four articles presented in this thesis. In the first article, we replace the output layer of an RNN with conditional restricted Boltzmann machines to describe much richer multimodal output distributions. In the second article, we review and develop advanced techniques to train RNNs. In the last four articles, we explore various ways to combine our symbolic models with deep networks and non-negative matrix factorization algorithms, namely using products of experts, input/output architectures, and generative frameworks that generalize hidden Markov models. We also propose and analyze efficient inference procedures for those models, such as greedy chronological search, high-dimensional beam search, dynamic programming-like pruned beam search and gradient descent. Finally, we explore issues such as label bias, teacher forcing, temporal smoothing, regularization and pre-training.

Gli stili APA, Harvard, Vancouver, ISO e altri

39

Kanuparthi, Bhargav. "Towards better understanding and improving optimization in recurrent neural networks". Thesis, 2020. http://hdl.handle.net/1866/24319.

Testo completo

Abstract (sommario):

Recurrent neural networks (RNN) are known for their notorious exploding and vanishing gradient problem (EVGP). This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because it prevents important gradient components from being back-propagated adequately over a large number of steps. The papers written in this work formalizes gradient propagation in parametric and semi-parametric RNNs to gain a better understanding towards the source of this problem. The first paper introduces a simple stochastic algorithm (h-detach) that is specific to LSTM optimization and targeted towards addressing the EVGP problem. Using this we show significant improvements over vanilla LSTM in terms of convergence speed, robustness to seed and learning rate, and generalization on various benchmark datasets. The next paper focuses on semi-parametric RNNs and self-attentive networks. Self-attention provides a way by which a system can dynamically access past states (stored in memory) which helps in mitigating vanishing of gradients. Although useful, it is difficult to scale as the size of the computational graph grows quadratically with the number of time steps involved. In the paper we describe a relevancy screening mechanism, inspired by the cognitive process of memory consolidation, that allows for a scalable use of sparse self-attention with recurrence while ensuring good gradient propagation.
Les réseaux de neurones récurrents (RNN) sont connus pour leur problème de gradient d'explosion et de disparition notoire (EVGP). Ce problème devient plus évident dans les tâches où les informations nécessaires pour les résoudre correctement existent sur de longues échelles de temps, car il empêche les composants de gradient importants de se propager correctement sur un grand nombre d'étapes. Les articles écrits dans ce travail formalise la propagation du gradient dans les RNN paramétriques et semi-paramétriques pour mieux comprendre la source de ce problème. Le premier article présente un algorithme stochastique simple (h-detach) spécifique à l'optimisation LSTM et visant à résoudre le problème EVGP. En utilisant cela, nous montrons des améliorations significatives par rapport au LSTM vanille en termes de vitesse de convergence, de robustesse au taux d'amorçage et d'apprentissage, et de généralisation sur divers ensembles de données de référence. Le prochain article se concentre sur les RNN semi-paramétriques et les réseaux auto-attentifs. L'auto-attention fournit un moyen par lequel un système peut accéder dynamiquement aux états passés (stockés en mémoire), ce qui aide à atténuer la disparition des gradients. Bien qu'utile, il est difficile à mettre à l'échelle car la taille du graphe de calcul augmente de manière quadratique avec le nombre de pas de temps impliqués. Dans l'article, nous décrivons un mécanisme de criblage de pertinence, inspiré par le processus cognitif de consolidation de la mémoire, qui permet une utilisation évolutive de l'auto-attention clairsemée avec récurrence tout en assurant une bonne propagation du gradient.

Gli stili APA, Harvard, Vancouver, ISO e altri

40

Mehri, Soroush. "Sequential modeling, generative recurrent neural networks, and their applications to audio". Thèse, 2016. http://hdl.handle.net/1866/18762.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

41

Agrawal, Harish. "Novel Neural Architectures based on Recurrent Connections and Symmetric Filters for Visual Processing". Thesis, 2022. https://etd.iisc.ac.in/handle/2005/6022.

Testo completo

Abstract (sommario):

Artificial Neural Networks (ANN) have been very successful due to their ability to extract meaningful information without any need for pre-processing raw data. First artificial neural networks were created in essence to understand how the human brain works. The expectations were that we would get a deeper understanding of the brain functions and human cognition, which we cannot explain just by biological experiments or intuitions. The field of ANN has grown so much now that the ANNs are not only limited for the purpose which they emerged for but are also being exploited for their unmatched pattern-matching and learning capabilities in addressing many complex problems, the problems which are difficult or impossible to solve by standard computational and statistical methods. The research has gone from ANN being used only for understanding brain functions to creating new types of ANN based on the neuronal pathways present in the brain. This thesis proposes two novel neural network layers based on studies on the human brain. First is a type of Recurrent Convolutional Neural Network layer called a Long-Short-Term-Convolutional-Neural-Network (LST_CNN) and the other is a Symmetric Convolutional Neural Network layer based on Symmetric Filters. The current feedforward neural network models have been successful in visual processing. Due to this, the lateral and feedback processing has been under-explored. Existing visual processing networks (Convolutional Neural Networks) lack the recurrent neuronal dynamics which are present in ventral visual pathways of human and non-human primate brains. Ventral visual pathways contain similar densities of feedforward and feedback connections. Furthermore, the current convolutional models are limited in learning spatial information, but we should also focus on learning temporal visual information, considering that the world is dynamic in nature and not static. Thus motivating us to incorporate recurrence in the convolutional neural networks. The layer we propose (LST_CNN) is not just limited to spatial learning but is also capable of exploiting temporal knowledge from the data due to the implicit presence of recurrence in the structure. The capability of LST_CNN’s spatiotemporal learning is examined by testing it on Object Detection and Tracking. Due to the fact that LST_CNN is based on LSTM, we explicitly evaluate its spatial learning capabilities through experiments. The visual cortex in the human brain has evolved to detect patterns and hence has specialized in detecting the pervasive symmetry in Nature. When filter weights from deep SOTA networks are visualized, several of them are symmetric similar to the features they represent. Hence inspiring the idea of constraining standard convolutional filter weights to symmetric weights. Given that the computational requirements for DNN training have doubled every few months, researchers have been trying to come up with NN architectural changes to combat this. In light of that, deploying symmetric filters reduces not only computational resources but also memory footprint. Therefore, using symmetric filters is beneficial for inference and also during training. Despite the reduction in trainable parameters, the accuracy is comparable to the standard version, thus allowing us to infer that they prevent over-fitting. We establish the quintessence of symmetric filters in NN models.

Gli stili APA, Harvard, Vancouver, ISO e altri

42

Ghazi-Zahedi, Keyan Mahmoud [Verfasser]. "Self-regulating neurons : a model for synaptic plasticity in artificial recurrent neural networks / Keyan Mahmoud Ghazi-Zahedi". 2009. http://d-nb.info/992767202/34.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

43

Sainath, Pravish. "Modeling functional brain activity of human working memory using deep recurrent neural networks". Thesis, 2020. http://hdl.handle.net/1866/25468.

Testo completo

Abstract (sommario):

Dans les systèmes cognitifs, le rôle de la mémoire de travail est crucial pour le raisonnement visuel et la prise de décision. D’énormes progrès ont été réalisés dans la compréhension des mécanismes de la mémoire de travail humain/animal, ainsi que dans la formulation de différents cadres de réseaux de neurones artificiels à mémoire augmentée. L’objectif global de notre projet est de former des modèles de réseaux de neurones artificiels capables de consolider la mémoire sur une courte période de temps pour résoudre une tâche de mémoire et les relier à l’activité cérébrale des humains qui ont résolu la même tâche. Le projet est de nature interdisciplinaire en essayant de relier les aspects de l’intelligence artificielle (apprentissage profond) et des neurosciences. La tâche cognitive utilisée est la tâche N-back, très populaire en neurosciences cognitives dans laquelle les sujets sont présentés avec une séquence d’images, dont chacune doit être identifiée pour savoir si elle a déjà été vue ou non. L’ensemble de données d’imagerie fonctionnelle (IRMf) utilisé a été collecté dans le cadre du projet Courtois Neurmod. Nous étudions plusieurs variantes de modèles de réseaux neuronaux récurrents qui apprennent à résoudre la tâche de mémoire de travail N-back en les entraînant avec des séquences d’images. Ces réseaux de neurones entraînés optimisés pour la tâche de mémoire sont finalement utilisés pour générer des représentations de caractéristiques pour les images de stimuli vues par les sujets humains pendant leurs enregistrements tout en résolvant la tâche. Les représentations dérivées de ces réseaux de neurones servent ensuite à créer un modèle de codage pour prédire l’activité IRMf BOLD des sujets. On comprend alors la relation entre le modèle de réseau neuronal et l’activité cérébrale en analysant cette capacité prédictive du modèle dans différentes zones du cerveau impliquées dans la mémoire de travail. Ce travail présente une manière d’utiliser des réseaux de neurones artificiels pour modéliser le comportement et le traitement de l’information de la mémoire de travail du cerveau et d’utiliser les données d’imagerie cérébrale capturées sur des sujets humains lors de la tâche N-back pour potentiellement comprendre certains mécanismes de mémoire du cerveau en relation avec ces modèles de réseaux de neurones artificiels.
In cognitive systems, the role of working memory is crucial for visual reasoning and decision making. Tremendous progress has been made in understanding the mechanisms of the human/animal working memory, as well as in formulating different frameworks of memory augmented artificial neural networks. The overall objective of our project is to train artificial neural network models that are capable of consolidating memory over a short period of time to solve a memory task and relate them to the brain activity of humans who solved the same task. The project is of interdisciplinary nature in trying to bridge aspects of Artificial Intelligence (deep learning) and Neuroscience. The cognitive task used is the N-back task, a very popular one in Cognitive Neuroscience in which the subjects are presented with a sequence of images, each of which needs to be identified as to whether it was already seen or not. The functional imaging (fMRI) dataset used has been collected as a part of the Courtois Neurmod Project. We study multiple variants of recurrent neural network models that learn to remember input images across timesteps. These trained neural networks optimized for the memory task are ultimately used to generate feature representations for the stimuli images seen by the human subjects during their recordings while solving the task. The representations derived from these neural networks are then to create an encoding model to predict the fMRI BOLD activity of the subjects. We then understand the relationship between the neural network model and brain activity by analyzing this predictive ability of the model in different areas of the brain that are involved in working memory. This work presents a way of using artificial neural networks to model the behavior and information processing of the working memory of the brain and to use brain imaging data captured from human subjects during the N-back task to potentially understand some memory mechanisms of the brain in relation to these artificial neural network models.

Gli stili APA, Harvard, Vancouver, ISO e altri

44

Ting, Chien-Chung, e 丁建中. "Robust Stabilization Analysis and Estimator Design for Uncertain Neutral Recurrent Neural Networks with Interval Time-varying Discrete and Distributed Delays". Thesis, 2010. http://ndltd.ncl.edu.tw/handle/58694883170618759753.

Testo completo

Abstract (sommario):

碩士
國立彰化師範大學
工業教育與技術學系
98
This thesis presents the complete study of stability analysis and state estimators design. The system is focused on neutral neural networks with both interval discrete and distributed time-varying delays, where the time-varying delays are in a given range. In a stability analysis problem, the purpose is to develop globally robust delay-dependent stability for neutral uncertain neural networks with both discrete and distributed delays. The activation functions are supposed to be bounded and globally Lipschitz continuous. By using a Lyapunov function approach and linear matrix inequality (LMI) techniques, the stability criteria for the neutral uncertain neural networks with both discrete and distributed delays are established in the form of LMIs, which can be readily verified by using standard numerical software. In an estimator design problem, the estimation for neutral neural network with both discrete and distributed interval time-varying delays is investigated. By using the Lyapunov-Krasovskii method, a linear matrix inequality (LMI) approach is developed to construct sufficient conditions for the existence of admissible state estimators such that the error-state system is globally asymptotically stable. Then, we show that both the existence conditions and the explicit expression of the desired estimator can be characterized in terms of the solution to an LMI. Finally, some illustrative examples have been presented to demonstrate the effectiveness of the proposed approach.

Gli stili APA, Harvard, Vancouver, ISO e altri

45

Laurent, César. "Advances in parameterisation, optimisation and pruning of neural networks". Thesis, 2020. http://hdl.handle.net/1866/25592.

Testo completo

Abstract (sommario):

Les réseaux de neurones sont une famille de modèles de l'apprentissage automatique qui sont capable d'apprendre des tâches complexes directement des données. Bien que produisant déjà des résultats impressionnants dans beaucoup de domaines tels que la reconnaissance de la parole, la vision par ordinateur ou encore la traduction automatique, il y a encore de nombreux défis dans l'entraînement et dans le déploiement des réseaux de neurones. En particulier, entraîner des réseaux de neurones nécessite typiquement d'énormes ressources computationnelles, et les modèles entraînés sont souvent trop gros ou trop gourmands en ressources pour être déployés sur des appareils dont les ressources sont limitées, tels que les téléphones intelligents ou les puces de faible puissance. Les articles présentés dans cette thèse étudient des solutions à ces différents problèmes. Les deux premiers articles se concentrent sur l'amélioration de l'entraînement des réseaux de neurones récurrents (RNNs), un type de réseaux de neurones particulier conçu pour traiter des données séquentielles. Les RNNs sont notoirement difficiles à entraîner, donc nous proposons d'améliorer leur paramétrisation en y intégrant la normalisation par lots (BN), qui était jusqu'à lors uniquement appliquée aux réseaux non-récurrents. Dans le premier article, nous appliquons BN aux connections des entrées vers les couches cachées du RNN, ce qui réduit le décalage covariable entre les différentes couches; et dans le second article, nous montrons comment appliquer BN aux connections des entrées vers les couches cachées et aussi des couches cachée vers les couches cachée des réseau récurrents à mémoire court et long terme (LSTM), une architecture populaire de RNN, ce qui réduit également le décalage covariable entre les pas de temps. Nos expériences montrent que les paramétrisations proposées permettent d'entraîner plus rapidement et plus efficacement les RNNs, et ce sur différents bancs de tests. Dans le troisième article, nous proposons un nouvel optimiseur pour accélérer l'entraînement des réseaux de neurones. Les optimiseurs diagonaux traditionnels, tels que RMSProp, opèrent dans l'espace des paramètres, ce qui n'est pas optimal lorsque plusieurs paramètres sont mis à jour en même temps. A la place, nous proposons d'appliquer de tels optimiseurs dans une base dans laquelle l'approximation diagonale est susceptible d'être plus efficace. Nous tirons parti de l'approximation K-FAC pour construire efficacement cette base propre Kronecker-factorisée (KFE). Nos expériences montrent une amélioration en vitesse d'entraînement par rapport à K-FAC, et ce pour différentes architectures de réseaux de neurones profonds. Le dernier article se concentre sur la taille des réseaux de neurones, i.e. l'action d'enlever des paramètres du réseau, afin de réduire son empreinte mémoire et son coût computationnel. Les méthodes de taille typique se base sur une approximation de Taylor de premier ou de second ordre de la fonction de coût, afin d'identifier quels paramètres peuvent être supprimés. Nous proposons d'étudier l'impact des hypothèses qui se cachent derrière ces approximations. Aussi, nous comparons systématiquement les méthodes basées sur des approximations de premier et de second ordre avec la taille par magnitude (MP), et montrons comment elles fonctionnent à la fois avant, mais aussi après une phase de réapprentissage. Nos expériences montrent que mieux préserver la fonction de coût ne transfère pas forcément à des réseaux qui performent mieux après la phase de réapprentissage, ce qui suggère que considérer uniquement l'impact de la taille sur la fonction de coût ne semble pas être un objectif suffisant pour développer des bon critères de taille.
Neural networks are a family of Machine Learning models able to learn complex tasks directly from the data. Although already producing impressive results in many areas such as speech recognition, computer vision or machine translation, there are still a lot of challenges in both training and deployment of neural networks. In particular, training neural networks typically requires huge amounts of computational resources, and trained models are often too big or too computationally expensive to be deployed on resource-limited devices, such as smartphones or low-power chips. The articles presented in this thesis investigate solutions to these different issues. The first couple of articles focus on improving the training of Recurrent Neural Networks (RNNs), networks specially designed to process sequential data. RNNs are notoriously hard to train, so we propose to improve their parameterisation by upgrading them with Batch Normalisation (BN), a very effective parameterisation which was hitherto used only in feed-forward networks. In the first article, we apply BN to the input-to-hidden connections of the RNNs, thereby reducing internal covariate shift between layers. In the second article, we show how to apply it to both input-to-hidden and hidden-to-hidden connections of the Long Short-Term Memory (LSTM), a popular RNN architecture, thus also reducing internal covariate shift between time steps. Our experiments show that these proposed parameterisations allow for faster and better training of RNNs on several benchmarks. In the third article, we propose a new optimiser to accelerate the training of neural networks. Traditional diagonal optimisers, such as RMSProp, operate in parameters coordinates, which is not optimal when several parameters are updated at the same time. Instead, we propose to apply such optimisers in a basis in which the diagonal approximation is likely to be more effective. We leverage the same approximation used in Kronecker-factored Approximate Curvature (K-FAC) to efficiently build this Kronecker-factored Eigenbasis (KFE). Our experiments show improvements over K-FAC in training speed for several deep network architectures. The last article focuses on network pruning, the action of removing parameters from the network, in order to reduce its memory footprint and computational cost. Typical pruning methods rely on first or second order Taylor approximations of the loss landscape to identify which parameters can be discarded. We propose to study the impact of the assumptions behind such approximations. Moreover, we systematically compare methods based on first and second order approximations with Magnitude Pruning (MP), showing how they perform both before and after a fine-tuning phase. Our experiments show that better preserving the original network function does not necessarily transfer to better performing networks after fine-tuning, suggesting that only considering the impact of pruning on the loss might not be a sufficient objective to design good pruning criteria.

Gli stili APA, Harvard, Vancouver, ISO e altri

46

Leszko, Dominika. "Time series forecasting for a call center in a Warsaw holding company". Master's thesis, 2020. http://hdl.handle.net/10362/102939.

Testo completo

Abstract (sommario):

Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics
In recent years, artificial intelligence and cognitive technologies are actively being adopted in industries that use conversational marketing. Workforce managers face the constant challenge of balancing the priorities of service levels and related service costs. This problem is especially common when inaccurate forecasts lead to inefficient scheduling decisions and in turn result in dramatic impact on the customer engagement and experience and thus call center’s profitability. The main trigger of this project development was the Company X’s struggle to estimate the number of inbound phone calls expected in the upcoming 40 days. Accurate phone call volume forecast could significantly improve consultants’ time management, as well as, service quality. Keeping this goal in mind, the main focus of this internship is to conduct a set of experiments with various types of predictive models and identify the best performing for the analyzed use case. After a thorough review of literature covering work related to time series analysis, the empirical part of the internship follows which describes the process of developing both, univariate and multivariate statistical models. The methods used in the report also include two types of recurrent neural networks which are commonly used for time series prediction. The exogenous variables used in multivariate models are derived from the Media Planning department of the company which stores information about the ads being published in the newspapers. The outcome of the research shows that statistical models outperformed the neural networks in this specific application. This report covers the overview of statistical and neural network models used. After that, a comparative study of all tested models is conducted and one best performing model is selected. Evidently, the experiments showed that SARIMAX model yields best predictions for the analyzed use-case and thus it is recommended for the company to be used for a better staff management driving a more pleasant customer experience of the call center.

Gli stili APA, Harvard, Vancouver, ISO e altri

47

(11048391), Hao Sha. "SOLVING PREDICTION PROBLEMS FROM TEMPORAL EVENT DATA ON NETWORKS". Thesis, 2021.

Cerca il testo completo

Abstract (sommario):

Many complex processes can be viewed as sequential events on a network. In this thesis, we study the interplay between a network and the event sequences on it. We first focus on predicting events on a known network. Examples of such include: modeling retweet cascades, forecasting earthquakes, and tracing the source of a pandemic. In specific, given the network structure, we solve two types of problems - (1) forecasting future events based on the historical events, and (2) identifying the initial event(s) based on some later observations of the dynamics. The inverse problem of inferring the unknown network topology or links, based on the events, is also of great important. Examples along this line include: constructing influence networks among Twitter users from their tweets, soliciting new members to join an event based on their participation history, and recommending positions for job seekers according to their work experience. Following this direction, we study two types of problems - (1) recovering influence networks, and (2) predicting links between a node and a group of nodes, from event sequences.

Gli stili APA, Harvard, Vancouver, ISO e altri

48

Savard, François. "Réseaux de neurones à relaxation entraînés par critère d'autoencodeur débruitant". Thèse, 2011. http://hdl.handle.net/1866/6176.

Testo completo

Abstract (sommario):

L’apprentissage machine est un vaste domaine où l’on cherche à apprendre les paramètres de modèles à partir de données concrètes. Ce sera pour effectuer des tâches demandant des aptitudes attribuées à l’intelligence humaine, comme la capacité à traiter des don- nées de haute dimensionnalité présentant beaucoup de variations. Les réseaux de neu- rones artificiels sont un exemple de tels modèles. Dans certains réseaux de neurones dits profonds, des concepts "abstraits" sont appris automatiquement. Les travaux présentés ici prennent leur inspiration de réseaux de neurones profonds, de réseaux récurrents et de neuroscience du système visuel. Nos tâches de test sont la classification et le débruitement d’images quasi binaires. On permettra une rétroac- tion où des représentations de haut niveau (plus "abstraites") influencent des représentations à bas niveau. Cette influence s’effectuera au cours de ce qu’on nomme relaxation, des itérations où les différents niveaux (ou couches) du modèle s’interinfluencent. Nous présentons deux familles d’architectures, l’une, l’architecture complètement connectée, pouvant en principe traiter des données générales et une autre, l’architecture convolutionnelle, plus spécifiquement adaptée aux images. Dans tous les cas, les données utilisées sont des images, principalement des images de chiffres manuscrits. Dans un type d’expérience, nous cherchons à reconstruire des données qui ont été corrompues. On a pu y observer le phénomène d’influence décrit précédemment en comparant le résultat avec et sans la relaxation. On note aussi certains gains numériques et visuels en terme de performance de reconstruction en ajoutant l’influence des couches supérieures. Dans un autre type de tâche, la classification, peu de gains ont été observés. On a tout de même pu constater que dans certains cas la relaxation aiderait à apprendre des représentations utiles pour classifier des images corrompues. L’architecture convolutionnelle développée, plus incertaine au départ, permet malgré tout d’obtenir des reconstructions numériquement et visuellement semblables à celles obtenues avec l’autre architecture, même si sa connectivité est contrainte.
Machine learning is a vast field where we seek to learn parameters for models from concrete data. The goal will be to execute various tasks requiring abilities normally associated more with human intelligence than with a computer program, such as the ability to process high dimensional data containing a lot of variations. Artificial neural networks are a large class of such models. In some neural networks said to be deep, we can observe that high level (or "abstract") concepts are automatically learned. The work we present here takes its inspiration from deep neural networks, from recurrent networks and also from neuroscience of the visual system. Our test tasks are classification and denoising for near binary images. We aim to take advantage of a feedback mechanism through which high-level representations, that is to say relatively abstract concepts, can influence lower-level representations. This influence will happen during what we call relaxation, which is iterations where the different levels (or layers) of the model can influence each other. We will present two families of architectures based on this mechanism. One, the fully connected architecture, can in principle accept generic data. The other, the convolutional one, is specifically made for images. Both were trained on images, though, and mostly images of written characters. In one type of experiment, we want to reconstruct data that has been corrupted. In these tasks, we have observed the feedback influence phenomenon previously described by comparing the results we obtained with and without relaxation. We also note some numerical and visual improvement in terms of reconstruction performance when we add upper layers’ influence. In another type of task, classification, little gain has been noted. Still, in one setting where we tried to classify noisy data with a representation trained without prior class information, relaxation did seem to improve results significantly. The convolutional architecture, a bit more risky at first, was shown to produce numerical and visual results in reconstruction that are near those obtained with the fully connected version, even though the connectivity is much more constrained.

Gli stili APA, Harvard, Vancouver, ISO e altri

49

Straková, Jana. "Rozpoznávání pojmenovaných entit pomocí neuronových sítí". Doctoral thesis, 2017. http://www.nusl.cz/ntk/nusl-368176.

Testo completo

Abstract (sommario):

Title: Neural Network Based Named Entity Recognition Author: Jana Straková Institute: Institute of Formal and Applied Linguistics Supervisor of the doctoral thesis: prof. RNDr. Jan Hajič, Dr., Institute of Formal and Applied Linguistics Abstract: Czech named entity recognition (the task of automatic identification and classification of proper names in text, such as names of people, locations and organizations) has become a well-established field since the publication of the Czech Named Entity Corpus (CNEC). This doctoral thesis presents the author's research of named entity recognition, mainly in the Czech language. It presents work and research carried out during CNEC publication and its evaluation. It fur- ther envelops the author's research results, which improved Czech state-of-the-art results in named entity recognition in recent years, with special focus on artificial neural network based solutions. Starting with a simple feed-forward neural net- work with softmax output layer, with a standard set of classification features for the task, the thesis presents methodology and results, which were later used in open-source software solution for named entity recognition, NameTag. The thesis finalizes with a recurrent neural network based recognizer with word embeddings and character-level word embeddings,...

Gli stili APA, Harvard, Vancouver, ISO e altri

50

Dutil, Francis. "Prédiction et génération de données structurées à l'aide de réseaux de neurones et de décisions discrètes". Thèse, 2018. http://hdl.handle.net/1866/22124.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Tesi sul tema "Artificial Neural Networks and Recurrent Neutral Networks"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili