To see the other types of publications on this topic, follow the link: Restricted Boltzmann Machine (RBM).

Dissertations / Theses on the topic 'Restricted Boltzmann Machine (RBM)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 41 dissertations / theses for your research on the topic 'Restricted Boltzmann Machine (RBM).'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Bertholds, Alexander, and Emil Larsson. "An intelligent search for feature interactions using Restricted Boltzmann Machines." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-202208.

Full text
Abstract:
Klarna uses a logistic regression to estimate the probability that an e-store customer will default on its given credit. The logistic regression is a linear statistical model which cannot detect non-linearities in the data. The aim of this project has been to develop a program which can be used to find suitable non-linear interaction-variables. This can be achieved using a Restricted Boltzmann Machine, an unsupervised neural network, whose hidden nodes can be used to model the distribution of the data. By using the hidden nodes as new variables in the logistic regression it is possible to see which nodes that have the greatest impact on the probability of default estimates. The contents of the hidden nodes, corresponding to different parts of the data distribution, can be used to find suitable interaction-variables which will allow the modelling of non-linearities. It was possible to find the data distribution using the Restricted Boltzmann Machine and adding its hidden nodes to the logistic regression improved the model's ability to predict the probability of default. The hidden nodes could be used to create interaction-variables which improve Klarna's internal models used for credit risk estimates.
Klarna använder en logistisk regression för att estimera sannolikheten att en e-handelskund inte kommer att betala sina fakturor efter att ha givits kredit. Den logistiska regressionen är en linjär modell och kan därför inte upptäcka icke-linjäriteter i datan. Målet med detta projekt har varit att utveckla ett program som kan användas för att hitta lämpliga icke-linjära interaktionsvariabler. Genom att införa dessa i den logistiska regressionen blir det möjligt att upptäcka icke-linjäriteter i datan och därmed förbättra sannolikhetsestimaten. Det utvecklade programmet använder Restricted Boltzmann Machines, en typ av oövervakat neuralt nätverk, vars dolda noder kan användas för att hitta datans distribution. Genom att använda de dolda noderna i den logistiska regressionen är det möjligt att se vilka delar av distributionen som är viktigast i sannolikhetsestimaten. Innehållet i de dolda noderna, som motsvarar olika delar av datadistributionen, kan användas för att hitta lämpliga interaktionsvariabler. Det var möjligt att hitta datans distribution genom att använda en Restricted Boltzmann Machine och dess dolda noder förbättrade sannolikhetsestimaten från den logistiska regressionen. De dolda noderna kunde användas för att skapa interaktionsvariabler som förbättrar Klarnas interna kreditriskmodeller.
APA, Harvard, Vancouver, ISO, and other styles
2

Moody, John Matali. "Process monitoring with restricted Boltzmann machines." Thesis, Stellenbosch : Stellenbosch University, 2014. http://hdl.handle.net/10019.1/86467.

Full text
Abstract:
Thesis (MScEng)--Stellenbosch University, 2014.
ENGLISH ABSTRACT: Process monitoring and fault diagnosis are used to detect abnormal events in processes. The early detection of such events or faults is crucial to continuous process improvement. Although principal component analysis and partial least squares are widely used for process monitoring and fault diagnosis in the metallurgical industries, these models are linear in principle; nonlinear approaches should provide more compact and informative models. The use of auto associative neural networks or auto encoders provide a principled approach for process monitoring. However, until very recently, these multiple layer neural networks have been difficult to train and have therefore not been used to any significant extent in process monitoring. With newly proposed algorithms based on the pre-training of the layers of the neural networks, it is now possible to train neural networks with very complex structures, i.e. deep neural networks. These neural networks can be used as auto encoders to extract features from high dimensional data. In this study, the application of deep auto encoders in the form of Restricted Boltzmann machines (RBM) to the extraction of features from process data is considered. These networks have mostly been used for data visualization to date and have not been applied in the context of fault diagnosis or process monitoring as yet. The objective of this investigation is therefore to assess the feasibility of using Restricted Boltzmann machines in various fault detection schemes. The use of RBM in process monitoring schemes will be discussed, together with the application of these models in automated control frameworks.
AFRIKAANSE OPSOMMING: Prosesmonitering en fout diagnose word gebruik om abnormale gebeure in prosesse op te spoor. Die vroeë opsporing van sulke gebeure of foute is noodsaaklik vir deurlopende verbetering van prosesse. Alhoewel hoofkomponent-analise en parsiële kleinste kwadrate wyd gebruik word vir prosesmonitering en fout diagnose in die metallurgiese industrieë, is hierdie modelle lineêr in beginsel; nie-lineêre benaderings behoort meer kompakte en insiggewende modelle te voorsien. Die gebruik van outo-assosiatiewe neurale netwerke of outokodeerders bied 'n beginsel gebaseerder benadering om dit te bereik. Hierdie veelvoudige laag neurale netwerke was egter tot onlangs moeilik om op te lei en is dus nie tot ʼn beduidende mate in die prosesmonitering gebruik nie. Nuwe, voorgestelde algoritmes, gebaseer op voorafopleiding van die lae van die neurale netwerke, maak dit nou moontlik om neurale netwerke met baie ingewikkelde strukture, d.w.s. diep neurale netwerke, op te lei. Hierdie neurale netwerke kan gebruik word as outokodeerders om kenmerke van hoë-dimensionele data te onttrek. In hierdie studie word die toepassing van diep outokodeerders in die vorm van Beperkte Boltzmann Masjiene vir die onttrekking van kenmerke van proses data oorweeg. Tot dusver is hierdie netwerke meestal vir data visualisering gebruik en dit is nog nie toegepas in die konteks van fout diagnose of prosesmonitering nie. Die doel van hierdie ondersoek is dus om die haalbaarheid van die gebruik van Beperkte Boltzmann Masjiene in verskeie foutopsporingskemas te assesseer. Die gebruik van Beperkte Boltzmann Masjiene se eienskappe in prosesmoniteringskemas sal bespreek word, tesame met die toepassing van hierdie modelle in outomatiese beheer raamwerke.
APA, Harvard, Vancouver, ISO, and other styles
3

McCoppin, Ryan R. "An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines." Wright State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=wright1418750414.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Vrábel, Jakub. "Popis Restricted Boltzmann machine metody ve vztahu se statistickou fyzikou a jeho následné využití ve zpracování spektroskopických dat." Master's thesis, Vysoké učení technické v Brně. Fakulta strojního inženýrství, 2019. http://www.nusl.cz/ntk/nusl-402522.

Full text
Abstract:
Práca sa zaoberá spojeniami medzi štatistickou fyzikou a strojovým učením s dôrazom na základné princípy a ich dôsledky. Ďalej sa venuje obecným vlastnostiam spektroskopických dát a ich zohľadnení pri pokročilom spracovaní dát. Začiatok práce je venovaný odvodeniu partičnej sumy štatistického systému a štúdiu Isingovho modelu pomocou "mean field" prístupu. Následne, popri základnom úvode do strojového učenia, je ukázaná ekvivalencia medzi Isingovým modelom a Hopfieldovou sieťou - modelom strojového učenia. Na konci teoretickej časti je z Hopfieldovej siete odvodený model Restricted Boltzmann Machine (RBM). Vhodnosť použitia RBM na spracovanie spektroskopických dát je diskutovaná a preukázaná na znížení dimenzie týchto dát. Výsledky sú porovnané s bežne používanou Metódou Hlavných Komponent (PCA), spolu so zhodnotením prístupu a možnosťami ďalšieho zlepšovania.
APA, Harvard, Vancouver, ISO, and other styles
5

Svoboda, Jiří. "Multi-modální "Restricted Boltzmann Machines"." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236426.

Full text
Abstract:
This thesis explores how multi-modal Restricted Boltzmann Machines (RBM) can be used in content-based image tagging. This work also cointains brief analysis of modalities that can be used for multi-modal classification. There are also described various RBMs, that are suitable for different kinds of input data. A design and implementation of multimodal RBM is described together with results of preliminary experiments.
APA, Harvard, Vancouver, ISO, and other styles
6

Fredriksson, Gustav, and Anton Hellström. "Restricted Boltzmann Machine as Recommendation Model for Venture Capital." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252703.

Full text
Abstract:
Denna studie introducerar restricted Boltzmann machines (RBMs) som rekommendationsmodell i kontexten av riskkapital. Ett nätverk av relationer används som proxy för att modellera investerares bolagspreferenser. Studiens huvudfokus är att undersöka hur RBMs kan implementeras för ett dataset bestående av relationer mellan personer och bolag, samt att undersöka om modellen går att förbättra genom att tillföra av ytterligare information. Nätverket skapas från styrelsesammansättningar för svenska bolag. För nätverket implementeras RBMs både med och utan den extra informationen om bolagens ursprungsort. Vardera RBM-modell undersöks genom att utvärdera dess inlärningsförmåga samt förmåga att återskapa manuellt gömda relationer. Resultatet påvisar att RBM-modellerna har en bristfällig förmåga att återskapa borttagna relationer, dock noteras god inlärningsförmåga. Genom att addera ursprungsort som extra information förbättras modellerna markant och god potential som rekommendationsmodell går att urskilja, både med avseende på inlärningsförmåga samt förmåga att återskapa gömda relationer.
In this thesis, we introduce restricted Boltzmann machines (RBMs) as a recommendation model in the context of venture capital. A network of connections is used as a proxy for investors’ preferences of companies. The main focus of the thesis is to investigate how RBMs can be implemented on a network of connections and investigate if conditional information can be used to boost RBMs. The network of connections is created by using board composition data of Swedish companies. For the network, RBMs are implemented with and without companies’ place of origin as conditional data, respectively. The RBMs are evaluated by their learning abilities and their ability to recreate withheld connections. The findings show that RBMs perform poorly when used to recreate withheld connections but can be tuned to acquire good learning abilities. Adding place of origin as conditional information improves the model significantly and show potential as a recommendation model, both with respect to learning abilities and the ability to recreate withheld connections.
APA, Harvard, Vancouver, ISO, and other styles
7

Juel, Bjørn Erik. "Investigating the Consistency and Convexity of Restricted Boltzmann Machine Learning." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for nevromedisin, 2013. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-25696.

Full text
Abstract:
In this thesis we asses the consistency and convexity of the parameter inference in Boltzmann machine learning algorithms based on gradient ascent on the likelihood surface. We do this by rst developing standard tools for generating equillibrium data drawn from a Boltzmann distribution, as well as analytically exact algorithms for inferring the parameters of restricted and semi-restricted Boltzmann machine architctures. After testing, and showing, the functionality of our algorithms, we assess how dierent network properties eect the inferrence quality of restricted Boltzmann machines. Subsequently, we look closer at the likelihood function itself, in an attempt to uncover more rigid details about its curvature, and the nature of its convexity. As we present results of our investigation, we discuss the ndings, before suggesting possible future directions to take, improvements to make and aspects to further investigate. We conclude that the standard, analytically exact restricted Boltzmann machine algorithm is convex up to certain permutations of the parameters, when initialized within reasonable ranges of parameter values, and given that the strength of connectivity in the underlying model is within a specied range. Additionaly, for strengths of connectivity, the distribution of Hessian eigenvalues of the likelihood function, as a funtion of the distance to a peak, may be stable both within and across network sizes.
APA, Harvard, Vancouver, ISO, and other styles
8

Tubiana, Jérôme. "Restricted Boltzmann machines : from compositional representations to protein sequence analysis." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE039/document.

Full text
Abstract:
Les Machines de Boltzmann restreintes (RBM) sont des modèles graphiques capables d’apprendre simultanément une distribution de probabilité et une représentation des données. Malgré leur architecture relativement simple, les RBM peuvent reproduire très fidèlement des données complexes telles que la base de données de chiffres écrits à la main MNIST. Il a par ailleurs été montré empiriquement qu’elles peuvent produire des représentations compositionnelles des données, i.e. qui décomposent les configurations en leurs différentes parties constitutives. Cependant, toutes les variantes de ce modèle ne sont pas aussi performantes les unes que les autres, et il n’y a pas d’explication théorique justifiant ces observations empiriques. Dans la première partie de ma thèse, nous avons cherché à comprendre comment un modèle si simple peut produire des distributions de probabilité si complexes. Pour cela, nous avons analysé un modèle simplifié de RBM à poids aléatoires à l’aide de la méthode des répliques. Nous avons pu caractériser théoriquement un régime compositionnel pour les RBM, et montré sous quelles conditions (statistique des poids, choix de la fonction de transfert) ce régime peut ou ne peut pas émerger. Les prédictions qualitatives et quantitatives de cette analyse théorique sont en accord avec les observations réalisées sur des RBM entraînées sur des données réelles. Nous avons ensuite appliqué les RBM à l’analyse et à la conception de séquences de protéines. De part leur grande taille, il est en effet très difficile de simuler physiquement les protéines, et donc de prédire leur structure et leur fonction. Il est cependant possible d’obtenir des informations sur la structure d’une protéine en étudiant la façon dont sa séquence varie selon les organismes. Par exemple, deux sites présentant des corrélations de mutations importantes sont souvent physiquement proches sur la structure. A l’aide de modèles graphiques tels que les Machine de Boltzmann, on peut exploiter ces signaux pour prédire la proximité spatiale des acides-aminés d’une séquence. Dans le même esprit, nous avons montré sur plusieurs familles de protéines que les RBM peuvent aller au-delà de la structure, et extraire des motifs étendus d’acides aminés en coévolution qui reflètent les contraintes phylogénétiques, structurelles et fonctionnelles des protéines. De plus, on peut utiliser les RBM pour concevoir de nouvelles séquences avec des propriétés fonctionnelles putatives par recombinaison de ces motifs. Enfin, nous avons développé de nouveaux algorithmes d’entraînement et des nouvelles formes paramétriques qui améliorent significativement la performance générative des RBM. Ces améliorations les rendent compétitives avec l’état de l’art des modèles génératifs tels que les réseaux génératifs adversariaux ou les auto-encodeurs variationnels pour des données de taille intermédiaires
Restricted Boltzmann machines (RBM) are graphical models that learn jointly a probability distribution and a representation of data. Despite their simple architecture, they can learn very well complex data distributions such the handwritten digits data base MNIST. Moreover, they are empirically known to learn compositional representations of data, i.e. representations that effectively decompose configurations into their constitutive parts. However, not all variants of RBM perform equally well, and little theoretical arguments exist for these empirical observations. In the first part of this thesis, we ask how come such a simple model can learn such complex probability distributions and representations. By analyzing an ensemble of RBM with random weights using the replica method, we have characterised a compositional regime for RBM, and shown under which conditions (statistics of weights, choice of transfer function) it can and cannot arise. Both qualitative and quantitative predictions obtained with our theoretical analysis are in agreement with observations from RBM trained on real data. In a second part, we present an application of RBM to protein sequence analysis and design. Owe to their large size, it is very difficult to run physical simulations of proteins, and to predict their structure and function. It is however possible to infer information about a protein structure from the way its sequence varies across organisms. For instance, Boltzmann Machines can leverage correlations of mutations to predict spatial proximity of the sequence amino-acids. Here, we have shown on several synthetic and real protein families that provided a compositional regime is enforced, RBM can go beyond structure and extract extended motifs of coevolving amino-acids that reflect phylogenic, structural and functional constraints within proteins. Moreover, RBM can be used to design new protein sequences with putative functional properties by recombining these motifs at will. Lastly, we have designed new training algorithms and model parametrizations that significantly improve RBM generative performance, to the point where it can compete with state-of-the-art generative models such as Generative Adversarial Networks or Variational Autoencoders on medium-scale data
APA, Harvard, Vancouver, ISO, and other styles
9

Spiliopoulou, Athina. "Probabilistic models for melodic sequences." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/8876.

Full text
Abstract:
Structure is one of the fundamentals of music, yet the complexity arising from the vast number of possible variations of musical elements such as rhythm, melody, harmony, key, texture and form, along with their combinations, makes music modelling a particularly challenging task for machine learning. The research presented in this thesis focuses on the problem of learning a generative model for melody directly from musical sequences belonging to the same genre. Our goal is to develop probabilistic models that can automatically capture the complex statistical dependencies evident in music without the need to incorporate significant domain-specifc knowledge. At all stages we avoid making assumptions explicit to music and consider models that can can be readily applied in different music genres and can easily be adapted for other sequential data domains. We develop the Dirichlet Variable-Length Markov Model (Dirichlet-VMM), a Bayesian formulation of the Variable-Length Markov Model (VMM), where smoothing is performed in a systematic probabilistic manner. The model is a general-purpose, dictionary-based predictor with a formal smoothing technique and is shown to perform significantly better than the standard VMM in melody modelling. Motivated by the ability of the Restricted Boltzmann Machine (RBM) to extract high quality latent features in an unsupervised manner, we next develop the Time-Convolutional Restricted Boltzmann Machine (TC-RBM), a novel adaptation of the Convolutional RBM for modelling sequential data. We show that the TC-RBM learns descriptive musical features such as chords, octaves and typical melody movement patterns. To deal with the non-stationarity of music, we develop the Variable-gram Topic model, which employs the Dirichlet-VMM for the parametrisation of the topic distributions. The Dirichlet-VMM models the local temporal structure, while the latent topics represent di erent music regimes. The model does not make any assumptions explicit to music, but it is particularly suitable in this context, as it couples the latent topic formalism with an expressive model of contextual information.
APA, Harvard, Vancouver, ISO, and other styles
10

de, Giorgio Andrea. "A study on the similarities of Deep Belief Networks and Stacked Autoencoders." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-174341.

Full text
Abstract:
Restricted Boltzmann Machines (RBMs) and autoencoders have been used - in several variants - for similar tasks, such as reducing dimensionality or extracting features from signals. Even though their structures are quite similar, they rely on different training theories. Lately, they have been largely used as building blocks in deep learning architectures that are called deep belief networks (instead of stacked RBMs) and stacked autoencoders. In light of this, the student has worked on this thesis with the aim to understand the extent of the similarities and the overall pros and cons of using either RBMs, autoencoders or denoising autoencoders in deep networks. Important characteristics are tested, such as the robustness to noise, the influence on training of the availability of data and the tendency to overtrain. The author has then dedicated part of the thesis to study how the three deep networks in exam form their deep internal representations and how similar these can be to each other. In result of this, a novel approach for the evaluation of internal representations is presented with the name of F-Mapping. Results are reported and discussed.
APA, Harvard, Vancouver, ISO, and other styles
11

Dahlin, Fredrik. "Investigating user behavior by analysis of gaze data : Evaluation of machine learning methods for user behavior analysis in web applications." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-190906.

Full text
Abstract:
User behavior analysis in web applications is currently mainly performed by analysis of statistical measurements based on user interactions or by creation of personas to better understand users. Both of these methods give great insights in how the users utilize a web site, but do not give any additional information about what they are actually doing. This thesis attempts to use eye tracking data for analysis of user activities in web applications. Eye tracking data has been recorded, labeled and analyzed for 25 test participants. No data source except eye tracking data has been used and two different approaches are attempted where the first relies on a gaze map representation of the data and the second relies on sequences of features. The results indicate that it is possible to distinguish user activities in web applications, but only at a high error-rate. Improvement are possible by implementing a less subjective labeling process and by including features from other data sources.
I nuläget utförs analys av användarbeteende i webbapplikationer primärt med hjälp av statistiska mått över användares beteenden på hemsidor tillsammans med personas förökad förståelse av olika typer av användare. Dessa metoder ger stor insikt i hur användare använder hemsidor men ger ingen information om vilka typer av aktiviteter användare har utfört på hemsidan. Denna rapport försöker skapa metoder för analys av användaraktiviter på hemsidor endast baserat på blickdata fångade med eye trackers. Blick data från 25 personer har samlats in under tiden de utför olika uppgifter på olika hemsidor. Två olika tekniker har utvärderats där den ena analyserar blick kartor som fångat ögonens rörelser under 10 sekunder och den andra tekniken använder sig av sekvenser av händelser för att klassificera aktiviteter. Resultaten indikerar att det går att urskilja olika typer av vanligt förekommande användaraktiviteter genom analys av blick data. Resultatet visar också att det är stor osäkerhet i prediktionerna och ytterligare arbete är nödvändigt för att finna användbara modeller.
APA, Harvard, Vancouver, ISO, and other styles
12

Nair, Binu Muraleedharan. "Learning Latent Temporal Manifolds for Recognition and Prediction of Multiple Actions in Streaming Videos using Deep Networks." University of Dayton / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1429532297.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Jin, Wenjing. "Modeling of Machine Life Using Accelerated Prognostics and Health Management (APHM) and Enhanced Deep Learning Methodology." University of Cincinnati / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1479821186023747.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Dupuy, Nathalie. "Neurocomputational model for learning, memory consolidation and schemas." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/33144.

Full text
Abstract:
This thesis investigates how through experience the brain acquires and stores memories, and uses these to extract and modify knowledge. This question is being studied by both computational and experimental neuroscientists as it is of relevance for neuroscience, but also for artificial systems that need to develop knowledge about the world from limited, sequential data. It is widely assumed that new memories are initially stored in the hippocampus, and later are slowly reorganised into distributed cortical networks that represent knowledge. This memory reorganisation is called systems consolidation. In recent years, experimental studies have revealed complex hippocampal-neocortical interactions that have blurred the lines between the two memory systems, challenging the traditional understanding of memory processes. In particular, the prior existence of cortical knowledge frameworks (also known as schemas) was found to speed up learning and consolidation, which seemingly is at odds with previous models of systems consolidation. However, the underlying mechanisms of this effect are not known. In this work, we present a computational framework to explore potential interactions between the hippocampus, the prefrontal cortex, and associative cortical areas during learning as well as during sleep. To model the associative cortical areas, where the memories are gradually consolidated, we have implemented an artificial neural network (Restricted Boltzmann Machine) so as to get insight into potential neural mechanisms of memory acquisition, recall, and consolidation. We analyse the network's properties using two tasks inspired by neuroscience experiments. The network gradually built a semantic schema in the associative cortical areas through the consolidation of multiple related memories, a process promoted by hippocampal-driven replay during sleep. To explain the experimental data we suggest that, as the neocortical schema develops, the prefrontal cortex extracts characteristics shared across multiple memories. We call this information meta-schema. In our model, the semantic schema and meta-schema in the neocortex are used to compute consistency, conflict and novelty signals. We propose that the prefrontal cortex uses these signals to modulate memory formation in the hippocampus during learning, which in turn influences consolidation during sleep replay. Together, these results provide theoretical framework to explain experimental findings and produce predictions for hippocampal-neocortical interactions during learning and systems consolidation.
APA, Harvard, Vancouver, ISO, and other styles
15

Côté, Marc-Alexandre. "Réseaux de neurones génératifs avec structure." Thèse, Université de Sherbrooke, 2017. http://hdl.handle.net/11143/10489.

Full text
Abstract:
Cette thèse porte sur les modèles génératifs en apprentissage automatique. Deux nouveaux modèles basés sur les réseaux de neurones y sont proposés. Le premier modèle possède une représentation interne où une certaine structure a été imposée afin d’ordonner les caractéristiques apprises. Le deuxième modèle parvient à exploiter la structure topologique des données observées, et d’en tenir compte lors de la phase générative. Cette thèse présente également une des premières applications de l’apprentissage automatique au problème de la tractographie du cerveau. Pour ce faire, un réseau de neurones récurrent est appliqué à des données de diffusion afin d’obtenir une représentation des fibres de la matière blanche sous forme de séquences de points en trois dimensions.
APA, Harvard, Vancouver, ISO, and other styles
16

Schneider, C. "Using unsupervised machine learning for fault identification in virtual machines." Thesis, University of St Andrews, 2015. http://hdl.handle.net/10023/7327.

Full text
Abstract:
Self-healing systems promise operating cost reductions in large-scale computing environments through the automated detection of, and recovery from, faults. However, at present there appears to be little known empirical evidence comparing the different approaches, or demonstrations that such implementations reduce costs. This thesis compares previous and current self-healing approaches before demonstrating a new, unsupervised approach that combines artificial neural networks with performance tests to perform fault identification in an automated fashion, i.e. the correct and accurate determination of which computer features are associated with a given performance test failure. Several key contributions are made in the course of this research including an analysis of the different types of self-healing approaches based on their contextual use, a baseline for future comparisons between self-healing frameworks that use artificial neural networks, and a successful, automated fault identification in cloud infrastructure, and more specifically virtual machines. This approach uses three established machine learning techniques: Naïve Bayes, Baum-Welch, and Contrastive Divergence Learning. The latter demonstrates minimisation of human-interaction beyond previous implementations by producing a list in decreasing order of likelihood of potential root causes (i.e. fault hypotheses) which brings the state of the art one step closer toward fully self-healing systems. This thesis also examines the impact of that different types of faults have on their respective identification. This helps to understand the validity of the data being presented, and how the field is progressing, whilst examining the differences in impact to identification between emulated thread crashes and errant user changes – a contribution believed to be unique to this research. Lastly, future research avenues and conclusions in automated fault identification are described along with lessons learned throughout this endeavor. This includes the progression of artificial neural networks, how learning algorithms are being developed and understood, and possibilities for automatically generating feature locality data.
APA, Harvard, Vancouver, ISO, and other styles
17

Pasa, Luca. "Linear Models and Deep Learning: Learning in Sequential Domains." Doctoral thesis, Università degli studi di Padova, 2017. http://hdl.handle.net/11577/3425865.

Full text
Abstract:
With the diffusion of cheap sensors, sensor-equipped devices (e.g., drones), and sensor networks (such as Internet of Things), as well as the development of inexpensive human-machine interaction interfaces, the ability to quickly and effectively process sequential data is becoming more and more important. There are many tasks that may benefit from advancement in this field, ranging from monitoring and classification of human behavior to prediction of future events. Most of the above tasks require pattern recognition and machine learning capabilities. There are many approaches that have been proposed in the past to learn in sequential domains, especially extensions in the field of Deep Learning. Deep Learning is based on highly nonlinear systems, which very often reach quite good classification/prediction performances, but at the expenses of a substantial computational burden. Actually, when facing learning in a sequential, or more in general structured domain, it is common practice to readily resort to nonlinear systems. Not always, however, the task really requires a nonlinear system. So the risk is to run into difficult and computational expensive training procedures to eventually get a solution that improves of an epsilon (if not at all) the performances that can be reached by a simple linear dynamical system involving simpler training procedures and a much lower computational effort. The aim of this thesis is to discuss about the role that linear dynamical systems may have in learning in sequential domains. On one hand, we like to point out that a linear dynamical system (LDS) is able, in many cases, to already provide good performances at a relatively low computational cost. On the other hand, when a linear dynamical system is not enough to provide a reasonable solution, we show that it can be used as a building block to construct more complex and powerful models, or how to resort to it to design quite effective pre-training techniques for nonlinear dynamical systems, such as Echo State Networks (ESNs) and simple Recurrent Neural Networks (RNNs). Specifically, in this thesis we consider the task of predicting the next event into a sequence of events. The datasets used to test various discussed models involve polyphonic music and contain quite long sequences. We start by introducing a simple state space LDS. Three different approaches to train the LDS are then considered. Then we introduce some brand new models that are inspired by the LDS and that have the aim to increase the prediction/classification capabilities of the simple linear models. We then move to study the most common nonlinear models. From this point of view, we considered the RNN models, which are significantly more computationally demanding. We experimentally show that, at least for the addressed prediction task and the considered datasets, the introduction of pre-training approaches involving linear systems leads to quite large improvements in prediction performances. Specifically, we introduce pre-training via linear Autoencoder, and an alternative based on Hidden Markov Models (HMMs). Experimental results suggest that linear models may play an important role for learning in sequential domains, both when used directly or indirectly (as basis for pre-training approaches): in fact, when used directly, linear models may by themselves return state-of-the-art performance, while requiring a much lower computational effort with respect to their nonlinear counterpart. Moreover, even when linear models do not perform well, it is always possible to successfully exploit them within pre-training approaches for nonlinear systems.
Con la diffusione di dispositivi a basso costo, e reti di sensori (come ad esempio l'Internet of Things), nonché lo sviluppo di interfacce di interazione uomo-macchina a basso costo, la capacità di processare dati sequenziali in maniera veloce, e assicurando un basso consumo di risorse, è diventato sempre più importante. Molti sono i compiti che trarrebbero beneficio da un avanzamento in questo ambito, dal monitoraggio e classificazione di comportamenti umani fino alla predizioni di eventi futuri. Molti dei task citati richiedono l'uso di tecniche di pattern recognition e di abilità correlate con metodi tipici dell’apprendimento automatico. Molti sono gli approcci per eseguire apprendimento su domini sequenziali proposti nel recente passato, e molti sono basati su tecniche tipiche dell'ambito del Deep Learning. I metodi di Deep Learning sono tipicamente basati su sistemi fortemente non lineari, capaci di ottenere ottimi risultati in problemi di predizione/classificazione, ma che risultano anche essere molto costosi dal punto di vista computazionale. Quando si cerca di eseguire un compito di apprendimento su domini sequenziali, e più in generale su dati strutturati, tipicamente si ricorre all'utilizzo di sistemi non lineari. Non è però sempre vero che i task considerati richiedono modelli non lineari. Quindi il rischio è di andare ad utilizzare metodi troppo complessi, e computazionalmente costosi, per poi ottenere alla fine soluzioni che migliorano di un’epsilon (o anche no migliorano) i risultati ottenibili tramite l'utilizzo di sistemi lineari dinamici, che risultano essere molto meno costosi dal punto di vista dell'apprendimento, e del costo computazionale. L'obiettivo di questa tesi è di discutere del ruolo che i sistemi lineari dinamici possono avere nelle esecuzioni di compiti di apprendimento su dati strutturati. In questa tesi vogliamo mettere in luce le capacità dei sistemi lineari dinamici (LDS) di ottenere soluzioni molto buone ad un costo computazionale relativamente basso. Inoltre risulta interessante vedere come, nel caso in cui un sistema lineare non sia sufficiente per ottenere il risultato sperato, esso possa essere usato come base per costruire modelli più complessi, oppure possa essere utilizzato per eseguire la fase di pre-training per un modello non lineare, come ad esempio Echo State Networks (ESNs) e Recurrent Neural Networks (RNNs). Nello specifico in questa tesi è stato considerato un task di predizione dell'evento successivo, data una sequenza di eventi. I dataset usati per testare i vari modelli proposti nella tesi, contengono sequenze di musica polifonica, che risultano essere particolarmente lunghe e complesse. Nella prima parte della tesi viene proposto l'utilizzo del semplice modello LDS per affrontare il compito considerato. In particolare vengono considerati tre approcci diversi per eseguire l'apprendimento con questo modello. Viene poi introdotti nuovi modelli, ispirati al modello LDS, che hanno l'obiettivo di migliorare le prestazioni di quest'ultimo nei compiti di predizione/classificazione. Vengono poi considerati i più comuni modelli non lineari, in particolare il modello RNN il quale risulta essere significativamente più complesso e computazionalmente costoso da utilizzare. Viene quindi empiricamente dimostrato che, almeno per quanto riguarda il compito di predizione e i dataset considerati, l'introduzione di una fase di pre-training basati su sistemi lineari porta ad un significativo miglioramento delle prestazioni e della accuratezza nell'eseguire la predizione. In particolare 2 metodi di pre-training vengono proposti, il primo chiamato pre-training via Linear Autoencoder, ed il secondo basato su Hidden Markov Models (HMMs). I risultati sperimentali suggeriscono che i sistemi lineari possono giocare un ruolo importante per quanto riguarda il compito di apprendimento in domini sequenziali, sia che siano direttamente usati oppure siano usati indirettamente (come base per eseguire la fase di pre-training): infatti, usandoli direttamente, essi hanno permesso di raggiungere risultati che rappresentano lo stato dell'arte, andando però a richiedere uno sforzo computazionale molto limitato se confrontato con i più comuni modelli non lineari. Inoltre, anche quando le performance ottenute sono risultate non soddisfacenti, si è dimostrato che è possibile utilizzarli con successo per eseguire la fase di pre-training di sistemi non lineari.
APA, Harvard, Vancouver, ISO, and other styles
18

Habrnál, Matěj. "Hluboké neuronové sítě." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236132.

Full text
Abstract:
The thesis addresses the topic of Deep Neural Networks, in particular the methods regar- ding the field of Deep Learning, which is used to initialize the weight and learning process s itself within Deep Neural Networks. The focus is also put to the basic theory of the classical Neural Networks, which is important to comprehensive understanding of the issue. The aim of this work is to determine the optimal set of optional parameters of the algori- thms on various complexity levels of image recognition tasks through experimenting with created application applying Deep Neural Networks. Furthermore, evaluation and analysis of the results and lessons learned from the experimentation with classical and Deep Neural Networks are integrated in the thesis.
APA, Harvard, Vancouver, ISO, and other styles
19

da, Costa Joel. "Online Non-linear Prediction of Financial Time Series Patterns." Master's thesis, Faculty of Science, 2020. http://hdl.handle.net/11427/32221.

Full text
Abstract:
We consider a mechanistic non-linear machine learning approach to learning signals in financial time series data. A modularised and decoupled algorithm framework is established and is proven on daily sampled closing time-series data for JSE equity markets. The input patterns are based on input data vectors of data windows preprocessed into a sequence of daily, weekly and monthly or quarterly sampled feature measurement changes (log feature fluctuations). The data processing is split into a batch processed step where features are learnt using a Stacked AutoEncoder (SAE) via unsupervised learning, and then both batch and online supervised learning are carried out on Feedforward Neural Networks (FNNs) using these features. The FNN output is a point prediction of measured time-series feature fluctuations (log differenced data) in the future (ex-post). Weight initializations for these networks are implemented with restricted Boltzmann machine pretraining, and variance based initializations. The validity of the FNN backtest results are shown under a rigorous assessment of backtest overfitting using both Combinatorially Symmetrical Cross Validation and Probabilistic and Deflated Sharpe Ratios. Results are further used to develop a view on the phenomenology of financial markets and the value of complex historical data under unstable dynamics.
APA, Harvard, Vancouver, ISO, and other styles
20

Yogeswaran, Arjun. "Self-Organizing Neural Visual Models to Learn Feature Detectors and Motion Tracking Behaviour by Exposure to Real-World Data." Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/37096.

Full text
Abstract:
Advances in unsupervised learning and deep neural networks have led to increased performance in a number of domains, and to the ability to draw strong comparisons between the biological method of self-organization conducted by the brain and computational mechanisms. This thesis aims to use real-world data to tackle two areas in the domain of computer vision which have biological equivalents: feature detection and motion tracking. The aforementioned advances have allowed efficient learning of feature representations directly from large sets of unlabeled data instead of using traditional handcrafted features. The first part of this thesis evaluates such representations by comparing regularization and preprocessing methods which incorporate local neighbouring information during training on a single-layer neural network. The networks are trained and tested on the Hollywood2 video dataset, as well as the static CIFAR-10, STL-10, COIL-100, and MNIST image datasets. The induction of topography or simple image blurring via Gaussian filters during training produces better discriminative features as evidenced by the consistent and notable increase in classification results that they produce. In the visual domain, invariant features are desirable such that objects can be classified despite transformations. It is found that most of the compared methods produce more invariant features, however, classification accuracy does not correlate to invariance. The second, and paramount, contribution of this thesis is a biologically-inspired model to explain the emergence of motion tracking behaviour in early development using unsupervised learning. The model’s self-organization is biased by an original concept called retinal constancy, which measures how similar visual contents are between successive frames. In the proposed two-layer deep network, when exposed to real-world video, the first layer learns to encode visual motion, and the second layer learns to relate that motion to gaze movements, which it perceives and creates through bi-directional nodes. This is unique because it uses general machine learning algorithms, and their inherent generative properties, to learn from real-world data. It also implements a biological theory and learns in a fully unsupervised manner. An analysis of its parameters and limitations is conducted, and its tracking performance is evaluated. Results show that this model is able to successfully follow targets in real-world video, despite being trained without supervision on real-world video.
APA, Harvard, Vancouver, ISO, and other styles
21

Hubený, Marek. "Koncepty strojového učení pro kategorizaci objektů v obrazu." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2017. http://www.nusl.cz/ntk/nusl-316388.

Full text
Abstract:
This work is focused on objects and scenes recognition using machine learning and computer vision tools. Before the solution of this problem has been studied basic phases of the machine learning concept and statistical models with accent on their division into discriminative and generative method. Further, the Bag-of-words method and its modification have been investigated and described. In the practical part of this work, the implementation of the Bag-of-words method with the SVM classifier was created in the Matlab environment and the model was tested on various sets of publicly available images.
APA, Harvard, Vancouver, ISO, and other styles
22

Tsai, Chang-Hung, and 蔡長宏. "Restricted Boltzmann Machine (RBM) Processor Design for Neural Network and Machine Learning Applications." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/55826222299703019418.

Full text
Abstract:
博士
國立交通大學
電子研究所
105
Recently, machine learning techniques have been widely applied to signal processing systems to support intelligent capabilities, such as AdaBoost, K-NN, mean-shift, and SVM for data classification, and HOG and SIFT for feature extraction in multimedia applications. In the past decades, the neural network (NN) algorithms are considered one of the state-of-the-art solutions in many applications, and both feature extraction and data classification are integrated and cascaded in neural networks. In the big data era, the huge dataset benefits neural network learning algorithms to train a powerful and accurate model for machine learning applications. Since the network structure becomes deeper and deeper to achieve more accurate performance for applications, the traditional neural network learning algorithm with feedforwarding and error backpropagation is inefficient to train multi-layer neural networks. Moreover, the data labeling is very expensive especially for big dataset, and how to initialize a neural network without any domain knowledge is also a crucial issue for model training. In this dissertation, a restricted Boltzmann machine (RBM) processor is designed and implemented. In the proposed RBM processor, 32 proposed RBM cores are integrated for parallel computing with the neural network structure of maximal 4k neurons per layer and 128 candidates per sample for inference. Operated in the learning mode, the batch-level parallelism is achieved for RBM model training with supervised and unsupervised learning. And the sample-level parallelism is achieved for data classification operated in the inference mode. Moreover, several features are proposed and implemented in the proposed RBM processor to save computation time, hardware cost, external memory bandwidth, and power consumption. To realize the proposed RBM processor, two implementations are designed in this dissertation. Implemented in Xilinx Virtex-7 FPGA, the proposed RBM processor is operated at 125 MHz and occupies 114.0k LUTs, 107.1k flip-flops, and 80 block memory blocks. Implemented in UMC 65nm LL RVT CMOS technology, the proposed RBM processor chip costs 2.2M gates and 128kB internal SRAM with 8.8 mm2 area to integrate 32 proposed RBM cores in 2 clusters, and the maximal operating frequency of this chip achieves 210 MHz in both learning and inference modes operated at 1.2V supply voltage. According to the measurement results, the proposed FPGA-based system prototype platform achieves 4.60G neuron weights/s (NWPS) learning performance and 3.87G NWPS inference performance for RBM model training and data classification, respectively. And the proposed RBM processor chip operated at 210MHz to achieve 4.61G NWPS and 3.86G NWPS performance with 69.50 pJ/NW and 81.20 pJ/NW energy efficiency in learning and inference modes, respectively. Compared to the software solution implemented on CPU and powerful multi-core processors, the proposed RBM processor achieves faster processing time and higher energy efficiency in both RBM model learning and data inference, respectively. Since the battery life is a crucial issue in IoT and handheld devices, our proposal achieves an energy-efficient solution to integrate the proposed RBM processor chip into the emerging energy-constrained devices to support intelligent capabilities with learning and inference for in-time model training and real-time decision making.
APA, Harvard, Vancouver, ISO, and other styles
23

Pandey, Gaurav. "Deep Learning with Minimal Supervision." Thesis, 2017. http://etd.iisc.ac.in/handle/2005/4315.

Full text
Abstract:
Abstract In recent years, deep neural networks have achieved extraordinary performance on supervised learning tasks. Convolutional neural networks (CNN) have vastly improved the state of the art for most computer vision tasks including object recognition and segmentation. However, their success relies on the presence of a large amount of labeled data. In contrast, relatively fewer work has been done in deep learning to handle scenarios when access to ground truth is limited, partial or completely absent. In this thesis, we propose models to handle challenging problems with limited labeled information. Our first contribution is a neural architecture that allows for the extraction of infinitely many features from an object while allowing for tractable inference. This is achieved by using the `kernel trick', that is, we express the inner product in the infinite dimensional feature space as a kernel. The kernel can either be computed exactly for single layer feedforward networks, or approximated by an iterative algorithm for deep convolutional networks. The corresponding models are referred to as stretched deep networks (SDN). We show that when the amount of training data is limited, SDNs with random weights drastically outperform fully supervised CNNs with similar architectures. While SDNs perform reasonably well for classification with limited labeled data, they can not utilize unlabeled data which is often much easier to obtain. A common approach to utilize unlabeled data is to couple the classifier with an autoencoder (or its variants) thereby minimizing reconstruction error in addition to the classification error. We discuss the limitations of decoder based architectures and propose a model that allows for the utilization of unlabeled data without the need of a decoder. This is achieved by jointly modeling the distribution of data and latent features in a manner that explicitly assigns zero probability to unobserved data. The joint probability of the data and the latent features is maximized using a two-step EM-like procedure. Depending on the task, we allow the latent features to be one-hot or real-valued vectors and define a suitable prior on the features. For instance, one-hot features correspond to class labels and are directly used for the unsupervised and semi-supervised classification tasks. For real-valued features, we use hierarchical Bayesian models as priors over the latent features. Hence, the proposed model, which we refer to as discriminative encoder (or DisCoder), is flexible in the type of latent features that it can capture. The proposed model achieves state-of-the-art performance on several challenging datasets. Having addressed the problem of utilizing unlabeled data for classification, we move to a domain where obtaining labels is a lot more expensive, that is, semantic segmentation of images. Explicitly labeling each pixel of an image with the object that the pixel belongs to, is an expensive operation, in terms of time as well as effort? Currently, only a few classes of images have been densely (pixel-level) labeled. Even among these classes, only a few images per class have pixel-level supervision. Models that rely on densely-labeled images, cannot utilize a much larger set of weakly annotated images available on the web. Moreover, these models cannot learn the segmentation masks for new classes, where there is no densely labeled data. Hence, we propose a model for utilizing weakly-labeled data for semantic segmentation of images. This is achieved by generating fake labels for each image, while simultaneously forcing the output of the CNN to satisfy the mean-field constraints imposed by a conditional random field. We show that one can enforce the CRF constraints by forcing the distribution at each pixel to be close to the distribution of its neighbors. The proposed model is very fast to train and achieves state-of-the-art performance on the popular VOC-2012 dataset for the task of weakly supervised semantic segmentation of images.
APA, Harvard, Vancouver, ISO, and other styles
24

Anderson, David John. "Automatic speech feature extraction using a convolutional restricted boltzmann machine." Thesis, 2017. https://hdl.handle.net/10539/26165.

Full text
Abstract:
A dissertation submitted to the Faculty of Science, University of the Witwatersrand, in fulfillment of the requirements for the degree of Master of Science 2017
Restricted Boltzmann Machines (RBMs) are a statistical learning concept that can be interpreted as Arti cial Neural Networks. They are capable of learning, in an unsupervised fashion, a set of features with which to describe a data set. Connected in series RBMs form a model called a Deep Belief Network (DBN), learning abstract feature combinations from lower layers. Convolutional RBMs (CRBMs) are a variation on the RBM architecture in which the learned features are kernels that are convolved across spatial portions of the input data to generate feature maps identifying if a feature is detected in a portion of the input data. Features extracted from speech audio data by a trained CRBM have recently been shown to compete with the state of the art for a number of speaker identi cation tasks. This project implements a similar CRBM architecture in order to verify previous work, as well as gain insight into Digital Signal Processing (DSP), Generative Graphical Models, unsupervised pre-training of Arti cial Neural Networks, and Machine Learning classi cation tasks. The CRBM architecture is trained on the TIMIT speech corpus and the learned features veri ed by using them to train a linear classi er on tasks such as speaker genetic sex classi cation and speaker identi cation. The implementation is quantitatively proven to successfully learn and extract a useful feature representation for the given classi cation tasks
MT 2018
APA, Harvard, Vancouver, ISO, and other styles
25

Huang, Chien-Ming, and 黃建銘. "Research in Recognition Method Based on Continuous Restricted Boltzmann Machine." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/97998030304152120598.

Full text
Abstract:
碩士
國立清華大學
電機工程學系
102
In recent years, the biomedical application of electronic nose sensor system has been noticed, for example, this thesis will focus on the recognition of pneumonia data from patients. However, the sensitivity of sensor array is not high enough so that the captured data is somewhat overlapped. In order to analyze these data further, this thesis proposes some methods to classify them with probabilistic model, such as CRBM. Continuous Restricted Boltzmann Machine (CRBM) is a generative probabilistic model that can cluster and classify, and that can reconstruct data distribution from training data. Therefore, there are 3 possible ways to classify pneumonia data by CRBM. First, as a clusterer, CRBM can re-project data into higher -dimensional space or lower-dimensional space so that the data will be classified more easily. Secondly, as a classifier, CRBM uses an additional neuron as label to learn class of training data. Finally, as a generative model, CRBM can re-generate the data distribution of training data following its energy function so that we can estimate the probability density in the space. After estimating the probability density, the Bayesian Classifier can classify with it. In addition, this thesis proposes a setup to test 3 rd CRBM analog chip. Since training mechanism was not designed for the this chip, so we use the data acquisition (DAQ) system and FPGA card to implement training algorithm of CRBM. This is the so-called Chip-in-a-Loop training. The performance of this training mechanism will be evalutated.
APA, Harvard, Vancouver, ISO, and other styles
26

Teng, Chih-Jung, and 鄧智嶸. "Training Restricted Boltzmann Machine for People Counting with PIR Sensors." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/17252842870795458226.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Upadhya, Vidyadhar. "Efficient Algorithms for Learning Restricted Boltzmann Machines." Thesis, 2020. https://etd.iisc.ac.in/handle/2005/4840.

Full text
Abstract:
The probabilistic generative models learn useful features from unlabeled data which can be used for subsequent problem-specific tasks, such as classification, regression or information retrieval. The RBM is one such important energy based probabilistic generative model. RBMs are also the building blocks for several deep generative models. It is difficult to train and evaluate RBMs mainly because the normalizing constant (known as the partition function) for the distribution that they represent is computationally hard to evaluate. Therefore, various approximate methods (based noisy gradient of the log likelihood estimated through sampling) are used to train RBMs. Thus, building efficient learning algorithms for the RBM is an important problem. In this thesis, we consider the problem of maximum likelihood learning of RBMs. We consider both binary-binary RBMs as well as Gaussian-binary RBMs. We propose a new algorithm for learning binary-binary RBMs by exploiting the property that the BB-RBM log-likelihood function is a difference of convex functions w.r.t. its parameters. In the standard difference of convex functions programming (DCP), the optimization proceeds through solving a convex optimization problem at each iteration. In the case of RBM, this convex objective function contains the partition function and hence its gradient computation may be intractable. We propose a stochastic variant of the difference of convex functions optimization algorithm, termed S-DCP, where the the convex optimization problem at each iteration is approximately solved through a few iterations of stochastic gradient descent. The resulting algorithm is simple and the contrastive divergence~(CD) algorithm, the current standard algorithm for learning RBMs, can be derived as a special case of the proposed algorithm. It is seen through empirical studies that S-DCP improves the optimization dynamics of learning binary-binary RBMs. We further modify this algorithm to accommodate centered gradients. Through extensive empirical studies on a number of benchmark datasets, we demonstrate the superior performance of the proposed algorithms. It is well documented in the literature that learning Gaussian-binary RBMs is more difficult compared to binary-binary RBMs. We extend the S-DCP algorithm to learn Gaussian-binary RBMs by proving that the Gaussian-binary RBM log-likelihood function is also a difference of convex functions w.r.t. the weights and hidden biases under the assumption that the conditional distribution of the visible units have a fixed variance. Through extensive empirical studies on a number of benchmark datasets, we demonstrate that S-DCP learns good models more efficiently compared to CD and Persistent CD, the current standard algorithms for learning Gaussian-binary RBMs. We further modify the S-DCP algorithm to accommodate variance update (outside the inner loop of the convex optimization) so that we can learn the variance parameter of visible units too instead of keeping it fixed. We empirically analyse the resulting algorithm and show that it is more efficient compared to the current algorithms. The second order learning methods provide invariance to re-parameterization of the model and also, improve the optimization dynamics of the learning algorithm by providing parameter specific (adaptive) learning rates. However, the Hessian of the log-likelihood, required for second order learning algorithm, can only be estimated through sampling and this noisy estimate makes the optimization algorithm unstable. Moreover, the computation of the Hessian inverse is expensive. We propose a second order learning algorithm on the convex S-DCP objective function using diagonal approximation of the Hessian which, we show, can be easily computed with the gradient estimates. To compensate for the noise in the Hessian estimate and to make the algorithm stable, we use an exponential averaging over these estimates. We show empirically that the resulting algorithm, termed S-DCP-D, is computationally cheap, stable and improves the performance of S-DCP further. Our empirical results show that the centered S-DCP as well as the diagonally scaled S-DCP are effective and efficient methods for learning RBMs. In all methods for learning RBMs, the log-likelihood achieved on the held-out test samples are used to evaluate the quality of learnt RBMs and for fixing the hyperparameters. However, the presence of the partition function makes estimating the log-likelihood intractable for models with large dimension. Currently one uses some sampling based methods for approximating the log likelihood. We provide an empirical analysis of these sampling based algorithms to estimate the log-likelihood and suggest some simple techniques to improve these estimates.
APA, Harvard, Vancouver, ISO, and other styles
28

Tai, Chih-Yuan, and 戴志遠. "An Intelligent System for Object Recognition Using Extended Restricted Boltzmann Machine." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/q55x53.

Full text
Abstract:
碩士
國立臺北科技大學
電腦與通訊研究所
100
In this paper, we propose an approach that implements an intelligent system for object recognition using Extended Restricted Boltzmann Machine (ERBM). It is excellent to recognize the objects by a typical neural network, but the problem of local minima remains to be solved. Hence, the proposed method is a neural network of global minima. First, objects are segmented from the image which is captured by the camera. In order to describe many kinds of objects completely, low-level features such as shape, texture, and color are essential. Because of some noises of low-level features, the accuracy is not precise in an actual condition. The processed result is the optimum approximate solution for object classification using the trained ERBM. Finally, an inference engine outputs the intelligent explanation for the result in the designed knowledge base which can store some high-level semantic rules. From the experimental results, it is proved that the proposed method is feasible.
APA, Harvard, Vancouver, ISO, and other styles
29

WANG, JEN-HUO, and 王仁和. "Design of Continuous Restricted Boltzmann Machine IC for Electronic Nose System." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/65860388450607202371.

Full text
Abstract:
碩士
國立清華大學
電機工程學系
102
Many portable or implantable microsystems have incorporated sensor arrays for various biomedical applications. The raw sensory signals are usually high-dimensional, noisy, and drifting. To facilitate in-situ diagnosis or to reduce the data for wireless transmission, a low-power, embedded system is demanded for fusing the sensory signals robustly in real time. A probabilistic neural network called the Continuous Restricted Boltzmann Machine (CRBM) has been shown capable of classifying biomedical data reliably. Thus, it is suitable for CRBM to act as a signal pre-processing unit in system. This paper discuss about how to use CRBM to process sensory data of electronic nose system. At first, it makes pilot simulation in software to confirm the capability of CRBM for processing sensory data. Then it will study the method of implementing CRBM into VLSI (Very Large Scale Integration) and integrating with electronic nose system. The chip of CRBM integrating with electronic nose system has been designed and fabricated with the TSMC 0.18μm and 90nm technology provided by TSMC (Taiwan Semiconductor Manufacturing Company). The measurement results proved that the CRBM hardware system can perform good processed results as expected.
APA, Harvard, Vancouver, ISO, and other styles
30

Hung, Lin, and 洪琳. "Unsupervised sound summarization from an environment based on the Restricted Boltzmann Machine." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/hq5a3n.

Full text
Abstract:
碩士
國立清華大學
電機工程學系
105
Machine listening plays an important role in machine-human interaction applications recent years. The prospect of making the computer to imitate the learning ability of human brain also became a popular issue with the rise of neural networks. Imagine that we go to a new place where labeled sound data is not available. How to let the users know what sound events happen frequently in a period of time by applying machine learning methods? These kinds of unsupervised learning applications are relatively rare in other machine listening research. We proposed this idea and also try to use neural networks and other unsupervised algorithms to summarize sound events that happen repeatedly in a place. In the simulation experiments of our thesis, we take self-recorded audio including common indoor sounds such as people talking and object collision sounds. Two electrical alarm sounds are also designed as target sound events, which the duration of each event is less than 10% of the total recording time. Frist, we take the sound signal and apply Fourier transform, then pass through the Mel-frequency filter bank to obtain Mel-spectrogram as our feature. Restricted Boltzmann machine of neural networks is chosen as our training model. Finally, we use clustering algorithm and successfully summarize the spectrogram that happens repeatedly. The user can distinguish the two target sound events through listen to the summarized sound events.
APA, Harvard, Vancouver, ISO, and other styles
31

Chen, Jyung-Ting, and 陳峻廷. "An Application of differential evolution algorithm-based restricted Boltzmann machine to recommendation systems." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/cy8m4b.

Full text
Abstract:
碩士
國立臺灣科技大學
工業管理系
104
Global e-commerce has grown very fast, and daily revenue can be up to billion US dollars. Many industries follow the trend and earn lots of money, such as: Amazon and Taobao. To raise revenue, Most of e-commerce’s companies endeavor to develop recommendation system to find out potential customers or stick customers. Recommendation systems can be implemented by lots of methods and the most well-known method is collaborative filtering. It mainly uses similar user’s records to recommend what similar users like. Its advantage is no need to analyze the product’s profile. This study, uses restricted Boltzmann machine (RBM) as collaborative filtering, and use differential evolution algorithm to optimize RBM’s parameter to improve prediction performance. Previously, original RBM use mini-batch gradient descent method.
APA, Harvard, Vancouver, ISO, and other styles
32

Hong, Chun-Yu, and 洪昌諭. "Design of a programmable system circuit for the Continuous Restricted Boltzmann Machine in VLSI." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/57105076950350456044.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Kai-YueHong and 洪凱悅. "A Refined Sample Data Method for Hyperspectral Images Classification Based on Restricted Boltzmann Machine." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/km26m8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

KOUTOU, Wend-Nougui Odilon, and 江歐狄. "Similarity-Boosted Hybrid Conditional Restricted Boltzmann Machine (SB H-CRBM) for Drug-Target Interaction Prediction." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/3v2pxs.

Full text
Abstract:
碩士
國立清華大學
資訊系統與應用研究所
106
Uncovering drug-target interactions plays a key role in the drug development process. Recently, in silico (docking simulation and machine learningbased) techniques have emerged as an alternative to costly and time consuming biochemical experiments. In machine learning-based techniques, many network-based approaches have been proposed such as Restricted Boltzmann Machine (RBM), Bipartite Local Models (BLM), Network Based Inference (NII), Weighted profile method and Advanced Local Drug-Target Interaction Prediction Technique (ALADIN). In this research, we extend the RBM by integrating important features such as drug-drug and target-target similarity. In addition, we incorporate the correlations between drugs that have not been taken into account in the original RBM. Finally, we propose a Similarity-Boosted Hybrid Conditional RBM (SB H-CRBM) which is inspired by the Content-Boosted Restricted Boltzmann Machine(CB-RBM) [1] from the recommendation systems community. Our experimental results show that our method performs better than the RBM was previously proposed by Wang and Zeng.
APA, Harvard, Vancouver, ISO, and other styles
35

Su, Hong-Yi, and 蘇泓伊. "A Study of Applying Modular Restricted Boltzmann Machine to Steady-State Visual Evoked Potentials Based Brain Computer Interface." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/ejub5y.

Full text
Abstract:
碩士
南臺科技大學
電機工程系
106
Many patients with severe disabilities have many problems in their lives, such as inconvenience during expression and action, and it is quite difficult to use traditional assistive devices. Although there are many science and technology applications in the analysis of the human brain’s biological signals such as the Brain-Computer Interface (BCI)that there has been considerable development in related research, the accuracy of identifying brain signals is not ideal. This paper uses different statistical and spectral calculation methods combined with modules to improve the accuracy of recognizing brain signals, and ultimately apply to the relevant auxiliary equipment for patients with severe disabilities, thereby improving the quality of life of related patients. The modular restricted Boltzmann machine (MRBM) designed in this dissertation is designed to extract the characteristics of many different input parameters through multiple identification layers and a layer of decision-making layer is connected to integrate the multiple parameter features in the end. Firstly, canonical correlation analysis (CCA) was used to calculate the temporal correlation of steady-state visual evoked potentials (SSVEP). Second, the fast Fourier transform (FFT) was used to transfer the steady-state visual evoked potentials to the frequency domain. The window function is used to effectively extract the characteristics of the target frequency. Thirdly, the frequency domain correlation of the steady-state visual evoked potentials is calculated to use the magnitude squared coherence (MSC). According the each types of features, the corresponding RBM is constructed and used to identify the decision results. With these results, the decision RBM is adopted to fuse the decision detected by using different types of features and then a fused decision result can be obtained. The Restricted Boltzmann machine of the identification layer extracts the parameter characteristics. Finally, a constrained Boltzmann machine connected to a decision-making layer integrates the input of the three identification layers for decision-making and integration, and is ultimately used to determine the steady state visual evoked potentials. Keyword: modular, restricted Boltzmann machine, Brain-Computer Interface, steady-state visual evoked potentials, EEG
APA, Harvard, Vancouver, ISO, and other styles
36

Susskind, Joshua Matthew. "Interpreting Faces with Neurally Inspired Generative Models." Thesis, 2011. http://hdl.handle.net/1807/29884.

Full text
Abstract:
Becoming a face expert takes years of learning and development. Many research programs are devoted to studying face perception, particularly given its prerequisite role in social interaction, yet its fundamental neural operations are poorly understood. One reason is that there are many possible explanations for a change in facial appearance, such as lighting, expression, or identity. Despite general agreement that the brain extracts multiple layers of feature detectors arranged into hierarchies to interpret causes of sensory information, very little work has been done to develop computational models of these processes, especially for complex stimuli like faces. The studies presented in this thesis used nonlinear generative models developed within machine learning to solve several face perception problems. Applying a deep hierarchical neural network, we showed that it is possible to learn representations capable of perceiving facial actions, expressions, and identities, better than similar non-hierarchical architectures. We then demonstrated that a generative architecture can be used to interpret high-level neural activity by synthesizing images in a top-down pass. Using this approach we showed that deep layers of a network can be activated to generate faces corresponding to particular categories. To facilitate training models to learn rich and varied facial features, we introduced a new expression database with the largest number of labeled faces collected to date. We found that a model trained on these images learned to recognize expressions comparably to human observers. Next we considered models trained on pairs of images, making it possible to learn how faces change appearance to take on different expressions. Modeling higher-order associations between images allowed us to efficiently match images of the same type according to a learned pairwise similarity measure. These models performed well on several tasks, including matching expressions and identities, and demonstrated performance superior to competing models. In sum, these studies showed that neural networks that extract highly nonlinear features from images using architectures inspired by the brain can solve difficult face perception tasks with minimal guidance by human experts.
APA, Harvard, Vancouver, ISO, and other styles
37

Yu, Kuan-Chih, and 余觀至. "Recognition of Patients with Chronic Obstructive Pulmonary Disease by Applying Continuous Restricted Boltzmann Machine and Data-Mining Methods to Sensory Data of E-Nose." Thesis, 2017. http://ndltd.ncl.edu.tw/handle/b32448.

Full text
Abstract:
碩士
國立清華大學
電機工程學系所
106
The purpose of this thesis is to the recognize Chronic Obstructive Pulmonary Disease (COPD) by applying machine-learning algorithms. In previous literature, it is confirmed that specific organic compounds are exhaled by most patients suffering from the COPD. The COPD could thus be diagnosed by using machine-learning algorithms to classify the sensory data of an electronic nose. An electronic nose (e-Nose) consists of an array of neuromorphic sensor with diversity. Each sensor exhibits its own characteristic response to different odorants. Therefore, this study aims to identify a machine-learning algorithm able to detect COPD by classifying the sensory data of an e-Nose. To ease data-classification, the following methods are employed to preprocess the e-Nose data: (1) baseline manipulation, (2) receiver operating characteristic (ROC) curve, and (3) normalization. For data classification, the performance of the following three linear classifiers are compared: (1) the support vector machine, (2) the linear discriminant analysis, (3) the linear programming. In addition, the Continuous Restricted Boltzmann Machine (CRBM) is employed as a nonlinear, probabilistic classifier. How the CRBM could improve the classification task is further explored in this thesis. Based on the fact that the CRBM learns to regenerate training data, an algorithm for estimating the likelihood of unknown data under a CRBM model is developed. This estimating algorithm enables CRBM to function as a probabilistic classifier reliably. However, our experimental results indicate that all algorithms are unable to recognize unknown data because different types of pre-processed COPD data exhibit significant overlap among each other. Further analysis indicates that sensor selection based on ROC curve filters out some important dimensions. Therefore, without the sensor selection, better classification result is achieved.
APA, Harvard, Vancouver, ISO, and other styles
38

"EXPLORATION OF NEURAL CODING IN RAT'S AGRANULAR MEDIAL AND AGRANULAR LATERAL CORTICES DURING LEARNING OF A DIRECTIONAL CHOICE TASK." Doctoral diss., 2014. http://hdl.handle.net/2286/R.I.25034.

Full text
Abstract:
abstract: Animals learn to choose a proper action among alternatives according to the circumstance. Through trial-and-error, animals improve their odds by making correct association between their behavioral choices and external stimuli. While there has been an extensive literature on the theory of learning, it is still unclear how individual neurons and a neural network adapt as learning progresses. In this dissertation, single units in the medial and lateral agranular (AGm and AGl) cortices were recorded as rats learned a directional choice task. The task required the rat to make a left/right side lever press if a light cue appeared on the left/right side of the interface panel. Behavior analysis showed that rat's movement parameters during performance of directional choices became stereotyped very quickly (2-3 days) while learning to solve the directional choice problem took weeks to occur. The entire learning process was further broken down to 3 stages, each having similar number of recording sessions (days). Single unit based firing rate analysis revealed that 1) directional rate modulation was observed in both cortices; 2) the averaged mean rate between left and right trials in the neural ensemble each day did not change significantly among the three learning stages; 3) the rate difference between left and right trials of the ensemble did not change significantly either. Besides, for either left or right trials, the trial-to-trial firing variability of single neurons did not change significantly over the three stages. To explore the spatiotemporal neural pattern of the recorded ensemble, support vector machines (SVMs) were constructed each day to decode the direction of choice in single trials. Improved classification accuracy indicated enhanced discriminability between neural patterns of left and right choices as learning progressed. When using a restricted Boltzmann machine (RBM) model to extract features from neural activity patterns, results further supported the idea that neural firing patterns adapted during the three learning stages to facilitate the neural codes of directional choices. Put together, these findings suggest a spatiotemporal neural coding scheme in a rat AGl and AGm neural ensemble that may be responsible for and contributing to learning the directional choice task.
Dissertation/Thesis
Ph.D. Electrical Engineering 2014
APA, Harvard, Vancouver, ISO, and other styles
39

Larochelle, Hugo. "Étude de techniques d'apprentissage non-supervisé pour l'amélioration de l'entraînement supervisé de modèles connexionnistes." Thèse, 2008. http://hdl.handle.net/1866/6435.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Lajoie, Isabelle. "Apprentissage de représentations sur-complètes par entraînement d’auto-encodeurs." Thèse, 2009. http://hdl.handle.net/1866/3768.

Full text
Abstract:
Les avancés dans le domaine de l’intelligence artificielle, permettent à des systèmes informatiques de résoudre des tâches de plus en plus complexes liées par exemple à la vision, à la compréhension de signaux sonores ou au traitement de la langue. Parmi les modèles existants, on retrouve les Réseaux de Neurones Artificiels (RNA), dont la popularité a fait un grand bond en avant avec la découverte de Hinton et al. [22], soit l’utilisation de Machines de Boltzmann Restreintes (RBM) pour un pré-entraînement non-supervisé couche après couche, facilitant grandement l’entraînement supervisé du réseau à plusieurs couches cachées (DBN), entraînement qui s’avérait jusqu’alors très difficile à réussir. Depuis cette découverte, des chercheurs ont étudié l’efficacité de nouvelles stratégies de pré-entraînement, telles que l’empilement d’auto-encodeurs traditionnels(SAE) [5, 38], et l’empilement d’auto-encodeur débruiteur (SDAE) [44]. C’est dans ce contexte qu’a débuté la présente étude. Après un bref passage en revue des notions de base du domaine de l’apprentissage machine et des méthodes de pré-entraînement employées jusqu’à présent avec les modules RBM, AE et DAE, nous avons approfondi notre compréhension du pré-entraînement de type SDAE, exploré ses différentes propriétés et étudié des variantes de SDAE comme stratégie d’initialisation d’architecture profonde. Nous avons ainsi pu, entre autres choses, mettre en lumière l’influence du niveau de bruit, du nombre de couches et du nombre d’unités cachées sur l’erreur de généralisation du SDAE. Nous avons constaté une amélioration de la performance sur la tâche supervisée avec l’utilisation des bruits poivre et sel (PS) et gaussien (GS), bruits s’avérant mieux justifiés que celui utilisé jusqu’à présent, soit le masque à zéro (MN). De plus, nous avons démontré que la performance profitait d’une emphase imposée sur la reconstruction des données corrompues durant l’entraînement des différents DAE. Nos travaux ont aussi permis de révéler que le DAE était en mesure d’apprendre, sur des images naturelles, des filtres semblables à ceux retrouvés dans les cellules V1 du cortex visuel, soit des filtres détecteurs de bordures. Nous aurons par ailleurs pu montrer que les représentations apprises du SDAE, composées des caractéristiques ainsi extraites, s’avéraient fort utiles à l’apprentissage d’une machine à vecteurs de support (SVM) linéaire ou à noyau gaussien, améliorant grandement sa performance de généralisation. Aussi, nous aurons observé que similairement au DBN, et contrairement au SAE, le SDAE possédait une bonne capacité en tant que modèle générateur. Nous avons également ouvert la porte à de nouvelles stratégies de pré-entraînement et découvert le potentiel de l’une d’entre elles, soit l’empilement d’auto-encodeurs rebruiteurs (SRAE).
Progress in the machine learning domain allows computational system to address more and more complex tasks associated with vision, audio signal or natural language processing. Among the existing models, we find the Artificial Neural Network (ANN), whose popularity increased suddenly with the recent breakthrough of Hinton et al. [22], that consists in using Restricted Boltzmann Machines (RBM) for performing an unsupervised, layer by layer, pre-training initialization, of a Deep Belief Network (DBN), which enables the subsequent successful supervised training of such architecture. Since this discovery, researchers studied the efficiency of other similar pre-training strategies such as the stacking of traditional auto-encoder (SAE) [5, 38] and the stacking of denoising auto-encoder (SDAE) [44]. This is the context in which the present study started. After a brief introduction of the basic machine learning principles and of the pre-training methods used until now with RBM, AE and DAE modules, we performed a series of experiments to deepen our understanding of pre-training with SDAE, explored its different proprieties and explored variations on the DAE algorithm as alternative strategies to initialize deep networks. We evaluated the sensitivity to the noise level, and influence of number of layers and number of hidden units on the generalization error obtained with SDAE. We experimented with other noise types and saw improved performance on the supervised task with the use of pepper and salt noise (PS) or gaussian noise (GS), noise types that are more justified then the one used until now which is masking noise (MN). Moreover, modifying the algorithm by imposing an emphasis on the corrupted components reconstruction during the unsupervised training of each different DAE showed encouraging performance improvements. Our work also allowed to reveal that DAE was capable of learning, on naturals images, filters similar to those found in V1 cells of the visual cortex, that are in essence edges detectors. In addition, we were able to verify that the learned representations of SDAE, are very good characteristics to be fed to a linear or gaussian support vector machine (SVM), considerably enhancing its generalization performance. Also, we observed that, alike DBN, and unlike SAE, the SDAE had the potential to be used as a good generative model. As well, we opened the door to novel pre-training strategies and discovered the potential of one of them : the stacking of renoising auto-encoders (SRAE).
APA, Harvard, Vancouver, ISO, and other styles
41

Taylor, Graham William. "Composable, Distributed-state Models for High-dimensional Time Series." Thesis, 2009. http://hdl.handle.net/1807/19238.

Full text
Abstract:
In this thesis we develop a class of nonlinear generative models for high-dimensional time series. The first key property of these models is their distributed, or "componential" latent state, which is characterized by binary stochastic variables which interact to explain the data. The second key property is the use of an undirected graphical model to represent the relationship between latent state (features) and observations. The final key property is composability: the proposed class of models can form the building blocks of deep networks by successively training each model on the features extracted by the previous one. We first propose a model based on the Restricted Boltzmann Machine (RBM) that uses an undirected model with binary latent variables and real-valued "visible" variables. The latent and visible variables at each time step receive directed connections from the visible variables at the last few time-steps. This "conditional" RBM (CRBM) makes on-line inference efficient and allows us to use a simple approximate learning procedure. We demonstrate the power of our approach by synthesizing various motion sequences and by performing on-line filling in of data lost during motion capture. We also explore CRBMs as priors in the context of Bayesian filtering applied to multi-view and monocular 3D person tracking. We extend the CRBM in a way that preserves its most important computational properties and introduces multiplicative three-way interactions that allow the effective interaction weight between two variables to be modulated by the dynamic state of a third variable. We introduce a factoring of the implied three-way weight tensor to permit a more compact parameterization. The resulting model can capture diverse styles of motion with a single set of parameters, and the three-way interactions greatly improve its ability to blend motion styles or to transition smoothly among them. In separate but related work, we revisit Products of Hidden Markov Models (PoHMMs). We show how the partition function can be estimated reliably via Annealed Importance Sampling. This enables us to demonstrate that PoHMMs outperform various flavours of HMMs on a variety of tasks and metrics, including log likelihood.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography