Дисертації з теми "Unsupervied learning"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Unsupervied learning".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
GIOBERGIA, FLAVIO. "Machine learning with limited label availability: algorithms and applications." Doctoral thesis, Politecnico di Torino, 2023. https://hdl.handle.net/11583/2976594.
Повний текст джерелаSnyder, Benjamin Ph D. Massachusetts Institute of Technology. "Unsupervised multilingual learning." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62455.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references (p. 241-254).
For centuries, scholars have explored the deep links among human languages. In this thesis, we present a class of probabilistic models that exploit these links as a form of naturally occurring supervision. These models allow us to substantially improve performance for core text processing tasks, such as morphological segmentation, part-of-speech tagging, and syntactic parsing. Besides these traditional NLP tasks, we also present a multilingual model for lost language deciphersment. We test this model on the ancient Ugaritic language. Our results show that we can automatically uncover much of the historical relationship between Ugaritic and Biblical Hebrew, a known related language.
by Benjamin Snyder.
Ph.D.
Geigel, Arturo. "Unsupervised Learning Trojan." NSUWorks, 2014. http://nsuworks.nova.edu/gscis_etd/17.
Повний текст джерелаMathieu, Michael. "Unsupervised Learning under Uncertainty." Thesis, New York University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10261120.
Повний текст джерелаDeep learning, in particular neural networks, achieved remarkable success in the recent years. However, most of it is based on supervised learning, and relies on ever larger datasets, and immense computing power. One step towards general artificial intelligence is to build a model of the world, with enough knowledge to acquire a kind of ``common sense''. Representations learned by such a model could be reused in a number of other tasks. It would reduce the requirement for labelled samples and possibly acquire a deeper understanding of the problem. The vast quantities of knowledge required to build common sense precludes the use of supervised learning, and suggests to rely on unsupervised learning instead.
The concept of uncertainty is central to unsupervised learning. The task is usually to learn a complex, multimodal distribution. Density estimation and generative models aim at representing the whole distribution of the data, while predictive learning consists of predicting the state of the world given the context and, more often than not, the prediction is not unique. That may be because the model lacks the capacity or the computing power to make a certain prediction, or because the future depends on parameters that are not part of the observation. Finally, the world can be chaotic of truly stochastic. Representing complex, multimodal continuous distributions with deep neural networks is still an open problem.
In this thesis, we first assess the difficulties of representing probabilities in high dimensional spaces, and review the related work in this domain. We then introduce two methods to address the problem of video prediction, first using a novel form of linearizing auto-encoders and latent variables, and secondly using Generative Adversarial Networks (GANs). We show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation. Finally, we explore a new non-probabilistic framework for GANs.
Boschini, Matteo. "Unsupervised Learning of Scene Flow." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16226/.
Повний текст джерелаJelacic, Mersad. "Unsupervised Learning for Plant Recognition." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-247.
Повний текст джерелаSix methods are used for clustering data containing two different objects: sugar-beet plants
and weed. These objects are described by 19 different features, i.e. shape and color features.
There is also information about the distance between sugar-beet plants that is used for
labeling clusters. The methods that are evaluated: k-means, k-medoids, hierarchical clustering,
competitive learning, self-organizing maps and fuzzy c-means. After using the methods on
plant data, clusters are formed. The clusters are labeled with three different proposed
methods: expert, database and context method. Expert method is using a human for giving
initial cluster centers that are labeled. The database method is using a database as an expert
that provides initial cluster centers. The context method is using information about the
environment, which is the distance between sugar-beet plants, for labeling the clusters.
The algorithms that were tested, with the lowest achieved corresponding error, are: k-means
(3.3%), k-medoids (3.8%), hierarchical clustering (5.3%), competitive learning (6.8%), self-
organizing maps (4.9%) and fuzzy c-means (7.9%). Three different datasets were used and the
lowest error on dataset0 is 3.3%, compared to supervised learning methods where it is 3%.
For dataset1 the error is 18.7% and for dataset2 it is 5.8%. Compared to supervised methods,
the error on dataset1 is 11% and for dataset2 it is 5.1%. The high error rate on dataset1 is due
to the samples are not very well separated in different clusters. The features from dataset1 are
extracted from lower resolution on images than the other datasets, and another difference
between the datasets are the sugar-beet plants that are in different growth stages.
The performance of the three methods for labeling clusters is: expert method (6.8% as the
lowest error achieved), database method (3.7%) and context method (6.8%). These results
show the clustering results by competitive learning where the real error is 6.8%.
Unsupervised-learning methods for clustering can very well be used for plant identification.
Because the samples are not classified, an automatic labeling technique must be used if plants
are to be identified. The three proposed techniques can be used for automatic labeling of
plants.
Amin, Khizer, and Mehmood ul haq Minhas. "Facebook Blocket with Unsupervised Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-1969.
Повний текст джерелаKorkontzelos, Ioannis. "Unsupervised learning of multiword expressions." Thesis, University of York, 2010. http://etheses.whiterose.ac.uk/2091/.
Повний текст джерелаLiang, Yingyu. "Modern aspects of unsupervised learning." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/52282.
Повний текст джерелаXiao, Ying. "New tools for unsupervised learning." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/52995.
Повний текст джерелаLuo, Jiaming S. M. Massachusetts Institute of Technology. "Unsupervised learning of morphological forests." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111923.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references (pages 39-41).
This thesis focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary. This formulation enables us to capture edge-wise properties reflecting single-step morphological derivations, along with global distributional properties of the entire forest. These global properties constrain the size of the affix set and encourage formation of tight morphological families. The resulting objective is solved using Integer Linear Programming (ILP) paired with contrastive estimation. We train the model by alternating between optimizing the local log-linear model and the global ILP objective. We evaluate our system on three tasks: root detection, clustering of morphological families and segmentation. Our experiments demonstrate that our model yields consistent gains in all three tasks compared with the best published results.
by Jiaming Luo.
S.M.
Drexler, Jennifer Fox. "Deep unsupervised learning from speech." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105696.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references (pages 87-92).
Automatic speech recognition (ASR) systems have become hugely successful in recent years - we have become accustomed to speech interfaces across all kinds of devices. However, despite the huge impact ASR has had on the way we interact with technology, it is out of reach for a significant portion of the world's population. This is because these systems rely on a variety of manually-generated resources - like transcripts and pronunciation dictionaries - that can be both expensive and difficult to acquire. In this thesis, we explore techniques for learning about speech directly from speech, with no manually generated transcriptions. Such techniques have the potential to revolutionize speech technologies for the vast majority of the world's population. The cognitive science and computer science communities have both been investing increasing time and resources into exploring this problem. However, a full unsupervised speech recognition system is a hugely complicated undertaking and is still a long ways away. As in previous work, we focus on the lower-level tasks which will underlie an eventual unsupervised speech recognizer. We specifically focus on two tasks: developing linguistically meaningful representations of speech and segmenting speech into phonetic units. This thesis approaches these tasks from a new direction: deep learning. While modern deep learning methods have their roots in ideas from the 1960s and even earlier, deep learning techniques have recently seen a resurgence, thanks to huge increases in computational power and new efficient learning algorithms. Deep learning algorithms have been instrumental in the recent progress of traditional supervised speech recognition; here, we extend that work to unsupervised learning from speech.
by Jennifer Fox Drexler.
S.M.
Jauk, Igor. "Unsupervised learning for expressive speech synthesis." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/460814.
Повний текст джерелаHoy en día, especialmente con el auge de las redes neuronales, la síntesis de habla se basa casi totalmente en datos. El objetivo de esta tesis es proveer métodos de entrenamiento automático y no supervisado a partir de datos para la síntesis de habla expresiva. En comparación con sistemas de síntesis "neutrales", resulta más difícil encontrar datos de entrenamiento fiables para la síntesis expresiva, a pesar de la gran disponibilidad de recursos como internet. La dificultad principal se origina en la naturaleza del habla expresiva, altamente dependiente del hablante y la situación, resultando en muchas variaciones acústicas. Las consecuencias son, primero, que es muy difícil definir etiquetas que identifiquen fiablemente todos los detalles del habla expresiva. La definición típica de 6 emociones básicas es una simplificación que tendrá consecuencias inexcusables cuando se trata con datos fuera del laboratorio. Segundo, incluso si se llegara a definir un conjunto de etiquetas, aparte del enorme esfuerzo manual que supondría, sería muy difícil conseguir suficientes datos de entrenamiento para cada variante respetando todos sus matices. El objetivo de esta tesis es estudiar métodos de entrenamiento automático para la síntesis de habla expresiva evitando etiquetas y desarrollar aplicaciones a base de estas propuestas. El enfoque abarca los dominios acústico y semántico. Con respecto al dominio acústico, el objetivo es encontrar características acústicas aptas para representar habla expresiva, especialmente en el dominio multi-locutor, acercándose a datos reales e incontrolados. Para esto, la perspectiva se apartará de las características tradicionales, principalmente basadas en la prosodia, hacia características ganadas a partir del análisis de factores, intentando identificar los componentes principales de la expresividad, concretamente los i-vectors. Los resultados demuestran que una combinación de características tradicionales y de las basadas en los i-vectors rinde mejor en la tarea del "clustering" no supervisado del habla expresiva que solo las características tradicionales e incluso mejor que amplios conjuntos de características del estado del arte en el dominio multi-locutor. Una vez definido, el conjunto de características se utiliza para el "clustering" no supervisado de un audiolibro, entrenando de cada "cluster" una voz. El método se ha evaluado en una aplicación de edición de audiolibro, donde los usuarios utilizaban las voces sintéticas para crear sus propios diálogos. Los resultados obtenidos validan la propuesta. En la aplicación de edición, los usuarios eligen voces sintéticas y las asignan a frases considerando los personajes y la expresividad. Implicando el dominio semántico, esta asignación podría realizarse automáticamente. En esta parte de la tesis, palabras y frases se representan numéricamente en espacios vectoriales entrenables, llamados embeddings, y pueden utilizarse para predecir la expresividad. Este método no solo permite una lectura automática de pasajes de texto, tomando en cuenta el contexto local, sino que también puede utilizarse como una herramienta de búsqueda semántica para datos de entrenamiento. Ambas aplicaciones se han evaluado en un experimento perceptual demostrando el potencial de la metodología propuesta. Finalmente, siguiendo las nuevas tendencias en el mundo de la síntesis de habla basada en redes neuronales, se ha desarrollado y evaluado un sistema de síntesis de voz expresiva utilizando esta tecnología. Representaciones semánticas de texto, motivadas emocionalmente, llamadas "sentiment embeddings", entrenadas con reseñas de cine, se utilizan como input adicional en el sistema. La red neuronal ahora aprende no solamente de la información segmental y contextual, sino también de esta representación del sentimiento, afectando especialmente la prosodia. El sistema se ha evaluado en dos experimentos perceptuales, demostrando la preferencia del sistema que incluye esta nueva represent
Gonzàlez, Pellicer Edgar. "Unsupervised learning of relation detection patterns." Doctoral thesis, Universitat Politècnica de Catalunya, 2012. http://hdl.handle.net/10803/83906.
Повний текст джерелаInformation extraction is the natural language processing area whose goal is to obtain structured data from the relevant information contained in textual fragments. Information extraction requires a significant amount of linguistic knowledge. The specificity of such knowledge supposes a drawback on the portability of the systems, as a change of language, domain or style demands a costly human effort. Machine learning techniques have been applied for decades so as to overcome this portability bottleneck¿progressively reducing the amount of involved human supervision. However, as the availability of large document collections increases, completely unsupervised approaches become necessary in order to mine the knowledge contained in them. The proposal of this thesis is to incorporate clustering techniques into pattern learning for information extraction, in order to further reduce the elements of supervision involved in the process. In particular, the work focuses on the problem of relation detection. The achievement of this ultimate goal has required, first, considering the different strategies in which this combination could be carried out; second, developing or adapting clustering algorithms suitable to our needs; and third, devising pattern learning procedures which incorporated clustering information. By the end of this thesis, we had been able to develop and implement an approach for learning of relation detection patterns which, using clustering techniques and minimal human supervision, is competitive and even outperforms other comparable approaches in the state of the art.
Panholzer, Georg. "Identifying Deviating Systems with Unsupervised Learning." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-1146.
Повний текст джерелаWe present a technique to identify deviating systems among a group of systems in a
self-organized way. A compressed representation of each system is used to compute similarity measures, which are combined in an affinity matrix of all systems. Deviation detection and clustering is then used to identify deviating systems based on this affinity matrix.
The compressed representation is computed with Principal Component Analysis and
Kernel Principal Component Analysis. The similarity measure between two compressed
representations is based on the angle between the spaces spanned by the principal
components, but other methods of calculating a similarity measure are suggested as
well. The subsequent deviation detection is carried out by computing the probability of
each system to be observed given all the other systems. Clustering of the systems is
done with hierarchical clustering and spectral clustering. The whole technique is demonstrated on four data sets of mechanical systems, two of a simulated cooling system and two of human gait. The results show its applicability on these mechanical systems.
Smith, Reuben. "Correlating intrusion alerts with unsupervised learning." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27179.
Повний текст джерелаKit, Chun Yu. "Unsupervised lexical learning as inductive inference." Thesis, University of Sheffield, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.340205.
Повний текст джерелаDomingues, Rémi. "Machine Learning for Unsupervised Fraud Detection." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-181027.
Повний текст джерелаZeltner, Felix. "Autonomous Terrain Classification Through Unsupervised Learning." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-60893.
Повний текст джерелаÖrjehag, Erik. "Unsupervised Learning for Structure from Motion." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173731.
Повний текст джерелаProst, Vincent. "Sparse unsupervised learning for metagenomic data." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL013.
Повний текст джерелаThe development of massively parallel sequencing technologies enables to sequence DNA at high-throughput and low cost, fueling the rise of metagenomics which is the study of complex microbial communities sequenced in their natural environment.Metagenomic problems are usually computationally difficult and are further complicated by the massive amount of data involved.In this thesis we consider two different metagenomics problems: 1. raw reads binning and 2. microbial network inference from taxonomic abundance profiles. We address them using unsupervised machine learning methods leveraging the parsimony principle, typically involving l1 penalized log-likelihood maximization.The assembly of genomes from raw metagenomic datasets is a challenging task akin to assembling a mixture of large puzzles composed of billions or trillions of pieces (DNA sequences). In the first part of this thesis, we consider the related task of clustering sequences into biologically meaningful partitions (binning). Most of the existing computational tools perform binning after read assembly as a pre-processing, which is error-prone (yielding artifacts like chimeric contigs) and discards vast amounts of information in the form of unassembled reads (up to 50% for highly diverse metagenomes). This motivated us to try to address the raw read binning (without prior assembly) problem. We exploit the co-abundance of species across samples as discriminative signal. Abundance is usually measured via the number of occurrences of long k-mers (subsequences of size k). The use of Local Sensitive Hashing (LSH) allows us to contain, at the cost of some approximation, the combinatorial explosion of long k-mers indexing. The first contribution of this thesis is to propose a sparse Non-Negative Matrix factorization (NMF) of the samples x k-mers count matrix in order to extract abundance variation signals. We first show that using sparse NMF is well-grounded since data is a sparse linear mixture of non-negative components. Sparse NMF exploiting online dictionary learning algorithms retained our attention, including its decent behavior on largely asymmetric data matrices. The validation of metagenomic binning being difficult on real datasets, because of the absence of ground truth, we created and used several benchmarks for the different methods evaluated on. We illustrated that sparse NMF improves state of the art binning methods on those datasets. Experiments conducted on a real metagenomic cohort of 1135 human gut microbiota showed the relevance of the approach.In the second part of the thesis, we consider metagenomic data after taxonomic profiling: multivariate data representing abundances of taxa across samples. It is known that microbes live in communities structured by ecological interaction between the members of the community. We focus on the problem of the inference of microbial interaction networks from taxonomic profiles. This problem is frequently cast into the paradigm of Gaussian graphical models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present in this part a zero-inflated log-normal graphical model specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets
Chakeri, Alireza. "Scalable Unsupervised Learning with Game Theory." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6616.
Повний текст джерелаOtt, Lionel. "Unsupervised learning for long-term autonomy." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/13334.
Повний текст джерелаDolfe, Rafael, and Keivan Matinzadeh. "Investigating Skin Cancer with Unsupervised Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259363.
Повний текст джерелаHudcancer är en av de mest förekommande typerna av cancer i världen. Det vanligaste sättet att diagnosticera hudcancer är för en dermatolog att analysera hudsår på en patients kropp. Dagens medicinsk diagnostik använder en etablerad mängd beteckningar för olika typer av hudsår. Ett alternativ till denna typ av diagnostisering skulle kunna vara att låta en dator utan förkunskap om datan (bilder på hudsår) sköta analysen. Denna katogorisering skulle sedan kunna jämföras med de existerande medicinska katogorierna som varje bild fått. För att undersöka detta användes tre algoritmer av typen oövervakat lärande för att producera kluster-indelningar på ett dataset innehållandes bilder på hudsår. Dessa algoritmer var K-means, agglomerative clustering, och spectral clustering. Vi fann inga uppenbara kluster-indelningar och ingen koppling mellan de nuvarande medicinska beteckningarna. Den indelning av kluster som fick högst poäng när den evaluaredes internt var den indelning av kluster genererad av spectral clustering. Detta skedde när antalet kluster som algoritmen skulle dela upp datan i var satt till två. En djupare undersökning i strukturen av denna indelning visade att ett av klustrerna i princip innehöll varje bild. Även fast Silhouette-värdet för denna indelning var låg, pekar värdet på att den underliggande strukturen bäst kan representeras av ett enda kluster.
Qian, Jing. "Unsupervised learning in high-dimensional space." Thesis, Boston University, 2014. https://hdl.handle.net/2144/12951.
Повний текст джерелаIn machine learning, the problem of unsupervised learning is that of trying to explain key features and find hidden structures in unlabeled data. In this thesis we focus on three unsupervised learning scenarios: graph based clustering with imbalanced data, point-wise anomaly detection and anomalous cluster detection on graphs. In the first part we study spectral clustering, a popular graph based clustering technique. We investigate the reason why spectral clustering performs badly on imbalanced and proximal data. We then propose the partition constrained minimum cut (PCut) framework based on a novel parametric graph construction method, that is shown to adapt to different degrees of imbalanced data. We analyze the limit cut behavior of our approach, and demonstrate the significant performance improvement through clustering and semi-supervised learning experiments on imbalanced data. [TRUNCATED]
Sebbar, Mehdi. "On unsupervised learning in high dimension." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLG003/document.
Повний текст джерелаIn this thesis, we discuss two topics, high-dimensional clustering on the one hand and estimation of mixing densities on the other. The first chapter is an introduction to clustering. We present various popular methods and we focus on one of the main models of our work which is the mixture of Gaussians. We also discuss the problems with high-dimensional estimation (Section 1.3) and the difficulty of estimating the number of clusters (Section 1.1.4). In what follows, we present briefly the concepts discussed in this manuscript. Consider a mixture of $K$ Gaussians in $RR^p$. One of the common approaches to estimate the parameters is to use the maximum likelihood estimator. Since this problem is not convex, we can not guarantee the convergence of classical methods such as gradient descent or Newton's algorithm. However, by exploiting the biconvexity of the negative log-likelihood, the iterative 'Expectation-Maximization' (EM) procedure described in Section 1.2.1 can be used. Unfortunately, this method is not well suited to meet the challenges posed by the high dimension. In addition, it is necessary to know the number of clusters in order to use it. Chapter 2 presents three methods that we have developed to try to solve the problems described above. The works presented there have not been thoroughly researched for various reasons. The first method that could be called 'graphical lasso on Gaussian mixtures' consists in estimating the inverse matrices of covariance matrices $Sigma$ (Section 2.1) in the hypothesis that they are parsimonious. We adapt the graphic lasso method of [Friedman et al., 2007] to a component in the case of a mixture and experimentally evaluate this method. The other two methods address the problem of estimating the number of clusters in the mixture. The first is a penalized estimate of the matrix of posterior probabilities $ Tau in RR ^ {n times K} $ whose component $ (i, j) $ is the probability that the $i$-th observation is in the $j$-th cluster. Unfortunately, this method proved to be too expensive in complexity (Section 2.2.1). Finally, the second method considered is to penalize the weight vector $ pi $ in order to make it parsimonious. This method shows promising results (Section 2.2.2). In Chapter 3, we study the maximum likelihood estimator of density of $n$ i.i.d observations, under the assumption that it is well approximated by a mixture with a large number of components. The main focus is on statistical properties with respect to the Kullback-Leibler loss. We establish risk bounds taking the form of sharp oracle inequalities both in deviation and in expectation. A simple consequence of these bounds is that the maximum likelihood estimator attains the optimal rate $((log K)/n)^{1/2}$, up to a possible logarithmic correction, in the problem of convex aggregation when the number $K$ of components is larger than $n^{1/2}$. More importantly, under the additional assumption that the Gram matrix of the components satisfies the compatibility condition, the obtained oracle inequalities yield the optimal rate in the sparsity scenario. That is, if the weight vector is (nearly) $D$-sparse, we get the rate $(Dlog K)/n$. As a natural complement to our oracle inequalities, we introduce the notion of nearly-$D$-sparse aggregation and establish matching lower bounds for this type of aggregation. Finally, in Chapter 4, we propose an algorithm that performs the Kullback-Leibler aggregation of components of a dictionary as discussed in Chapter 3. We compare its performance with different methods: the kernel density estimator , the 'Adaptive Danzig' estimator, the SPADES and EM estimator with the BIC criterion. We then propose a method to build the dictionary of densities and study it numerically. This thesis was carried out within the framework of a CIFRE agreement with the company ARTEFACT
Tsang, Wai-Hung. "Kernel methods in supervised and unsupervised learning /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20TSANG.
Повний текст джерелаIncludes bibliographical references (leaves 46-49). Also available in electronic version. Access restricted to campus users.
Vollgraf, Roland. "Unsupervised learning methods for statistical signal processing." [S.l.] : [s.n.], 2006. http://opus.kobv.de/tuberlin/volltexte/2007/1488.
Повний текст джерелаMeinicke, Peter. "Unsupervised learning in a generalized regression framework." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=960755594.
Повний текст джерелаGiguère, Philippe. "Unsupervised learning for mobile robot terrain classification." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=95062.
Повний текст джерелаAu travers de cette thèse, nous examinons la problématique entourant la perception des différences entre divers terrains, pour un robot mobile autonome. L'application visée par les résultats de nos recherches est l'identification des types de terrains. Cette identification, faite de manière robuste, permet d'augmenter les capacités de systèmes mobiles, tant au niveau de la locomotion que de la navigation. Par exemple, un robot amphibie à pattes qui aurait apprit à distinguer le sable et la mer pourra choisir de lui-même la démarche appropriée : marcher sur le sable, et nager dans l'eau. Cette même information sur le type de terrain peut aussi être utile pour guider un robot, lui permettant d'éviter des types de terrains spécifiques. Nous abordons la problématique d'identification des terrains autour de deux axes principaux: un problème de capture d'information (sensoriel), et un problème d'apprentissage. Dans le problème de la capture d'information, la question traitée est celle d'extraire l'information pertinente à l'identification du type de sol à partir de capteurs sur un robot, ou à l'aide d'une sonde tactile. En particulier, nous démontrons qu'en combinant l'information provenant d'une centrale inertielle avec celle provenant des actionneurs d'un robot à pattes, il est possible d'identifier certains types de sols. De plus, nous présentons une nouvelle sonde tactile possédant des caractéristiques améliorant la capture d'informations relatives aux terrains. Pour le problème de l'apprentissage, nous analysons comment il est possible d'exploiter les continuités spatiales et temporelles afin de séparer des séries temporelles ou des images en leurs classes constituantes (clustering). Nous présentons un nouvel algorithme de clustering basé sur ce principe. En combinant l'approche sensorielle et ce nouvel algorithme, nous obtenons une architecture permettant l'apprentissage, de façon autonome, des terrains. Cette approche est
Afzal, Naveed. "Unsupervised relation extraction for e-learning applications." Thesis, University of Wolverhampton, 2011. http://hdl.handle.net/2436/299064.
Повний текст джерелаTitsias, Michalis. "Unsupervised learning of multiple objects in images." Thesis, University of Edinburgh, 2005. http://hdl.handle.net/1842/776.
Повний текст джерелаWatts, Oliver Samuel. "Unsupervised learning for text-to-speech synthesis." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/7982.
Повний текст джерелаKhaliq, Bilal. "Unsupervised learning of Arabic non-concatenative morphology." Thesis, University of Sussex, 2015. http://sro.sussex.ac.uk/id/eprint/53865/.
Повний текст джерелаMorita, Takashi Ph D. Massachusetts Institute of Technology. "Unsupervised learning of lexical subclasses from phonotactics." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/120612.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 203-215).
Languages are constantly borrowing words from one another. Since the donor and recipient languages typically differ in their phonology and phonotactics, the native words and the loanwords of the borrower language can also exhibit dierent phonology/ phonotactics. Accordingly, it has been proposed that the phonotactics of languages such as Japanese is better explained if words are classified into etymologically defined sublexica. However, this sublexical analysis is challenged by a learnability problem: the sublexical membership of words is not directly observable. This study applies a state-of-the-art clustering method (a Dirichlet process mixture model) to a substantial number of Japanese and English words extracted from corpora. It turns out that the predicted clusters largely correspond to the etymologically defined sublexica. Since the clustering method is domain-general and not specialized to sublexicon identication, the results can be taken as statistical evidence for the heterogeneous lexica of the two languages. Moreover, the unsupervised nature of the clustering method demonstrates the learnability of sublexica from naturalistic data. The learned sublexica also replicate linguistic characterizations of actual sublexica proposed in previous literature, such as the biased distribution of (certain substrings of) segments to particular sublexica. In addition, the learned sublexica make informative predictions based on previous experimental studies. These results suggest that the predicted sublexica are linguistically sound. Finally, the predicted sublexica reveal hitherto unnoticed phonotactic properties. These discoveries can be used for further investigation of native speakers' knowledge.
by Takashi Morita.
Ph. D. in Linguistics
Martin, del Campo Barraza Sergio. "Unsupervised feature learning applied to condition monitoring." Doctoral thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-63113.
Повний текст джерелаPelletier, Bertrand 1961 Carleton University Dissertation Engineering Systems and Computer. "Unsupervised learning from a goal-driven agent." Ottawa.:, 1993.
Знайти повний текст джерелаBerkes, Pietro. "Temporal slowness as an unsupervised learning principle." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2006. http://dx.doi.org/10.18452/15414.
Повний текст джерелаIn this thesis we investigate the relevance of temporal slowness as a principle for the self-organization of the visual cortex and for technical applications. We first introduce and discuss this principle and put it into mathematical terms. We then define the slow feature analysis (SFA) algorithm, which solves the mathematical problem for multidimensional, discrete time series in a finite dimensional function space. In the main part of the thesis we apply temporal slowness as a learning principle of receptive fields in the visual cortex. Using SFA we learn the input-output functions that, when applied to natural image sequences, vary as slowly as possible in time and thus optimize the slowness objective. The resulting functions can be interpreted as nonlinear spatio-temporal receptive fields and compared to neurons in the primary visual cortex (V1). We find that they reproduce (qualitatively and quantitatively) many of the properties of complex cells in V1, not only the two basic ones, namely a Gabor-like optimal stimulus and phase-shift invariance, but also secondary ones like direction selectivity, non-orthogonal inhibition, end-inhibition and side-inhibition. These results show that a single unsupervised learning principle can account for a rich repertoire of receptive field properties. In order to analyze the nonlinear functions learned by SFA in our model, we developed a set of mathematical and numerical tools to characterize quadratic forms as receptive fields. We expand them in a successive chapter to be of more general interest for theoretical and physiological models. We conclude this thesis by showing the application of the temporal slowness principle to pattern recognition. We reformulate the SFA algorithm such that it can be applied to pattern recognition problems that lack of a temporal structure and present the optimal solutions in this case. We then apply the system to a standard handwritten digits database with good performance.
Nikbakht, Silab Rasoul. "Unsupervised learning for parametric optimization in wireless networks." Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/671246.
Повний текст джерелаAqueta tesis estudia l’optimització paramètrica a les xarxes cel.lulars i xarxes cell-free, explotant els paradigmes basats en dades i basats en experts. L’assignació i control de la potencia, que ajusten la potencia de transmissió per complir amb diferents criteris d’equitat com max-min o max-product, son tasques crucials en les telecomunicacions inalàmbriques pertanyents a la categoria d’optimització paramètrica. Les tècniques d’última generació per al control i assignació de la potència solen exigir enormes costos computacionals i no son adequats per aplicacions en temps real. Per abordar aquesta qüestió, desenvolupem una tècnica de propòsit general utilitzant aprenentatge no supervisat per resoldre optimitzacions paramètriques; i al mateix temps ampliem el reconegut algoritme de control de potencia fraccionada. En el paradigma basat en dades, creem un marc d’aprenentatge no supervisat que defineix una xarxa neuronal (NN, sigles de Neural Network en Anglès) especifica, incorporant coneixements experts a la funció de cost de la NN per resoldre els problemes de control i assignació de potència. Dins d’aquest enfocament, s’entrena una NN de tipus feedforward mitjançant el mostreig repetit en l’espai de paràmetres, però, en lloc de resoldre completament el problema d’optimització associat, es pren un sol pas en la direcció del gradient de la funció objectiu. El mètode resultant ´es aplicable tant als problemes d’optimització convexos com no convexos. Això ofereix una acceleració de dos a tres ordres de magnitud en els problemes de control i assignació de potencia en comparació amb un algoritme de resolució convexa—sempre que sigui aplicable. En el paradigma dirigit per experts, investiguem l’extensió del control de potencia fraccionada a les xarxes sense cèl·lules. La solució tancada resultant pot ser avaluada per a l’enllaç de pujada i el de baixada sense esforç i assoleix una solució (gaire) òptima en el cas de l’enllaç de pujada. En ambdós paradigmes, ens centrem especialment en els guanys a gran escala—la quantitat d’atenuació que experimenta la potencia mitja local rebuda. La naturalesa de variació lenta dels guanys a gran escala relaxa la necessitat d’una actualització freqüent de les solucions tant en el paradigma basat en dades com en el basat en experts, permetent d’aquesta manera l’ús dels dos mètodes en aplicacions en temps real.
Esta tesis estudia la optimización paramétrica en las redes celulares y redes cell-free, explorando los paradigmas basados en datos y en expertos. La asignación y el control de la potencia, que ajustan la potencia de transmisión para cumplir con diferentes criterios de equidad como max-min o max-product, son tareas cruciales en las comunicaciones inalámbricas pertenecientes a la categoría de optimización paramétrica. Los enfoques más modernos de control y asignación de la potencia suelen exigir enormes costes computacionales y no son adecuados para aplicaciones en tiempo real. Para abordar esta cuestión, desarrollamos un enfoque de aprendizaje no supervisado de propósito general que resuelve las optimizaciones paramétricas y a su vez ampliamos el reconocido algoritmo de control de potencia fraccionada. En el paradigma basado en datos, creamos un marco de aprendizaje no supervisado que define una red neuronal (NN, por sus siglas en inglés) específica, incorporando conocimiento de expertos a la función de coste de la NN para resolver los problemas de control y asignación de potencia. Dentro de este enfoque, se entrena una NN de tipo feedforward mediante el muestreo repetido del espacio de parámetros, pero, en lugar de resolver completamente el problema de optimización asociado, se toma un solo paso en la dirección del gradiente de la función objetivo. El método resultante es aplicable tanto a los problemas de optimización convexos como no convexos. Ofrece una aceleración de dos a tres órdenes de magnitud en los problemas de control y asignación de potencia, en comparación con un algoritmo de resolución convexo—siempre que sea aplicable. Dentro del paradigma dirigido por expertos, investigamos la extensión del control de potencia fraccionada a las redes cell-free. La solución de forma cerrada resultante puede ser evaluada para el enlace uplink y el downlink sin esfuerzo y alcanza una solución (casi) óptima en el caso del enlace uplink. En ambos paradigmas, nos centramos especialmente en las large-scale gains— la cantidad de atenuación que experimenta la potencia media local recibida. La naturaleza lenta y variable de las ganancias a gran escala relaja la necesidad de una actualización frecuente de las soluciones tanto en el paradigma basado en datos como en el basado en expertos, permitiendo el uso de ambos métodos en aplicaciones en tiempo real.
Hasenjäger, Martina. "Active data selection in supervised and unsupervised learning." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=960209220.
Повний текст джерелаBhaskar, Dhananjay. "Morphology based cell classification : unsupervised machine learning approach." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/61342.
Повний текст джерелаScience, Faculty of
Mathematics, Department of
Graduate
Jossen, Quentin. "Unsupervised learning procedure for nonintrusive appliance load monitoring." Doctoral thesis, Universite Libre de Bruxelles, 2013. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209369.
Повний текст джерелаenergy advice. The required functionalities must however be rapidly defined if they are expected to be integrated in the future massive roll out.
Nonintrusive appliance load monitoring aims to derive appliance-specific information from the aggregate electricity consumption. While techniques have been developed since the 80’s, those mainly address the identification of previously learned appliances, from a database. Building such a database is an intrusive and tedious process which should be avoided. Whereas most recent efforts have focused on unsupervised techniques to disambiguate energy consumption into individual appliances, they usually rely on prior information about measured appliances such as the number of appliances, the number of states in each appliance as well as the power they consume in each state. This information should ideally be learned from the data. This topic will be addressed in the present research.
This work will present a framework for unsupervised learning for nonintrusive appliance
load monitoring. It aims to discover information about appliances of a household solely from its aggregate consumption data, with neither prior information nor user intervention. The learning process can be segmented into five tasks: the detection of on/off switching, the extraction of individual load signatures, the identification of
recurrent signatures, the discovery of two-state electrical devices and, finally, the elaboration
of appliance models. The first four steps will be addressed in this paper.
The suite of algorithms proposed in this work allows to discover the set of two-states electrical loads from their aggregated consumption. This, along with the evaluation
of their operating sequences, is a prerequisite to learn appliance models from the data. Results show that loads consuming power down to some dozens of watts can be learned from the data. This should encourage future researchers to consider such an unsupervised learning.
Doctorat en Sciences de l'ingénieur
info:eu-repo/semantics/nonPublished
Khan, Najeed Ahmed. "Unsupervised learning of object detectors for everyday scenes." Thesis, University of Leeds, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.540773.
Повний текст джерелаBourrier, Anthony. "Compressed sensing and dimensionality reduction for unsupervised learning." Phd thesis, Université Rennes 1, 2014. http://tel.archives-ouvertes.fr/tel-01023030.
Повний текст джерелаDong, Shuonan. "Unsupervised learning and recognition of physical activity plans." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/42195.
Повний текст джерелаIncludes bibliographical references (p. 125-129).
This thesis desires to enable a new kind of interaction between humans and computational agents, such as robots or computers, by allowing the agent to anticipate and adapt to human intent. In the future, more robots may be deployed in situations that require collaboration with humans, such as scientific exploration, search and rescue, hospital assistance, and even domestic care. These situations require robots to work together with humans, as part of a team, rather than as a stand-alone tool. The intent recognition capability is necessary for computational agents to play a more collaborative role in human-robot interactions, moving beyond the standard master-slave relationship of humans and computers today. We provide an innovative capability for recognizing human intent, through statistical plan learning and online recognition. We approach the plan learning problem by employing unsupervised learning to automatically determine the activities in a plan based on training data. The plan activities are described by a mixture of multivariate probability densities. The number of distributions in the mixture used to describe the data is assumed to be given. The training data trajectories are fed again through the activities' density distributions to determine each possible sequence of activities that make up a plan. These activity sequences are then summarized with temporal information in a temporal plan network, which consists of a network of all possible plans. Our approach to plan recognition begins with formulating the temporal plan network as a hidden Markov model. Next, we determine the most likely path using the Viterbi algorithm. Finally, we refer back to the temporal plan network to obtain predicted future activities. Our research presents several innovations:
(cont.) First, we introduce a modified representation of temporal plan networks that incorporates probabilistic information into the state space and temporal representations. Second, we learn plans from actual data, such that the notion of an activity is not arbitrarily or manually defined, but is determined by the characteristics of the data. Third, we develop a recognition algorithm that can perform recognition continuously by making probabilistic updates. Finally, our recognizer not only identifies previously executed activities, but also pre-dicts future activities based on the plan network. We demonstrate the capabilities of our algorithms on motion capture data. Our results show that the plan learning algorithm is able to generate reasonable temporal plan networks, depending on the dimensions of the training data and the recognition resolution used. The plan recognition algorithm is also successful in recognizing the correct activity sequences in the temporal plan network corresponding to the observed test data.
by Shuonan Dong.
S.M.
Chan, Kevin S. (Kevin Sao Wei). "Multiview monocular depth estimation using unsupervised learning methods." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119753.
Повний текст джерелаThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 50-51).
Existing learned methods for monocular depth estimation use only a single view of scene for depth evaluation, so they inherently overt to their training scenes and cannot generalize well to new datasets. This thesis presents a neural network for multiview monocular depth estimation. Teaching a network to estimate depth via structure from motion allows it to generalize better to new environments with unfamiliar objects. This thesis extends recent work in unsupervised methods for single-view monocular depth estimation and uses the reconstruction losses for training posed in those works. Models and baseline models were evaluated on a variety of datasets and results indicate that indicate multiview models generalize across datasets better than previous work. This work is unique in that it emphasizes cross domain performance and ability to generalize more so than performance on the training set.
by Kevin S. Chan.
M. Eng.
Wichrowska, Olga N. "Unsupervised syntactic category learning from child-directed speech." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62756.
Повний текст джерелаCataloged from PDF version of thesis.
Includes bibliographical references (p. 57-59).
The goal of this research was to discover what kinds of syntactic categories can be learned using distributional analysis on linear context of words, specifically in child-directed speech. The idea behind this is that the categories used by children could very well be different from adult categories. There is some evidence that distributional analysis could be used for some aspects of language acquisition, though very strong arguments exist for why it is not enough to acquire grammar. These experiments can help identify what kind of data can be learned from linear context and statistics only. This paper reports the results of three established automatic syntactic category learning algorithms on a small, edited input set of child-directed speech from the CHILDES database. Hierarchical clustering, K-Means analysis, and an implementation of a substitution algorithm are all used to assign syntactic categories to words based on their linear distributional context. Overall, open classes (nouns, verbs, adjectives) were reliably categorized, and some methods were able to distinguish prepositions, adverbs, subjects vs. objects, and verbs by subcategorization frame. The main barrier standing between these methods and human-like categorization is the inability to deal with the ambiguity that is omnipresent in natural language and poses an important problem for future models of syntactic category acquisition.
by Olga N. Wichrowska.
M.Eng.
Mansinghka, Vikash Kumar. "Nonparametric Bayesian methods for supervised and unsupervised learning." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/53172.
Повний текст джерелаIncludes bibliographical references (leaves 44-45).
I introduce two nonparametric Bayesian methods for solving problems of supervised and unsupervised learning. The first method simultaneously learns causal networks and causal theories from data. For example, given synthetic co-occurrence data from a simple causal model for the medical domain, it can learn relationships like "having a flu causes coughing", while also learning that observable quantities can be usefully grouped into categories like diseases and symptoms, and that diseases tend to cause symptoms, not the other way around. The second method is an online algorithm for learning a prototype-based model for categorial concepts, and can be used to solve problems of multiclass classification with missing features. I apply it to problems of categorizing newsgroup posts and recognizing handwritten digits. These approaches were inspired by a striking capacity of human learning, which should also be a desideratum for any intelligent system: the ability to learn certain kinds of "simple" or "natural" structures very quickly, while still being able to learn arbitrary -- and arbitrarily complex - structures given enough data. In each case, I show how nonparametric Bayesian modeling and inference based on stochastic simulation give us some of the tools we need to achieve this goal.
by Vikash Kumar Mansinghka.
M.Eng.
Sani, Lorenzo. "Unsupervised clustering of MDS data using federated learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25591/.
Повний текст джерелаNallabolu, Adithya Reddy. "Unsupervised Learning of Spatiotemporal Features by Video Completion." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/79702.
Повний текст джерелаMaster of Science