Dissertations / Theses: 'Unsupervied learning'

1

GIOBERGIA, FLAVIO. "Machine learning with limited label availability: algorithms and applications." Doctoral thesis, Politecnico di Torino, 2023. https://hdl.handle.net/11583/2976594.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Snyder, Benjamin Ph D. Massachusetts Institute of Technology. "Unsupervised multilingual learning." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62455.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 241-254).
For centuries, scholars have explored the deep links among human languages. In this thesis, we present a class of probabilistic models that exploit these links as a form of naturally occurring supervision. These models allow us to substantially improve performance for core text processing tasks, such as morphological segmentation, part-of-speech tagging, and syntactic parsing. Besides these traditional NLP tasks, we also present a multilingual model for lost language deciphersment. We test this model on the ancient Ugaritic language. Our results show that we can automatically uncover much of the historical relationship between Ugaritic and Biblical Hebrew, a known related language.
by Benjamin Snyder.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

3

Geigel, Arturo. "Unsupervised Learning Trojan." NSUWorks, 2014. http://nsuworks.nova.edu/gscis_etd/17.

Full text

Abstract:

This work presents a proof of concept of an Unsupervised Learning Trojan. The Unsupervised Learning Trojan presents new challenges over previous work on the Neural network Trojan, since the attacker does not control most of the environment. The current work will presented an analysis of how the attack can be successful by proposing new assumptions under which the attack can become a viable one. A general analysis of how the compromise can be theoretically supported is presented, providing enough background for practical implementation development. The analysis was carried out using 3 selected algorithms that can cover a wide variety of circumstances of unsupervised learning. A selection of 4 encoding schemes on 4 datasets were chosen to represent actual scenarios under which the Trojan compromise might be targeted. A detailed procedure is presented to demonstrate the attack's viability under assumed circumstances. Two tests of hypothesis concerning the experimental setup were carried out which yielded acceptance of the null hypothesis. Further discussion is contemplated on various aspects of actual implementation issues and real world scenarios where this attack might be contemplated.

APA, Harvard, Vancouver, ISO, and other styles

4

Mathieu, Michael. "Unsupervised Learning under Uncertainty." Thesis, New York University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10261120.

Full text

Abstract:

Deep learning, in particular neural networks, achieved remarkable success in the recent years. However, most of it is based on supervised learning, and relies on ever larger datasets, and immense computing power. One step towards general artificial intelligence is to build a model of the world, with enough knowledge to acquire a kind of ``common sense''. Representations learned by such a model could be reused in a number of other tasks. It would reduce the requirement for labelled samples and possibly acquire a deeper understanding of the problem. The vast quantities of knowledge required to build common sense precludes the use of supervised learning, and suggests to rely on unsupervised learning instead.

The concept of uncertainty is central to unsupervised learning. The task is usually to learn a complex, multimodal distribution. Density estimation and generative models aim at representing the whole distribution of the data, while predictive learning consists of predicting the state of the world given the context and, more often than not, the prediction is not unique. That may be because the model lacks the capacity or the computing power to make a certain prediction, or because the future depends on parameters that are not part of the observation. Finally, the world can be chaotic of truly stochastic. Representing complex, multimodal continuous distributions with deep neural networks is still an open problem.

In this thesis, we first assess the difficulties of representing probabilities in high dimensional spaces, and review the related work in this domain. We then introduce two methods to address the problem of video prediction, first using a novel form of linearizing auto-encoders and latent variables, and secondly using Generative Adversarial Networks (GANs). We show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation. Finally, we explore a new non-probabilistic framework for GANs.

APA, Harvard, Vancouver, ISO, and other styles

5

Boschini, Matteo. "Unsupervised Learning of Scene Flow." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16226/.

Full text

Abstract:

As Computer Vision-powered autonomous systems are increasingly deployed to solve problems in the wild, the case is made for developing visual understanding methods that are robust and flexible. One of the most challenging tasks for this purpose is given by the extraction of scene flow, that is the dense three-dimensional vector field that associates each world point with its corresponding position in the next observed frame, hence describing its three-dimensional motion entirely. The recent addition of a limited amount of ground truth scene flow information to the popular KITTI dataset prompted a renewed interest in the study of techniques for scene flow inference, although the proposed solutions in literature mostly rely on computation-intensive techniques and are characterised by execution times that are not suited for real-time application. In the wake of the recent widespread adoption of Deep Learning techniques to Computer Vision tasks and in light of the convenience of Unsupervised Learning for scenarios in which ground truth collection is difficult and time-consuming, this thesis work proposes the first neural network architecture to be trained in end-to-end fashion for unsupervised scene flow regression from monocular visual data, called Pantaflow. The proposed solution is much faster than currently available state-of-the-art methods and therefore represents a step towards the achievement of real-time scene flow inference.

APA, Harvard, Vancouver, ISO, and other styles

6

Jelacic, Mersad. "Unsupervised Learning for Plant Recognition." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-247.

Full text

Abstract:

Six methods are used for clustering data containing two different objects: sugar-beet plants

and weed. These objects are described by 19 different features, i.e. shape and color features.

There is also information about the distance between sugar-beet plants that is used for

labeling clusters. The methods that are evaluated: k-means, k-medoids, hierarchical clustering,

competitive learning, self-organizing maps and fuzzy c-means. After using the methods on

plant data, clusters are formed. The clusters are labeled with three different proposed

methods: expert, database and context method. Expert method is using a human for giving

initial cluster centers that are labeled. The database method is using a database as an expert

that provides initial cluster centers. The context method is using information about the

environment, which is the distance between sugar-beet plants, for labeling the clusters.

The algorithms that were tested, with the lowest achieved corresponding error, are: k-means

(3.3%), k-medoids (3.8%), hierarchical clustering (5.3%), competitive learning (6.8%), self-

organizing maps (4.9%) and fuzzy c-means (7.9%). Three different datasets were used and the

lowest error on dataset0 is 3.3%, compared to supervised learning methods where it is 3%.

For dataset1 the error is 18.7% and for dataset2 it is 5.8%. Compared to supervised methods,

the error on dataset1 is 11% and for dataset2 it is 5.1%. The high error rate on dataset1 is due

to the samples are not very well separated in different clusters. The features from dataset1 are

extracted from lower resolution on images than the other datasets, and another difference

between the datasets are the sugar-beet plants that are in different growth stages.

The performance of the three methods for labeling clusters is: expert method (6.8% as the

lowest error achieved), database method (3.7%) and context method (6.8%). These results

show the clustering results by competitive learning where the real error is 6.8%.

Unsupervised-learning methods for clustering can very well be used for plant identification.

Because the samples are not classified, an automatic labeling technique must be used if plants

are to be identified. The three proposed techniques can be used for automatic labeling of

plants.

APA, Harvard, Vancouver, ISO, and other styles

7

Amin, Khizer, and Mehmood ul haq Minhas. "Facebook Blocket with Unsupervised Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för tillämpad signalbehandling, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-1969.

Full text

Abstract:

The Internet has become a valuable channel for both business-to- consumer and business-to-business e-commerce. It has changed the way for many companies to manage the business. Every day, more and more companies are making their presence on Internet. Web sites are launched for online shopping as web shops or on-line stores are a popular means of goods distribution. The number of items sold through the internet has sprung up significantly in the past few years. Moreover, it has become a choice for customers to do shopping at their ease. Thus, the aim of this thesis is to design and implement a consumer to consumer application for Facebook, which is one of the largest social networking website. The application allows Facebook users to use their regular profile (on Facebook) to buy and sell goods or services through Facebook. As we already mentioned, there are many web shops such as eBay, Amazon, and applications like blocket on Facebook. However, none of them is directly interacting with the Facebook users, and all of them are using their own platform. Users may use the web shop link from their Facebook profile and will be redirected to web shop. On the other hand, most of the applications in Facebook use notification method to introduce themselves or they push their application on the Facebook pages. This application provides an opportunity to Facebook users to interact directly with other users and use the Facebook platform as a selling/buying point. The application is developed by using a modular approach. Initially a Python web framework, i.e., Django is used and association rule learning is applied for the classification of users’ advertisments. Apriori algorithm generates the rules, which are stored as separate text file. The rule file is further used to classify advertisements and is updated regularly.

APA, Harvard, Vancouver, ISO, and other styles

8

Korkontzelos, Ioannis. "Unsupervised learning of multiword expressions." Thesis, University of York, 2010. http://etheses.whiterose.ac.uk/2091/.

Full text

Abstract:

Multiword expressions are expressions consisting of two or more words that correspond to some conventional way of saying things (Manning & Schutze 1999). Due to the idiomatic nature of many of them and their high frequency of occurence in all sorts of text, they cause problems in many Natural Language Processing (NLP) applications and are frequently responsible for their shortcomings. Efficiently recognising multiword expressions and deciding the degree of their idiomaticity would be useful to all applications that require some degree of semantic processing, such as question-answering, summarisation, parsing, language modelling and language generation. In this thesis we investigate the issues of recognising multiword expressions, domainspecific or not, and of deciding whether they are idiomatic. Moreover, we inspect the extent to which multiword expressions can contribute to a basic NLP task such as shallow parsing and ways that the basic property of multiword expressions, idiomaticity, can be employed to define a novel task for Compositional Distributional Semantics (CDS). The results show that it is possible to recognise multiword expressions and decide their compositionality in an unsupervised manner, based on cooccurrence statistics and distributional semantics. Further, multiword expressions are beneficial for other fundamental applications of Natural Language Processing either by direct integration or as an evaluation tool. In particular, termhood-based methods, which are based on nestedness information, are shown to outperform unithood-based methods, which measure the strength of association among the constituents of a multi-word candidate term. A simple heuristic was proved to perform better than more sophisticated methods. A new graph-based algorithm employing sense induction is proposed to address multiword expression compositionality and is shown to perform better than a standard vector space model. Its parameters were estimated by an unsupervised scheme based on graph connectivity. Multiword expressions are shown to contribute to shallow parsing. Moreover, they are used to define a new evaluation task for distributional semantic composition models.

APA, Harvard, Vancouver, ISO, and other styles

9

Liang, Yingyu. "Modern aspects of unsupervised learning." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/52282.

Full text

Abstract:

Unsupervised learning has become more and more important due to the recent explosion of data. Clustering, a key topic in unsupervised learning, is a well-studied task arising in many applications ranging from computer vision to computational biology to the social sciences. This thesis is a collection of work exploring two modern aspects of clustering: stability and scalability. In the first part, we study clustering under a stability property called perturbation resilience. As an alternative approach to worst case analysis, this novel theoretical framework aims at understanding the complexity of clustering instances that satisfy natural stability assumptions. In particular, we show how to correctly cluster instances whose optimal solutions are resilient to small multiplicative perturbations on the distances between data points, significantly improving existing guarantees. We further propose a generalized property that allows small changes in the optimal solutions after perturbations, and provide the first known positive results in this more challenging setting. In the second part, we study the problem of clustering large scale data distributed across nodes which communicate over the edges of a connected graph. We provide algorithms with small communication cost and provable guarantees on the clustering quality. We also propose algorithms for distributed principal component analysis, which can be used to reduce the communication cost of clustering high dimensional data while merely comprising the clustering quality. In the third part, we study community detection, the modern extension of clustering to network data. We propose a theoretical model of communities that are stable in the presence of noisy nodes in the network, and design an algorithm that provably detects all such communities. We also provide a local algorithm for large scale networks, whose running time depends on the sizes of the output communities but not that of the entire network.

APA, Harvard, Vancouver, ISO, and other styles

10

Xiao, Ying. "New tools for unsupervised learning." Diss., Georgia Institute of Technology, 2014. http://hdl.handle.net/1853/52995.

Full text

Abstract:

In an unsupervised learning problem, one is given an unlabelled dataset and hopes to find some hidden structure; the prototypical example is clustering similar data. Such problems often arise in machine learning and statistics, but also in signal processing, theoretical computer science, and any number of quantitative scientific fields. The distinguishing feature of unsupervised learning is that there are no privileged variables or labels which are particularly informative, and thus the greatest challenge is often to differentiate between what is relevant or irrelevant in any particular dataset or problem. In the course of this thesis, we study a number of problems which span the breadth of unsupervised learning. We make progress in Gaussian mixtures, independent component analysis (where we solve the open problem of underdetermined ICA), and we formulate and solve a feature selection/dimension reduction model. Throughout, our goal is to give finite sample complexity bounds for our algorithms -- these are essentially the strongest type of quantitative bound that one can prove for such algorithms. Some of our algorithmic techniques turn out to be very efficient in practice as well. Our major technical tool is tensor spectral decomposition: tensors are generalisations of matrices, and often allow access to the "fine structure" of data. Thus, they are often the right tools for unravelling the hidden structure in an unsupervised learning setting. However, naive generalisations of matrix algorithms to tensors run into NP-hardness results almost immediately, and thus to solve our problems, we are obliged to develop two new tensor decompositions (with robust analyses) from scratch. Both of these decompositions are polynomial time, and can be viewed as efficient generalisations of PCA extended to tensors.

APA, Harvard, Vancouver, ISO, and other styles

11

Luo, Jiaming S. M. Massachusetts Institute of Technology. "Unsupervised learning of morphological forests." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/111923.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 39-41).
This thesis focuses on unsupervised modeling of morphological families, collectively comprising a forest over the language vocabulary. This formulation enables us to capture edge-wise properties reflecting single-step morphological derivations, along with global distributional properties of the entire forest. These global properties constrain the size of the affix set and encourage formation of tight morphological families. The resulting objective is solved using Integer Linear Programming (ILP) paired with contrastive estimation. We train the model by alternating between optimizing the local log-linear model and the global ILP objective. We evaluate our system on three tasks: root detection, clustering of morphological families and segmentation. Our experiments demonstrate that our model yields consistent gains in all three tasks compared with the best published results.
by Jiaming Luo.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

12

Drexler, Jennifer Fox. "Deep unsupervised learning from speech." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105696.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 87-92).
Automatic speech recognition (ASR) systems have become hugely successful in recent years - we have become accustomed to speech interfaces across all kinds of devices. However, despite the huge impact ASR has had on the way we interact with technology, it is out of reach for a significant portion of the world's population. This is because these systems rely on a variety of manually-generated resources - like transcripts and pronunciation dictionaries - that can be both expensive and difficult to acquire. In this thesis, we explore techniques for learning about speech directly from speech, with no manually generated transcriptions. Such techniques have the potential to revolutionize speech technologies for the vast majority of the world's population. The cognitive science and computer science communities have both been investing increasing time and resources into exploring this problem. However, a full unsupervised speech recognition system is a hugely complicated undertaking and is still a long ways away. As in previous work, we focus on the lower-level tasks which will underlie an eventual unsupervised speech recognizer. We specifically focus on two tasks: developing linguistically meaningful representations of speech and segmenting speech into phonetic units. This thesis approaches these tasks from a new direction: deep learning. While modern deep learning methods have their roots in ideas from the 1960s and even earlier, deep learning techniques have recently seen a resurgence, thanks to huge increases in computational power and new efficient learning algorithms. Deep learning algorithms have been instrumental in the recent progress of traditional supervised speech recognition; here, we extend that work to unsupervised learning from speech.
by Jennifer Fox Drexler.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

13

Jauk, Igor. "Unsupervised learning for expressive speech synthesis." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/460814.

Full text

Abstract:

Nowadays, especially with the upswing of neural networks, speech synthesis is almost totally data driven. The goal of this thesis is to provide methods for automatic and unsupervised learning from data for expressive speech synthesis. In comparison to "ordinary" synthesis systems, it is more difficult to find reliable expressive training data, despite huge availability on sources like Internet. The main difficulty consists in the highly speaker- and situation-dependent nature of expressiveness, causing many and acoustically substantial variations. The consequences are, first, it is very difficult to define labels which reliably identify expressive speech with all nuances. The typical definition of 6 basic emotions, or alike, is a simplification which will have inexcusable consequences dealing with data outside the lab. Second, even if a label set is defined, apart of the enormous manual effort, it is difficult to gain sufficient training data for the models respecting all the nuances and variations. The goal of this thesis is to study automatic training methods for expressive speech synthesis avoiding labeling and to develop applications from these proposals. The focus lies on the acoustic and the semantic domains. For the part of the acoustic domain, the goal is to find suitable acoustic features to represent expressive speech, especially for the multi-speaker domain, as getting closer to real-life uncontrolled data. For this, the perspective will slide away from traditional, mainly prosody-based, features towards features gained with factor analysis, trying to identify the principal components of the expressiveness, namely using i-vectors. Results show that a combination of traditional and i-vector based features performs better in unsupervised clustering of expressive speech than traditional features and even better than large state-of-the-art sets in the multi-speaker domain. Once the feature set is defined, it is used for unsupervised clustering of an audiobook, where from each cluster a voice is trained. Then, the method is evaluated in an audiobook-editing application, where users can use the synthetic voices to create their own dialogues. The obtained results validate the proposal. In this editing application users choose synthetic voices and assign them to the sentences considering the speaking characters and the expressiveness. Involving the semantic domain, this assignment can be achieved automatically, at least partly. Words and sentences are represented numerically in trainable semantic vector spaces, called embeddings, and these can be used to predict the expressiveness to some extent. This method not only permits fully automatic reading of larger text passages, considering the local context, but can also be used as a semantic search engine for training data. Both applications are evaluated in a perceptual test showing the potential of the proposed method. Finally, accounting for the new tendencies in the speech synthesis world, deep neural network based expressive speech synthesis is designed and tested. Emotionally motivated semantic representations of text, sentiment embeddings, trained on the positiveness and the negativeness of movie reviews, are used as an additional input to the system. The neural network now learns not only from segmental and contextual information, but also from the sentiment embeddings, affecting especially prosody. The system is evaluated in two perceptual experiments which show preferences for the inclusion of sentiment embeddings as an additional input.
Hoy en día, especialmente con el auge de las redes neuronales, la síntesis de habla se basa casi totalmente en datos. El objetivo de esta tesis es proveer métodos de entrenamiento automático y no supervisado a partir de datos para la síntesis de habla expresiva. En comparación con sistemas de síntesis "neutrales", resulta más difícil encontrar datos de entrenamiento fiables para la síntesis expresiva, a pesar de la gran disponibilidad de recursos como internet. La dificultad principal se origina en la naturaleza del habla expresiva, altamente dependiente del hablante y la situación, resultando en muchas variaciones acústicas. Las consecuencias son, primero, que es muy difícil definir etiquetas que identifiquen fiablemente todos los detalles del habla expresiva. La definición típica de 6 emociones básicas es una simplificación que tendrá consecuencias inexcusables cuando se trata con datos fuera del laboratorio. Segundo, incluso si se llegara a definir un conjunto de etiquetas, aparte del enorme esfuerzo manual que supondría, sería muy difícil conseguir suficientes datos de entrenamiento para cada variante respetando todos sus matices. El objetivo de esta tesis es estudiar métodos de entrenamiento automático para la síntesis de habla expresiva evitando etiquetas y desarrollar aplicaciones a base de estas propuestas. El enfoque abarca los dominios acústico y semántico. Con respecto al dominio acústico, el objetivo es encontrar características acústicas aptas para representar habla expresiva, especialmente en el dominio multi-locutor, acercándose a datos reales e incontrolados. Para esto, la perspectiva se apartará de las características tradicionales, principalmente basadas en la prosodia, hacia características ganadas a partir del análisis de factores, intentando identificar los componentes principales de la expresividad, concretamente los i-vectors. Los resultados demuestran que una combinación de características tradicionales y de las basadas en los i-vectors rinde mejor en la tarea del "clustering" no supervisado del habla expresiva que solo las características tradicionales e incluso mejor que amplios conjuntos de características del estado del arte en el dominio multi-locutor. Una vez definido, el conjunto de características se utiliza para el "clustering" no supervisado de un audiolibro, entrenando de cada "cluster" una voz. El método se ha evaluado en una aplicación de edición de audiolibro, donde los usuarios utilizaban las voces sintéticas para crear sus propios diálogos. Los resultados obtenidos validan la propuesta. En la aplicación de edición, los usuarios eligen voces sintéticas y las asignan a frases considerando los personajes y la expresividad. Implicando el dominio semántico, esta asignación podría realizarse automáticamente. En esta parte de la tesis, palabras y frases se representan numéricamente en espacios vectoriales entrenables, llamados embeddings, y pueden utilizarse para predecir la expresividad. Este método no solo permite una lectura automática de pasajes de texto, tomando en cuenta el contexto local, sino que también puede utilizarse como una herramienta de búsqueda semántica para datos de entrenamiento. Ambas aplicaciones se han evaluado en un experimento perceptual demostrando el potencial de la metodología propuesta. Finalmente, siguiendo las nuevas tendencias en el mundo de la síntesis de habla basada en redes neuronales, se ha desarrollado y evaluado un sistema de síntesis de voz expresiva utilizando esta tecnología. Representaciones semánticas de texto, motivadas emocionalmente, llamadas "sentiment embeddings", entrenadas con reseñas de cine, se utilizan como input adicional en el sistema. La red neuronal ahora aprende no solamente de la información segmental y contextual, sino también de esta representación del sentimiento, afectando especialmente la prosodia. El sistema se ha evaluado en dos experimentos perceptuales, demostrando la preferencia del sistema que incluye esta nueva represent

APA, Harvard, Vancouver, ISO, and other styles

14

Gonzàlez, Pellicer Edgar. "Unsupervised learning of relation detection patterns." Doctoral thesis, Universitat Politècnica de Catalunya, 2012. http://hdl.handle.net/10803/83906.

Full text

Abstract:

L'extracció d'informació és l'àrea del processament de llenguatge natural l'objectiu de la qual és l'obtenir dades estructurades a partir de la informació rellevant continguda en fragments textuals. L'extracció d'informació requereix una quantitat considerable de coneixement lingüístic. La especificitat d'aquest coneixement suposa un inconvenient de cara a la portabilitat dels sistemes, ja que un canvi d'idioma, domini o estil té un cost en termes d'esforç humà. Durant dècades, s'han aplicat tècniques d'aprenentatge automàtic per tal de superar aquest coll d'ampolla de portabilitat, reduint progressivament la supervisió humana involucrada. Tanmateix, a mida que augmenta la disponibilitat de grans col·leccions de documents, esdevenen necessàries aproximacions completament nosupervisades per tal d'explotar el coneixement que hi ha en elles. La proposta d'aquesta tesi és la d'incorporar tècniques de clustering a l'adquisició de patrons per a extracció d'informació, per tal de reduir encara més els elements de supervisió involucrats en el procés En particular, el treball se centra en el problema de la detecció de relacions. L'assoliment d'aquest objectiu final ha requerit, en primer lloc, el considerar les diferents estratègies en què aquesta combinació es podia dur a terme; en segon lloc, el desenvolupar o adaptar algorismes de clustering adequats a les nostres necessitats; i en tercer lloc, el disseny de procediments d'adquisició de patrons que incorporessin la informació de clustering. Al final d'aquesta tesi, havíem estat capaços de desenvolupar i implementar una aproximació per a l'aprenentatge de patrons per a detecció de relacions que, utilitzant tècniques de clustering i un mínim de supervisió humana, és competitiu i fins i tot supera altres aproximacions comparables en l'estat de l'art.
Information extraction is the natural language processing area whose goal is to obtain structured data from the relevant information contained in textual fragments. Information extraction requires a significant amount of linguistic knowledge. The specificity of such knowledge supposes a drawback on the portability of the systems, as a change of language, domain or style demands a costly human effort. Machine learning techniques have been applied for decades so as to overcome this portability bottleneck¿progressively reducing the amount of involved human supervision. However, as the availability of large document collections increases, completely unsupervised approaches become necessary in order to mine the knowledge contained in them. The proposal of this thesis is to incorporate clustering techniques into pattern learning for information extraction, in order to further reduce the elements of supervision involved in the process. In particular, the work focuses on the problem of relation detection. The achievement of this ultimate goal has required, first, considering the different strategies in which this combination could be carried out; second, developing or adapting clustering algorithms suitable to our needs; and third, devising pattern learning procedures which incorporated clustering information. By the end of this thesis, we had been able to develop and implement an approach for learning of relation detection patterns which, using clustering techniques and minimal human supervision, is competitive and even outperforms other comparable approaches in the state of the art.

APA, Harvard, Vancouver, ISO, and other styles

15

Panholzer, Georg. "Identifying Deviating Systems with Unsupervised Learning." Thesis, Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-1146.

Full text

Abstract:

We present a technique to identify deviating systems among a group of systems in a

self-organized way. A compressed representation of each system is used to compute similarity measures, which are combined in an affinity matrix of all systems. Deviation detection and clustering is then used to identify deviating systems based on this affinity matrix.

The compressed representation is computed with Principal Component Analysis and

Kernel Principal Component Analysis. The similarity measure between two compressed

representations is based on the angle between the spaces spanned by the principal

components, but other methods of calculating a similarity measure are suggested as

well. The subsequent deviation detection is carried out by computing the probability of

each system to be observed given all the other systems. Clustering of the systems is

done with hierarchical clustering and spectral clustering. The whole technique is demonstrated on four data sets of mechanical systems, two of a simulated cooling system and two of human gait. The results show its applicability on these mechanical systems.

APA, Harvard, Vancouver, ISO, and other styles

16

Smith, Reuben. "Correlating intrusion alerts with unsupervised learning." Thesis, University of Ottawa (Canada), 2006. http://hdl.handle.net/10393/27179.

Full text

Abstract:

Alert correlation systems attempt to discover the relationships between intrusion detection system (IDS) alerts to determine the motivation of attackers. IDSs are deployed to detect computer attacks against a network, but the output of IDSs is considered low level since a single attack can be represented by several alerts. An alert correlation system enables the intrusion analyst to find important alerts and filter false positives more efficiently. We present an alert correlation system based on unsupervised machine learning algorithms that is accurate and low maintenance. The system is implemented in two stages of correlation. At the first stage of correlation alerts are grouped together such that each group forms one step of an attack. At the second stage the groups created at the first stage are combined such that each combination of groups contains the alerts of precisely one full attack. (Abstract shortened by UMI.)

APA, Harvard, Vancouver, ISO, and other styles

17

Kit, Chun Yu. "Unsupervised lexical learning as inductive inference." Thesis, University of Sheffield, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.340205.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Domingues, Rémi. "Machine Learning for Unsupervised Fraud Detection." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-181027.

Full text

Abstract:

Fraud is a threat that most online service providers must address in the development of their systems to ensure an efficient security policy and the integrity of their revenue. Amadeus, a Global Distribution System providing a transaction platform for flight booking by travel agents, is targeted by fraud attempts that could lead to revenue losses and indemnifications. The objective of this thesis is to detect fraud attempts by applying machine learning algorithms to bookings represented by Passenger Name Record history. Due to the lack of labelled data, the current study presents a benchmark of unsupervised algorithms and aggregation methods. It also describes anomaly detection techniques which can be applied to self-organizing maps and hierarchical clustering. Considering the important amount of transactions per second processed by Amadeus back-ends, we eventually highlight potential bottlenecks and alternatives.

APA, Harvard, Vancouver, ISO, and other styles

19

Zeltner, Felix. "Autonomous Terrain Classification Through Unsupervised Learning." Thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-60893.

Full text

Abstract:

A key component of autonomous outdoor navigation in unstructured environments is the classification of terrain. Recent development in the area of machine learning show promising results in the task of scene segmentation but are limited by the labels used during their supervised training. In this work, we present and evaluate a flexible strategy for terrain classification based on three components: A deep convolutional neural network trained on colour, depth and infrared data which provides feature vectors for image segmentation, a set of exchangeable segmentation engines that operate in this feature space and a novel, air pressure based actuator responsible for distinguishing rigid obstacles from those that only appear as such. Through the use of unsupervised learning we eliminate the need for labeled training data and allow our system to adapt to previously unseen terrain classes. We evaluate the performance of this classification scheme on a mobile robot platform in an environment containing vegetation and trees with a Kinect v2 sensor as low-cost depth camera. Our experiments show that the features generated by our neural network are currently not competitive with state of the art implementations and that our system is not yet ready for real world applications.

APA, Harvard, Vancouver, ISO, and other styles

20

Örjehag, Erik. "Unsupervised Learning for Structure from Motion." Thesis, Linköpings universitet, Datorseende, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-173731.

Full text

Abstract:

Perception of depth, ego-motion and robust keypoints is critical for SLAM andstructure from motion applications. Neural networks have achieved great perfor-mance in perception tasks in recent years. But collecting labeled data for super-vised training is labor intensive and costly. This thesis explores recent methodsin unsupervised training of neural networks that can predict depth, ego-motion,keypoints and do geometric consensus maximization. The benefit of unsuper-vised training is that the networks can learn from raw data collected from thecamera sensor, instead of labeled data. The thesis focuses on training on imagesfrom a monocular camera, where no stereo or LIDAR data is available. The exper-iments compare different techniques for depth and ego-motion prediction fromprevious research, and shows how the techniques can be combined successfully.A keypoint prediction network is evaluated and its performance is comparedwith the ORB detector provided by OpenCV. A geometric consensus network isalso implemented and its performance is compared with the RANSAC algorithmin OpenCV. The consensus maximization network is trained on the output of thekeypoint prediction network. For future work it is suggested that all networkscould be combined and trained jointly to reach a better overall performance. Theresults show (1) which techniques in unsupervised depth prediction are most ef-fective, (2) that the keypoint predicting network outperformed the ORB detector,and (3) that the consensus maximization network was able to classify outlierswith comparable performance to the RANSAC algorithm of OpenCV.

APA, Harvard, Vancouver, ISO, and other styles

21

Prost, Vincent. "Sparse unsupervised learning for metagenomic data." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASL013.

Full text

Abstract:

Les avancées technologiques dans le séquençage ADN haut débit ont permis à la métagénomique de considérablement se développer lors de la dernière décennie. Le séquencage des espèces directement dans leur milieu naturel a ouvert de nouveaux horizons dans de nombreux domaines de recherche. La réduction des coûts associée à l'augmentation du débit fait que de plus en plus d'études sont lancées actuellement.Dans cette thèse nous considérons deux problèmes ardus en métagénomique, à savoir le clustering de lectures brutes et l'inférence de réseaux microbiens. Pour résoudre ces problèmes, nous proposons de mettre en oeuvre des méthodes d'apprentissage non supervisées utilisant le principe de parcimonie, ce qui prend la forme concrète de problèmes d'optimisation avec une pénalisation de norme l1.Dans la première partie de la thèse, on considère le problème intermédiaire du clustering des séquences ADN dans des partitions biologiquement pertinentes (binning). La plupart des méthodes computationelles n'effectuent le binning qu'après une étape d'assemblage qui est génératrice d'erreurs (avec la création de contigs chimériques) et de pertes d'information. C'est pourquoi nous nous penchons sur le problème du binning sans assemblage préalable. Nous exploitons le signal de co-abondance des espèces au travers des échantillons mesuré via le comptage des k-mers (sous-séquences de taille k) longs. L'utilisation du Local Sensitive Hashing (LSH) permet de contenir, au coût d'une approximation, l'explosion combinatoire des k-mers possibles dans un espace de cardinal fixé. La première contribution de la thèse est de proposer l'application d'une factorisation en matrices non-négatives creuses (sparse NMF) sur la matrice de comptage des k-mers afin de conjointement extraire une information de variation d'abondance et d'effectuer le clustering des k-mers. Nous montrons d'abord le bien fondé de l'approche au niveau théorique. Puis, nous explorons dans l'état de l'art les méthodes de sparse NMF les mieux adaptées à notre problème. Les méthodes d'apprentissage de dictionnaire en ligne ont particulièrement retenu notre attention de par leur capacité à passer à l'échelle pour des jeux de données comportant un très grand nombre de points. La validation des méthodes de binning en métagénomique sur des données réelles étant difficile à cause de l'absence de vérité terrain, nous avons créé et utilisé plusieurs jeux de données synthétiques pour l'évaluation des différentes méthodes. Nous montrons que l'application de la sparse NMF améliore les méthodes de l'état de l'art pour le binning sur ces jeux de données. Des expérience sur des données métagénomiques réelles issus de 1135 échantillons de microbiotes intestinaux d'individus sains ont également été menées afin de montrer la pertinence de l'approche.Dans la seconde partie de la thèse, on considère les données métagénomiques après le profilage taxonomique, c'est à dire des donnés multivariées représentant les niveaux d'abondance des taxons au sein des échantillons. Les microbes vivant en communautés structurées par des interactions écologiques, il est important de pouvoir identifier ces interactions. Nous nous penchons donc sur le problème de l'inférence de réseau d'interactions microbiennes à partir des profils taxonomiques. Ce problème est souvent abordé dans le cadre théorique des modèles graphiques gaussiens (GGM), pour lequel il existe des algorithmes d'inférence puissants tel que le graphical lasso. Mais les méthodes statistiques existantes sont très limitées par l'aspect extrêmement creux des profils taxonomiques que l'on rencontre en métagénomique, notamment par la grande proportion de zéros dits biologiques (i.e. liés à l'absence réelle de taxons). Nous proposons un model log normal avec inflation de zéro visant à traiter ces zéros biologiques et nous montrons un gain de performance par rapport aux méthodes de l'état de l'art pour l'inférence de réseau d'interactions microbiennes
The development of massively parallel sequencing technologies enables to sequence DNA at high-throughput and low cost, fueling the rise of metagenomics which is the study of complex microbial communities sequenced in their natural environment.Metagenomic problems are usually computationally difficult and are further complicated by the massive amount of data involved.In this thesis we consider two different metagenomics problems: 1. raw reads binning and 2. microbial network inference from taxonomic abundance profiles. We address them using unsupervised machine learning methods leveraging the parsimony principle, typically involving l1 penalized log-likelihood maximization.The assembly of genomes from raw metagenomic datasets is a challenging task akin to assembling a mixture of large puzzles composed of billions or trillions of pieces (DNA sequences). In the first part of this thesis, we consider the related task of clustering sequences into biologically meaningful partitions (binning). Most of the existing computational tools perform binning after read assembly as a pre-processing, which is error-prone (yielding artifacts like chimeric contigs) and discards vast amounts of information in the form of unassembled reads (up to 50% for highly diverse metagenomes). This motivated us to try to address the raw read binning (without prior assembly) problem. We exploit the co-abundance of species across samples as discriminative signal. Abundance is usually measured via the number of occurrences of long k-mers (subsequences of size k). The use of Local Sensitive Hashing (LSH) allows us to contain, at the cost of some approximation, the combinatorial explosion of long k-mers indexing. The first contribution of this thesis is to propose a sparse Non-Negative Matrix factorization (NMF) of the samples x k-mers count matrix in order to extract abundance variation signals. We first show that using sparse NMF is well-grounded since data is a sparse linear mixture of non-negative components. Sparse NMF exploiting online dictionary learning algorithms retained our attention, including its decent behavior on largely asymmetric data matrices. The validation of metagenomic binning being difficult on real datasets, because of the absence of ground truth, we created and used several benchmarks for the different methods evaluated on. We illustrated that sparse NMF improves state of the art binning methods on those datasets. Experiments conducted on a real metagenomic cohort of 1135 human gut microbiota showed the relevance of the approach.In the second part of the thesis, we consider metagenomic data after taxonomic profiling: multivariate data representing abundances of taxa across samples. It is known that microbes live in communities structured by ecological interaction between the members of the community. We focus on the problem of the inference of microbial interaction networks from taxonomic profiles. This problem is frequently cast into the paradigm of Gaussian graphical models (GGMs) for which efficient structure inference algorithms are available, like the graphical lasso. Unfortunately, GGMs or variants thereof can not properly account for the extremely sparse patterns occurring in real-world metagenomic taxonomic profiles. In particular, structural zeros corresponding to true absences of biological signals fail to be properly handled by most statistical methods. We present in this part a zero-inflated log-normal graphical model specifically aimed at handling such "biological" zeros, and demonstrate significant performance gains over state-of-the-art statistical methods for the inference of microbial association networks, with most notable gains obtained when analyzing taxonomic profiles displaying sparsity levels on par with real-world metagenomic datasets

APA, Harvard, Vancouver, ISO, and other styles

22

Chakeri, Alireza. "Scalable Unsupervised Learning with Game Theory." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6616.

Full text

Abstract:

Recently dominant sets, a generalization of the notion of the maximal clique to edge-weighted graphs, have proven to be an effective tool for unsupervised learning and have found applications in different domains. Although, they were initially established using optimization and graph theory concepts, recent work has shown fascinating connections with evolutionary game theory, that leads to the clustering game framework. However, considering size of today's data sets, existing methods need to be modified in order to handle massive data. Hence, in this research work, first we address the limitations of the clustering game framework for large data sets theoretically. We propose a new important question for the clustering community ``How can a cluster of a subset of a dataset be a cluster of the entire dataset?''. We show that, this problem is a coNP-hard problem in a clustering game framework. Thus, we modify the definition of a cluster from a stable concept to a non-stable but optimal one (Nash equilibrium). By experiments we show that this relaxation does not change the qualities of the clusters practically. Following this alteration and the fact that equilibriums are generally compact subsets of vertices, we design an effective strategy to find equilibriums representing well distributed clusters. After finding such equilibriums, a linear game theoretic relation is proposed to assign vertices to the clusters and partition the graph. However, the method inherits a space complexity issue, that is the similarities between every pair of objects are required which proves practically intractable for large data sets. To overcome this limitation, after establishing necessary theoretical tools for a special type of graphs that we call vertex-repeated graphs, we propose the scalable clustering game framework. This approach divides a data set into disjoint tractable size chunks. Then, the exact clusters of the entire data are approximated by the clusters of the chunks. In fact, the exact equilibriums of the entire graph is approximated by the equilibriums of the subsets of the graph. We show theorems that enable significantly improved time complexity for the model. The applications include, but are not limited to, the maximum weight clique problem, large data clustering and image segmentation. Experiments have been done on random graphs and the DIMACS benchmark for the maximum weight clique problem and on magnetic resonance images (MRI) of the human brain consisting of about 4 million examples for large data clustering. Also, on the Berkeley Segmentation Dataset, the proposed method achieves results comparable to the state of the art, providing a parallel framework for image segmentation and without any training phase. The results show the effectiveness and efficiency of our approach. In another part of this research work, we generalize the clustering game method to cluster uncertain data where the similarities between the data points are not exactly known, that leads to the uncertain clustering game framework. Here, contrary to the ensemble clustering approaches, where the results of different similarity matrices are combined, we focus on the average utilities of an uncertain game. We show that the game theoretical solutions provide stable clusters even in the presence of severe uncertainties. In addition, based on this framework, we propose a novel concept in uncertain data clustering so that every subset of objects can have a ''cluster degree''. Extensive experiments on real world data sets, as well as on the Berkeley image segmentation dataset, confirm the performance of the proposed method. And finally, instead of dividing a graph into chunks to make the clustering scalable, we study the effect of the spectral sparsification method based on sampling by effective resistance on the clustering outputs. Through experimental and theoretical observations, we show that the clustering results obtained from sparsified graphs are very similar to the results of the original non-sparsified graphs. The rand index is always at about 0.9 to 0.99 in our experiments even when lots of sparsification is done.

APA, Harvard, Vancouver, ISO, and other styles

23

Ott, Lionel. "Unsupervised learning for long-term autonomy." Thesis, The University of Sydney, 2014. http://hdl.handle.net/2123/13334.

Full text

Abstract:

This thesis investigates methods to enable a robot to build and maintain an environment model in an automatic manner. Such capabilities are especially important in long-term autonomy, where robots operate for extended periods of time without human intervention. In such scenarios we can no longer assume that the environment and the models will remain static. Rather changes are expected and the robot needs to adapt to the new, unseen, circumstances automatically. The approach described in this thesis is based on clustering the robot’s sensing information. This provides a compact representation of the data which can be updated as more information becomes available. The work builds on affinity propagation (Frey and Dueck, 2007), a recent clustering method which obtains high quality clusters while only requiring similarities between pairs of points, and importantly, selecting the number of clusters automatically. This is essential for real autonomy as we typically do not know “a priori” how many clusters best represent the data. The contributions of this thesis a three fold. First a self-supervised method capable of learning a visual appearance model in long-term autonomy settings is presented. Secondly, affinity propagation is extended to handle multiple sensor modalities, often occurring in robotics, in a principle way. Third, a method for joint clustering and outlier selection is proposed which selects a user defined number of outlier while clustering the data. This is solved using an extension of affinity propagation as well as a Lagrangian duality approach which provides guarantees on the optimality of the solution.

APA, Harvard, Vancouver, ISO, and other styles

24

Dolfe, Rafael, and Keivan Matinzadeh. "Investigating Skin Cancer with Unsupervised Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-259363.

Full text

Abstract:

Skin cancer is one of the most commonly diagnosed cancers in the world. Diagnosis of skin cancer is commonly performed by analysing skin lesions on the patient’s body. Today’s medical diagnostics use a established set of labels for different types of skin lesions. Another way of categorising skin lesions could be to let a computer perform the analysis without any prior knowledge of the data, where the data is a data set of skin lesion images. This categorisation could then be compared to the already existing medical labels assigned to each image. This categorisation and comparison could provide insight into underlying structures of skin lesion data. To investigate this, three unsupervised learning algorithms; K-means, agglomerative clustering, and spectral clustering, have been used to produce cluster partitionings on a data set of skin lesion images. We found no clear cluster partitionings and no connection to the already existing medical labels. The highest scoring partitioning was produced by spectral clustering when the number of clusters was set to two. Further investigation into the structure of this partitioning revealed that one cluster contained essentially every image. Although relatively low, the score does indicate that the underlying structure may be best represented by a single cluster.
Hudcancer är en av de mest förekommande typerna av cancer i världen. Det vanligaste sättet att diagnosticera hudcancer är för en dermatolog att analysera hudsår på en patients kropp. Dagens medicinsk diagnostik använder en etablerad mängd beteckningar för olika typer av hudsår. Ett alternativ till denna typ av diagnostisering skulle kunna vara att låta en dator utan förkunskap om datan (bilder på hudsår) sköta analysen. Denna katogorisering skulle sedan kunna jämföras med de existerande medicinska katogorierna som varje bild fått. För att undersöka detta användes tre algoritmer av typen oövervakat lärande för att producera kluster-indelningar på ett dataset innehållandes bilder på hudsår. Dessa algoritmer var K-means, agglomerative clustering, och spectral clustering. Vi fann inga uppenbara kluster-indelningar och ingen koppling mellan de nuvarande medicinska beteckningarna. Den indelning av kluster som fick högst poäng när den evaluaredes internt var den indelning av kluster genererad av spectral clustering. Detta skedde när antalet kluster som algoritmen skulle dela upp datan i var satt till två. En djupare undersökning i strukturen av denna indelning visade att ett av klustrerna i princip innehöll varje bild. Även fast Silhouette-värdet för denna indelning var låg, pekar värdet på att den underliggande strukturen bäst kan representeras av ett enda kluster.

APA, Harvard, Vancouver, ISO, and other styles

25

Qian, Jing. "Unsupervised learning in high-dimensional space." Thesis, Boston University, 2014. https://hdl.handle.net/2144/12951.

Full text

Abstract:

Thesis (Ph.D.)--Boston University
In machine learning, the problem of unsupervised learning is that of trying to explain key features and find hidden structures in unlabeled data. In this thesis we focus on three unsupervised learning scenarios: graph based clustering with imbalanced data, point-wise anomaly detection and anomalous cluster detection on graphs. In the first part we study spectral clustering, a popular graph based clustering technique. We investigate the reason why spectral clustering performs badly on imbalanced and proximal data. We then propose the partition constrained minimum cut (PCut) framework based on a novel parametric graph construction method, that is shown to adapt to different degrees of imbalanced data. We analyze the limit cut behavior of our approach, and demonstrate the significant performance improvement through clustering and semi-supervised learning experiments on imbalanced data. [TRUNCATED]

APA, Harvard, Vancouver, ISO, and other styles

26

Sebbar, Mehdi. "On unsupervised learning in high dimension." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLG003/document.

Full text

Abstract:

Dans ce mémoire de thèse, nous abordons deux thèmes, le clustering en haute dimension d'une part et l'estimation de densités de mélange d'autre part. Le premier chapitre est une introduction au clustering. Nous y présentons différentes méthodes répandues et nous nous concentrons sur un des principaux modèles de notre travail qui est le mélange de Gaussiennes. Nous abordons aussi les problèmes inhérents à l'estimation en haute dimension et la difficulté d'estimer le nombre de clusters. Nous exposons brièvement ici les notions abordées dans ce manuscrit. Considérons une loi mélange de K Gaussiennes dans R^p. Une des approches courantes pour estimer les paramètres du mélange est d'utiliser l'estimateur du maximum de vraisemblance. Ce problème n'étant pas convexe, on ne peut garantir la convergence des méthodes classiques. Cependant, en exploitant la biconvexité de la log-vraisemblance négative, on peut utiliser la procédure itérative 'Expectation-Maximization' (EM). Malheureusement, cette méthode n'est pas bien adaptée pour relever les défis posés par la grande dimension. Par ailleurs, cette méthode requiert de connaître le nombre de clusters. Le Chapitre 2 présente trois méthodes que nous avons développées pour tenter de résoudre les problèmes décrits précédemment. Les travaux qui y sont exposés n'ont pas fait l'objet de recherches approfondies pour diverses raisons. La première méthode, 'lasso graphique sur des mélanges de Gaussiennes', consiste à estimer les matrices inverses des matrices de covariance dans l'hypothèse où celles-ci sont parcimonieuses. Nous adaptons la méthode du lasso graphique de [Friedman et al., 2007] sur une composante dans le cas d'un mélange et nous évaluons expérimentalement cette méthode. Les deux autres méthodes abordent le problème d'estimation du nombre de clusters dans le mélange. La première est une estimation pénalisée de la matrice des probabilités postérieures dont la composante (i,j) est la probabilité que la i-ème observation soit dans le j-ème cluster. Malheureusement, cette méthode s'est avérée trop coûteuse en complexité. Enfin, la deuxième méthode considérée consiste à pénaliser le vecteur de poids afin de le rendre parcimonieux. Cette méthode montre des résultats prometteurs. Dans le Chapitre 3, nous étudions l'estimateur du maximum de vraisemblance d'une densité de n observations i.i.d. sous l’hypothèse qu'elle est bien approximée par un mélange de plusieurs densités données. Nous nous intéressons aux performances de l'estimateur par rapport à la perte de Kullback-Leibler. Nous établissons des bornes de risque sous la forme d'inégalités d'oracle exactes, que ce soit en probabilité ou en espérance. Nous démontrons à travers ces bornes que, dans le cas du problème d’agrégation convexe, l'estimateur du maximum de vraisemblance atteint la vitesse (log K)/n)^{1/2}, qui est optimale à un terme logarithmique près, lorsque le nombre de composant est plus grand que n^{1/2}. Plus important, sous l’hypothèse supplémentaire que la matrice de Gram des composantes du dictionnaire satisfait la condition de compatibilité, les inégalités d'oracles obtenues donnent la vitesse optimale dans le scénario parcimonieux. En d'autres termes, si le vecteur de poids est (presque) D-parcimonieux, nous obtenons une vitesse (Dlog K)/n. En complément de ces inégalités d'oracle, nous introduisons la notion d’agrégation (presque)-D-parcimonieuse et établissons pour ce type d’agrégation les bornes inférieures correspondantes. Enfin, dans le Chapitre 4, nous proposons un algorithme qui réalise l'agrégation en Kullback-Leibler de composantes d'un dictionnaire telle qu'étudiée dans le Chapitre 3. Nous comparons sa performance avec différentes méthodes. Nous proposons ensuite une méthode pour construire le dictionnaire de densités et l’étudions de manière numérique. Cette thèse a été effectué dans le cadre d’une convention CIFRE avec l’entreprise ARTEFACT
In this thesis, we discuss two topics, high-dimensional clustering on the one hand and estimation of mixing densities on the other. The first chapter is an introduction to clustering. We present various popular methods and we focus on one of the main models of our work which is the mixture of Gaussians. We also discuss the problems with high-dimensional estimation (Section 1.3) and the difficulty of estimating the number of clusters (Section 1.1.4). In what follows, we present briefly the concepts discussed in this manuscript. Consider a mixture of $K$ Gaussians in $RR^p$. One of the common approaches to estimate the parameters is to use the maximum likelihood estimator. Since this problem is not convex, we can not guarantee the convergence of classical methods such as gradient descent or Newton's algorithm. However, by exploiting the biconvexity of the negative log-likelihood, the iterative 'Expectation-Maximization' (EM) procedure described in Section 1.2.1 can be used. Unfortunately, this method is not well suited to meet the challenges posed by the high dimension. In addition, it is necessary to know the number of clusters in order to use it. Chapter 2 presents three methods that we have developed to try to solve the problems described above. The works presented there have not been thoroughly researched for various reasons. The first method that could be called 'graphical lasso on Gaussian mixtures' consists in estimating the inverse matrices of covariance matrices $Sigma$ (Section 2.1) in the hypothesis that they are parsimonious. We adapt the graphic lasso method of [Friedman et al., 2007] to a component in the case of a mixture and experimentally evaluate this method. The other two methods address the problem of estimating the number of clusters in the mixture. The first is a penalized estimate of the matrix of posterior probabilities $ Tau in RR ^ {n times K} $ whose component $ (i, j) $ is the probability that the $i$-th observation is in the $j$-th cluster. Unfortunately, this method proved to be too expensive in complexity (Section 2.2.1). Finally, the second method considered is to penalize the weight vector $ pi $ in order to make it parsimonious. This method shows promising results (Section 2.2.2). In Chapter 3, we study the maximum likelihood estimator of density of $n$ i.i.d observations, under the assumption that it is well approximated by a mixture with a large number of components. The main focus is on statistical properties with respect to the Kullback-Leibler loss. We establish risk bounds taking the form of sharp oracle inequalities both in deviation and in expectation. A simple consequence of these bounds is that the maximum likelihood estimator attains the optimal rate $((log K)/n)^{1/2}$, up to a possible logarithmic correction, in the problem of convex aggregation when the number $K$ of components is larger than $n^{1/2}$. More importantly, under the additional assumption that the Gram matrix of the components satisfies the compatibility condition, the obtained oracle inequalities yield the optimal rate in the sparsity scenario. That is, if the weight vector is (nearly) $D$-sparse, we get the rate $(Dlog K)/n$. As a natural complement to our oracle inequalities, we introduce the notion of nearly-$D$-sparse aggregation and establish matching lower bounds for this type of aggregation. Finally, in Chapter 4, we propose an algorithm that performs the Kullback-Leibler aggregation of components of a dictionary as discussed in Chapter 3. We compare its performance with different methods: the kernel density estimator , the 'Adaptive Danzig' estimator, the SPADES and EM estimator with the BIC criterion. We then propose a method to build the dictionary of densities and study it numerically. This thesis was carried out within the framework of a CIFRE agreement with the company ARTEFACT

APA, Harvard, Vancouver, ISO, and other styles

27

Tsang, Wai-Hung. "Kernel methods in supervised and unsupervised learning /." View Abstract or Full-Text, 2003. http://library.ust.hk/cgi/db/thesis.pl?COMP%202003%20TSANG.

Full text

Abstract:

Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2003.
Includes bibliographical references (leaves 46-49). Also available in electronic version. Access restricted to campus users.

APA, Harvard, Vancouver, ISO, and other styles

28

Vollgraf, Roland. "Unsupervised learning methods for statistical signal processing." [S.l.] : [s.n.], 2006. http://opus.kobv.de/tuberlin/volltexte/2007/1488.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Meinicke, Peter. "Unsupervised learning in a generalized regression framework." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=960755594.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Giguère, Philippe. "Unsupervised learning for mobile robot terrain classification." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=95062.

Full text

Abstract:

In this thesis, we consider the problem of having a mobile robot autonomously learn to perceive differences between terrains. The targeted application is for terrain identification. Robust terrain identification can be used to enhance the capabilities of mobile systems, both in terms of locomotion and navigation. For example, a legged amphibious robot that has learned to differentiate sand from water can automatically select its gait on a beach: walking for sand, and swimming for water. The same terrain information can also be used to guide a robot in order to avoid specific terrain types. The problem of autonomous terrain identification is decomposed into two sub-problems: a sensing sub-problem, and a learning sub-problem. In the sensing sub-problem, we look at extracting terrain information from existing sensors, and at the design of a new tactile probe. In particular, we show that inertial sensor measurements and actuator feedback information can be combined to enable terrain identification for a legged robot. In addition, we describe a novel tactile probe designed for improved terrain sensing. In the learning sub-problem, we discuss how temporal or spatial continuities can be exploited to perform the clustering of both time-series and images. Specifically, we present a new algorithm that can be used to train a number of classifiers in order to perform clustering when temporal or spatial dependencies between samples are present. We combine our sensing approach with this clustering technique, to obtain a computational architecture that can learn autonomously to differentiate terrains. This approach is validated experimentally using several different sensing modalities (proprioceptive and tactile) and with two different robotic platforms (on a legged robot named AQUA and a wheeled robot iRobot Create). Finally, we show that the same clustering technique, when combined with image information, can be used to define a new image segmentation algorithm.
Au travers de cette thèse, nous examinons la problématique entourant la perception des différences entre divers terrains, pour un robot mobile autonome. L'application visée par les résultats de nos recherches est l'identification des types de terrains. Cette identification, faite de manière robuste, permet d'augmenter les capacités de systèmes mobiles, tant au niveau de la locomotion que de la navigation. Par exemple, un robot amphibie à pattes qui aurait apprit à distinguer le sable et la mer pourra choisir de lui-même la démarche appropriée : marcher sur le sable, et nager dans l'eau. Cette même information sur le type de terrain peut aussi être utile pour guider un robot, lui permettant d'éviter des types de terrains spécifiques. Nous abordons la problématique d'identification des terrains autour de deux axes principaux: un problème de capture d'information (sensoriel), et un problème d'apprentissage. Dans le problème de la capture d'information, la question traitée est celle d'extraire l'information pertinente à l'identification du type de sol à partir de capteurs sur un robot, ou à l'aide d'une sonde tactile. En particulier, nous démontrons qu'en combinant l'information provenant d'une centrale inertielle avec celle provenant des actionneurs d'un robot à pattes, il est possible d'identifier certains types de sols. De plus, nous présentons une nouvelle sonde tactile possédant des caractéristiques améliorant la capture d'informations relatives aux terrains. Pour le problème de l'apprentissage, nous analysons comment il est possible d'exploiter les continuités spatiales et temporelles afin de séparer des séries temporelles ou des images en leurs classes constituantes (clustering). Nous présentons un nouvel algorithme de clustering basé sur ce principe. En combinant l'approche sensorielle et ce nouvel algorithme, nous obtenons une architecture permettant l'apprentissage, de façon autonome, des terrains. Cette approche est

APA, Harvard, Vancouver, ISO, and other styles

31

Afzal, Naveed. "Unsupervised relation extraction for e-learning applications." Thesis, University of Wolverhampton, 2011. http://hdl.handle.net/2436/299064.

Full text

Abstract:

In this modern era many educational institutes and business organisations are adopting the e-Learning approach as it provides an effective method for educating and testing their students and staff. The continuous development in the area of information technology and increasing use of the internet has resulted in a huge global market and rapid growth for e-Learning. Multiple Choice Tests (MCTs) are a popular form of assessment and are quite frequently used by many e-Learning applications as they are well adapted to assessing factual, conceptual and procedural information. In this thesis, we present an alternative to the lengthy and time-consuming activity of developing MCTs by proposing a Natural Language Processing (NLP) based approach that relies on semantic relations extracted using Information Extraction to automatically generate MCTs. Information Extraction (IE) is an NLP field used to recognise the most important entities present in a text, and the relations between those concepts, regardless of their surface realisations. In IE, text is processed at a semantic level that allows the partial representation of the meaning of a sentence to be produced. IE has two major subtasks: Named Entity Recognition (NER) and Relation Extraction (RE). In this work, we present two unsupervised RE approaches (surface-based and dependency-based). The aim of both approaches is to identify the most important semantic relations in a document without assigning explicit labels to them in order to ensure broad coverage, unrestricted to predefined types of relations. In the surface-based approach, we examined different surface pattern types, each implementing different assumptions about the linguistic expression of semantic relations between named entities while in the dependency-based approach we explored how dependency relations based on dependency trees can be helpful in extracting relations between named entities. Our findings indicate that the presented approaches are capable of achieving high precision rates. Our experiments make use of traditional, manually compiled corpora along with similar corpora automatically collected from the Web. We found that an automatically collected web corpus is still unable to ensure the same level of topic relevance as attained in manually compiled traditional corpora. Comparison between the surface-based and the dependency-based approaches revealed that the dependency-based approach performs better. Our research enabled us to automatically generate questions regarding the important concepts present in a domain by relying on unsupervised relation extraction approaches as extracted semantic relations allow us to identify key information in a sentence. The extracted patterns (semantic relations) are then automatically transformed into questions. In the surface-based approach, questions are automatically generated from sentences matched by the extracted surface-based semantic pattern which relies on a certain set of rules. Conversely, in the dependency-based approach questions are automatically generated by traversing the dependency tree of extracted sentence matched by the dependency-based semantic patterns. The MCQ systems produced from these surface-based and dependency-based semantic patterns were extrinsically evaluated by two domain experts in terms of questions and distractors readability, usefulness of semantic relations, relevance, acceptability of questions and distractors and overall MCQ usability. The evaluation results revealed that the MCQ system based on dependency-based semantic relations performed better than the surface-based one. A major outcome of this work is an integrated system for MCQ generation that has been evaluated by potential end users.

APA, Harvard, Vancouver, ISO, and other styles

32

Titsias, Michalis. "Unsupervised learning of multiple objects in images." Thesis, University of Edinburgh, 2005. http://hdl.handle.net/1842/776.

Full text

Abstract:

Developing computer vision algorithms able to learn from unsegmented images containing multiple objects is important since this is how humans constantly learn from visual experiences. In this thesis we consider images containing views of multiple objects and our task is to learn about each of the objects present in the images. This task can be approached as a factorial learning problem, where each image is explained by instantiating a model for each of the objects present with the correct instantiation parameters. A major problem with learning a factorial model is that as the number of objects increases, there is a combinatorial explosion of the number of configurations that need to be considered. We develop a greedy algorithm to extract object models sequentially from the data by making use of a robust statistical method, thus avoiding the combinatorial explosion. When we have video data, we greatly speed up the greedy algorithm by carrying out approximate tracking of the multiple objects in the scene. This method is applied to raw image sequence data and extracts the objects one at a time. First, the (possibly moving) background is learned, and moving objects are found at later stages. The algorithm recursively updates an appearance model so that occlusion is taken into account, and matches this model to the frames through the sequence. We apply this method to learn multiple objects in image sequences as well as articulated parts of the human body. Additionally, we learn a distribution over parts undergoing full affine transformations that expresses the relative movements of the parts. The idea of fitting a model to data sequentially using robust statistics is quite general and it can be applied to other models. We describe a method for training mixture models by learning one component at a time and thus building the mixture model in a sequential manner. We do this by incorporating an outlier component into the mixture model which allows us to fit just one data cluster by "ignoring" the rest of the clusters. Once a model is fitted we remove from consideration all the data explained by this model and then repeat the operation. This algorithm can be used to provide a sensible initialization of the mixture components when we train a mixture model.

APA, Harvard, Vancouver, ISO, and other styles

33

Watts, Oliver Samuel. "Unsupervised learning for text-to-speech synthesis." Thesis, University of Edinburgh, 2013. http://hdl.handle.net/1842/7982.

Full text

Abstract:

This thesis introduces a general method for incorporating the distributional analysis of textual and linguistic objects into text-to-speech (TTS) conversion systems. Conventional TTS conversion uses intermediate layers of representation to bridge the gap between text and speech. Collecting the annotated data needed to produce these intermediate layers is a far from trivial task, possibly prohibitively so for languages in which no such resources are in existence. Distributional analysis, in contrast, proceeds in an unsupervised manner, and so enables the creation of systems using textual data that are not annotated. The method therefore aids the building of systems for languages in which conventional linguistic resources are scarce, but is not restricted to these languages. The distributional analysis proposed here places the textual objects analysed in a continuous-valued space, rather than specifying a hard categorisation of those objects. This space is then partitioned during the training of acoustic models for synthesis, so that the models generalise over objects' surface forms in a way that is acoustically relevant. The method is applied to three levels of textual analysis: to the characterisation of sub-syllabic units, word units and utterances. Entire systems for three languages (English, Finnish and Romanian) are built with no reliance on manually labelled data or language-specific expertise. Results of a subjective evaluation are presented.

APA, Harvard, Vancouver, ISO, and other styles

34

Khaliq, Bilal. "Unsupervised learning of Arabic non-concatenative morphology." Thesis, University of Sussex, 2015. http://sro.sussex.ac.uk/id/eprint/53865/.

Full text

Abstract:

Unsupervised approaches to learning the morphology of a language play an important role in computer processing of language from a practical and theoretical perspective, due their minimal reliance on manually produced linguistic resources and human annotation. Such approaches have been widely researched for the problem of concatenative affixation, but less attention has been paid to the intercalated (non-concatenative) morphology exhibited by Arabic and other Semitic languages. The aim of this research is to learn the root and pattern morphology of Arabic, with accuracy comparable to manually built morphological analysis systems. The approach is kept free from human supervision or manual parameter settings, assuming only that roots and patterns intertwine to form a word. Promising results were obtained by applying a technique adapted from previous work in concatenative morphology learning, which uses machine learning to determine relatedness between words. The output, with probabilistic relatedness values between words, was then used to rank all possible roots and patterns to form a lexicon. Analysis using trilateral roots resulted in correct root identification accuracy of approximately 86% for inflected words. Although the machine learning-based approach is effective, it is conceptually complex. So an alternative, simpler and computationally efficient approach was then devised to obtain morpheme scores based on comparative counts of roots and patterns. In this approach, root and pattern scores are defined in terms of each other in a mutually recursive relationship, converging to an optimized morpheme ranking. This technique gives slightly better accuracy while being conceptually simpler and more efficient. The approach, after further enhancements, was evaluated on a version of the Quranic Arabic Corpus, attaining a final accuracy of approximately 93%. A comparative evaluation shows this to be superior to two existing, well used manually built Arabic stemmers, thus demonstrating the practical feasibility of unsupervised learning of non-concatenative morphology.

APA, Harvard, Vancouver, ISO, and other styles

35

Morita, Takashi Ph D. Massachusetts Institute of Technology. "Unsupervised learning of lexical subclasses from phonotactics." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/120612.

Full text

Abstract:

Thesis: Ph. D. in Linguistics, Massachusetts Institute of Technology, Department of Linguistics and Philosophy, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 203-215).
Languages are constantly borrowing words from one another. Since the donor and recipient languages typically differ in their phonology and phonotactics, the native words and the loanwords of the borrower language can also exhibit dierent phonology/ phonotactics. Accordingly, it has been proposed that the phonotactics of languages such as Japanese is better explained if words are classified into etymologically defined sublexica. However, this sublexical analysis is challenged by a learnability problem: the sublexical membership of words is not directly observable. This study applies a state-of-the-art clustering method (a Dirichlet process mixture model) to a substantial number of Japanese and English words extracted from corpora. It turns out that the predicted clusters largely correspond to the etymologically defined sublexica. Since the clustering method is domain-general and not specialized to sublexicon identication, the results can be taken as statistical evidence for the heterogeneous lexica of the two languages. Moreover, the unsupervised nature of the clustering method demonstrates the learnability of sublexica from naturalistic data. The learned sublexica also replicate linguistic characterizations of actual sublexica proposed in previous literature, such as the biased distribution of (certain substrings of) segments to particular sublexica. In addition, the learned sublexica make informative predictions based on previous experimental studies. These results suggest that the predicted sublexica are linguistically sound. Finally, the predicted sublexica reveal hitherto unnoticed phonotactic properties. These discoveries can be used for further investigation of native speakers' knowledge.
by Takashi Morita.
Ph. D. in Linguistics

APA, Harvard, Vancouver, ISO, and other styles

36

Martin, del Campo Barraza Sergio. "Unsupervised feature learning applied to condition monitoring." Doctoral thesis, Luleå tekniska universitet, Institutionen för system- och rymdteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-63113.

Full text

Abstract:

Improving the reliability and efficiency of rotating machinery are central problems in many application domains, such as energy production and transportation. This requires efficient condition monitoring methods, including analytics needed to predict and detect faults and manage the high volume and velocity of data. Rolling element bearings are essential components of rotating machines, which are particularly important to monitor due to the high requirements on the operational conditions. Bearings are also located near the rotating parts of the machines and thereby the signal sources that characterize faults and abnormal operational conditions. Thus, bearings with embedded sensing, analysis and communication capabilities are developed. However, the analysis of signals from bearings and the surrounding components is a challenging problem due to the high variability and complexity of the systems. For example, machines evolve over time due to wear and maintenance, and the operational conditions typically also vary over time. Furthermore, the variety of fault signatures and failure mechanisms makes it difficult to derive generally useful and accurate models, which enable early detection of faults at reasonable cost. Therefore, investigations of machine learning methods that avoid some of these difficulties by automated on-line adaptation of the signal model are motivated. In particular, can unsupervised feature learning methods be used to automatically derive useful information about the state and operational conditions of a rotating machine? What additional methods are needed to recognize normal operational conditions and detect abnormal conditions, for example in terms of learned features or changes of model parameters? Condition monitoring systems are typically based on condition indicators that are pre-defined by experts, such as the amplitudes in certain frequency bands of a vibration signal, or the temperature of a bearing. Condition indicators are used to define alarms in terms of thresholds; when the indicator is above (or below) the threshold, an alarm indicating a fault condition is generated, without further information about the root cause of the fault. Similarly, machine learning methods and labeled datasets are used to train classifiers that can be used for the detection of faults. The accuracy and reliability of such condition monitoring methods depends on the type of condition indicators used and the data considered when determining the model parameters. Hence, this approach can be challenging to apply in the field where machines and sensor systems are different and change over time, and parameters have different meaning depending on the conditions. Adaptation of the model parameters to each condition monitoring application and operational condition is also difficult due to the need for labeled training data representing all relevant conditions, and the high cost of manual configuration. Therefore, neither of these solutions is viable in general. In this thesis I investigate unsupervised methods for feature learning and anomaly detection, which can operate online without pre-training with labeled datasets. Concepts and methods for validation of normal operational conditions and detection of abnormal operational conditions based on automatically learned features are proposed and studied. In particular, dictionary learning is applied to vibration and acoustic emission signals obtained from laboratory experiments and condition monitoring systems. The methodology is based on the assumption that signals can be described as a linear superposition of noise and learned atomic waveforms of arbitrary shape, amplitude and position. Greedy sparse coding algorithms and probabilistic gradient methods are used to learn dictionaries of atomic waveforms enabling sparse representation of the vibration and acoustic emission signals. As a result, the model can adapt automatically to different machine configurations, and environmental and operational conditions with a minimum of initial configuration. In addition, sparse coding results in reduced data rates that can simplify the processing and communication of information in resource-constrained systems. Measures that can be used to detect anomalies in a rotating machine are introduced and studied, like the dictionary distance between an online propagated dictionary and a set of dictionaries learned when the machine is known to operate in healthy conditions. In addition, the possibility to generalize a dictionary learned from the vibration signal in one machine to another similar machine is studied in the case of wind turbines. The main contributions of this thesis are the extension of unsupervised dictionary learning to condition monitoring for anomaly detection purposes, and the related case studies demonstrating that the learned features can be used to obtain information about the condition. The cases studies include vibration signals from controlled ball bearing experiments and wind turbines; and acoustic emission signals from controlled tensile strength tests and bearing contamination experiments. It is found that the dictionary distance between an online propagated dictionary and a baseline dictionary trained in healthy conditions can increase up to three times when a fault appears, without reference to kinematic information like defect frequencies. Furthermore, it is found that in the presence of a bearing defect, impulse-like waveforms with center frequencies that are about two times higher than in the healthy condition are learned. In the case of acoustic emission analysis, it is shown that the representations of signals of different strain stages of stainless steel appear as distinct clusters. Furthermore, the repetition rates of learned acoustic emission waveforms are found to be markedly different for a bearing with and without particles in the lubricant, especially at high rotational speed above 1000 rpm, where particle contaminants are difficult to detect using conventional methods. Different hyperparameters are investigated and it is found that the model is useful for anomaly detection with as little as 2.5 % preserved coefficients.

APA, Harvard, Vancouver, ISO, and other styles

37

Pelletier, Bertrand 1961 Carleton University Dissertation Engineering Systems and Computer. "Unsupervised learning from a goal-driven agent." Ottawa.:, 1993.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

38

Berkes, Pietro. "Temporal slowness as an unsupervised learning principle." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät I, 2006. http://dx.doi.org/10.18452/15414.

Full text

Abstract:

In dieser Doktorarbeit untersuchen wir zeitliche Langsamkeit als Prinzip für die Selbstorganisation des sensorischen Kortex sowie für computer-basierte Mustererkennung. Wir beginnen mit einer Einführung und Diskussion dieses Prinzips und stellen anschliessend den Slow Feature Analysis (SFA) Algorithmus vor, der das matemathisches Problem für diskrete Zeitreihen in einem endlich dimensionalen Funktionenraum löst. Im Hauptteil der Doktorarbeit untersuchen wir zeitliche Langsamkeit als Lernprinzip für rezeptive Felder im visuellen Kortex. Unter Verwendung von SFA werden Transformationsfunktionen gelernt, die, angewendet auf natürliche Bildsequenzen, möglichst langsam variierende Merkmale extrahieren. Die Funktionen können als nichtlineare raum-zeitliche rezeptive Felder interpretiert und mit Neuronen im primären visuellen Kortex (V1) verglichen werden. Wir zeigen, dass sie viele Eigenschaften von komplexen Zellen in V1 besitzen, nicht nur die primären, d.h. Gabor-ähnliche optimale Stimuli und Phaseninvarianz, sondern auch sekundäre, wie Richtungsselektivität, nicht-orthogonale Inhibition sowie End- und Seiteninhibition. Diese Resultate zeigen, dass ein einziges unüberwachtes Lernprinzip eine solche Mannigfaltigkeit an Eigenschaften begründen kann. Für die Analyse der mit SFA gelernten nichtlinearen Funktionen haben wir eine Reihe mathematischer und numerischer Werkzeuge entwickelt, mit denen man die quadratischen Formen als rezeptive Felder charakterisieren kann. Wir erweitern sie im weiteren Verlauf, um sie von allgemeinerem Interesse für theoretische und physiologische Modelle zu machen. Den Abschluss dieser Arbeit bildet die Anwendung des Prinzips der zeitlichen Langsamkeit auf Mustererkennungsprobleme. Die fehlende zeitliche Struktur in dieser Problemklasse erfordert eine Modifikation des SFA-Algorithmus. Wir stellen eine alternative Formulierung vor und wenden diese auf eine Standard-Datenbank von handgeschriebenen Ziffern an.
In this thesis we investigate the relevance of temporal slowness as a principle for the self-organization of the visual cortex and for technical applications. We first introduce and discuss this principle and put it into mathematical terms. We then define the slow feature analysis (SFA) algorithm, which solves the mathematical problem for multidimensional, discrete time series in a finite dimensional function space. In the main part of the thesis we apply temporal slowness as a learning principle of receptive fields in the visual cortex. Using SFA we learn the input-output functions that, when applied to natural image sequences, vary as slowly as possible in time and thus optimize the slowness objective. The resulting functions can be interpreted as nonlinear spatio-temporal receptive fields and compared to neurons in the primary visual cortex (V1). We find that they reproduce (qualitatively and quantitatively) many of the properties of complex cells in V1, not only the two basic ones, namely a Gabor-like optimal stimulus and phase-shift invariance, but also secondary ones like direction selectivity, non-orthogonal inhibition, end-inhibition and side-inhibition. These results show that a single unsupervised learning principle can account for a rich repertoire of receptive field properties. In order to analyze the nonlinear functions learned by SFA in our model, we developed a set of mathematical and numerical tools to characterize quadratic forms as receptive fields. We expand them in a successive chapter to be of more general interest for theoretical and physiological models. We conclude this thesis by showing the application of the temporal slowness principle to pattern recognition. We reformulate the SFA algorithm such that it can be applied to pattern recognition problems that lack of a temporal structure and present the optimal solutions in this case. We then apply the system to a standard handwritten digits database with good performance.

APA, Harvard, Vancouver, ISO, and other styles

39

Nikbakht, Silab Rasoul. "Unsupervised learning for parametric optimization in wireless networks." Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/671246.

Full text

Abstract:

This thesis studies parametric optimization in cellular and cell-free networks, exploring data-based and expert-based paradigms. Power allocation and power control, which adjust the transmit power to meet different fairness criteria such as max-min or max-product, are crucial tasks in wireless communications that fall into the parametric optimization category. The state-of-the-art approaches for power control and power allocation often demand huge computational costs and are not suitable for real-time applications. To address this issue, we develop a general-purpose unsupervised-learning approach for solving parametric optimizations; and extend the well-known fractional power control algorithm. In the data-based paradigm, we create an unsupervised learning framework that defines a custom neural network (NN), incorporating expert knowledge to the NN loss function to solve the power control and power allocation problems. In this approach, a feedforward NN is trained by repeatedly sampling the parameter space, but, rather than solving the associated optimization problem completely, a single step is taken along the gradient of the objective function. The resulting method is applicable for both convex and non-convex optimization problems. It offers two-to-three orders of magnitude speedup in the power control and power allocation problems compared to a convex solver—whenever appliable. In the expert-driven paradigm, we investigate the extension of fractional power control to cell-free networks. The resulting closed-form solution can be evaluated for uplink and downlink effortlessly and reaches an (almost) optimum solution in the uplink case. In both paradigms, we place a particular focus on large scale gains—the amount of attenuation experienced by the local-average received power. The slow-varying nature of the large-scale gains relaxes the need for a frequent update of the solutions in both the data-driven and expert-driven paradigms, enabling real-time application for both methods.
Aqueta tesis estudia l’optimització paramètrica a les xarxes cel.lulars i xarxes cell-free, explotant els paradigmes basats en dades i basats en experts. L’assignació i control de la potencia, que ajusten la potencia de transmissió per complir amb diferents criteris d’equitat com max-min o max-product, son tasques crucials en les telecomunicacions inalàmbriques pertanyents a la categoria d’optimització paramètrica. Les tècniques d’última generació per al control i assignació de la potència solen exigir enormes costos computacionals i no son adequats per aplicacions en temps real. Per abordar aquesta qüestió, desenvolupem una tècnica de propòsit general utilitzant aprenentatge no supervisat per resoldre optimitzacions paramètriques; i al mateix temps ampliem el reconegut algoritme de control de potencia fraccionada. En el paradigma basat en dades, creem un marc d’aprenentatge no supervisat que defineix una xarxa neuronal (NN, sigles de Neural Network en Anglès) especifica, incorporant coneixements experts a la funció de cost de la NN per resoldre els problemes de control i assignació de potència. Dins d’aquest enfocament, s’entrena una NN de tipus feedforward mitjançant el mostreig repetit en l’espai de paràmetres, però, en lloc de resoldre completament el problema d’optimització associat, es pren un sol pas en la direcció del gradient de la funció objectiu. El mètode resultant ´es aplicable tant als problemes d’optimització convexos com no convexos. Això ofereix una acceleració de dos a tres ordres de magnitud en els problemes de control i assignació de potencia en comparació amb un algoritme de resolució convexa—sempre que sigui aplicable. En el paradigma dirigit per experts, investiguem l’extensió del control de potencia fraccionada a les xarxes sense cèl·lules. La solució tancada resultant pot ser avaluada per a l’enllaç de pujada i el de baixada sense esforç i assoleix una solució (gaire) òptima en el cas de l’enllaç de pujada. En ambdós paradigmes, ens centrem especialment en els guanys a gran escala—la quantitat d’atenuació que experimenta la potencia mitja local rebuda. La naturalesa de variació lenta dels guanys a gran escala relaxa la necessitat d’una actualització freqüent de les solucions tant en el paradigma basat en dades com en el basat en experts, permetent d’aquesta manera l’ús dels dos mètodes en aplicacions en temps real.
Esta tesis estudia la optimización paramétrica en las redes celulares y redes cell-free, explorando los paradigmas basados en datos y en expertos. La asignación y el control de la potencia, que ajustan la potencia de transmisión para cumplir con diferentes criterios de equidad como max-min o max-product, son tareas cruciales en las comunicaciones inalámbricas pertenecientes a la categoría de optimización paramétrica. Los enfoques más modernos de control y asignación de la potencia suelen exigir enormes costes computacionales y no son adecuados para aplicaciones en tiempo real. Para abordar esta cuestión, desarrollamos un enfoque de aprendizaje no supervisado de propósito general que resuelve las optimizaciones paramétricas y a su vez ampliamos el reconocido algoritmo de control de potencia fraccionada. En el paradigma basado en datos, creamos un marco de aprendizaje no supervisado que define una red neuronal (NN, por sus siglas en inglés) específica, incorporando conocimiento de expertos a la función de coste de la NN para resolver los problemas de control y asignación de potencia. Dentro de este enfoque, se entrena una NN de tipo feedforward mediante el muestreo repetido del espacio de parámetros, pero, en lugar de resolver completamente el problema de optimización asociado, se toma un solo paso en la dirección del gradiente de la función objetivo. El método resultante es aplicable tanto a los problemas de optimización convexos como no convexos. Ofrece una aceleración de dos a tres órdenes de magnitud en los problemas de control y asignación de potencia, en comparación con un algoritmo de resolución convexo—siempre que sea aplicable. Dentro del paradigma dirigido por expertos, investigamos la extensión del control de potencia fraccionada a las redes cell-free. La solución de forma cerrada resultante puede ser evaluada para el enlace uplink y el downlink sin esfuerzo y alcanza una solución (casi) óptima en el caso del enlace uplink. En ambos paradigmas, nos centramos especialmente en las large-scale gains— la cantidad de atenuación que experimenta la potencia media local recibida. La naturaleza lenta y variable de las ganancias a gran escala relaja la necesidad de una actualización frecuente de las soluciones tanto en el paradigma basado en datos como en el basado en expertos, permitiendo el uso de ambos métodos en aplicaciones en tiempo real.

APA, Harvard, Vancouver, ISO, and other styles

40

Hasenjäger, Martina. "Active data selection in supervised and unsupervised learning." [S.l. : s.n.], 2000. http://deposit.ddb.de/cgi-bin/dokserv?idn=960209220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Bhaskar, Dhananjay. "Morphology based cell classification : unsupervised machine learning approach." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/61342.

Full text

Abstract:

Individual cells adapt their morphology as a function of their differentiation status and in response to environmental cues and selective pressures. While it known that the great majority of these cues and pressures are mediated by changes in intracellular signal transduction, the precise regulatory mechanisms that govern cell shape, size and polarity are not well understood. Systematic investigation of cell morphology involves experimentally perturbing biochemical pathways and observing changes in phenotype. In order to facilitate this work, experimental biologists need software capable of analyzing a large number of microscopic images to classify cells and recognize cell types. Furthermore, automatic cell classification enables pathologists to rapidly diagnose diseases like leukemia that are marked by cell shape deformation. This thesis describes a methodology to identify cells in microscopy images and compute quantitative descriptors that characterize their morphology. Phase-contrast microscopy data is used for the purpose of demonstration. Cells are identified with minimal user input using advanced image segmentation methods. Features (e.g. area, perimeter, curvature, circularity, convexity, etc.) are extracted from segmented cell boundary to quantify cell morphology. Correlated features are combined to reduce dimensionality and the resulting feature set is clustered to identify distinct cell morphologies. Clustering results obtained from different combinations of features are compared to identify a minimal set of features without compromising classification accuracy.
Science, Faculty of
Mathematics, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

42

Jossen, Quentin. "Unsupervised learning procedure for nonintrusive appliance load monitoring." Doctoral thesis, Universite Libre de Bruxelles, 2013. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/209369.

Full text

Abstract:

There is a continuously growing amount of appliances and energy dependent services in households. To date, efforts have mostly focused on energy efficiency, however behavior changes are required for a more sustainable energy consumption. People therefore need to understand their consumption habits to be able to adapt them. Appliance-specific feedback is probably the most efficient way to impact behaviors, since people need to ‘see’ where their electricity goes. Smart meters, currently being extensively rolled out in Europe and in the U.S. are good potential candidates to provide end-users with

energy advice. The required functionalities must however be rapidly defined if they are expected to be integrated in the future massive roll out.

Nonintrusive appliance load monitoring aims to derive appliance-specific information from the aggregate electricity consumption. While techniques have been developed since the 80’s, those mainly address the identification of previously learned appliances, from a database. Building such a database is an intrusive and tedious process which should be avoided. Whereas most recent efforts have focused on unsupervised techniques to disambiguate energy consumption into individual appliances, they usually rely on prior information about measured appliances such as the number of appliances, the number of states in each appliance as well as the power they consume in each state. This information should ideally be learned from the data. This topic will be addressed in the present research.

This work will present a framework for unsupervised learning for nonintrusive appliance

load monitoring. It aims to discover information about appliances of a household solely from its aggregate consumption data, with neither prior information nor user intervention. The learning process can be segmented into five tasks: the detection of on/off switching, the extraction of individual load signatures, the identification of

recurrent signatures, the discovery of two-state electrical devices and, finally, the elaboration

of appliance models. The first four steps will be addressed in this paper.

The suite of algorithms proposed in this work allows to discover the set of two-states electrical loads from their aggregated consumption. This, along with the evaluation

of their operating sequences, is a prerequisite to learn appliance models from the data. Results show that loads consuming power down to some dozens of watts can be learned from the data. This should encourage future researchers to consider such an unsupervised learning.
Doctorat en Sciences de l'ingénieur
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

43

Khan, Najeed Ahmed. "Unsupervised learning of object detectors for everyday scenes." Thesis, University of Leeds, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.540773.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Bourrier, Anthony. "Compressed sensing and dimensionality reduction for unsupervised learning." Phd thesis, Université Rennes 1, 2014. http://tel.archives-ouvertes.fr/tel-01023030.

Full text

Abstract:

Cette thèse est motivée par la perspective de rapprochement entre traitement du signal et apprentissage statistique, et plus particulièrement par l'exploitation de techniques d'échantillonnage compressé afin de réduire le coût de tâches d'apprentissage. Après avoir rappelé les bases de l'échantillonnage compressé et mentionné quelques techniques d'analyse de données s'appuyant sur des idées similaires, nous proposons un cadre de travail pour l'estimation de paramètres de mélange de densités de probabilité dans lequel les données d'entraînement sont compressées en une représentation de taille fixe. Nous instancions ce cadre sur un modèle de mélange de Gaussiennes isotropes. Cette preuve de concept suggère l'existence de garanties théoriques de reconstruction d'un signal pour des modèles allant au-delà du modèle parcimonieux usuel de vecteurs. Nous étudions ainsi dans un second temps la généralisation de résultats de stabilité de problèmes inverses linéaires à des modèles tout à fait généraux de signaux. Nous proposons des conditions sous lesquelles des garanties de reconstruction peuvent être données dans un cadre général. Enfin, nous nous penchons sur un problème de recherche approchée de plus proche voisin avec calcul de signature des vecteurs afin de réduire la complexité. Dans le cadre où la distance d'intérêt dérive d'un noyau de Mercer, nous proposons de combiner un plongement explicite des données suivi d'un calcul de signatures, ce qui aboutit notamment à une recherche approchée plus précise.

APA, Harvard, Vancouver, ISO, and other styles

45

Dong, Shuonan. "Unsupervised learning and recognition of physical activity plans." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/42195.

Full text

Abstract:

Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 2007.
Includes bibliographical references (p. 125-129).
This thesis desires to enable a new kind of interaction between humans and computational agents, such as robots or computers, by allowing the agent to anticipate and adapt to human intent. In the future, more robots may be deployed in situations that require collaboration with humans, such as scientific exploration, search and rescue, hospital assistance, and even domestic care. These situations require robots to work together with humans, as part of a team, rather than as a stand-alone tool. The intent recognition capability is necessary for computational agents to play a more collaborative role in human-robot interactions, moving beyond the standard master-slave relationship of humans and computers today. We provide an innovative capability for recognizing human intent, through statistical plan learning and online recognition. We approach the plan learning problem by employing unsupervised learning to automatically determine the activities in a plan based on training data. The plan activities are described by a mixture of multivariate probability densities. The number of distributions in the mixture used to describe the data is assumed to be given. The training data trajectories are fed again through the activities' density distributions to determine each possible sequence of activities that make up a plan. These activity sequences are then summarized with temporal information in a temporal plan network, which consists of a network of all possible plans. Our approach to plan recognition begins with formulating the temporal plan network as a hidden Markov model. Next, we determine the most likely path using the Viterbi algorithm. Finally, we refer back to the temporal plan network to obtain predicted future activities. Our research presents several innovations:
(cont.) First, we introduce a modified representation of temporal plan networks that incorporates probabilistic information into the state space and temporal representations. Second, we learn plans from actual data, such that the notion of an activity is not arbitrarily or manually defined, but is determined by the characteristics of the data. Third, we develop a recognition algorithm that can perform recognition continuously by making probabilistic updates. Finally, our recognizer not only identifies previously executed activities, but also pre-dicts future activities based on the plan network. We demonstrate the capabilities of our algorithms on motion capture data. Our results show that the plan learning algorithm is able to generate reasonable temporal plan networks, depending on the dimensions of the training data and the recognition resolution used. The plan recognition algorithm is also successful in recognizing the correct activity sequences in the temporal plan network corresponding to the observed test data.
by Shuonan Dong.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

46

Chan, Kevin S. (Kevin Sao Wei). "Multiview monocular depth estimation using unsupervised learning methods." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119753.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 50-51).
Existing learned methods for monocular depth estimation use only a single view of scene for depth evaluation, so they inherently overt to their training scenes and cannot generalize well to new datasets. This thesis presents a neural network for multiview monocular depth estimation. Teaching a network to estimate depth via structure from motion allows it to generalize better to new environments with unfamiliar objects. This thesis extends recent work in unsupervised methods for single-view monocular depth estimation and uses the reconstruction losses for training posed in those works. Models and baseline models were evaluated on a variety of datasets and results indicate that indicate multiview models generalize across datasets better than previous work. This work is unique in that it emphasizes cross domain performance and ability to generalize more so than performance on the training set.
by Kevin S. Chan.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

47

Wichrowska, Olga N. "Unsupervised syntactic category learning from child-directed speech." Thesis, Massachusetts Institute of Technology, 2010. http://hdl.handle.net/1721.1/62756.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 57-59).
The goal of this research was to discover what kinds of syntactic categories can be learned using distributional analysis on linear context of words, specifically in child-directed speech. The idea behind this is that the categories used by children could very well be different from adult categories. There is some evidence that distributional analysis could be used for some aspects of language acquisition, though very strong arguments exist for why it is not enough to acquire grammar. These experiments can help identify what kind of data can be learned from linear context and statistics only. This paper reports the results of three established automatic syntactic category learning algorithms on a small, edited input set of child-directed speech from the CHILDES database. Hierarchical clustering, K-Means analysis, and an implementation of a substitution algorithm are all used to assign syntactic categories to words based on their linear distributional context. Overall, open classes (nouns, verbs, adjectives) were reliably categorized, and some methods were able to distinguish prepositions, adverbs, subjects vs. objects, and verbs by subcategorization frame. The main barrier standing between these methods and human-like categorization is the inability to deal with the ambiguity that is omnipresent in natural language and poses an important problem for future models of syntactic category acquisition.
by Olga N. Wichrowska.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

48

Mansinghka, Vikash Kumar. "Nonparametric Bayesian methods for supervised and unsupervised learning." Thesis, Massachusetts Institute of Technology, 2009. http://hdl.handle.net/1721.1/53172.

Full text

Abstract:

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009.
Includes bibliographical references (leaves 44-45).
I introduce two nonparametric Bayesian methods for solving problems of supervised and unsupervised learning. The first method simultaneously learns causal networks and causal theories from data. For example, given synthetic co-occurrence data from a simple causal model for the medical domain, it can learn relationships like "having a flu causes coughing", while also learning that observable quantities can be usefully grouped into categories like diseases and symptoms, and that diseases tend to cause symptoms, not the other way around. The second method is an online algorithm for learning a prototype-based model for categorial concepts, and can be used to solve problems of multiclass classification with missing features. I apply it to problems of categorizing newsgroup posts and recognizing handwritten digits. These approaches were inspired by a striking capacity of human learning, which should also be a desideratum for any intelligent system: the ability to learn certain kinds of "simple" or "natural" structures very quickly, while still being able to learn arbitrary -- and arbitrarily complex - structures given enough data. In each case, I show how nonparametric Bayesian modeling and inference based on stochastic simulation give us some of the tools we need to achieve this goal.
by Vikash Kumar Mansinghka.
M.Eng.

APA, Harvard, Vancouver, ISO, and other styles

49

Sani, Lorenzo. "Unsupervised clustering of MDS data using federated learning." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25591/.

Full text

Abstract:

In this master thesis we developed a model for unsupervised clustering on a data set of biomedical data. This data has been collected by GenoMed4All consortium from patients affected by Myelodysplastic Syndrome (MDS), that is an haematological disease. The main focus is put on the genetic mutations collected that are used as features of the patients in order to cluster them. Clustering approaches have been used in several studies concerning haematological diseases such MDS. A neural network-based model was used to solve the task. The results of the clustering have been compared with labels from a "gold standard'' technique, i.e. hierarchical Dirichlet processes (HDP). Our model was designed to be also implemented in the context of federated learning (FL). This innovative technique is able to achieve machine learning objective without the necessity of collecting all the data in one single center, allowing strict privacy policies to be respected. Federated learning was used because of its properties, and because of the sensitivity of data. Several recent studies regarding clinical problems addressed with machine learning endorse the development of federated learning settings in such context, because its privacy preserving properties could represent a cornerstone for applying machine learning techniques to medical data. In this work will be then discussed the clustering performance of the model, and also its generative capabilities.

APA, Harvard, Vancouver, ISO, and other styles

50

Nallabolu, Adithya Reddy. "Unsupervised Learning of Spatiotemporal Features by Video Completion." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/79702.

Full text

Abstract:

In this work, we present an unsupervised representation learning approach for learning rich spatiotemporal features from videos without the supervision from semantic labels. We propose to learn the spatiotemporal features by training a 3D convolutional neural network (CNN) using video completion as a surrogate task. Using a large collection of unlabeled videos, we train the CNN to predict the missing pixels of a spatiotemporal hole given the remaining parts of the video through minimizing per-pixel reconstruction loss. To achieve good reconstruction results using color videos, the CNN needs to have a certain level of understanding of the scene dynamics and predict plausible, temporally coherent contents. We further explore to jointly reconstruct both color frames and flow fields. By exploiting the statistical temporal structure of images, we show that the learned representations capture meaningful spatiotemporal structures from raw videos. We validate the effectiveness of our approach for CNN pre-training on action recognition and action similarity labeling problems. Our quantitative results demonstrate that our method compares favorably against learning without external data and existing unsupervised learning approaches.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Unsupervied learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles