Log in

Relevant bibliographies by topics / Deep Unsupervised Learning / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Deep Unsupervised Learning.

Dissertations / Theses on the topic 'Deep Unsupervised Learning'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Deep Unsupervised Learning.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Drexler, Jennifer Fox. "Deep unsupervised learning from speech." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105696.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 87-92).
Automatic speech recognition (ASR) systems have become hugely successful in recent years - we have become accustomed to speech interfaces across all kinds of devices. However, despite the huge impact ASR has had on the way we interact with technology, it is out of reach for a significant portion of the world's population. This is because these systems rely on a variety of manually-generated resources - like transcripts and pronunciation dictionaries - that can be both expensive and difficult to acquire. In this thesis, we explore techniques for learning about speech directly from speech, with no manually generated transcriptions. Such techniques have the potential to revolutionize speech technologies for the vast majority of the world's population. The cognitive science and computer science communities have both been investing increasing time and resources into exploring this problem. However, a full unsupervised speech recognition system is a hugely complicated undertaking and is still a long ways away. As in previous work, we focus on the lower-level tasks which will underlie an eventual unsupervised speech recognizer. We specifically focus on two tasks: developing linguistically meaningful representations of speech and segmenting speech into phonetic units. This thesis approaches these tasks from a new direction: deep learning. While modern deep learning methods have their roots in ideas from the 1960s and even earlier, deep learning techniques have recently seen a resurgence, thanks to huge increases in computational power and new efficient learning algorithms. Deep learning algorithms have been instrumental in the recent progress of traditional supervised speech recognition; here, we extend that work to unsupervised learning from speech.
by Jennifer Fox Drexler.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

2

Ahn, Euijoon. "Unsupervised Deep Feature Learning for Medical Image Analysis." Thesis, University of Sydney, 2020. https://hdl.handle.net/2123/23002.

Full text

Abstract:

The availability of annotated image datasets and recent advances in supervised deep learning methods are enabling the end-to-end derivation of representative image features that can impact a variety of image analysis problems. These supervised methods use prior knowledge derived from labelled training data and approaches, for example, convolutional neural networks (CNNs) have produced impressive results in natural (photographic) image classification. CNNs learn image features in a hierarchical fashion. Each deeper layer of the network learns a representation of the image data that is higher level and semantically more meaningful. However, the accuracy and robustness of image features with supervised CNNs are dependent on the availability of large-scale labelled training data. In medical imaging, these large labelled datasets are scarce mainly due to the complexity of manual annotation and inter- and intra-observer variability in label assignment. The concept of ‘transfer learning’ – the adoption of image features from different domains, e.g., image features learned from natural photographic images – was introduced to address the lack of large amounts of labelled medical image data. These image features, however, are often generic and do not perform well in specific medical image analysis problems. An alternative approach was to optimise these features by retraining the generic features using a relatively small set of labelled medical images. This ‘fine-tuning’ approach, however, is not able to match the overall accuracy of learning image features directly from large collections of data that are specifically related to the problem at hand. An alternative approach is to use unsupervised feature learning algorithms to build features from unlabelled data, which then allows unannotated image archives to be used. Many unsupervised feature learning algorithms such as sparse coding (SC), auto-encoder (AE) and Restricted Boltzmann Machines (RBMs), however, have often been limited to learning low-level features such as lines and edges. In an attempt to address these limitations, in this thesis, we present several new unsupervised deep learning methods to learn semantic high-level features from unlabelled medical images to address the challenge of learning representative visual features in medical image analysis. We present two methods to derive non-linear and non-parametric models, which are crucial to unsupervised feature learning algorithms; one method embeds a kernel learning within CNNs while the other couples clustering with CNNs. We then further improved the quality of image features using domain adaptation methods (DAs) that learn representations that are invariant to domains with different data distributions. We present a deep unsupervised feature extractor to transform the feature maps from the pre-trained CNN on natural images to a set of non-redundant and relevant medical image features. Our feature extractor preserves meaningful generic features from the pre-trained domain and learns specific local features that are more representative of the medical image data. We conducted extensive experiments on 4 public datasets which have diverse visual characteristics of medical images including X-ray, dermoscopic and CT images. Our results show that our methods had better accuracy when compared to other conventional unsupervised methods and competitive accuracy to methods that used state-of-the-art supervised CNNs. Our findings suggest that our methods could scale to many different transfer learning or domain adaptation approaches where they have none or small sets of labelled data.

APA, Harvard, Vancouver, ISO, and other styles

3

Caron, Mathilde. "Unsupervised Representation Learning with Clustering in Deep Convolutional Networks." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-227926.

Full text

Abstract:

This master thesis tackles the problem of unsupervised learning of visual representations with deep Convolutional Neural Networks (CNN). This is one of the main actual challenges in image recognition to close the gap between unsupervised and supervised representation learning. We propose a novel and simple way of training CNN on fully unlabeled datasets. Our method jointly optimizes a grouping of the representations and trains a CNN using the groups as supervision. We evaluate the models trained with our method on standard transfer learning experiments from the literature. We find out that our method outperforms all self-supervised and unsupervised state-of-the-art approaches. More importantly, our method outperforms those methods even when the unsupervised training set is not ImageNet but an arbitrary subset of images from Flickr.
Detta examensarbete behandlar problemet med oövervakat lärande av visuella representationer med djupa konvolutionella neurala nätverk (CNN). Detta är en av de viktigaste faktiska utmaningarna i datorseende för att överbrygga klyftan mellan oövervakad och övervakad representationstjänst. Vi föreslår ett nytt och enkelt sätt att träna CNN på helt omärkta dataset. Vår metod består i att tillsammans optimera en gruppering av representationerna och träna ett CNN med hjälp av grupperna som tillsyn. Vi utvärderar modellerna som tränats med vår metod på standardöverföringslärande experiment från litteraturen. Vi finner att vår metod överträffar alla självövervakade och oövervakade, toppmoderna tillvägagångssätt, hur sofistikerade de än är. Ännu viktigare är att vår metod överträffar de metoderna även när den oövervakade träningsuppsättningen inte är ImageNet men en godtycklig delmängd av bilder från Flickr.

APA, Harvard, Vancouver, ISO, and other styles

4

Manjunatha, Bharadwaj Sandhya. "Land Cover Quantification using Autoencoder based Unsupervised Deep Learning." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/99861.

Full text

Abstract:

This work aims to develop a deep learning model for land cover quantification through hyperspectral unmixing using an unsupervised autoencoder. Land cover identification and classification is instrumental in urban planning, environmental monitoring and land management. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. The high spectral information in these images can be analyzed to identify the various target materials present in the image scene based on their unique reflectance patterns. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimating their abundance compositions. The advantage of using this technique for land cover quantification is that it is completely unsupervised and eliminates the need for labelled data which generally requires years of field survey and formulation of detailed maps. We evaluate the performance of the autoencoder on various synthetic and real hyperspectral images consisting of different land covers using similarity metrics and abundance maps. The scalability of the technique with respect to landscapes is assessed by evaluating its performance on hyperspectral images spanning across 100m x 100m, 200m x 200m, 1000m x 1000m, 4000m x 4000m and 5000m x 5000m regions. Finally, we analyze the performance of this technique by comparing it to several supervised learning methods like Support Vector Machine (SVM), Random Forest (RF) and multilayer perceptron using F1-score, Precision and Recall metrics and other unsupervised techniques like K-Means, N-Findr, and VCA using cosine similarity, mean square error and estimated abundances. The land cover classification obtained using this technique is compared to the existing United States National Land Cover Database (NLCD) classification standard.
Master of Science
This work aims to develop an automated deep learning model for identifying and estimating the composition of the different land covers in a region using hyperspectral remote sensing imagery. With the technological advancements in remote sensing, hyperspectral imagery which captures high resolution images of the earth's surface across hundreds of wavelength bands, is becoming increasingly popular. As every surface has a unique reflectance pattern, the high spectral information contained in these images can be analyzed to identify the various target materials present in the image scene. An autoencoder is a deep learning model that can perform spectral unmixing by decomposing the complex image spectra into its constituent materials and estimate their percent compositions. The advantage of this method in land cover quantification is that it is an unsupervised technique which does not require labelled data which generally requires years of field survey and formulation of detailed maps. The performance of this technique is evaluated on various synthetic and real hyperspectral datasets consisting of different land covers. We assess the scalability of the model by evaluating its performance on images of different sizes spanning over a few hundred square meters to thousands of square meters. Finally, we compare the performance of the autoencoder based approach with other supervised and unsupervised deep learning techniques and with the current land cover classification standard.

APA, Harvard, Vancouver, ISO, and other styles

5

Martin, Damien W. "Fault detection in manufacturing equipment using unsupervised deep learning." Thesis, Massachusetts Institute of Technology, 2021. https://hdl.handle.net/1721.1/130698.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 87-90).
We investigate the use of unsupervised deep learning to create a general purpose automated fault detection system for manufacturing equipment. Unexpected equipment faults can be costly to manufacturing lines, but data driven fault detection systems often require a high level of application specific expertise to implement and continued human oversight. Collecting large labeled datasets to train such a system can also be challenging due to the sparse nature of faults. To address this, we focus on unsupervised deep learning approaches, and their ability to generalize across applications without changes to the hyper-parameters or architecture. Previous work has demonstrated the efficacy of autoencoders in unsupervised anomaly detection systems. In this work we propose a novel variant of the deep auto-encoding Gaussian mixture model, optimized for time series applications, and test its efficacy in detecting faults across a range of manufacturing equipment. It was tested against fault datasets from three milling machines, two plasma etchers, and one spinning ball bearing. In our tests, the model is able to detect over 80% of faults in all cases without the use of labeled data and without hyperparameter changes between applications. We also find that the model is capable of classifying different failure modes in some of our tests, and explore other ways the system can be used to provide useful diagnostic information. We present preliminary results from a continual learning variant of our fault detection architecture aimed at tackling the problem of system drift.
by Damien W. Martin.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO, and other styles

6

Liu, Dongnan. "Supervised and Unsupervised Deep Learning-based Biomedical Image Segmentation." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/24744.

Full text

Abstract:

Biomedical image analysis plays a crucial role in the development of healthcare, with a wide scope of applications including the disease diagnosis, clinical treatment, and future prognosis. Among various biomedical image analysis techniques, segmentation is an essential step, which aims at assigning each pixel with labels of interest on the category and instance. At the early stage, the segmentation results were obtained via manual annotation, which is time-consuming and error-prone. Over the past few decades, hand-craft feature based methods have been proposed to segment the biomedical images automatically. However, these methods heavily rely on prior knowledge, which limits their generalization ability on various biomedical images. With the recent advance of the deep learning technique, convolutional neural network (CNN) based methods have achieved state-of-the-art performance on various nature and biomedical image segmentation tasks. The great success of the CNN based segmentation methods results from the ability to learn contextual and local information from the high dimensional feature space. However, the biomedical image segmentation tasks are particularly challenging, due to the complicated background components, the high variability of object appearances, numerous overlapping objects, and ambiguous object boundaries. To this end, it is necessary to establish automated deep learning-based segmentation paradigms, which are capable of processing the complicated semantic and morphological relationships in various biomedical images. In this thesis, we propose novel deep learning-based methods for fully supervised and unsupervised biomedical image segmentation tasks. For the first part of the thesis, we introduce fully supervised deep learning-based segmentation methods on various biomedical image analysis scenarios. First, we design a panoptic structure paradigm for nuclei instance segmentation in the histopathology images, and cell instance segmentation in the fluorescence microscopy images. Traditional proposal-based and proposal-free instance segmentation methods are only capable to leverage either global contextual or local instance information. However, our panoptic paradigm integrates both of them and therefore achieves better performance. Second, we propose a multi-level feature fusion architecture for semantic neuron membrane segmentation in the electron microscopy (EM) images. Third, we propose a 3D anisotropic paradigm for brain tumor segmentation in magnetic resonance images, which enlarges the model receptive field while maintaining the memory efficiency. Although our fully supervised methods achieve competitive performance on several biomedical image segmentation tasks, they heavily rely on the annotations of the training images. However, labeling pixel-level segmentation ground truth for biomedical images is expensive and labor-intensive. Subsequently, exploring unsupervised segmentation methods without accessing annotations is an important topic for biomedical image analysis. In the second part of the thesis, we focus on the unsupervised biomedical image segmentation methods. First, we proposed a panoptic feature alignment paradigm for unsupervised nuclei instance segmentation in the histopathology images, and mitochondria instance segmentation in EM images. To the best of our knowledge, we are for the first time to design an unsupervised deep learning-based method for various biomedical image instance segmentation tasks. Second, we design a feature disentanglement architecture for unsupervised object recognition. In addition to the unsupervised instance segmentation for the biomedical images, our method also achieves state-of-the-art performance on the unsupervised object detection for natural images, which further demonstrates its effectiveness and high generalization ability.

APA, Harvard, Vancouver, ISO, and other styles

7

Nasrin, Mst Shamima. "Pathological Image Analysis with Supervised and Unsupervised Deep Learning Approaches." University of Dayton / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1620052562772676.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Wu, Xinheng. "A Deep Unsupervised Anomaly Detection Model for Automated Tumor Segmentation." Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22502.

Full text

Abstract:

Many researches have been investigated to provide the computer aided diagnosis (CAD) automated tumor segmentation in various medical images, e.g., magnetic resonance (MR), computed tomography (CT) and positron-emission tomography (PET). The recent advances in automated tumor segmentation have been achieved by supervised deep learning (DL) methods trained on large labelled data to cover tumor variations. However, there is a scarcity in such training data due to the cost of labeling process. Thus, with insufficient training data, supervised DL methods have difficulty in generating effective feature representations for tumor segmentation. This thesis aims to develop an unsupervised DL method to exploit large unlabeled data generated during clinical process. Our assumption is unsupervised anomaly detection (UAD) that, normal data have constrained anatomy and variations, while anomalies, i.e., tumors, usually differ from the normality with high diversity. We demonstrate our method for automated tumor segmentation on two different image modalities. Firstly, given that bilateral symmetry in normal human brains and unsymmetry in brain tumors, we propose a symmetric-driven deep UAD model using GAN model to model the normal symmetric variations thus segmenting tumors by their being unsymmetrical. We evaluated our method on two benchmarked datasets. Our results show that our method outperformed the state-of-the-art unsupervised brain tumor segmentation methods and achieved competitive performance to the supervised segmentation methods. Secondly, we propose a multi-modal deep UAD model for PET-CT tumor segmentation. We model a manifold of normal variations shared across normal CT and PET pairs; this manifold representing the normal pairing that can be used to segment the anomalies. We evaluated our method on two PET-CT datasets and the results show that we outperformed the state-of-the-art unsupervised methods, supervised methods and baseline fusion techniques.

APA, Harvard, Vancouver, ISO, and other styles

9

Längkvist, Martin. "Modeling time-series with deep networks." Doctoral thesis, Örebro universitet, Institutionen för naturvetenskap och teknik, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-39415.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Dekhtiar, Jonathan. "Deep Learning and unsupervised learning to automate visual inspection in the manufacturing industry." Thesis, Compiègne, 2019. http://www.theses.fr/2019COMP2513.

Full text

Abstract:

La croissance exponentielle des besoins et moyens informatiques implique un besoin croissant d’automatisation des procédés industriels. Ce constat est en particulier visible pour l’inspection visuelle automatique sur ligne de production. Bien qu’étudiée depuis 1970, peine toujours à être appliquée à de larges échelles et à faible coûts. Les méthodes employées dépendent grandement de la disponibilité des experts métiers. Ce qui provoque inévitablement une augmentation des coûts et une réduction de la flexibilité des méthodes employées. Depuis 2012, les avancées dans le domaine associé à l’étude des réseaux neuronaux profonds (i.e. Deep Learning) a permis de nombreux progrès en ce sens, notamment grâce au réseaux neuronaux convolutif qui ont atteint des performances proches de l’humain dans de nombreux domaines associées à la perception visuelle (e.g. reconnaissance et détection d’objets, etc.). Cette thèse propose une approche non supervisée pour répondre aux besoins de l’inspection visuelle automatique. Cette méthode, baptisé AnoAEGAN, combine l’apprentissage adversaire et l’estimation d’une fonction de densité de probabilité. Ces deux approches complémentaires permettent d’estimer jointement la probabilité pixel par pixel d’un défaut visuel sur une image. Le modèle est entrainé à partir d’un nombre très limités d’images (i.e. inférieur à 1000 images) sans utilisation de connaissance expert pour « étiqueter » préalablement les données. Cette méthode permet une flexibilité accrue par la rapidité d’entrainement du modèle et une grande versatilité, démontrée sur dix tâches différentes sans la moindre modification du modèle. Cette méthode devrait permettre de réduire les coûts de développement et le temps nécessaire de déploiement en production. Cette méthode peut être également déployée de manière complémentaire à une approche supervisée afin de bénéficier des avantages de chaque approche
Although studied since 1970, automatic visual inspection on production lines still struggles to be applied on a large scale and at low cost. The methods used depend greatly on the availability of domain experts. This inevitably leads to increased costs and reduced flexibility in the methods used. Since 2012, advances in the field of Deep Learning have enabled many advances in this direction, particularly thanks to convolutional neura networks that have achieved near-human performance in many areas associated with visual perception (e.g. object recognition and detection, etc.). This thesis proposes an unsupervised approach to meet the needs of automatic visual inspection. This method, called AnoAEGAN, combines adversarial learning and the estimation of a probability density function. These two complementary approaches make it possible to jointly estimate the pixel-by-pixel probability of a visual defect on an image. The model is trained from a very limited number of images (i.e. less than 1000 images) without using expert knowledge to "label" the data beforehand. This method allows increased flexibility with a limited training time and therefore great versatility, demonstrated on ten different tasks without any modification of the model. This method should reduce development costs and the time required to deploy in production. This method can also be deployed in a complementary way to a supervised approach in order to benefit from the advantages of each approach

APA, Harvard, Vancouver, ISO, and other styles

11

Boschini, Matteo. "Unsupervised Learning of Scene Flow." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16226/.

Full text

Abstract:

As Computer Vision-powered autonomous systems are increasingly deployed to solve problems in the wild, the case is made for developing visual understanding methods that are robust and flexible. One of the most challenging tasks for this purpose is given by the extraction of scene flow, that is the dense three-dimensional vector field that associates each world point with its corresponding position in the next observed frame, hence describing its three-dimensional motion entirely. The recent addition of a limited amount of ground truth scene flow information to the popular KITTI dataset prompted a renewed interest in the study of techniques for scene flow inference, although the proposed solutions in literature mostly rely on computation-intensive techniques and are characterised by execution times that are not suited for real-time application. In the wake of the recent widespread adoption of Deep Learning techniques to Computer Vision tasks and in light of the convenience of Unsupervised Learning for scenarios in which ground truth collection is difficult and time-consuming, this thesis work proposes the first neural network architecture to be trained in end-to-end fashion for unsupervised scene flow regression from monocular visual data, called Pantaflow. The proposed solution is much faster than currently available state-of-the-art methods and therefore represents a step towards the achievement of real-time scene flow inference.

APA, Harvard, Vancouver, ISO, and other styles

12

Feng, Zeyu. "Learning Deep Representations from Unlabelled Data for Visual Recognition." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/26876.

Full text

Abstract:

Self-supervised learning (SSL) aims at extracting from abundant unlabelled images transferable semantic features, which benefit various downstream visual tasks by reducing the sample complexity when human annotated labels are scarce. SSL is promising because it also boosts performance in diverse tasks when combined with the knowledge of existing techniques. Therefore, it is important and meaningful to study how SSL leads to better transferability and design novel SSL methods. To this end, this thesis proposes several methods to improve SSL and its function in downstream tasks. We begin by investigating the effect of unlabelled training data, and introduce an information-theoretical constraint for SSL from multiple related domains. In contrast to conventional single dataset, exploiting multi-domains has the benefits of decreasing the build-in bias of individual domain and allowing knowledge transfer across domains. Thus, the learned representation is more unbiased and transferable. Next, we describe a feature decoupling (FD) framework that incorporates invariance into predicting transformations, one main category of SSL methods, by observing that they often lead to co-variant features unfavourable for transfer. Our model learns a split representation that contains both transformation related and unrelated parts. FD achieves SOTA results on SSL benchmarks. We also present a multi-task method with theoretical understanding for contrastive learning, the other main category of SSL, by leveraging the semantic information from synthetic images to facilitate the learning of class-related semantics. Finally, we explore self-supervision in open-set unsupervised classification with the knowledge of source domain. We propose to enforce consistency under transformation of target data and discover pseudo-labels from confident predictions. Experimental results outperform SOTA open-set domain adaptation methods.

APA, Harvard, Vancouver, ISO, and other styles

13

Jiménez-Pérez, Guillermo. "Deep learning and unsupervised machine learning for the quantification and interpretation of electrocardiographic signals." Doctoral thesis, Universitat Pompeu Fabra, 2022. http://hdl.handle.net/10803/673555.

Full text

Abstract:

Las señales electrocardiográficas, ya sea adquiridas en la piel del paciente (electrocardiogamas de superficie, ECG) o de forma invasiva mediante cateterismo (electrocardiogramas intracavitarios, iECG) ayudan a explorar la condición y función cardíacas del paciente, dada su capacidad para representar la actividad eléctrica del corazón. Sin embargo, la interpretación de las señales de ECG e iECG es una tarea difícil que requiere años de experiencia, con criterios diagnósticos complejos para personal clínico no especialista, que en muchos casos deben ser interpretados durante situaciones de gran estrés o carga de trabajo como en la unidad de cuidados intensivos, o durante procedimientos de ablación por radiofrecuencia (ARF) donde el cardiólogo tiene que interpretar cientos o miles de señales individuales. Desde el punto de vista computacional, el desarrollo de herramientas de alto rendimiento mediante técnicas de análisis basadas en datos adolece de la falta de bases de datos anotadas a gran escala y de la naturaleza de “caja negra” que están asociados con los algoritmos considerados estado del arte en la actualidad. Esta tesis trata sobre el entrenamiento de algoritmos de aprendizaje automático que ayuden al personal clínico en la interpretación automática de ECG e iECG. Esta tesis tiene cuatro contribuciones principales. En primer lugar, se ha desarrollado una herramienta de delineación del ECG para la predicción de los inicios y finales de las principales ondas cardíacas (ondas P, QRS y T) en registros compuestos de cualquier configuración de derivaciones. En segundo lugar, se ha desarrollado un algoritmo de generación de datos sintéticos que es capaz de paliar el impacto del reducido tamaño de las bases de datos existentes para el desarrollo de algoritmos de delineación. En tercer lugar, la metodología de análisis de datos de ECG se aplicó a datos similares, en registros electrocardiográficos intracavitarios, con el mismo objetivo de marcar inicios y finales de activaciones locales y de campo lejano para facilitar la localización de sitios de ablación adecuados en procedimientos de ARF. Para este propósito, el algoritmo de delineación del ECG de superficie desarrollado previamente fue empleado para preprocesar los datos y marcar la detección del complejo QRS. En cuarto y último lugar, el algoritmo de delineación de ECG de superficie fue empleado, junto con un algoritmo de reducción de dimensionalidad, Multiple Kernel Learning, para agregar la información del ECG de 12 derivaciones y lograr la identificación de marcadores que permitan la estratificación del riesgo de muerte súbita cardíaca en pacientes con cardiomiopatía hipertrófica.
Electrocardiographic signals, either acquired on the patient’s skin (surface electrocardiogam, ECG) or invasively through catheterization (intracavitary electrocardiogram, iECG) offer a rich insight into the patient’s cardiac condition and function given their ability to represent the electrical activity of the heart. However, the interpretation of ECG and iECG signals is a complex task that requires years of experience, difficulting the correct diagnosis for non-specialists, during stress-related situations such as in the intensive care unit, or in radiofrequency ablation (RFA) procedures where the physician has to interpret hundreds or thousands of individual signals. From the computational point of view, the development of high-performing pipelines from data analysis suffer from lack of large-scale annotated databases and from the “black-box” nature of state-of-the-art analysis approaches. This thesis attempts at developing machine learning-based algorithms that aid physicians in the task of automatic ECG and iECG interpretation. The contributions of this thesis are fourfold. Firstly, an ECG delineation tool has been developed for the markup of the onsets and offsets of the main cardiac waves (P, QRS and T waves) in recordings comprising any configuration of leads. Secondly, a novel synthetic data augmentation algorithm has been developed for palliating the impact of small-scale datasets in the development of robust delineation algorithms. Thirdly, this methodology was applied to similar data, intracavitary electrocardiographic recordings, with the objective of marking the onsets and offsets of events for facilitating the localization of suitable ablation sites. For this purpose, the ECG delineation algorithm previously developed was employed to pre-process the data and mark the QRS detection fiducials. Finally, the ECG delineation approach was employed alongside a dimensionality reduction algorithm, Multiple Kernel Learning, for aggregating the information of 12-lead ECGs with the objective of developing a pipeline for risk stratification of sudden cardiac death in patients with hypertrophic cardiomyopathy.

APA, Harvard, Vancouver, ISO, and other styles

14

De, Deuge Mark. "Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies." Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14551.

Full text

Abstract:

Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot.

APA, Harvard, Vancouver, ISO, and other styles

15

Varshney, Varun. "Supervised and unsupervised learning for plant and crop row detection in precision agriculture." Thesis, Kansas State University, 2017. http://hdl.handle.net/2097/35463.

Full text

Abstract:

Master of Science
Department of Computing and Information Sciences
William H. Hsu
The goal of this research is to present a comparison between different clustering and segmentation techniques, both supervised and unsupervised, to detect plant and crop rows. Aerial images, taken by an Unmanned Aerial Vehicle (UAV), of a corn field at various stages of growth were acquired in RGB format through the Agronomy Department at the Kansas State University. Several segmentation and clustering approaches were applied to these images, namely K-Means clustering, Excessive Green (ExG) Index algorithm, Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and a deep learning approach based on Fully Convolutional Networks (FCN), to detect the plants present in the images. A Hough Transform (HT) approach was used to detect the orientation of the crop rows and rotate the images so that the rows became parallel to the x-axis. The result of applying different segmentation methods to the images was then used in estimating the location of crop rows in the images by using a template creation method based on Green Pixel Accumulation (GPA) that calculates the intensity profile of green pixels present in the images. Connected component analysis was then applied to find the centroids of the detected plants. Each centroid was associated with a crop row, and centroids lying outside the row templates were discarded as being weeds. A comparison between the various segmentation algorithms based on the Dice similarity index and average run-times is presented at the end of the work.

APA, Harvard, Vancouver, ISO, and other styles

16

Sahasrabudhe, Mihir. "Unsupervised and weakly supervised deep learning methods for computer vision and medical imaging." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASC010.

Full text

Abstract:

Les premières contributions de cette thèse (Chapter 2 et Chapitre 3) sont des modèles appelés Deforming Autoencoder (DAE) et Lifting Autoencoder (LAE), utilisés pour l'apprentissage non-supervisé de l'alignement 2-D dense d'images d'une classe donnée, et à partir de cela, pour apprendre un modèle tridimensionnel de l'objet. Ces modèles sont capable d'identifer des espaces canoniques pour représenter de différent caractéristiques de l'objet, à savoir, l'apparence des objets dans l'espace canonique, la déformation dense associée permettant de retrouver l'image réelle à partir de cette apparence, et pour le cas des visages humains, le modèle 3-D propre au visage de la personne considérée, son expression faciale, et l'angle de vue de la caméra. De plus, nous illustrons l'application de DAE à d'autres domaines, à savoir, l'alignement d'IRM de poumons et d'images satellites. Dans le Chapitre 4, nous nous concentrons sur une problématique lié au cancer du sang-diagnostique d'hyperlymphocytosis. Nous proposons un modèle convolutif pour encoder les images appartenant à un patient, suivi par la concaténation de l'information contenue dans toutes les images. Nos résultats montrent que les modèles proposés sont de performances comparables à celles des biologistes, et peuvent dont les aider dans l'élaboration de leur diagnostique
The first two contributions of this thesis (Chapter 2 and 3) are models for unsupervised 2D alignment and learning 3D object surfaces, called Deforming Autoencoders (DAE) and Lifting Autoencoders (LAE). These models are capable of identifying canonical space in order to represent different object properties, for example, appearance in a canonical space, deformation associated with this appearance that maps it to the image space, and for human faces, a 3D model for a face, its facial expression, and the angle of the camera. We further illustrate applications of models to other domains_ alignment of lung MRI images in medical image analysis, and alignment of satellite images for remote sensing imagery. In Chapter 4, we concentrate on a problem in medical image analysis_ diagnosis of lymphocytosis. We propose a convolutional network to encode images of blood smears obtained from a patient, followed by an aggregation operation to gather information from all images in order to represent them in one feature vector which is used to determine the diagnosis. Our results show that the performance of the proposed models is at-par with biologists and can therefore augment their diagnosis

APA, Harvard, Vancouver, ISO, and other styles

17

Landi, Isotta. "Stratification of autism spectrum conditions by deep encodings." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/252684.

Full text

Abstract:

This work aims at developing a novel machine learning method to investigate heterogeneity in neurodevelopmental disorders, with a focus on autism spectrum conditions (ASCs). In ASCs, heterogeneity is shown at several levels of analysis, e.g., genetic, behavioral, throughout developmental trajectories, which hinders the development of effective treatments and the identification of biological pathways involved in gene-cognition-behavior links. ASC diagnosis comes from behavioral observations, which determine the cohort composition of studies in every scientific field (e.g., psychology, neuroscience, genetics). Thus, uncovering behavioral subtypes can provide stratified ASC cohorts that are more representative of the true population. Ideally, behavioral stratification can (1) help to revise and shorten the diagnostic process highlighting the characteristics that best identify heterogeneity; (2) help to develop personalized treatments based on their effectiveness for subgroups of subjects; (3) investigate how the longitudinal course of the condition might differ (e.g., divergent/convergent developmental trajectories); (4) contribute to the identification of genetic variants that may be overlooked in case-control studies; and (5) identify possible disrupted neuronal activity in the brain (e.g., excitatory/inhibitory mechanisms). The characterization of the temporal aspects of heterogeneous manifestations based on their multi-dimensional features is thus the key to identify the etiology of such disorders and establish personalized treatments. Features include trajectories described by a multi-modal combination of electronic health records (EHRs), cognitive functioning and adaptive behavior indicators. This thesis contributes in particular to a data-driven discovery of clinical and behavioral trajectories of individuals with complex disorders and ASCs. Machine learning techniques, such as deep learning and word embedding, that proved successful for e.g., natural language processing and image classification, are gaining ground in healthcare research for precision medicine. Here, we leverage these methods to investigate the feasibility of learning data-driven pathways that have been difficult to identify in the clinical practice to help disentangle the complexity of conditions whose etiology is still unknown. In Chapter 1, we present a new computational method, based on deep learning, to stratify patients with complex disorders; we demonstrate the method on multiple myeloma, Alzheimer’s disease, and Parkinson’s disease, among others. We use clinical records from a heterogeneous patient cohort (i.e., multiple disease dataset) of 1.6M temporally-ordered EHR sequences from the Mount Sinai health system’s data warehouse to learn unsupervised patient representations. These representations are then leveraged to identify subgroups within complex condition cohorts via hierarchical clustering. We investigate the enrichment of terms that code for comorbidities, medications, laboratory tests and procedures, to clinically validate our results. A data analysis protocol is developed in Chapter 2 that produces behavioral embeddings from observational measurements to represent subjects with ASCs in a latent space able to capture multiple levels of assessment (i.e., multiple tests) and the temporal pattern of behavioral-cognitive profiles. The computational framework includes clustering algorithms and state-of-the-art word and text representation methods originally developed for natural language processing. The aim is to detect subgroups within ASC cohorts towards the identification of possible subtypes based on behavioral, cognitive, and functioning aspects. The protocol is applied to ASC behavioral data of 204 children and adolescents referred to the Laboratory of Observation Diagnosis and Education (ODFLab) at the University of Trento. In Chapter 3 we develop a case study for ASCs. From the learned representations of Chapter 1, we select 1,439 individuals with ASCs and investigate whether such representations generalize well to any disorder. Specifically, we identify three subgroups within individuals with ASCs that are further clinically validated to detect clinical profiles based on different term enrichment that can inform comorbidities, therapeutic treatments, medication side effects, and screening policies. This work has been developed in partnership with ODFLab (University of Trento) and the Predictive Models for Biomedicine and Environment unit at FBK. The study reported in Chapter 1 has been conducted at the Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai (NY).

APA, Harvard, Vancouver, ISO, and other styles

18

Landi, Isotta. "Stratification of autism spectrum conditions by deep encodings." Doctoral thesis, Università degli studi di Trento, 2020. http://hdl.handle.net/11572/252684.

Full text

Abstract:

This work aims at developing a novel machine learning method to investigate heterogeneity in neurodevelopmental disorders, with a focus on autism spectrum conditions (ASCs). In ASCs, heterogeneity is shown at several levels of analysis, e.g., genetic, behavioral, throughout developmental trajectories, which hinders the development of effective treatments and the identification of biological pathways involved in gene-cognition-behavior links. ASC diagnosis comes from behavioral observations, which determine the cohort composition of studies in every scientific field (e.g., psychology, neuroscience, genetics). Thus, uncovering behavioral subtypes can provide stratified ASC cohorts that are more representative of the true population. Ideally, behavioral stratification can (1) help to revise and shorten the diagnostic process highlighting the characteristics that best identify heterogeneity; (2) help to develop personalized treatments based on their effectiveness for subgroups of subjects; (3) investigate how the longitudinal course of the condition might differ (e.g., divergent/convergent developmental trajectories); (4) contribute to the identification of genetic variants that may be overlooked in case-control studies; and (5) identify possible disrupted neuronal activity in the brain (e.g., excitatory/inhibitory mechanisms). The characterization of the temporal aspects of heterogeneous manifestations based on their multi-dimensional features is thus the key to identify the etiology of such disorders and establish personalized treatments. Features include trajectories described by a multi-modal combination of electronic health records (EHRs), cognitive functioning and adaptive behavior indicators. This thesis contributes in particular to a data-driven discovery of clinical and behavioral trajectories of individuals with complex disorders and ASCs. Machine learning techniques, such as deep learning and word embedding, that proved successful for e.g., natural language processing and image classification, are gaining ground in healthcare research for precision medicine. Here, we leverage these methods to investigate the feasibility of learning data-driven pathways that have been difficult to identify in the clinical practice to help disentangle the complexity of conditions whose etiology is still unknown. In Chapter 1, we present a new computational method, based on deep learning, to stratify patients with complex disorders; we demonstrate the method on multiple myeloma, Alzheimer’s disease, and Parkinson’s disease, among others. We use clinical records from a heterogeneous patient cohort (i.e., multiple disease dataset) of 1.6M temporally-ordered EHR sequences from the Mount Sinai health system’s data warehouse to learn unsupervised patient representations. These representations are then leveraged to identify subgroups within complex condition cohorts via hierarchical clustering. We investigate the enrichment of terms that code for comorbidities, medications, laboratory tests and procedures, to clinically validate our results. A data analysis protocol is developed in Chapter 2 that produces behavioral embeddings from observational measurements to represent subjects with ASCs in a latent space able to capture multiple levels of assessment (i.e., multiple tests) and the temporal pattern of behavioral-cognitive profiles. The computational framework includes clustering algorithms and state-of-the-art word and text representation methods originally developed for natural language processing. The aim is to detect subgroups within ASC cohorts towards the identification of possible subtypes based on behavioral, cognitive, and functioning aspects. The protocol is applied to ASC behavioral data of 204 children and adolescents referred to the Laboratory of Observation Diagnosis and Education (ODFLab) at the University of Trento. In Chapter 3 we develop a case study for ASCs. From the learned representations of Chapter 1, we select 1,439 individuals with ASCs and investigate whether such representations generalize well to any disorder. Specifically, we identify three subgroups within individuals with ASCs that are further clinically validated to detect clinical profiles based on different term enrichment that can inform comorbidities, therapeutic treatments, medication side effects, and screening policies. This work has been developed in partnership with ODFLab (University of Trento) and the Predictive Models for Biomedicine and Environment unit at FBK. The study reported in Chapter 1 has been conducted at the Institute for Next Generation Healthcare, Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai (NY).

APA, Harvard, Vancouver, ISO, and other styles

19

Larsson, Frans. "Algorithmic trading surveillance : Identifying deviating behavior with unsupervised anomaly detection." Thesis, Uppsala universitet, Matematiska institutionen, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-389941.

Full text

Abstract:

The financial markets are no longer what they used to be and one reason for this is the breakthrough of algorithmic trading. Although this has had several positive effects, there have been recorded incidents where algorithms have been involved. It is therefore of interest to find effective methods to monitor algorithmic trading. The purpose of this thesis was therefore to contribute to this research area by investigating if machine learning can be used for detecting deviating behavior. Since the real world data set used in this study lacked labels, an unsupervised anomaly detection approach was chosen. Two models, isolation forest and deep denoising autoencoder, were selected and evaluated. Because the data set lacked labels, artificial anomalies were injected into the data set to make evaluation of the models possible. These synthetic anomalies were generated by two different approaches, one based on a downsampling strategy and one based on manual construction and modification of real data. The evaluation of the anomaly detection models shows that both isolation forest and deep denoising autoencoder outperform a trivial baseline model, and have the ability to detect deviating behavior. Furthermore, it is shown that a deep denoising autoencoder outperforms isolation forest, with respect to both area under the receiver operating characteristics curve and area under the precision-recall curve. A deep denoising autoencoder is therefore recommended for the purpose of algorithmic trading surveillance.

APA, Harvard, Vancouver, ISO, and other styles

20

Merrill, Nicholas Swede. "Modified Kernel Principal Component Analysis and Autoencoder Approaches to Unsupervised Anomaly Detection." Thesis, Virginia Tech, 2020. http://hdl.handle.net/10919/98659.

Full text

Abstract:

Unsupervised anomaly detection is the task of identifying examples that differ from the normal or expected pattern without the use of labeled training data. Our research addresses shortcomings in two existing anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE), and proposes novel solutions to improve both of their performances in the unsupervised settings. Anomaly detection has several useful applications, such as intrusion detection, fault monitoring, and vision processing. More specifically, anomaly detection can be used in autonomous driving to identify obscured signage or to monitor intersections. Kernel techniques are desirable because of their ability to model highly non-linear patterns, but they are limited in the unsupervised setting due to their sensitivity of parameter choices and the absence of a validation step. Additionally, conventionally KPCA suffers from a quadratic time and memory complexity in the construction of the gram matrix and a cubic time complexity in its eigendecomposition. The problem of tuning the Gaussian kernel parameter, $sigma$, is solved using the mini-batch stochastic gradient descent (SGD) optimization of a loss function that maximizes the dispersion of the kernel matrix entries. Secondly, the computational time is greatly reduced, while still maintaining high accuracy by using an ensemble of small, textit{skeleton} models and combining their scores. The performance of traditional machine learning approaches to anomaly detection plateaus as the volume and complexity of data increases. Deep anomaly detection (DAD) involves the applications of multilayer artificial neural networks to identify anomalous examples. AEs are fundamental to most DAD approaches. Conventional AEs rely on the assumption that a trained network will learn to reconstruct normal examples better than anomalous ones. In practice however, given sufficient capacity and training time, an AE will generalize to reconstruct even very rare examples. Three methods are introduced to more reliably train AEs for unsupervised anomaly detection: Cumulative Error Scoring (CES) leverages the entire history of training errors to minimize the importance of early stopping and Percentile Loss (PL) training aims to prevent anomalous examples from contributing to parameter updates. Lastly, early stopping via Knee detection aims to limit the risk of over training. Ultimately, the two new modified proposed methods of this research, Unsupervised Ensemble KPCA (UE-KPCA) and the modified training and scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets.
Master of Science
Anomaly detection is the task of identifying examples that differ from the normal or expected pattern. The challenge of unsupervised anomaly detection is distinguishing normal and anomalous data without the use of labeled examples to demonstrate their differences. This thesis addresses shortcomings in two anomaly detection algorithms, Kernel Principal Component Analysis (KPCA) and Autoencoders (AE) and proposes new solutions to apply them in the unsupervised setting. Ultimately, the two modified methods, Unsupervised Ensemble KPCA (UE-KPCA) and the Modified Training and Scoring AE (MTS-AE), demonstrates improved detection performance and reliability compared to many baseline algorithms across a number of benchmark datasets.

APA, Harvard, Vancouver, ISO, and other styles

21

ABUKMEIL, MOHANAD. "UNSUPERVISED GENERATIVE MODELS FOR DATA ANALYSIS AND EXPLAINABLE ARTIFICIAL INTELLIGENCE." Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/889159.

Full text

Abstract:

For more than a century, the methods of learning representation and the exploration of the intrinsic structures of data have developed remarkably and currently include supervised, semi-supervised, and unsupervised methods. However, recent years have witnessed the flourishing of big data, where typical dataset dimensions are high, and the data can come in messy, missing, incomplete, unlabeled, or corrupted forms. Consequently, discovering and learning the hidden structure buried inside such data becomes highly challenging. From this perspective, latent data analysis and dimensionality reduction play a substantial role in decomposing the exploratory factors and learning the hidden structures of data, which encompasses the significant features that characterize the categories and trends among data samples in an ordered manner. That is by extracting patterns, differentiating trends, and testing hypotheses to identify anomalies, learning compact knowledge, and performing many different machine learning (ML) tasks such as classification, detection, and prediction. Unsupervised generative learning (UGL) methods are a class of ML characterized by their possibility of analyzing and decomposing latent data, reducing dimensionality, visualizing the manifold of data, and learning representations with limited levels of predefined labels and prior assumptions. Furthermore, explainable artificial intelligence (XAI) is an emerging field of ML that deals with explaining the decisions and behaviors of learned models. XAI is also associated with UGL models to explain the hidden structure of data, and to explain the learned representations of ML models. However, the current UGL models lack large-scale generalizability and explainability in the testing stage, which leads to restricting their potential in ML and XAI applications. To overcome the aforementioned limitations, this thesis proposes innovative methods that integrate UGL and XAI to enable data factorization and dimensionality reduction to improve the generalizability of the learned ML models. Moreover, the proposed methods enable visual explainability in modern applications as anomaly detection and autonomous driving systems. The main research contributions are listed as follows: • A novel overview of UGL models including blind source separation (BSS), manifold learning (MfL), and neural networks (NNs). Also, the overview considers open issues and challenges among each UGL method. • An innovative method to identify the dimensions of the compact feature space via a generalized rank in the application of image dimensionality reduction. • An innovative method to hierarchically reduce and visualize the manifold of data to improve the generalizability in limited data learning scenarios, and computational complexity reduction applications. • An original method to visually explain autoencoders by reconstructing an attention map in the application of anomaly detection and explainable autonomous driving systems. The novel methods introduced in this thesis are benchmarked on publicly available datasets, and they outperformed the state-of-the-art methods considering different evaluation metrics. Furthermore, superior results were obtained with respect to the state-of-the-art to confirm the feasibility of the proposed methodologies concerning the computational complexity, availability of learning data, model explainability, and high data reconstruction accuracy.

APA, Harvard, Vancouver, ISO, and other styles

22

Olsson, Sebastian. "Automated sleep scoring using unsupervised learning of meta-features." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-189234.

Full text

Abstract:

Sleep is an important part of life as it affects the performance of one's activities during all awake hours. The study of sleep and wakefulness is therefore of great interest, particularly to the clinical and medical fields where sleep disorders are diagnosed. When studying sleep, it is common to talk about different types, or stages, of sleep. A common task in sleep research is to determine the sleep stage of the sleeping subject as a function of time. This process is known as sleep stage scoring. In this study, I seek to determine whether there is any benefit to using unsupervised feature learning in the context of electroencephalogram-based (EEG) sleep scoring. More specifically, the effect of generating and making use of new feature representations for hand-crafted features of sleep data – meta-features – is studied. For this purpose, two scoring algorithms have been implemented and compared. Both scoring algorithms involve segmentation of the EEG signal, feature extraction, feature selection and classification using a support vector machine (SVM). Unsupervised feature learning was implemented in the form of a dimensionality-reducing deep-belief network (DBN) which the feature space was processed through. Both scorers were shown to have a classification accuracy of about 76 %. The application of unsupervised feature learning did not affect the accuracy significantly. It is speculated that with a better choice of parameters for the DBN in a possible future work, the accuracy may improve significantly.
Sömnen är en viktig del av livet eftersom den påverkar ens prestation under alla vakna timmar. Forskning om sömn and vakenhet är därför av stort intresse, i synnerhet för de kliniska och medicinska områdena där sömnbesvär diagnostiseras. I forskning om sömn är det är vanligt att tala om olika typer av sömn, eller sömnstadium. En vanlig uppgift i sömnforskning är att avgöra sömnstadiet av den sovande exemplaret som en funktion av tiden. Den här processen kallas sömnmätning. I den här studien försöker jag avgöra om det finns någon fördel med att använda oövervakad inlärning av särdrag för att utföra elektroencephalogram-baserad (EEG) sömnmätning. Mer specifikt undersöker jag effekten av att generera och använda nya särdragsrepresentationer som härstammar från handgjorda särdrag av sömndata – meta-särdrag. Två sömnmätningsalgoritmer har implementerats och jämförts för det här syftet. Sömnmätningsalgoritmerna involverar segmentering av EEG-signalen, extraktion av särdragen, urval av särdrag och klassificering genom användning av en stödvektormaskin (SVM). Oövervakad inlärning av särdrag implementerades i form av ett dimensionskrympande djuptrosnätverk (DBN) som användes för att bearbetasärdragsrymden. Båda sömnmätarna visades ha en klassificeringsprecision av omkring 76 %. Användningen av oövervakad inlärning av särdrag hade ingen signifikant inverkan på precisionen. Det spekuleras att precisionen skulle kunna höjas med ett mer lämpligt val av parametrar för djuptrosnätverket.

APA, Harvard, Vancouver, ISO, and other styles

23

Budaraju, Sri Datta. "Unsupervised 3D Human Pose Estimation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291435.

Full text

Abstract:

The thesis proposes an unsupervised representation learning method to predict 3D human pose from a 2D skeleton via a VAEGAN (Variational Autoencoder Generative Adversarial Network) hybrid network. The method learns to lift poses from 2D to 3D using selfsupervision and adversarial learning techniques. The method does not use images, heatmaps, 3D pose annotations, paired/unpaired 2Dto3D skeletons, 3D priors, synthetic 2D skeletons, multiview or temporal information in any shape or form. The 2D skeleton input is taken by a VAE that encodes it in a latent space and then decodes that latent representation to a 3D pose. The 3D pose is then reprojected to 2D for a constrained, selfsupervised optimization using the input 2D pose. Parallelly, the 3D pose is also randomly rotated and reprojected to 2D to generate a ’novel’ 2D view for unconstrained adversarial optimization using a discriminator network. The combination of the optimizations of the original and the novel 2D views of the predicted 3D pose results in a ’realistic’ 3D pose generation. The thesis shows that the encoding and decoding process of the VAE addresses the major challenge of erroneous and incomplete skeletons from 2D detection networks as inputs and that the variance of the VAE can be altered to get various plausible 3D poses for a given 2D input. Additionally, the latent representation could be used for crossmodal training and many downstream applications. The results on Human3.6M datasets outperform previous unsupervised approaches with less model complexity while addressing more hurdles in scaling the task to the real world.
Uppsatsen föreslår en oövervakad metod för representationslärande för att förutsäga en 3Dpose från ett 2D skelett med hjälp av ett VAE GAN (Variationellt Autoenkodande Generativt Adversariellt Nätverk) hybrid neuralt nätverk. Metoden lär sig att utvidga poser från 2D till 3D genom att använda självövervakning och adversariella inlärningstekniker. Metoden använder sig vare sig av bilder, värmekartor, 3D poseannotationer, parade/oparade 2D till 3D skelett, a priori information i 3D, syntetiska 2Dskelett, flera vyer, eller tidsinformation. 2Dskelettindata tas från ett VAE som kodar det i en latent rymd och sedan avkodar den latenta representationen till en 3Dpose. 3D posen är sedan återprojicerad till 2D för att genomgå begränsad, självövervakad optimering med hjälp av den tvådimensionella posen. Parallellt roteras dessutom 3Dposen slumpmässigt och återprojiceras till 2D för att generera en ny 2D vy för obegränsad adversariell optimering med hjälp av ett diskriminatornätverk. Kombinationen av optimeringarna av den ursprungliga och den nya 2Dvyn av den förutsagda 3Dposen resulterar i en realistisk 3Dposegenerering. Resultaten i uppsatsen visar att kodningsoch avkodningsprocessen av VAE adresserar utmaningen med felaktiga och ofullständiga skelett från 2D detekteringsnätverk som indata och att variansen av VAE kan modifieras för att få flera troliga 3D poser för givna 2D indata. Dessutom kan den latenta representationen användas för crossmodal träning och flera nedströmsapplikationer. Resultaten på datamängder från Human3.6M är bättre än tidigare oövervakade metoder med mindre modellkomplexitet samtidigt som de adresserar flera hinder för att skala upp uppgiften till verkliga tillämpningar.

APA, Harvard, Vancouver, ISO, and other styles

24

Lind, Johan. "Evaluating CNN-based models for unsupervised image denoising." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176092.

Full text

Abstract:

Images are often corrupted by noise which reduces their visual quality and interferes with analysis. Convolutional Neural Networks (CNNs) have become a popular method for denoising images, but their training typically relies on access to thousands of pairs of noisy and clean versions of the same underlying picture. Unsupervised methods lack this requirement and can instead be trained purely using noisy images. This thesis evaluated two different unsupervised denoising algorithms: Noise2Self (N2S) and Parametric Probabilistic Noise2Void (PPN2V), both of which train an internal CNN to denoise images. Four different CNNs were tested in order to investigate how the performance of these algorithms would be affected by different network architectures. The testing used two different datasets: one containing clean images corrupted by synthetic noise, and one containing images damaged by real noise originating from the camera used to capture them. Two of the networks, UNet and a CBAM-augmented UNet resulted in high performance competitive with the strong classical denoisers BM3D and NLM. The other two networks - GRDN and MultiResUNet - on the other hand generally caused poor performance.

APA, Harvard, Vancouver, ISO, and other styles

25

Farouni, Tarek. "An Overview of Probabilistic Latent Variable Models with anApplication to the Deep Unsupervised Learning of ChromatinStates." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492189894812539.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Bujwid, Sebastian. "GANtruth – a regularization method for unsupervised image-to-image translation." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233849.

Full text

Abstract:

In this work, we propose a novel and effective method for constraining the output space of the ill-posed problem of unsupervised image-to-image translation. We make the assumption that the environment of the source domain is known, and we propose to explicitly enforce preservation of the ground-truth labels on the images translated from the source to the target domain. We run empirical experiments on preserving information such as semantic segmentation and disparity and show evidence that our method achieves improved performance over the baseline model UNIT on translating images from SYNTHIA to Cityscapes. The generated images are perceived as more realistic in human surveys and have reduced errors when using them as adapted images in the domain adaptation scenario. Moreover, the underlying ground-truth preservation assumption is complementary to alternative approaches and by combining it with the UNIT framework, we improve the results even further.
I det här arbetet föreslår vi en ny och effektiv metod för att begränsa värdemängden för det illa-definierade problemet som utgörs av oövervakad bild-till-bild-översättning. Vi antar att miljön i källdomänen är känd, och vi föreslår att uttryckligen framtvinga bevarandet av grundfaktaetiketterna på bilder översatta från källa till måldomän. Vi utför empiriska experiment där information som semantisk segmentering och skillnad bevaras och visar belägg för att vår metod uppnår förbättrad prestanda över baslinjemetoden UNIT på att översätta bilder från SYNTHIA till Cityscapes. De genererade bilderna uppfattas som mer realistiska i undersökningar där människor tillfrågats och har minskat fel när de används som anpassade bilder i domänpassningsscenario. Dessutom är det underliggande grundfaktabevarande antagandet kompletterat med alternativa tillvägagångssätt och genom att kombinera det med UNIT-ramverket förbättrar vi resultaten ytterligare.

APA, Harvard, Vancouver, ISO, and other styles

27

Anand, Gaurangi. "Unsupervised visual perception-based representation learning for time-series and trajectories." Thesis, Queensland University of Technology, 2021. https://eprints.qut.edu.au/212901/1/Gaurangi_Anand_Thesis.pdf.

Full text

Abstract:

Representing time-series without relying on the domain knowledge and independent of the end-task is a challenging problem. The same situation applies to trajectory data as well, where sufficient labelled information is often unavailable to learn effective representations. This thesis addresses this problem and explores unsupervised ways of representing the temporal data. The novel methods imitate the human visual perception of the pictorial depiction of such data based on deep learning.

APA, Harvard, Vancouver, ISO, and other styles

28

Ackerman, Wesley. "Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8684.

Full text

Abstract:

We expand the scope of image-to-image translation to include more distinct image domains, where the image sets have analogous structures, but may not share object types between them. Semantic-Driven Unsupervised Image-to-Image Translation for Distinct Image Domains (SUNIT) is built to more successfully translate images in this setting, where content from one domain is not found in the other. Our method trains an image translation model by learning encodings for semantic segmentations of images. These segmentations are translated between image domains to learn meaningful mappings between the structures in the two domains. The translated segmentations are then used as the basis for image generation. Beginning image generation with encoded segmentation information helps maintain the original structure of the image. We qualitatively and quantitatively show that SUNIT improves image translation outcomes, especially for image translation tasks where the image domains are very distinct.

APA, Harvard, Vancouver, ISO, and other styles

29

Mehr, Éloi. "Unsupervised Learning of 3D Shape Spaces for 3D Modeling." Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS566.

Full text

Abstract:

Bien que les données 3D soient de plus en plus populaires, en particulier avec la démocratisation des expériences de réalité virtuelle et augmentée, il reste très difficile de manipuler une forme 3D, même pour des designers ou des experts. Partant d’une base de données d’instances 3D d’une ou plusieurs catégories d’objets, nous voulons apprendre la variété des formes plausibles en vue de développer de nouveaux outils intelligents de modélisation et d’édition 3D. Cependant, cette variété est souvent bien plus complexe comparée au domaine 2D. En effet, les surfaces 3D peuvent être représentées en utilisant plusieurs plongements distincts, et peuvent aussi exhiber des alignements ou des topologies différentes. Dans cette thèse, nous étudions la variété des formes plausibles à la lumière des défis évoqués précédemment, en approfondissant trois points de vue différents. Tout d'abord, nous considérons la variété comme un espace quotient, dans le but d’apprendre la géométrie intrinsèque des formes à partir d’une base de données où les modèles 3D ne sont pas co-alignés. Ensuite, nous supposons que la variété est non connexe, ce qui aboutit à un nouveau modèle d’apprentissage profond capable d’automatiquement partitionner et apprendre les formes selon leur typologie. Enfin, nous étudions la conversion d’une entrée 3D non structurée vers une géométrie exacte, représentée comme un arbre structuré de primitives solides continues
Even though 3D data is becoming increasingly more popular, especially with the democratization of virtual and augmented experiences, it remains very difficult to manipulate a 3D shape, even for designers or experts. Given a database containing 3D instances of one or several categories of objects, we want to learn the manifold of plausible shapes in order to develop new intelligent 3D modeling and editing tools. However, this manifold is often much more complex compared to the 2D domain. Indeed, 3D surfaces can be represented using various embeddings, and may also exhibit different alignments and topologies. In this thesis we study the manifold of plausible shapes in the light of the aforementioned challenges, by deepening three different points of view. First of all, we consider the manifold as a quotient space, in order to learn the shapes’ intrinsic geometry from a dataset where the 3D models are not co-aligned. Then, we assume that the manifold is disconnected, which leads to a new deep learning model that is able to automatically cluster and learn the shapes according to their typology. Finally, we study the conversion of an unstructured 3D input to an exact geometry, represented as a structured tree of continuous solid primitives

APA, Harvard, Vancouver, ISO, and other styles

30

McClintick, Kyle W. "Training Data Generation Framework For Machine-Learning Based Classifiers." Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1276.

Full text

Abstract:

In this thesis, we propose a new framework for the generation of training data for machine learning techniques used for classification in communications applications. Machine learning-based signal classifiers do not generalize well when training data does not describe the underlying probability distribution of real signals. The simplest way to accomplish statistical similarity between training and testing data is to synthesize training data passed through a permutation of plausible forms of noise. To accomplish this, a framework is proposed that implements arbitrary channel conditions and baseband signals. A dataset generated using the framework is considered, and is shown to be appropriately sized by having $11\%$ lower entropy than state-of-the-art datasets. Furthermore, unsupervised domain adaptation can allow for powerful generalized training via deep feature transforms on unlabeled evaluation-time signals. A novel Deep Reconstruction-Classification Network (DRCN) application is introduced, which attempts to maintain near-peak signal classification accuracy despite dataset bias, or perturbations on testing data unforeseen in training. Together, feature transforms and diverse training data generated from the proposed framework, teaching a range of plausible noise, can train a deep neural net to classify signals well in many real-world scenarios despite unforeseen perturbations.

APA, Harvard, Vancouver, ISO, and other styles

31

Marchesin, Stefano. "Developing unsupervised knowledge-enhanced models to reduce the semantic Gap in information retrieval." Doctoral thesis, Università degli studi di Padova, 2020. http://hdl.handle.net/11577/3426253.

Full text

Abstract:

In this thesis we tackle the semantic gap, a long-standing problem in Information Retrieval(IR). The semantic gap can be described as the mismatch between users’ queries and the way retrieval models answer to such queries. Two main lines of work have emerged over the years to bridge the semantic gap: (i) the use of external knowledge resources to enhance the bag-of-words representations used by lexical models, and (ii) the use of semantic models to perform matching between the latent representations of queries and documents. To deal with this issue, we first perform an in-depth evaluation of lexical and semantic models through different analyses. The objective of this evaluation is to understand what features lexical and semantic models share, if their signals are complementary, and how they can be combined to effectively address the semantic gap. In particular, the evaluation focuses on (semantic) neural models and their critical aspects. Then, we build on the insights of this evaluation to develop lexical and semantic models addressing the semantic gap. Specifically, we develop unsupervised models that integrate knowledge from external resources, and we evaluate them for the medical domain – a domain with a high social value, where the semantic gap is prominent, and the large presence of authoritative knowledge resources allows us to explore effective ways to leverage external knowledge to address the semantic gap. For lexical models, we propose and evaluate several knowledge-based query expansion and reduction techniques. These query reformulations are used to increase the probability of retrieving relevant documents by adding to or removing from the original query highly specific terms. Regarding semantic models, we first analyze the limitations of the knowledge-enhanced neural models presented in the literature. Then, to overcome these limitations, we propose SAFIR, an unsupervised knowledge-enhanced neural framework for IR. The representations learned within this framework are optimized for IR and encode linguistic features that are relevant to address the semantic gap.

APA, Harvard, Vancouver, ISO, and other styles

32

Li, Yingzhen. "Approximate inference : new visions." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/277549.

Full text

Abstract:

Nowadays machine learning (especially deep learning) techniques are being incorporated to many intelligent systems affecting the quality of human life. The ultimate purpose of these systems is to perform automated decision making, and in order to achieve this, predictive systems need to return estimates of their confidence. Powered by the rules of probability, Bayesian inference is the gold standard method to perform coherent reasoning under uncertainty. It is generally believed that intelligent systems following the Bayesian approach can better incorporate uncertainty information for reliable decision making, and be less vulnerable to attacks such as data poisoning. Critically, the success of Bayesian methods in practice, including the recent resurgence of Bayesian deep learning, relies on fast and accurate approximate Bayesian inference applied to probabilistic models. These approximate inference methods perform (approximate) Bayesian reasoning at a relatively low cost in terms of time and memory, thus allowing the principles of Bayesian modelling to be applied to many practical settings. However, more work needs to be done to scale approximate Bayesian inference methods to big systems such as deep neural networks and large-scale dataset such as ImageNet. In this thesis we develop new algorithms towards addressing the open challenges in approximate inference. In the first part of the thesis we develop two new approximate inference algorithms, by drawing inspiration from the well known expectation propagation and message passing algorithms. Both approaches provide a unifying view of existing variational methods from different algorithmic perspectives. We also demonstrate that they lead to better calibrated inference results for complex models such as neural network classifiers and deep generative models, and scale to large datasets containing hundreds of thousands of data-points. In the second theme of the thesis we propose a new research direction for approximate inference: developing algorithms for fitting posterior approximations of arbitrary form, by rethinking the fundamental principles of Bayesian computation and the necessity of algorithmic constraints in traditional inference schemes. We specify four algorithmic options for the development of such new generation approximate inference methods, with one of them further investigated and applied to Bayesian deep learning tasks.

APA, Harvard, Vancouver, ISO, and other styles

33

Andraghetti, Lorenzo. "Monocular Depth Estimation enhancement by depth from SLAM Keypoints." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/16626/.

Full text

Abstract:

Training a neural network in a supervised way is extremely challenging since ground truth is expensive, time consuming and limited. Therefore the best choice is to do it unsupervisedly, exploiting easier-to-obtain binocular stereo images and epipolar geometry constraints. Sometimes however, this is not enough to predict fairly correct depth maps because of ambiguity of colour images, due for instance to shadows, reflective surfaces and so on. A Simultaneous Location and Mapping (SLAM) algorithm keeps track of hundreds of 3D landmarks in each frame of a sequence. Therefore, given the base assumption that it has the right scale, it can help the depth prediction providing a value for each of those 3D points. This work proposes a novel approach to enhance the depth prediction exploiting the potential of the SLAM depth points to their limits.

APA, Harvard, Vancouver, ISO, and other styles

34

Kilinc, Ismail Ozsel. "Graph-based Latent Embedding, Annotation and Representation Learning in Neural Networks for Semi-supervised and Unsupervised Settings." Scholar Commons, 2017. https://scholarcommons.usf.edu/etd/7415.

Full text

Abstract:

Machine learning has been immensely successful in supervised learning with outstanding examples in major industrial applications such as voice and image recognition. Following these developments, the most recent research has now begun to focus primarily on algorithms which can exploit very large sets of unlabeled examples to reduce the amount of manually labeled data required for existing models to perform well. In this dissertation, we propose graph-based latent embedding/annotation/representation learning techniques in neural networks tailored for semi-supervised and unsupervised learning problems. Specifically, we propose a novel regularization technique called Graph-based Activity Regularization (GAR) and a novel output layer modification called Auto-clustering Output Layer (ACOL) which can be used separately or collaboratively to develop scalable and efficient learning frameworks for semi-supervised and unsupervised settings. First, singularly using the GAR technique, we develop a framework providing an effective and scalable graph-based solution for semi-supervised settings in which there exists a large number of observations but a small subset with ground-truth labels. The proposed approach is natural for the classification framework on neural networks as it requires no additional task calculating the reconstruction error (as in autoencoder based methods) or implementing zero-sum game mechanism (as in adversarial training based methods). We demonstrate that GAR effectively and accurately propagates the available labels to unlabeled examples. Our results show comparable performance with state-of-the-art generative approaches for this setting using an easier-to-train framework. Second, we explore a different type of semi-supervised setting where a coarse level of labeling is available for all the observations but the model has to learn a fine, deeper level of latent annotations for each one. Problems in this setting are likely to be encountered in many domains such as text categorization, protein function prediction, image classification as well as in exploratory scientific studies such as medical and genomics research. We consider this setting as simultaneously performed supervised classification (per the available coarse labels) and unsupervised clustering (within each one of the coarse labels) and propose a novel framework combining GAR with ACOL, which enables the network to perform concurrent classification and clustering. We demonstrate how the coarse label supervision impacts performance and the classification task actually helps propagate useful clustering information between sub-classes. Comparative tests on the most popular image datasets rigorously demonstrate the effectiveness and competitiveness of the proposed approach. The third and final setup builds on the prior framework to unlock fully unsupervised learning where we propose to substitute real, yet unavailable, parent- class information with pseudo class labels. In this novel unsupervised clustering approach the network can exploit hidden information indirectly introduced through a pseudo classification objective. We train an ACOL network through this pseudo supervision together with unsupervised objective based on GAR and ultimately obtain a k-means friendly latent representation. Furthermore, we demonstrate how the chosen transformation type impacts performance and helps propagate the latent information that is useful in revealing unknown clusters. Our results show state-of-the-art performance for unsupervised clustering tasks on MNIST, SVHN and USPS datasets with the highest accuracies reported to date in the literature.

APA, Harvard, Vancouver, ISO, and other styles

35

Guiraud, Enrico [Verfasser], Jörg [Akademischer Betreuer] Lücke, and Ralf [Akademischer Betreuer] Häfner. "Scalable unsupervised learning for deep discrete generative models: novel variational algorithms and their software realizations / Enrico Guiraud ; Jörg Lücke, Ralf Häfner." Oldenburg : BIS der Universität Oldenburg, 2020. http://d-nb.info/1226287077/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Baur, Christoph [Verfasser], Nassir [Akademischer Betreuer] Navab, Nassir [Gutachter] Navab, and Ben [Gutachter] Glocker. "Anomaly Detection in Brain MRI: From Supervised to Unsupervised Deep Learning / Christoph Baur ; Gutachter: Nassir Navab, Ben Glocker ; Betreuer: Nassir Navab." München : Universitätsbibliothek der TU München, 2021. http://d-nb.info/1236343115/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Cherti, Mehdi. "Deep generative neural networks for novelty generation : a foundational framework, metrics and experiments." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS029/document.

Full text

Abstract:

Des avancées significatives sur les réseaux de neurones profonds ont récemment permis le développement de technologies importantes comme les voitures autonomes et les assistants personnels intelligents basés sur la commande vocale. La plupart des succès en apprentissage profond concernent la prédiction, alors que les percées initiales viennent des modèles génératifs. Actuellement, même s'il existe des outils puissants dans la littérature des modèles génératifs basés sur les réseaux profonds, ces techniques sont essentiellement utilisées pour la prédiction ou pour générer des objets connus (i.e., des images de haute qualité qui appartiennent à des classes connues) : un objet généré qui est à priori inconnu est considéré comme une erreur (Salimans et al., 2016) ou comme un objet fallacieux (Bengio et al., 2013b). En d'autres termes, quand la prédiction est considérée comme le seul objectif possible, la nouveauté est vue comme une erreur - que les chercheurs ont essayé d'éliminer au maximum. Cette thèse défends le point de vue que, plutôt que d'éliminer ces nouveautés, on devrait les étudier et étudier le potentiel génératif des réseaux neuronaux pour créer de la nouveauté utile - particulièrement sachant l'importance économique et sociétale de la création d'objets nouveaux dans les sociétés contemporaines. Cette thèse a pour objectif d'étudier la génération de la nouveauté et sa relation avec les modèles de connaissance produits par les réseaux neurones profonds génératifs. Notre première contribution est la démonstration de l'importance des représentations et leur impact sur le type de nouveautés qui peuvent être générées : une conséquence clé est qu'un agent créatif a besoin de re-représenter les objets connus et utiliser cette représentation pour générer des objets nouveaux. Ensuite, on démontre que les fonctions objectives traditionnelles utilisées dans la théorie de l'apprentissage statistique, comme le maximum de vraisemblance, ne sont pas nécessairement les plus adaptées pour étudier la génération de nouveauté. On propose plusieurs alternatives à un niveau conceptuel. Un deuxième résultat clé est la confirmation que les modèles actuels - qui utilisent les fonctions objectives traditionnelles - peuvent en effet générer des objets inconnus. Cela montre que même si les fonctions objectives comme le maximum de vraisemblance s'efforcent à éliminer la nouveauté, les implémentations en pratique échouent à le faire. A travers une série d'expérimentations, on étudie le comportement de ces modèles ainsi que les objets qu'ils génèrent. En particulier, on propose une nouvelle tâche et des métriques pour la sélection de bons modèles génératifs pour la génération de la nouveauté. Finalement, la thèse conclue avec une série d'expérimentations qui clarifie les caractéristiques des modèles qui génèrent de la nouveauté. Les expériences montrent que la sparsité, le niveaux du niveau de corruption et la restriction de la capacité des modèles tuent la nouveauté et que les modèles qui arrivent à reconnaître des objets nouveaux arrivent généralement aussi à générer de la nouveauté
In recent years, significant advances made in deep neural networks enabled the creation of groundbreaking technologies such as self-driving cars and voice-enabled personal assistants. Almost all successes of deep neural networks are about prediction, whereas the initial breakthroughs came from generative models. Today, although we have very powerful deep generative modeling techniques, these techniques are essentially being used for prediction or for generating known objects (i.e., good quality images of known classes): any generated object that is a priori unknown is considered as a failure mode (Salimans et al., 2016) or as spurious (Bengio et al., 2013b). In other words, when prediction seems to be the only possible objective, novelty is seen as an error that researchers have been trying hard to eliminate. This thesis defends the point of view that, instead of trying to eliminate these novelties, we should study them and the generative potential of deep nets to create useful novelty, especially given the economic and societal importance of creating new objects in contemporary societies. The thesis sets out to study novelty generation in relationship with data-driven knowledge models produced by deep generative neural networks. Our first key contribution is the clarification of the importance of representations and their impact on the kind of novelties that can be generated: a key consequence is that a creative agent might need to rerepresent known objects to access various kinds of novelty. We then demonstrate that traditional objective functions of statistical learning theory, such as maximum likelihood, are not necessarily the best theoretical framework for studying novelty generation. We propose several other alternatives at the conceptual level. A second key result is the confirmation that current models, with traditional objective functions, can indeed generate unknown objects. This also shows that even though objectives like maximum likelihood are designed to eliminate novelty, practical implementations do generate novelty. Through a series of experiments, we study the behavior of these models and the novelty they generate. In particular, we propose a new task setup and metrics for selecting good generative models. Finally, the thesis concludes with a series of experiments clarifying the characteristics of models that can exhibit novelty. Experiments show that sparsity, noise level, and restricting the capacity of the net eliminates novelty and that models that are better at recognizing novelty are also good at generating novelty

APA, Harvard, Vancouver, ISO, and other styles

38

Juan, Albarracín Javier. "Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature." Doctoral thesis, Universitat Politècnica de València, 2020. http://hdl.handle.net/10251/149560.

Full text

Abstract:

[ES] El futuro de la imagen médica está ligado a la inteligencia artificial. El análisis manual de imágenes médicas es hoy en día una tarea ardua, propensa a errores y a menudo inasequible para los humanos, que ha llamado la atención de la comunidad de Aprendizaje Automático (AA). La Imagen por Resonancia Magnética (IRM) nos proporciona una rica variedad de representaciones de la morfología y el comportamiento de lesiones inaccesibles sin una intervención invasiva arriesgada. Sin embargo, explotar la potente pero a menudo latente información contenida en la IRM es una tarea muy complicada, que requiere técnicas de análisis computacional inteligente. Los tumores del sistema nervioso central son una de las enfermedades más críticas estudiadas a través de IRM. Específicamente, el glioblastoma representa un gran desafío, ya que, hasta la fecha, continua siendo un cáncer letal que carece de una terapia satisfactoria. Del conjunto de características que hacen del glioblastoma un tumor tan agresivo, un aspecto particular que ha sido ampliamente estudiado es su heterogeneidad vascular. La fuerte proliferación vascular del glioblastoma, así como su robusta angiogénesis han sido consideradas responsables de la alta letalidad de esta neoplasia. Esta tesis se centra en la investigación y desarrollo del método Hemodynamic Tissue Signature (HTS): un método de AA no supervisado para describir la heterogeneidad vascular de los glioblastomas mediante el análisis de perfusión por IRM. El método HTS se basa en el concepto de hábitat, que se define como una subregión de la lesión con un perfil de IRM que describe un comportamiento fisiológico concreto. El método HTS delinea cuatro hábitats en el glioblastoma: el hábitat HAT, como la región más perfundida del tumor con captación de contraste; el hábitat LAT, como la región del tumor con un perfil angiogénico más bajo; el hábitat IPE, como la región adyacente al tumor con índices de perfusión elevados; y el hábitat VPE, como el edema restante de la lesión con el perfil de perfusión más bajo. La investigación y desarrollo de este método ha originado una serie de contribuciones enmarcadas en esta tesis. Primero, para verificar la fiabilidad de los métodos de AA no supervisados en la extracción de patrones de IRM, se realizó una comparativa para la tarea de segmentación de gliomas de grado alto. Segundo, se propuso un algoritmo de AA no supervisado dentro de la familia de los Spatially Varying Finite Mixture Models. El algoritmo propone una densidad a priori basada en un Markov Random Field combinado con la función probabilística Non-Local Means, para codificar la idea de que píxeles vecinos tienden a pertenecer al mismo objeto. Tercero, se presenta el método HTS para describir la heterogeneidad vascular del glioblastoma. El método se ha aplicado a casos reales en una cohorte local de un solo centro y en una cohorte internacional de más de 180 pacientes de 7 centros europeos. Se llevó a cabo una evaluación exhaustiva del método para medir el potencial pronóstico de los hábitats HTS. Finalmente, la tecnología desarrollada en la tesis se ha integrado en la plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofrece dos servicios: 1) segmentación de tejidos de glioblastoma, y 2) evaluación de la heterogeneidad vascular del tumor mediante el método HTS. Los resultados de esta tesis han sido publicados en diez contribuciones científicas, incluyendo revistas y conferencias de alto impacto en las áreas de Informática Médica, Estadística y Probabilidad, Radiología y Medicina Nuclear y Aprendizaje Automático. También se emitió una patente industrial registrada en España, Europa y EEUU. Finalmente, las ideas originales concebidas en esta tesis dieron lugar a la creación de ONCOANALYTICS CDX, una empresa enmarcada en el modelo de negocio de los companion diagnostics de compuestos farmacéuticos.
[EN] The future of medical imaging is linked to Artificial Intelligence (AI). The manual analysis of medical images is nowadays an arduous, error-prone and often unaffordable task for humans, which has caught the attention of the Machine Learning (ML) community. Magnetic Resonance Imaging (MRI) provides us with a wide variety of rich representations of the morphology and behavior of lesions completely inaccessible without a risky invasive intervention. Nevertheless, harnessing the powerful but often latent information contained in MRI acquisitions is a very complicated task, which requires computational intelligent analysis techniques. Central nervous system tumors are one of the most critical diseases studied through MRI. Specifically, glioblastoma represents a major challenge, as it remains a lethal cancer that, to date, lacks a satisfactory therapy. Of the entire set of characteristics that make glioblastoma so aggressive, a particular aspect that has been widely studied is its vascular heterogeneity. The strong vascular proliferation of glioblastomas, as well as their robust angiogenesis and extensive microvasculature heterogeneity have been claimed responsible for the high lethality of the neoplasm. This thesis focuses on the research and development of the Hemodynamic Tissue Signature (HTS) method: an unsupervised ML approach to describe the vascular heterogeneity of glioblastomas by means of perfusion MRI analysis. The HTS builds on the concept of habitats. A habitat is defined as a sub-region of the lesion with a particular MRI profile describing a specific physiological behavior. The HTS method delineates four habitats within the glioblastoma: the HAT habitat, as the most perfused region of the enhancing tumor; the LAT habitat, as the region of the enhancing tumor with a lower angiogenic profile; the potentially IPE habitat, as the non-enhancing region adjacent to the tumor with elevated perfusion indexes; and the VPE habitat, as the remaining edema of the lesion with the lowest perfusion profile. The research and development of the HTS method has generated a number of contributions to this thesis. First, in order to verify that unsupervised learning methods are reliable to extract MRI patterns to describe the heterogeneity of a lesion, a comparison among several unsupervised learning methods was conducted for the task of high grade glioma segmentation. Second, a Bayesian unsupervised learning algorithm from the family of Spatially Varying Finite Mixture Models is proposed. The algorithm integrates a Markov Random Field prior density weighted by the probabilistic Non-Local Means function, to codify the idea that neighboring pixels tend to belong to the same semantic object. Third, the HTS method to describe the vascular heterogeneity of glioblastomas is presented. The HTS method has been applied to real cases, both in a local single-center cohort of patients, and in an international retrospective cohort of more than 180 patients from 7 European centers. A comprehensive evaluation of the method was conducted to measure the prognostic potential of the HTS habitats. Finally, the technology developed in this thesis has been integrated into an online open-access platform for its academic use. The ONCOhabitats platform is hosted at https://www.oncohabitats.upv.es, and provides two main services: 1) glioblastoma tissue segmentation, and 2) vascular heterogeneity assessment of glioblastomas by means of the HTS method. The results of this thesis have been published in ten scientific contributions, including top-ranked journals and conferences in the areas of Medical Informatics, Statistics and Probability, Radiology & Nuclear Medicine and Machine Learning. An industrial patent registered in Spain, Europe and EEUU was also issued. Finally, the original ideas conceived in this thesis led to the foundation of ONCOANALYTICS CDX, a company framed into the business model of companion diagnostics for pharmaceutical compounds.
[CA] El futur de la imatge mèdica està lligat a la intel·ligència artificial. L'anàlisi manual d'imatges mèdiques és hui dia una tasca àrdua, propensa a errors i sovint inassequible per als humans, que ha cridat l'atenció de la comunitat d'Aprenentatge Automàtic (AA). La Imatge per Ressonància Magnètica (IRM) ens proporciona una àmplia varietat de representacions de la morfologia i el comportament de lesions inaccessibles sense una intervenció invasiva arriscada. Tanmateix, explotar la potent però sovint latent informació continguda a les adquisicions de IRM esdevé una tasca molt complicada, que requereix tècniques d'anàlisi computacional intel·ligent. Els tumors del sistema nerviós central són una de les malalties més crítiques estudiades a través de IRM. Específicament, el glioblastoma representa un gran repte, ja que, fins hui, continua siguent un càncer letal que manca d'una teràpia satisfactòria. Del conjunt de característiques que fan del glioblastoma un tumor tan agressiu, un aspecte particular que ha sigut àmpliament estudiat és la seua heterogeneïtat vascular. La forta proliferació vascular dels glioblastomes, així com la seua robusta angiogènesi han sigut considerades responsables de l'alta letalitat d'aquesta neoplàsia. Aquesta tesi es centra en la recerca i desenvolupament del mètode Hemodynamic Tissue Signature (HTS): un mètode d'AA no supervisat per descriure l'heterogeneïtat vascular dels glioblastomas mitjançant l'anàlisi de perfusió per IRM. El mètode HTS es basa en el concepte d'hàbitat, que es defineix com una subregió de la lesió amb un perfil particular d'IRM, que descriu un comportament fisiològic concret. El mètode HTS delinea quatre hàbitats dins del glioblastoma: l'hàbitat HAT, com la regió més perfosa del tumor amb captació de contrast; l'hàbitat LAT, com la regió del tumor amb un perfil angiogènic més baix; l'hàbitat IPE, com la regió adjacent al tumor amb índexs de perfusió elevats, i l'hàbitat VPE, com l'edema restant de la lesió amb el perfil de perfusió més baix. La recerca i desenvolupament del mètode HTS ha originat una sèrie de contribucions emmarcades a aquesta tesi. Primer, per verificar la fiabilitat dels mètodes d'AA no supervisats en l'extracció de patrons d'IRM, es va realitzar una comparativa en la tasca de segmentació de gliomes de grau alt. Segon, s'ha proposat un algorisme d'AA no supervisat dintre de la família dels Spatially Varying Finite Mixture Models. L'algorisme proposa un densitat a priori basada en un Markov Random Field combinat amb la funció probabilística Non-Local Means, per a codificar la idea que els píxels veïns tendeixen a pertànyer al mateix objecte semàntic. Tercer, es presenta el mètode HTS per descriure l'heterogeneïtat vascular dels glioblastomas. El mètode HTS s'ha aplicat a casos reals en una cohort local d'un sol centre i en una cohort internacional de més de 180 pacients de 7 centres europeus. Es va dur a terme una avaluació exhaustiva del mètode per mesurar el potencial pronòstic dels hàbitats HTS. Finalment, la tecnologia desenvolupada en aquesta tesi s'ha integrat en una plataforma online ONCOhabitats (https://www.oncohabitats.upv.es). La plataforma ofereix dos serveis: 1) segmentació dels teixits del glioblastoma, i 2) avaluació de l'heterogeneïtat vascular dels glioblastomes mitjançant el mètode HTS. Els resultats d'aquesta tesi han sigut publicats en deu contribucions científiques, incloent revistes i conferències de primer nivell a les àrees d'Informàtica Mèdica, Estadística i Probabilitat, Radiologia i Medicina Nuclear i Aprenentatge Automàtic. També es va emetre una patent industrial registrada a Espanya, Europa i els EEUU. Finalment, les idees originals concebudes en aquesta tesi van donar lloc a la creació d'ONCOANALYTICS CDX, una empresa emmarcada en el model de negoci dels companion diagnostics de compostos farmacèutics.
En este sentido quiero agradecer a las diferentes instituciones y estructuras de ﬁnanciación de investigación que han contribuido al desarrollo de esta tesis. En especial quiero agradecer a la Universitat Politècnica de València, donde he desarrollado toda mi carrera acadèmica y cientíﬁca, así como al Ministerio de Ciencia e Innovación, al Ministerio de Economía y Competitividad, a la Comisión Europea, al EIT Health Programme y a la fundación Caixa Impulse
Juan Albarracín, J. (2020). Unsupervised learning for vascular heterogeneity assessment of glioblastoma based on magnetic resonance imaging: The Hemodynamic Tissue Signature [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/149560
TESIS

APA, Harvard, Vancouver, ISO, and other styles

39

Donati, Lorenzo. "Domain Adaptation through Deep Neural Networks for Health Informatics." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14888/.

Full text

Abstract:

The PreventIT project is an EU Horizon 2020 project aimed at preventing early functional decline at younger old age. The analysis of causal links between risk factors and functional decline has been made possible by the cooperation of several research institutes' studies. However, since each research institute collects and delivers different kinds of data in different formats, so far the analysis has been assisted by expert geriatricians whose role is to detect the best candidates among hundreds of fields and offer a semantic interpretation of the values. This manual data harmonization approach is very common in both scientific and industrial environments. In this thesis project an alternative method for parsing heterogeneous data is proposed. Since all the datasets represent semantically related data, being all made from longitudinal studies on aging-related metrics, it is possible to train an artificial neural network to perform an automatic domain adaptation. To achieve this goal, a Stacked Denoising Autoencoder has been implemented and trained to extract a domain-invariant representation of the data. Then, from this high-level representation, multiple classifiers have been trained to validate the model and ultimately to predict the probability of functional decline of the patient. This innovative approach to the domain adaptation process can provide an easy and fast solution to many research fields that now rely on human interaction to analyze the semantic data model and perform cross-dataset analysis. Functional decline classifiers show a great improvement in their performance when trained on the domain-invariant features extracted by the Stacked Denoising Autoencoder. Furthermore, this project applies multiple deep neural network classifiers on top of the Stacked Denoising Autoencoder representation, achieving excellent results for the prediction of functional decline in a real case study that involves two different datasets.

APA, Harvard, Vancouver, ISO, and other styles

40

Espis, Andrea. "Object detection and semantic segmentation for assisted data labeling." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2022.

Find full text

Abstract:

The automation of data labeling tasks is a solution to the errors and time costs related to human labeling. In this thesis work CenterNet, DeepLabV3, and K-Means applied to the RGB color space, are deployed to build a pipeline for Assisted data labeling: a semi-automatic process to iteratively improve the quality of the annotations. The proposed pipeline pointed out a total of 1547 wrong and missing annotations when applied to a dataset originally containing 8,300 annotations. Moreover, the quality of each annotation has been drastically improved, and at the same time, more than 600 hours of work have been saved. The same models have also been used to address the real-time Tire inspection task, regarding the detection of markers on the surface of tires. According to the experiments, the combination of DeepLabV3 output and post-processing based on the area and shape of the predicted blobs, achieves a maximum of mean Precision 0.992, with mean Recall 0.982, and a maximum of mean Recall 0.998, with mean Precision 0.960.

APA, Harvard, Vancouver, ISO, and other styles

41

ZHU, XIANGPING. "Learning Discriminative Features for Person Re-Identification." Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/997742.

Full text

Abstract:

For fulfilling the requirements of public safety in modern cities, more and more large-scale surveillance camera systems are deployed, resulting in an enormous amount of visual data. Automatically processing and interpreting these data promote the development and application of visual data analytic technologies. As one of the important research topics in surveillance systems, person re-identification (re-id) aims at retrieving the target person across non-overlapping camera-views that are implemented in a number of distributed space-time locations. It is a fundamental problem for many practical surveillance applications, eg, person search, cross-camera tracking, multi-camera human behavior analysis and prediction, and it received considerable attentions nowadays from both academic and industrial domains. Learning discriminative feature representation is an essential task in person re-id. Although many methodologies have been proposed, discriminative re-id feature extraction is still a challenging problem due to: (1) Intra- and inter-personal variations. The intrinsic properties of the camera deployment in surveillance system lead to various changes in person poses, view-points, illumination conditions etc. This may result in the large intra-personal variations and/or small inter-personal variations, thus incurring problems in matching person images. (2) Domain variations. The domain variations between different datasets give rise to the problem of generalization capability of re-id model. Directly applying a re-id model trained on one dataset to another one usually causes a large performance degradation. (3) Difficulties in data creation and annotation. Existing person re-id methods, especially deep re-id methods, rely mostly on a large set of inter-camera identity labelled training data, requiring a tedious data collection and annotation process. This leads to poor scalability in practical person re-id applications. Corresponding to the challenges in learning discriminative re-id features, this thesis contributes to the re-id domain by proposing three related methodologies and one new re-id setting: (1) Gaussian mixture importance estimation. Handcrafted features are usually not discriminative enough for person re-id because of noisy information, such as background clutters. To precisely evaluate the similarities between person images, the main task of distance metric learning is to filter out the noisy information. Keep It Simple and Straightforward MEtric (KISSME) is an effective method in person re-id. However, it is sensitive to the feature dimensionality and cannot capture the multi-modes in dataset. To this end, a Gaussian Mixture Importance Estimation re-id approach is proposed, which exploits the Gaussian Mixture Models for estimating the observed commonalities of similar and dissimilar person pairs in the feature space. (2) Unsupervised domain-adaptive person re-id based on pedestrian attributes. In person re-id, person identities are usually not overlapped among different domains (or datasets) and this raises the difficulties in generalizing re-id models. Different from person identity, pedestrian attributes, eg., hair length, clothes type and color, are consistent across different domains (or datasets). However, most of re-id datasets lack attribute annotations. On the other hand, in the field of pedestrian attribute recognition, there is a number of datasets labeled with attributes. Exploiting such data for re-id purpose can alleviate the shortage of attribute annotations in re-id domain and improve the generalization capability of re-id model. To this end, an unsupervised domain-adaptive re-id feature learning framework is proposed to make full use of attribute annotations. Specifically, an existing unsupervised domain adaptation method has been extended to transfer attribute-based features from attribute recognition domain to the re-id domain. With the proposed re-id feature learning framework, the domain invariant feature representations can be effectively extracted. (3) Intra-camera supervised person re-id. Annotating the large-scale re-id datasets requires a tedious data collection and annotation process and therefore leads to poor scalability in practical person re-id applications. To overcome this fundamental limitation, a new person re-id setting is considered without inter-camera identity association but only with identity labels independently annotated within each camera-view. This eliminates the most time-consuming and tedious inter-camera identity association annotating process and thus significantly reduces the amount of human efforts required during annotation. It hence gives rise to a more scalable and more feasible learning scenario, which is named as Intra-Camera Supervised (ICS) person re-id. Under this ICS setting, a new re-id method, i.e., Multi-task Mulit-label (MATE) learning method, is formulated. Given no inter-camera association, MATE is specially designed for self-discovering the inter-camera identity correspondence. This is achieved by inter-camera multi-label learning under a joint multi-task inference framework. In addition, MATE can also efficiently learn the discriminative re-id feature representations using the available identity labels within each camera-view.

APA, Harvard, Vancouver, ISO, and other styles

42

Carlsson, Filip, and Philip Lindgren. "Deep Scenario Generation of Financial Markets." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273631.

Full text

Abstract:

The goal of this thesis is to explore a new clustering algorithm, VAE-Clustering, and examine if it can be applied to find differences in the distribution of stock returns and augment the distribution of a current portfolio of stocks and see how it performs in different market conditions. The VAE-clustering method is as mentioned a newly introduced method and not widely tested, especially not on time series. The first step is therefore to see if and how well the clustering works. We first apply the algorithm to a dataset containing monthly time series of the power demand in Italy. The purpose in this part is to focus on how well the method works technically. When the model works well and generates proper results with the Italian Power Demand data, we move forward and apply the model on stock return data. In the latter application we are unable to find meaningful clusters and therefore unable to move forward towards the goal of the thesis. The results shows that the VAE-clustering method is applicable for time series. The power demand have clear differences from season to season and the model can successfully identify those differences. When it comes to the financial data we hoped that the model would be able to find different market regimes based on time periods. The model is though not able distinguish different time periods from each other. We therefore conclude that the VAE-clustering method is applicable on time series data, but that the structure and setting of the financial data in this thesis makes it to hard to find meaningful clusters. The major finding is that the VAE-clustering method can be applied to time series. We highly encourage further research to find if the method can be successfully used on financial data in different settings than tested in this thesis.
Syftet med den här avhandlingen är att utforska en ny klustringsalgoritm, VAE-Clustering, och undersöka om den kan tillämpas för att hitta skillnader i fördelningen av aktieavkastningar och förändra distributionen av en nuvarande aktieportfölj och se hur den presterar under olika marknadsvillkor. VAE-klusteringsmetoden är som nämnts en nyinförd metod och inte testad i stort, särskilt inte på tidsserier. Det första steget är därför att se om och hur klusteringen fungerar. Vi tillämpar först algoritmen på ett datasätt som innehåller månatliga tidsserier för strömbehovet i Italien. Syftet med denna del är att fokusera på hur väl metoden fungerar tekniskt. När modellen fungerar bra och ger tillfredställande resultat, går vi vidare och tillämpar modellen på aktieavkastningsdata. I den senare applikationen kan vi inte hitta meningsfulla kluster och kan därför inte gå framåt mot målet som var att simulera olika marknader och se hur en nuvarande portfölj presterar under olika marknadsregimer. Resultaten visar att VAE-klustermetoden är väl tillämpbar på tidsserier. Behovet av el har tydliga skillnader från säsong till säsong och modellen kan framgångsrikt identifiera dessa skillnader. När det gäller finansiell data hoppades vi att modellen skulle kunna hitta olika marknadsregimer baserade på tidsperioder. Modellen kan dock inte skilja olika tidsperioder från varandra. Vi drar därför slutsatsen att VAE-klustermetoden är tillämplig på tidsseriedata, men att strukturen på den finansiella data som undersöktes i denna avhandling gör det svårt att hitta meningsfulla kluster. Den viktigaste upptäckten är att VAE-klustermetoden kan tillämpas på tidsserier. Vi uppmuntrar ytterligare forskning för att hitta om metoden framgångsrikt kan användas på finansiell data i andra former än de testade i denna avhandling

APA, Harvard, Vancouver, ISO, and other styles

43

Choi, Jin-Woo. "Action Recognition with Knowledge Transfer." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/101780.

Full text

Abstract:

Recent progress on deep neural networks has shown remarkable action recognition performance from videos. The remarkable performance is often achieved by transfer learning: training a model on a large-scale labeled dataset (source) and then fine-tuning the model on the small-scale labeled datasets (targets). However, existing action recognition models do not always generalize well on new tasks or datasets because of the following two reasons. i) Current action recognition datasets have a spurious correlation between action types and background scene types. The models trained on these datasets are biased towards the scene instead of focusing on the actual action. This scene bias leads to poor generalization performance. ii) Directly testing the model trained on the source data on the target data leads to poor performance as the source, and target distributions are different. Fine-tuning the model on the target data can mitigate this issue. However, manual labeling small- scale target videos is labor-intensive. In this dissertation, I propose solutions to these two problems. For the first problem, I propose to learn scene-invariant action representations to mitigate the scene bias in action recognition models. Specifically, I augment the standard cross-entropy loss for action classification with 1) an adversarial loss for the scene types and 2) a human mask confusion loss for videos where the human actors are invisible. These two losses encourage learning representations unsuitable for predicting 1) the correct scene types and 2) the correct action types when there is no evidence. I validate the efficacy of the proposed method by transfer learning experiments. I trans- fer the pre-trained model to three different tasks, including action classification, temporal action localization, and spatio-temporal action detection. The results show consistent improvement over the baselines for every task and dataset. I formulate human action recognition as an unsupervised domain adaptation (UDA) problem to handle the second problem. In the UDA setting, we have many labeled videos as source data and unlabeled videos as target data. We can use already exist- ing labeled video datasets as source data in this setting. The task is to align the source and target feature distributions so that the learned model can generalize well on the target data. I propose 1) aligning the more important temporal part of each video and 2) encouraging the model to focus on action, not the background scene, to learn domain-invariant action representations. The proposed method is simple and intuitive while achieving state-of-the-art performance without training on a lot of labeled target videos. I relax the unsupervised target data setting to a sparsely labeled target data setting. Then I explore the semi-supervised video action recognition, where we have a lot of labeled videos as source data and sparsely labeled videos as target data. The semi-supervised setting is practical as sometimes we can afford a little bit of cost for labeling target data. I propose multiple video data augmentation methods to inject photometric, geometric, temporal, and scene invariances to the action recognition model in this setting. The resulting method shows favorable performance on the public benchmarks.
Doctor of Philosophy
Recent progress on deep learning has shown remarkable action recognition performance. The remarkable performance is often achieved by transferring the knowledge learned from existing large-scale data to the small-scale data specific to applications. However, existing action recog- nition models do not always work well on new tasks and datasets because of the following two problems. i) Current action recognition datasets have a spurious correlation between action types and background scene types. The models trained on these datasets are biased towards the scene instead of focusing on the actual action. This scene bias leads to poor performance on the new datasets and tasks. ii) Directly testing the model trained on the source data on the target data leads to poor performance as the source, and target distributions are different. Fine-tuning the model on the target data can mitigate this issue. However, manual labeling small-scale target videos is labor-intensive. In this dissertation, I propose solutions to these two problems. To tackle the first problem, I propose to learn scene-invariant action representations to mitigate background scene- biased human action recognition models for the first problem. Specifically, the proposed method learns representations that cannot predict the scene types and the correct actions when there is no evidence. I validate the proposed method's effectiveness by transferring the pre-trained model to multiple action understanding tasks. The results show consistent improvement over the baselines for every task and dataset. To handle the second problem, I formulate human action recognition as an unsupervised learning problem on the target data. In this setting, we have many labeled videos as source data and unlabeled videos as target data. We can use already existing labeled video datasets as source data in this setting. The task is to align the source and target feature distributions so that the learned model can generalize well on the target data. I propose 1) aligning the more important temporal part of each video and 2) encouraging the model to focus on action, not the background scene. The proposed method is simple and intuitive while achieving state-of-the-art performance without training on a lot of labeled target videos. I relax the unsupervised target data setting to a sparsely labeled target data setting. Here, we have many labeled videos as source data and sparsely labeled videos as target data. The setting is practical as sometimes we can afford a little bit of cost for labeling target data. I propose multiple video data augmentation methods to inject color, spatial, temporal, and scene invariances to the action recognition model in this setting. The resulting method shows favorable performance on the public benchmarks.

APA, Harvard, Vancouver, ISO, and other styles

44

Alise, Dario Fioravante. "Algoritmo di "Label Propagation" per il clustering di documenti testuali." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14388/.

Full text

Abstract:

Negli ultimi anni del secolo scorso l’avvento di Internet ha permesso di avere a disposizione innumerevoli quantità di testi consultabili online, provenienti sia da libri e riviste, sia da nuove forme di comunicazione della rete quali email, forum, newsgroup e chat.  Le soluzioni adottate nel settore del Text Mining (d’ora in poi abbreviato in TM), che è l’estensione del Data Mining rivolto a dati testuali non strutturati, si basano su fondamenti informatici, statistici e linguistici e sono in linea di principio applicabili a documenti di qualsiasi dimensione. Con l’avvento dei Social Networks la quantità e la dimensione dei dati testuali da analizzare è cresciuta in maniera sub-esponenziale e benché le tecniche disponibili rimangono comunque valide e applicabili, negli ultimi quattro/cinque anni la ricerca si è concentrata su una tecnica emergente, chiamata semantic hashing, che consente di mappare documenti di qualunque tipo in stringhe binarie. Sfruttando questa nuova branca di ricerca, lo scopo principale di questa tesi è di definire, progettare ed implementare un algoritmo di clustering che prendendo in input questi dati binari sia in grado di etichettare tali dati in maniera più precisa ed in tempi minori rispetto a quanto fanno gli altri approcci presenti in letteratura. Dopo una descrizione di quelle che sono le principali tecniche di TM, seguirà una trattazione relativa all’hashing semantico e alle basi teoriche su cui questo si fonda per poi introdurre l’algoritmo adoperato per fare clustering, presentandone lo schema architetturale di funzionamento e la relativa implementazione.  Infine saranno comparati e analizzati i risultati dell’esecuzione dell’algoritmo, chiamato d’ora in poi Label Propagation (abbreviato in LP), con quelli ottenuti con tecniche standard.

APA, Harvard, Vancouver, ISO, and other styles

45

Yuan, Xiao. "Graph neural networks for spatial gene expression analysis of the developing human heart." Thesis, Uppsala universitet, Institutionen för biologisk grundutbildning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-427330.

Full text

Abstract:

Single-cell RNA sequencing and in situ sequencing were combined in a recent study of the developing human heart to explore the transcriptional landscape at three developmental stages. However, the method used in the study to create the spatial cellular maps has some limitations. It relies on image segmentation of the nuclei and cell types defined in advance by single-cell sequencing. In this study, we applied a new unsupervised approach based on graph neural networks on the in situ sequencing data of the human heart to find spatial gene expression patterns and detect novel cell and sub-cell types. In this thesis, we first introduce some relevant background knowledge about the sequencing techniques that generate our data, machine learning in single-cell analysis, and deep learning on graphs. We have explored several graph neural network models and algorithms to learn embeddings for spatial gene expression. Dimensionality reduction and cluster analysis were performed on the embeddings for visualization and identification of biologically functional domains. Based on the cluster gene expression profiles, locations of the clusters in the heart sections, and comparison with cell types defined in the previous study, the results of our experiments demonstrate that graph neural networks can learn meaningful representations of spatial gene expression in the human heart. We hope further validations of our clustering results could give new insights into cell development and differentiation processes of the human heart.

APA, Harvard, Vancouver, ISO, and other styles

46

Chafaa, Irched. "Machine learning for beam alignment in mmWave networks." Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG044.

Full text

Abstract:

Pour faire face à la croissance exponentielle du trafic des données mobiles, une solution possible est d'exploiter les larges bandes spectrales disponibles dans la partie millimétrique du spectre électromagnétique. Cependant, le signal transmis est fortement atténué, impliquant une portée de propagation limitée et un faible nombre des trajets de propagation (canal parcimonieux). Par conséquent, des faisceaux directifs doivent être utilisés pour focaliser l'énergie du signal transmis vers son utilisateur et compenser les pertes de propagation. Ces faisceaux ont besoin d'être dirigés convenablement pour garantir la fiabilité du lien de communication. Ceci représente le problème d'alignement des faisceaux pour les systèmes de communication à onde millimétrique. En effet, les faisceaux de l'émetteur et du récepteur doivent être constamment ajustés et alignés pour combattre les conditions de propagation difficiles de la bande millimétrique. De plus, les techniques d'alignement des faisceaux doivent prendre en compte la mobilité des utilisateurs et la dynamique imprévisible du réseau. Ceci mène à un fort coût de signalisation et d'entraînement qui impacte les performances des réseaux. Dans la première partie de cette thèse, nous reformulons le problème d'alignement des faisceaux en utilisant les bandits manchots (ou multi-armed bandits), pertinents dans le cas d'une dynamique du réseau imprévisibles et arbitraire (non-stationnaire ou même antagoniste). Nous proposons des méthodes en ligne et adaptatives pour aligner indépendamment les faisceaux des deux nœuds du lien de communication en utilisant seulement un seul bit de feedback. En se basant sur l'algorithme des poids exponentiels (EXP3) et le caractère parcimonieux du canal à onde millimétrique, nous proposons une version modifiée de l'algorithme original (MEXP3) avec des garanties théoriques en fonction du regret asymptotique. En outre, pour un horizon du temps fini, notre borne supérieure du regret est plus serrée que celle de l'algorithme EXP3, indiquant une meilleure performance en pratique. Nous introduisons également une deuxième modification qui utilise les corrélations temporelles entre des choix successifs des faisceaux dans une nouvelle technique d'alignement des faisceaux (NBT-MEXP3). Dans la deuxième partie de cette thèse, des outils de l'apprentissage profond sont examinés pour choisir des faisceaux dans un lien point d'accès -- utilisateur. Nous exploitons l'apprentissage profond non supervisé pour utiliser l'information des canaux au-dessous de 6 GHz afin de prédire des faisceaux dans la bande millimétrique; cette fonction canal-faisceau complexe est apprise en utilisant des données non-annotés du dataset DeepMIMO. Nous discutons aussi le choix d'une taille optimale pour le réseau de neurones en fonction du nombre des antennes de transmission et de réception au point d'accès. De plus, nous étudions l'impact de la disponibilité des données d'entraînement et introduisons une approche basée sur l'apprentissage fédéré pour prédire des faisceaux dans un réseau à plusieurs liens en partageant uniquement les paramètres des réseaux de neurones entrainés localement (et non pas les données locales). Nous envisageons les méthodes synchrones et asynchrones de l'approche par apprentissage fédéré. Nos résultats numériques montrent le potentiel de notre approche particulièrement au cas où les données d'entrainement sont peu abondantes ou imparfaites (bruitées). Enfin, nous comparons nos méthodes basées sur l'apprentissage profond avec celles de la première partie. Les simulations montrent que le choix d'une méthode convenable pour aligner les faisceaux dépend de la nature de l'application et présente un compromis entre le débit obtenu et la complexité du calcul
To cope with the ever increasing mobile data traffic, an envisioned solution for future wireless networks is to exploit the large available spectrum in the millimeter wave (mmWave) band. However, communicating at these high frequencies is very challenging as the transmitted signal suffers from strong attenuation, which leads to a limited propagation range and few multipath components (sparse mmWave channels). Hence, highly-directional beams have to be employed to focus the signal energy towards the intended user and compensate all those losses. Such beams need to be steered appropriately to guarantee a reliable communication link. This represents the so called beam alignment problem where the beams of the transmitter and the receiver need to be constantly aligned. Moreover, beam alignment policies need to support devices mobility and the unpredicted dynamics of the network, which result in significant signaling and training overhead affecting the overall performance. In the first part of the thesis, we formulate the beam alignment problem via the adversarial multi-armed bandit framework, which copes with arbitrary network dynamics including non-stationary or adversarial components. We propose online and adaptive beam alignment policies relying only on one-bit feedback to steer the beams of both nodes of the communication link in a distributed manner. Building on the well-known exponential weights algorithm (EXP3) and by exploiting the sparse nature of mmWave channels, we propose a modified policy (MEXP3), which comes with optimal theoretical guarantees in terms of asymptotic regret. Moreover, for finite horizons, our regret upper-bound is tighter than that of the original EXP3 suggesting better performance in practice. We then introduce an additional modification that accounts for the temporal correlation between successive beams and propose another beam alignment policy (NBT-MEXP3). In the second part of the thesis, deep learning tools are investigated to select mmWave beams in an access point -- user link. We leverage unsupervised deep learning to exploit the channel knowledge at sub-6 GHz and predict beamforming vectors in the mmWave band; this complex channel-beam mapping is learned via data issued from the DeepMIMO dataset and lacking the ground truth. We also show how to choose an optimal size of our neural network depending on the number of transmit and receive antennas at the access point. Furthermore, we investigate the impact of training data availability and introduce a federated learning (FL) approach to predict the beams of multiple links by sharing only the parameters of the locally trained neural networks (and not the local data). We investigate both synchronous and asynchronous FL methods. Our numerical simulations show the high potential of our approach, especially when the local available data is scarce or imperfect (noisy). At last, we compare our proposed deep learning methods with reinforcement learning methods derived in the first part. Simulations show that choosing an appropriate beam steering method depends on the target application and is a tradeoff between rate performance and computational complexity

APA, Harvard, Vancouver, ISO, and other styles

47

Sjökvist, Henrik. "Text feature mining using pre-trained word embeddings." Thesis, KTH, Matematisk statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-228536.

Full text

Abstract:

This thesis explores a machine learning task where the data contains not only numerical features but also free-text features. In order to employ a supervised classifier and make predictions, the free-text features must be converted into numerical features. In this thesis, an algorithm is developed to perform that conversion. The algorithm uses a pre-trained word embedding model which maps each word to a vector. The vectors for multiple word embeddings belonging to the same sentence are then combined to form a single sentence embedding. The sentence embeddings for the whole dataset are clustered to identify distinct groups of free-text strings. The cluster labels are output as the numerical features. The algorithm is applied on a specific case concerning operational risk control in banking. The data consists of modifications made to trades in financial instruments. Each such modification comes with a short text string which documents the modification, a trader comment. Converting these strings to numerical trader comment features is the objective of the case study. A classifier is trained and used as an evaluation tool for the trader comment features. The performance of the classifier is measured with and without the trader comment feature. Multiple models for generating the features are evaluated. All models lead to an improvement in classification rate over not using a trader comment feature. The best performance is achieved with a model where the sentence embeddings are generated using the SIF weighting scheme and then clustered using the DBSCAN algorithm.
Detta examensarbete behandlar ett maskininlärningsproblem där data innehåller fritext utöver numeriska attribut. För att kunna använda all data för övervakat lärande måste fritexten omvandlas till numeriska värden. En algoritm utvecklas i detta arbete för att utföra den omvandlingen. Algoritmen använder färdigtränade ordvektormodeller som omvandlar varje ord till en vektor. Vektorerna för flera ord i samma mening kan sedan kombineras till en meningsvektor. Meningsvektorerna i hela datamängden klustras sedan för att identifiera grupper av liknande textsträngar. Algoritmens utdata är varje datapunkts klustertillhörighet. Algoritmen appliceras på ett specifikt fall som berör operativ risk inom banksektorn. Data består av modifikationer av finansiella transaktioner. Varje sådan modifikation har en tillhörande textkommentar som beskriver modifikationen, en handlarkommentar. Att omvandla dessa kommentarer till numeriska värden är målet med fallstudien. En klassificeringsmodell tränas och används för att utvärdera de numeriska värdena från handlarkommentarerna. Klassificeringssäkerheten mäts med och utan de numeriska värdena. Olika modeller för att generera värdena från handlarkommentarerna utvärderas. Samtliga modeller leder till en förbättring i klassificering över att inte använda handlarkommentarerna. Den bästa klassificeringssäkerheten uppnås med en modell där meningsvektorerna genereras med hjälp av SIF-viktning och sedan klustras med hjälp av DBSCAN-algoritmen.

APA, Harvard, Vancouver, ISO, and other styles

48

Yogeswaran, Arjun. "Self-Organizing Neural Visual Models to Learn Feature Detectors and Motion Tracking Behaviour by Exposure to Real-World Data." Thesis, Université d'Ottawa / University of Ottawa, 2018. http://hdl.handle.net/10393/37096.

Full text

Abstract:

Advances in unsupervised learning and deep neural networks have led to increased performance in a number of domains, and to the ability to draw strong comparisons between the biological method of self-organization conducted by the brain and computational mechanisms. This thesis aims to use real-world data to tackle two areas in the domain of computer vision which have biological equivalents: feature detection and motion tracking. The aforementioned advances have allowed efficient learning of feature representations directly from large sets of unlabeled data instead of using traditional handcrafted features. The first part of this thesis evaluates such representations by comparing regularization and preprocessing methods which incorporate local neighbouring information during training on a single-layer neural network. The networks are trained and tested on the Hollywood2 video dataset, as well as the static CIFAR-10, STL-10, COIL-100, and MNIST image datasets. The induction of topography or simple image blurring via Gaussian filters during training produces better discriminative features as evidenced by the consistent and notable increase in classification results that they produce. In the visual domain, invariant features are desirable such that objects can be classified despite transformations. It is found that most of the compared methods produce more invariant features, however, classification accuracy does not correlate to invariance. The second, and paramount, contribution of this thesis is a biologically-inspired model to explain the emergence of motion tracking behaviour in early development using unsupervised learning. The model’s self-organization is biased by an original concept called retinal constancy, which measures how similar visual contents are between successive frames. In the proposed two-layer deep network, when exposed to real-world video, the first layer learns to encode visual motion, and the second layer learns to relate that motion to gaze movements, which it perceives and creates through bi-directional nodes. This is unique because it uses general machine learning algorithms, and their inherent generative properties, to learn from real-world data. It also implements a biological theory and learns in a fully unsupervised manner. An analysis of its parameters and limitations is conducted, and its tracking performance is evaluated. Results show that this model is able to successfully follow targets in real-world video, despite being trained without supervision on real-world video.

APA, Harvard, Vancouver, ISO, and other styles

49

Chafik, Sanaa. "Machine learning techniques for content-based information retrieval." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLL008/document.

Full text

Abstract:

Avec l’évolution des technologies numériques et la prolifération d'internet, la quantité d’information numérique a considérablement évolué. La recherche par similarité (ou recherche des plus proches voisins) est une problématique que plusieurs communautés de recherche ont tenté de résoudre. Les systèmes de recherche par le contenu de l’information constituent l’une des solutions prometteuses à ce problème. Ces systèmes sont composés essentiellement de trois unités fondamentales, une unité de représentation des données pour l’extraction des primitives, une unité d’indexation multidimensionnelle pour la structuration de l’espace des primitives, et une unité de recherche des plus proches voisins pour la recherche des informations similaires. L’information (image, texte, audio, vidéo) peut être représentée par un vecteur multidimensionnel décrivant le contenu global des données d’entrée. La deuxième unité consiste à structurer l’espace des primitives dans une structure d’index, où la troisième unité -la recherche par similarité- est effective.Dans nos travaux de recherche, nous proposons trois systèmes de recherche par le contenu de plus proches voisins. Les trois approches sont non supervisées, et donc adaptées aux données étiquetées et non étiquetées. Elles sont basées sur le concept du hachage pour une recherche efficace multidimensionnelle des plus proches voisins. Contrairement aux approches de hachage existantes, qui sont binaires, les approches proposées fournissent des structures d’index avec un hachage réel. Bien que les approches de hachage binaires fournissent un bon compromis qualité-temps de calcul, leurs performances en termes de qualité (précision) se dégradent en raison de la perte d’information lors du processus de binarisation. À l'opposé, les approches de hachage réel fournissent une bonne qualité de recherche avec une meilleure approximation de l’espace d’origine, mais induisent en général un surcoût en temps de calcul.Ce dernier problème est abordé dans la troisième contribution. Les approches proposées sont classifiées en deux catégories, superficielle et profonde. Dans la première catégorie, on propose deux techniques de hachage superficiel, intitulées Symmetries of the Cube Locality sensitive hashing (SC-LSH) et Cluster-Based Data Oriented Hashing (CDOH), fondées respectivement sur le hachage aléatoire et l’apprentissage statistique superficiel. SCLSH propose une solution au problème de l’espace mémoire rencontré par la plupart des approches de hachage aléatoire, en considérant un hachage semi-aléatoire réduisant partiellement l’effet aléatoire, et donc l’espace mémoire, de ces dernières, tout en préservant leur efficacité pour la structuration des espaces hétérogènes. La seconde technique, CDOH, propose d’éliminer l’effet aléatoire en combinant des techniques d’apprentissage non-supervisé avec le concept de hachage. CDOH fournit de meilleures performances en temps de calcul, en espace mémoire et en qualité de recherche.La troisième contribution est une approche de hachage basée sur les réseaux de neurones profonds appelée "Unsupervised Deep Neuron-per-Neuron Hashing" (UDN2H). UDN2H propose une indexation individuelle de la sortie de chaque neurone de la couche centrale d’un modèle non supervisé. Ce dernier est un auto-encodeur profond capturant une structure individuelle de haut niveau de chaque neurone de sortie.Nos trois approches, SC-LSH, CDOH et UDN2H, ont été proposées séquentiellement durant cette thèse, avec un niveau croissant, en termes de la complexité des modèles développés, et en termes de la qualité de recherche obtenue sur de grandes bases de données d'information
The amount of media data is growing at high speed with the fast growth of Internet and media resources. Performing an efficient similarity (nearest neighbor) search in such a large collection of data is a very challenging problem that the scientific community has been attempting to tackle. One of the most promising solutions to this fundamental problem is Content-Based Media Retrieval (CBMR) systems. The latter are search systems that perform the retrieval task in large media databases based on the content of the data. CBMR systems consist essentially of three major units, a Data Representation unit for feature representation learning, a Multidimensional Indexing unit for structuring the resulting feature space, and a Nearest Neighbor Search unit to perform efficient search. Media data (i.e. image, text, audio, video, etc.) can be represented by meaningful numeric information (i.e. multidimensional vector), called Feature Description, describing the overall content of the input data. The task of the second unit is to structure the resulting feature descriptor space into an index structure, where the third unit, effective nearest neighbor search, is performed.In this work, we address the problem of nearest neighbor search by proposing three Content-Based Media Retrieval approaches. Our three approaches are unsupervised, and thus can adapt to both labeled and unlabeled real-world datasets. They are based on a hashing indexing scheme to perform effective high dimensional nearest neighbor search. Unlike most recent existing hashing approaches, which favor indexing in Hamming space, our proposed methods provide index structures adapted to a real-space mapping. Although Hamming-based hashing methods achieve good accuracy-speed tradeoff, their accuracy drops owing to information loss during the binarization process. By contrast, real-space hashing approaches provide a more accurate approximation in the mapped real-space as they avoid the hard binary approximations.Our proposed approaches can be classified into shallow and deep approaches. In the former category, we propose two shallow hashing-based approaches namely, "Symmetries of the Cube Locality Sensitive Hashing" (SC-LSH) and "Cluster-based Data Oriented Hashing" (CDOH), based respectively on randomized-hashing and shallow learning-to-hash schemes. The SC-LSH method provides a solution to the space storage problem faced by most randomized-based hashing approaches. It consists of a semi-random scheme reducing partially the randomness effect of randomized hashing approaches, and thus the memory storage problem, while maintaining their efficiency in structuring heterogeneous spaces. The CDOH approach proposes to eliminate the randomness effect by combining machine learning techniques with the hashing concept. The CDOH outperforms the randomized hashing approaches in terms of computation time, memory space and search accuracy.The third approach is a deep learning-based hashing scheme, named "Unsupervised Deep Neuron-per-Neuron Hashing" (UDN2H). The UDN2H approach proposes to index individually the output of each neuron of the top layer of a deep unsupervised model, namely a Deep Autoencoder, with the aim of capturing the high level individual structure of each neuron output.Our three approaches, SC-LSH, CDOH and UDN2H, were proposed sequentially as the thesis was progressing, with an increasing level of complexity in terms of the developed models, and in terms of the effectiveness and the performances obtained on large real-world datasets

APA, Harvard, Vancouver, ISO, and other styles

50

Sala, Cardoso Enric. "Advanced energy management strategies for HVAC systems in smart buildings." Doctoral thesis, Universitat Politècnica de Catalunya, 2019. http://hdl.handle.net/10803/668528.

Full text

Abstract:

The efficacy of the energy management systems at dealing with energy consumption in buildings has been a topic with a growing interest in recent years due to the ever-increasing global energy demand and the large percentage of energy being currently used by buildings. The scale of this sector has attracted research effort with the objective of uncovering potential improvement avenues and materializing them with the help of recent technological advances that could be exploited to lower the energetic footprint of buildings. Specifically, in the area of heating, ventilating and air conditioning installations, the availability of large amounts of historical data in building management software suites makes possible the study of how resource-efficient these systems really are when entrusted with ensuring occupant comfort. Actually, recent reports have shown that there is a gap between the ideal operating performance and the performance achieved in practice. Accordingly, this thesis considers the research of novel energy management strategies for heating, ventilating and air conditioning installations in buildings, aimed at narrowing the performance gap by employing data-driven methods to increase their context awareness, allowing management systems to steer the operation towards higher efficiency. This includes the advancement of modeling methodologies capable of extracting actionable knowledge from historical building behavior databases, through load forecasting and equipment operational performance estimation supporting the identification of a building’s context and energetic needs, and the development of a generalizable multi-objective optimization strategy aimed at meeting these needs while minimizing the consumption of energy. The experimental results obtained from the implementation of the developed methodologies show a significant potential for increasing energy efficiency of heating, ventilating and air conditioning systems while being sufficiently generic to support their usage in different installations having diverse equipment. In conclusion, a complete analysis and actuation framework was developed, implemented and validated by means of an experimental database acquired from a pilot plant during the research period of this thesis. The obtained results demonstrate the efficacy of the proposed standalone contributions, and as a whole represent a suitable solution for helping to increase the performance of heating, ventilating and air conditioning installations without affecting the comfort of their occupants.
L’eficàcia dels sistemes de gestió d’energia per afrontar el consum d’energia en edificis és un tema que ha rebut un interès en augment durant els darrers anys a causa de la creixent demanda global d’energia i del gran percentatge d’energia que n’utilitzen actualment els edificis. L’escala d’aquest sector ha atret l'atenció de nombrosa investigació amb l’objectiu de descobrir possibles vies de millora i materialitzar-les amb l’ajuda de recents avenços tecnològics que es podrien aprofitar per disminuir les necessitats energètiques dels edificis. Concretament, en l’àrea d’instal·lacions de calefacció, ventilació i climatització, la disponibilitat de grans bases de dades històriques als sistemes de gestió d’edificis fa possible l’estudi de com d'eficients són realment aquests sistemes quan s’encarreguen d'assegurar el confort dels seus ocupants. En realitat, informes recents indiquen que hi ha una diferència entre el rendiment operatiu ideal i el rendiment generalment assolit a la pràctica. En conseqüència, aquesta tesi considera la investigació de noves estratègies de gestió de l’energia per a instal·lacions de calefacció, ventilació i climatització en edificis, destinades a reduir la diferència de rendiment mitjançant l’ús de mètodes basats en dades per tal d'augmentar el seu coneixement contextual, permetent als sistemes de gestió dirigir l’operació cap a zones de treball amb un rendiment superior. Això inclou tant l’avanç de metodologies de modelat capaces d’extreure coneixement de bases de dades de comportaments històrics d’edificis a través de la previsió de càrregues de consum i l’estimació del rendiment operatiu dels equips que recolzin la identificació del context operatiu i de les necessitats energètiques d’un edifici, tant com del desenvolupament d’una estratègia d’optimització multi-objectiu generalitzable per tal de minimitzar el consum d’energia mentre es satisfan aquestes necessitats energètiques. Els resultats experimentals obtinguts a partir de la implementació de les metodologies desenvolupades mostren un potencial important per augmentar l'eficiència energètica dels sistemes de climatització, mentre que són prou genèrics com per permetre el seu ús en diferents instal·lacions i suportant equips diversos. En conclusió, durant aquesta tesi es va desenvolupar, implementar i validar un marc d’anàlisi i actuació complet mitjançant una base de dades experimental adquirida en una planta pilot durant el període d’investigació de la tesi. Els resultats obtinguts demostren l’eficàcia de les contribucions de manera individual i, en conjunt, representen una solució idònia per ajudar a augmentar el rendiment de les instal·lacions de climatització sense afectar el confort dels seus ocupants

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!