Dissertationen zum Thema „Deep Generatve Models“

Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: Deep Generatve Models.

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-50 Dissertationen für die Forschung zum Thema "Deep Generatve Models" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Dissertationen für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

Miao, Yishu. „Deep generative models for natural language processing“. Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:e4e1f1f9-e507-4754-a0ab-0246f1e1e258.

Der volle Inhalt der Quelle
Annotation:
Deep generative models are essential to Natural Language Processing (NLP) due to their outstanding ability to use unlabelled data, to incorporate abundant linguistic features, and to learn interpretable dependencies among data. As the structure becomes deeper and more complex, having an effective and efficient inference method becomes increasingly important. In this thesis, neural variational inference is applied to carry out inference for deep generative models. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. The powerful neural networks are able to approximate complicated non-linear distributions and grant the possibilities for more interesting and complicated generative models. Therefore, we develop the potential of neural variational inference and apply it to a variety of models for NLP with continuous or discrete latent variables. This thesis is divided into three parts. Part I introduces a generic variational inference framework for generative and conditional models of text. For continuous or discrete latent variables, we apply a continuous reparameterisation trick or the REINFORCE algorithm to build low-variance gradient estimators. To further explore Bayesian non-parametrics in deep neural networks, we propose a family of neural networks that parameterise categorical distributions with continuous latent variables. Using the stick-breaking construction, an unbounded categorical distribution is incorporated into our deep generative models which can be optimised by stochastic gradient back-propagation with a continuous reparameterisation. Part II explores continuous latent variable models for NLP. Chapter 3 discusses the Neural Variational Document Model (NVDM): an unsupervised generative model of text which aims to extract a continuous semantic latent variable for each document. In Chapter 4, the neural topic models modify the neural document models by parameterising categorical distributions with continuous latent variables, where the topics are explicitly modelled by discrete latent variables. The models are further extended to neural unbounded topic models with the help of stick-breaking construction, and a truncation-free variational inference method is proposed based on a Recurrent Stick-breaking construction (RSB). Chapter 5 describes the Neural Answer Selection Model (NASM) for learning a latent stochastic attention mechanism to model the semantics of question-answer pairs and predict their relatedness. Part III discusses discrete latent variable models. Chapter 6 introduces latent sentence compression models. The Auto-encoding Sentence Compression Model (ASC), as a discrete variational auto-encoder, generates a sentence by a sequence of discrete latent variables representing explicit words. The Forced Attention Sentence Compression Model (FSC) incorporates a combined pointer network biased towards the usage of words from source sentence, which significantly improves the performance when jointly trained with the ASC model in a semi-supervised learning fashion. Chapter 7 describes the Latent Intention Dialogue Models (LIDM) that employ a discrete latent variable to learn underlying dialogue intentions. Additionally, the latent intentions can be interpreted as actions guiding the generation of machine responses, which could be further refined autonomously by reinforcement learning. Finally, Chapter 8 summarizes our findings and directions for future work.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Misino, Eleonora. „Deep Generative Models with Probabilistic Logic Priors“. Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24058/.

Der volle Inhalt der Quelle
Annotation:
Many different extensions of the VAE framework have been introduced in the past. How­ ever, the vast majority of them focused on pure sub­-symbolic approaches that are not sufficient for solving generative tasks that require a form of reasoning. In this thesis, we propose the probabilistic logic VAE (PLVAE), a neuro-­symbolic deep generative model that combines the representational power of VAEs with the reasoning ability of probabilistic ­logic programming. The strength of PLVAE resides in its probabilistic ­logic prior, which provides an interpretable structure to the latent space that can be easily changed in order to apply the model to different scenarios. We provide empirical results of our approach by training PLVAE on a base task and then using the same model to generalize to novel tasks that involve reasoning with the same set of symbols.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Nilsson, Mårten. „Augmenting High-Dimensional Data with Deep Generative Models“. Thesis, KTH, Robotik, perception och lärande, RPL, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-233969.

Der volle Inhalt der Quelle
Annotation:
Data augmentation is a technique that can be performed in various ways to improve the training of discriminative models. The recent developments in deep generative models offer new ways of augmenting existing data sets. In this thesis, a framework for augmenting annotated data sets with deep generative models is proposed together with a method for quantitatively evaluating the quality of the generated data sets. Using this framework, two data sets for pupil localization was generated with different generative models, including both well-established models and a novel model proposed for this purpose. The unique model was shown both qualitatively and quantitatively to generate the best data sets. A set of smaller experiments on standard data sets also revealed cases where this generative model could improve the performance of an existing discriminative model. The results indicate that generative models can be used to augment or replace existing data sets when training discriminative models.
Dataaugmentering är en teknik som kan utföras på flera sätt för att förbättra träningen av diskriminativa modeller. De senaste framgångarna inom djupa generativa modeller har öppnat upp nya sätt att augmentera existerande dataset. I detta arbete har ett ramverk för augmentering av annoterade dataset med hjälp av djupa generativa modeller föreslagits. Utöver detta så har en metod för kvantitativ evaulering av kvaliteten hos genererade data set tagits fram. Med hjälp av detta ramverk har två dataset för pupillokalisering genererats med olika generativa modeller. Både väletablerade modeller och en ny modell utvecklad för detta syfte har testats. Den unika modellen visades både kvalitativt och kvantitativt att den genererade de bästa dataseten. Ett antal mindre experiment på standardiserade dataset visade exempel på fall där denna generativa modell kunde förbättra prestandan hos en existerande diskriminativ modell. Resultaten indikerar att generativa modeller kan användas för att augmentera eller ersätta existerande dataset vid träning av diskriminativa modeller.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Lindqvist, Niklas. „Automatic Question Paraphrasing in Swedish with Deep Generative Models“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-294320.

Der volle Inhalt der Quelle
Annotation:
Paraphrase generation refers to the task of automatically generating a paraphrase given an input sentence or text. Paraphrase generation is a fundamental yet challenging natural language processing (NLP) task and is utilized in a variety of applications such as question answering, information retrieval, conversational systems etc. In this study, we address the problem of paraphrase generation of questions in Swedish by evaluating two different deep generative models that have shown promising results on paraphrase generation of questions in English. The first model is a Conditional Variational Autoencoder (C-VAE) and the other model is an extension of the first one where a discriminator network is introduced into the model to form a Generative Adversarial Network (GAN) architecture. In addition to these models, a method not based on machine-learning was implemented to act as a baseline. The models were evaluated using both quantitative and qualitative measures including grammatical correctness and equivalence to source question. The results show that the deep generative models outperformed the baseline across all quantitative metrics. Furthermore, from the qualitative evaluation it was shown that the deep generative models outperformed the baseline at generating grammatically correct sentences, but there was no noticeable difference in terms of equivalence to the source question between the models.
Parafrasgenerering syftar på uppgiften att, utifrån en given mening eller text, automatiskt generera en parafras, det vill säga en annan text med samma betydelse. Parafrasgenerering är en grundläggande men ändå utmanande uppgift inom naturlig språkbehandling och används i en rad olika applikationer som informationssökning, konversionssystem, att besvara frågor givet en text etc. I den här studien undersöker vi problemet med parafrasgenerering av frågor på svenska genom att utvärdera två olika djupa generativa modeller som visat lovande resultat på parafrasgenerering av frågor på engelska. Den första modellen är en villkorsbaserad variationsautokodare (C-VAE). Den andra modellen är också en C-VAE men introducerar även en diskriminator vilket gör modellen till ett generativt motståndarnätverk (GAN). Förutom modellerna presenterade ovan, implementerades även en icke maskininlärningsbaserad metod som en baslinje. Modellerna utvärderades med både kvantitativa och kvalitativa mått inklusive grammatisk korrekthet och likvärdighet mellan parafras och originalfråga. Resultaten visar att de djupa generativa modellerna presterar bättre än baslinjemodellen på alla kvantitativa mätvärden. Vidare, visade the kvalitativa utvärderingen att de djupa generativa modellerna kunde generera grammatiskt korrekta frågor i större utsträckning än baslinjemodellen. Det var däremot ingen större skillnad i semantisk ekvivalens mellan parafras och originalfråga för de olika modellerna.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Gane, Georgiana Andreea. „Building generative models over discrete structures : from graphical models to deep learning“. Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121611.

Der volle Inhalt der Quelle
Annotation:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from PDF version of thesis. Page 173 blank.
Includes bibliographical references (pages 159-172).
The goal of this thesis is to investigate generative models over discrete structures, such as binary grids, alignments or arbitrary graphs. We focused on developing models easy to sample from, and we approached the task from two broad perspectives: defining models via structured potential functions, and via neural network based decoders. In the first case, we investigated Perturbation Models, a family of implicit distributions where samples emerge through optimization of randomized potential functions. Designed explicitly for efficient sampling, Perturbation Models are strong candidates for building generative models over structures, and the leading open questions pertain to understanding the properties of the induced models and developing practical learning algorithms.
In this thesis, we present theoretical results showing that, in contrast to the more established Gibbs models, low-order potential functions, after undergoing randomization and maximization, lead to high-order dependencies in the induced distributions. Furthermore, while conditioning in Gibbs' distributions is straightforward, conditioning in Perturbation Models is typically not, but we theoretically characterize cases where the straightforward approach produces the correct results. Finally, we introduce a new Perturbation Models learning algorithm based on Inverse Combinatorial Optimization. We illustrate empirically both the induced dependencies and the inverse optimization approach, in learning tasks inspired by computer vision problems. In the second case, we sequentialize the structures, converting structure generation into a sequence of discrete decisions, to enable the use of sequential models.
We explore maximum likelihood training with step-wise supervision and continuous relaxations of the intermediate decisions. With respect to intermediate discrete representations, the main directions consist of using gradient estimators or designing continuous relaxations. We discuss these solutions in the context of unsupervised scene understanding with generative models. In particular, we asked whether a continuous relaxation of the counting problem also discovers the objects in an unsupervised fashion (given the increased training stability that continuous relaxations provide) and we proposed an approach based on Adaptive Computation Time (ACT) which achieves the desired result. Finally, we investigated the task of iterative graph generation. We proposed a variational lower-bound to the maximum likelihood objective, where the approximate posterior distribution renormalizes the prior distribution over local predictions which are plausible for the target graph.
For instance, the local predictions may be binary values indicating the presence or absence of an edge indexed by the given time step, for a canonical edge indexing chosen a-priori. The plausibility of each local prediction is assessed by solving a combinatorial optimization problem, and we discuss relevant approaches, including an induced sub-graph isomorphism-based algorithm for the generic graph generation case, and a polynomial algorithm for the special case of graph generation resulting from solving graph clustering tasks. In this thesis, we focused on the generic case, and we investigated the approximate posterior's relevance on synthetic graph datasets.
by Georgiana Andreea Gane.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Mescheder, Lars Morten [Verfasser]. „Stability and Expressiveness of Deep Generative Models / Lars Morten Mescheder“. Tübingen : Universitätsbibliothek Tübingen, 2020. http://d-nb.info/1217249257/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Rastgoufard, Rastin. „Multi-Label Latent Spaces with Semi-Supervised Deep Generative Models“. ScholarWorks@UNO, 2018. https://scholarworks.uno.edu/td/2486.

Der volle Inhalt der Quelle
Annotation:
Expert labeling, tagging, and assessment are far more costly than the processes of collecting raw data. Generative modeling is a very powerful tool to tackle this real-world problem. It is shown here how these models can be used to allow for semi-supervised learning that performs very well in label-deficient conditions. The foundation for the work in this dissertation is built upon visualizing generative models' latent spaces to gain deeper understanding of data, analyze faults, and propose solutions. A number of novel ideas and approaches are presented to improve single-label classification. This dissertation's main focus is on extending semi-supervised Deep Generative Models for solving the multi-label problem by proposing unique mathematical and programming concepts and organization. In all naive mixtures, using multiple labels is detrimental and causes each label's predictions to be worse than models that utilize only a single label. Examining latent spaces reveals that in many cases, large regions in the models generate meaningless results. Enforcing a priori independence is essential, and only when applied can multi-label models outperform the best single-label models. Finally, a novel learning technique called open-book learning is described that is capable of surpassing the state-of-the-art classification performance of generative models for multi-labeled, semi-supervised data sets.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Douwes, Constance. „On the Environmental Impact of Deep Generative Models for Audio“. Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS074.

Der volle Inhalt der Quelle
Annotation:
Cette thèse étudie l'impact environnemental des modèles d'apprentissage profond pour la génération audio et vise à mettre le coût de calcul au cœur du processus d'évaluation. En particulier, nous nous concentrons sur différents types de modèles d'apprentissage profond spécialisés dans la synthèse audio de formes d'onde brutes. Ces modèles sont désormais un élément clé des systèmes audio modernes, et leur utilisation a considérablement augmenté ces dernières années. Leur flexibilité et leurs capacités de généralisation en font des outils puissants dans de nombreux contextes, de la synthèse de texte à la parole à la génération audio inconditionnelle. Cependant, ces avantages se font au prix de sessions d'entraînement coûteuses sur de grandes quantités de données, exploitées sur du matériel dédié à forte consommation d'énergie, ce qui entraîne d'importantes émissions de gaz à effet de serre. Les mesures que nous utilisons en tant que communauté scientifique pour évaluer nos travaux sont au cœur de ce problème. Actuellement, les chercheurs en apprentissage profond évaluent leurs travaux principalement sur la base des améliorations de la précision, de la log-vraisemblance, de la reconstruction ou des scores d'opinion, qui occultent tous le coût de calcul des modèles génératifs. Par conséquent, nous proposons d'utiliser une nouvelle méthodologie basée sur l'optimalité de Pareto pour aider la communauté à mieux évaluer leurs travaux tout en ramenant l'empreinte énergétique -- et in fine les émissions de carbone -- au même niveau d'intérêt que la qualité du son. Dans la première partie de cette thèse, nous présentons un rapport complet sur l'utilisation de diverses mesures d'évaluation des modèles génératifs profonds pour les tâches de synthèse audio. Bien que l'efficacité de calcul soit de plus en plus abordée, les mesures de qualité sont les plus couramment utilisées pour évaluer les modèles génératifs profonds, alors que la consommation d'énergie n'est presque jamais mentionnée. Nous abordons donc cette question en estimant le coût en carbone de la formation des modèles génératifs et en le comparant à d'autres coûts en carbone notables pour démontrer qu'il est loin d'être insignifiant. Dans la deuxième partie de cette thèse, nous proposons une évaluation à grande échelle des vocodeurs neuronaux pervasifs, qui sont une classe de modèles génératifs utilisés pour la génération de la parole, conditionnée par le mel-spectrogramme. Nous introduisons une analyse multi-objectifs basée sur l'optimalité de Pareto à la fois de la qualité de l'évaluation humaine et de la consommation d'énergie. Dans ce cadre, nous montrons que des modèles plus légers peuvent être plus performants que des modèles plus coûteux. En proposant de s'appuyer sur une nouvelle définition de l'efficacité, nous entendons fournir aux praticiens une base de décision pour choisir le meilleur modèle en fonction de leurs exigences. Dans la dernière partie de la thèse, nous proposons une méthode pour réduire les coûts associés à l'inférence des modèle génératif profonds, basée sur la quantification des réseaux de neurones. Nous montrons un gain notable sur la taille des modèles et donnons des pistes pour l'utilisation future de ces modèles dans des systèmes embarqués. En somme, nous fournissons des clés pour mieux comprendre l'impact des modèles génératifs profonds pour la synthèse audio ainsi qu'un nouveau cadre pour développer des modèles tout en tenant compte de leur impact environnemental. Nous espérons que ce travail permettra de sensibiliser les chercheurs à la nécessité d'étudier des modèles efficaces sur le plan énergétique tout en garantissant une qualité audio élevée
In this thesis, we investigate the environmental impact of deep learning models for audio generation and we aim to put computational cost at the core of the evaluation process. In particular, we focus on different types of deep learning models specialized in raw waveform audio synthesis. These models are now a key component of modern audio systems, and their use has increased significantly in recent years. Their flexibility and generalization capabilities make them powerful tools in many contexts, from text-to-speech synthesis to unconditional audio generation. However, these benefits come at the cost of expensive training sessions on large amounts of data, operated on energy-intensive dedicated hardware, which incurs large greenhouse gas emissions. The measures we use as a scientific community to evaluate our work are at the heart of this problem. Currently, deep learning researchers evaluate their works primarily based on improvements in accuracy, log-likelihood, reconstruction, or opinion scores, all of which overshadow the computational cost of generative models. Therefore, we propose using a new methodology based on Pareto optimality to help the community better evaluate their work's significance while bringing energy footprint -- and in fine carbon emissions -- at the same level of interest as the sound quality. In the first part of this thesis, we present a comprehensive report on the use of various evaluation measures of deep generative models for audio synthesis tasks. Even though computational efficiency is increasingly discussed, quality measurements are the most commonly used metrics to evaluate deep generative models, while energy consumption is almost never mentioned. Therefore, we address this issue by estimating the carbon cost of training generative models and comparing it to other noteworthy carbon costs to demonstrate that it is far from insignificant. In the second part of this thesis, we propose a large-scale evaluation of pervasive neural vocoders, which are a class of generative models used for speech generation, conditioned on mel-spectrogram. We introduce a multi-objective analysis based on Pareto optimality of both quality from human-based evaluation and energy consumption. Within this framework, we show that lighter models can perform better than more costly models. By proposing to rely on a novel definition of efficiency, we intend to provide practitioners with a decision basis for choosing the best model based on their requirements. In the last part of the thesis, we propose a method to reduce the inference costs of neural vocoders, based on quantizated neural networks. We show a significant gain on the memory size and give some hints for the future use of these models on embedded hardware. Overall, we provide keys to better understand the impact of deep generative models for audio synthesis as well as a new framework for developing models while accounting for their environmental impact. We hope that this work raises awareness on the need to investigate energy-efficient models simultaneously with high perceived quality
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Patsanis, Alexandros. „Network Anomaly Detection and Root Cause Analysis with Deep Generative Models“. Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-397367.

Der volle Inhalt der Quelle
Annotation:
The project's objective is to detect network anomalies happening in a telecommunication network due to hardware malfunction or software defects after a vast upgrade on the network's system over a specific area, such as a city. The network's system generates statistical data at a 15-minute interval for different locations in the area of interest. For every interval, all statistical data generated over an area are aggregated and converted to images. In this way, an image represents a snapshot of the network for a specific interval, where statistical data are represented as points having different density values. To that problem, this project makes use of Generative Adversarial Networks (GANs), which learn a manifold of the normal network pattern. Additionally, mapping from new unseen images to the learned manifold results in an anomaly score used to detect anomalies. The anomaly score is a combination of the reconstruction error and the learned feature representation. Two models for detecting anomalies are used in this project, AnoGAN and f-AnoGAN. Furthermore, f-AnoGAN uses a state-of-the-art approach called Wasstestein GAN with gradient penalty, which improves the initial implementation of GANs. Both quantitative and qualitative evaluation measurements are used to assess GANs models, where F1 Score and Wasserstein loss are used for the quantitative evaluation and linear interpolation in the hidden space for qualitative evaluation. Moreover, to set a threshold, a prediction model used to predict the expected behaviour of the network for a specific interval. Then, the predicted behaviour is used over the anomaly detection model to define a threshold automatically. Our experiments were implemented successfully for both prediction and anomaly detection models. We additionally tested known abnormal behaviours which were detected and visualised. However, more research has to be done over the evaluation of GANs, as there is no universal approach to evaluate them.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Alabdallah, Abdallah. „Human Understandable Interpretation of Deep Neural Networks Decisions Using Generative Models“. Thesis, Högskolan i Halmstad, Halmstad Embedded and Intelligent Systems Research (EIS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-41035.

Der volle Inhalt der Quelle
Annotation:
Deep Neural Networks have long been considered black box systems, where their interpretability is a concern when applied in safety critical systems. In this work, a novel approach of interpreting the decisions of DNNs is proposed. The approach depends on exploiting generative models and the interpretability of their latent space. Three methods for ranking features are explored, two of which depend on sensitivity analysis, and the third one depends on Random Forest model. The Random Forest model was the most successful to rank the features, given its accuracy and inherent interpretability.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
11

Franceschi, Jean-Yves. „Apprentissage de représentations et modèles génératifs profonds dans les systèmes dynamiques“. Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS014.

Der volle Inhalt der Quelle
Annotation:
L'essor de l'apprentissage profond trouve notamment sa source dans les avancées scientifiques qu'il a permises en termes d'apprentissage de représentations et de modèles génératifs. Dans leur grande majorité, ces progrès ont cependant été obtenus sur des données textuelles et visuelles statiques, les données temporelles demeurant un défi pour ces méthodes. Compte tenu de leur importance pour l'automatisation croissante de multiples tâches, de plus en plus de travaux en apprentissage automatique s'intéressent aux problématiques d'évolution temporelle. Dans cette thèse, nous étudions ainsi plusieurs aspects de la temporalité et des systèmes dynamiques dans les réseaux de neurones profonds pour l'apprentissage non supervisé de représentations et de modèles génératifs. Premièrement, nous présentons une méthode générale d'apprentissage de représentations non supervisée pour les séries temporelles prenant en compte des besoins pratiques d'efficacité et de flexibilité. Dans un second temps, nous nous intéressons à l'apprentissage pour les séquences structurées de nature spatio-temporelle, couvrant les vidéos et phénomènes physiques. En les modélisant par des équations différentielles paramétrisées par des réseaux de neurones, nous montrons la corrélation entre la découverte de représentations pertinentes d'un côté, et de l'autre la fabrique de modèles prédictifs performants sur ces données. Enfin, nous analysons plus généralement dans une troisième partie les populaires réseaux antagonistes génératifs dont nous décrivons la dynamique d'apprentissage par des équations différentielles, nous permettant d'améliorer la compréhension de leur fonctionnement
The recent rise of deep learning has been motivated by numerous scientific breakthroughs, particularly regarding representation learning and generative modeling. However, most of these achievements have been obtained on image or text data, whose evolution through time remains challenging for existing methods. Given their importance for autonomous systems to adapt in a constantly evolving environment, these challenges have been actively investigated in a growing body of work. In this thesis, we follow this line of work and study several aspects of temporality and dynamical systems in deep unsupervised representation learning and generative modeling. Firstly, we present a general-purpose deep unsupervised representation learning method for time series tackling scalability and adaptivity issues arising in practical applications. We then further study in a second part representation learning for sequences by focusing on structured and stochastic spatiotemporal data: videos and physical phenomena. We show in this context that performant temporal generative prediction models help to uncover meaningful and disentangled representations, and conversely. We highlight to this end the crucial role of differential equations in the modeling and embedding of these natural sequences within sequential generative models. Finally, we more broadly analyze in a third part a popular class of generative models, generative adversarial networks, under the scope of dynamical systems. We study the evolution of the involved neural networks with respect to their training time by describing it with a differential equation, allowing us to gain a novel understanding of this generative model
APA, Harvard, Vancouver, ISO und andere Zitierweisen
12

Reichert, David Paul. „Deep Boltzmann machines as hierarchical generative models of perceptual inference in the cortex“. Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/8300.

Der volle Inhalt der Quelle
Annotation:
The mammalian neocortex is integral to all aspects of cognition, in particular perception across all sensory modalities. Whether computational principles can be identified that would explain why the cortex is so versatile and capable of adapting to various inputs is not clear. One well-known hypothesis is that the cortex implements a generative model, actively synthesising internal explanations of the sensory input. This ‘analysis by synthesis’ could be instantiated in the top-down connections in the hierarchy of cortical regions, and allow the cortex to evaluate its internal model and thus learn good representations of sensory input over time. Few computational models however exist that implement these principles. In this thesis, we investigate the deep Boltzmann machine (DBM) as a model of analysis by synthesis in the cortex, and demonstrate how three distinct perceptual phenomena can be interpreted in this light: visual hallucinations, bistable perception, and object-based attention. A common thread is that in all cases, the internally synthesised explanations go beyond, or deviate from, what is in the visual input. The DBM was recently introduced in machine learning, but combines several properties of interest for biological application. It constitutes a hierarchical generative model and carries both the semantics of a connectionist neural network and a probabilistic model. Thus, we can consider neuronal mechanisms but also (approximate) probabilistic inference, which has been proposed to underlie cortical processing, and contribute to the ongoing discussion concerning probabilistic or Bayesian models of cognition. Concretely, making use of the model’s capability to synthesise internal representations of sensory input, we model complex visual hallucinations resulting from loss of vision in Charles Bonnet syndrome.We demonstrate that homeostatic regulation of neuronal firing could be the underlying cause, reproduce various aspects of the syndrome, and examine a role for the neuromodulator acetylcholine. Next, we relate bistable perception to approximate, sampling-based probabilistic inference, and show how neuronal adaptation can be incorporated by providing a biological interpretation for a recently developed sampling algorithm. Finally, we explore how analysis by synthesis could be related to attentional feedback processing, employing the generative aspect of the DBM to implement a form of object-based attention. We thus present a model that uniquely combines several computational principles (sampling, neural processing, unsupervised learning) and is general enough to uniquely address a range of distinct perceptual phenomena. The connection to machine learning ensures theoretical grounding and practical evaluation of the underlying principles. Our results lend further credence to the hypothesis of a generative model in the brain, and promise fruitful interaction between neuroscience and Deep Learning approaches.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
13

Ionascu, Beatrice. „Modelling user interaction at scale with deep generative methods“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239333.

Der volle Inhalt der Quelle
Annotation:
Understanding how users interact with a company's service is essential for data-driven businesses that want to better cater to their users and improve their offering. By using a generative machine learning approach it is possible to model user behaviour and generate new data to simulate or recognize and explain typical usage patterns. In this work we introduce an approach for modelling users' interaction behaviour at scale in a client-service model. We propose a novel representation of multivariate time-series data as time pictures that express temporal correlations through spatial organization. This representation shares two key properties that convolutional networks have been built to exploit and allows us to develop an approach based on deep generative models that use convolutional networks as backbone. In introducing this approach of feature learning for time-series data, we expand the application of convolutional neural networks in the multivariate time-series domain, and specifically user interaction data. We adopt a variational approach inspired by the β-VAE framework in order to learn hidden factors that define different user behaviour patterns. We explore different values for the regularization parameter β and show that it is possible to construct a model that learns a latent representation of identifiable and different user behaviours. We show on real-world data that the model generates realistic samples, that capture the true population-level statistics of the interaction behaviour data, learns different user behaviours, and provides accurate imputations of missing data.
Förståelse för hur användare interagerar med ett företags tjänst är essentiell för data-drivna affärsverksamheter med ambitioner om att bättre tillgodose dess användare och att förbättra deras utbud. Generativ maskininlärning möjliggör modellering av användarbeteende och genererande av ny data i syfte att simulera eller identifiera och förklara typiska användarmönster. I detta arbete introducerar vi ett tillvägagångssätt för storskalig modellering av användarinteraktion i en klientservice-modell. Vi föreslår en ny representation av multivariat tidsseriedata i form av tidsbilder vilka representerar temporala korrelationer via spatial organisering. Denna representation delar två nyckelegenskaper som faltningsnätverk har utvecklats för att exploatera, vilket tillåter oss att utveckla ett tillvägagångssätt baserat på på djupa generativa modeller som bygger på faltningsnätverk. Genom att introducera detta tillvägagångssätt för tidsseriedata expanderar vi applicering av faltningsnätverk inom domänen för multivariat tidsserie, specifikt för användarinteraktionsdata. Vi använder ett tillvägagångssätt inspirerat av ramverket β-VAE i syfte att lära modellen gömda faktorer som definierar olika användarmönster. Vi utforskar olika värden för regulariseringsparametern β och visar att det är möjligt att konstruera en modell som lär sig en latent representation av identifierbara och multipla användarbeteenden. Vi visar med verklig data att modellen genererar realistiska exempel vilka i sin tur fångar statistiken på populationsnivå hos användarinteraktionsdatan, samt lär olika användarbeteenden och bidrar med precisa imputationer av saknad data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
14

Chen, Mickaël. „Learning with weak supervision using deep generative networks“. Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS024.

Der volle Inhalt der Quelle
Annotation:
Nombre des succès de l’apprentissage profond reposent sur la disponibilité de données massivement collectées et annotées, exploités par des algorithmes supervisés. Ces annotations, cependant, peuvent s’avérer difficiles à obtenir. La conception de méthodes peu gourmandes en annotations est ainsi un enjeu important, abordé dans des approches semi-supervisées ou faiblement supervisées. Par ailleurs ont été récemment introduit les réseaux génératifs profonds, capable de manipuler des distributions complexes et à l’origine d’avancées majeures, en édition d’image et en adaptation de domaine par exemple. Dans cette thèse, nous explorons comment ces outils nouveaux peuvent être exploités pour réduire les besoins en annotations. En premier lieu, nous abordons la tâche de prédiction stochastique. Il s’agit de concevoir des systèmes de prédiction structurée tenant compte de la diversité des réponses possibles. Nous proposons dans ce cadre deux modèles, le premier pour des données multi-vues avec vues manquantes, et le second pour la prédiction de futurs possibles d'une séquence vidéo. Ensuite, nous étudions la décomposition en deux facteurs latents indépendants dans le cas où un seul facteur est annoté. Nous proposons des modèles qui visent à retrouver des représentations latentes sémantiquement cohérentes de ces facteurs explicatifs. Le premier modèle est appliqué en génération de données de capture de mouvements, le second, sur des données multi-vues. Enfin, nous nous attaquons au problème, crucial en vision par ordinateur, de la segmentation d’image. Nous proposons un modèle, inspiré des idées développées dans cette thèse, de segmentation d’objet entièrement non supervisé
Many successes of deep learning rely on the availability of massive annotated datasets that can be exploited by supervised algorithms. Obtaining those labels at a large scale, however, can be difficult, or even impossible in many situations. Designing methods that are less dependent on annotations is therefore a major research topic, and many semi-supervised and weakly supervised methods have been proposed. Meanwhile, the recent introduction of deep generative networks provided deep learning methods with the ability to manipulate complex distributions, allowing for breakthroughs in tasks such as image edition and domain adaptation. In this thesis, we explore how these new tools can be useful to further alleviate the need for annotations. Firstly, we tackle the task of performing stochastic predictions. It consists in designing systems for structured prediction that take into account the variability in possible outputs. We propose, in this context, two models. The first one performs predictions on multi-view data with missing views, and the second one predicts possible futures of a video sequence. Then, we study adversarial methods to learn a factorized latent space, in a setting with two explanatory factors but only one of them is annotated. We propose models that aim to uncover semantically consistent latent representations for those factors. One model is applied to the conditional generation of motion capture data, and another one to multi-view data. Finally, we focus on the task of image segmentation, which is of crucial importance in computer vision. Building on previously explored ideas, we propose a model for object segmentation that is entirely unsupervised
APA, Harvard, Vancouver, ISO und andere Zitierweisen
15

Skalic, Miha 1990. „Deep learning for drug design : modeling molecular shapes“. Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/667503.

Der volle Inhalt der Quelle
Annotation:
Designing novel drugs is a complex process which requires finding molecules in a vast chemical space that bind to a specific biomolecular target and have favorable physio-chemical properties. Machine learning methods can leverage previous data and use it for new predictions helping the processes of selection of molecule candidate without relying exclusively on experiments. Particularly, deep learning can be applied to extract complex patterns from simple representations. In this work we leverage deep learning to extract patterns from three-dimensional representations of molecules. We apply classification and regression models to predict bioactivity and binding affinity, respectively. Furthermore, we show that it is possible to predict ligand properties for a particular protein pocket. Finally, we employ deep generative modeling for compound design. Given a ligand shape we show that we can generate similar compounds, and given a protein pocket we can generate potentially binding compounds.
El disseny de drogues novells es un procés complex que requereix trobar les molècules adequades, entre un gran ventall de possibilitats, que siguin capaces d’unir-se a la proteïna desitjada amb unes propietats fisicoquímiques favorables. Els mètodes d’aprenentatge automàtic ens serveixen per a aprofitar dades antigues sobre les molècules i utilitzar-les per a noves prediccions, ajudant en el procés de selecció de molècules potencials sense la necessitat exclusiva d’experiments. Particularment, l’aprenentatge profund pot sera plicat per a extreure patrons complexos a partir de representacions simples. En aquesta tesi utilitzem l’aprenentatge profund per a extreure patrons a partir de representacions tridimensionals de molècules. Apliquem models de classificació i regressió per a predir la bioactivitat i l’afinitat d’unió, respectivament. A més, demostrem que podem predir les propietats dels lligands per a una cavitat proteica determinada. Finalment, utilitzem un model generatiu profund per a disseny de compostos. Donada una forma d’un lligand demostrem que podem generar compostos similars i, donada una cavitat proteica, podem generar compostos que potencialment s’hi podràn unir.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
16

ABUKMEIL, MOHANAD. „UNSUPERVISED GENERATIVE MODELS FOR DATA ANALYSIS AND EXPLAINABLE ARTIFICIAL INTELLIGENCE“. Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/889159.

Der volle Inhalt der Quelle
Annotation:
For more than a century, the methods of learning representation and the exploration of the intrinsic structures of data have developed remarkably and currently include supervised, semi-supervised, and unsupervised methods. However, recent years have witnessed the flourishing of big data, where typical dataset dimensions are high, and the data can come in messy, missing, incomplete, unlabeled, or corrupted forms. Consequently, discovering and learning the hidden structure buried inside such data becomes highly challenging. From this perspective, latent data analysis and dimensionality reduction play a substantial role in decomposing the exploratory factors and learning the hidden structures of data, which encompasses the significant features that characterize the categories and trends among data samples in an ordered manner. That is by extracting patterns, differentiating trends, and testing hypotheses to identify anomalies, learning compact knowledge, and performing many different machine learning (ML) tasks such as classification, detection, and prediction. Unsupervised generative learning (UGL) methods are a class of ML characterized by their possibility of analyzing and decomposing latent data, reducing dimensionality, visualizing the manifold of data, and learning representations with limited levels of predefined labels and prior assumptions. Furthermore, explainable artificial intelligence (XAI) is an emerging field of ML that deals with explaining the decisions and behaviors of learned models. XAI is also associated with UGL models to explain the hidden structure of data, and to explain the learned representations of ML models. However, the current UGL models lack large-scale generalizability and explainability in the testing stage, which leads to restricting their potential in ML and XAI applications. To overcome the aforementioned limitations, this thesis proposes innovative methods that integrate UGL and XAI to enable data factorization and dimensionality reduction to improve the generalizability of the learned ML models. Moreover, the proposed methods enable visual explainability in modern applications as anomaly detection and autonomous driving systems. The main research contributions are listed as follows: • A novel overview of UGL models including blind source separation (BSS), manifold learning (MfL), and neural networks (NNs). Also, the overview considers open issues and challenges among each UGL method. • An innovative method to identify the dimensions of the compact feature space via a generalized rank in the application of image dimensionality reduction. • An innovative method to hierarchically reduce and visualize the manifold of data to improve the generalizability in limited data learning scenarios, and computational complexity reduction applications. • An original method to visually explain autoencoders by reconstructing an attention map in the application of anomaly detection and explainable autonomous driving systems. The novel methods introduced in this thesis are benchmarked on publicly available datasets, and they outperformed the state-of-the-art methods considering different evaluation metrics. Furthermore, superior results were obtained with respect to the state-of-the-art to confirm the feasibility of the proposed methodologies concerning the computational complexity, availability of learning data, model explainability, and high data reconstruction accuracy.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
17

Carlsson, Filip, und Philip Lindgren. „Deep Scenario Generation of Financial Markets“. Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273631.

Der volle Inhalt der Quelle
Annotation:
The goal of this thesis is to explore a new clustering algorithm, VAE-Clustering, and examine if it can be applied to find differences in the distribution of stock returns and augment the distribution of a current portfolio of stocks and see how it performs in different market conditions. The VAE-clustering method is as mentioned a newly introduced method and not widely tested, especially not on time series. The first step is therefore to see if and how well the clustering works. We first apply the algorithm to a dataset containing monthly time series of the power demand in Italy. The purpose in this part is to focus on how well the method works technically. When the model works well and generates proper results with the Italian Power Demand data, we move forward and apply the model on stock return data. In the latter application we are unable to find meaningful clusters and therefore unable to move forward towards the goal of the thesis. The results shows that the VAE-clustering method is applicable for time series. The power demand have clear differences from season to season and the model can successfully identify those differences. When it comes to the financial data we hoped that the model would be able to find different market regimes based on time periods. The model is though not able distinguish different time periods from each other. We therefore conclude that the VAE-clustering method is applicable on time series data, but that the structure and setting of the financial data in this thesis makes it to hard to find meaningful clusters. The major finding is that the VAE-clustering method can be applied to time series. We highly encourage further research to find if the method can be successfully used on financial data in different settings than tested in this thesis.
Syftet med den här avhandlingen är att utforska en ny klustringsalgoritm, VAE-Clustering, och undersöka om den kan tillämpas för att hitta skillnader i fördelningen av aktieavkastningar och förändra distributionen av en nuvarande aktieportfölj och se hur den presterar under olika marknadsvillkor. VAE-klusteringsmetoden är som nämnts en nyinförd metod och inte testad i stort, särskilt inte på tidsserier. Det första steget är därför att se om och hur klusteringen fungerar. Vi tillämpar först algoritmen på ett datasätt som innehåller månatliga tidsserier för strömbehovet i Italien. Syftet med denna del är att fokusera på hur väl metoden fungerar tekniskt. När modellen fungerar bra och ger tillfredställande resultat, går vi vidare och tillämpar modellen på aktieavkastningsdata. I den senare applikationen kan vi inte hitta meningsfulla kluster och kan därför inte gå framåt mot målet som var att simulera olika marknader och se hur en nuvarande portfölj presterar under olika marknadsregimer. Resultaten visar att VAE-klustermetoden är väl tillämpbar på tidsserier. Behovet av el har tydliga skillnader från säsong till säsong och modellen kan framgångsrikt identifiera dessa skillnader. När det gäller finansiell data hoppades vi att modellen skulle kunna hitta olika marknadsregimer baserade på tidsperioder. Modellen kan dock inte skilja olika tidsperioder från varandra. Vi drar därför slutsatsen att VAE-klustermetoden är tillämplig på tidsseriedata, men att strukturen på den finansiella data som undersöktes i denna avhandling gör det svårt att hitta meningsfulla kluster. Den viktigaste upptäckten är att VAE-klustermetoden kan tillämpas på tidsserier. Vi uppmuntrar ytterligare forskning för att hitta om metoden framgångsrikt kan användas på finansiell data i andra former än de testade i denna avhandling
APA, Harvard, Vancouver, ISO und andere Zitierweisen
18

Buys, Jan Moolman. „Incremental generative models for syntactic and semantic natural language processing“. Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:a9a7b5cf-3bb1-4e08-b109-de06bf387d1d.

Der volle Inhalt der Quelle
Annotation:
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in natural language processing (NLP). Syntactic well-formedness is crucial in language generation, but most statistical models do not account for the hierarchical structure of sentences. Many applications exhibiting natural language understanding rely on structured semantic representations to enable querying, inference and reasoning. Yet most semantic parsers produce domain-specific or inadequately expressive representations. We propose a series of generative transition-based models for dependency syntax which can be applied as both parsers and language models while being amenable to supervised or unsupervised learning. Two models are based on Markov assumptions commonly made in NLP: The first is a Bayesian model with hierarchical smoothing, the second is parameterised by feed-forward neural networks. The Bayesian model enables careful analysis of the structure of the conditioning contexts required for generative parsers, but the neural network is more accurate. As a language model the syntactic neural model outperforms both the Bayesian model and n-gram neural networks, pointing to the complementary nature of distributed and structured representations for syntactic prediction. We propose approximate inference methods based on particle filtering. The third model is parameterised by recurrent neural networks (RNNs), dropping the Markov assumptions. Exact inference with dynamic programming is made tractable here by simplifying the structure of the conditioning contexts. We then shift the focus to semantics and propose models for parsing sentences to labelled semantic graphs. We introduce a transition-based parser which incrementally predicts graph nodes (predicates) and edges (arguments). This approach is contrasted against predicting top-down graph traversals. RNNs and pointer networks are key components in approaching graph parsing as an incremental prediction problem. The RNN architecture is augmented to condition the model explicitly on the transition system configuration. We develop a robust parser for Minimal Recursion Semantics, a linguistically-expressive framework for compositional semantics which has previously been parsed only with grammar-based approaches. Our parser is much faster than the grammar-based model, while the same approach improves the accuracy of neural Abstract Meaning Representation parsing.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
19

Waldow, Walter E. „An Adversarial Framework for Deep 3D Target Template Generation“. Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1597334881614898.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
20

TOMA, ANDREA. „PHY-layer Security in Cognitive Radio Networks through Learning Deep Generative Models: an AI-based approach“. Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/1003576.

Der volle Inhalt der Quelle
Annotation:
Recently, Cognitive Radio (CR) has been intended as an intelligent radio endowed with cognition which can be developed by implementing Artificial Intelligence (AI) techniques. Specifically, data-driven Self-Awareness (SA) functionalities, such as detection of spectrum abnormalities, can be effectively implemented as shown by the proposed research. One important application is PHY-layer security since it is essential to establish secure wireless communications against external jamming attacks. In this framework, signals are non-stationary and features from such kind of dynamic spectrum, with multiple high sampling rate signals, are then extracted through the Stockwell Transform (ST) with dual-resolution which has been proposed and validated in this work as part of spectrum sensing techniques. Afterwards, analysis of the state-of-the-art about learning dynamic models from observed features describes theoretical aspects of Machine Learning (ML). In particular, following the recent advances of ML, learning deep generative models with several layers of non-linear processing has been selected as AI method for the proposed spectrum abnormality detection in CR for a brain-inspired, data-driven SA. In the proposed approach, the features extracted from the ST representation of the wideband spectrum are organized in a high-dimensional generalized state vector and, then, a generative model is learned and employed to detect any deviation from normal situations in the analysed spectrum (abnormal signals or behaviours). Specifically, conditional GAN (C-GAN), auxiliary classifier GAN (AC-GAN), and deep VAE have been considered as deep generative models. A dataset of a dynamic spectrum with multi-OFDM signals has been generated by using the National Instruments mm-Wave Transceiver which operates at 28 GHz (central carrier frequency) with 800 MHz frequency range. Training of the deep generative model is performed on the generalized state vector representing the mmWave spectrum with normality pattern without any malicious activity. Testing is based on new and independent data samples corresponding to abnormality pattern where the moving signal follows a different behaviour which has not been observed during training. An abnormality indicator is measured and used for the binary classification (normality hypothesis otherwise abnormality hypothesis), while the performance of the generative models is evaluated and compared through ROC curves and accuracy metrics.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
21

Cherti, Mehdi. „Deep generative neural networks for novelty generation : a foundational framework, metrics and experiments“. Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS029/document.

Der volle Inhalt der Quelle
Annotation:
Des avancées significatives sur les réseaux de neurones profonds ont récemment permis le développement de technologies importantes comme les voitures autonomes et les assistants personnels intelligents basés sur la commande vocale. La plupart des succès en apprentissage profond concernent la prédiction, alors que les percées initiales viennent des modèles génératifs. Actuellement, même s'il existe des outils puissants dans la littérature des modèles génératifs basés sur les réseaux profonds, ces techniques sont essentiellement utilisées pour la prédiction ou pour générer des objets connus (i.e., des images de haute qualité qui appartiennent à des classes connues) : un objet généré qui est à priori inconnu est considéré comme une erreur (Salimans et al., 2016) ou comme un objet fallacieux (Bengio et al., 2013b). En d'autres termes, quand la prédiction est considérée comme le seul objectif possible, la nouveauté est vue comme une erreur - que les chercheurs ont essayé d'éliminer au maximum. Cette thèse défends le point de vue que, plutôt que d'éliminer ces nouveautés, on devrait les étudier et étudier le potentiel génératif des réseaux neuronaux pour créer de la nouveauté utile - particulièrement sachant l'importance économique et sociétale de la création d'objets nouveaux dans les sociétés contemporaines. Cette thèse a pour objectif d'étudier la génération de la nouveauté et sa relation avec les modèles de connaissance produits par les réseaux neurones profonds génératifs. Notre première contribution est la démonstration de l'importance des représentations et leur impact sur le type de nouveautés qui peuvent être générées : une conséquence clé est qu'un agent créatif a besoin de re-représenter les objets connus et utiliser cette représentation pour générer des objets nouveaux. Ensuite, on démontre que les fonctions objectives traditionnelles utilisées dans la théorie de l'apprentissage statistique, comme le maximum de vraisemblance, ne sont pas nécessairement les plus adaptées pour étudier la génération de nouveauté. On propose plusieurs alternatives à un niveau conceptuel. Un deuxième résultat clé est la confirmation que les modèles actuels - qui utilisent les fonctions objectives traditionnelles - peuvent en effet générer des objets inconnus. Cela montre que même si les fonctions objectives comme le maximum de vraisemblance s'efforcent à éliminer la nouveauté, les implémentations en pratique échouent à le faire. A travers une série d'expérimentations, on étudie le comportement de ces modèles ainsi que les objets qu'ils génèrent. En particulier, on propose une nouvelle tâche et des métriques pour la sélection de bons modèles génératifs pour la génération de la nouveauté. Finalement, la thèse conclue avec une série d'expérimentations qui clarifie les caractéristiques des modèles qui génèrent de la nouveauté. Les expériences montrent que la sparsité, le niveaux du niveau de corruption et la restriction de la capacité des modèles tuent la nouveauté et que les modèles qui arrivent à reconnaître des objets nouveaux arrivent généralement aussi à générer de la nouveauté
In recent years, significant advances made in deep neural networks enabled the creation of groundbreaking technologies such as self-driving cars and voice-enabled personal assistants. Almost all successes of deep neural networks are about prediction, whereas the initial breakthroughs came from generative models. Today, although we have very powerful deep generative modeling techniques, these techniques are essentially being used for prediction or for generating known objects (i.e., good quality images of known classes): any generated object that is a priori unknown is considered as a failure mode (Salimans et al., 2016) or as spurious (Bengio et al., 2013b). In other words, when prediction seems to be the only possible objective, novelty is seen as an error that researchers have been trying hard to eliminate. This thesis defends the point of view that, instead of trying to eliminate these novelties, we should study them and the generative potential of deep nets to create useful novelty, especially given the economic and societal importance of creating new objects in contemporary societies. The thesis sets out to study novelty generation in relationship with data-driven knowledge models produced by deep generative neural networks. Our first key contribution is the clarification of the importance of representations and their impact on the kind of novelties that can be generated: a key consequence is that a creative agent might need to rerepresent known objects to access various kinds of novelty. We then demonstrate that traditional objective functions of statistical learning theory, such as maximum likelihood, are not necessarily the best theoretical framework for studying novelty generation. We propose several other alternatives at the conceptual level. A second key result is the confirmation that current models, with traditional objective functions, can indeed generate unknown objects. This also shows that even though objectives like maximum likelihood are designed to eliminate novelty, practical implementations do generate novelty. Through a series of experiments, we study the behavior of these models and the novelty they generate. In particular, we propose a new task setup and metrics for selecting good generative models. Finally, the thesis concludes with a series of experiments clarifying the characteristics of models that can exhibit novelty. Experiments show that sparsity, noise level, and restricting the capacity of the net eliminates novelty and that models that are better at recognizing novelty are also good at generating novelty
APA, Harvard, Vancouver, ISO und andere Zitierweisen
22

Wu, Xinheng. „A Deep Unsupervised Anomaly Detection Model for Automated Tumor Segmentation“. Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22502.

Der volle Inhalt der Quelle
Annotation:
Many researches have been investigated to provide the computer aided diagnosis (CAD) automated tumor segmentation in various medical images, e.g., magnetic resonance (MR), computed tomography (CT) and positron-emission tomography (PET). The recent advances in automated tumor segmentation have been achieved by supervised deep learning (DL) methods trained on large labelled data to cover tumor variations. However, there is a scarcity in such training data due to the cost of labeling process. Thus, with insufficient training data, supervised DL methods have difficulty in generating effective feature representations for tumor segmentation. This thesis aims to develop an unsupervised DL method to exploit large unlabeled data generated during clinical process. Our assumption is unsupervised anomaly detection (UAD) that, normal data have constrained anatomy and variations, while anomalies, i.e., tumors, usually differ from the normality with high diversity. We demonstrate our method for automated tumor segmentation on two different image modalities. Firstly, given that bilateral symmetry in normal human brains and unsymmetry in brain tumors, we propose a symmetric-driven deep UAD model using GAN model to model the normal symmetric variations thus segmenting tumors by their being unsymmetrical. We evaluated our method on two benchmarked datasets. Our results show that our method outperformed the state-of-the-art unsupervised brain tumor segmentation methods and achieved competitive performance to the supervised segmentation methods. Secondly, we propose a multi-modal deep UAD model for PET-CT tumor segmentation. We model a manifold of normal variations shared across normal CT and PET pairs; this manifold representing the normal pairing that can be used to segment the anomalies. We evaluated our method on two PET-CT datasets and the results show that we outperformed the state-of-the-art unsupervised methods, supervised methods and baseline fusion techniques.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
23

Gruselius, Hanna. „Generative Models and Feature Extraction on Patient Images and Structure Data in Radiation Therapy“. Thesis, KTH, Matematisk statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-229407.

Der volle Inhalt der Quelle
Annotation:
This Master thesis focuses on generative models for medical patient data for radiation therapy. The objective with the project is to implement and investigate the characteristics of a Variational Autoencoder applied to this diverse and versatile data. The questions this thesis aims to answer are: (i) whether the VAE can capture salient features of medical image data, and (ii) if these features can be used to compare similarity between patients. Furthermore, (iii) if the VAE network can successfully reconstruct its input and lastly (iv) if the VAE can generate artificial data having a reasonable anatomical appearance. The experiments carried out conveyed that the VAE is a promising method for feature extraction, since it appeared to ascertain similarity between patient images. Moreover, the reconstruction of training inputs demonstrated that the method is capable of identifying and preserving anatomical details. Regarding the generative abilities, the artificial samples generally conveyed fairly realistic anatomical structures. Future work could be to investigate the VAEs ability to generalize, with respect to both the amount of data and probabilistic considerations as well as probabilistic assumptions.
Fokuset i denna masteruppsats är generativa modeller för patientdata från strålningsbehandling. Syftet med projektet är att implementera och undersöka egenskaperna som en “Variational Autoencoder” (VAE) har på denna typ av mångsidiga och varierade data. Frågorna som ska besvaras är: (i) kan en VAE fånga särdrag hos medicinsk bild-data, och (ii) kan dessa särdrag användas för att jämföra likhet mellan patienter. Därutöver, (iii) kan VAE-nätverket återskapa sin indata väl och slutligen (iv) kan en VAE skapa artificiell data med ett rimligt anatomiskt utseende. De experiment som utfördes pekade på att en VAE kan vara en lovande metod för att extrahera framtydande drag hos patienter, eftersom metoden verkade utröna likheter mellan olika patienters bilder. Dessutom påvisade återskapningen av träningsdata att metoden är kapabel att identifiera och bevara anatomiska detaljer. Vidare uppvisade generellt den artificiellt genererade datan, en realistisk anatomisk struktur. Framtida arbete kan bestå i att undersöka hur väl en VAE kan generalisera, med avseende på både mängd data som krävs och sannolikhetsteorietiska avgränsningar och antaganden.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
24

Grechka, Asya. „Image editing with deep neural networks“. Electronic Thesis or Diss., Sorbonne université, 2023. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2023SORUS683.pdf.

Der volle Inhalt der Quelle
Annotation:
L'édition d'images a une histoire riche remontant à plus de deux siècles. Cependant, l'édition "classique" des images requiert une grande maîtrise artistique et nécessitent un temps considérable, souvent plusieurs heures, pour modifier chaque image. Ces dernières années, d'importants progrès dans la modélisation générative ont permis la synthèse d'images réalistes et de haute qualité. Toutefois, l'édition d'une image réelle est un vrai défi nécessitant de synthétiser de nouvelles caractéristiques tout en préservant fidèlement une partie de l'image d'origine. Dans cette thèse, nous explorons différentes approches pour l'édition d'images en exploitant trois familles de modèles génératifs : les GANs, les auto-encodeurs variationnels et les modèles de diffusion. Tout d'abord, nous étudions l'utilisation d'un GAN pré-entraîné pour éditer une image réelle. Bien que des méthodes d'édition d'images générées par des GANs soient bien connues, elles ne se généralisent pas facilement aux images réelles. Nous analysons les raisons de cette limitation et proposons une solution pour mieux projeter une image réelle dans un GAN afin de la rendre éditable. Ensuite, nous utilisons des autoencodeurs variationnels avec quantification vectorielle pour obtenir directement une représentation compacte de l'image (ce qui faisait défaut avec les GANs) et optimiser le vecteur latent de manière à se rapprocher d'un texte souhaité. Nous cherchons à contraindre ce problème, qui pourrait être vulnérable à des exemples adversariaux. Nous proposons une méthode pour choisir les hyperparamètres en fonction de la fidélité et de l'édition des images modifiées. Nous présentons un protocole d'évaluation robuste et démontrons l'intérêt de notre approche. Enfin, nous abordons l'édition d'images sous l'angle particulier de l'inpainting. Notre objectif est de synthétiser une partie de l'image tout en préservant le reste intact. Pour cela, nous exploitons des modèles de diffusion pré-entraînés et nous appuyons sur la méthode classique d'inpainting en remplaçant, à chaque étape du processus de débruitage, la partie que nous ne souhaitons pas modifier par l'image réelle bruitée. Cependant, cette méthode peut entraîner une désynchronisation entre la partie générée et la partie réelle. Nous proposons une approche basée sur le calcul du gradient d'une fonction qui évalue l'harmonisation entre les deux parties. Nous guidons ainsi le processus de débruitage en utilisant ce gradient
Image editing has a rich history which dates back two centuries. That said, "classic" image editing requires strong artistic skills as well as considerable time, often in the scale of hours, to modify an image. In recent years, considerable progress has been made in generative modeling which has allowed realistic and high-quality image synthesis. However, real image editing is still a challenge which requires a balance between novel generation all while faithfully preserving parts of the original image. In this thesis, we will explore different approaches to edit images, leveraging three families of generative networks: GANs, VAEs and diffusion models. First, we study how to use a GAN to edit a real image. While methods exist to modify generated images, they do not generalize easily to real images. We analyze the reasons for this and propose a solution to better project a real image into the GAN's latent space so as to make it editable. Then, we use variational autoencoders with vector quantification to directly obtain a compact image representation (which we could not obtain with GANs) and optimize the latent vector so as to match a desired text input. We aim to constrain this problem, which on the face could be vulnerable to adversarial attacks. We propose a method to chose the hyperparameters while optimizing simultaneously the image quality and the fidelity to the original image. We present a robust evaluation protocol and show the interest of our method. Finally, we abord the problem of image editing from the view of inpainting. Our goal is to synthesize a part of an image while preserving the rest unmodified. For this, we leverage pre-trained diffusion models and build off on their classic inpainting method while replacing, at each denoising step, the part which we do not wish to modify with the noisy real image. However, this method leads to a disharmonization between the real and generated parts. We propose an approach based on calculating a gradient of a loss which evaluates the harmonization of the two parts. We guide the denoising process with this gradient
APA, Harvard, Vancouver, ISO und andere Zitierweisen
25

Farouni, Tarek. „An Overview of Probabilistic Latent Variable Models with anApplication to the Deep Unsupervised Learning of ChromatinStates“. The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1492189894812539.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
26

Kalchbrenner, Nal. „Encoder-decoder neural networks“. Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:d56e48db-008b-4814-bd82-a5d612000de9.

Der volle Inhalt der Quelle
Annotation:
This thesis introduces the concept of an encoder-decoder neural network and develops architectures for the construction of such networks. Encoder-decoder neural networks are probabilistic conditional generative models of high-dimensional structured items such as natural language utterances and natural images. Encoder-decoder neural networks estimate a probability distribution over structured items belonging to a target set conditioned on structured items belonging to a source set. The distribution over structured items is factorized into a product of tractable conditional distributions over individual elements that compose the items. The networks estimate these conditional factors explicitly. We develop encoder-decoder neural networks for core tasks in natural language processing and natural image and video modelling. In Part I, we tackle the problem of sentence modelling and develop deep convolutional encoders to classify sentences; we extend these encoders to models of discourse. In Part II, we go beyond encoders to study the longstanding problem of translating from one human language to another. We lay the foundations of neural machine translation, a novel approach that views the entire translation process as a single encoder-decoder neural network. We propose a beam search procedure to search over the outputs of the decoder to produce a likely translation in the target language. Besides known recurrent decoders, we also propose a decoder architecture based solely on convolutional layers. Since the publication of these new foundations for machine translation in 2013, encoder-decoder translation models have been richly developed and have displaced traditional translation systems both in academic research and in large-scale industrial deployment. In services such as Google Translate these models process in the order of a billion translation queries a day. In Part III, we shift from the linguistic domain to the visual one to study distributions over natural images and videos. We describe two- and three- dimensional recurrent and convolutional decoder architectures and address the longstanding problem of learning a tractable distribution over high-dimensional natural images and videos, where the likely samples from the distribution are visually coherent. The empirical validation of encoder-decoder neural networks as state-of- the-art models of tasks ranging from machine translation to video prediction has a two-fold significance. On the one hand, it validates the notions of assigning probabilities to sentences or images and of learning a distribution over a natural language or a domain of natural images; it shows that a probabilistic principle of compositionality, whereby a high- dimensional item is composed from individual elements at the encoder side and whereby a corresponding item is decomposed into conditional factors over individual elements at the decoder side, is a general method for modelling cognition involving high-dimensional items; and it suggests that the relations between the elements are best learnt in an end-to-end fashion as non-linear functions in distributed space. On the other hand, the empirical success of the networks on the tasks characterizes the underlying cognitive processes themselves: a cognitive process as complex as translating from one language to another that takes a human a few seconds to perform correctly can be accurately modelled via a learnt non-linear deterministic function of distributed vectors in high-dimensional space.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
27

Viljoen, Christiaan Gerhardus. „Machine learning for particle identification & deep generative models towards fast simulations for the Alice Transition Radiation Detector at CERN“. Master's thesis, Faculty of Science, 2019. https://hdl.handle.net/11427/31781.

Der volle Inhalt der Quelle
Annotation:
This Masters thesis outlines the application of machine learning techniques, predominantly deep learning techniques, towards certain aspects of particle physics. Its two main aims: particle identification and high energy physics detector simulations are pertinent to research avenues pursued by physicists working with the ALICE (A Large Ion Collider Experiment) Transition Radiation Detector (TRD), within the Large Hadron Collider (LHC) at CERN (The European Organization for Nuclear Research).
APA, Harvard, Vancouver, ISO und andere Zitierweisen
28

Guiraud, Enrico [Verfasser], Jörg [Akademischer Betreuer] Lücke und Ralf [Akademischer Betreuer] Häfner. „Scalable unsupervised learning for deep discrete generative models: novel variational algorithms and their software realizations / Enrico Guiraud ; Jörg Lücke, Ralf Häfner“. Oldenburg : BIS der Universität Oldenburg, 2020. http://d-nb.info/1226287077/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
29

PANFILO, DANIELE. „Generating Privacy-Compliant, Utility-Preserving Synthetic Tabular and Relational Datasets Through Deep Learning“. Doctoral thesis, Università degli Studi di Trieste, 2022. http://hdl.handle.net/11368/3030920.

Der volle Inhalt der Quelle
Annotation:
Due tendenze hanno rapidamente ridefinito il panorama dell'intelligenza artificiale (IA) negli ultimi decenni. La prima è il rapido sviluppo tecnologico che rende possibile un'intelligenza artificiale sempre più sofisticata. Dal punto di vista dell'hardware, ciò include una maggiore potenza di calcolo ed una sempre crescente efficienza di archiviazione dei dati. Da un punto di vista concettuale e algoritmico, campi come l'apprendimento automatico hanno subito un'impennata e le sinergie tra l'IA e le altre discipline hanno portato a sviluppi considerevoli. La seconda tendenza è la crescente consapevolezza della società nei confronti dell'IA. Mentre le istituzioni sono sempre più consapevoli di dover adottare la tecnologia dell'IA per rimanere competitive, questioni come la privacy dei dati e la possibilità di spiegare il funzionamento dei modelli di apprendimento automatico sono diventate parte del dibattito pubblico. L'insieme di questi sviluppi genera però una sfida: l'IA può migliorare tutti gli aspetti della nostra vita, dall'assistenza sanitaria alla politica ambientale, fino alle opportunità commerciali, ma poterla sfruttare adeguatamente richiede l'uso di dati sensibili. Purtroppo, le tecniche di anonimizzazione tradizionali non forniscono una soluzione affidabile a suddetta sfida. Non solo non sono sufficienti a proteggere i dati personali, ma ne riducono anche il valore analitico a causa delle inevitabili distorsioni apportate ai dati. Tuttavia, lo studio emergente dei modelli generativi ad apprendimento profondo (MGAP) può costituire un'alternativa più raffinata all'anonimizzazione tradizionale. Originariamente concepiti per l'elaborazione delle immagini, questi modelli catturano le distribuzioni di probabilità sottostanti agli insiemi di dati. Tali distribuzioni possono essere successivamente campionate, fornendo nuovi campioni di dati, non presenti nel set di dati originale. Tuttavia, la distribuzione complessiva degli insiemi di dati sintetici, costituiti da dati campionati in questo modo, è equivalente a quella del set dei dati originali. In questa tesi, verrà analizzato l'uso dei MGAP come tecnologia abilitante per una più ampia adozione dell'IA. A tal scopo, verrà ripercorsa prima di tutto la legislazione sulla privacy dei dati, con particolare attenzione a quella relativa all'Unione Europea. Nel farlo, forniremo anche una panoramica delle tecnologie tradizionali di anonimizzazione dei dati. Successivamente, verrà fornita un'introduzione all'IA e al deep-learning. Per illustrare i meriti di questo campo, vengono discussi due casi di studio: uno relativo alla segmentazione delle immagini ed uno reltivo alla diagnosi del cancro. Si introducono poi i MGAP, con particolare attenzione agli autoencoder variazionali. L'applicazione di questi metodi ai dati tabellari e relazionali costituisce una utile innovazione in questo campo che comporta l’introduzione di tecniche innovative di pre-elaborazione. Infine, verrà valutata la metodologia sviluppata attraverso esperimenti riproducibili, considerando sia l'utilità analitica che il grado di protezione della privacy attraverso metriche statistiche.
Two trends have rapidly been redefining the artificial intelligence (AI) landscape over the past several decades. The first of these is the rapid technological developments that make increasingly sophisticated AI feasible. From a hardware point of view, this includes increased computational power and efficient data storage. From a conceptual and algorithmic viewpoint, fields such as machine learning have undergone a surge and synergies between AI and other disciplines have resulted in considerable developments. The second trend is the growing societal awareness around AI. While institutions are becoming increasingly aware that they have to adopt AI technology to stay competitive, issues such as data privacy and explainability have become part of public discourse. Combined, these developments result in a conundrum: AI can improve all aspects of our lives, from healthcare to environmental policy to business opportunities, but invoking it requires the use of sensitive data. Unfortunately, traditional anonymization techniques do not provide a reliable solution to this conundrum. They are insufficient in protecting personal data, but also reduce the analytic value of data through distortion. However, the emerging study of deep-learning generative models (DLGM) may form a more refined alternative to traditional anonymization. Originally conceived for image processing, these models capture probability distributions underlying datasets. Such distributions can subsequently be sampled, giving new data points not present in the original dataset. However, the overall distribution of synthetic datasets, consisting of data sampled in this manner, is equivalent to that of the original dataset. In our research activity, we study the use of DLGM as an enabling technology for wider AI adoption. To do so, we first study legislation around data privacy with an emphasis on the European Union. In doing so, we also provide an outline of traditional data anonymization technology. We then provide an introduction to AI and deep-learning. Two case studies are discussed to illustrate the field’s merits, namely image segmentation and cancer diagnosis. We then introduce DLGM, with an emphasis on variational autoencoders. The application of such methods to tabular and relational data is novel and involves innovative preprocessing techniques. Finally, we assess the developed methodology in reproducible experiments, evaluating both the analytic utility and the degree of privacy protection through statistical metrics.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
30

Testolin, Alberto. „Modeling cognition with generative neural networks: The case of orthographic processing“. Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424619.

Der volle Inhalt der Quelle
Annotation:
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words.
In questa tesi vengono studiati alcuni processi cognitivi utilizzando recenti modelli di reti neurali generative. A differenza della maggior parte dei modelli connessionisti, l’approccio computazionale adottato in questa tesi enfatizza la natura generativa della cognizione, suggerendo che uno degli obiettivi principali dei sistemi cognitivi sia quello di apprendere un modello interno dell’ambiente circostante, che può essere usato per inferire relazioni causali ed effettuare previsioni riguardo all’informazione sensoriale in arrivo. In particolare, viene considerata una potente classe di reti neurali ricorrenti in grado di apprendere modelli generativi probabilistici dall’esperienza, estraendo informazione statistica di ordine superiore da un insieme di variabili in modo totalmente non supervisionato. Questo tipo di reti può essere formalizzato utilizzando la teoria dei modelli grafici probabilistici, che consente di descrivere con lo stesso linguaggio formale sia modelli di reti neurali che modelli Bayesiani strutturati. Inoltre, architetture di rete di base possono essere estese per creare sistemi più sofisticati, sfruttando molteplici livelli di processamento per apprendere modelli generativi gerarchici o sfruttando connessioni ricorrenti direzionate per processare informazione organizzata in sequenze. Riteniamo che queste architetture avanzate costituiscano un’alternativa promettente alle più tradizionali reti neurali supervisionate di tipo feed-forward, perché riproducono più fedelmente l’organizzazione funzionale e strutturale dei circuiti corticali, consentendo di spiegare come l’evidenza sensoriale possa essere effettivamente combinata con informazione contestuale proveniente da connessioni di feedback (“top-down”). Per giustificare l’utilizzo di questo tipo di modelli, in una serie di simulazioni studiamo nel dettaglio come implementazioni efficienti di reti generative gerarchiche e temporali possano estrarre informazione da grandi basi di dati, contenenti migliaia di esempi di training. In particolare, forniamo evidenza empirica relativa al riconoscimento di caratteri stampati e manoscritti appartenenti a diversi sistemi di scrittura, che possono in seguito essere combinati spazialmente o temporalmente per costruire unità ortografiche più complesse come quelle rappresentate dalle parole inglesi.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
31

Ljung, Mikael. „Synthetic Data Generation for the Financial Industry Using Generative Adversarial Networks“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301307.

Der volle Inhalt der Quelle
Annotation:
Following the introduction of new laws and regulations to ensure data protection in GDPR and PIPEDA, interests in technologies to protect data privacy have increased. A promising research trajectory in this area is found in Generative Adversarial Networks (GAN), an architecture trained to produce data that reflects the statistical properties of its underlying dataset without compromising the integrity of the data subjects. Despite the technology’s young age, prior research has made significant progress in the generation process of so-called synthetic data, and the current models can generate images with high-quality. Due to the architecture’s success with images, it has been adapted to new domains, and this study examines its potential to synthesize financial tabular data. The study investigates a state-of-the-art model within tabular GANs, called CTGAN, together with two proposed ideas to enhance its generative ability. The results indicate that a modified training dynamic and a novel early stopping strategy improve the architecture’s capacity to synthesize data. The generated data presents realistic features with clear influences from its underlying dataset, and the inferred conclusions on subsequent analyses are similar to those based on the original data. Thus, the conclusion is that GANs has great potential to generate tabular data that can be considered a substitute for sensitive data, which could enable organizations to have more generous data sharing policies.
Med striktare förhållningsregler till hur data ska hanteras genom GDPR och PIPEDA har intresset för anonymiseringsmetoder för att censurera känslig data aktualliserats. En lovande teknik inom området återfinns i Generativa Motstridande Nätverk, en arkitektur som syftar till att generera data som återspeglar de statiska egenskaperna i dess underliggande dataset utan att äventyra datasubjektens integritet. Trots forskningsfältet unga ålder har man gjort stora framsteg i genereringsprocessen av så kallad syntetisk data, och numera finns det modeller som kan generera bilder av hög realistisk karaktär. Som ett steg framåt i forskningen har arkitekturen adopterats till nya domäner, och den här studien syftar till att undersöka dess förmåga att syntatisera finansiell tabelldata. I studien undersöks en framträdande modell inom forskningsfältet, CTGAN, tillsammans med två föreslagna idéer i syfte att förbättra dess generativa förmåga. Resultaten indikerar att en förändrad träningsdynamik och en ny optimeringsstrategi förbättrar arkitekturens förmåga att generera syntetisk data. Den genererade datan håller i sin tur hög kvalité med tydliga influenser från dess underliggande dataset, och resultat på efterföljande analyser mellan datakällorna är av jämförbar karaktär. Slutsatsen är således att GANs har stor potential att generera tabulär data som kan betrakatas som substitut till känslig data, vilket möjliggör för en mer frikostig delningspolitik av data inom organisationer.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
32

Pakdaman, Hesam. „Updating the generator in PPGN-h with gradients flowing through the encoder“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-224867.

Der volle Inhalt der Quelle
Annotation:
The Generative Adversarial Network framework has shown success in implicitly modeling data distributions and is able to generate realistic samples. Its architecture is comprised of a generator, which produces fake data that superficially seem to belong to the real data distribution, and a discriminator which is to distinguish fake from genuine samples. The Noiseless Joint Plug & Play model offers an extension to the framework by simultaneously training autoencoders. This model uses a pre-trained encoder as a feature extractor, feeding the generator with global information. Using the Plug & Play network as baseline, we design a new model by adding discriminators to the Plug & Play architecture. These additional discriminators are trained to discern real and fake latent codes, which are the output of the encoder using genuine and generated inputs, respectively. We proceed to investigate whether this approach is viable. Experiments conducted for the MNIST manifold show that this indeed is the case.
Generative Adversarial Network är ett ramverk vilket implicit modellerar en datamängds sannolikhetsfördelning och är kapabel till att producera realistisk exempel. Dess arkitektur utgörs av en generator, vilken kan fabricera datapunkter liggandes nära den verkliga sannolikhetsfördelning, och en diskriminator vars syfte är att urskilja oäkta punkter från genuina. Noiseless Joint Plug & Play modellen är en vidareutveckling av ramverket som samtidigt tränar autoencoders. Denna modell använder sig utav en inlärd enkoder som förser generatorn med data. Genom att använda Plug & Play modellen som referens, skapar vi en ny modell genom att addera diskriminatorer till Plug & Play architekturen. Dessa diskriminatorer är tränade att särskilja genuina och falska latenta koder, vilka har producerats av enkodern genom att ha använt genuina och oäkta datapunkter som inputs. Vi undersöker huruvida denna metod är gynnsam. Experiment utförda för MNIST datamängden visar att så är fallet.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
33

Folin, Veronika. „Abstractive Long Document Summarization: Studio e Sperimentazione di Modelli Generativi Retrieval-Augmented“. Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24283/.

Der volle Inhalt der Quelle
Annotation:
In questa tesi si trattano lo studio e la sperimentazione di un modello generativo retrieval-augmented, basato su Transformers, per il task di Abstractive Summarization su lunghe sentenze legali. La sintesi automatica del testo (Automatic Text Summarization) è diventata un task di Natural Language Processing (NLP) molto importante oggigiorno, visto il grandissimo numero di dati provenienti dal web e banche dati. Inoltre, essa permette di automatizzare un processo molto oneroso per gli esperti, specialmente nel settore legale, in cui i documenti sono lunghi e complicati, per cui difficili e dispendiosi da riassumere. I modelli allo stato dell’arte dell’Automatic Text Summarization sono basati su soluzioni di Deep Learning, in particolare sui Transformers, che rappresentano l’architettura più consolidata per task di NLP. Il modello proposto in questa tesi rappresenta una soluzione per la Long Document Summarization, ossia per generare riassunti di lunghe sequenze testuali. In particolare, l’architettura si basa sul modello RAG (Retrieval-Augmented Generation), recentemente introdotto dal team di ricerca Facebook AI per il task di Question Answering. L’obiettivo consiste nel modificare l’architettura RAG al fine di renderla adatta al task di Abstractive Long Document Summarization. In dettaglio, si vuole sfruttare e testare la memoria non parametrica del modello, con lo scopo di arricchire la rappresentazione del testo di input da riassumere. A tal fine, sono state sperimentate diverse configurazioni del modello su diverse tipologie di esperimenti e sono stati valutati i riassunti generati con diverse metriche automatiche.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
34

Bjöörn, Anton. „Employing a Transformer Language Model for Information Retrieval and Document Classification : Using OpenAI's generative pre-trained transformer, GPT-2“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281766.

Der volle Inhalt der Quelle
Annotation:
As the information flow on the Internet keeps growing it becomes increasingly easy to miss important news which does not have a mass appeal. Combating this problem calls for increasingly sophisticated information retrieval methods. Pre-trained transformer based language models have shown great generalization performance on many natural language processing tasks. This work investigates how well such a language model, Open AI’s General Pre-trained Transformer 2 model (GPT-2), generalizes to information retrieval and classification of online news articles, written in English, with the purpose of comparing this approach with the more traditional method of Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. The aim is to shed light on how useful state-of-the-art transformer based language models are for the construction of personalized information retrieval systems. Using transfer learning the smallest version of GPT-2 is trained to rank and classify news articles achieving similar results to the purely TF-IDF based approach. While the average Normalized Discounted Cumulative Gain (NDCG) achieved by the GPT-2 based model was about 0.74 percentage points higher the sample size was too small to give these results high statistical certainty.
Informationsflödet på Internet fortsätter att öka vilket gör det allt lättare att missa viktiga nyheter som inte intresserar en stor mängd människor. För att bekämpa detta problem behövs allt mer sofistikerade informationssökningsmetoder. Förtränade transformermodeller har sedan ett par år tillbaka tagit över som de mest framstående neurala nätverken för att hantera text. Det här arbetet undersöker hur väl en sådan språkmodell, Open AIs General Pre-trained Transformer 2 (GPT-2), kan generalisera från att generera text till att användas för informationssökning och klassificering av texter. För att utvärdera detta jämförs en transformerbaserad modell med en mer traditionell Term Frequency- Inverse Document Frequency (TF-IDF) vektoriseringsmodell. Målet är att klargöra hur användbara förtränade transformermodeller faktiskt är i skapandet av specialiserade informationssökningssystem. Den minsta versionen av språkmodellen GPT-2 anpassas och tränas om till att ranka och klassificera nyhetsartiklar, skrivna på engelska, och uppnår liknande prestanda som den TF-IDF baserade modellen. Den GPT-2 baserade modellen hade i genomsnitt 0.74 procentenheter högre Normalized Discounted Cumulative Gain (NDCG) men provstorleken var ej stor nog för att ge dessa resultat hög statistisk säkerhet.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
35

Oskarsson, Joel. „Probabilistic Regression using Conditional Generative Adversarial Networks“. Thesis, Linköpings universitet, Statistik och maskininlärning, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-166637.

Der volle Inhalt der Quelle
Annotation:
Regression is a central problem in statistics and machine learning with applications everywhere in science and technology. In probabilistic regression the relationship between a set of features and a real-valued target variable is modelled as a conditional probability distribution. There are cases where this distribution is very complex and not properly captured by simple approximations, such as assuming a normal distribution. This thesis investigates how conditional Generative Adversarial Networks (GANs) can be used to properly capture more complex conditional distributions. GANs have seen great success in generating complex high-dimensional data, but less work has been done on their use for regression problems. This thesis presents experiments to better understand how conditional GANs can be used in probabilistic regression. Different versions of GANs are extended to the conditional case and evaluated on synthetic and real datasets. It is shown that conditional GANs can learn to estimate a wide range of different distributions and be competitive with existing probabilistic regression models.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
36

Pumarola, Peris Albert. „Bridging the gap between reconstruction and synthesis“. Doctoral thesis, TDX (Tesis Doctorals en Xarxa), 2021. http://hdl.handle.net/10803/673266.

Der volle Inhalt der Quelle
Annotation:
3D reconstruction and image synthesis are two of the main pillars in computer vision. Early works focused on simple tasks such as multi-view reconstruction and texture synthesis. With the spur of Deep Learning, the field has rapidly progressed, making it possible to achieve more complex and high level tasks. For example, the 3D reconstruction results of traditional multi-view approaches are currently obtained with single view methods. Similarly, early pattern based texture synthesis works have resulted in techniques that allow generating novel high-resolution images. In this thesis we have developed a hierarchy of tools that cover all these range of problems, lying at the intersection of computer vision, graphics and machine learning. We tackle the problem of 3D reconstruction and synthesis in the wild. Importantly, we advocate for a paradigm in which not everything should be learned. Instead of applying Deep Learning naively we propose novel representations, layers and architectures that directly embed prior 3D geometric knowledge for the task of 3D reconstruction and synthesis. We apply these techniques to problems including scene/person reconstruction and photo-realistic rendering. We first address methods to reconstruct a scene and the clothed people in it while estimating the camera position. Then, we tackle image and video synthesis for clothed people in the wild. Finally, we bridge the gap between reconstruction and synthesis under the umbrella of a unique novel formulation. Extensive experiments conducted along this thesis show that the proposed techniques improve the performance of Deep Learning models in terms of the quality of the reconstructed 3D shapes / synthesised images, while reducing the amount of supervision and training data required to train them. In summary, we provide a variety of low, mid and high level algorithms that can be used to incorporate prior knowledge into different stages of the Deep Learning pipeline and improve performance in tasks of 3D reconstruction and image synthesis.
La reconstrucció 3D i la síntesi d'imatges són dos dels pilars fonamentals en visió per computador. Els estudis previs es centren en tasques senzilles com la reconstrucció amb informació multi-càmera i la síntesi de textures. Amb l'aparició del "Deep Learning", aquest camp ha progressat ràpidament, fent possible assolir tasques molt més complexes. Per exemple, per obtenir una reconstrucció 3D, tradicionalment s'utilitzaven mètodes multi-càmera, en canvi ara, es poden obtenir a partir d'una sola imatge. De la mateixa manera, els primers treballs de síntesi de textures basats en patrons han donat lloc a tècniques que permeten generar noves imatges completes en alta resolució. En aquesta tesi, hem desenvolupat una sèrie d'eines que cobreixen tot aquest ventall de problemes, situats en la intersecció entre la visió per computador, els gràfics i l'aprenentatge automàtic. Abordem el problema de la reconstrucció i la síntesi 3D en el món real. És important destacar que defensem un paradigma on no tot s'ha d'aprendre. Enlloc d'aplicar el "Deep Learning" de forma naïve, proposem representacions novedoses i arquitectures que incorporen directament els coneixements geomètrics ja existents per a aconseguir la reconstrucció 3D i la síntesi d'imatges. Nosaltres apliquem aquestes tècniques a problemes com ara la reconstrucció d'escenes/persones i a la renderització d'imatges fotorealistes. Primer abordem els mètodes per reconstruir una escena, les persones vestides que hi ha i la posició de la càmera. A continuació, abordem la síntesi d'imatges i vídeos de persones vestides en situacions quotidianes. I finalment, aconseguim, a través d'una nova formulació única, connectar la reconstrucció amb la síntesi. Els experiments realitzats al llarg d'aquesta tesi demostren que les tècniques proposades milloren el rendiment dels models de "Deepp Learning" pel que fa a la qualitat de les reconstruccions i les imatges sintetitzades alhora que redueixen la quantitat de dades necessàries per entrenar-los. En resum, proporcionem una varietat d'algoritmes de baix, mitjà i alt nivell que es poden utilitzar per incorporar els coneixements previs a les diferents etapes del "Deep Learning" i millorar el rendiment en tasques de reconstrucció 3D i síntesi d'imatges.
Automàtica, robòtica i visió
APA, Harvard, Vancouver, ISO und andere Zitierweisen
37

Kryściński, Wojciech. „Training Neural Models for Abstractive Text Summarization“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-236973.

Der volle Inhalt der Quelle
Annotation:
Abstractive text summarization aims to condense long textual documents into a short, human-readable form while preserving the most important information from the source document. A common approach to training summarization models is by using maximum likelihood estimation with the teacher forcing strategy. Despite its popularity, this method has been shown to yield models with suboptimal performance at inference time. This work examines how using alternative, task-specific training signals affects the performance of summarization models. Two novel training signals are proposed and evaluated as part of this work. One, a novelty metric, measuring the overlap between n-grams in the summary and the summarized article. The other, utilizing a discriminator model to distinguish human-written summaries from generated ones on a word-level basis. Empirical results show that using the mentioned metrics as rewards for policy gradient training yields significant performance gains measured by ROUGE scores, novelty scores and human evaluation.
Abstraktiv textsammanfattning syftar på att korta ner långa textdokument till en förkortad, mänskligt läsbar form, samtidigt som den viktigaste informationen i källdokumentet bevaras. Ett vanligt tillvägagångssätt för att träna sammanfattningsmodeller är att använda maximum likelihood-estimering med teacher-forcing-strategin. Trots dess popularitet har denna metod visat sig ge modeller med suboptimal prestanda vid inferens. I det här arbetet undersöks hur användningen av alternativa, uppgiftsspecifika träningssignaler påverkar sammanfattningsmodellens prestanda. Två nya träningssignaler föreslås och utvärderas som en del av detta arbete. Den första, vilket är en ny metrik, mäter överlappningen mellan n-gram i sammanfattningen och den sammanfattade artikeln. Den andra använder en diskrimineringsmodell för att skilja mänskliga skriftliga sammanfattningar från genererade på ordnivå. Empiriska resultat visar att användandet av de nämnda mätvärdena som belöningar för policygradient-träning ger betydande prestationsvinster mätt med ROUGE-score, novelty score och mänsklig utvärdering.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
38

Besedin, Andrey. „Continual forgetting-free deep learning from high-dimensional data streams“. Electronic Thesis or Diss., Paris, CNAM, 2019. http://www.theses.fr/2019CNAM1263.

Der volle Inhalt der Quelle
Annotation:
Dans cette thèse, nous proposons une nouvelle approche de l’apprentissage profond pour la classification des flux de données de grande dimension. Au cours des dernières années, les réseaux de neurones sont devenus la référence dans diverses applications d’apprentissage automatique. Cependant, la plupart des méthodes basées sur les réseaux de neurones sont conçues pour résoudre des problèmes d’apprentissage statique. Effectuer un apprentissage profond en ligne est une tâche difficile. La principale difficulté est que les classificateurs basés sur les réseaux de neurones reposent généralement sur l’hypothèse que la séquence des lots de données utilisées pendant l’entraînement est stationnaire ; ou en d’autres termes, que la distribution des classes de données est la même pour tous les lots (hypothèse i.i.d.). Lorsque cette hypothèse ne tient pas les réseaux de neurones ont tendance à oublier les concepts temporairement indisponibles dans le flux. Dans la littérature scientifique, ce phénomène est généralement appelé oubli catastrophique. Les approches que nous proposons ont comme objectif de garantir la nature i.i.d. de chaque lot qui provient du flux et de compenser l’absence de données historiques. Pour ce faire, nous entrainons des modèles génératifs et pseudo-génératifs capable de produire des échantillons synthétiques à partir des classes absentes ou mal représentées dans le flux, et complètent les lots du flux avec ces échantillons. Nous testons nos approches dans un scénario d’apprentissage incrémental et dans un type spécifique de l’apprentissage continu. Nos approches effectuent une classification sur des flux de données dynamiques avec une précision proche des résultats obtenus dans la configuration de classification statique où toutes les données sont disponibles pour la durée de l’apprentissage. En outre, nous démontrons la capacité de nos méthodes à s’adapter à des classes de données invisibles et à de nouvelles instances de catégories de données déjà connues, tout en évitant d’oublier les connaissances précédemment acquises
In this thesis, we propose a new deep-learning-based approach for online classification on streams of high-dimensional data. In recent years, Neural Networks (NN) have become the primary building block of state-of-the-art methods in various machine learning problems. Most of these methods, however, are designed to solve the static learning problem, when all data are available at once at training time. Performing Online Deep Learning is exceptionally challenging.The main difficulty is that NN-based classifiers usually rely on the assumption that the sequence of data batches used during training is stationary, or in other words, that the distribution of data classes is the same for all batches (i.i.d. assumption).When this assumption does not hold Neural Networks tend to forget the concepts that are temporarily not available in thestream. In the literature, this phenomenon is known as catastrophic forgetting. The approaches we propose in this thesis aim to guarantee the i.i.d. nature of each batch that comes from the stream and compensates for the lack of historical data. To do this, we train generative models and pseudo-generative models capable of producing synthetic samples from classes that are absent or misrepresented in the stream and complete the stream’s batches with these samples. We test our approaches in an incremental learning scenario and a specific type of continuous learning. Our approaches perform classification on dynamic data streams with the accuracy close to the results obtained in the static classification configuration where all data are available for the duration of the learning. Besides, we demonstrate the ability of our methods to adapt to invisible data classes and new instances of already known data categories, while avoiding forgetting the previously acquired knowledge
APA, Harvard, Vancouver, ISO und andere Zitierweisen
39

Diffner, Fredrik, und Hovig Manjikian. „Training a Neural Network using Synthetically Generated Data“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280334.

Der volle Inhalt der Quelle
Annotation:
A major challenge in training machine learning models is the gathering and labeling of a sufficiently large training data set. A common solution is the use of synthetically generated data set to expand or replace a real data set. This paper examines the performance of a machine learning model trained on synthetic data set versus the same model trained on real data. This approach was applied to the problem of character recognition using a machine learning model that implements convolutional neural networks. A synthetic data set of 1’240’000 images and two real data sets, Char74k and ICDAR 2003, were used. The result was that the model trained on the synthetic data set achieved an accuracy that was about 50% better than the accuracy of the same model trained on the real data set.
Vid utvecklandet av maskininlärningsmodeller kan avsaknaden av ett tillräckligt stort dataset för träning utgöra ett problem. En vanlig lösning är att använda syntetiskt genererad data för att antingen utöka eller helt ersätta ett dataset med verklig data. Denna uppsats undersöker prestationen av en maskininlärningsmodell tränad på syntetisk data jämfört med samma modell tränad på verklig data. Detta applicerades på problemet att använda ett konvolutionärt neuralt nätverk för att tyda tecken i bilder från ”naturliga” miljöer. Ett syntetiskt dataset bestående av 1’240’000 samt två stycken dataset med tecken från bilder, Char74K och ICDAR2003, användes. Resultatet visar att en modell tränad på det syntetiska datasetet presterade ca 50% bättre än samma modell tränad på Char74K.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
40

Hadjeres, Gaëtan. „Modèles génératifs profonds pour la génération interactive de musique symbolique“. Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS027/document.

Der volle Inhalt der Quelle
Annotation:
Ce mémoire traite des modèles génératifs profonds appliqués à la génération automatique de musique symbolique. Nous nous attacherons tout particulièrement à concevoir des modèles génératifs interactifs, c'est-à-dire des modèles instaurant un dialogue entre un compositeur humain et la machine au cours du processus créatif. En effet, les récentes avancées en intelligence artificielle permettent maintenant de concevoir de puissants modèles génératifs capables de générer du contenu musical sans intervention humaine. Il me semble cependant que cette approche est stérile pour la production artistique dans le sens où l'intervention et l'appréciation humaines en sont des piliers essentiels. En revanche, la conception d'assistants puissants, flexibles et expressifs destinés aux créateurs de contenus musicaux me semble pleine de sens. Que ce soit dans un but pédagogique ou afin de stimuler la créativité artistique, le développement et le potentiel de ces nouveaux outils de composition assistée par ordinateur sont prometteurs. Dans ce manuscrit, je propose plusieurs nouvelles architectures remettant l'humain au centre de la création musicale. Les modèles proposés ont en commun la nécessité de permettre à un opérateur de contrôler les contenus générés. Afin de rendre cette interaction aisée, des interfaces utilisateurs ont été développées ; les possibilités de contrôle se manifestent sous des aspects variés et laissent entrevoir de nouveaux paradigmes compositionnels. Afin d'ancrer ces avancées dans une pratique musicale réelle, je conclue cette thèse sur la présentation de quelques réalisations concrètes (partitions, concerts) résultant de l'utilisation de ces nouveaux outils
This thesis discusses the use of deep generative models for symbolic music generation. We will be focused on devising interactive generative models which are able to create new creative processes through a fruitful dialogue between a human composer and a computer. Recent advances in artificial intelligence led to the development of powerful generative models able to generate musical content without the need of human intervention. I believe that this practice cannot be thriving in the future since the human experience and human appreciation are at the crux of the artistic production. However, the need of both flexible and expressive tools which could enhance content creators' creativity is patent; the development and the potential of such novel A.I.-augmented computer music tools are promising. In this manuscript, I propose novel architectures that are able to put artists back in the loop. The proposed models share the common characteristic that they are devised so that a user can control the generated musical contents in a creative way. In order to create a user-friendly interaction with these interactive deep generative models, user interfaces were developed. I believe that new compositional paradigms will emerge from the possibilities offered by these enhanced controls. This thesis ends on the presentation of genuine musical projects like concerts featuring these new creative tools
APA, Harvard, Vancouver, ISO und andere Zitierweisen
41

Budaraju, Sri Datta. „Unsupervised 3D Human Pose Estimation“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-291435.

Der volle Inhalt der Quelle
Annotation:
The thesis proposes an unsupervised representation learning method to predict 3D human pose from a 2D skeleton via a VAEGAN (Variational Autoencoder Generative Adversarial Network) hybrid network. The method learns to lift poses from 2D to 3D using selfsupervision and adversarial learning techniques. The method does not use images, heatmaps, 3D pose annotations, paired/unpaired 2Dto3D skeletons, 3D priors, synthetic 2D skeletons, multiview or temporal information in any shape or form. The 2D skeleton input is taken by a VAE that encodes it in a latent space and then decodes that latent representation to a 3D pose. The 3D pose is then reprojected to 2D for a constrained, selfsupervised optimization using the input 2D pose. Parallelly, the 3D pose is also randomly rotated and reprojected to 2D to generate a ’novel’ 2D view for unconstrained adversarial optimization using a discriminator network. The combination of the optimizations of the original and the novel 2D views of the predicted 3D pose results in a ’realistic’ 3D pose generation. The thesis shows that the encoding and decoding process of the VAE addresses the major challenge of erroneous and incomplete skeletons from 2D detection networks as inputs and that the variance of the VAE can be altered to get various plausible 3D poses for a given 2D input. Additionally, the latent representation could be used for crossmodal training and many downstream applications. The results on Human3.6M datasets outperform previous unsupervised approaches with less model complexity while addressing more hurdles in scaling the task to the real world.
Uppsatsen föreslår en oövervakad metod för representationslärande för att förutsäga en 3Dpose från ett 2D skelett med hjälp av ett VAE GAN (Variationellt Autoenkodande Generativt Adversariellt Nätverk) hybrid neuralt nätverk. Metoden lär sig att utvidga poser från 2D till 3D genom att använda självövervakning och adversariella inlärningstekniker. Metoden använder sig vare sig av bilder, värmekartor, 3D poseannotationer, parade/oparade 2D till 3D skelett, a priori information i 3D, syntetiska 2Dskelett, flera vyer, eller tidsinformation. 2Dskelettindata tas från ett VAE som kodar det i en latent rymd och sedan avkodar den latenta representationen till en 3Dpose. 3D posen är sedan återprojicerad till 2D för att genomgå begränsad, självövervakad optimering med hjälp av den tvådimensionella posen. Parallellt roteras dessutom 3Dposen slumpmässigt och återprojiceras till 2D för att generera en ny 2D vy för obegränsad adversariell optimering med hjälp av ett diskriminatornätverk. Kombinationen av optimeringarna av den ursprungliga och den nya 2Dvyn av den förutsagda 3Dposen resulterar i en realistisk 3Dposegenerering. Resultaten i uppsatsen visar att kodningsoch avkodningsprocessen av VAE adresserar utmaningen med felaktiga och ofullständiga skelett från 2D detekteringsnätverk som indata och att variansen av VAE kan modifieras för att få flera troliga 3D poser för givna 2D indata. Dessutom kan den latenta representationen användas för crossmodal träning och flera nedströmsapplikationer. Resultaten på datamängder från Human3.6M är bättre än tidigare oövervakade metoder med mindre modellkomplexitet samtidigt som de adresserar flera hinder för att skala upp uppgiften till verkliga tillämpningar.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
42

Chali, Samy. „Robustness Analysis of Classifiers Against Out-of-Distribution and Adversarial Inputs“. Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPAST012.

Der volle Inhalt der Quelle
Annotation:
De nombreux problèmes traités par l'IA sont des problèmes de classification de données d'entrées complexes qui doivent être séparées en différentes classes. Les fonctions transformant l'espace complexe des valeurs d'entrées en un espace plus simple, linéairement séparable, se font soit par apprentissage (réseaux convolutionels profonds), soit par projection dans un espace de haute dimension afin d'obtenir une représentation non-linéaire 'riche' des entrées puis un appariement linaire entre l'espace de haute dimension et les unités de sortie, tels qu'utilisés dans les Support Vector Machines (travaux de Vapnik 1966-1995). L'objectif de la thèse est de réaliser une architecture optimisée, générique dans un domaine d'application donné, permettant de pré-traiter des données afin de les préparer pour une classification en un minimum d'opérations. En outre, cette architecture aura pour but d'augmenter l'autonomie du modèle en lui permettant par exemple d'apprendre en continu, d'être robuste aux données corrompues ou d'identifier des données que le modèle ne pourrait pas traiter
Many issues addressed by AI involve the classification of complex input data that needs to be separated into different classes. The functions that transform the complex input values into a simpler, linearly separable space are achieved either through learning (deep convolutional networks) or by projecting into a high-dimensional space to obtain a 'rich' non-linear representation of the inputs, followed by a linear mapping between the high-dimensional space and the output units, as used in Support Vector Machines (Vapnik's work 1966-1995). The thesis aims to create an optimized, generic architecture capable of preprocessing data to prepare them for classification with minimal operations required. Additionally, this architecture aims to enhance the model's autonomy by enabling continuous learning, robustness to corrupted data, and the identification of data that the model cannot process
APA, Harvard, Vancouver, ISO und andere Zitierweisen
43

Hadjeres, Gaëtan. „Modèles génératifs profonds pour la génération interactive de musique symbolique“. Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS027.

Der volle Inhalt der Quelle
Annotation:
Ce mémoire traite des modèles génératifs profonds appliqués à la génération automatique de musique symbolique. Nous nous attacherons tout particulièrement à concevoir des modèles génératifs interactifs, c'est-à-dire des modèles instaurant un dialogue entre un compositeur humain et la machine au cours du processus créatif. En effet, les récentes avancées en intelligence artificielle permettent maintenant de concevoir de puissants modèles génératifs capables de générer du contenu musical sans intervention humaine. Il me semble cependant que cette approche est stérile pour la production artistique dans le sens où l'intervention et l'appréciation humaines en sont des piliers essentiels. En revanche, la conception d'assistants puissants, flexibles et expressifs destinés aux créateurs de contenus musicaux me semble pleine de sens. Que ce soit dans un but pédagogique ou afin de stimuler la créativité artistique, le développement et le potentiel de ces nouveaux outils de composition assistée par ordinateur sont prometteurs. Dans ce manuscrit, je propose plusieurs nouvelles architectures remettant l'humain au centre de la création musicale. Les modèles proposés ont en commun la nécessité de permettre à un opérateur de contrôler les contenus générés. Afin de rendre cette interaction aisée, des interfaces utilisateurs ont été développées ; les possibilités de contrôle se manifestent sous des aspects variés et laissent entrevoir de nouveaux paradigmes compositionnels. Afin d'ancrer ces avancées dans une pratique musicale réelle, je conclue cette thèse sur la présentation de quelques réalisations concrètes (partitions, concerts) résultant de l'utilisation de ces nouveaux outils
This thesis discusses the use of deep generative models for symbolic music generation. We will be focused on devising interactive generative models which are able to create new creative processes through a fruitful dialogue between a human composer and a computer. Recent advances in artificial intelligence led to the development of powerful generative models able to generate musical content without the need of human intervention. I believe that this practice cannot be thriving in the future since the human experience and human appreciation are at the crux of the artistic production. However, the need of both flexible and expressive tools which could enhance content creators' creativity is patent; the development and the potential of such novel A.I.-augmented computer music tools are promising. In this manuscript, I propose novel architectures that are able to put artists back in the loop. The proposed models share the common characteristic that they are devised so that a user can control the generated musical contents in a creative way. In order to create a user-friendly interaction with these interactive deep generative models, user interfaces were developed. I believe that new compositional paradigms will emerge from the possibilities offered by these enhanced controls. This thesis ends on the presentation of genuine musical projects like concerts featuring these new creative tools
APA, Harvard, Vancouver, ISO und andere Zitierweisen
44

Crestel, Léopold. „Neural networks for automatic musical projective orchestration“. Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS625.

Der volle Inhalt der Quelle
Annotation:
L’orchestration est l’art de composer un discours musical en combinant les timbres instrumentaux. La complexité de la discipline a longtemps été un frein à l’élaboration d’une théorie de l’orchestration. Ainsi, contrairement à l’harmonie ou au contrepoint qui s’appuient sur de solides constructions théoriques, l’orchestration reste de nos jours encore essentiellement enseignée à travers l’observation d’exemples canoniques. Notre objectif est de développer un système d’orchestration automatique de pièce pour piano en nous appuyant sur des méthodes d’apprentissage statistique. Nous nous focalisons sur le répertoire classique, cette technique d’écriture étant courante pour des compositeurs tels que Mozart ou Beethoven qui réalisaient d’abord une ébauche pianistique de leurs pièces orchestrales. En observant une large base de donnée de pièces pour orchestre et leurs réductions pour piano, nous évaluons l'aptitude des réseaux de neurones à apprendre les mécanismes complexes qui régissent l’orchestration. La vaste capacité d’apprentissage des architectures profondes semble adaptée à la difficulté du problème. Cependant, dans un contexte orchestrale, les représentations musicales symboliques traditionnelles donnent lieu à des vecteurs parcimonieux dans des espaces de grande dimension. Nous essayons donc de contourner ces difficultés en utilisant des méthodes auto-régressives et des fonctions d’erreur adaptées. Finalement, nous essayons de développer un système capable d'orchestrer en temps réel l'improvisation d'un pianiste
Orchestration is the art of composing a musical discourse over a combinatorial set of instrumental possibilities. For centuries, musical orchestration has only been addressed in an empirical way, as a scientific theory of orchestration appears elusive. In this work, we attempt to build the first system for automatic projective orchestration, and to rely on machine learning. Hence, we start by formalizing this novel task. We focus our effort on projecting a piano piece onto a full symphonic orchestra, in the style of notable classic composers such as Mozart or Beethoven. Hence, the first objective is to design a system of live orchestration, which takes as input the sequence of chords played by a pianist and generate in real-time its orchestration. Afterwards, we relax the real-time constraints in order to use slower but more powerful models and to generate scores in a non-causal way, which is closer to the writing process of a human composer. By observing a large dataset of orchestral music written by composers and their reduction for piano, we hope to be able to capture through statistical learning methods the mechanisms involved in the orchestration of a piano piece. Deep neural networks seem to be a promising lead for their ability to model complex behaviour from a large dataset and in an unsupervised way. More specifically, in the challenging context of symbolic music which is characterized by a high-dimensional target space and few examples, we investigate autoregressive models. At the price of a slower generation process, auto-regressive models allow to account for more complex dependencies between the different elements of the score, which we believe to be of the foremost importance in the case of orchestration
APA, Harvard, Vancouver, ISO und andere Zitierweisen
45

Salakhutdinov, Ruslan. „Learning Deep Generative Models“. Thesis, 2009. http://hdl.handle.net/1807/19226.

Der volle Inhalt der Quelle
Annotation:
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks. In addition, similar methods can be used for nonlinear dimensionality reduction.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
46

„Learning Transferable Data Representations Using Deep Generative Models“. Master's thesis, 2018. http://hdl.handle.net/2286/R.I.49287.

Der volle Inhalt der Quelle
Annotation:
abstract: Machine learning models convert raw data in the form of video, images, audio, text, etc. into feature representations that are convenient for computational process- ing. Deep neural networks have proven to be very efficient feature extractors for a variety of machine learning tasks. Generative models based on deep neural networks introduce constraints on the feature space to learn transferable and disentangled rep- resentations. Transferable feature representations help in training machine learning models that are robust across different distributions of data. For example, with the application of transferable features in domain adaptation, models trained on a source distribution can be applied to a data from a target distribution even though the dis- tributions may be different. In style transfer and image-to-image translation, disen- tangled representations allow for the separation of style and content when translating images. This thesis examines learning transferable data representations in novel deep gen- erative models. The Semi-Supervised Adversarial Translator (SAT) utilizes adversar- ial methods and cross-domain weight sharing in a neural network to extract trans- ferable representations. These transferable interpretations can then be decoded into the original image or a similar image in another domain. The Explicit Disentangling Network (EDN) utilizes generative methods to disentangle images into their core at- tributes and then segments sets of related attributes. The EDN can separate these attributes by controlling the ow of information using a novel combination of losses and network architecture. This separation of attributes allows precise modi_cations to speci_c components of the data representation, boosting the performance of ma- chine learning tasks. The effectiveness of these models is evaluated across domain adaptation, style transfer, and image-to-image translation tasks.
Dissertation/Thesis
Masters Thesis Computer Science 2018
APA, Harvard, Vancouver, ISO und andere Zitierweisen
47

Bordes, Florian. „Learning to sample from noise with deep generative models“. Thèse, 2017. http://hdl.handle.net/1866/19370.

Der volle Inhalt der Quelle
Annotation:
L’apprentissage automatique et spécialement l’apprentissage profond se sont imposés ces dernières années pour résoudre une large variété de tâches. Une des applications les plus remarquables concerne la vision par ordinateur. Les systèmes de détection ou de classification ont connu des avancées majeurs grâce a l’apprentissage profond. Cependant, il reste de nombreux obstacles à une compréhension du monde similaire aux être vivants. Ces derniers n’ont pas besoin de labels pour classifier, pour extraire des caractéristiques du monde réel. L’apprentissage non supervisé est un des axes de recherche qui se concentre sur la résolution de ce problème. Dans ce mémoire, je présente un nouveau moyen d’entrainer des réseaux de neurones de manière non supervisée. Je présente une méthode permettant d’échantillonner de manière itérative a partir de bruit afin de générer des données qui se rapprochent des données d’entrainement. Cette procédure itérative s’appelle l’entrainement par infusion qui est une nouvelle approche permettant d’apprendre l’opérateur de transition d’une chaine de Markov. Dans le premier chapitre, j’introduis des bases concernant l’apprentissage automatique et la théorie des probabilités. Dans le second chapitre, j’expose les modèles génératifs qui ont inspiré ce travail. Dans le troisième et dernier chapitre, je présente comment améliorer l’échantillonnage dans les modèles génératifs avec l’entrainement par infusion.
Machine learning and specifically deep learning has made significant breakthroughs in recent years concerning different tasks. One well known application of deep learning is computer vision. Tasks such as detection or classification are nearly considered solved by the community. However, training state-of-the-art models for such tasks requires to have labels associated to the data we want to classify. A more general goal is, similarly to animal brains, to be able to design algorithms that can extract meaningful features from data that aren’t labeled. Unsupervised learning is one of the axes that try to solve this problem. In this thesis, I present a new way to train a neural network as a generative model capable of generating quality samples (a task akin to imagining). I explain how by starting from noise, it is possible to get samples which are close to the training data. This iterative procedure is called Infusion training and is a novel approach to learning the transition operator of a generative Markov chain. In the first chapter, I present some background about machine learning and probabilistic models. The second chapter presents generative models that inspired this work. The third and last chapter presents and investigates our novel approach to learn a generative model with Infusion training.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
48

Killedar, Vinayak. „Solving Inverse Problems Using a Deep Generative Prior“. Thesis, 2021. https://etd.iisc.ac.in/handle/2005/5234.

Der volle Inhalt der Quelle
Annotation:
In an inverse problem, the objective is to recover a signal from its measurements, given the knowledge of the measurement operator. In this thesis, we address the problems of compressive sensing (CS) and compressive phase retrieval (CPR) using a generative prior model with sparse latent sampling. These problems are ill-posed and have infinite solutions. Structural assumptions such as smoothness, sparsity and non-negativity are imposed on the solution to obtain a unique and meaningful solution. The standard CS and CPR formulations impose a sparsity prior on the signal. Recently, generative modeling approaches have removed the sparsity constraint and shown superior performance over traditional CS and CPR techniques in recovering signals from fewer measurements. Generative model uses a pre-trained network, the generator of a Generative Adversarial Network (GAN) or the decoder of a Variational Autoencoder (VAE) to model the distribution of the signal and impose a Set-Restricted Eigenvalue Condition (S - REC) on the measurement operator. The S - REC property places a condition on the L2-norm of the difference in signal and measurement domain for signals coming from the set S. Solving CS and CPR using generative models have some limitations. The reconstructed signal is constrained to lie in the range-space of the generator. The reconstruction process is slow because the latent space is optimized through gradient-descent (GD) and requires several restarts. It has been argued that the distribution of natural images is not confined to a single manifold, but a union of submanifolds. To take advantage of this property, we propose a sparsity-driven latent space sampling (SDLSS) framework, where sparsity is imposed in the latent space. The effect is to divide the latent space into subspaces such that the generator models maps each subspace into a submanifold. We propose a proximal meta-learning (PML) algorithm to optimize the parameters of the generative model along with the latent code. The PML algorithm reduces the number of gradient steps required during testing and imposes sparsity in the latent space. We derive the sample complexity bounds within the SDLSS framework for the linear CS model, which is a generalization of the result available in the literature. The results demonstrate that, for a higher degree of compression, the SDLSS method is more efficient than the state-of-the-art deep compressive sensing (DCS) method. We consider both linear and learned nonlinear sensing mechanisms, where the nonlinear operator is a learned fully connected neural network or a convolutional neural network, and show that the learned nonlinear version is superior to the linear one. As an application of the nonlinear sensing operator, we consider compressive phase retrieval, wherein the problem is to reconstruct a signal from the magnitude of its compressed linear measurements. We adapt the S-REC imposed on the measurement operator and propose a novel cost function. The SDLSS framework along with PML algorithm is applied to optimize the sparse latent space such that the adapted S-REC loss and data-fitting error are minimized. The reconstruction process is fast and requires few gradient steps during testing compared with the state-of-art deep phase retrieval technique. Experiments are conducted on standard datasets such as MNIST, Fashion-MNIST, CIFAR-10, and CelebA to validate the efficiency of SDLSS framework for CS and CPR. The results show that, for a given dataset, there exists an effective input latent dimension for the generative model. Performance quantification is carried out by employing three objective metrics: peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and reconstruction error (RE) per pixel, which are averaged across the test dataset.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
49

Barros, Maria de Fátima da Silva. „Deep Generative Models for 3D Breast MRI Shapes and Textures“. Master's thesis, 2021. https://hdl.handle.net/10216/135381.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
50

Barros, Maria de Fátima da Silva. „Deep Generative Models for 3D Breast MRI Shapes and Textures“. Dissertação, 2021. https://hdl.handle.net/10216/135381.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie