Dissertationen zum Thema „Deep generative modeling“

Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: Deep generative modeling.

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-33 Dissertationen für die Forschung zum Thema "Deep generative modeling" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Dissertationen für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

Skalic, Miha 1990. „Deep learning for drug design : modeling molecular shapes“. Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/667503.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Designing novel drugs is a complex process which requires finding molecules in a vast chemical space that bind to a specific biomolecular target and have favorable physio-chemical properties. Machine learning methods can leverage previous data and use it for new predictions helping the processes of selection of molecule candidate without relying exclusively on experiments. Particularly, deep learning can be applied to extract complex patterns from simple representations. In this work we leverage deep learning to extract patterns from three-dimensional representations of molecules. We apply classification and regression models to predict bioactivity and binding affinity, respectively. Furthermore, we show that it is possible to predict ligand properties for a particular protein pocket. Finally, we employ deep generative modeling for compound design. Given a ligand shape we show that we can generate similar compounds, and given a protein pocket we can generate potentially binding compounds.
El disseny de drogues novells es un procés complex que requereix trobar les molècules adequades, entre un gran ventall de possibilitats, que siguin capaces d’unir-se a la proteïna desitjada amb unes propietats fisicoquímiques favorables. Els mètodes d’aprenentatge automàtic ens serveixen per a aprofitar dades antigues sobre les molècules i utilitzar-les per a noves prediccions, ajudant en el procés de selecció de molècules potencials sense la necessitat exclusiva d’experiments. Particularment, l’aprenentatge profund pot sera plicat per a extreure patrons complexos a partir de representacions simples. En aquesta tesi utilitzem l’aprenentatge profund per a extreure patrons a partir de representacions tridimensionals de molècules. Apliquem models de classificació i regressió per a predir la bioactivitat i l’afinitat d’unió, respectivament. A més, demostrem que podem predir les propietats dels lligands per a una cavitat proteica determinada. Finalment, utilitzem un model generatiu profund per a disseny de compostos. Donada una forma d’un lligand demostrem que podem generar compostos similars i, donada una cavitat proteica, podem generar compostos que potencialment s’hi podràn unir.
2

Chen, Tian Qi. „Deep kernel mean embeddings for generative modeling and feedforward style transfer“. Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/62668.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
The generation of data has traditionally been specified using hand-crafted algorithms. However, oftentimes the exact generative process is unknown while only a limited number of samples are observed. One such case is generating images that look visually similar to an exemplar image or as if coming from a distribution of images. We look into learning the generating process by constructing a similarity function that measures how close the generated image is to the target image. We discuss a framework in which the similarity function is specified by a pre-trained neural network without fine-tuning, as is the case for neural texture synthesis, and a framework where the similarity function is learned along with the generative process in an adversarial setting, as is the case for generative adversarial networks. The main point of discussion is the combined use of neural networks and maximum mean discrepancy as a versatile similarity function. Additionally, we describe an improvement to state-of-the-art style transfer that allows faster computations while maintaining generality of the generating process. The proposed objective has desirable properties such as a simpler optimization landscape, intuitive parameter tuning, and consistent frame- by-frame performance on video. We use 80,000 natural images and 80,000 paintings to train a procedure for artistic style transfer that is efficient but also allows arbitrary content and style images.
Science, Faculty of
Computer Science, Department of
Graduate
3

Brodie, Michael B. „Methods for Generative Adversarial Output Enhancement“. BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8763.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Generative Adversarial Networks (GAN) learn to synthesize novel samples for a given data distribution. While GANs can train on diverse data of various modalities, the most successful use cases to date apply GANs to computer vision tasks. Despite significant advances in training algorithms and network architectures, GANs still struggle to consistently generate high-quality outputs after training. We present a series of papers that improve GAN output inference qualitatively and quantitatively. The first chapter, Alpha Model Domination, addresses a related subfield of Multiple Choice Learning, which -- like GANs -- aims to generate diverse sets of outputs. The next chapter, CoachGAN, introduces a real-time refinement method for the latent input space that improves inference quality for pretrained GANs. The following two chapters introduce finetuning methods for arbitrary, end-to-end differentiable GANs. The first, PuzzleGAN, proposes a self-supervised puzzle-solving task to improve global coherence in generated images. The latter, Trained Truncation Trick, improves upon a common inference heuristic by better maintaining output diversity while increasing image realism. Our final work, Two Second StyleGAN Projection, reduces the time for high-quality, image-to-latent GAN projections by two orders of magnitude. We present a wide array of results and applications of our method. We conclude with implications and directions for future work.
4

Testolin, Alberto. „Modeling cognition with generative neural networks: The case of orthographic processing“. Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424619.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
This thesis investigates the potential of generative neural networks to model cognitive processes. In contrast to many popular connectionist models, the computational framework adopted in this research work emphasizes the generative nature of cognition, suggesting that one of the primary goals of cognitive systems is to learn an internal model of the surrounding environment that can be used to infer causes and make predictions about the upcoming sensory information. In particular, we consider a powerful class of recurrent neural networks that learn probabilistic generative models from experience in a completely unsupervised way, by extracting high-order statistical structure from a set of observed variables. Notably, this type of networks can be conveniently formalized within the more general framework of probabilistic graphical models, which provides a unified language to describe both neural networks and structured Bayesian models. Moreover, recent advances allow to extend basic network architectures to build more powerful systems, which exploit multiple processing stages to perform learning and inference over hierarchical models, or which exploit delayed recurrent connections to process sequential information. We argue that these advanced network architectures constitute a promising alternative to the more traditional, feed-forward, supervised neural networks, because they more neatly capture the functional and structural organization of cortical circuits, providing a principled way to combine top-down, high-level contextual information with bottom-up, sensory evidence. We provide empirical support justifying the use of these models by studying how efficient implementations of hierarchical and temporal generative networks can extract information from large datasets containing thousands of patterns. In particular, we perform computational simulations of recognition of handwritten and printed characters belonging to different writing scripts, which are successively combined spatially or temporally in order to build more complex orthographic units such as those constituting English words.
In questa tesi vengono studiati alcuni processi cognitivi utilizzando recenti modelli di reti neurali generative. A differenza della maggior parte dei modelli connessionisti, l’approccio computazionale adottato in questa tesi enfatizza la natura generativa della cognizione, suggerendo che uno degli obiettivi principali dei sistemi cognitivi sia quello di apprendere un modello interno dell’ambiente circostante, che può essere usato per inferire relazioni causali ed effettuare previsioni riguardo all’informazione sensoriale in arrivo. In particolare, viene considerata una potente classe di reti neurali ricorrenti in grado di apprendere modelli generativi probabilistici dall’esperienza, estraendo informazione statistica di ordine superiore da un insieme di variabili in modo totalmente non supervisionato. Questo tipo di reti può essere formalizzato utilizzando la teoria dei modelli grafici probabilistici, che consente di descrivere con lo stesso linguaggio formale sia modelli di reti neurali che modelli Bayesiani strutturati. Inoltre, architetture di rete di base possono essere estese per creare sistemi più sofisticati, sfruttando molteplici livelli di processamento per apprendere modelli generativi gerarchici o sfruttando connessioni ricorrenti direzionate per processare informazione organizzata in sequenze. Riteniamo che queste architetture avanzate costituiscano un’alternativa promettente alle più tradizionali reti neurali supervisionate di tipo feed-forward, perché riproducono più fedelmente l’organizzazione funzionale e strutturale dei circuiti corticali, consentendo di spiegare come l’evidenza sensoriale possa essere effettivamente combinata con informazione contestuale proveniente da connessioni di feedback (“top-down”). Per giustificare l’utilizzo di questo tipo di modelli, in una serie di simulazioni studiamo nel dettaglio come implementazioni efficienti di reti generative gerarchiche e temporali possano estrarre informazione da grandi basi di dati, contenenti migliaia di esempi di training. In particolare, forniamo evidenza empirica relativa al riconoscimento di caratteri stampati e manoscritti appartenenti a diversi sistemi di scrittura, che possono in seguito essere combinati spazialmente o temporalmente per costruire unità ortografiche più complesse come quelle rappresentate dalle parole inglesi.
5

Yan, Guowei. „Interactive Modeling of Elastic Materials and Splashing Liquids“. The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1593098802306904.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Sadok, Samir. „Audiovisual speech representation learning applied to emotion recognition“. Electronic Thesis or Diss., CentraleSupélec, 2024. http://www.theses.fr/2024CSUP0003.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Les émotions sont vitales dans notre quotidien, devenant un centre d'intérêt majeur de la recherche en cours. La reconnaissance automatique des émotions a suscité beaucoup d'attention en raison de ses applications étendues dans des secteurs tels que la santé, l'éducation, le divertissement et le marketing. Ce progrès dans la reconnaissance émotionnelle est essentiel pour favoriser le développement de l'intelligence artificielle centrée sur l'humain. Les systèmes de reconnaissance des émotions supervisés se sont considérablement améliorés par rapport aux approches traditionnelles d’apprentissage automatique. Cependant, cette progression rencontre des limites en raison de la complexité et de la nature ambiguë des émotions. La création de vastes ensembles de données étiquetées émotionnellement est coûteuse, chronophage et souvent impraticable. De plus, la nature subjective des émotions entraîne des ensembles de données biaisés, impactant l'applicabilité des modèles d'apprentissage dans des scénarios réels.Motivé par la manière dont les humains apprennent et conceptualisent des représentations complexes dès un jeune âge avec un minimum de supervision, cette approche démontre l'efficacité de tirer parti de l'expérience antérieure pour s'adapter à de nouvelles situations. Les modèles d'apprentissage non supervisé ou auto-supervisé s'inspirent de ce paradigme. Initialement, ils visent à établir une représentation générale à partir de données non étiquetées, semblable à l'expérience préalable fondamentale dans l'apprentissage humain. Ces représentations doivent répondre à des critères tels que l'invariance, l'interprétabilité et l'efficacité. Ensuite, ces représentations apprises sont appliquées à des tâches ultérieures avec des données étiquetées limitées, telles que la reconnaissance des émotions. Cela reflète l'assimilation de nouvelles situations dans l'apprentissage humain. Dans cette thèse, nous visons à proposer des méthodes d'apprentissage de représentations non supervisées et auto-supervisées conçues spécifiquement pour des données multimodales et séquentielles, et à explorer leurs avantages potentiels dans le contexte des tâches de reconnaissance des émotions. Les principales contributions de cette thèse comprennent :1. Le développement de modèles génératifs via l'apprentissage non supervisé ou auto-supervisé pour l'apprentissage de la représentation audiovisuelle de la parole, en intégrant une modélisation temporelle et multimodale (audiovisuelle) conjointe.2. La structuration de l'espace latent pour permettre des représentations désentrelacées, améliorant l'interprétabilité en contrôlant les facteurs latents interprétables par l'humain.3. La validation de l'efficacité de nos approches à travers des analyses qualitatives et quantitatives, en particulier sur la tâche de reconnaissance des émotions. Nos méthodes facilitent l'analyse, la transformation et la génération de signaux
Emotions are vital in our daily lives, becoming a primary focus of ongoing research. Automatic emotion recognition has gained considerable attention owing to its wide-ranging applications across sectors such as healthcare, education, entertainment, and marketing. This advancement in emotion recognition is pivotal for fostering the development of human-centric artificial intelligence. Supervised emotion recognition systems have significantly improved over traditional machine learning approaches. However, this progress encounters limitations due to the complexity and ambiguous nature of emotions. Acquiring extensive emotionally labeled datasets is costly, time-intensive, and often impractical.Moreover, the subjective nature of emotions results in biased datasets, impacting the learning models' applicability in real-world scenarios. Motivated by how humans learn and conceptualize complex representations from an early age with minimal supervision, this approach demonstrates the effectiveness of leveraging prior experience to adapt to new situations. Unsupervised or self-supervised learning models draw inspiration from this paradigm. Initially, they aim to establish a general representation learning from unlabeled data, akin to the foundational prior experience in human learning. These representations should adhere to criteria like invariance, interpretability, and effectiveness. Subsequently, these learned representations are applied to downstream tasks with limited labeled data, such as emotion recognition. This mirrors the assimilation of new situations in human learning. In this thesis, we aim to propose unsupervised and self-supervised representation learning methods designed explicitly for multimodal and sequential data and to explore their potential advantages in the context of emotion recognition tasks. The main contributions of this thesis encompass:1. Developing generative models via unsupervised or self-supervised learning for audiovisual speech representation learning, incorporating joint temporal and multimodal (audiovisual) modeling.2. Structuring the latent space to enable disentangled representations, enhancing interpretability by controlling human-interpretable latent factors.3. Validating the effectiveness of our approaches through both qualitative and quantitative analyses, in particular on emotion recognition task. Our methods facilitate signal analysis, transformation, and generation
7

Luc, Pauline. „Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos“. Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM024/document.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Les modèles prédictifs ont le potentiel de permettre le transfert des succès récents en apprentissage par renforcement à de nombreuses tâches du monde réel, en diminuant le nombre d’interactions nécessaires avec l’environnement.La tâche de prédiction vidéo a attiré un intérêt croissant de la part de la communauté ces dernières années, en tant que cas particulier d’apprentissage prédictif dont les applications en robotique et dans les systèmes de navigations sont vastes.Tandis que les trames RGB sont faciles à obtenir et contiennent beaucoup d’information, elles sont extrêmement difficile à prédire, et ne peuvent être interprétées directement par des applications en aval.C’est pourquoi nous introduisons ici une tâche nouvelle, consistant à prédire la segmentation sémantique ou d’instance de trames futures.Les espaces de descripteurs que nous considérons sont mieux adaptés à la prédiction récursive, et nous permettent de développer des modèles de segmentation prédictifs performants jusqu’à une demi-seconde dans le futur.Les prédictions sont interprétables par des applications en aval et demeurent riches en information, détaillées spatialement et faciles à obtenir, en s’appuyant sur des méthodes état de l’art de segmentation.Dans cette thèse, nous nous attachons d’abord à proposer pour la tâche de segmentation sémantique, une approche discriminative se basant sur un entrainement par réseaux antagonistes.Ensuite, nous introduisons la tâche nouvelle de prédiction de segmentation sémantique future, pour laquelle nous développons un modèle convolutionnel autoregressif.Enfin, nous étendons notre méthode à la tâche plus difficile de prédiction de segmentation d’instance future, permettant de distinguer entre différents objets.Du fait du nombre de classes variant selon les images, nous proposons un modèle prédictif dans l’espace des descripteurs d’image convolutionnels haut niveau du réseau de segmentation d’instance Mask R-CNN.Cela nous permet de produire des segmentations visuellement plaisantes en haute résolution, pour des scènes complexes comportant un grand nombre d’objets, et avec une performance satisfaisante jusqu’à une demi seconde dans le futur
Predictive models of the environment hold promise for allowing the transfer of recent reinforcement learning successes to many real-world contexts, by decreasing the number of interactions needed with the real world.Video prediction has been studied in recent years as a particular case of such predictive models, with broad applications in robotics and navigation systems.While RGB frames are easy to acquire and hold a lot of information, they are extremely challenging to predict, and cannot be directly interpreted by downstream applications.Here we introduce the novel tasks of predicting semantic and instance segmentation of future frames.The abstract feature spaces we consider are better suited for recursive prediction and allow us to develop models which convincingly predict segmentations up to half a second into the future.Predictions are more easily interpretable by downstream algorithms and remain rich, spatially detailed and easy to obtain, relying on state-of-the-art segmentation methods.We first focus on the task of semantic segmentation, for which we propose a discriminative approach based on adversarial training.Then, we introduce the novel task of predicting future semantic segmentation, and develop an autoregressive convolutional neural network to address it.Finally, we extend our method to the more challenging problem of predicting future instance segmentation, which additionally segments out individual objects.To deal with a varying number of output labels per image, we develop a predictive model in the space of high-level convolutional image features of the Mask R-CNN instance segmentation model.We are able to produce visually pleasing segmentations at a high resolution for complex scenes involving a large number of instances, and with convincing accuracy up to half a second ahead
8

Ionascu, Beatrice. „Modelling user interaction at scale with deep generative methods“. Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239333.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Understanding how users interact with a company's service is essential for data-driven businesses that want to better cater to their users and improve their offering. By using a generative machine learning approach it is possible to model user behaviour and generate new data to simulate or recognize and explain typical usage patterns. In this work we introduce an approach for modelling users' interaction behaviour at scale in a client-service model. We propose a novel representation of multivariate time-series data as time pictures that express temporal correlations through spatial organization. This representation shares two key properties that convolutional networks have been built to exploit and allows us to develop an approach based on deep generative models that use convolutional networks as backbone. In introducing this approach of feature learning for time-series data, we expand the application of convolutional neural networks in the multivariate time-series domain, and specifically user interaction data. We adopt a variational approach inspired by the β-VAE framework in order to learn hidden factors that define different user behaviour patterns. We explore different values for the regularization parameter β and show that it is possible to construct a model that learns a latent representation of identifiable and different user behaviours. We show on real-world data that the model generates realistic samples, that capture the true population-level statistics of the interaction behaviour data, learns different user behaviours, and provides accurate imputations of missing data.
Förståelse för hur användare interagerar med ett företags tjänst är essentiell för data-drivna affärsverksamheter med ambitioner om att bättre tillgodose dess användare och att förbättra deras utbud. Generativ maskininlärning möjliggör modellering av användarbeteende och genererande av ny data i syfte att simulera eller identifiera och förklara typiska användarmönster. I detta arbete introducerar vi ett tillvägagångssätt för storskalig modellering av användarinteraktion i en klientservice-modell. Vi föreslår en ny representation av multivariat tidsseriedata i form av tidsbilder vilka representerar temporala korrelationer via spatial organisering. Denna representation delar två nyckelegenskaper som faltningsnätverk har utvecklats för att exploatera, vilket tillåter oss att utveckla ett tillvägagångssätt baserat på på djupa generativa modeller som bygger på faltningsnätverk. Genom att introducera detta tillvägagångssätt för tidsseriedata expanderar vi applicering av faltningsnätverk inom domänen för multivariat tidsserie, specifikt för användarinteraktionsdata. Vi använder ett tillvägagångssätt inspirerat av ramverket β-VAE i syfte att lära modellen gömda faktorer som definierar olika användarmönster. Vi utforskar olika värden för regulariseringsparametern β och visar att det är möjligt att konstruera en modell som lär sig en latent representation av identifierbara och multipla användarbeteenden. Vi visar med verklig data att modellen genererar realistiska exempel vilka i sin tur fångar statistiken på populationsnivå hos användarinteraktionsdatan, samt lär olika användarbeteenden och bidrar med precisa imputationer av saknad data.
9

McClintick, Kyle W. „Training Data Generation Framework For Machine-Learning Based Classifiers“. Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1276.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
In this thesis, we propose a new framework for the generation of training data for machine learning techniques used for classification in communications applications. Machine learning-based signal classifiers do not generalize well when training data does not describe the underlying probability distribution of real signals. The simplest way to accomplish statistical similarity between training and testing data is to synthesize training data passed through a permutation of plausible forms of noise. To accomplish this, a framework is proposed that implements arbitrary channel conditions and baseband signals. A dataset generated using the framework is considered, and is shown to be appropriately sized by having $11\%$ lower entropy than state-of-the-art datasets. Furthermore, unsupervised domain adaptation can allow for powerful generalized training via deep feature transforms on unlabeled evaluation-time signals. A novel Deep Reconstruction-Classification Network (DRCN) application is introduced, which attempts to maintain near-peak signal classification accuracy despite dataset bias, or perturbations on testing data unforeseen in training. Together, feature transforms and diverse training data generated from the proposed framework, teaching a range of plausible noise, can train a deep neural net to classify signals well in many real-world scenarios despite unforeseen perturbations.
10

Fang, Zhufeng. „USING GEOSTATISTICS, PEDOTRANSFER FUNCTIONS TO GENERATE 3D SOIL AND HYDRAULIC PROPERTY DISTRIBUTIONS FOR DEEP VADOSE ZONE FLOW SIMULATIONS“. Thesis, The University of Arizona, 2009. http://hdl.handle.net/10150/193439.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
We use geostatistical and pedotrasnfer functions to estimate the three-dimensional distributions of soil types and hydraulic properties in a relatively large volume of vadose zone underlying the Maricopa Agriculture Center near Phoenix, Arizona. Soil texture and bulk density data from the site are analyzed geostatistically to reveal the underlying stratigraphy as well as finer features of their three-dimensional variability in space. Such fine features are revealed by cokriging soil texture and water content measured prior to large-scale long-term infiltration experiments. Resultant estimates of soil texture and bulk density data across the site are then used as input into a pedotransfer function to produce estimates of soil hydraulic parameter (saturated and residual water content θs and θr, saturated hydraulic conductivity Ks, van Genuchten parameters αand n) distributions across the site in three dimensions. We compare these estimates with laboratory-measured values of these same hydraulic parameters and find the estimated parameters match the measured well for θs, n and Ks but not well for θr nor α, while some measured extreme values are not captured. Finally the estimated soil hydraulic parameters are put into a numerical simulator to test the reliability of the models. Resultant simulated water contents do not agree well with those observed, indicating inverse calibration is required to improve the modeling performance. The results of this research conform to a previous work by Wang et al. at 2003. Also this research covers the gaps of Wang’s work in sense of generating 3-D heterogeneous fields of soil texture and bulk density by cokriging and providing comparisons between estimated and measured soil hydraulic parameters with new field and laboratory measurements of water retentions datasets.
11

Marin-Moreno, Hector. „Numerical modelling of overpressure generation in deep basins and response of Arctic gas hydrate to ocean warming“. Thesis, University of Southampton, 2014. https://eprints.soton.ac.uk/364170/.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
This thesis is split into the two scientific topics studied; overpressure development in deep basins and present-day and future gas hydrate dissociation in the Arctic. Locating and quantifying overpressure is essential to understand basin evolution and hydrocarbon migration in deep basins and thickly sedimented continental margins. The first part of this thesis develops two new methods, including an inverse model, to impose seismic and geological constraints on models of overpressure generated by the disequilibrium compaction and aquathermal expansion mechanisms. The results provide greater understanding of a low velocity zone (LVZ), inferred from wide-angle seismic data, in the centre of the Eastern Black Sea Basin (EBSB). The application of both methods in the study area indicate that the LVZ located within the Maikop formation, at ~3500-6500 m depth below the seabed (mbsf), is linked to overpressure generated, mainly, by disequilibrium compaction.
12

He, Sheng. „Thermal History and Deep Overpressure Modelling in the Northern Carnarvon Basin, North West Shelf, Australia“. Thesis, Curtin University, 2002. http://hdl.handle.net/20.500.11937/1292.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
The Northern Carnarvon Basin is the richest petroleum province in Australia. About 50 gas/condensate and oil fields, associated mainly with Jurassic source rocks, have been discovered in the sub-basins and on the Rankin Platform since 1964. The basin is located at the southern end of the North West Shelf of Australia. It can be mainly subdivided into the Exmouth, Barrow, Dampier and Beagle Sub-basins, the Rankin Platform and Exmouth Plateau. The sub-basins are rift-related grabens and half-grabens developed during the Jurassic to the earliest Cretaceous and contain over 10 kilometres of Mesozoic and Cainozoic sedimentary rocks, among which are several thousand meters of Jurassic rocks. The formations of the Jurassic and the lower part of the Barrow Group of Early Cretaceous age in the sub-basins of the Northern Carnarvon Basin were found to be overpressured with excess pressures of 5-29 MPa at depths of 2900-3600 m indicated by repeat formation tests (RFTs) and drill stem tests (DSTs). The characteristics of organic matter, thermal history and thermal maturity, pressure seal and overpressure evolution in the sub-basins are crucial to a proper understanding of the nature and dynamic processes of hydrocarbon generation and migration in the basin. Based on organic geochemical data, the important source rocks in the basin are Jurassic organic-rich fine-grained rocks including the Murat Siltstone, the rift-related Athol Formation and Dingo Claystone. The Mungaroo Formation of the Middle-Upper Triassic contains gas-generating source rocks. These formations recognised to be organic rich based on 1256 values of the total organic carbon content (TOC, %) from 17 wells. Average TOC values (calculated from samples with TOC < 15 %) are about 2.19 % in the Mungaroo Formation, about 2.09 % in the Murat Siltstone and about 1.74 % in the Athol Formation and Dingo Claystone.Data from kerogen element analysis, Rock-Eval pyrolysis, visual kerogen composition and some biomarkers have been used to evaluate the kerogen type in the basin. It appears that type III kerogen is the dominant organic-matter type in the Triassic and Jurassic source rocks, while the Dingo Claystone may contain some oil-prone organic matter. The vitrinite reflectance (Ro) data in some wells of the Northern Carnarvon Basin are anomalously low. As a major thermal maturity indicator, the anomalously low Ro data seriously hinder the assessment of thermal maturity in the basin. This study differs from other studies in that it has paid more attention to Rock-Eval Tmax data. Therefore, problems affecting Tmax data in evaluating thermal maturity were investigated. A case study of contaminated Rock-Eval data in Bambra-2 and thermal modelling using Tmax data in 16 wells from different tectonic subdivisions were carried out. The major problems for using Tmax data were found to be contamination by drilling-mud additives, natural bitumen and suppression due to hydrogen index (HI) > 150 in some wells. Although the data reveal uncertainties and there is about ±3-10 % error for thermal modelling by using the proposed relationship of Ro and Tmax, the "reliable" Tmax data are found to be important, and useful to assess thermal maturity and reduce the influence of unexpectedly low Ro data.This study analyzed the characteristics of deep overpressured zones and top pressure seals, in detail, in 7 wells based on the observed fluid pressure data and petrophysical data. The deep overpressured system (depth greater than 2650-3000 m) in the Jurassic formations and the lower part of the Barrow Group is shown by the measured fluid pressure data including RFTs, DSTs and mud weights. The highly overpressured Jurassic fine-grained rocks also exhibit well-log responses of high sonic transit times and low formation resistivities. The deep overpressured zone, however, may not necessarily be caused by anomalously high porosities due to undercompaction. The porosities in the deep overpressured Jurassic rocks may be significantly less than the well-log derived porosities, which may indicate that the sonic-log and resistivity-log also directly respond to the overpressuring in the deep overpressured fine-grained rocks of the sub-basins. Based on the profiles of fluid pressure and well-log data in 5 wells of the Barrow Sub-basin, a top pressure seal was interpreted to be consistent with the transitional pressure zone in the Barrow Sub-basin. This top pressure seal was observed to consist of a rock layer of 60-80 % claystone and siltstone. The depths of the rock layer range from 2650 m to 3300 m with thicknesses of 300-500 m and temperatures of 110-135 °C. Based on the well-log data, measured porosity and sandstone diagenesis, the rock layer seems to be well compacted and cemented with a porosity range of about 2-5 % and calculated permeabilities of about 10-19 to 10-22 M2.This study performed thermal history and maturity modelling in 14 wells using the BasinMod 1D software. It was found that the thermal maturity data in 4 wells are consistent with the maturity curves predicted by the rifting heat flow history associated with the tectonic regime of this basin. The maximum heat flows during the rift event of the Jurassic and earliest Cretaceous possibly ranged from 60-70 mW/m2 along the sub-basins and 70-80 mW/m2 on the southern and central Exmouth Plateau. This study also carried out two case studies of thermal maturity and thermal modelling within the deep overpressured system in the Barrow and Bambra wells of the Barrow Sub-basin. These case studies were aimed at understanding whether overpressure has a determinable influence on thermal maturation in this region. It was found that there is no evidence for overpressure-related retardation of thermal maturity in the deep overpressured system, based on the measured maturity, biomarker maturity parameters and 1D thermal modelling. Therefore, based on the data analysed, overpressure is an insignificant factor in thermal maturity and h hydrocarbon generation in this basin.Three seismic lines in the Exmouth, Barrow and Dampier Sub-basins were selected and converted to depth cross-sections, and then 2D geological models were created for overpressure evolution modelling. A major object of these 2D geological models was to define the critical faults. A top pressure seal was also detected based on the 2D model of the Barrow Sub-basin. Two-dimensional overpressure modelling was performed using the BasinMod 2D software. The mathematical 2D model takes into consideration compaction, fluid thermal expansion, pressure produced by hydrocarbon generation and quartz cementation. The sealed overpressured conditions can be modelled with fault sealing, bottom pressure seal (permeabilities of 10-23 to 10-25 M2 ) and top pressure seal (permeabilities of 10-19 to 10-22 m2). The modelling supports the development of a top pressure seal with quartz cementation. The 2D modelling suggests the rapid sedimentation rates can cause compaction disequilibrium in the fine-grained rocks, which may be a mechanism for overpressure generation during the Jurassic to the Early Cretaceous. The data suggest that the present-day deep overpressure is not associated with the porosity anomaly due to compaction disequilibrium and that compaction may be much less important than recurrent pressure charges because most of the porosity in the Jurassic source rocks has been lost through compaction and deposition rates have been very slow since the beginning of the Cainozoic.Three simple 1D models were developed and applied to estimate how rapidly the overpressure dissipates. The results suggest that the present day overpressure would be almost dissipated after 2 million years with a pressure seal with an average permeability of 10-22 M2 (10-7 md). On the basis of numerous accumulations of oil and gas to be expelled from the overpressured Jurassic source rocks in the basin and the pressure seal modelling, it seems that the top pressure seal with permeabilities of 10-19 to 10-22 M2 (10-4 to 10-7 md) is not enough to retain the deep overpressure for tens of millions of years without pressure recharging. Only if the permeabilities were 10-23 m2 (10-8 md) or less, would a long-lived overpressured system be preserved. This study suggests that hydrocarbon generation, especially gas generation and thermal expansion, within sealed conditions of low-permeability is a likely major cause for maintaining the deep overpressure over the past tens of millions of years. Keywords: Thermal history; Deep overpressure; Type III kerogen; Rock-Eval Tmax; Thermal maturity; Palaeoheatflow modelling; Pressure seal; 2D deep overpressure modelling; Pressure behaviour modelling; Overpressure generation; Northern Carnarvon Basin.
13

Buys, Jan Moolman. „Incremental generative models for syntactic and semantic natural language processing“. Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:a9a7b5cf-3bb1-4e08-b109-de06bf387d1d.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
This thesis investigates the role of linguistically-motivated generative models of syntax and semantic structure in natural language processing (NLP). Syntactic well-formedness is crucial in language generation, but most statistical models do not account for the hierarchical structure of sentences. Many applications exhibiting natural language understanding rely on structured semantic representations to enable querying, inference and reasoning. Yet most semantic parsers produce domain-specific or inadequately expressive representations. We propose a series of generative transition-based models for dependency syntax which can be applied as both parsers and language models while being amenable to supervised or unsupervised learning. Two models are based on Markov assumptions commonly made in NLP: The first is a Bayesian model with hierarchical smoothing, the second is parameterised by feed-forward neural networks. The Bayesian model enables careful analysis of the structure of the conditioning contexts required for generative parsers, but the neural network is more accurate. As a language model the syntactic neural model outperforms both the Bayesian model and n-gram neural networks, pointing to the complementary nature of distributed and structured representations for syntactic prediction. We propose approximate inference methods based on particle filtering. The third model is parameterised by recurrent neural networks (RNNs), dropping the Markov assumptions. Exact inference with dynamic programming is made tractable here by simplifying the structure of the conditioning contexts. We then shift the focus to semantics and propose models for parsing sentences to labelled semantic graphs. We introduce a transition-based parser which incrementally predicts graph nodes (predicates) and edges (arguments). This approach is contrasted against predicting top-down graph traversals. RNNs and pointer networks are key components in approaching graph parsing as an incremental prediction problem. The RNN architecture is augmented to condition the model explicitly on the transition system configuration. We develop a robust parser for Minimal Recursion Semantics, a linguistically-expressive framework for compositional semantics which has previously been parsed only with grammar-based approaches. Our parser is much faster than the grammar-based model, while the same approach improves the accuracy of neural Abstract Meaning Representation parsing.
14

He, Sheng. „Thermal History and Deep Overpressure Modelling in the Northern Carnarvon Basin, North West Shelf, Australia“. Curtin University of Technology, Department of Applied Geology, 2002. http://espace.library.curtin.edu.au:80/R/?func=dbin-jump-full&object_id=11998.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
The Northern Carnarvon Basin is the richest petroleum province in Australia. About 50 gas/condensate and oil fields, associated mainly with Jurassic source rocks, have been discovered in the sub-basins and on the Rankin Platform since 1964. The basin is located at the southern end of the North West Shelf of Australia. It can be mainly subdivided into the Exmouth, Barrow, Dampier and Beagle Sub-basins, the Rankin Platform and Exmouth Plateau. The sub-basins are rift-related grabens and half-grabens developed during the Jurassic to the earliest Cretaceous and contain over 10 kilometres of Mesozoic and Cainozoic sedimentary rocks, among which are several thousand meters of Jurassic rocks. The formations of the Jurassic and the lower part of the Barrow Group of Early Cretaceous age in the sub-basins of the Northern Carnarvon Basin were found to be overpressured with excess pressures of 5-29 MPa at depths of 2900-3600 m indicated by repeat formation tests (RFTs) and drill stem tests (DSTs). The characteristics of organic matter, thermal history and thermal maturity, pressure seal and overpressure evolution in the sub-basins are crucial to a proper understanding of the nature and dynamic processes of hydrocarbon generation and migration in the basin. Based on organic geochemical data, the important source rocks in the basin are Jurassic organic-rich fine-grained rocks including the Murat Siltstone, the rift-related Athol Formation and Dingo Claystone. The Mungaroo Formation of the Middle-Upper Triassic contains gas-generating source rocks. These formations recognised to be organic rich based on 1256 values of the total organic carbon content (TOC, %) from 17 wells. Average TOC values (calculated from samples with TOC < 15 %) are about 2.19 % in the Mungaroo Formation, about 2.09 % in the Murat Siltstone and about 1.74 % in the Athol Formation and Dingo Claystone.
Data from kerogen element analysis, Rock-Eval pyrolysis, visual kerogen composition and some biomarkers have been used to evaluate the kerogen type in the basin. It appears that type III kerogen is the dominant organic-matter type in the Triassic and Jurassic source rocks, while the Dingo Claystone may contain some oil-prone organic matter. The vitrinite reflectance (Ro) data in some wells of the Northern Carnarvon Basin are anomalously low. As a major thermal maturity indicator, the anomalously low Ro data seriously hinder the assessment of thermal maturity in the basin. This study differs from other studies in that it has paid more attention to Rock-Eval Tmax data. Therefore, problems affecting Tmax data in evaluating thermal maturity were investigated. A case study of contaminated Rock-Eval data in Bambra-2 and thermal modelling using Tmax data in 16 wells from different tectonic subdivisions were carried out. The major problems for using Tmax data were found to be contamination by drilling-mud additives, natural bitumen and suppression due to hydrogen index (HI) > 150 in some wells. Although the data reveal uncertainties and there is about ±3-10 % error for thermal modelling by using the proposed relationship of Ro and Tmax, the "reliable" Tmax data are found to be important, and useful to assess thermal maturity and reduce the influence of unexpectedly low Ro data.
This study analyzed the characteristics of deep overpressured zones and top pressure seals, in detail, in 7 wells based on the observed fluid pressure data and petrophysical data. The deep overpressured system (depth greater than 2650-3000 m) in the Jurassic formations and the lower part of the Barrow Group is shown by the measured fluid pressure data including RFTs, DSTs and mud weights. The highly overpressured Jurassic fine-grained rocks also exhibit well-log responses of high sonic transit times and low formation resistivities. The deep overpressured zone, however, may not necessarily be caused by anomalously high porosities due to undercompaction. The porosities in the deep overpressured Jurassic rocks may be significantly less than the well-log derived porosities, which may indicate that the sonic-log and resistivity-log also directly respond to the overpressuring in the deep overpressured fine-grained rocks of the sub-basins. Based on the profiles of fluid pressure and well-log data in 5 wells of the Barrow Sub-basin, a top pressure seal was interpreted to be consistent with the transitional pressure zone in the Barrow Sub-basin. This top pressure seal was observed to consist of a rock layer of 60-80 % claystone and siltstone. The depths of the rock layer range from 2650 m to 3300 m with thicknesses of 300-500 m and temperatures of 110-135 °C. Based on the well-log data, measured porosity and sandstone diagenesis, the rock layer seems to be well compacted and cemented with a porosity range of about 2-5 % and calculated permeabilities of about 10-19 to 10-22 M2.
This study performed thermal history and maturity modelling in 14 wells using the BasinMod 1D software. It was found that the thermal maturity data in 4 wells are consistent with the maturity curves predicted by the rifting heat flow history associated with the tectonic regime of this basin. The maximum heat flows during the rift event of the Jurassic and earliest Cretaceous possibly ranged from 60-70 mW/m2 along the sub-basins and 70-80 mW/m2 on the southern and central Exmouth Plateau. This study also carried out two case studies of thermal maturity and thermal modelling within the deep overpressured system in the Barrow and Bambra wells of the Barrow Sub-basin. These case studies were aimed at understanding whether overpressure has a determinable influence on thermal maturation in this region. It was found that there is no evidence for overpressure-related retardation of thermal maturity in the deep overpressured system, based on the measured maturity, biomarker maturity parameters and 1D thermal modelling. Therefore, based on the data analysed, overpressure is an insignificant factor in thermal maturity and h hydrocarbon generation in this basin.
Three seismic lines in the Exmouth, Barrow and Dampier Sub-basins were selected and converted to depth cross-sections, and then 2D geological models were created for overpressure evolution modelling. A major object of these 2D geological models was to define the critical faults. A top pressure seal was also detected based on the 2D model of the Barrow Sub-basin. Two-dimensional overpressure modelling was performed using the BasinMod 2D software. The mathematical 2D model takes into consideration compaction, fluid thermal expansion, pressure produced by hydrocarbon generation and quartz cementation. The sealed overpressured conditions can be modelled with fault sealing, bottom pressure seal (permeabilities of 10-23 to 10-25 M2 ) and top pressure seal (permeabilities of 10-19 to 10-22 m2). The modelling supports the development of a top pressure seal with quartz cementation. The 2D modelling suggests the rapid sedimentation rates can cause compaction disequilibrium in the fine-grained rocks, which may be a mechanism for overpressure generation during the Jurassic to the Early Cretaceous. The data suggest that the present-day deep overpressure is not associated with the porosity anomaly due to compaction disequilibrium and that compaction may be much less important than recurrent pressure charges because most of the porosity in the Jurassic source rocks has been lost through compaction and deposition rates have been very slow since the beginning of the Cainozoic.
Three simple 1D models were developed and applied to estimate how rapidly the overpressure dissipates. The results suggest that the present day overpressure would be almost dissipated after 2 million years with a pressure seal with an average permeability of 10-22 M2 (10-7 md). On the basis of numerous accumulations of oil and gas to be expelled from the overpressured Jurassic source rocks in the basin and the pressure seal modelling, it seems that the top pressure seal with permeabilities of 10-19 to 10-22 M2 (10-4 to 10-7 md) is not enough to retain the deep overpressure for tens of millions of years without pressure recharging. Only if the permeabilities were 10-23 m2 (10-8 md) or less, would a long-lived overpressured system be preserved. This study suggests that hydrocarbon generation, especially gas generation and thermal expansion, within sealed conditions of low-permeability is a likely major cause for maintaining the deep overpressure over the past tens of millions of years. Keywords: Thermal history; Deep overpressure; Type III kerogen; Rock-Eval Tmax; Thermal maturity; Palaeoheatflow modelling; Pressure seal; 2D deep overpressure modelling; Pressure behaviour modelling; Overpressure generation; Northern Carnarvon Basin.
15

Martin, Alice. „Deep learning models and algorithms for sequential data problems : applications to language modelling and uncertainty quantification“. Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS007.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Dans ce manuscrit de thèse, nous développons de nouveaux algorithmes et modèles pour résoudre les problèmes d'apprentissage profond sur de la donnée séquentielle, en partant des problématiques posées par l'apprentissage des modèles de langage basés sur des réseaux de neurones. Un premier axe de recherche développe de nouveaux modèles génératifs profonds basés sur des méthodes de Monte Carlo Séquentielles (SMC), qui permettent de mieux modéliser la diversité du langage, ou de mieux quantifier l'incertitude pour des problèmes de régression séquentiels. Un deuxième axe de recherche vise à faciliter l'utilisation de techniques de SMC dans le cadre de l'apprentissage profond, en développant un nouvel algorithme de lissage à coût computationnel largement réduit, et qui s'applique à un scope plus large de modèles à espace d'états, notamment aux modèles génératifs profonds. Finalement, un troisième axe de recherche propose le premier algorithme d'apprentissage par renforcement permettant d'apprendre des modèles de langage conditionnels "ex-nihilo" (i.e sans jeu de données supervisé), basé sur un mécanisme de troncation de l'espace d'actions par un modèle de langage pré-entrainé
In this thesis, we develop new models and algorithms to solve deep learning tasks on sequential data problems, with the perspective of tackling the pitfalls of current approaches for learning language models based on neural networks. A first research work develops a new deep generative model for sequential data based on Sequential Monte Carlo Methods, that enables to better model diversity in language modelling tasks, and better quantify uncertainty in sequential regression problems. A second research work aims to facilitate the use of SMC techniques within deep learning architectures, by developing a new online smoothing algorithm with reduced computational cost, and applicable on a wider scope of state-space models, including deep generative models. Finally, a third research work proposes the first reinforcement learning that enables to learn conditional language models from scratch (i.e without supervised datasets), based on a truncation mechanism of the natural language action space with a pretrained language model
16

Devineau, Guillaume. „Deep learning for multivariate time series : from vehicle control to gesture recognition and generation“. Thesis, Université Paris sciences et lettres, 2020. http://www.theses.fr/2020UPSLM037.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
L'apprentissage profond est une branche du domaine de l'intelligence artificielle qui vise à doter les machines de la capacité d'apprendre par elles-mêmes à réaliser des tâches précises. L'apprentissage profond a abouti à des développements spectaculaires dans le domaine de l'image et du langage naturel au cours des dernières années. Pourtant, dans de nombreux domaines, les données d'observations ne sont ni des images ni du texte mais des séries temporelles qui représentent l'évolution de grandeurs mesurées ou calculées. Dans cette thèse, nous étudions et proposons différentes représentations de séries temporelles à partir de modèles d'apprentissage profond. Dans un premier temps, dans le domaine du contrôle de véhicules autonomes, nous montrons que l'analyse d'une fenêtre temporelle par un réseau de neurones permet d'obtenir de meilleurs résultats que les méthodes classiques qui n'utilisent pas de réseaux de neurones. Dans un second temps, en reconnaissance de gestes et d'actions, nous proposons des réseaux de neurones convolutifs 1D où la dimension temporelle seule est convoluée, afin de tirer profit des invariances temporelles. Dans un troisième temps, dans un but de génération de mouvements humains, nous proposons des réseaux de neurones génératifs convolutifs 2D où les dimensions temporelles et spatiales sont convoluées de manière jointe. Enfin, dans un dernier temps, nous proposons un plongement où des représentations spatiales de poses humaines sont (ré)organisées dans un espace latent en fonction de leurs relations temporelles
Artificial intelligence is the scientific field which studies how to create machines that are capable of intelligent behaviour. Deep learning is a family of artificial intelligence methods based on neural networks. In recent years, deep learning has lead to groundbreaking developments in the image and natural language processing fields. However, in many domains, input data consists in neither images nor text documents, but in time series that describe the temporal evolution of observed or computed quantities. In this thesis, we study and introduce different representations for time series, based on deep learning models. Firstly, in the autonomous driving domain, we show that, the analysis of a temporal window by a neural network can lead to better vehicle control results than classical approaches that do not use neural networks, especially in highly-coupled situations. Secondly, in the gesture and action recognition domain, we introduce 1D parallel convolutional neural network models. In these models, convolutions are performed over the temporal dimension, in order for the neural network to detect -and benefit from- temporal invariances. Thirdly, in the human pose motion generation domain, we introduce 2D convolutional generative adversarial neural networks where the spatial and temporal dimensions are convolved in a joint manner. Finally, we introduce an embedding where spatial representations of human poses are sorted in a latent space based on their temporal relationships
17

Wen, Tsung-Hsien. „Recurrent neural network language generation for dialogue systems“. Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275648.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Language is the principal medium for ideas, while dialogue is the most natural and effective way for humans to interact with and access information from machines. Natural language generation (NLG) is a critical component of spoken dialogue and it has a significant impact on usability and perceived quality. Many commonly used NLG systems employ rules and heuristics, which tend to generate inflexible and stylised responses without the natural variation of human language. However, the frequent repetition of identical output forms can quickly make dialogue become tedious for most real-world users. Additionally, these rules and heuristics are not scalable and hence not trivially extensible to other domains or languages. A statistical approach to language generation can learn language decisions directly from data without relying on hand-coded rules or heuristics, which brings scalability and flexibility to NLG. Statistical models also provide an opportunity to learn in-domain human colloquialisms and cross-domain model adaptations. A robust and quasi-supervised NLG model is proposed in this thesis. The model leverages a Recurrent Neural Network (RNN)-based surface realiser and a gating mechanism applied to input semantics. The model is motivated by the Long-Short Term Memory (LSTM) network. The RNN-based surface realiser and gating mechanism use a neural network to learn end-to-end language generation decisions from input dialogue act and sentence pairs; it also integrates sentence planning and surface realisation into a single optimisation problem. The single optimisation not only bypasses the costly intermediate linguistic annotations but also generates more natural and human-like responses. Furthermore, a domain adaptation study shows that the proposed model can be readily adapted and extended to new dialogue domains via a proposed recipe. Continuing the success of end-to-end learning, the second part of the thesis speculates on building an end-to-end dialogue system by framing it as a conditional generation problem. The proposed model encapsulates a belief tracker with a minimal state representation and a generator that takes the dialogue context to produce responses. These features suggest comprehension and fast learning. The proposed model is capable of understanding requests and accomplishing tasks after training on only a few hundred human-human dialogues. A complementary Wizard-of-Oz data collection method is also introduced to facilitate the collection of human-human conversations from online workers. The results demonstrate that the proposed model can talk to human judges naturally, without any difficulty, for a sample application domain. In addition, the results also suggest that the introduction of a stochastic latent variable can help the system model intrinsic variation in communicative intention much better.
18

Lucas, Thomas. „Modèles génératifs profonds : sur-généralisation et abandon de mode“. Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM049.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Cette dissertation explore le sujet des modèles génératifs appliqués aux images naturelles.Cette tâche consiste a modéliser la distribution des données observées, et peut permettre de générer des données artificielles semblables aux données d'origine, où de compresser des images.Les modèles à variable latentes, qui sont au cœur de cette thèse, cherchent a résumer les principaux facteurs de variation d'une image en une variable qui peut être manipulée.En particulier, nos contributions sont basées sur deux modèles génératifs a variable latentes: le modèle génératif adversarial (GAN) et l' encodeur variationel (VAE).Récemment, les GAN ont significativement amélioré la qualité des images générées par des modèles profonds, générant des images très convaincantes.Malheureusement ces modèles ont du mal à modéliser tous les modes de la distribution d'origine, ie ils ne couvrent pas les données dans toute leur variabilité.A l'inverse, les modèles basés sur le maximum de vraisemblance tels que les VAEs couvrent typiquement toute la variabilité des données, et en offrent une mesure objective.Mais ces modèles produisent des échantillons de qualité visuelle inférieure, qui sont plus facilement distingués de vrais images.Le travail présenté dans cette thèse a pour but d'obtenir le meilleur des deux mondes: des échantillons de bonne qualité tout en modélisant tout le support de la distribution.La première contribution de ce manuscrit est un modèle génératif profond qui encode la structure globale des images dans une variable latente, basé sur le VAE, et utilise un modèle autoregressif pour modéliser les détails de bas niveau.Nous proposons une procédure d'entrainement qui utilise une fonction de perte auxiliaire pour contrôler quelle information est capturée par la variable latent et quelle information est laissée à un décodeur autoregressif.Au contraire des précédentes approches pour construire des modèles hybrides de ce genre, notre modèle de nécessite pas de contraindre la capacité du décodeur autoregressif pour empêcher des modèles dégénérés qui ignorent la variable latente.La deuxième contribution est bâtie sur le modèle du GAN standard, qui utilise un discriminateur pour guider le modèle génératif.Le discriminateur évalue généralement la qualité d'échantillons individuels, ce qui rend la tache d'évaluer la variabilité des données difficile.A la place, nous proposons de fournir au discriminateur des ensembles de données, ou batches, qui mélangent des vraies images et des images générées.Nous l'entrainons à prédire le ratio de vrais et de faux éléments dans l'ensemble.Ces batches servent d'approximation de la vrai distribution des images générées et permettent au discriminateur d'approximer des statistiques sur leur distributionLes lacunes mutuelles des VAEs et des GANs peuvent, en principe, être réglées en entrainant des modèles hybrides qui utilisent les deux types d'objectif.Dans notre troisième contribution, nous montrons que les hypothèses paramétriques habituelles faites par les VAE produisent un conflit entre les deux, menant à des performances décevantes pour les modèles hybrides.Nous proposons une solution basée sur des modèles profonds inversibles, qui entraine un espace de features dans lequel les hypothèses habituelles peuvent être faites sans poser problème.Notre approche fourni des évaluations e vraisemblance dans l'espace des images tout en étant capable de tirer profit de l'entrainement adversaire.Elle obtient des échantillons de qualité équivalente au modèle pleinement adversaires tout en améliorant les scores de maximum de vraisemblance au moment de la publication, ce qui constitue une amélioration significative
This dissertation explores the topic of generative modelling of natural images,which is the task of fitting a data generating distribution.Such models can be used to generate artificial data resembling the true data, or to compress images.Latent variable models, which are at the core of our contributions, seek to capture the main factors of variations of an image into a variable that can be manipulated.In particular we build on two successful latent variable generative models, the generative adversarial network (GAN) and Variational autoencoder (VAE) models.Recently GANs significantly improved the quality of images generated by deep models, obtaining very compelling samples.Unfortunately these models struggle to capture all the modes of the original distribution, ie they do not cover the full variability of the dataset.Conversely, likelihood based models such as VAEs typically cover the full variety of the data well and provide an objective measure of coverage.However these models produce samples of inferior visual quality that are more easily distinguished from real ones.The work presented in this thesis strives for the best of both worlds: to obtain compelling samples while modelling the full support of the distribution.To achieve that, we focus on i) the optimisation problems used and ii) practical model limitations that hinder performance.The first contribution of this manuscript is a deep generative model that encodes global image structure into latent variables, built on the VAE, and autoregressively models low level detail.We propose a training procedure relying on an auxiliary loss function to control what information is captured by the latent variables and what information is left to an autoregressive decoder.Unlike previous approaches to such hybrid models, ours does not need to restrict the capacity of the autoregressive decoder to prevent degenerate models that ignore the latent variables.The second contribution builds on the standard GAN model, which trains a discriminator network to provide feedback to a generative network.The discriminator usually assesses the quality of individual samples, which makes it hard to evaluate the variability of the data.Instead we propose to feed the discriminator with emph{batches} that mix both true and fake samples, and train it to predict the ratio of true samples in the batch.These batches work as approximations of the distribution of generated images and allows the discriminator to approximate distributional statistics.We introduce an architecture that is well suited to solve this problem efficiently,and show experimentally that our approach reduces mode collapse in GANs on two synthetic datasets, and obtains good results on the CIFAR10 and CelebA datasets.The mutual shortcomings of VAEs and GANs can in principle be addressed by training hybrid models that use both types of objective.In our third contribution, we show that usual parametric assumptions made in VAEs induce a conflict between them, leading to lackluster performance of hybrid models.We propose a solution based on deep invertible transformations, that trains a feature space in which usual assumptions can be made without harm.Our approach provides likelihood computations in image space while being able to take advantage of adversarial training.It obtains GAN-like samples that are competitive with fully adversarial models while improving likelihood scores over existing hybrid models at the time of publication, which is a significant advancement
19

Rafael-Palou, Xavier. „Detection, quantification, malignancy prediction and growth forecasting of pulmonary nodules using deep learning in follow-up CT scans“. Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/672964.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Nowadays, lung cancer assessment is a complex and tedious task mainly per- formed by radiological visual inspection of suspicious pulmonary nodules, using computed tomography (CT) scan images taken to patients over time. Several computational tools relying on conventional artificial intelligence and computer vision algorithms have been proposed for supporting lung cancer de- tection and classification. These solutions mostly rely on the analysis of indi- vidual lung CT images of patients and on the use of hand-crafted image de- scriptors. Unfortunately, this makes them unable to cope with the complexity and variability of the problem. Recently, the advent of deep learning has led to a major breakthrough in the medical image domain, outperforming conven- tional approaches. Despite recent promising achievements in nodule detection, segmentation, and lung cancer classification, radiologists are still reluctant to use these solutions in their day-to-day clinical practice. One of the main rea- sons is that current solutions do not provide support to automatic analysis of the temporal evolution of lung tumours. The difficulty to collect and annotate longitudinal lung CT cases to train models may partially explain the lack of deep learning studies that address this issue. In this dissertation, we investigate how to automatically provide lung can- cer assessment through deep learning algorithms and computer vision pipelines, especially taking into consideration the temporal evolution of the pulmonary nodules. To this end, our first goal consisted on obtaining accurate methods for lung cancer assessment (diagnostic ground truth) based on individual lung CT images. Since these types of labels are expensive and difficult to collect (e.g. usually after biopsy), we proposed to train different deep learning models, based on 3D convolutional neural networks (CNN), to predict nodule malig- nancy based on radiologist visual inspection annotations (which are reasonable to obtain). These classifiers were built based on ground truth consisting of the nodule malignancy, the position and the size of the nodules to classify. Next, we evaluated different ways of synthesizing the knowledge embedded by the nodule malignancy neural network, into an end-to-end pipeline aimed to detect pul- monary nodules and predict lung cancer at the patient level, given a lung CT image. The positive results confirmed the convenience of using CNNs for mod- elling nodule malignancy, according to radiologists, for the automatic prediction of lung cancer. Next, we focused on the analysis of lung CT image series. Thus, we first faced the problem of automatically re-identifying pulmonary nodules from dif- ferent lung CT scans of the same patient. To do this, we present a novel method based on a Siamese neural network (SNN) to rank similarity between nodules, overpassing the need for image registration. This change of paradigm avoided introducing potentially erroneous image deformations and provided computa- tionally faster results. Different configurations of the SNN were examined, in- cluding the application of transfer learning, using different loss functions, and the combination of several feature maps of different network levels. This method obtained state-of-the-art performances for nodule matching both in an isolated manner and embedded in an end-to-end nodule growth detection pipeline. Afterwards, we moved to the core problem of supporting radiologists in the longitudinal management of lung cancer. For this purpose, we created a novel end-to-end deep learning pipeline, composed of four stages that completely au- tomatize from the detection of nodules to the classification of cancer, through the detection of growth in the nodules. In addition, the pipeline integrated a novel approach for nodule growth detection, which relies on a recent hierarchi- cal probabilistic segmentation network adapted to report uncertainty estimates. Also, a second novel method was introduced for lung cancer nodule classification, integrating into a two stream 3D-CNN the estimated nodule malignancy prob- abilities derived from a pre-trained nodule malignancy network. The pipeline was evaluated in a longitudinal cohort and the reported outcomes (i.e. nodule detection, re-identification, growth quantification, and malignancy prediction) were comparable with state-of-the-art work, focused on solving one or a few of the functionalities of our pipeline. Thereafter, we also investigated how to help clinicians to prescribe more accurate tumour treatments and surgical planning. Thus, we created a novel method to forecast nodule growth given a single image of the nodule. Partic- ularly, the method relied on a hierarchical, probabilistic and generative deep neural network able to produce multiple consistent future segmentations of the nodule at a given time. To do this, the network learned to model the mul- timodal posterior distribution of future lung tumour segmentations by using variational inference and injecting the posterior latent features. Eventually, by applying Monte-Carlo sampling on the outputs of the trained network, we esti- mated the expected tumour growth mean and the uncertainty associated with the prediction. Although further evaluation in a larger cohort would be highly recommended, the proposed methods reported accurate results to adequately support the ra- diological workflow of pulmonary nodule follow-up. Beyond this specific appli- cation, the outlined innovations, such as the methods for integrating CNNs into computer vision pipelines, the re-identification of suspicious regions over time based on SNNs, without the need to warp the inherent image structure, or the proposed deep generative and probabilistic network to model tumour growth considering ambiguous images and label uncertainty, they could be easily appli- cable to other types of cancer (e.g. pancreas), clinical diseases (e.g. Covid-19) or medical applications (e.g. therapy follow-up).
Avui en dia, l’avaluació del càncer de pulmó ´es una tasca complexa i tediosa, principalment realitzada per inspecció visual radiològica de nòduls pulmonars sospitosos, mitjançant imatges de tomografia computada (TC) preses als pacients al llarg del temps. Actualment, existeixen diverses eines computacionals basades en intel·ligència artificial i algorismes de visió per computador per donar suport a la detecció i classificació del càncer de pulmó. Aquestes solucions es basen majoritàriament en l’anàlisi d’imatges individuals de TC pulmonar dels pacients i en l’ús de descriptors d’imatges fets a mà. Malauradament, això les fa incapaces d’afrontar completament la complexitat i la variabilitat del problema. Recentment, l’aparició de l’aprenentatge profund ha permès un gran avenc¸ en el camp de la imatge mèdica. Malgrat els prometedors assoliments en detecció de nòduls, segmentació i classificació del càncer de pulmó, els radiòlegs encara són reticents a utilitzar aquestes solucions en el seu dia a dia. Un dels principals motius ´es que les solucions actuals no proporcionen suport automàtic per analitzar l’evolució temporal dels tumors pulmonars. La dificultat de recopilar i anotar cohorts longitudinals de TC pulmonar poden explicar la manca de treballs d’aprenentatge profund que aborden aquest problema. En aquesta tesi investiguem com abordar el suport automàtic a l’avaluació del càncer de pulmó, construint algoritmes d’aprenentatge profund i pipelines de visió per ordinador que, especialment, tenen en compte l’evolució temporal dels nòduls pulmonars. Així doncs, el nostre primer objectiu va consistir a obtenir mètodes precisos per a l’avaluació del càncer de pulmó basats en imatges de CT pulmonar individuals. Atès que aquests tipus d’etiquetes són costoses i difícils d’obtenir (per exemple, després d’una biòpsia), vam dissenyar diferents xarxes neuronals profundes, basades en xarxes de convolució 3D (CNN), per predir la malignitat dels nòduls basada en la inspecció visual dels radiòlegs (més senzilles de recol.lectar). A continuació, vàrem avaluar diferents maneres de sintetitzar aquest coneixement representat en la xarxa neuronal de malignitat, en una pipeline destinada a proporcionar predicció del càncer de pulmó a nivell de pacient, donada una imatge de TC pulmonar. Els resultats positius van confirmar la conveniència d’utilitzar CNN per modelar la malignitat dels nòduls, segons els radiòlegs, per a la predicció automàtica del càncer de pulmó. Seguidament, vam dirigir la nostra investigació cap a l’anàlisi de sèries d’imatges de TC pulmonar. Per tant, ens vam enfrontar primer a la reidentificació automàtica de nòduls pulmonars de diferents tomografies pulmonars. Per fer-ho, vam proposar utilitzar xarxes neuronals siameses (SNN) per classificar la similitud entre nòduls, superant la necessitat de registre d’imatges. Aquest canvi de paradigma va evitar possibles pertorbacions de la imatge i va proporcionar resultats computacionalment més ràpids. Es van examinar diferents configuracions del SNN convencional, que van des de l’aplicació de l’aprenentatge de transferència, utilitzant diferents funcions de pèrdua, fins a la combinació de diversos mapes de característiques de diferents nivells de xarxa. Aquest mètode va obtenir resultats d’estat de la tècnica per reidentificar nòduls de manera aïllada, i de forma integrada en una pipeline per a la quantificació de creixement de nòduls. A més, vam abordar el problema de donar suport als radiòlegs en la gestió longitudinal del càncer de pulmó. Amb aquesta finalitat, vam proposar una nova pipeline d’aprenentatge profund, composta de quatre etapes que s’automatitzen completament i que van des de la detecció de nòduls fins a la classificació del càncer, passant per la detecció del creixement dels nòduls. A més, la pipeline va integrar un nou enfocament per a la detecció del creixement dels nòduls, que es basava en una recent xarxa de segmentació probabilística jeràrquica adaptada per informar estimacions d’incertesa. A més, es va introduir un segon mètode per a la classificació dels nòduls del càncer de pulmó, que integrava en una xarxa 3D-CNN de dos fluxos les probabilitats estimades de malignitat dels nòduls derivades de la xarxa pre-entrenada de malignitat dels nòduls. La pipeline es va avaluar en una cohort longitudinal i va informar rendiments comparables a l’estat de la tècnica utilitzats individualment o en pipelines però amb menys components que la proposada. Finalment, també vam investigar com ajudar els metges a prescriure de forma més acurada tractaments tumorals i planificacions quirúrgiques més precises. Amb aquesta finalitat, hem realitzat un nou mètode per predir el creixement dels nòduls donada una única imatge del nòdul. Particularment, el mètode es basa en una xarxa neuronal profunda jeràrquica, probabilística i generativa capaç de produir múltiples segmentacions de nòduls futurs consistents del nòdul en un moment determinat. Per fer-ho, la xarxa aprèn a modelar la distribució posterior multimodal de futures segmentacions de tumors pulmonars mitjançant la utilització d’inferència variacional i la injecció de les característiques latents posteriors. Finalment, aplicant el mostreig de Monte-Carlo a les sortides de la xarxa, podem estimar la mitjana de creixement del tumor i la incertesa associada a la predicció. Tot i que es recomanable una avaluació posterior en una cohort més gran, els mètodes proposats en aquest treball han informat resultats prou precisos per donar suport adequadament al flux de treball radiològic del seguiment dels nòduls pulmonars. Més enllà d’aquesta aplicació especifica, les innovacions presentades com, per exemple, els mètodes per integrar les xarxes CNN a pipelines de visió per ordinador, la reidentificació de regions sospitoses al llarg del temps basades en SNN, sense la necessitat de deformar l’estructura de la imatge inherent o la xarxa probabilística per modelar el creixement del tumor tenint en compte imatges ambigües i la incertesa en les prediccions, podrien ser fàcilment aplicables a altres tipus de càncer (per exemple, pàncrees), malalties clíniques (per exemple, Covid-19) o aplicacions mèdiques (per exemple, seguiment de la teràpia).
20

Risso, Davide. „Simultaneous inference for RNA-Seq data“. Doctoral thesis, Università degli studi di Padova, 2012. http://hdl.handle.net/11577/3421731.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
In the last few years, RNA-Seq has become a popular choice for high-throughput studies of gene expression, revealing its potential to overcome microarrays and become the new standard for transcriptional profiling. At a gene-level, RNA-Seq yields counts rather than continuous measures of expression, leading to the need for novel methods to deal with count data in high-dimensional problems. In this Thesis, we aim at shedding light on the problems related to the exploration and modeling of RNA-Seq data. In particular, we introduce simple and effective ways to summarize and visualize the data; we define a novel algorithm for the clustering of RNA-Seq data and we implement simple normalization strategies to deal with technology-related biases. Finally, we present a hierarchical Bayesian approach to the modeling of RNA-Seq data. The model accounts for the difference in sequencing depth, as well as for overdispersion, automatically accounting for different types of normalization.
Negli ultimi anni il sequenziamento massivo di RNA (RNA-Seq) è diventato una scelta frequente per gli studi di espressione genica. Questa tecnica ha il potenziale di superare i microarray come tecnica standard per lo studio dei profili trascrizionali. A livello genico, i dati di RNA-Seq si presentano sotto forma di conteggi, al contrario dei microarray che stimano l’espressione su una scala continua. Questo porta alla necessità di sviluppare nuovi metodi e modelli per l'analisi di dati di conteggio in problemi con dimensionalità elevata. In questa tesi verranno affrontati alcuni problemi relativi all'esplorazione e alla modellazione dei dati di RNA-Seq. In particolare, verranno introdotti metodi per la visualizzazione e il riassunto numerico dei dati. Inoltre si definirà un nuovo algoritmo per il raggruppamento dei dati e alcune strategie per la normalizzazione, volte a eliminare le distorsioni specifiche di questa tecnologia. Infine, verrà definito un modello gerarchico Bayesiano per modellare l'espressione di dati RNA-Seq e verificarne le eventuali differenze in diverse condizioni sperimentali. Il modello tiene in considerazione la profondità di sequenziamento e la sovra-dispersione e automaticamente sviluppa diversi tipi di normalizzazione.
21

Loaiza, Ganem Gabriel. „Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience“. Thesis, 2019. https://doi.org/10.7916/d8-yp88-e002.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Deep generative modeling is an increasingly popular area of machine learning that takes advantage of recent developments in neural networks in order to estimate the distribution of observed data. In this dissertation we introduce three advances in this area. The first one, Maximum Entropy Flow Networks, allows to do maximum entropy modeling by combining normalizing flows with the augmented Lagrangian optimization method. The second one is the continuous Bernoulli, a new [0,1]-supported distribution which we introduce with the motivation of fixing the pervasive error in variational autoencoders of using a Bernoulli likelihood for non-binary data. The last one, Deep Random Splines, is a novel distribution over functions, where samples are obtained by sampling Gaussian noise and transforming it through a neural network to obtain the parameters of a spline. We apply these to model texture images, natural images and neural population data, respectively; and observe significant improvements over current state of the art alternatives.
22

Yahi, Alexandre. „Simulating drug responses in laboratory test time series with deep generative modeling“. Thesis, 2019. https://doi.org/10.7916/d8-arta-jt32.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Drug effects can be unpredictable and vary widely among patients with environmental, genetic, and clinical factors. Randomized control trials (RCTs) are not sufficient to identify adverse drug reactions (ADRs), and the electronic health record (EHR) along with medical claims have become an important resource for pharmacovigilance. Among all the data collected in hospitals, laboratory tests represent the most documented and reliable data type in the EHR. Laboratory tests are at the core of the clinical decision process and are used for diagnosis, monitoring, screening, and research by physicians. They can be linked to drug effects either directly, with therapeutic drug monitoring (TDM), or indirectly using drug laboratory effects (DLEs) that affect surrogate tests. Unfortunately, very few automated methods use laboratory tests to inform clinical decision making and predict drug effects, partly due to the complexity of these time series that are irregularly sampled, highly dependent on other clinical covariates, and non-stationary. Deep learning, the branch of machine learning that relies on high-capacity artificial neural networks, has known a renewed popularity this past decade and has transformed fields such as computer vision and natural language processing. Deep learning holds the promise of better performances compared to established machine learning models, although with the necessity for larger training datasets due to their higher degrees of freedom. These models are more flexible with multi-modal inputs and can make sense of large amounts of features without extensive engineering. Both qualities make deep learning models ideal candidate for complex, multi-modal, noisy healthcare datasets. With the development of novel deep learning methods such as generative adversarial networks (GANs), there is an unprecedented opportunity to learn how to augment existing clinical dataset with realistic synthetic data and increase predictive performances. Moreover, GANs have the potential to simulate effects of individual covariates such as drug exposures by leveraging the properties of implicit generative models. In this dissertation, I present a body of work that aims at paving the way for next generation laboratory test-based clinical decision support systems powered by deep learning. To this end, I organized my experiments around three building blocks: (1) the evaluation of various deep learning architectures with laboratory test time series and their covariates with a forecasting task; (2) the development of implicit generative models of laboratory test time series using the Wasserstein GAN framework; (3) the inference properties of these models for the simulation of drug effects in laboratory test time series, and their application for data augmentation. Each component has its own evaluation: The forecasting task enabled me to explore the properties and performances of different learning architectures; the Wasserstein GAN models are evaluated with both intrinsic metrics and extrinsic tasks, and I always set baselines to avoid providing results in a "neural-network only" referential. Applied machine learning, and more so with deep learning, is an empirical science. While the datasets used in this dissertation are not publicly available due to patient privacy regulation, I described pre-processing steps, hyper-parameters selection and training processes with reproducibility and transparency in mind. In the specific context of these studies involving laboratory test time series and their clinical covariates, I found that for supervised tasks, machine learning holds up well against deep learning methods. Complex recurrent architectures like long short-term memory (LSTM) do not perform well on these short time series, while convolutional neural networks (CNNs) and multi-layer perceptrons (MLPs) provide the best performances, at the cost of extensive hyper-parameter tuning. Generative adversarial networks, enabled by deep learning models, were able to generate high-fidelity laboratory test time series, and the quality of the generated samples was increased with conditional models using drug exposures as auxiliary information. Interestingly, forecasting models trained on synthetic data exclusively still retain good performances, confirming the potential of GANs in privacy-oriented applications. Finally, conditional GANs demonstrated an ability to interpolate samples from drug exposure combinations not seen during training, opening the way for laboratory test simulation with larger auxiliary information spaces. In specific cases, augmenting real training sets with synthetic data improved performances in the forecasting tasks, and could be extended to other applications where rare cases present a high prediction error.
23

Abdal, Rameen. „Image Embedding into Generative Adversarial Networks“. Thesis, 2020. http://hdl.handle.net/10754/662516.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
We propose an e cient algorithm to embed a given image into the latent space of StyleGAN. This embedding enables semantic image editing operations that can be applied to existing photographs. Taking the StyleGAN trained on the FFHQ dataset as an example, we show results for image morphing, style transfer, and expression transfer. Studying the results of the embedding algorithm provides valuable insights into the structure of the StyleGAN latent space. We propose a set of experiments to test what class of images can be embedded, how they are embedded, what latent space is suitable for embedding, and if the embedding is semantically meaningful.
24

Mehri, Soroush. „Sequential modeling, generative recurrent neural networks, and their applications to audio“. Thèse, 2016. http://hdl.handle.net/1866/18762.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
25

Lamb, Alexander. „Generative models : a critical review“. Thèse, 2018. http://hdl.handle.net/1866/21282.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
26

Parent-Lévesque, Jérôme. „Towards deep unsupervised inverse graphics“. Thesis, 2020. http://hdl.handle.net/1866/25467.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
Un objectif de longue date dans le domaine de la vision par ordinateur est de déduire le contenu 3D d’une scène à partir d’une seule photo, une tâche connue sous le nom d’inverse graphics. L’apprentissage automatique a, dans les dernières années, permis à de nombreuses approches de faire de grands progrès vers la résolution de ce problème. Cependant, la plupart de ces approches requièrent des données de supervision 3D qui sont coûteuses et parfois impossible à obtenir, ce qui limite les capacités d’apprentissage de telles œuvres. Dans ce travail, nous explorons l’architecture des méthodes d’inverse graphics non-supervisées et proposons deux méthodes basées sur des représentations 3D et algorithmes de rendus différentiables distincts: les surfels ainsi qu’une nouvelle représentation basée sur Voronoï. Dans la première méthode basée sur les surfels, nous montrons que, bien qu’efficace pour maintenir la cohérence visuelle, la production de surfels à l’aide d’une carte de profondeur apprise entraîne des ambiguïtés car la relation entre la carte de profondeur et le rendu n’est pas bijective. Dans notre deuxième méthode, nous introduisons une nouvelle représentation 3D basée sur les diagrammes de Voronoï qui modélise des objets/scènes à la fois explicitement et implicitement, combinant ainsi les avantages des deux approches. Nous montrons comment cette représentation peut être utilisée à la fois dans un contexte supervisé et non-supervisé et discutons de ses avantages par rapport aux représentations 3D traditionnelles
A long standing goal of computer vision is to infer the underlying 3D content in a scene from a single photograph, a task known as inverse graphics. Machine learning has, in recent years, enabled many approaches to make great progress towards solving this problem. However, most approaches rely on 3D supervision data which is expensive and sometimes impossible to obtain and therefore limits the learning capabilities of such work. In this work, we explore the deep unsupervised inverse graphics training pipeline and propose two methods based on distinct 3D representations and associated differentiable rendering algorithms: namely surfels and a novel Voronoi-based representation. In the first method based on surfels, we show that, while effective at maintaining view-consistency, producing view-dependent surfels using a learned depth map results in ambiguities as the mapping between depth map and rendering is non-bijective. In our second method, we introduce a novel 3D representation based on Voronoi diagrams which models objects/scenes both explicitly and implicitly simultaneously, thereby combining the benefits of both. We show how this representation can be used in both a supervised and unsupervised context and discuss its advantages compared to traditional 3D representations.
27

Mastropietro, Olivier. „Deep Learning for Video Modelling“. Thèse, 2017. http://hdl.handle.net/1866/20192.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
28

Sylvain, Tristan. „Locality and compositionality in representation learning for complex visual tasks“. Thesis, 2021. http://hdl.handle.net/1866/25594.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
L'utilisation d'architectures neuronales profondes associée à des innovations spécifiques telles que les méthodes adversarielles, l’entraînement préalable sur de grands ensembles de données et l'estimation de l'information mutuelle a permis, ces dernières années, de progresser rapidement dans de nombreuses tâches de vision par ordinateur complexes telles que la classification d'images de catégories préalablement inconnues (apprentissage zéro-coups), la génération de scènes ou la classification multimodale. Malgré ces progrès, il n’est pas certain que les méthodes actuelles d’apprentissage de représentations suffiront à atteindre une performance équivalente au niveau humain sur des tâches visuelles arbitraires et, de fait, cela pose des questions quant à la direction de la recherche future. Dans cette thèse, nous nous concentrerons sur deux aspects des représentations qui semblent nécessaires pour atteindre de bonnes performances en aval pour l'apprentissage des représentations : la localité et la compositionalité. La localité peut être comprise comme la capacité d'une représentation à retenir des informations locales. Ceci sera pertinent dans de nombreux cas, et bénéficiera particulièrement à la vision informatique, domaine dans lequel les images naturelles comportent intrinsèquement des informations locales, par exemple des parties pertinentes d’une image, des objets multiples présents dans une scène... D'autre part, une représentation compositionnelle peut être comprise comme une représentation qui résulte d'une combinaison de parties plus simples. Les réseaux neuronaux convolutionnels sont intrinsèquement compositionnels, et de nombreuses images complexes peuvent être considérées comme la composition de sous-composantes pertinentes : les objets et attributs individuels dans une scène, les attributs sémantiques dans l'apprentissage zéro-coups en sont deux exemples. Nous pensons que ces deux propriétés détiennent la clé pour concevoir de meilleures méthodes d'apprentissage de représentations. Dans cette thèse, nous présentons trois articles traitant de la localité et/ou de la compositionnalité, et de leur application à l'apprentissage de représentations pour des tâches visuelles complexes. Dans le premier article, nous introduisons des méthodes de mesure de la localité et de la compositionnalité pour les représentations d'images, et nous démontrons que les représentations locales et compositionnelles sont plus performantes dans l'apprentissage zéro-coups. Nous utilisons également ces deux notions comme base pour concevoir un nouvel algorithme d'apprentissage des représentations qui atteint des performances de pointe dans notre cadre expérimental, une variante de l'apprentissage "zéro-coups" plus difficile où les informations externes, par exemple un pré-entraînement sur d'autres ensembles de données d'images, ne sont pas autorisées. Dans le deuxième article, nous montrons qu'en encourageant un générateur à conserver des informations locales au niveau de l'objet, à l'aide d'un module dit de similarité de graphes de scène, nous pouvons améliorer les performances de génération de scènes. Ce modèle met également en évidence l'importance de la composition, car de nombreux composants fonctionnent individuellement sur chaque objet présent. Pour démontrer pleinement la portée de notre approche, nous effectuons une analyse détaillée et proposons un nouveau cadre pour évaluer les modèles de génération de scènes. Enfin, dans le troisième article, nous montrons qu'en encourageant une forte information mutuelle entre les représentations multimodales locales et globales des images médicales en 2D et 3D, nous pouvons améliorer la classification et la segmentation des images. Ce cadre général peut être appliqué à une grande variété de contextes et démontre les avantages non seulement de la localité, mais aussi de la compositionnalité, car les représentations multimodales sont combinées pour obtenir une représentation plus générale.
The use of deep neural architectures coupled with specific innovations such as adversarial methods, pre-training on large datasets and mutual information estimation has in recent years allowed rapid progress in many complex vision tasks such as zero-shot learning, scene generation, or multi-modal classification. Despite such progress, it is still not clear if current representation learning methods will be enough to attain human-level performance on arbitrary visual tasks, and if not, what direction should future research take. In this thesis, we will focus on two aspects of representations that seem necessary to achieve good downstream performance for representation learning: locality and compositionality. Locality can be understood as a representation's ability to retain local information. This will be relevant in many cases, and will specifically benefit computer vision where natural images inherently feature local information, i.e. relevant patches of an image, multiple objects present in a scene... On the other hand, a compositional representation can be understood as one that arises from a combination of simpler parts. Convolutional neural networks are inherently compositional, and many complex images can be seen as composition of relevant sub-components: individual objects and attributes in a scene, semantic attributes in zero-shot learning are two examples. We believe both properties hold the key to designing better representation learning methods. In this thesis, we present 3 articles dealing with locality and/or compositionality, and their application to representation learning for complex visual tasks. In the first article, we introduce ways of measuring locality and compositionality for image representations, and demonstrate that local and compositional representations perform better at zero-shot learning. We also use these two notions as the basis for designing class-matching deep info-max, a novel representation learning algorithm that achieves state-of-the-art performance on our proposed "Zero-shot from scratch" setting, a harder zero-shot setting where external information, e.g. pre-training on other image datasets is not allowed. In the second article, we show that by encouraging a generator to retain local object-level information, using a scene-graph similarity module, we can improve scene generation performance. This model also showcases the importance of compositionality as many components operate individually on each object present. To fully demonstrate the reach of our approach, we perform detailed analysis, and propose a new framework to evaluate scene generation models. Finally, in the third article, we show that encouraging high mutual information between local and global multi-modal representations of 2D and 3D medical images can lead to improvements in image classification and segmentation. This general framework can be applied to a wide variety of settings, and demonstrates the benefits of not only locality, but also of compositionality as multi-modal representations are combined to obtain a more general one.
29

Dumoulin, Vincent. „Representation Learning for Visual Data“. Thèse, 2018. http://hdl.handle.net/1866/21140.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
30

Ehrler, Matthew. „VConstruct: a computationally efficient method for reconstructing satellite derived Chlorophyll-a data“. Thesis, 2021. http://hdl.handle.net/1828/13346.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Annotation:
The annual phytoplankton bloom is an important marine event. Its annual variability can be easily recognized by ocean-color satellite sensors through the increase in surface Chlorophyll-a concentration, a key indicator to quantitatively characterize all phytoplankton groups. However, a common problem is that the satellites used to gather the data are obstructed by clouds and other artifacts. This means that time series data from satellites can suffer from spatial data loss. There are a number of algorithms that are able to reconstruct the missing parts of these images to varying degrees of accuracy, with Data INterpolating Empirical Orthogonal Functions (DINEOF) being the most popular. However, DINEOF has a high computational cost, taking both significant time and memory to generate reconstructions. We propose a machine learning approach to reconstruction of Chlorophyll-a data using a Variational Autoencoder (VAE). Our method is 3-5x times faster (50-200x if the method has already been run once in the area). Our method uses less memory and increasing the size of the data being reconstructed causes computational cost to grow at a significantly better rate than DINEOF. We show that our method's accuracy is within a margin of error but slightly less accurate than DINEOF, as found by our own experiments and similar experiments from other studies. Lastly, we discuss other potential benefits of our method that could be investigated in future work, including generating data under certain conditions or anomaly detection.
Graduate
31

Dinh, Laurent. „Reparametrization in deep learning“. Thèse, 2018. http://hdl.handle.net/1866/21139.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
32

Chung, Junyoung. „On Deep Multiscale Recurrent Neural Networks“. Thèse, 2018. http://hdl.handle.net/1866/21588.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
33

Xu, Kelvin. „Exploring Attention Based Model for Captioning Images“. Thèse, 2017. http://hdl.handle.net/1866/20194.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Zur Bibliographie