Дисертації з теми "Deep generative modeling"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-33 дисертацій для дослідження на тему "Deep generative modeling".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Skalic, Miha 1990. "Deep learning for drug design : modeling molecular shapes." Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/667503.
El disseny de drogues novells es un procés complex que requereix trobar les molècules adequades, entre un gran ventall de possibilitats, que siguin capaces d’unir-se a la proteïna desitjada amb unes propietats fisicoquímiques favorables. Els mètodes d’aprenentatge automàtic ens serveixen per a aprofitar dades antigues sobre les molècules i utilitzar-les per a noves prediccions, ajudant en el procés de selecció de molècules potencials sense la necessitat exclusiva d’experiments. Particularment, l’aprenentatge profund pot sera plicat per a extreure patrons complexos a partir de representacions simples. En aquesta tesi utilitzem l’aprenentatge profund per a extreure patrons a partir de representacions tridimensionals de molècules. Apliquem models de classificació i regressió per a predir la bioactivitat i l’afinitat d’unió, respectivament. A més, demostrem que podem predir les propietats dels lligands per a una cavitat proteica determinada. Finalment, utilitzem un model generatiu profund per a disseny de compostos. Donada una forma d’un lligand demostrem que podem generar compostos similars i, donada una cavitat proteica, podem generar compostos que potencialment s’hi podràn unir.
Chen, Tian Qi. "Deep kernel mean embeddings for generative modeling and feedforward style transfer." Thesis, University of British Columbia, 2017. http://hdl.handle.net/2429/62668.
Science, Faculty of
Computer Science, Department of
Graduate
Brodie, Michael B. "Methods for Generative Adversarial Output Enhancement." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8763.
Testolin, Alberto. "Modeling cognition with generative neural networks: The case of orthographic processing." Doctoral thesis, Università degli studi di Padova, 2015. http://hdl.handle.net/11577/3424619.
In questa tesi vengono studiati alcuni processi cognitivi utilizzando recenti modelli di reti neurali generative. A differenza della maggior parte dei modelli connessionisti, l’approccio computazionale adottato in questa tesi enfatizza la natura generativa della cognizione, suggerendo che uno degli obiettivi principali dei sistemi cognitivi sia quello di apprendere un modello interno dell’ambiente circostante, che può essere usato per inferire relazioni causali ed effettuare previsioni riguardo all’informazione sensoriale in arrivo. In particolare, viene considerata una potente classe di reti neurali ricorrenti in grado di apprendere modelli generativi probabilistici dall’esperienza, estraendo informazione statistica di ordine superiore da un insieme di variabili in modo totalmente non supervisionato. Questo tipo di reti può essere formalizzato utilizzando la teoria dei modelli grafici probabilistici, che consente di descrivere con lo stesso linguaggio formale sia modelli di reti neurali che modelli Bayesiani strutturati. Inoltre, architetture di rete di base possono essere estese per creare sistemi più sofisticati, sfruttando molteplici livelli di processamento per apprendere modelli generativi gerarchici o sfruttando connessioni ricorrenti direzionate per processare informazione organizzata in sequenze. Riteniamo che queste architetture avanzate costituiscano un’alternativa promettente alle più tradizionali reti neurali supervisionate di tipo feed-forward, perché riproducono più fedelmente l’organizzazione funzionale e strutturale dei circuiti corticali, consentendo di spiegare come l’evidenza sensoriale possa essere effettivamente combinata con informazione contestuale proveniente da connessioni di feedback (“top-down”). Per giustificare l’utilizzo di questo tipo di modelli, in una serie di simulazioni studiamo nel dettaglio come implementazioni efficienti di reti generative gerarchiche e temporali possano estrarre informazione da grandi basi di dati, contenenti migliaia di esempi di training. In particolare, forniamo evidenza empirica relativa al riconoscimento di caratteri stampati e manoscritti appartenenti a diversi sistemi di scrittura, che possono in seguito essere combinati spazialmente o temporalmente per costruire unità ortografiche più complesse come quelle rappresentate dalle parole inglesi.
Yan, Guowei. "Interactive Modeling of Elastic Materials and Splashing Liquids." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1593098802306904.
Sadok, Samir. "Audiovisual speech representation learning applied to emotion recognition." Electronic Thesis or Diss., CentraleSupélec, 2024. http://www.theses.fr/2024CSUP0003.
Emotions are vital in our daily lives, becoming a primary focus of ongoing research. Automatic emotion recognition has gained considerable attention owing to its wide-ranging applications across sectors such as healthcare, education, entertainment, and marketing. This advancement in emotion recognition is pivotal for fostering the development of human-centric artificial intelligence. Supervised emotion recognition systems have significantly improved over traditional machine learning approaches. However, this progress encounters limitations due to the complexity and ambiguous nature of emotions. Acquiring extensive emotionally labeled datasets is costly, time-intensive, and often impractical.Moreover, the subjective nature of emotions results in biased datasets, impacting the learning models' applicability in real-world scenarios. Motivated by how humans learn and conceptualize complex representations from an early age with minimal supervision, this approach demonstrates the effectiveness of leveraging prior experience to adapt to new situations. Unsupervised or self-supervised learning models draw inspiration from this paradigm. Initially, they aim to establish a general representation learning from unlabeled data, akin to the foundational prior experience in human learning. These representations should adhere to criteria like invariance, interpretability, and effectiveness. Subsequently, these learned representations are applied to downstream tasks with limited labeled data, such as emotion recognition. This mirrors the assimilation of new situations in human learning. In this thesis, we aim to propose unsupervised and self-supervised representation learning methods designed explicitly for multimodal and sequential data and to explore their potential advantages in the context of emotion recognition tasks. The main contributions of this thesis encompass:1. Developing generative models via unsupervised or self-supervised learning for audiovisual speech representation learning, incorporating joint temporal and multimodal (audiovisual) modeling.2. Structuring the latent space to enable disentangled representations, enhancing interpretability by controlling human-interpretable latent factors.3. Validating the effectiveness of our approaches through both qualitative and quantitative analyses, in particular on emotion recognition task. Our methods facilitate signal analysis, transformation, and generation
Luc, Pauline. "Apprentissage autosupervisé de modèles prédictifs de segmentation à partir de vidéos." Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAM024/document.
Predictive models of the environment hold promise for allowing the transfer of recent reinforcement learning successes to many real-world contexts, by decreasing the number of interactions needed with the real world.Video prediction has been studied in recent years as a particular case of such predictive models, with broad applications in robotics and navigation systems.While RGB frames are easy to acquire and hold a lot of information, they are extremely challenging to predict, and cannot be directly interpreted by downstream applications.Here we introduce the novel tasks of predicting semantic and instance segmentation of future frames.The abstract feature spaces we consider are better suited for recursive prediction and allow us to develop models which convincingly predict segmentations up to half a second into the future.Predictions are more easily interpretable by downstream algorithms and remain rich, spatially detailed and easy to obtain, relying on state-of-the-art segmentation methods.We first focus on the task of semantic segmentation, for which we propose a discriminative approach based on adversarial training.Then, we introduce the novel task of predicting future semantic segmentation, and develop an autoregressive convolutional neural network to address it.Finally, we extend our method to the more challenging problem of predicting future instance segmentation, which additionally segments out individual objects.To deal with a varying number of output labels per image, we develop a predictive model in the space of high-level convolutional image features of the Mask R-CNN instance segmentation model.We are able to produce visually pleasing segmentations at a high resolution for complex scenes involving a large number of instances, and with convincing accuracy up to half a second ahead
Ionascu, Beatrice. "Modelling user interaction at scale with deep generative methods." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239333.
Förståelse för hur användare interagerar med ett företags tjänst är essentiell för data-drivna affärsverksamheter med ambitioner om att bättre tillgodose dess användare och att förbättra deras utbud. Generativ maskininlärning möjliggör modellering av användarbeteende och genererande av ny data i syfte att simulera eller identifiera och förklara typiska användarmönster. I detta arbete introducerar vi ett tillvägagångssätt för storskalig modellering av användarinteraktion i en klientservice-modell. Vi föreslår en ny representation av multivariat tidsseriedata i form av tidsbilder vilka representerar temporala korrelationer via spatial organisering. Denna representation delar två nyckelegenskaper som faltningsnätverk har utvecklats för att exploatera, vilket tillåter oss att utveckla ett tillvägagångssätt baserat på på djupa generativa modeller som bygger på faltningsnätverk. Genom att introducera detta tillvägagångssätt för tidsseriedata expanderar vi applicering av faltningsnätverk inom domänen för multivariat tidsserie, specifikt för användarinteraktionsdata. Vi använder ett tillvägagångssätt inspirerat av ramverket β-VAE i syfte att lära modellen gömda faktorer som definierar olika användarmönster. Vi utforskar olika värden för regulariseringsparametern β och visar att det är möjligt att konstruera en modell som lär sig en latent representation av identifierbara och multipla användarbeteenden. Vi visar med verklig data att modellen genererar realistiska exempel vilka i sin tur fångar statistiken på populationsnivå hos användarinteraktionsdatan, samt lär olika användarbeteenden och bidrar med precisa imputationer av saknad data.
McClintick, Kyle W. "Training Data Generation Framework For Machine-Learning Based Classifiers." Digital WPI, 2018. https://digitalcommons.wpi.edu/etd-theses/1276.
Fang, Zhufeng. "USING GEOSTATISTICS, PEDOTRANSFER FUNCTIONS TO GENERATE 3D SOIL AND HYDRAULIC PROPERTY DISTRIBUTIONS FOR DEEP VADOSE ZONE FLOW SIMULATIONS." Thesis, The University of Arizona, 2009. http://hdl.handle.net/10150/193439.
Marin-Moreno, Hector. "Numerical modelling of overpressure generation in deep basins and response of Arctic gas hydrate to ocean warming." Thesis, University of Southampton, 2014. https://eprints.soton.ac.uk/364170/.
He, Sheng. "Thermal History and Deep Overpressure Modelling in the Northern Carnarvon Basin, North West Shelf, Australia." Thesis, Curtin University, 2002. http://hdl.handle.net/20.500.11937/1292.
Buys, Jan Moolman. "Incremental generative models for syntactic and semantic natural language processing." Thesis, University of Oxford, 2017. https://ora.ox.ac.uk/objects/uuid:a9a7b5cf-3bb1-4e08-b109-de06bf387d1d.
He, Sheng. "Thermal History and Deep Overpressure Modelling in the Northern Carnarvon Basin, North West Shelf, Australia." Curtin University of Technology, Department of Applied Geology, 2002. http://espace.library.curtin.edu.au:80/R/?func=dbin-jump-full&object_id=11998.
Data from kerogen element analysis, Rock-Eval pyrolysis, visual kerogen composition and some biomarkers have been used to evaluate the kerogen type in the basin. It appears that type III kerogen is the dominant organic-matter type in the Triassic and Jurassic source rocks, while the Dingo Claystone may contain some oil-prone organic matter. The vitrinite reflectance (Ro) data in some wells of the Northern Carnarvon Basin are anomalously low. As a major thermal maturity indicator, the anomalously low Ro data seriously hinder the assessment of thermal maturity in the basin. This study differs from other studies in that it has paid more attention to Rock-Eval Tmax data. Therefore, problems affecting Tmax data in evaluating thermal maturity were investigated. A case study of contaminated Rock-Eval data in Bambra-2 and thermal modelling using Tmax data in 16 wells from different tectonic subdivisions were carried out. The major problems for using Tmax data were found to be contamination by drilling-mud additives, natural bitumen and suppression due to hydrogen index (HI) > 150 in some wells. Although the data reveal uncertainties and there is about ±3-10 % error for thermal modelling by using the proposed relationship of Ro and Tmax, the "reliable" Tmax data are found to be important, and useful to assess thermal maturity and reduce the influence of unexpectedly low Ro data.
This study analyzed the characteristics of deep overpressured zones and top pressure seals, in detail, in 7 wells based on the observed fluid pressure data and petrophysical data. The deep overpressured system (depth greater than 2650-3000 m) in the Jurassic formations and the lower part of the Barrow Group is shown by the measured fluid pressure data including RFTs, DSTs and mud weights. The highly overpressured Jurassic fine-grained rocks also exhibit well-log responses of high sonic transit times and low formation resistivities. The deep overpressured zone, however, may not necessarily be caused by anomalously high porosities due to undercompaction. The porosities in the deep overpressured Jurassic rocks may be significantly less than the well-log derived porosities, which may indicate that the sonic-log and resistivity-log also directly respond to the overpressuring in the deep overpressured fine-grained rocks of the sub-basins. Based on the profiles of fluid pressure and well-log data in 5 wells of the Barrow Sub-basin, a top pressure seal was interpreted to be consistent with the transitional pressure zone in the Barrow Sub-basin. This top pressure seal was observed to consist of a rock layer of 60-80 % claystone and siltstone. The depths of the rock layer range from 2650 m to 3300 m with thicknesses of 300-500 m and temperatures of 110-135 °C. Based on the well-log data, measured porosity and sandstone diagenesis, the rock layer seems to be well compacted and cemented with a porosity range of about 2-5 % and calculated permeabilities of about 10-19 to 10-22 M2.
This study performed thermal history and maturity modelling in 14 wells using the BasinMod 1D software. It was found that the thermal maturity data in 4 wells are consistent with the maturity curves predicted by the rifting heat flow history associated with the tectonic regime of this basin. The maximum heat flows during the rift event of the Jurassic and earliest Cretaceous possibly ranged from 60-70 mW/m2 along the sub-basins and 70-80 mW/m2 on the southern and central Exmouth Plateau. This study also carried out two case studies of thermal maturity and thermal modelling within the deep overpressured system in the Barrow and Bambra wells of the Barrow Sub-basin. These case studies were aimed at understanding whether overpressure has a determinable influence on thermal maturation in this region. It was found that there is no evidence for overpressure-related retardation of thermal maturity in the deep overpressured system, based on the measured maturity, biomarker maturity parameters and 1D thermal modelling. Therefore, based on the data analysed, overpressure is an insignificant factor in thermal maturity and h hydrocarbon generation in this basin.
Three seismic lines in the Exmouth, Barrow and Dampier Sub-basins were selected and converted to depth cross-sections, and then 2D geological models were created for overpressure evolution modelling. A major object of these 2D geological models was to define the critical faults. A top pressure seal was also detected based on the 2D model of the Barrow Sub-basin. Two-dimensional overpressure modelling was performed using the BasinMod 2D software. The mathematical 2D model takes into consideration compaction, fluid thermal expansion, pressure produced by hydrocarbon generation and quartz cementation. The sealed overpressured conditions can be modelled with fault sealing, bottom pressure seal (permeabilities of 10-23 to 10-25 M2 ) and top pressure seal (permeabilities of 10-19 to 10-22 m2). The modelling supports the development of a top pressure seal with quartz cementation. The 2D modelling suggests the rapid sedimentation rates can cause compaction disequilibrium in the fine-grained rocks, which may be a mechanism for overpressure generation during the Jurassic to the Early Cretaceous. The data suggest that the present-day deep overpressure is not associated with the porosity anomaly due to compaction disequilibrium and that compaction may be much less important than recurrent pressure charges because most of the porosity in the Jurassic source rocks has been lost through compaction and deposition rates have been very slow since the beginning of the Cainozoic.
Three simple 1D models were developed and applied to estimate how rapidly the overpressure dissipates. The results suggest that the present day overpressure would be almost dissipated after 2 million years with a pressure seal with an average permeability of 10-22 M2 (10-7 md). On the basis of numerous accumulations of oil and gas to be expelled from the overpressured Jurassic source rocks in the basin and the pressure seal modelling, it seems that the top pressure seal with permeabilities of 10-19 to 10-22 M2 (10-4 to 10-7 md) is not enough to retain the deep overpressure for tens of millions of years without pressure recharging. Only if the permeabilities were 10-23 m2 (10-8 md) or less, would a long-lived overpressured system be preserved. This study suggests that hydrocarbon generation, especially gas generation and thermal expansion, within sealed conditions of low-permeability is a likely major cause for maintaining the deep overpressure over the past tens of millions of years. Keywords: Thermal history; Deep overpressure; Type III kerogen; Rock-Eval Tmax; Thermal maturity; Palaeoheatflow modelling; Pressure seal; 2D deep overpressure modelling; Pressure behaviour modelling; Overpressure generation; Northern Carnarvon Basin.
Martin, Alice. "Deep learning models and algorithms for sequential data problems : applications to language modelling and uncertainty quantification." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS007.
In this thesis, we develop new models and algorithms to solve deep learning tasks on sequential data problems, with the perspective of tackling the pitfalls of current approaches for learning language models based on neural networks. A first research work develops a new deep generative model for sequential data based on Sequential Monte Carlo Methods, that enables to better model diversity in language modelling tasks, and better quantify uncertainty in sequential regression problems. A second research work aims to facilitate the use of SMC techniques within deep learning architectures, by developing a new online smoothing algorithm with reduced computational cost, and applicable on a wider scope of state-space models, including deep generative models. Finally, a third research work proposes the first reinforcement learning that enables to learn conditional language models from scratch (i.e without supervised datasets), based on a truncation mechanism of the natural language action space with a pretrained language model
Devineau, Guillaume. "Deep learning for multivariate time series : from vehicle control to gesture recognition and generation." Thesis, Université Paris sciences et lettres, 2020. http://www.theses.fr/2020UPSLM037.
Artificial intelligence is the scientific field which studies how to create machines that are capable of intelligent behaviour. Deep learning is a family of artificial intelligence methods based on neural networks. In recent years, deep learning has lead to groundbreaking developments in the image and natural language processing fields. However, in many domains, input data consists in neither images nor text documents, but in time series that describe the temporal evolution of observed or computed quantities. In this thesis, we study and introduce different representations for time series, based on deep learning models. Firstly, in the autonomous driving domain, we show that, the analysis of a temporal window by a neural network can lead to better vehicle control results than classical approaches that do not use neural networks, especially in highly-coupled situations. Secondly, in the gesture and action recognition domain, we introduce 1D parallel convolutional neural network models. In these models, convolutions are performed over the temporal dimension, in order for the neural network to detect -and benefit from- temporal invariances. Thirdly, in the human pose motion generation domain, we introduce 2D convolutional generative adversarial neural networks where the spatial and temporal dimensions are convolved in a joint manner. Finally, we introduce an embedding where spatial representations of human poses are sorted in a latent space based on their temporal relationships
Wen, Tsung-Hsien. "Recurrent neural network language generation for dialogue systems." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/275648.
Lucas, Thomas. "Modèles génératifs profonds : sur-généralisation et abandon de mode." Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM049.
This dissertation explores the topic of generative modelling of natural images,which is the task of fitting a data generating distribution.Such models can be used to generate artificial data resembling the true data, or to compress images.Latent variable models, which are at the core of our contributions, seek to capture the main factors of variations of an image into a variable that can be manipulated.In particular we build on two successful latent variable generative models, the generative adversarial network (GAN) and Variational autoencoder (VAE) models.Recently GANs significantly improved the quality of images generated by deep models, obtaining very compelling samples.Unfortunately these models struggle to capture all the modes of the original distribution, ie they do not cover the full variability of the dataset.Conversely, likelihood based models such as VAEs typically cover the full variety of the data well and provide an objective measure of coverage.However these models produce samples of inferior visual quality that are more easily distinguished from real ones.The work presented in this thesis strives for the best of both worlds: to obtain compelling samples while modelling the full support of the distribution.To achieve that, we focus on i) the optimisation problems used and ii) practical model limitations that hinder performance.The first contribution of this manuscript is a deep generative model that encodes global image structure into latent variables, built on the VAE, and autoregressively models low level detail.We propose a training procedure relying on an auxiliary loss function to control what information is captured by the latent variables and what information is left to an autoregressive decoder.Unlike previous approaches to such hybrid models, ours does not need to restrict the capacity of the autoregressive decoder to prevent degenerate models that ignore the latent variables.The second contribution builds on the standard GAN model, which trains a discriminator network to provide feedback to a generative network.The discriminator usually assesses the quality of individual samples, which makes it hard to evaluate the variability of the data.Instead we propose to feed the discriminator with emph{batches} that mix both true and fake samples, and train it to predict the ratio of true samples in the batch.These batches work as approximations of the distribution of generated images and allows the discriminator to approximate distributional statistics.We introduce an architecture that is well suited to solve this problem efficiently,and show experimentally that our approach reduces mode collapse in GANs on two synthetic datasets, and obtains good results on the CIFAR10 and CelebA datasets.The mutual shortcomings of VAEs and GANs can in principle be addressed by training hybrid models that use both types of objective.In our third contribution, we show that usual parametric assumptions made in VAEs induce a conflict between them, leading to lackluster performance of hybrid models.We propose a solution based on deep invertible transformations, that trains a feature space in which usual assumptions can be made without harm.Our approach provides likelihood computations in image space while being able to take advantage of adversarial training.It obtains GAN-like samples that are competitive with fully adversarial models while improving likelihood scores over existing hybrid models at the time of publication, which is a significant advancement
Rafael-Palou, Xavier. "Detection, quantification, malignancy prediction and growth forecasting of pulmonary nodules using deep learning in follow-up CT scans." Doctoral thesis, Universitat Pompeu Fabra, 2021. http://hdl.handle.net/10803/672964.
Avui en dia, l’avaluació del càncer de pulmó ´es una tasca complexa i tediosa, principalment realitzada per inspecció visual radiològica de nòduls pulmonars sospitosos, mitjançant imatges de tomografia computada (TC) preses als pacients al llarg del temps. Actualment, existeixen diverses eines computacionals basades en intel·ligència artificial i algorismes de visió per computador per donar suport a la detecció i classificació del càncer de pulmó. Aquestes solucions es basen majoritàriament en l’anàlisi d’imatges individuals de TC pulmonar dels pacients i en l’ús de descriptors d’imatges fets a mà. Malauradament, això les fa incapaces d’afrontar completament la complexitat i la variabilitat del problema. Recentment, l’aparició de l’aprenentatge profund ha permès un gran avenc¸ en el camp de la imatge mèdica. Malgrat els prometedors assoliments en detecció de nòduls, segmentació i classificació del càncer de pulmó, els radiòlegs encara són reticents a utilitzar aquestes solucions en el seu dia a dia. Un dels principals motius ´es que les solucions actuals no proporcionen suport automàtic per analitzar l’evolució temporal dels tumors pulmonars. La dificultat de recopilar i anotar cohorts longitudinals de TC pulmonar poden explicar la manca de treballs d’aprenentatge profund que aborden aquest problema. En aquesta tesi investiguem com abordar el suport automàtic a l’avaluació del càncer de pulmó, construint algoritmes d’aprenentatge profund i pipelines de visió per ordinador que, especialment, tenen en compte l’evolució temporal dels nòduls pulmonars. Així doncs, el nostre primer objectiu va consistir a obtenir mètodes precisos per a l’avaluació del càncer de pulmó basats en imatges de CT pulmonar individuals. Atès que aquests tipus d’etiquetes són costoses i difícils d’obtenir (per exemple, després d’una biòpsia), vam dissenyar diferents xarxes neuronals profundes, basades en xarxes de convolució 3D (CNN), per predir la malignitat dels nòduls basada en la inspecció visual dels radiòlegs (més senzilles de recol.lectar). A continuació, vàrem avaluar diferents maneres de sintetitzar aquest coneixement representat en la xarxa neuronal de malignitat, en una pipeline destinada a proporcionar predicció del càncer de pulmó a nivell de pacient, donada una imatge de TC pulmonar. Els resultats positius van confirmar la conveniència d’utilitzar CNN per modelar la malignitat dels nòduls, segons els radiòlegs, per a la predicció automàtica del càncer de pulmó. Seguidament, vam dirigir la nostra investigació cap a l’anàlisi de sèries d’imatges de TC pulmonar. Per tant, ens vam enfrontar primer a la reidentificació automàtica de nòduls pulmonars de diferents tomografies pulmonars. Per fer-ho, vam proposar utilitzar xarxes neuronals siameses (SNN) per classificar la similitud entre nòduls, superant la necessitat de registre d’imatges. Aquest canvi de paradigma va evitar possibles pertorbacions de la imatge i va proporcionar resultats computacionalment més ràpids. Es van examinar diferents configuracions del SNN convencional, que van des de l’aplicació de l’aprenentatge de transferència, utilitzant diferents funcions de pèrdua, fins a la combinació de diversos mapes de característiques de diferents nivells de xarxa. Aquest mètode va obtenir resultats d’estat de la tècnica per reidentificar nòduls de manera aïllada, i de forma integrada en una pipeline per a la quantificació de creixement de nòduls. A més, vam abordar el problema de donar suport als radiòlegs en la gestió longitudinal del càncer de pulmó. Amb aquesta finalitat, vam proposar una nova pipeline d’aprenentatge profund, composta de quatre etapes que s’automatitzen completament i que van des de la detecció de nòduls fins a la classificació del càncer, passant per la detecció del creixement dels nòduls. A més, la pipeline va integrar un nou enfocament per a la detecció del creixement dels nòduls, que es basava en una recent xarxa de segmentació probabilística jeràrquica adaptada per informar estimacions d’incertesa. A més, es va introduir un segon mètode per a la classificació dels nòduls del càncer de pulmó, que integrava en una xarxa 3D-CNN de dos fluxos les probabilitats estimades de malignitat dels nòduls derivades de la xarxa pre-entrenada de malignitat dels nòduls. La pipeline es va avaluar en una cohort longitudinal i va informar rendiments comparables a l’estat de la tècnica utilitzats individualment o en pipelines però amb menys components que la proposada. Finalment, també vam investigar com ajudar els metges a prescriure de forma més acurada tractaments tumorals i planificacions quirúrgiques més precises. Amb aquesta finalitat, hem realitzat un nou mètode per predir el creixement dels nòduls donada una única imatge del nòdul. Particularment, el mètode es basa en una xarxa neuronal profunda jeràrquica, probabilística i generativa capaç de produir múltiples segmentacions de nòduls futurs consistents del nòdul en un moment determinat. Per fer-ho, la xarxa aprèn a modelar la distribució posterior multimodal de futures segmentacions de tumors pulmonars mitjançant la utilització d’inferència variacional i la injecció de les característiques latents posteriors. Finalment, aplicant el mostreig de Monte-Carlo a les sortides de la xarxa, podem estimar la mitjana de creixement del tumor i la incertesa associada a la predicció. Tot i que es recomanable una avaluació posterior en una cohort més gran, els mètodes proposats en aquest treball han informat resultats prou precisos per donar suport adequadament al flux de treball radiològic del seguiment dels nòduls pulmonars. Més enllà d’aquesta aplicació especifica, les innovacions presentades com, per exemple, els mètodes per integrar les xarxes CNN a pipelines de visió per ordinador, la reidentificació de regions sospitoses al llarg del temps basades en SNN, sense la necessitat de deformar l’estructura de la imatge inherent o la xarxa probabilística per modelar el creixement del tumor tenint en compte imatges ambigües i la incertesa en les prediccions, podrien ser fàcilment aplicables a altres tipus de càncer (per exemple, pàncrees), malalties clíniques (per exemple, Covid-19) o aplicacions mèdiques (per exemple, seguiment de la teràpia).
Risso, Davide. "Simultaneous inference for RNA-Seq data." Doctoral thesis, Università degli studi di Padova, 2012. http://hdl.handle.net/11577/3421731.
Negli ultimi anni il sequenziamento massivo di RNA (RNA-Seq) è diventato una scelta frequente per gli studi di espressione genica. Questa tecnica ha il potenziale di superare i microarray come tecnica standard per lo studio dei profili trascrizionali. A livello genico, i dati di RNA-Seq si presentano sotto forma di conteggi, al contrario dei microarray che stimano l’espressione su una scala continua. Questo porta alla necessità di sviluppare nuovi metodi e modelli per l'analisi di dati di conteggio in problemi con dimensionalità elevata. In questa tesi verranno affrontati alcuni problemi relativi all'esplorazione e alla modellazione dei dati di RNA-Seq. In particolare, verranno introdotti metodi per la visualizzazione e il riassunto numerico dei dati. Inoltre si definirà un nuovo algoritmo per il raggruppamento dei dati e alcune strategie per la normalizzazione, volte a eliminare le distorsioni specifiche di questa tecnologia. Infine, verrà definito un modello gerarchico Bayesiano per modellare l'espressione di dati RNA-Seq e verificarne le eventuali differenze in diverse condizioni sperimentali. Il modello tiene in considerazione la profondità di sequenziamento e la sovra-dispersione e automaticamente sviluppa diversi tipi di normalizzazione.
Loaiza, Ganem Gabriel. "Advances in Deep Generative Modeling With Applications to Image Generation and Neuroscience." Thesis, 2019. https://doi.org/10.7916/d8-yp88-e002.
Yahi, Alexandre. "Simulating drug responses in laboratory test time series with deep generative modeling." Thesis, 2019. https://doi.org/10.7916/d8-arta-jt32.
Abdal, Rameen. "Image Embedding into Generative Adversarial Networks." Thesis, 2020. http://hdl.handle.net/10754/662516.
Mehri, Soroush. "Sequential modeling, generative recurrent neural networks, and their applications to audio." Thèse, 2016. http://hdl.handle.net/1866/18762.
Lamb, Alexander. "Generative models : a critical review." Thèse, 2018. http://hdl.handle.net/1866/21282.
Parent-Lévesque, Jérôme. "Towards deep unsupervised inverse graphics." Thesis, 2020. http://hdl.handle.net/1866/25467.
A long standing goal of computer vision is to infer the underlying 3D content in a scene from a single photograph, a task known as inverse graphics. Machine learning has, in recent years, enabled many approaches to make great progress towards solving this problem. However, most approaches rely on 3D supervision data which is expensive and sometimes impossible to obtain and therefore limits the learning capabilities of such work. In this work, we explore the deep unsupervised inverse graphics training pipeline and propose two methods based on distinct 3D representations and associated differentiable rendering algorithms: namely surfels and a novel Voronoi-based representation. In the first method based on surfels, we show that, while effective at maintaining view-consistency, producing view-dependent surfels using a learned depth map results in ambiguities as the mapping between depth map and rendering is non-bijective. In our second method, we introduce a novel 3D representation based on Voronoi diagrams which models objects/scenes both explicitly and implicitly simultaneously, thereby combining the benefits of both. We show how this representation can be used in both a supervised and unsupervised context and discuss its advantages compared to traditional 3D representations.
Mastropietro, Olivier. "Deep Learning for Video Modelling." Thèse, 2017. http://hdl.handle.net/1866/20192.
Sylvain, Tristan. "Locality and compositionality in representation learning for complex visual tasks." Thesis, 2021. http://hdl.handle.net/1866/25594.
The use of deep neural architectures coupled with specific innovations such as adversarial methods, pre-training on large datasets and mutual information estimation has in recent years allowed rapid progress in many complex vision tasks such as zero-shot learning, scene generation, or multi-modal classification. Despite such progress, it is still not clear if current representation learning methods will be enough to attain human-level performance on arbitrary visual tasks, and if not, what direction should future research take. In this thesis, we will focus on two aspects of representations that seem necessary to achieve good downstream performance for representation learning: locality and compositionality. Locality can be understood as a representation's ability to retain local information. This will be relevant in many cases, and will specifically benefit computer vision where natural images inherently feature local information, i.e. relevant patches of an image, multiple objects present in a scene... On the other hand, a compositional representation can be understood as one that arises from a combination of simpler parts. Convolutional neural networks are inherently compositional, and many complex images can be seen as composition of relevant sub-components: individual objects and attributes in a scene, semantic attributes in zero-shot learning are two examples. We believe both properties hold the key to designing better representation learning methods. In this thesis, we present 3 articles dealing with locality and/or compositionality, and their application to representation learning for complex visual tasks. In the first article, we introduce ways of measuring locality and compositionality for image representations, and demonstrate that local and compositional representations perform better at zero-shot learning. We also use these two notions as the basis for designing class-matching deep info-max, a novel representation learning algorithm that achieves state-of-the-art performance on our proposed "Zero-shot from scratch" setting, a harder zero-shot setting where external information, e.g. pre-training on other image datasets is not allowed. In the second article, we show that by encouraging a generator to retain local object-level information, using a scene-graph similarity module, we can improve scene generation performance. This model also showcases the importance of compositionality as many components operate individually on each object present. To fully demonstrate the reach of our approach, we perform detailed analysis, and propose a new framework to evaluate scene generation models. Finally, in the third article, we show that encouraging high mutual information between local and global multi-modal representations of 2D and 3D medical images can lead to improvements in image classification and segmentation. This general framework can be applied to a wide variety of settings, and demonstrates the benefits of not only locality, but also of compositionality as multi-modal representations are combined to obtain a more general one.
Dumoulin, Vincent. "Representation Learning for Visual Data." Thèse, 2018. http://hdl.handle.net/1866/21140.
Ehrler, Matthew. "VConstruct: a computationally efficient method for reconstructing satellite derived Chlorophyll-a data." Thesis, 2021. http://hdl.handle.net/1828/13346.
Graduate
Dinh, Laurent. "Reparametrization in deep learning." Thèse, 2018. http://hdl.handle.net/1866/21139.
Chung, Junyoung. "On Deep Multiscale Recurrent Neural Networks." Thèse, 2018. http://hdl.handle.net/1866/21588.
Xu, Kelvin. "Exploring Attention Based Model for Captioning Images." Thèse, 2017. http://hdl.handle.net/1866/20194.