Dissertations / Theses: 'Deep learning'

1

Dufourq, Emmanuel. "Evolutionary deep learning." Doctoral thesis, Faculty of Science, 2019. http://hdl.handle.net/11427/30357.

Full text

Abstract:

The primary objective of this thesis is to investigate whether evolutionary concepts can improve the performance, speed and convenience of algorithms in various active areas of machine learning research. Deep neural networks are exhibiting an explosion in the number of parameters that need to be trained, as well as the number of permutations of possible network architectures and hyper-parameters. There is little guidance on how to choose these and brute-force experimentation is prohibitively time consuming. We show that evolutionary algorithms can help tame this explosion of freedom, by developing an algorithm that robustly evolves near optimal deep neural network architectures and hyper-parameters across a wide range of image and sentiment classification problems. We further develop an algorithm that automatically determines whether a given data science problem is of classification or regression type, successfully choosing the correct problem type with more than 95% accuracy. Together these algorithms show that a great deal of the current "art" in the design of deep learning networks - and in the job of the data scientist - can be automated. Having discussed the general problem of optimising deep learning networks the thesis moves on to a specific application: the automated extraction of human sentiment from text and images of human faces. Our results reveal that our approach is able to outperform several public and/or commercial text sentiment analysis algorithms using an evolutionary algorithm that learned to encode and extend sentiment lexicons. A second analysis looked at using evolutionary algorithms to estimate text sentiment while simultaneously compressing text data. An extensive analysis of twelve sentiment datasets reveal that accurate compression is possible with 3.3% loss in classification accuracy even with 75% compression of text size, which is useful in environments where data volumes are a problem. Finally, the thesis presents improvements to automated sentiment analysis of human faces to identify emotion, an area where there has been a tremendous amount of progress using convolutional neural networks. We provide a comprehensive critique of past work, highlight recommendations and list some open, unanswered questions in facial expression recognition using convolutional neural networks. One serious challenge when implementing such networks for facial expression recognition is the large number of trainable parameters which results in long training times. We propose a novel method based on evolutionary algorithms, to reduce the number of trainable parameters whilst simultaneously retaining classification performance, and in some cases achieving superior performance. We are robustly able to reduce the number of parameters on average by 95% with no loss in classification accuracy. Overall our analyses show that evolutionary algorithms are a valuable addition to machine learning in the deep learning era: automating, compressing and/or improving results significantly, depending on the desired goal.

APA, Harvard, Vancouver, ISO, and other styles

2

He, Fengxiang. "Theoretical Deep Learning." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/25674.

Full text

Abstract:

Deep learning has long been criticised as a black-box model for lacking sound theoretical explanation. During the PhD course, I explore and establish theoretical foundations for deep learning. In this thesis, I present my contributions positioned upon existing literature: (1) analysing the generalizability of the neural networks with residual connections via complexity and capacity-based hypothesis complexity measures; (2) modeling stochastic gradient descent (SGD) by stochastic differential equations (SDEs) and their dynamics, and further characterizing the generalizability of deep learning; (3) understanding the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems, which sheds light in reconciling the over-representation and excellent generalizability of deep learning; and (4) discovering the interplay between generalization, privacy preservation, and adversarial robustness, which have seen rising concerns in deep learning deployment.

APA, Harvard, Vancouver, ISO, and other styles

3

FRACCAROLI, MICHELE. "Explainable Deep Learning." Doctoral thesis, Università degli studi di Ferrara, 2023. https://hdl.handle.net/11392/2503729.

Full text

Abstract:

Il grande successo che il Deep Learning ha ottenuto in ambiti strategici per la nostra società quali l'industria, la difesa, la medicina etc., ha portanto sempre più realtà a investire ed esplorare l'utilizzo di questa tecnologia. Ormai si possono trovare algoritmi di Machine Learning e Deep Learning quasi in ogni ambito della nostra vita. Dai telefoni, agli elettrodomestici intelligenti fino ai veicoli che guidiamo. Quindi si può dire che questa tecnologia pervarsiva è ormai a contatto con le nostre vite e quindi dobbiamo confrontarci con essa. Da questo nasce l’eXplainable Artificial Intelligence o XAI, uno degli ambiti di ricerca che vanno per la maggiore al giorno d'oggi in ambito di Deep Learning e di Intelligenza Artificiale. Il concetto alla base di questo filone di ricerca è quello di rendere e/o progettare i nuovi algoritmi di Deep Learning in modo che siano affidabili, interpretabili e comprensibili all'uomo. Questa necessità è dovuta proprio al fatto che le reti neurali, modello matematico che sta alla base del Deep Learning, agiscono come una scatola nera, rendendo incomprensibile all'uomo il ragionamento interno che compiono per giungere ad una decisione. Dato che stiamo delegando a questi modelli matematici decisioni sempre più importanti, integrandole nei processi più delicati della nostra società quali, ad esempio, la diagnosi medica, la guida autonoma o i processi di legge, è molto importante riuscire a comprendere le motivazioni che portano questi modelli a produrre determinati risultati. Il lavoro presentato in questa tesi consiste proprio nello studio e nella sperimentazione di algoritmi di Deep Learning integrati con tecniche di Intelligenza Artificiale simbolica. Questa integrazione ha un duplice scopo: rendere i modelli più potenti, consentendogli di compiere ragionamenti o vincolandone il comportamento in situazioni complesse, e renderli interpretabili. La tesi affronta due macro argomenti: le spiegazioni ottenute grazie all'integrazione neuro-simbolica e lo sfruttamento delle spiegazione per rendere gli algoritmi di Deep Learning più capaci o intelligenti. Il primo macro argomento si concentra maggiormente sui lavori svolti nello sperimentare l'integrazione di algoritmi simbolici con le reti neurali. Un approccio è stato quelli di creare un sistema per guidare gli addestramenti delle reti stesse in modo da trovare la migliore combinazione di iper-parametri per automatizzare la progettazione stessa di queste reti. Questo è fatto tramite l'integrazione di reti neurali con la Programmazione Logica Probabilistica (PLP) che consente di sfruttare delle regole probabilistiche indotte dal comportamento delle reti durante la fase di addestramento o ereditate dall'esperienza maturata dagli esperti del settore. Queste regole si innescano allo scatenarsi di un problema che il sistema rileva durate l'addestramento della rete. Questo ci consente di ottenere una spiegazione di cosa è stato fatto per migliorare l'addestramento una volta identificato un determinato problema. Un secondo approccio è stato quello di far cooperare sistemi logico-probabilistici con reti neurali per la diagnosi medica da fonti di dati eterogenee. La seconda tematica affrontata in questa tesi tratta lo sfruttamento delle spiegazioni che possiamo ottenere dalle rete neurali. In particolare, queste spiegazioni sono usate per creare moduli di attenzione che aiutano a vincolare o a guidare le reti neurali portandone ad avere prestazioni migliorate. Tutti i lavori sviluppati durante il dottorato e descritti in questa tesi hanno portato alle pubblicazioni elencate nel Capitolo 14.2.
The great success that Machine and Deep Learning has achieved in areas that are strategic for our society such as industry, defence, medicine, etc., has led more and more realities to invest and explore the use of this technology. Machine Learning and Deep Learning algorithms and learned models can now be found in almost every area of our lives. From phones to smart home appliances, to the cars we drive. So it can be said that this pervasive technology is now in touch with our lives, and therefore we have to deal with it. This is why eXplainable Artificial Intelligence or XAI was born, one of the research trends that are currently in vogue in the field of Deep Learning and Artificial Intelligence. The idea behind this line of research is to make and/or design the new Deep Learning algorithms so that they are interpretable and comprehensible to humans. This necessity is due precisely to the fact that neural networks, the mathematical model underlying Deep Learning, act like a black box, making the internal reasoning they carry out to reach a decision incomprehensible and untrustable to humans. As we are delegating more and more important decisions to these mathematical models, it is very important to be able to understand the motivations that lead these models to make certain decisions. This is because we have integrated them into the most delicate processes of our society, such as medical diagnosis, autonomous driving or legal processes. The work presented in this thesis consists in studying and testing Deep Learning algorithms integrated with symbolic Artificial Intelligence techniques. This integration has a twofold purpose: to make the models more powerful, enabling them to carry out reasoning or constraining their behaviour in complex situations, and to make them interpretable. The thesis focuses on two macro topics: the explanations obtained through neuro-symbolic integration and the exploitation of explanations to make the Deep Learning algorithms more capable or intelligent. The neuro-symbolic integration was addressed twice, by experimenting with the integration of symbolic algorithms with neural networks. A first approach was to create a system to guide the training of the networks themselves in order to find the best combination of hyper-parameters to automate the design of these networks. This is done by integrating neural networks with Probabilistic Logic Programming (PLP). This integration makes it possible to exploit probabilistic rules tuned by the behaviour of the networks during the training phase or inherited from the experience of experts in the field. These rules are triggered when a problem occurs during network training. This generates an explanation of what was done to improve the training once a particular issue was identified. A second approach was to make probabilistic logic systems cooperate with neural networks for medical diagnosis on heterogeneous data sources. The second topic addressed in this thesis concerns the exploitation of explanations. In particular, the explanations one can obtain from neural networks are used in order to create attention modules that help in constraining and improving the performance of neural networks. All works developed during the PhD and described in this thesis have led to the publications listed in Chapter 14.2.

APA, Harvard, Vancouver, ISO, and other styles

4

Halle, Alex, and Alexander Hasse. "Topologieoptimierung mittels Deep Learning." Technische Universität Chemnitz, 2019. https://monarch.qucosa.de/id/qucosa%3A34343.

Full text

Abstract:

Die Topologieoptimierung ist die Suche einer optimalen Bauteilgeometrie in Abhängigkeit des Einsatzfalls. Für komplexe Probleme kann die Topologieoptimierung aufgrund eines hohen Detailgrades viel Zeit- und Rechenkapazität erfordern. Diese Nachteile der Topologieoptimierung sollen mittels Deep Learning reduziert werden, so dass eine Topologieoptimierung dem Konstrukteur als sekundenschnelle Hilfe dient. Das Deep Learning ist die Erweiterung künstlicher neuronaler Netzwerke, mit denen Muster oder Verhaltensregeln erlernt werden können. So soll die bislang numerisch berechnete Topologieoptimierung mit dem Deep Learning Ansatz gelöst werden. Hierzu werden Ansätze, Berechnungsschema und erste Schlussfolgerungen vorgestellt und diskutiert.

APA, Harvard, Vancouver, ISO, and other styles

5

Goh, Hanlin. "Learning deep visual representations." Paris 6, 2013. http://www.theses.fr/2013PA066356.

Full text

Abstract:

Les avancées récentes en apprentissage profond et en traitement d'image présentent l'opportunité d'unifier ces deux champs de recherche complémentaires pour une meilleure résolution du problème de classification d'images dans des catégories sémantiques. L'apprentissage profond apporte au traitement d'image le pouvoir de représentation nécessaire à l'amélioration des performances des méthodes de classification d'images. Cette thèse propose de nouvelles méthodes d'apprentissage de représentations visuelles profondes pour la résolution de cette tache. L'apprentissage profond a été abordé sous deux angles. D'abord nous nous sommes intéressés à l'apprentissage non supervisé de représentations latentes ayant certaines propriétés à partir de données en entrée. Il s'agit ici d'intégrer une connaissance à priori, à travers un terme de régularisation, dans l'apprentissage d'une machine de Boltzmann restreinte (RBM). Nous proposons plusieurs formes de régularisation qui induisent différentes propriétés telles que la parcimonie, la sélectivité et l'organisation en structure topographique. Le second aspect consiste au passage graduel de l'apprentissage non supervisé à l'apprentissage supervisé de réseaux profonds. Ce but est réalisé par l'introduction sous forme de supervision, d'une information relative à la catégorie sémantique. Deux nouvelles méthodes sont proposées. Le premier est basé sur une régularisation top-down de réseaux de croyance profonds à base de RBMs. Le second optimise un cout intégrant un critre de reconstruction et un critre de supervision pour l'entrainement d'autoencodeurs profonds. Les méthodes proposées ont été appliquées au problme de classification d'images. Nous avons adopté le modèle sac-de-mots comme modèle de base parce qu'il offre d'importantes possibilités grâce à l'utilisation de descripteurs locaux robustes et de pooling par pyramides spatiales qui prennent en compte l'information spatiale de l'image. L'apprentissage profonds avec agrÉgation spatiale est utilisé pour apprendre un dictionnaire hiÉrarchique pour l'encodage de reprÉsentations visuelles de niveau intermÉdiaire. Cette mÉthode donne des rÉsultats trs compétitifs en classification de scènes et d'images. Les dictionnaires visuels appris contiennent diverses informations non-redondantes ayant une structure spatiale cohérente. L'inférence est aussi très rapide. Nous avons par la suite optimisé l'étape de pooling sur la base du codage produit par le dictionnaire hiérarchique précédemment appris en introduisant introduit une nouvelle paramétrisation dérivable de l'opération de pooling qui permet un apprentissage par descente de gradient utilisant l'algorithme de rétro-propagation. Ceci est la premire tentative d'unification de l'apprentissage profond et du modèle de sac de mots. Bien que cette fusion puisse sembler évidente, l'union de plusieurs aspects de l'apprentissage profond de représentations visuelles demeure une tache complexe à bien des égards et requiert encore un effort de recherche important
Recent advancements in the areas of deep learning and visual information processing have presented an opportunity to unite both fields. These complementary fields combine to tackle the problem of classifying images into their semantic categories. Deep learning brings learning and representational capabilities to a visual processing model that is adapted for image classification. This thesis addresses problems that lead to the proposal of learning deep visual representations for image classification. The problem of deep learning is tackled on two fronts. The first aspect is the problem of unsupervised learning of latent representations from input data. The main focus is the integration of prior knowledge into the learning of restricted Boltzmann machines (RBM) through regularization. Regularizers are proposed to induce sparsity, selectivity and topographic organization in the coding to improve discrimination and invariance. The second direction introduces the notion of gradually transiting from unsupervised layer-wise learning to supervised deep learning. This is done through the integration of bottom-up information with top-down signals. Two novel implementations supporting this notion are explored. The first method uses top-down regularization to train a deep network of RBMs. The second method combines predictive and reconstructive loss functions to optimize a stack of encoder-decoder networks. The proposed deep learning techniques are applied to tackle the image classification problem. The bag-of-words model is adopted due to its strengths in image modeling through the use of local image descriptors and spatial pooling schemes. Deep learning with spatial aggregation is used to learn a hierarchical visual dictionary for encoding the image descriptors into mid-level representations. This method achieves leading image classification performances for object and scene images. The learned dictionaries are diverse and non-redundant. The speed of inference is also high. From this, a further optimization is performed for the subsequent pooling step. This is done by introducing a differentiable pooling parameterization and applying the error backpropagation algorithm. This thesis represents one of the first attempts to synthesize deep learning and the bag-of-words model. This union results in many challenging research problems, leaving much room for further study in this area

APA, Harvard, Vancouver, ISO, and other styles

6

Geirsson, Gunnlaugur. "Deep learning exotic derivatives." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-430410.

Full text

Abstract:

Monte Carlo methods in derivative pricing are computationally expensive, in particular for evaluating models partial derivatives with regard to inputs. This research proposes the use of deep learning to approximate such valuation models for highly exotic derivatives, using automatic differentiation to evaluate input sensitivities. Deep learning models are trained to approximate Phoenix Autocall valuation using a proprietary model used by Svenska Handelsbanken AB. Models are trained on large datasets of low-accuracy (10^4 simulations) Monte Carlo data, successfully learning the true model with an average error of 0.1% on validation data generated by 10^8 simulations. A specific model parametrisation is proposed for 2-day valuation only, to be recalibrated interday using transfer learning. Automatic differentiation approximates sensitivity to (normalised) underlying asset prices with a mean relative error generally below 1.6%. Overall error when predicting sensitivity to implied volatililty is found to lie within 10%-40%. Near identical results are found by finite difference as automatic differentiation in both cases. Automatic differentiation is not successful at capturing sensitivity to interday contract change in value, though errors of 8%-25% are achieved by finite difference. Model recalibration by transfer learning proves to converge over 15 times faster and with up to 14% lower relative error than training using random initialisation. The results show that deep learning models can efficiently learn Monte Carlo valuation, and that these can be quickly recalibrated by transfer learning. The deep learning model gradient computed by automatic differentiation proves a good approximation of the true model sensitivities. Future research proposals include studying optimised recalibration schedules, using training data generated by single Monte Carlo price paths, and studying additional parameters and contracts.

APA, Harvard, Vancouver, ISO, and other styles

7

Wülfing, Jan [Verfasser], and Martin [Akademischer Betreuer] Riedmiller. "Stable deep reinforcement learning." Freiburg : Universität, 2019. http://d-nb.info/1204826188/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

White, Martin. "Deep Learning Software Repositories." W&M ScholarWorks, 2017. https://scholarworks.wm.edu/etd/1516639667.

Full text

Abstract:

Bridging the abstraction gap between artifacts and concepts is the essence of software engineering (SE) research problems. SE researchers regularly use machine learning to bridge this gap, but there are three fundamental issues with traditional applications of machine learning in SE research. Traditional applications are too reliant on labeled data. They are too reliant on human intuition, and they are not capable of learning expressive yet efficient internal representations. Ultimately, SE research needs approaches that can automatically learn representations of massive, heterogeneous, datasets in situ, apply the learned features to a particular task and possibly transfer knowledge from task to task. Improvements in both computational power and the amount of memory in modern computer architectures have enabled new approaches to canonical machine learning tasks. Specifically, these architectural advances have enabled machines that are capable of learning deep, compositional representations of massive data depots. The rise of deep learning has ushered in tremendous advances in several fields. Given the complexity of software repositories, we presume deep learning has the potential to usher in new analytical frameworks and methodologies for SE research and the practical applications it reaches. This dissertation examines and enables deep learning algorithms in different SE contexts. We demonstrate that deep learners significantly outperform state-of-the-practice software language models at code suggestion on a Java corpus. Further, these deep learners for code suggestion automatically learn how to represent lexical elements. We use these representations to transmute source code into structures for detecting similar code fragments at different levels of granularity—without declaring features for how the source code is to be represented. Then we use our learning-based framework for encoding fragments to intelligently select and adapt statements in a codebase for automated program repair. In our work on code suggestion, code clone detection, and automated program repair, everything for representing lexical elements and code fragments is mined from the source code repository. Indeed, our work aims to move SE research from the art of feature engineering to the science of automated discovery.

APA, Harvard, Vancouver, ISO, and other styles

9

Sun, Haozhe. "Modularity in deep learning." Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG090.

Full text

Abstract:

L'objectif de cette thèse est de rendre l'apprentissage profond plus efficace en termes de ressources en appliquant le principe de modularité. La thèse comporte plusieurs contributions principales : une étude de la littérature sur la modularité dans l'apprentissage profond; la conception d'OmniPrint et de Meta-Album, des outils qui facilitent l'étude de la modularité des données; des études de cas examinant les effets de l'apprentissage épisodique, un exemple de modularité des données; un mécanisme d'évaluation modulaire appelé LTU pour évaluer les risques en matière de protection de la vie privée; et la méthode RRR pour réutiliser des modèles modulaires pré-entraînés afin d'en construire des versions plus compactes. La modularité, qui implique la décomposition d'une entité en sous-entités, est un concept répandu dans diverses disciplines. Cette thèse examine la modularité sur trois axes de l'apprentissage profond : les données, la tâche et le modèle. OmniPrint et Meta-Album facilitent de comparer les modèles modulaires et d'explorer les impacts de la modularité des données. LTU garantit la fiabilité de l'évaluation de la protection de la vie privée. RRR améliore l'efficacité de l'utilisation des modèles modulaires pré-entraînés. Collectivement, cette thèse fait le lien entre le principe de modularité et l'apprentissage profond et souligne ses avantages dans certains domaines de l'apprentissage profond, contribuant ainsi à une intelligence artificielle plus efficace en termes de ressources
This Ph.D. thesis is dedicated to enhancing the efficiency of Deep Learning by leveraging the principle of modularity. It contains several main contributions: a literature survey on modularity in Deep Learning; the introduction of OmniPrint and Meta-Album, tools that facilitate the investigation of data modularity; case studies examining the effects of episodic few-shot learning, an instance of data modularity; a modular evaluation mechanism named LTU for assessing privacy risks; and the method RRR for reusing pre-trained modular models to create more compact versions. Modularity, which involves decomposing an entity into sub-entities, is a prevalent concept across various disciplines. This thesis examines modularity across three axes of Deep Learning: data, task, and model. OmniPrint and Meta-Album assist in benchmarking modular models and exploring data modularity's impacts. LTU ensures the reliability of the privacy assessment. RRR significantly enhances the utilization efficiency of pre-trained modular models. Collectively, this thesis bridges the modularity principle with Deep Learning and underscores its advantages in selected fields of Deep Learning, contributing to more resource-efficient Artificial Intelligence

APA, Harvard, Vancouver, ISO, and other styles

10

Arnold, Ludovic. "Learning Deep Representations : Toward a better new understanding of the deep learning paradigm." Phd thesis, Université Paris Sud - Paris XI, 2013. http://tel.archives-ouvertes.fr/tel-00842447.

Full text

Abstract:

Since 2006, deep learning algorithms which rely on deep architectures with several layers of increasingly complex representations have been able to outperform state-of-the-art methods in several settings. Deep architectures can be very efficient in terms of the number of parameters required to represent complex operations which makes them very appealing to achieve good generalization with small amounts of data. Although training deep architectures has traditionally been considered a difficult problem, a successful approach has been to employ an unsupervised layer-wise pre-training step to initialize deep supervised models. First, unsupervised learning has many benefits w.r.t. generalization because it only relies on unlabeled data which is easily found. Second, the possibility to learn representations layer by layer instead of all layers at once improves generalization further and reduces computational time. However, deep learning is a very recent approach and still poses a lot of theoretical and practical questions concerning the consistency of layer-wise learning with many layers and difficulties such as evaluating performance, performing model selection and optimizing layers. In this thesis we first discuss the limitations of the current variational justification for layer-wise learning which does not generalize well to many layers. We ask if a layer-wise method can ever be truly consistent, i.e. capable of finding an optimal deep model by training one layer at a time without knowledge of the upper layers. We find that layer-wise learning can in fact be consistent and can lead to optimal deep generative models. To do this, we introduce the Best Latent Marginal (BLM) upper bound, a new criterion which represents the maximum log-likelihood of a deep generative model where the upper layers are unspecified. We prove that maximizing this criterion for each layer leads to an optimal deep architecture, provided the rest of the training goes well. Although this criterion cannot be computed exactly, we show that it can be maximized effectively by auto-encoders when the encoder part of the model is allowed to be as rich as possible. This gives a new justification for stacking models trained to reproduce their input and yields better results than the state-of-the-art variational approach. Additionally, we give a tractable approximation of the BLM upper-bound and show that it can accurately estimate the final log-likelihood of models. Taking advantage of these theoretical advances, we propose a new method for performing layer-wise model selection in deep architectures, and a new criterion to assess whether adding more layers is warranted. As for the difficulty of training layers, we also study the impact of metrics and parametrization on the commonly used gradient descent procedure for log-likelihood maximization. We show that gradient descent is implicitly linked with the metric of the underlying space and that the Euclidean metric may often be an unsuitable choice as it introduces a dependence on parametrization and can lead to a breach of symmetry. To mitigate this problem, we study the benefits of the natural gradient and show that it can restore symmetry, regrettably at a high computational cost. We thus propose that a centered parametrization may alleviate the problem with almost no computational overhead.

APA, Harvard, Vancouver, ISO, and other styles

11

Hussein, Ahmed. "Deep learning based approaches for imitation learning." Thesis, Robert Gordon University, 2018. http://hdl.handle.net/10059/3117.

Full text

Abstract:

Imitation learning refers to an agent's ability to mimic a desired behaviour by learning from observations. The field is rapidly gaining attention due to recent advances in computational and communication capabilities as well as rising demand for intelligent applications. The goal of imitation learning is to describe the desired behaviour by providing demonstrations rather than instructions. This enables agents to learn complex behaviours with general learning methods that require minimal task specific information. However, imitation learning faces many challenges. The objective of this thesis is to advance the state of the art in imitation learning by adopting deep learning methods to address two major challenges of learning from demonstrations. Firstly, representing the demonstrations in a manner that is adequate for learning. We propose novel Convolutional Neural Networks (CNN) based methods to automatically extract feature representations from raw visual demonstrations and learn to replicate the demonstrated behaviour. This alleviates the need for task specific feature extraction and provides a general learning process that is adequate for multiple problems. The second challenge is generalizing a policy over unseen situations in the training demonstrations. This is a common problem because demonstrations typically show the best way to perform a task and don't offer any information about recovering from suboptimal actions. Several methods are investigated to improve the agent's generalization ability based on its initial performance. Our contributions in this area are three fold. Firstly, we propose an active data aggregation method that queries the demonstrator in situations of low confidence. Secondly, we investigate combining learning from demonstrations and reinforcement learning. A deep reward shaping method is proposed that learns a potential reward function from demonstrations. Finally, memory architectures in deep neural networks are investigated to provide context to the agent when taking actions. Using recurrent neural networks addresses the dependency between the state-action sequences taken by the agent. The experiments are conducted in simulated environments on 2D and 3D navigation tasks that are learned from raw visual data, as well as a 2D soccer simulator. The proposed methods are compared to state of the art deep reinforcement learning methods. The results show that deep learning architectures can learn suitable representations from raw visual data and effectively map them to atomic actions. The proposed methods for addressing generalization show improvements over using supervised learning and reinforcement learning alone. The results are thoroughly analysed to identify the benefits of each approach and situations in which it is most suitable.

APA, Harvard, Vancouver, ISO, and other styles

12

Zhang, Jingwei [Verfasser], and Wolfram [Akademischer Betreuer] Burgard. "Learning navigation policies with deep reinforcement learning." Freiburg : Universität, 2021. http://d-nb.info/1235325571/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Rodés-Guirao, Lucas. "Deep Learning for Digital Typhoon : Exploring a typhoon satellite image dataset using deep learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-249514.

Full text

Abstract:

Efficient early warning systems can help in the management of natural disaster events, by allowing for adequate evacuations and resources administration. Several approaches have been used to implement proper early warning systems, such as simulations or statistical models, which rely on the collection of meteorological data. Data-driven techniques have been proven to be effective to build statistical models, being able to generalise to unseen data. Motivated by this, in this work, we explore deep learning techniques applied to the typhoon meteorological satellite image dataset "Digital Typhoon". We focus on intensity measurement and categorisation of different natural phenomena. Firstly, we build a classifier to differentiate natural tropical cyclones and extratropical cyclones and, secondly, we implement a regression model to estimate the centre pressure value of a typhoon. In addition, we also explore cleaning methodologies to ensure that the data used is reliable. The results obtained show that deep learning techniques can be effective under certain circumstances, providing reliable classification and regression models and feature extractors. More research to draw more conclusions and validate the obtained results is expected in the future.
Effektiva varningssystem kan hjälpa till med hanteringen av naturkatastrofer genom att möjliggöra tillräckliga evakueringar och resursfördelningar. Flera olika tillvägagångssätt har använts för att genomföra lämpliga tidiga varningssystem, såsom simuleringar eller statistiska modeller, som bygger på insamling av meteorologiska data. Datadriven teknik har visat sig vara effektiv för att bygga statistiska modeller som kan generalisera till okända data. Motiverat av detta, utforskar examensarbetet tekniker baserade på djupinlärning, vilka tillämpas på ett dataset med meteorologiska satellitbilder, Digital Typhoon". Vi fokuserar på intensitetsmätning och kategorisering av olika naturfenomen. Först bygger vi en klassificerare för att skilja mellan naturliga tropiska cykloner och extratropiska cykloner. Därefter implementerar vi en regressionsmodell för att uppskatta en tyfons mittrycksvärde. Dessutom utforskar vi rengöringsmetoder för att säkerställa att de data som används är tillförlitliga. De erhållna resultaten visar att tekniker för djupinlärning kan vara effektiva under vissa omständigheter, vilket ger tillförlitliga klassificerings- och regressionsmodeller samt extraktorer. Mer forskning för att dra fler slutsatser och validera de erhållna resultaten förväntas i framtiden.
Els sistemes d’alerta ràpida poden ajudar en la gestió dels esdeveniments de desastres naturals, permetent una evacuació i administració dels recursos adequada. En aquest sentit s’han utilitzat diferentes tècniques per implementar sistemes d’alerta, com ara simulacions o models estadístics, tots ells basats en la recollida de dades meteorològiques. S’ha demostrat que les tècniques basades en dades són eficaces a l’hora de construir models estadístics, podent generalitzar-se a a noves dades. Motivat per això, en aquest treball, explorem l’ús de tècniques d’aprenentatge profund (o deep learning) aplicades a les imatges meteorològiquesper satèl·lit de tifons del projecte "Digital Typhoon". Ens centrem en la mesura i la categorització de la intensitat de diferentsfenòmens naturals. En primer lloc, construïm un classificador per diferenciar ciclonstropicals naturals i ciclons extratropicals i, en segon lloc, implementemun model de regressió per estimar el valor de pressió central d’un tifó.A més, també explorem metodologies de neteja per garantir que lesdades utilitzades siguin fiables. Els resultats obtinguts mostren que les tècniques d’aprenentatgeprofundes poden ser efectives en determinades circumstàncies, proporcionant models fiables de classificació/regressió i extractors de característiques.Es preveu que hi hagi més recerques per obtenir més conclusions i validar els resultats obtinguts en el futur.

APA, Harvard, Vancouver, ISO, and other styles

14

Franceschelli, Giorgio. "Generative Deep Learning and Creativity." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Find full text

Abstract:

“Non ha la presunzione di originare nulla; può solo fare ciò che noi sappiamo ordinarle di fare”. Così, oltre 150 anni fa, Lady Lovelace commentava la Macchina Analitica di Babbage, l’antenato dei nostri computer. Una frase che, a distanza di tanti anni, suona quasi come una sfida: grazie alla diffusione delle tecniche di Generative Deep Learning e alle ricerche nell’ambito della Computational Creativity, sempre più sforzi sono stati destinati allo smentire l’ormai celebre Obiezione della Lovelace. Proprio a partire da questa, quattro domande formano i capisaldi della Computational Creativity: se è possibile sfruttare tecniche computazionali per comprendere la creatività umana; e, soprattutto, se i computer possono fare cose che sembrino creative (se non che siano effettivamente creative), e se possono imparare a riconoscere la creatività. Questa tesi si propone dunque di inserirsi in tale contesto, esplorando queste ultime due questioni grazie a tecniche di Deep Learning. In particolare, sulla base della definizione di creatività proposta da Margaret Boden, è presentata una metrica data dalla somma pesata di tre singole componenti (valore, novità e sorpresa) per il riconoscimento della creatività. In aggiunta, sfruttando tale misura, è presentato anche UCAN (Unexpectedly Creative Adversarial Network), un modello generativo orientato alla creatività, che impara a produrre opere creative massimizzando la metrica di cui sopra. Sia il generatore sia la metrica sono stati testati sul contesto della poesia americana del diciannovesimo secolo; i risultati ottenuti mostrano come la metrica sia effettivamente in grado di intercettare la traiettoria storica, e come possa rappresentare un importante passo avanti per lo studio della Computational Creativity; il generatore, pur non ottenendo risultati altrettanto eccellenti, si pone quale punto di partenza per la definizione futura di un modello effettivamente creativo.

APA, Harvard, Vancouver, ISO, and other styles

15

Shakibi, Babak. "Predicting parameters in deep learning." Thesis, University of British Columbia, 2014. http://hdl.handle.net/2429/50999.

Full text

Abstract:

The recent success of large and deep neural network models has motivated the training of even larger and deeper networks with millions of parameters. Training these models usually requires parallel training methods where communicating large number of parameters becomes one of the main bottlenecks. We show that many deep learning models are over-parameterized and their learned features can be predicted given only a small fraction of their parameters. We then propose a method which exploits this fact during the training to reduce the number of parameters that need to be learned. Our method is orthogonal to the choice of network architecture and can be applied in a wide variety of neural network architectures and application areas. We evaluate this technique using various experiments in image and speech recognition and show that we can only learn a fraction of the parameters (up to 10% in some cases) and predict the rest without a significant loss in the predictive accuracy of the model.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

16

Nguyen, Thien Huu. "Deep Learning for Information Extraction." Thesis, New York University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10260911.

Full text

Abstract:

The explosion of data has made it crucial to analyze the data and distill important information effectively and efficiently. A significant part of such data is presented in unstructured and free-text documents. This has prompted the development of the techniques for information extraction that allow computers to automatically extract structured information from the natural free-text data. Information extraction is a branch of natural language processing in artificial intelligence that has a wide range of applications, including question answering, knowledge base population, information retrieval etc. The traditional approach for information extraction has mainly involved hand-designing large feature sets (feature engineering) for different information extraction problems, i.e, entity mention detection, relation extraction, coreference resolution, event extraction, and entity linking. This approach is limited by the laborious and expensive effort required for feature engineering for different domains, and suffers from the unseen word/feature problem of natural languages.

This dissertation explores a different approach for information extraction that uses deep learning to automate the representation learning process and generate more effective features. Deep learning is a subfield of machine learning that uses multiple layers of connections to reveal the underlying representations of data. I develop the fundamental deep learning models for information extraction problems and demonstrate their benefits through systematic experiments.

First, I examine word embeddings, a general word representation that is produced by training a deep learning model on a large unlabelled dataset. I introduce methods to use word embeddings to obtain new features that generalize well across domains for relation extraction. This is done for both the feature-based method and the kernel-based method of relation extraction.

Second, I investigate deep learning models for different problems, including entity mention detection, relation extraction and event detection. I develop new mechanisms and network architectures that allow deep learning to model the structures of information extraction problems more effectively. Some extensive experiments are conducted on the domain adaptation and transfer learning settings to highlight the generalization advantage of the deep learning models for information extraction.

Finally, I investigate the joint frameworks to simultaneously solve several information extraction problems and benefit from the inter-dependencies among these problems. I design a novel memory augmented network for deep learning to properly exploit such inter-dependencies. I demonstrate the effectiveness of this network on two important problems of information extraction, i.e, event extraction and entity linking.

APA, Harvard, Vancouver, ISO, and other styles

17

Palasek, Petar. "Action recognition using deep learning." Thesis, Queen Mary, University of London, 2017. http://qmro.qmul.ac.uk/xmlui/handle/123456789/30828.

Full text

Abstract:

In this thesis we study deep learning architectures for the problem of human action recognition in image sequences, i.e. the problem of automatically recognizing what people are doing in a given video. As unlabeled video data is easily accessible these days, we first explore models that can learn meaningful representations of sequences without actually having to know what is happening in the sequences at hand. More specifically, we first explore the convolutional restricted Boltzmann machine (RBM) and show how a stack of convolutional RBMs can be used to learn and extract features from sequences in an unsupervised way. Using the classical Fisher vector pipeline to encode the extracted features we apply them on the task of action classification. We move on to feature extraction using larger, deep convolutional neural networks and propose a novel architecture which expresses the processing steps of the classical Fisher vector pipeline as network layers. By contrast to other methods where these steps are performed consecutively and the corresponding parameters are learned in an unsupervised manner, defining them as a single neural network allows us to refine the whole model discriminatively in an end to end fashion. We show that our method achieves significant improvements in comparison to the classical Fisher vector extraction chain and results in a comparable performance to other convolutional networks, while largely reducing the number of required trainable parameters. Finally, we explore how the proposed architecture can be modified into a hybrid network that combines the benefits of both unsupervised and supervised training methods, resulting in a model that learns a semi-supervised Fisher vector descriptor of the input data. We evaluate the proposed model at image classification and action recognition problems and show how the model's classification performance improves as the amount of unlabeled data increases during training.

APA, Harvard, Vancouver, ISO, and other styles

18

Zhuang, Zhongfang. "Deep Learning on Attributed Sequences." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/507.

Full text

Abstract:

Recent research in feature learning has been extended to sequence data, where each instance consists of a sequence of heterogeneous items with a variable length. However, in many real-world applications, the data exists in the form of attributed sequences, which is composed of a set of fixed-size attributes and variable-length sequences with dependencies between them. In the attributed sequence context, feature learning remains challenging due to the dependencies between sequences and their associated attributes. In this dissertation, we focus on analyzing and building deep learning models for four new problems on attributed sequences. First, we propose a framework, called NAS, to produce feature representations of attributed sequences in an unsupervised fashion. The NAS is capable of producing task independent embeddings that can be used in various mining tasks of attributed sequences. Second, we study the problem of deep metric learning on attributed sequences. The goal is to learn a distance metric based on pairwise user feedback. In this task, we propose a framework, called MLAS, to learn a distance metric that measures the similarity and dissimilarity between attributed sequence feedback pairs. Third, we study the problem of one-shot learning on attributed sequences. This problem is important for a variety of real-world applications ranging from fraud prevention to network intrusion detection. We design a deep learning framework OLAS to tackle this problem. Once the OLAS is trained, we can then use it to make predictions for not only the new data but also for entire previously unseen new classes. Lastly, we investigate the problem of attributed sequence classification with attention model. This is challenging that now we need to assess the importance of each item in each sequence considering both the sequence itself and the associated attributes. In this work, we propose a framework, called AMAS, to classify attributed sequences using the information from the sequences, metadata, and the computed attention. Our extensive experiments on real-world datasets demonstrate that the proposed solutions significantly improve the performance of each task over the state-of-the-art methods on attributed sequences.

APA, Harvard, Vancouver, ISO, and other styles

19

Zhang, Chiyuan Ph D. Massachusetts Institute of Technology. "Deep learning and structured data." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/115643.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 135-150).
In the recent years deep learning has witnessed successful applications in many different domains such as visual object recognition, detection and segmentation, automatic speech recognition, natural language processing, and reinforcement learning. In this thesis, we will investigate deep learning from a spectrum of different perspectives. First of all, we will study the question of generalization, which is one of the most fundamental notion in machine learning theory. We will show how, in the regime of deep learning, the characterization of generalization becomes different from the conventional way, and propose alternative ways to approach it. Moving from theory to more practical perspectives, we will show two different applications of deep learning. One is originated from a real world problem of automatic geophysical feature detection from seismic recordings to help oil & gas exploration; the other is motivated from a computational neuroscientific modeling and studying of human auditory system. More specifically, we will show how deep learning could be adapted to play nicely with the unique structures associated with the problems from different domains. Lastly, we move to the computer system design perspective, and present our efforts in building better deep learning systems to allow efficient and flexible computation in both academic and industrial worlds.
by Chiyuan Zhang.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

20

Drexler, Jennifer Fox. "Deep unsupervised learning from speech." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/105696.

Full text

Abstract:

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 87-92).
Automatic speech recognition (ASR) systems have become hugely successful in recent years - we have become accustomed to speech interfaces across all kinds of devices. However, despite the huge impact ASR has had on the way we interact with technology, it is out of reach for a significant portion of the world's population. This is because these systems rely on a variety of manually-generated resources - like transcripts and pronunciation dictionaries - that can be both expensive and difficult to acquire. In this thesis, we explore techniques for learning about speech directly from speech, with no manually generated transcriptions. Such techniques have the potential to revolutionize speech technologies for the vast majority of the world's population. The cognitive science and computer science communities have both been investing increasing time and resources into exploring this problem. However, a full unsupervised speech recognition system is a hugely complicated undertaking and is still a long ways away. As in previous work, we focus on the lower-level tasks which will underlie an eventual unsupervised speech recognizer. We specifically focus on two tasks: developing linguistically meaningful representations of speech and segmenting speech into phonetic units. This thesis approaches these tasks from a new direction: deep learning. While modern deep learning methods have their roots in ideas from the 1960s and even earlier, deep learning techniques have recently seen a resurgence, thanks to huge increases in computational power and new efficient learning algorithms. Deep learning algorithms have been instrumental in the recent progress of traditional supervised speech recognition; here, we extend that work to unsupervised learning from speech.
by Jennifer Fox Drexler.
S.M.

APA, Harvard, Vancouver, ISO, and other styles

21

Rippel, Oren. "Sculpting representations for deep learning." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/104581.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 149-164).
In machine learning, the choice of space in which to represent our data is of vital importance to their effective and efficient analysis. In this thesis, we develop approaches to address a number of problems in representation learning. We employ deep learning as means of sculpting our representations, and also develop improved representations for deep learning models. We present contributions that are based on five papers and make progress in several different research directions. First, we present techniques which leverage spatial and relational structure to achieve greater computational efficiency of model optimization and query retrieval. This allows us to train distance metric learning models 5-30 times faster; optimize convolutional neural networks 2-5 times faster; perform content-based image retrieval hundreds of times faster on codes hundreds of times longer than feasible before; and improve the complexity of Bayesian optimization to linear in the number of observations in contrast to the cubic dependence in its naive Gaussian process formulation. Furthermore, we introduce ideas to facilitate preservation of relevant information within the learned representations, and demonstrate this leads to improved supervision results. Our approaches achieve state-of-the-art classification and transfer learning performance on a number of well-known machine learning benchmarks. In addition, while deep learning models are able to discover structure in high dimensional input domains, they only offer implicit probabilistic descriptions. We develop an algorithm to enable probabilistic interpretability of deep representations. It constructs a transformation to a representation space under which the map of the distribution is approximately factorized and has known marginals. This allows tractable density estimation and.inference within this alternate domain.
by Oren Rippel.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

22

Simonovsky, Martin. "Deep learning on attributed graphs." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1133/document.

Full text

Abstract:

Le graphe est un concept puissant pour la représentation des relations entre des paires d'entités. Les données ayant une structure de graphes sous-jacente peuvent être trouvées dans de nombreuses disciplines, décrivant des composés chimiques, des surfaces des modèles tridimensionnels, des interactions sociales ou des bases de connaissance, pour n'en nommer que quelques-unes. L'apprentissage profond (DL) a accompli des avancées significatives dans une variété de tâches d'apprentissage automatique au cours des dernières années, particulièrement lorsque les données sont structurées sur une grille, comme dans la compréhension du texte, de la parole ou des images. Cependant, étonnamment peu de choses ont été faites pour explorer l'applicabilité de DL directement sur des données structurées sous forme des graphes. L'objectif de cette thèse est d'étudier des architectures de DL sur des graphes et de rechercher comment transférer, adapter ou généraliser à ce domaine des concepts qui fonctionnent bien sur des données séquentielles et des images. Nous nous concentrons sur deux primitives importantes : le plongement de graphes ou leurs nœuds dans une représentation de l'espace vectorielle continue (codage) et, inversement, la génération des graphes à partir de ces vecteurs (décodage). Nous faisons les contributions suivantes. Tout d'abord, nous introduisons Edge-Conditioned Convolutions (ECC), une opération de type convolution sur les graphes réalisés dans le domaine spatial où les filtres sont générés dynamiquement en fonction des attributs des arêtes. La méthode est utilisée pour coder des graphes avec une structure arbitraire et variable. Deuxièmement, nous proposons SuperPoint Graph, une représentation intermédiaire de nuages de points avec de riches attributs des arêtes codant la relation contextuelle entre des parties des objets. Sur la base de cette représentation, l'ECC est utilisé pour segmenter les nuages de points à grande échelle sans sacrifier les détails les plus fins. Troisièmement, nous présentons GraphVAE, un générateur de graphes permettant de décoder des graphes avec un nombre de nœuds variable mais limité en haut, en utilisant la correspondance approximative des graphes pour aligner les prédictions d'un auto-encodeur avec ses entrées. La méthode est appliquée à génération de molécules
Graph is a powerful concept for representation of relations between pairs of entities. Data with underlying graph structure can be found across many disciplines, describing chemical compounds, surfaces of three-dimensional models, social interactions, or knowledge bases, to name only a few. There is a natural desire for understanding such data better. Deep learning (DL) has achieved significant breakthroughs in a variety of machine learning tasks in recent years, especially where data is structured on a grid, such as in text, speech, or image understanding. However, surprisingly little has been done to explore the applicability of DL on graph-structured data directly.The goal of this thesis is to investigate architectures for DL on graphs and study how to transfer, adapt or generalize concepts working well on sequential and image data to this domain. We concentrate on two important primitives: embedding graphs or their nodes into a continuous vector space representation (encoding) and, conversely, generating graphs from such vectors back (decoding). To that end, we make the following contributions.First, we introduce Edge-Conditioned Convolutions (ECC), a convolution-like operation on graphs performed in the spatial domain where filters are dynamically generated based on edge attributes. The method is used to encode graphs with arbitrary and varying structure.Second, we propose SuperPoint Graph, an intermediate point cloud representation with rich edge attributes encoding the contextual relationship between object parts. Based on this representation, ECC is employed to segment large-scale point clouds without major sacrifice in fine details.Third, we present GraphVAE, a graph generator allowing to decode graphs with variable but upper-bounded number of nodes making use of approximate graph matching for aligning the predictions of an autoencoder with its inputs. The method is applied to the task of molecule generation

APA, Harvard, Vancouver, ISO, and other styles

23

Piano, Francesco. "Deep Reinforcement Learning con PyTorch." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amslaurea.unibo.it/25340/.

Full text

Abstract:

Il Reinforcement Learning è un campo di ricerca del Machine Learning in cui la risoluzione di problemi da parte di un agente avviene scegliendo l’azione più idonea da eseguire attraverso un processo di apprendimento iterativo, in un ambiente dinamico che lo incentiva tramite ricompense. Il Deep Learning, anch’esso approccio del Machine Learning, sfruttando una rete neurale artificiale è in grado di applicare metodi di apprendimento per rappresentazione allo scopo di ottenere una struttura dei dati più idonea ad essere elaborata. Solo recentemente il Deep Reinforcement Learning, creato dalla combinazione di questi due paradigmi di apprendimento, ha permesso di risolvere problemi considerati prima intrattabili riscuotendo un notevole successo e rinnovando l’interesse dei ricercatori riguardo l’applicazione degli algoritmi di Reinforcement Learning. Con questa tesi si è voluto approfondire lo studio del Reinforcement Learning applicato a problemi semplici, per poi esaminare come esso possa superare i propri limiti caratteristici attraverso l’utilizzo delle reti neurali artificiali, in modo da essere applicato in un contesto di Deep Learning attraverso l'utilizzo del framework PyTorch, una libreria attualmente molto usata per il calcolo scientifico e il Machine Learning.

APA, Harvard, Vancouver, ISO, and other styles

24

Broström, Axel, and Richard Kristiansson. "Exotic Derivatives and Deep Learning." Thesis, KTH, Matematisk statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-228476.

Full text

Abstract:

This thesis investigates the use of Artificial Neural Networks (ANNs)for calculating present values, Value-at-Risk and Expected Shortfall ofoptions, both European call options and more complex rainbow options. Theperformance of the ANN is evaluated by comparing it to a second-order Taylorpolynomial using pre-calculated sensitivities to certain risk-factors. Amultilayer perceptron approach is chosen based on previous literature andapplied to both types of options. The data is generated from a financial risk-managementsoftware for both call options and rainbow options along with the relatedTaylor approximations. The study shows that while the ANN outperforms theTaylor approximation in calculating present values and risk measures forcertain movements in the underlying risk-factors, the general conclusion isthat an ANN trained and evaluated in accordance with the method in this studydoes not outperform a Taylor approximation even if it is theoretically possiblefor the ANN to do so. The important conclusion of the study is that the ANNseems to be able to learn to calculate present values that otherwise requireMonte Carlo simulation. Thus, the study is a proof of concept that requiresfurther development for implementation.
Denna masteruppsats undersöker användningen avArtificiella Neurala Nätverk (ANN) för att beräkna nuvärdet, Value-at-Risk ochExpected Shortfall för optioner, både Europeiska köpoptioner samt mer komplexarainbow-optioner. ANN:t jämförs med ett Taylorpolynom av andra ordningen somanvänder känsligheter mot ett flertal riskfaktorer. En typ av ANN som kallasmultilayer perceptron väljs baserat på tidigare forskning inom området ochappliceras på båda typerna av optioner. Datan som används har genererats frånett finansiellt riskhanteringssystem för såväl köpoptioner som rainbow-optionertillsammans med tillhörande Taylorapproximation. Studien visar att även om ANNslår Taylorpolynomet för vissa specifika beräkningar av nuvärdet och riskvärdenså är den generella slutsatsen att ett ANN som är tränad och utvärderad enligtmetoden i denna studie inte presterar bättre än ett Taylorpolynom även om detär teoretiskt möjligt att ANN:t kan göra det. Den viktigaste slutsatsen fråndenna studie är att ANN:t verkar kunna lära sig prissätta komplexa finansielladerivat som annars kräver Monte Carlo-simulering. Således validerar dennastudie ett koncept som kräver ytterligare utveckling före det implementeras

APA, Harvard, Vancouver, ISO, and other styles

25

Watson, Cody. "Deep Learning In Software Engineering." W&M ScholarWorks, 2020. https://scholarworks.wm.edu/etd/1616444371.

Full text

Abstract:

Software evolves and therefore requires an evolving field of Software Engineering. The evolution of software can be seen on an individual project level through the software life cycle, as well as on a collective level, as we study the trends and uses of software in the real world. As the needs and requirements of users change, so must software evolve to reflect those changes. This cycle is never ending and has led to continuous and rapid development of software projects. More importantly, it has put a great responsibility on software engineers, causing them to adopt practices and tools that allow them to increase their efficiency. However, these tools suffer the same fate as software designed for the general population; they need to change in order to reflect the user’s needs. Fortunately, the demand for this evolving software has given software engineers a plethora of data and artifacts to analyze. The challenge arises when attempting to identify and apply patterns learned from the vast amount of data. In this dissertation, we explore and develop techniques to take advantage of the vast amount of software data and to aid developers in software development tasks. Specifically, we exploit the tool of deep learning to automatically learn patterns discovered within previous software data and automatically apply those patterns to present day software development. We first set out to investigate the current impact of deep learning in software engineering by performing a systematic literature review of top tier conferences and journals. This review provides guidelines and common pitfalls for researchers to consider when implementing DL (Deep Learning) approaches in SE (Software Engineering). In addition, the review provides a research road map for areas within SE where DL could be applicable. Our next piece of work developed an approach that simultaneously learned different representations of source code for the task of clone detection. We found that the use of multiple representations, such as Identifiers, ASTs, CFGs and bytecode, can lead to the identification of similar code fragments. Through the use of deep learning strategies, we automatically learned these different representations without the requirement of hand-crafted features. Lastly, we designed a novel approach for automating the generation of assert statements through seq2seq learning, with the goal of increasing the efficiency of software testing. Given the test method and the context of the associated focal method, we automatically generated semantically and syntactically correct assert statements for a given, unseen test method. We exemplify that the techniques presented in this dissertation provide a meaningful advancement to the field of software engineering and the automation of software development tasks. We provide analytical evaluations and empirical evidence that substantiate the impact of our findings and usefulness of our approaches toward the software engineering community.

APA, Harvard, Vancouver, ISO, and other styles

26

Kim, Alisa. "Deep Learning for Uncertainty Measurement." Doctoral thesis, Humboldt-Universität zu Berlin, 2021. http://dx.doi.org/10.18452/22161.

Full text

Abstract:

Diese Arbeit konzentriert sich auf die Lösung des Problems der Unsicherheitsmessung und ihrer Auswirkungen auf Geschäftsentscheidungen, wobei zwei Ziele verfolgt werden: Erstens die Entwicklung und Validierung robuster Modelle zur Quantifizierung der Unsicherheit, wobei insbesondere sowohl die etablierten statistischen Modelle als auch neu entwickelte maschinelle Lernwerkzeuge zum Einsatz kommen. Das zweite Ziel dreht sich um die industrielle Anwendung der vorgeschlagenen Modelle. Die Anwendung auf reale Fälle bei der Messung der Volatilität oder bei einer riskanten Entscheidung ist mit einem direkten und erheblichen Gewinn oder Verlust verbunden. Diese These begann mit der Untersuchung der impliziten Volatilität (IV) als Proxy für die Wahrnehmung der Unsicherheit von Anlegern für eine neue Klasse von Vermögenswerten - Kryptowährungen. Das zweite Papier konzentriert sich auf Methoden zur Identifizierung risikofreudiger Händler und nutzt die DNN-Infrastruktur, um das Risikoverhalten von Marktakteuren, das auf Unsicherheit beruht und diese aufrechterhält, weiter zu untersuchen. Das dritte Papier befasste sich mit dem herausfordernden Bestreben der Betrugserkennung 3 und bot das Entscheidungshilfe-modell, das eine genauere und interpretierbarere Bewertung der zur Prüfung eingereichten Finanzberichte ermöglichte. Angesichts der Bedeutung der Risikobewertung und der Erwartungen der Agenten für die wirtschaftliche Entwicklung und des Aufbaus der bestehenden Arbeiten von Baker (2016) bot das vierte Papier eine neuartige DL-NLP-basierte Methode zur Quantifizierung der wirtschaftspolitischen Unsicherheit. Die neuen Deep-Learning-basierten Lösungen bieten eine überlegene Leistung gegenüber bestehenden Ansätzen zur Quantifizierung und Erklärung wirtschaftlicher Unsicherheiten und ermöglichen genauere Prognosen, verbesserte Planungskapazitäten und geringere Risiken. Die angebotenen Anwendungsfälle bilden eine Plattform für die weitere Forschung.
This thesis focuses on solving the problem of uncertainty measurement and its impact on business decisions while pursuing two goals: first, develop and validate accurate and robust models for uncertainty quantification, employing both the well established statistical models and newly developed machine learning tools, with particular focus on deep learning. The second goal revolves around the industrial application of proposed models, applying them to real-world cases when measuring volatility or making a risky decision entails a direct and substantial gain or loss. This thesis started with the exploration of implied volatility (IV) as a proxy for investors' perception of uncertainty for a new class of assets - crypto-currencies. The second paper focused on methods to identify risk-loving traders and employed the DNN infrastructure for it to investigate further the risk-taking behavior of market actors that both stems from and perpetuates uncertainty. The third paper addressed the challenging endeavor of fraud detection and offered the decision support model that allowed a more accurate and interpretable evaluation of financial reports submitted for audit. Following the importance of risk assessment and agents' expectations in economic development and building on the existing works of Baker (2016) and their economic policy uncertainty (EPU) index, it offered a novel DL-NLP-based method for the quantification of economic policy uncertainty. In summary, this thesis offers insights that are highly relevant to both researchers and practitioners. The new deep learning-based solutions exhibit superior performance to existing approaches to quantify and explain economic uncertainty, allowing for more accurate forecasting, enhanced planning capacities, and mitigated risks. The offered use-cases provide a road-map for further development of the DL tools in practice and constitute a platform for further research.

APA, Harvard, Vancouver, ISO, and other styles

27

Rosar, Kós Lassance Carlos Eduardo. "Graphs for deep learning representations." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2020. http://www.theses.fr/2020IMTA0204.

Full text

Abstract:

Ces dernières années, les méthodes d'apprentissage profond ont atteint l'état de l'art dans une vaste gamme de tâches d'apprentissage automatique, y compris la classification d'images et la traduction automatique. Ces architectures sont assemblées pour résoudre des tâches d'apprentissage automatique de bout en bout. Afin d'atteindre des performances de haut niveau, ces architectures nécessitent souvent d'un très grand nombre de paramètres. Les conséquences indésirables sont multiples, et pour y remédier, il est souhaitable de pouvoir comprendre ce qui se passe à l'intérieur des architectures d'apprentissage profond. Il est difficile de le faire en raison de: i) la dimension élevée des représentations ; et ii) la stochasticité du processus de formation. Dans cette thèse, nous étudions ces architectures en introduisant un formalisme à base de graphes, s'appuyant notamment sur les récents progrès du traitement de signaux sur graphe (TSG). À savoir, nous utilisons des graphes pour représenter les espaces latents des réseaux neuronaux profonds. Nous montrons que ce formalisme des graphes nous permet de répondre à diverses questions, notamment: i) mesurer des capacités de généralisation ;ii) réduire la quantité de des choix arbitraires dans la conception du processus d'apprentissage ; iii)améliorer la robustesse aux petites perturbations ajoutées sur les entrées ; et iv) réduire la complexité des calculs
In recent years, Deep Learning methods have achieved state of the art performance in a vast range of machine learning tasks, including image classification and multilingual automatic text translation. These architectures are trained to solve machine learning tasks in an end-to-end fashion. In order to reach top-tier performance, these architectures often require a very large number of trainable parameters. There are multiple undesirable consequences, and in order to tackle these issues, it is desired to be able to open the black boxes of deep learning architectures. Problematically, doing so is difficult due to the high dimensionality of representations and the stochasticity of the training process. In this thesis, we investigate these architectures by introducing a graph formalism based on the recent advances in Graph Signal Processing (GSP). Namely, we use graphs to represent the latent spaces of deep neural networks. We showcase that this graph formalism allows us to answer various questions including: ensuring generalization abilities, reducing the amount of arbitrary choices in the design of the learning process, improving robustness to small perturbations added to the inputs, and reducing computational complexity

APA, Harvard, Vancouver, ISO, and other styles

28

Gunér, Gustaf. "Receipt Scanning Using Deep Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-279697.

Full text

Abstract:

Employees often make purchases on behalf of the companies that they are working for. These purchases must be reported manually, either by the employees themselves or by sending the receipts to the company‘s accountant. In both cases, parts of the receipts are transcribed manually. This process is time-consuming and poses a risk that the human factor causes transcription errors, which can lead to ambiguities in the company‘s financial statements. A fully-automatic receipt scanner, which from a photograph of a receipt can extract metadata (e.g. total price, VAT, and individual item names) would solve many of these problems. Not only would it make the reporting process more efficient, which would reduce costs and save time, but the correctness of the data could be increased too. In this report, the possibilities of using Deep Learning (DL) as an approach to receipt scanning are evaluated, in comparison to a heuristic Computer Vision (CV) solution. Both approaches detect the receipt in a photograph, preprocess the original photograph based on the location information and extract the text from it using Optical Character Recognition (OCR). The approaches were evaluated based on the accuracy of the predicted receipt locations and the accuracy of the extracted texts. The results show that the Deep Learning approach achieved significantly better results than the heuristic approach, in both tasks. In the generic test set, which combined all test instances, the Deep Learning approach achieved 31.1 percentage points higher average Intersection over Union (IoU), 23.4 percentage points lower average Character Error Rate (CER) and 17.5 percentage points lower average Word Error Rate (WER).
Anställda på företag gör ofta utlägg för inköp. Dessa inköp måste rapporteras manuellt, antingen av varje enskild anställd eller genom att skicka kvittona till företagets revisor och låta denna person göra det. I båda fallen transkriberas delar av kvitton manuellt. Denna process är tidskrävande och utgör en risk för att den mänskliga faktorn orsakar fel i avskrivningen, vilket kan leda till tvetydigheter i företagets finansiella redovisningar. En helautomatisk kvittoscanner, som från ett foto av ett kvitto kan extrahera ut metadata (t.ex. totalpris, moms och individuella objektnamn) skulle lösa många av dessa problem. Utöver att lösningen skulle göra rapporteringsprocessen mer effektiv, vilket skulle mins- ka kostnader och spara tid, skulle även korrektheten i datan kunna förbättras. I denna rapport utvärderas möjligheterna att använda djupinlärning som metod för att scanna kvitton, i jämförelse med en heuristisk metod baserad på dator-seende. Båda metoderna detekterar kvittot i bilden, förbehandlar originalfotot baserat på kvittots platsinformation och extraherar sedan texten med hjälp av optisk teckenigenkänning. Metoderna utvärderades baserat på noggrannheten i de förutspådda platserna för kvittona och noggrannheten i de extraherade texterna. Resultaten visar att djupinlärningsmetoden uppnådde avsevärt bättre resultat än den heuristiska metoden, i båda avseendena. I den generiska testuppsättningen, som inkluderade samtliga testinstanser, uppnådde djupinlärningsmetoden 31.1 procentenheter högre genomsnittlig Intersection over Uni- on (IoU), 23.4 procentenheter lägre genomsnittlig Character Error Rate (CER) och 17.5 procentenheter lägre genomsnittlig Word Error Rate (WER).

APA, Harvard, Vancouver, ISO, and other styles

29

Elmarakeby, Haitham Abdulrahman. "Deep Learning for Biological Problems." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/86264.

Full text

Abstract:

The last decade has witnessed a tremendous increase in the amount of available biological data. Different technologies for measuring the genome, epigenome, transcriptome, proteome, metabolome, and microbiome in different organisms are producing large amounts of high-dimensional data every day. High-dimensional data provides unprecedented challenges and opportunities to gain a better understanding of biological systems. Unlike other data types, biological data imposes more constraints on researchers. Biologists are not only interested in accurate predictive models that capture complex input-output relationships, but they also seek a deep understanding of these models. In the last few years, deep models have achieved better performance in computational prediction tasks compared to other approaches. Deep models have been extensively used in processing natural data, such as images, text, and recently sound. However, application of deep models in biology is limited. Here, I propose to use deep models for output prediction, dimension reduction, and feature selection of biological data to get better interpretation and understanding of biological systems. I demonstrate the applicability of deep models in a domain that has a high and direct impact on health care. In this research, novel deep learning models have been introduced to solve pressing biological problems. The research shows that deep models can be used to automatically extract features from raw inputs without the need to manually craft features. Deep models are used to reduce the dimensionality of the input space, which resulted in faster training. Deep models are shown to have better performance and less variant output when compared to other shallow models even when an ensemble of shallow models is used. Deep models are shown to be able to process non-classical inputs such as sequences. Deep models are shown to be able to naturally process input sequences to automatically extract useful features.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

30

Amar, Gilad. "Deep learning for supernovae detection." Master's thesis, University of Cape Town, 2017. http://hdl.handle.net/11427/27090.

Full text

Abstract:

In future astronomical sky surveys it will be humanly impossible to classify the tens of thousands of candidate transients detected per night. This thesis explores the potential of using state-of-the-art machine learning algorithms to handle this burden more accurately and quickly than trained astronomers. To this end Deep Learning methods are applied to classify transients using real-world data from the Sloan Digital Sky Survey. Using cutting-edge training techniques several Convolutional Neural networks are trained and hyper-parameters tuned to outperform previous approaches and find that human labelling errors are the primary obstacle to further improvement. The tuning and optimisation of the deep models took in excess of 700 hours on a 4-Titan X GPU cluster.

APA, Harvard, Vancouver, ISO, and other styles

31

Stigenberg, Jakob. "Scheduling using Deep Reinforcement Learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-284506.

Full text

Abstract:

As radio networks have continued to evolve in recent decades, so have theircomplexity and the difficulty in efficiently utilizing the available resources. Ina cellular network, the scheduler controls the allocation of time, frequencyand spatial resources to users in both uplink and downlink directions. Thescheduler is therefore a key component in terms of efficient usage of networkresources. Although the scope and characteristics of available resources forschedulers are well defined in network standards, e.g. Long-Term Evolutionor New Radio, its real implementation is not. Most previous work focus onconstructing heuristics, based on metrics such as Quality of Service (QoS)classes, channel quality and delay, from which packets are then sorted andscheduled. In this work, a new approach to time domain scheduling using reinforcementlearning is presented. The proposed algorithm leverages modelfreereinforcement learning in order to treat the frequency domain scheduleras a black box. The proposed algorithm uses end-to-end learning and considersall packets, including control packets such as scheduling requests and CSIreports. Using a Deep Q-Network, the algorithm was evaluated in a settingwith multiple delay sensitive VoiP users and one best effort user. Compared toa priority based scheduler, the agent was able to improve total cell throughputby 20:5%, 23:5%, and 16:2% in the 10th, 50th, and 90th percentiles, respectively,while simultaneously reducing the VoiP packet delay by 29:6%, thusimproving QoS.
I takt med radionätverks fortsatta utveckling under de senaste decenniernahar även komplexiteten och svårigheten i att effektivt utnyttja de tillgängligaresurserna ökat. I varje trådlöst nätverk finns en schemaläggare som styrtrafikflödet genom nätverket. Schemaläggaren är därmed en nyckelkomponentnär det kommer till att effektivt utnyttja de tillgängliga nätverksresurserna. Ien given nätverkspecifikation, t.ex. Long-Term Evoluation eller New Radio,är det givet vilka möjligheter till allokering som schemaläggaren kan använda.Hur schemaläggaren utnyttjar dessa möjligheter, det vill säga implementationenav schemaläggaren, är helt upp till varje enskild tillverkare. I tidigarearbete har fokus främst legat på att manuellt definera sorteringsvikter baseratpå, bland annat, Quality of Service (QoS) -klass, kanalkvalitet och fördröjning.Nätverkspaket skickas sedan givet viktordningen. I detta examensarbetepresenteras en ny metod för schemaläggning baserat på förstärkande inlärning.Metoden hanterar resursallokeraren som en svart låda och lär sig denbästa sorteringen direkt från indata (end-to-end) och hanterar även kontrollpaket.Ramverket utvärderades med ett Deep Q-Network i ett scenario medflera fördröjningskänsliga röstanvändare tillsammans med en (oändligt) storfilnedladdning. Algoritmen lärde sig att minska mängden försenade röstpaket,alltså öka QoS, med 29.6% samtidigt som den ökade total överföringshastighetmed 20.5, 23.5 och 16.2% i den 10:e, 50:e samt 90:e kvantilen.

APA, Harvard, Vancouver, ISO, and other styles

32

Ramesh, Shreyas. "Deep Learning for Taxonomy Prediction." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/89752.

Full text

Abstract:

The last decade has seen great advances in Next-Generation Sequencing technologies, and, as a result, there has been a rise in the number of genomes sequenced each year. In 2017, there were as many as 10,000 new organisms sequenced and added into the RefSeq Database. Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction. In the Plinko strategy, each network takes advantage of different word usage patterns corresponding to different levels of evolutionary divergence. Plinko has the advantages of relatively low storage, GPGPU parallel training and inference, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional,alignment-free approach to taxonomy prediction.
Master of Science
Taxonomy prediction is a science involving the hierarchical classification of DNA fragments up to the rank species. Given species diversity on Earth, taxonomy prediction gets challenging with (i) increasing number of species (labels) to classify and (ii) decreasing input (DNA) size. In this research, we introduce Predicting Linked Organisms, Plinko, for short. Plinko is a fully-functioning, state-of-the-art predictive system that accurately captures DNA - Taxonomy relationships where other state-of-the-art algorithms falter. Three major challenges in taxonomy prediction are (i) large dataset sizes (order of 109 sequences) (ii) large label spaces (order of 103 labels) and (iii) low resolution inputs (100 base pairs or less). Plinko leverages multi-view convolutional neural networks and the pre-defined taxonomy tree structure to improve multi-level taxonomy prediction for hard to classify sequences under the three conditions stated above. Plinko has the advantage of relatively low storage footprint, making the solution portable, and scalable with anticipated genome database growth. To the best of our knowledge, Plinko is the first to use multi-view convolutional neural networks as the core algorithm in a compositional, alignment-free approach to taxonomy prediction.

APA, Harvard, Vancouver, ISO, and other styles

33

Xiao, Yao. "Vehicle Detection in Deep Learning." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/91375.

Full text

Abstract:

Computer vision techniques are becoming increasingly popular. For example, face recognition is used to help police find criminals, vehicle detection is used to prevent drivers from serious traffic accidents, and written word recognition is used to convert written words into printed words. With the rapid development of vehicle detection given the use of deep learning techniques, there are still concerns about the performance of state-of-the-art vehicle detection techniques. For example, state-of-the-art vehicle detectors are restricted by the large variation of scales. People working on vehicle detection are developing techniques to solve this problem. This thesis proposes an advanced vehicle detection model, adopting one of the classical neural networks, which are the residual neural network and the region proposal network. The model utilizes the residual neural network as a feature extractor and the region proposal network to detect the potential objects' information.
Master of Science
Computer vision techniques are becoming increasingly popular. For example, face recognition is used to help police find criminals, vehicle detection is used to prevent drivers from serious traffic accidents, and written word recognition is used to convert written words into printed words. With the rapid development of vehicle detection given the use of deep learning techniques, there are still concerns about the performance of state-of-the art vehicle detection techniques. For example, state-of-the-art vehicle detectors are restricted by the large variation of scales. People working on vehicle detection are developing techniques to solve this problem. This thesis proposes an advanced vehicle detection model, utilizing deep learning techniques to detect the potential objects’ information.

APA, Harvard, Vancouver, ISO, and other styles

34

Howard, Shaun Michael. "Deep Learning for Sensor Fusion." Case Western Reserve University School of Graduate Studies / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=case1495751146601099.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Abrishami, Hedayat. "Deep Learning Based Electrocardiogram Delineation." University of Cincinnati / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1563525992210273.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Mansanet, Sandín Jorge. "Contributions to Deep Learning Models." Doctoral thesis, Universitat Politècnica de València, 2016. http://hdl.handle.net/10251/61296.

Full text

Abstract:

[EN] Deep Learning is a new area of Machine Learning research which aims to create computational models that learn several representations of the data using deep architectures. These methods have become very popular over the last few years due to the remarkable results obtained in speech recognition, visual object recognition, object detection, natural language processing, etc. The goal of this thesis is to present some contributions to the Deep Learning framework, particularly focused on computer vision problems dealing with images. These contributions can be summarized in two novel methods proposed: a new regularization technique for Restricted Boltzmann Machines called Mask Selective Regularization (MSR), and a powerful discriminative network called Local Deep Neural Network (Local-DNN). On the one hand, the MSR method is based on taking advantage of the benefits of the L2 and the L1 regularizations techniques. Both regularizations are applied dynamically on the parameters of the RBM according to the state of the model during training and the topology of the input space. On the other hand, the Local-DNN model is based on two key concepts: local features and deep architectures. Similar to the convolutional networks, the Local-DNN model learns from local regions in the input image using a deep neural network. The network aims to classify each local feature according to the label of the sample to which it belongs, and all of these local contributions are taken into account during testing using a simple voting scheme. The methods proposed throughout the thesis have been evaluated in several experiments using various image datasets. The results obtained show the great performance of these approaches, particularly on gender recognition using face images, where the Local-DNN improves other state-of-the-art results.
[ES] El Aprendizaje Profundo (Deep Learning en inglés) es una nueva área dentro del campo del Aprendizaje Automático que pretende crear modelos computacionales que aprendan varias representaciones de los datos utilizando arquitecturas profundas. Este tipo de métodos ha ganado mucha popularidad durante los últimos años debido a los impresionantes resultados obtenidos en diferentes tareas como el reconocimiento automático del habla, el reconocimiento y la detección automática de objetos, el procesamiento de lenguajes naturales, etc. El principal objetivo de esta tesis es aportar una serie de contribuciones realizadas dentro del marco del Aprendizaje Profundo, particularmente enfocadas a problemas relacionados con la visión por computador. Estas contribuciones se resumen en dos novedosos métodos: una nueva técnica de regularización para Restricted Boltzmann Machines llamada Mask Selective Regularization (MSR), y una potente red neuronal discriminativa llamada Local Deep Neural Network (Local-DNN). Por una lado, el método MSR se basa en aprovechar las ventajas de las técnicas de regularización clásicas basadas en las normas L2 y L1. Ambas regularizaciones se aplican sobre los parámetros de la RBM teniendo en cuenta el estado del modelo durante el entrenamiento y la topología de los datos de entrada. Por otro lado, El modelo Local-DNN se basa en dos conceptos fundamentales: características locales y arquitecturas profundas. De forma similar a las redes convolucionales, Local-DNN restringe el aprendizaje a regiones locales de la imagen de entrada. La red neuronal pretende clasificar cada característica local con la etiqueta de la imagen a la que pertenece, y, finalmente, todas estas contribuciones se tienen en cuenta utilizando un sencillo sistema de votación durante la predicción. Los métodos propuestos a lo largo de la tesis han sido ampliamente evaluados en varios experimentos utilizando distintas bases de datos, principalmente en problemas de visión por computador. Los resultados obtenidos muestran el buen funcionamiento de dichos métodos, y sirven para validar las estrategias planteadas. Entre ellos, destacan los resultados obtenidos aplicando el modelo Local-DNN al problema del reconocimiento de género utilizando imágenes faciales, donde se han mejorado los resultados publicados del estado del arte.
[CAT] L'Aprenentatge Profund (Deep Learning en anglès) és una nova àrea dins el camp de l'Aprenentatge Automàtic que pretén crear models computacionals que aprenguen diverses representacions de les dades utilitzant arquitectures profundes. Aquest tipus de mètodes ha guanyat molta popularitat durant els últims anys a causa dels impressionants resultats obtinguts en diverses tasques com el reconeixement automàtic de la parla, el reconeixement i la detecció automàtica d'objectes, el processament de llenguatges naturals, etc. El principal objectiu d'aquesta tesi és aportar una sèrie de contribucions realitzades dins del marc de l'Aprenentatge Profund, particularment enfocades a problemes relacionats amb la visió per computador. Aquestes contribucions es resumeixen en dos nous mètodes: una nova tècnica de regularització per Restricted Boltzmann Machines anomenada Mask Selective Regularization (MSR), i una potent xarxa neuronal discriminativa anomenada Local Deep Neural Network ( Local-DNN). D'una banda, el mètode MSR es basa en aprofitar els avantatges de les tècniques de regularització clàssiques basades en les normes L2 i L1. Les dues regularitzacions s'apliquen sobre els paràmetres de la RBM tenint en compte l'estat del model durant l'entrenament i la topologia de les dades d'entrada. D'altra banda, el model Local-DNN es basa en dos conceptes fonamentals: característiques locals i arquitectures profundes. De forma similar a les xarxes convolucionals, Local-DNN restringeix l'aprenentatge a regions locals de la imatge d'entrada. La xarxa neuronal pretén classificar cada característica local amb l'etiqueta de la imatge a la qual pertany, i, finalment, totes aquestes contribucions es fusionen durant la predicció utilitzant un senzill sistema de votació. Els mètodes proposats al llarg de la tesi han estat àmpliament avaluats en diversos experiments utilitzant diferents bases de dades, principalment en problemes de visió per computador. Els resultats obtinguts mostren el bon funcionament d'aquests mètodes, i serveixen per validar les estratègies plantejades. Entre d'ells, destaquen els resultats obtinguts aplicant el model Local-DNN al problema del reconeixement de gènere utilitzant imatges facials, on s'han millorat els resultats publicats de l'estat de l'art.
Mansanet Sandín, J. (2016). Contributions to Deep Learning Models [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/61296
TESIS

APA, Harvard, Vancouver, ISO, and other styles

37

Deselaers, Johannes. "Deep Learning Pupil Center Localization." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-287538.

Full text

Abstract:

This project strives to achieve high performance object localization with Convolutional Neural Networks (CNNs) - in particular for pupil centers in the context of remote eye tracking systems. Three different network architectures suitable to the task are developed, evaluated and compared - one based on regression using fully connected layers, one Fully Convolutional Network and one Deconvolutional Network. The best performing model achieves a mean error of only 0.52 pixel distance and a median error of 0.42 pixel distance compared to the ground truth annotations. The 95th percentile lies at 1.12 pixel error. This exceeds the performance of current state-of-the-art pupil center detection algorithms by an order of magnitude, a result that can be accredited both to the algorithm as well as to the dataset which exceeds datasets used for this purpose in prior publications in suitability, quality and size. Opportunities for further improvements of the computational cost based on recent model compression research are suggested.
Detta projekt strävar efter att uppnå högpresterande objektlokalisering med djupa faltningsnätverker/Convolutional Neural Networks (CNNs) - särskilt för pupillcenter i samband med eyetracking. Tre olika nätverksarkitekturer som passar uppgiften utvecklas, utvärderas och jämförs - en baserad på regression med fullt anslutna lager, ett Fully Convolutional Network och ett Deconvolutional Network. Den bäst presterande modellen uppnår ett medelfel på endast 0.52 pixelavstånd och ett medianfel på 0.42 pixelavstånd jämfört med marken sanningsetiketten. Den 95:e percentilen ligger på 1.12 pixelfel. Detta överträffar prestandan hos nuvarande toppmoderna detekteringsalgoritmer för pupillcentrum med en storleksordning, ett resultat som kan ackrediteras både till algoritmen såväl som till dataset som överstiger datasets som används för detta ändamål i tidigare publikationer i lämplighet, kvalitet och storlek. Möjligheter till ytterligare förbättringar av beräkningskostnaden baserad på ny kompressionsforskning föreslås.

APA, Harvard, Vancouver, ISO, and other styles

38

MBITI, JOHN N. "Deep learning for portfolio optimization." Thesis, Linnéuniversitetet, Institutionen för matematik (MA), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-104567.

Full text

Abstract:

In this thesis, an optimal investment problem is studied for an investor who can only invest in a financial market modelled by an Itô-Lévy process; with one risk free (bond) and one risky (stock) investment possibility. We present the dynamic programming method and the associated Hamilton-Jacobi-Bellman (HJB) equation to explicitly solve this problem. It is shown that with purification and simplification to the standard jump diffusion process, closed form solutions for the optimal investment strategy and for the value function are attainable. It is also shown that, an explicit solution can be obtained via a finite training of a neural network using Stochastic gradient descent (SGD) for a specific case.

APA, Harvard, Vancouver, ISO, and other styles

39

Gerlach, Johanna, Alexander Riedel, Seyyid Uslu, Frank Engelmann, and Nico Brehm. "Montagegerechte Gestaltungsrichtlinien mittels Deep Learning." Thelem Universitätsverlag & Buchhandlung GmbH & Co. KG, 2021. https://tud.qucosa.de/id/qucosa%3A75857.

Full text

Abstract:

Die Anwendung von Deep Learning in der manuellen Montage birgt großes Potenzial, Montagezeiten zu reduzieren und Montagefehler zu vermeiden. Indem der Montageablauf mithilfe einer Kamera erfasst und die aufgezeichneten Bilder durch einen Objekterkennungsalgorithmus analysiert werden, lassen sich Position, Lage und Art der montierten Bauteile bestimmen. Daraus lassen sich wiederum Informationen über Arbeitsschritte, Montagefehler oder den aktuellen Zustand des Produkts ableiten, sodass die Mitarbeiter bei der Montage durch entsprechende Anweisungen unterstützt werden können. Es stellt sich jedoch die Frage, inwieweit gegenwärtige Produkte für den Einsatz von Deep Learning geeignet sind. Nur wenn die zu montierenden Bauteile sicher erkannt werden, ist der Einsatz in der manuellen Montage sinnvoll. Bestehende Gestaltungsrichtlinien adressieren diesen Aspekt bislang nicht. Im Forschungsprojekt wurde daher untersucht, welche Eigenschaften Produkte aufweisen sollten, um eine optimale Objekterkennung zu ermöglichen. Dazu wurden Hypothesen zu positiven und negativen Bauteileigenschaften hinsichtlich der Erkennungsgenauigkeit formuliert und in praktischen Versuchen überprüft. Dabei konnte gezeigt werden, dass alle untersuchten Objekte durch den eingesetzten Objekterkennungsalgorithmus sehr gut detektiert werden. Aus den vorliegenden Forschungsergebnissen lassen sich daher keine Einschränkungen in der Produktgestaltung ableiten.

APA, Harvard, Vancouver, ISO, and other styles

40

Jaderberg, Maxwell. "Deep learning for text spotting." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:e893c11e-6b6b-4d11-bb25-846bcef9b13e.

Full text

Abstract:

This thesis addresses the problem of text spotting - being able to automatically detect and recognise text in natural images. Developing text spotting systems, systems capable of reading and therefore better interpreting the visual world, is a challenging but wildly useful task to solve. We approach this problem by drawing on the successful developments in machine learning, in particular deep learning and neural networks, to present advancements using these data-driven methods. Deep learning based models, consisting of millions of trainable parameters, require a lot of data to train effectively. To meet the requirements of these data hungry algorithms, we present two methods of automatically generating extra training data without any additional human interaction. The first crawls a photo sharing website and uses a weakly-supervised existing text spotting system to harvest new data. The second is a synthetic data generation engine, capable of generating unlimited amounts of realistic looking text images, that can be solely relied upon for training text recognition models. While we define these new datasets, all our methods are also evaluated on standard public benchmark datasets. We develop two approaches to text spotting: character-centric and word-centric. In the character-centric approach, multiple character classifier models are developed, reinforcing each other through a feature sharing framework. These character models are used to generate text saliency maps to drive detection, and convolved with detection regions to enable text recognition, producing an end-to-end system with state-of-the-art performance. For the second, higher-level, word-centric approach to text spotting, weak detection models are constructed to find potential instances of words in images, which are subsequently refined and adjusted with a classifier and deep coordinate regressor. A whole word image recognition model recognises words from a huge dictionary of 90k words using classification, resulting in previously unattainable levels of accuracy. The resulting end-to-end text spotting pipeline advances the state of the art significantly and is applied to large scale video search. While dictionary based text recognition is useful and powerful, the need for unconstrained text recognition still prevails. We develop a two-part model for text recognition, with the complementary parts combined in a graphical model and trained using a structured output learning framework adapted to deep learning. The trained recognition model is capable of accurately recognising unseen and completely random text. Finally, we make a general contribution to improve the efficiency of convolutional neural networks. Our low-rank approximation schemes can be utilised to greatly reduce the number of computations required for inference. These are applied to various existing models, resulting in real-world speedups with negligible loss in predictive power.

APA, Harvard, Vancouver, ISO, and other styles

41

Ma, Sihan. "Image Matting via Deep Learning." Thesis, The University of Sydney, 2020. https://hdl.handle.net/2123/22426.

Full text

Abstract:

Image matting aims to extract the accurate opacity of the foreground from the input RGB image, which is beneficial and essential for the subsequent applications, such as image editing, compositing, and film production. Unfortunately, this task is challenging because of its ill-posed nature. Specifically, with the corresponding foreground and background unknown, it is hard to predict the opacity of the foreground from the single RGB image. Recently, deep learning is introduced to image matting to deal with this problem. However, there are still some issues to be addressed. First, the choice of simple base networks for image matting task matters due to the higher demand for detail recovery in matting. The classic base networks used in other vision tasks, e.g. ResNet in classification, yield unsatisfactory results with blurry matte details when directly used in image matting. Therefore, it increases the difficulty in developing new methods and also overshadows further investigations. Second, prior matting methods suffer from fuzzy boundaries when the foreground and the background have similar appearance features. The main reason here is that previous approaches merely rely on local neighbor relationship depicted by color features for predictions on image details, but lack utilization of global contextual information and discriminative feature representation. To address these issues, we first devise a novel base network called DM+. It is based on ResNet-50 and specifically customized for image matting. Compared with the original structure, we make simple but effective changes by adding a max pooling instead of the strided convolution for down-sampling, which keeps important localization information of extreme feature values. In this way, it can learn both low-level detail features and meanwhile maintain high-level semantic information, thus greatly increasing the ability in distinguishing fine details of alpha mattes. Moreover, DM+ is easy to implement and follow, which provides more opportunities for future exploration. To further generate fine alpha mattes in the transition areas with tiny details and textures, we propose a novel global affinity module, which utilizes the discriminative alpha mattes for calculating the affinity matrix and enables the information propagation in a global manner. By integrating into the training objectives, this module is only used during training. Therefore, no extra computational cost is incurred during inference. Experimental results on the popular Composition-1k dataset demonstrate the effectiveness of our base structure DM+ and the global affinity module, and their superiority over representative state-of-the-art methods.

APA, Harvard, Vancouver, ISO, and other styles

42

He, Haoyu. "Deep learning based human parsing." Thesis, University of Sydney, 2020. https://hdl.handle.net/2123/24262.

Full text

Abstract:

Human parsing, or human body part semantic segmentation, has been an active research topic due to its wide potential applications. Although prior works have made significant progress by introducing large-scale datasets and deep learning to solve the problem, there are still two challenges remain unsolved. Firstly, to better exploit the existing parsing annotations, prior methods learn a knowledge-sharing mechanism to improve semantic structures in cross-dataset human parsing. However, the modeling for such mechanism remains inefficient for not considering classes' granularity difference in different domains. Secondly, the trained models are limited to parsing humans into classes pre-defined in the training data, which lacks the generalization ability to the unseen fashion classes. Targeting at improving feature representations from multi-domain annotations more efficiently, in this thesis, we propose a novel GRAph PYramid Mutual Learning (Grapy-ML) method to address the cross-dataset human parsing problem, where we model the granularity difference through a graph pyramid. Starting from the prior knowledge of the human body hierarchical structure, we devise a graph pyramid module (GPM) by stacking three levels of graph structures from coarse granularity to fine granularity subsequently. Specifically, the network weights of the first two levels are shared to exchange the learned coarse-granularity information across different datasets. At each level, GPM utilizes the self-attention mechanism to model the correlations between context nodes. Then, it adopts a top-down mechanism to progressively refine the hierarchical features through all the levels. GPM also enables efficient mutual learning. By making use of the multi-granularity labels, Grapy-ML learns a more discriminative feature representation and achieves state-of-the-art performance, which is demonstrated by extensive experiments on the three popular benchmarks, e.g., CIHP dataset. To bridge the generalizability gap, in this thesis, we propose a new problem named one-shot human parsing (OSHP) that requires to parse human into an open set of reference classes defined by any single reference example. During training, only base classes defined in the training set are exposed, which can overlap with part of reference classes. In this thesis, we devise a novel Progressive One-shot Parsing network (POPNet) to address two critical challenges in this problem, i.e., testing bias and small size. POPNet consists of two collaborative metric learning modules named Attention Guidance Module (AGM) and Nearest Centroid Module (NCM), which can learn representative prototypes for base classes and quickly transfer the ability to the unseen classes during testing, thereby reducing the testing bias. Moreover, POPNet adopts a progressive human parsing framework that can incorporate the learned knowledge of parent classes at the coarse granularity to help recognize the unseen descendant classes at the fine granularity, thereby handling the small size issue. Experiments on the ATR-OS benchmark tailoring for OSHP, demonstrate POPNet outperforms other representative one-shot segmentation models by large margins and establishes a strong baseline for the new problem.

APA, Harvard, Vancouver, ISO, and other styles

43

Zhao, Yang. "Person Retrieval with Deep Learning." Thesis, Griffith University, 2022. http://hdl.handle.net/10072/411526.

Full text

Abstract:

Person retrieval aims at matching person images across multiple non-overlapping camera views. It has facilitated a wide range of important applications in intelligent video analysis. The task of person retrieval remains challenging due to dramatic changes on visual appearance that are caused by large intra-class variations from human pose and camera viewpoint, misaligned person detection and occlusion. How to learn discriminative features under these challenging conditions becomes the core issue for the task of person retrieval. According to the input modality, person retrieval could be categorised into image-based retrieval and video-based retrieval. Despite decades of efforts, person retrieval is still very challenging and remains unsolved due to the following factors: 1) the large intra-class variations (e.g., pose variation) of pedestrian images, leading to a dramatic change in their appearances; 2) only heuristically coarse-grained region strips or pixel-level annotations directly borrowed from pretrained human parsing models are employed, impeding the efficacy and practicality of region representation learning; 3) absence of useful temporal cues for boosting the video person retrieval system; This thesis reports a series of technical solutions towards addressing the above challenges in person retrieval. To address the large intra-class variations among the person images, we introduce an improved triplet loss such that the global feature representations from the same identity are closely clustered for person retrieval. To learn a discriminative region representation within fine-grained segments while avoiding expensive pixel-level annotations, we introduce a novel identityguided human region segmentation method that can predict informative region segments, enabling discriminative region representation learning for person retrieval. To extract useful temporal cues for video person retrieval, we build a two-stream architecture, named appearance-gait network, to jointly learn the appearance features and gait feaures from RGB video clips and silhouette video clips. To further provide potentially useful information for person retrieval, we introduce a lightweight and effective knowledge distillation method for facial landmark detection. We believe that the proposed person retrieval approaches can serve as benchmark methods and provide new perspectives for the person retrieval task.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Eng & Built Env
Science, Environment, Engineering and Technology
Full Text

APA, Harvard, Vancouver, ISO, and other styles

44

Ovidiu, Chelcea Vlad, and Björn Ståhl. "Deep Reinforcement Learning for Snake." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239362.

Full text

Abstract:

The world has recently seen a large increase in both research and development and layman use of machine learning. Machine learning has a broad application domain, e.g, in marketing, production and finance. Although these applications have a predetermined set of rules or goals, this project deals with another aspect of machine learning which is general intelligence. During the course of the project a non-human player (known as agent) will learn how to play the game SNAKE without any outside influence or knowledge of the environment dynamics. After having the agent train for 66 hours and almost two million games an average of 16 points per game out of 35 possible were reached. This is realized by the use of reinforcement learning and deep convolutional neural networks (CNN).

APA, Harvard, Vancouver, ISO, and other styles

45

Figué, Valentin. "Depth prediction by deep learning." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240593.

Full text

Abstract:

Knowing the depth information is of critical importance in scene understanding for several industrial projects such as self-driving cars for instance. Where depth inference from a single still image has taken a prominent place in recent studies with the outcome of deep learning methods, practical cases often offer useful additional information that should be considered early in the architecture of the design to benefit from them in order to improve quality and robustness of the estimates. Hence, this thesis proposes a deep fully convolutional network which allows to exploit the informations of either stereo or monocular temporal sequences, along with a novel training procedure which takes multi-scale optimization into account. Indeed, this thesis found that using multi-scale information all along the network is of prime importance for accurate depth estimation and greatly improves performances, allowing to obtain new state-of-theart results on both synthetic data using Virtual KITTI and also on realimages with the challenging KITTI dataset.
Att känna till djupet i en bild är av avgörande betydelse för scenförståelse i flera industriella tillämpningar, exempelvis för självkörande bilar. Bestämning av djup utifrån enstaka bilder har fått en alltmer framträdande roll i studier på senare år, tack vare utvecklingen inom deep learning. I många praktiska fall tillhandahålls ytterligare information som är högst användbar, vilket man bör ta hänsyn till då man designar en arkitektur för att förbättra djupuppskattningarnas kvalitet och robusthet. I detta examensarbete presenteras därför ett så kallat djupt fullständigt faltningsnätverk, som tillåter att man utnyttjar information från tidssekvenser både monokulärt och i stereo samt nya sätt att optimalt träna nätverken i multipla skalor. I examensarbetet konstateras att information från multipla skalor är av synnerlig vikt för noggrann uppskattning av djup och för avsevärt förbättrad prestanda, vilket resulterat i nya state-of-the-art-resultat på syntetiska data från Virtual KITTI såväl som på riktiga bilder fråndet utmanande KITTI-datasetet.

APA, Harvard, Vancouver, ISO, and other styles

46

Hussain, Jabbar. "Deep Learning Black Box Problem." Thesis, Uppsala universitet, Institutionen för informatik och media, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-393479.

Full text

Abstract:

Application of neural networks in deep learning is rapidly growing due to their ability to outperform other machine learning algorithms in different kinds of problems. But one big disadvantage of deep neural networks is its internal logic to achieve the desired output or result that is un-understandable and unexplainable. This behavior of the deep neural network is known as “black box”. This leads to the following questions: how prevalent is the black box problem in the research literature during a specific period of time? The black box problems are usually addressed by socalled rule extraction. The second research question is: what rule extracting methods have been proposed to solve such kind of problems? To answer the research questions, a systematic literature review was conducted for data collection related to topics, the black box, and the rule extraction. The printed and online articles published in higher ranks journals and conference proceedings were selected to investigate and answer the research questions. The analysis unit was a set of journals and conference proceedings articles related to the topics, the black box, and the rule extraction. The results conclude that there has been gradually increasing interest in the black box problems with the passage of time mainly because of new technological development. The thesis also provides an overview of different methodological approaches used for rule extraction methods.

APA, Harvard, Vancouver, ISO, and other styles

47

Li, Shuai Ph D. Massachusetts Institute of Technology. "Computational imaging through deep learning." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122070.

Full text

Abstract:

This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 143-154).
Computational imaging (CI) is a class of imaging systems that uses inverse algorithms to recover an unknown object from the physical measurement. Traditional inverse algorithms in CI obtain an estimate of the object by minimizing the Tikhonov functional, which requires explicit formulations of the forward operator of the physical system, as well as the prior knowledge about the class of objects being imaged. In recent years, machine learning architectures, and deep learning (DL) in particular, have attracted increasing attentions from CI researchers. Unlike traditional inverse algorithms in CI, DL approach learns both the forward operator and the objects' prior implicitly from training examples. Therefore, it is especially attractive when the forward imaging model is uncertain (e.g. imaging through random scattering media), or the prior about the class of objects is difficult to be expressed analytically (e.g. natural images).
In this thesis, the application of DL approaches in two different CI scenarios are investigated: imaging through a glass diffuser and quantitative phase retrieval (QPR), where an Imaging through Diffuser Network (IDiffNet) and a Phase Extraction Neural Network (PhENN) are experimentally demonstrated, respectively. This thesis also studies the influences of the two main factors that determine the performance of a trained neural network: network architecture (connectivity, network depth, etc) and training example quality (spatial frequency content in particular). Motivated by the analysis of the latter factor, two novel approaches, spectral pre-modulation approach and Learning Synthesis by DNN (LS-DNN) method, are successively proposed to improve the visual qualities of the network outputs. Finally, the LS-DNN enhanced PhENN is applied to a phase microscope to recover the phase of a red blood cell (RBC) sample.
Furthermore, through simulation of the learned weak object transfer function (WOTF) and experiment on a star-like phase target, we demonstrate that our network has indeed learned the correct physical model rather than doing something trivial as pattern matching.
by Shuai Li.
Ph. D.
Ph.D. Massachusetts Institute of Technology, Department of Mechanical Engineering

APA, Harvard, Vancouver, ISO, and other styles

48

Dumas, Thierry. "Deep learning for image compression." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S029/document.

Full text

Abstract:

Ces vingt dernières années, la quantité d’images et de vidéos transmises a augmenté significativement, ce qui est principalement lié à l’essor de Facebook et Netflix. Même si les capacités de transmission s’améliorent, ce nombre croissant d’images et de vidéos transmises exige des méthodes de compression plus efficaces. Cette thèse a pour but d’améliorer par l’apprentissage deux composants clés des standards modernes de compression d’image, à savoir la transformée et la prédiction intra. Plus précisément, des réseaux de neurones profonds sont employés car ils ont un grand pouvoir d’approximation, ce qui est nécessaire pour apprendre une approximation fidèle d’une transformée optimale (ou d’un filtre de prédiction intra optimal) appliqué à des pixels d’image. En ce qui concerne l’apprentissage d’une transformée pour la compression d’image via des réseaux de neurones, un défi est d’apprendre une transformée unique qui est efficace en termes de compromis débit-distorsion, à différents débits. C’est pourquoi deux approches sont proposées pour relever ce défi. Dans la première approche, l’architecture du réseau de neurones impose une contrainte de parcimonie sur les coefficients transformés. Le niveau de parcimonie offre un contrôle sur le taux de compression. Afin d’adapter la transformée à différents taux de compression, le niveau de parcimonie est stochastique pendant la phase d’apprentissage. Dans la deuxième approche, l’efficacité en termes de compromis débit-distorsion est obtenue en minimisant une fonction de débit-distorsion pendant la phase d’apprentissage. Pendant la phase de test, les pas de quantification sont progressivement agrandis selon un schéma afin de compresser à différents débits avec une unique transformée apprise. Concernant l’apprentissage d’un filtre de prédiction intra pour la compression d’image via des réseaux de neurones, le problème est d’obtenir un filtre appris qui s’adapte à la taille du bloc d’image à prédire, à l’information manquante dans le contexte de prédiction et au bruit de quantification variable dans ce contexte. Un ensemble de réseaux de neurones est conçu et entraîné de façon à ce que le filtre appris soit adaptatif à ces égards
Over the last twenty years, the amount of transmitted images and videos has increased noticeably, mainly urged on by Facebook and Netflix. Even though broadcast capacities improve, this growing amount of transmitted images and videos requires increasingly efficient compression methods. This thesis aims at improving via learning two critical components of the modern image compression standards, which are the transform and the intra prediction. More precisely, deep neural networks are used for this task as they exhibit high power of approximation, which is needed for learning a reliable approximation of an optimal transform (or an optimal intra prediction filter) applied to image pixels. Regarding the learning of a transform for image compression via neural networks, a challenge is to learn an unique transform that is efficient in terms of rate-distortion while keeping this efficiency when compressing at different rates. That is why two approaches are proposed to take on this challenge. In the first approach, the neural network architecture sets a sparsity on the transform coefficients. The level of sparsity gives a direct control over the compression rate. To force the transform to adapt to different compression rates, the level of sparsity is stochastically driven during the training phase. In the second approach, the rate-distortion efficiency is obtained by minimizing a rate-distortion objective function during the training phase. During the test phase, the quantization step sizes are gradually increased according a scheduling to compress at different rates using the single learned transform. Regarding the learning of an intra prediction filter for image compression via neural networks, the issue is to obtain a learned filter that is adaptive with respect to the size of the image block to be predicted, with respect to missing information in the context of prediction, and with respect to the variable quantization noise in this context. A set of neural networks is designed and trained so that the learned prediction filter has this adaptibility

APA, Harvard, Vancouver, ISO, and other styles

49

Ouyang, Wei. "Deep Learning for Advanced Microscopy." Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCC174/document.

Full text

Abstract:

Contexte: La microscopie joue un rôle important en biologie depuis plusieurs siècles, mais sa résolution a longtemps été limitée à environ 250 nm, de sorte que nombre de structures biologiques (virus, vésicules, pores nucléaires, synapses) ne pouvaient être résolues. Au cours de la dernière décennie, plusieurs méthodes de super-résolution ont été développées pour dépasser cette limite. Parmi ces techniques, les plus puissantes et les plus utilisées reposent sur la localisation de molécules uniques (microscopie à localisation de molécule unique, ou SMLM), comme PALM et STORM. En localisant précisément les positions de molécules fluorescentes isolées dans des milliers d'images de basse résolution acquises de manière séquentielle, la SMLM peut atteindre des résolutions de 20 à 50 nm voire mieux. Cependant, cette technique est intrinsèquement lente car elle nécessite l’accumulation d’un très grand nombre d’images et de localisations pour obtenir un échantillonnage super-résolutif des structures fluorescentes. Cette lenteur (typiquement ~ 30 minutes par image super-résolutive) rend difficile l'utilisation de la SMLM pour l'imagerie cellulaire à haut débit ou en cellules vivantes. De nombreuses méthodes ont été proposées pour pallier à ce problème, principalement en améliorant les algorithmes de localisation pour localiser des molécules proches, mais la plupart de ces méthodes compromettent la résolution spatiale et entraînent l’apparition d’artefacts. Méthodes et résultats: Nous avons adopté une stratégie de transformation d’image en image basée sur l'apprentissage profond dans le but de restaurer des images SMLM parcimonieuses et par là d’améliorer la vitesse d’acquisition et la qualité des images super-résolutives. Notre méthode, ANNA-PALM, s’appuie sur des développements récents en apprentissage profond, notamment l’architecture U-net et les modèles génératifs antagonistes (GANs). Nous montrons des validations de la méthode sur des images simulées et des images expérimentales de différentes structures cellulaires (microtubules, pores nucléaires et mitochondries). Ces résultats montrent qu’après un apprentissage sur moins de 10 images de haute qualité, ANNA-PALM permet de réduire le temps d’acquisition d’images SMLM, à qualité comparable, d’un facteur 10 à 100. Nous avons également montré que ANNA-PALM est robuste à des altérations de la structure biologique, ainsi qu’à des changements de paramètres de microscopie. Nous démontrons le potentiel applicatif d’ANNA-PALM pour la microscopie à haut débit en imageant ~ 1000 cellules à haute résolution en environ 3 heures. Enfin, nous avons conçu un outil pour estimer et réduire les artefacts de reconstruction en mesurant la cohérence entre l’image reconstruite et l’image en épi-fluorescence. Notre méthode permet une microscopie super-résolutive plus rapide et plus douce, compatible avec l’imagerie haut débit, et ouvre une nouvelle voie vers l'imagerie super-résolutive des cellules vivantes. La performance des méthodes d'apprentissage profond augmente avec la quantité des données d’entraînement. Le partage d’images au sein de la communauté de microscopie offre en principe un moyen peu coûteux d’augmenter ces données. Cependant, il est souvent difficile d'échanger ou de partager des données de SMLM, car les tables de localisation seules ont souvent une taille de plusieurs gigaoctets et il n'existe pas de plate-forme de visualisation dédiée aux données SMLM. Nous avons développé un format de fichier pour compresser sans perte des tables de localisation, ainsi qu’une plateforme web (https://shareloc.xyz) qui permet de visualiser et de partager facilement des données SMLM 2D ou 3D. A l’avenir, cette plate-forme pourrait grandement améliorer les performances des modèles d'apprentissage en profondeur, accélérer le développement des outils, faciliter la réanalyse des données et promouvoir la recherche reproductible et la science ouverte
Background: Microscopy plays an important role in biology since several centuries, but its resolution has long been limited to ~250nm due to diffraction, leaving many important biological structures (e.g. viruses, vesicles, nuclear pores, synapses) unresolved. Over the last decade, several super-resolution methods have been developed that break this limit. Among the most powerful and popular super-resolution techniques are those based on single molecular localization (single molecule localization microscopy, or SMLM) such as PALM and STORM. By precisely localizing positions of isolated fluorescent molecules in thousands or more sequentially acquired diffraction limited images, SMLM can achieve resolutions of 20-50 nm or better. However, SMLM is inherently slow due to the necessity to accumulate enough localizations to achieve high resolution sampling of the fluorescent structures. The drawback in acquisition speed (typically ~30 minutes per super-resolution image) makes it difficult to use SMLM in high-throughput and live cell imaging. Many methods have been proposed to address this issue, mostly by improving the localization algorithms to localize overlapping spots, but most of them compromise spatial resolution and cause artifacts.Methods and results: In this work, we applied deep learning based image-to-image translation framework for improving imaging speed and quality by restoring information from rapidly acquired low quality SMLM images. By utilizing recent advances in deep learning including the U-net and Generative Adversarial Networks, we developed our method Artificial Neural Network Accelerated PALM (ANNA-PALM) which is capable of learning structural information from training images and using the trained model to accelerate SMLM imaging by tens to hundreds folds. With experimentally acquired images of different cellular structures (microtubules, nuclear pores and mitochondria), we demonstrated that deep learning can efficiently capture the structural information from less than 10 training samples and reconstruct high quality super-resolution images from sparse, noisy SMLM images obtained with much shorter acquisitions than usual for SMLM. We also showed that ANNA-PALM is robust to possible variations between training and testing conditions, due either to changes in the biological structure or to changes in imaging parameters. Furthermore, we take advantage of the acceleration provided by ANNA-PALM to perform high throughput experiments, showing acquisition of ~1000 cells at high resolution in ~3 hours. Additionally, we designed a tool to estimate and reduce possible artifacts is designed by measuring the consistency between the reconstructed image and the experimental wide-field image. Our method enables faster and gentler imaging which can be applied to high-throughput, and provides a novel avenue towards live cell high resolution imaging. Deep learning methods rely on training data and their performance can be improved even further with more training data. One cheap way to obtain more training data is through data sharing within the microscopy community. However, it often difficult to exchange or share localization microscopy data, because localization tables alone are typically several gigabytes in size, and there is no dedicated platform for localization microscopy data which provide features such as rendering, visualization and filtering. To address these issues, we developed a file format that can losslessly compress localization tables into smaller files, alongside with a web platform called ShareLoc (https://shareloc.xyz) that allows to easily visualize and share 2D or 3D SMLM data. We believe that this platform can greatly improve the performance of deep learning models, accelerate tool development, facilitate data re-analysis and further promote reproducible research and open science

APA, Harvard, Vancouver, ISO, and other styles

50

Lomonaco, Vincenzo <1991&gt. "Continual Learning with Deep Architectures." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2019. http://amsdottorato.unibo.it/9073/1/vincenzo_lomonaco_thesis.pdf.

Full text

Abstract:

Humans have the extraordinary ability to learn continually from experience. Not only we can apply previously learned knowledge and skills to new situations, we can also use these as the foundation for later learning. One of the grand goals of Artificial Intelligence (AI) is building an artificial “continual learning” agent that constructs a sophisticated understanding of the world from its own experience through the autonomous incremental development of ever more complex knowledge and skills. However, despite early speculations and few pioneering works, very little research and effort has been devoted to address this vision. Current AI systems greatly suffer from the exposure to new data or environments which even slightly differ from the ones for which they have been trained for. Moreover, the learning process is usually constrained on fixed datasets within narrow and isolated tasks which may hardly lead to the emergence of more complex and autonomous intelligent behaviors. In essence, continual learning and adaptation capabilities, while more than often thought as fundamental pillars of every intelligent agent, have been mostly left out of the main AI research focus. In this dissertation, we study the application of these ideas in light of the more recent advances in machine learning research and in the context of deep architectures for AI. We propose a comprehensive and unifying framework for continual learning, new metrics, benchmarks and algorithms, as well as providing substantial experimental evaluations in different supervised, unsupervised and reinforcement learning tasks.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Deep learning'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles