Thèses sur le sujet « Representation learning (artifical intelligence) »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les 50 meilleures thèses pour votre recherche sur le sujet « Representation learning (artifical intelligence) ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.
Li, Hao. « Towards Fast and Efficient Representation Learning ». Thesis, University of Maryland, College Park, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10845690.
Texte intégralThe success of deep learning and convolutional neural networks in many fields is accompanied by a significant increase in the computation cost. With the increasing model complexity and pervasive usage of deep neural networks, there is a surge of interest in fast and efficient model training and inference on both cloud and embedded devices. Meanwhile, understanding the reasons for trainability and generalization is fundamental for its further development. This dissertation explores approaches for fast and efficient representation learning with a better understanding of the trainability and generalization. In particular, we ask following questions and provide our solutions: 1) How to reduce the computation cost for fast inference? 2) How to train low-precision models on resources-constrained devices? 3) What does the loss surface looks like for neural nets and how it affects generalization?
To reduce the computation cost for fast inference, we propose to prune filters from CNNs that are identified as having a small effect on the prediction accuracy. By removing filters with small norms together with their connected feature maps, the computation cost can be reduced accordingly without using special software or hardware. We show that simple filter pruning approach can reduce the inference cost while regaining close to the original accuracy by retraining the networks.
To further reduce the inference cost, quantizing model parameters with low-precision representations has shown significant speedup, especially for edge devices that have limited computing resources, memory capacity, and power consumption. To enable on-device learning on lower-power systems, removing the dependency of full-precision model during training is the key challenge. We study various quantized training methods with the goal of understanding the differences in behavior, and reasons for success or failure. We address the issue of why algorithms that maintain floating-point representations work so well, while fully quantized training methods stall before training is complete. We show that training algorithms that exploit high-precision representations have an important greedy search phase that purely quantized training methods lack, which explains the difficulty of training using low-precision arithmetic.
Finally, we explore the structure of neural loss functions, and the effect of loss landscapes on generalization, using a range of visualization methods. We introduce a simple filter normalization method that helps us visualize loss function curvature, and make meaningful side-by-side comparisons between loss functions. The sharpness of minimizers correlates well with generalization error when this visualization is used. Then, using a variety of visualizations, we explore how training hyper-parameters affect the shape of minimizers, and how network architecture affects the loss landscape.
Denize, Julien. « Self-supervised representation learning and applications to image and video analysis ». Electronic Thesis or Diss., Normandie, 2023. http://www.theses.fr/2023NORMIR37.
Texte intégralIn this thesis, we develop approaches to perform self-supervised learning for image and video analysis. Self-supervised representation learning allows to pretrain neural networks to learn general concepts without labels before specializing in downstream tasks faster and with few annotations. We present three contributions to self-supervised image and video representation learning. First, we introduce the theoretical paradigm of soft contrastive learning and its practical implementation called Similarity Contrastive Estimation (SCE) connecting contrastive and relational learning for image representation. Second, SCE is extended to global temporal video representation learning. Lastly, we propose COMEDIAN a pipeline for local-temporal video representation learning for transformers. These contributions achieved state-of-the-art results on multiple benchmarks and led to several academic and technical published contributions
Aboul-Enien, Hisham Abdel-Ghaffer. « Neural network learning and knowledge representation in a multi-agent system ». Thesis, Imperial College London, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.252040.
Texte intégralCarvalho, Micael. « Deep representation spaces ». Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS292.
Texte intégralIn recent years, Deep Learning techniques have swept the state-of-the-art of many applications of Machine Learning, becoming the new standard approach for them. The architectures issued from these techniques have been used for transfer learning, which extended the power of deep models to tasks that did not have enough data to fully train them from scratch. This thesis' subject of study is the representation spaces created by deep architectures. First, we study properties inherent to them, with particular interest in dimensionality redundancy and precision of their features. Our findings reveal a strong degree of robustness, pointing the path to simple and powerful compression schemes. Then, we focus on refining these representations. We choose to adopt a cross-modal multi-task problem, and design a loss function capable of taking advantage of data coming from multiple modalities, while also taking into account different tasks associated to the same dataset. In order to correctly balance these losses, we also we develop a new sampling scheme that only takes into account examples contributing to the learning phase, i.e. those having a positive loss. Finally, we test our approach in a large-scale dataset of cooking recipes and associated pictures. Our method achieves a 5-fold improvement over the state-of-the-art, and we show that the multi-task aspect of our approach promotes a semantically meaningful organization of the representation space, allowing it to perform subtasks never seen during training, like ingredient exclusion and selection. The results we present in this thesis open many possibilities, including feature compression for remote applications, robust multi-modal and multi-task learning, and feature space refinement. For the cooking application, in particular, many of our findings are directly applicable in a real-world context, especially for the detection of allergens, finding alternative recipes due to dietary restrictions, and menu planning
Newman-Griffis, Denis R. « Capturing Domain Semantics with Representation Learning : Applications to Health and Function ». The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587658607378958.
Texte intégralCao, Xi Hang. « On Leveraging Representation Learning Techniques for Data Analytics in Biomedical Informatics ». Diss., Temple University Libraries, 2019. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/586006.
Texte intégralPh.D.
Representation Learning is ubiquitous in state-of-the-art machine learning workflow, including data exploration/visualization, data preprocessing, data model learning, and model interpretations. However, the majority of the newly proposed Representation Learning methods are more suitable for problems with a large amount of data. Applying these methods to problems with a limited amount of data may lead to unsatisfactory performance. Therefore, there is a need for developing Representation Learning methods which are tailored for problems with ``small data", such as, clinical and biomedical data analytics. In this dissertation, we describe our studies of tackling the challenging clinical and biomedical data analytics problem from four perspectives: data preprocessing, temporal data representation learning, output representation learning, and joint input-output representation learning. Data scaling is an important component in data preprocessing. The objective in data scaling is to scale/transform the raw features into reasonable ranges such that each feature of an instance will be equally exploited by the machine learning model. For example, in a credit flaw detection task, a machine learning model may utilize a person's credit score and annual income as features, but because the ranges of these two features are different, a machine learning model may consider one more heavily than another. In this dissertation, I thoroughly introduce the problem in data scaling and describe an approach for data scaling which can intrinsically handle the outlier problem and lead to better model prediction performance. Learning new representations for data in the unstandardized form is a common task in data analytics and data science applications. Usually, data come in a tubular form, namely, the data is represented by a table in which each row is a feature (row) vector of an instance. However, it is also common that the data are not in this form; for example, texts, images, and video/audio records. In this dissertation, I describe the challenge of analyzing imperfect multivariate time series data in healthcare and biomedical research and show that the proposed method can learn a powerful representation to encounter various imperfections and lead to an improvement of prediction performance. Learning output representations is a new aspect of Representation Learning, and its applications have shown promising results in complex tasks, including computer vision and recommendation systems. The main objective of an output representation algorithm is to explore the relationship among the target variables, such that a prediction model can efficiently exploit the similarities and potentially improve prediction performance. In this dissertation, I describe a learning framework which incorporates output representation learning to time-to-event estimation. Particularly, the approach learns the model parameters and time vectors simultaneously. Experimental results do not only show the effectiveness of this approach but also show the interpretability of this approach from the visualizations of the time vectors in 2-D space. Learning the input (feature) representation, output representation, and predictive modeling are closely related to each other. Therefore, it is a very natural extension of the state-of-the-art by considering them together in a joint framework. In this dissertation, I describe a large-margin ranking-based learning framework for time-to-event estimation with joint input embedding learning, output embedding learning, and model parameter learning. In the framework, I cast the functional learning problem to a kernel learning problem, and by adopting the theories in Multiple Kernel Learning, I propose an efficient optimization algorithm. Empirical results also show its effectiveness on several benchmark datasets.
Temple University--Theses
Panesar, Kulvinder. « Conversational artificial intelligence - demystifying statistical vs linguistic NLP solutions ». Universitat Politécnica de Valéncia, 2020. http://hdl.handle.net/10454/18121.
Texte intégralThis paper aims to demystify the hype and attention on chatbots and its association with conversational artificial intelligence. Both are slowly emerging as a real presence in our lives from the impressive technological developments in machine learning, deep learning and natural language understanding solutions. However, what is under the hood, and how far and to what extent can chatbots/conversational artificial intelligence solutions work – is our question. Natural language is the most easily understood knowledge representation for people, but certainly not the best for computers because of its inherent ambiguous, complex and dynamic nature. We will critique the knowledge representation of heavy statistical chatbot solutions against linguistics alternatives. In order to react intelligently to the user, natural language solutions must critically consider other factors such as context, memory, intelligent understanding, previous experience, and personalized knowledge of the user. We will delve into the spectrum of conversational interfaces and focus on a strong artificial intelligence concept. This is explored via a text based conversational software agents with a deep strategic role to hold a conversation and enable the mechanisms need to plan, and to decide what to do next, and manage the dialogue to achieve a goal. To demonstrate this, a deep linguistically aware and knowledge aware text based conversational agent (LING-CSA) presents a proof-of-concept of a non-statistical conversational AI solution.
Tamaazousti, Youssef. « Vers l’universalité des représentations visuelle et multimodales ». Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLC038/document.
Texte intégralBecause of its key societal, economic and cultural stakes, Artificial Intelligence (AI) is a hot topic. One of its main goal, is to develop systems that facilitates the daily life of humans, with applications such as household robots, industrial robots, autonomous vehicle and much more. The rise of AI is highly due to the emergence of tools based on deep neural-networks which make it possible to simultaneously learn, the representation of the data (which were traditionally hand-crafted), and the task to solve (traditionally learned with statistical models). This resulted from the conjunction of theoretical advances, the growing computational capacity as well as the availability of many annotated data. A long standing goal of AI is to design machines inspired humans, capable of perceiving the world, interacting with humans, in an evolutionary way. We categorize, in this Thesis, the works around AI, in the two following learning-approaches: (i) Specialization: learn representations from few specific tasks with the goal to be able to carry out very specific tasks (specialized in a certain field) with a very good level of performance; (ii) Universality: learn representations from several general tasks with the goal to perform as many tasks as possible in different contexts. While specialization was extensively explored by the deep-learning community, only a few implicit attempts were made towards universality. Thus, the goal of this Thesis is to explicitly address the problem of improving universality with deep-learning methods, for image and text data. We have addressed this topic of universality in two different forms: through the implementation of methods to improve universality (“universalizing methods”); and through the establishment of a protocol to quantify its universality. Concerning universalizing methods, we proposed three technical contributions: (i) in a context of large semantic representations, we proposed a method to reduce redundancy between the detectors through, an adaptive thresholding and the relations between concepts; (ii) in the context of neural-network representations, we proposed an approach that increases the number of detectors without increasing the amount of annotated data; (iii) in a context of multimodal representations, we proposed a method to preserve the semantics of unimodal representations in multimodal ones. Regarding the quantification of universality, we proposed to evaluate universalizing methods in a Transferlearning scheme. Indeed, this technical scheme is relevant to assess the universal ability of representations. This also led us to propose a new framework as well as new quantitative evaluation criteria for universalizing methods
Liu, Xudong. « MODELING, LEARNING AND REASONING ABOUT PREFERENCE TREES OVER COMBINATORIAL DOMAINS ». UKnowledge, 2016. http://uknowledge.uky.edu/cs_etds/43.
Texte intégralCleland, Benjamin George. « Reinforcement Learning for Racecar Control ». The University of Waikato, 2006. http://hdl.handle.net/10289/2507.
Texte intégralKim, Seungyeon. « Novel document representations based on labels and sequential information ». Diss., Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53946.
Texte intégralKilinc, Ismail Ozsel. « Graph-based Latent Embedding, Annotation and Representation Learning in Neural Networks for Semi-supervised and Unsupervised Settings ». Scholar Commons, 2017. https://scholarcommons.usf.edu/etd/7415.
Texte intégralKoga, Marcelo Li. « Relational transfer across reinforcement learning tasks via abstract policies ». Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/3/3141/tde-04112014-103827/.
Texte intégralNa construção de agentes inteligentes para a solução de problemas de decisão sequenciais, o uso de aprendizado por reforço é necessário quando o agente não possui conhecimento suficiente para construir um modelo completo do problema. Entretanto, o aprendizado de uma política ótima é em geral muito lento pois deve ser atingido através de tentativa-e-erro e de repetidas interações do agente com o ambiente. Umas das técnicas para se acelerar esse processo é possibilitar a transferência de aprendizado, ou seja, utilizar o conhecimento adquirido para se resolver tarefas passadas no aprendizado de novas tarefas. Assim, se as tarefas tiverem similaridades, o conhecimento prévio guiará o agente para um aprendizado mais rápido. Neste trabalho é explorado o uso de uma representação relacional, que explicita relações entre objetos e suas propriedades. Essa representação possibilita que se explore abstração e semelhanças estruturais entre as tarefas, possibilitando a generalização de políticas de ação para o uso em tarefas diferentes, porém relacionadas. Este trabalho contribui com dois algoritmos livres de modelo para construção online de políticas abstratas: AbsSarsa(λ) e AbsProb-RL. O primeiro constrói uma política abstrata determinística através de funções-valor, enquanto o segundo constrói uma política abstrata estocástica através de busca direta no espaço de políticas. Também é proposta a arquitetura S2L-RL para o agente, que possui dois níveis de aprendizado: o nível abstrato e o nível concreto. Uma política concreta é construída simultaneamente a uma política abstrata, que pode ser utilizada tanto para guiar o agente no problema atual quanto para guiá-lo em um novo problema futuro. Experimentos com tarefas de navegação robótica mostram que essas técnicas são efetivas na melhoria do desempenho do agente, principalmente nas fases inicias do aprendizado, quando o agente desconhece completamente o novo problema.
Yaner, Patrick William. « From Shape to Function : Acquisition of Teleological Models from Design Drawings by Compositional Analogy ». Diss., Atlanta, Ga. : Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/19791.
Texte intégralCommittee Chair: Goel, Ashok; Committee Member: Eastman, Charles; Committee Member: Ferguson, Ronald; Committee Member: Glasgow, Janice; Committee Member: Nersessian, Nancy; Committee Member: Ram, Ashwin.
Azizpour, Hossein. « Visual Representations and Models : From Latent SVM to Deep Learning ». Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-192289.
Texte intégralQC 20160908
Chennupati, Nikhil. « Recommending Collaborations Using Link Prediction ». Wright State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=wright1621899961924795.
Texte intégralEl-Shaer, Mennat Allah. « An Experimental Evaluation of Probabilistic Deep Networks for Real-time Traffic Scene Representation using Graphical Processing Units ». The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1546539166677894.
Texte intégralJones, Joshua K. « Empirically-based self-diagnosis and repair of domain knowledge ». Diss., Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/33931.
Texte intégralTerreau, Enzo. « Apprentissage de représentations d'auteurs et d'autrices à partir de modèles de langue pour l'analyse des dynamiques d'écriture ». Electronic Thesis or Diss., Lyon 2, 2024. http://www.theses.fr/2024LYO20001.
Texte intégralThe recent and massive democratization of digital tools has empowered individuals to generate and share information on the web through various means such as blogs, social networks, sharing platforms, and more. The exponential growth of available information, mostly textual data, requires the development of Natural Language Processing (NLP) models to mathematically represent it and subsequently classify, sort, or recommend it. This is the essence of representation learning. It aims to construct a low-dimensional space where the distances between projected objects (words, texts) reflect real-world distances, whether semantic, stylistic, and so on.The proliferation of available data, coupled with the rise in computing power and deep learning, has led to the creation of highly effective language models for word and document embeddings. These models incorporate complex semantic and linguistic concepts while remaining accessible to everyone and easily adaptable to specific tasks or corpora. One can use them to create author embeddings. However, it is challenging to determine the aspects on which a model will focus to bring authors closer or move them apart. In a literary context, it is preferable for similarities to primarily relate to writing style, which raises several issues. The definition of literary style is vague, assessing the stylistic difference between two texts and their embeddings is complex. In computational linguistics, approaches aiming to characterize it are mainly statistical, relying on language markers. In light of this, our first contribution is a framework to evaluate the ability of language models to grasp writing style. We will have previously elaborated on text embedding models in machine learning and deep learning, at the word, document, and author levels. We will also have presented the treatment of the notion of literary style in Natural Language Processing, which forms the basis of our method. Transferring knowledge between black-box large language models and these methods derived from linguistics remains a complex task. Our second contribution aims to reconcile these approaches through a representation learning model focusing on style, VADES (Variational Author and Document Embedding with Style). We compare our model to state-of-the-art ones and analyze their limitations in this context.Finally, we delve into dynamic author and document embeddings. Temporal information is crucial, allowing for a more fine-grained representation of writing dynamics. After presenting the state of the art, we elaborate on our last contribution, B²ADE (Brownian Bridge Author and Document Embedding), which models authors as trajectories. We conclude by outlining several leads for improving our methods and highlighting potential research directions for the future
Martelli, Thérèse. « Modelisation objet pour la representation de connaissances complexes : application au decodage acoustico-phonetique de la parole continue ». Paris, ENST, 1988. http://www.theses.fr/1988ENST0005.
Texte intégralMita, Graziano. « Toward interpretable machine learning, with applications to large-scale industrial systems data ». Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS112.
Texte intégralThe contributions presented in this work are two-fold. We first provide a general overview of explanations and interpretable machine learning, making connections with different fields, including sociology, psychology, and philosophy, introducing a taxonomy of popular explainability approaches and evaluation methods. We subsequently focus on rule learning, a specific family of transparent models, and propose a novel rule-based classification approach, based on monotone Boolean function synthesis: LIBRE. LIBRE is an ensemble method that combines the candidate rules learned by multiple bottom-up learners with a simple union, in order to obtain a final intepretable rule set. Our method overcomes most of the limitations of state-of-the-art competitors: it successfully deals with both balanced and imbalanced datasets, efficiently achieving superior performance and higher interpretability in real datasets. Interpretability of data representations constitutes the second broad contribution to this work. We restrict our attention to disentangled representation learning, and, in particular, VAE-based disentanglement methods to automatically learn representations consisting of semantically meaningful features. Recent contributions have demonstrated that disentanglement is impossible in purely unsupervised settings. Nevertheless, incorporating inductive biases on models and data may overcome such limitations. We present a new disentanglement method - IDVAE - with theoretical guarantees on disentanglement, deriving from the employment of an optimal exponential factorized prior, conditionally dependent on auxiliary variables complementing input observations. We additionally propose a semi-supervised version of our method. Our experimental campaign on well-established datasets in the literature shows that IDVAE often beats its competitors according to several disentanglement metrics
Trinh, Viet. « CONTEXTUALIZING OBSERVATIONAL DATA FOR MODELING HUMAN PERFORMANCE ». Doctoral diss., University of Central Florida, 2009. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/2747.
Texte intégralPh.D.
School of Electrical Engineering and Computer Science
Engineering and Computer Science
Computer Engineering PhD
Seroussi, Brigitte. « Alex : resolution de problemes par analogie basee sur un apprentissage de strategies par la construction dynamique d'une memoire indexee des exemples ». Paris 7, 1988. http://www.theses.fr/1988PA077153.
Texte intégralSha, Long. « Representing and predicting multi-agent data in adversarial team sports ». Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/116506/1/Long_Sha_Thesis.pdf.
Texte intégralLeitner, Jürgen. « From vision to actions : Towards adaptive and autonomous humanoid robots ». Thesis, Università della Svizzera Italiana, 2014. https://eprints.qut.edu.au/90178/2/2014INFO020.pdf.
Texte intégralJernite, Yacine. « Learning Representations of Text through Language and Discourse Modeling| From Characters to Sentences ». Thesis, New York University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10680744.
Texte intégralIn this thesis, we consider the problem of obtaining a representation of the meaning expressed in a text. How to do so correctly remains a largely open problem, combining a number of inter-related questions (e.g. what is the role of context in interpreting text? how should language understanding models handle compositionality? etc...) In this work, after reflecting on the notion of meaning and describing the most common sequence modeling paradigms in use in recent work, we focus on two of these questions: what level of granularity text should be read at, and what training objectives can lead models to learn useful representations of a text's meaning.
In a first part, we argue for the use of sub-word information for that purpose, and present new neural network architectures which can either process words in a way that takes advantage of morphological information, or do away with word separations altogether while still being able to identify relevant units of meaning.
The second part starts by arguing for the use of language modeling as a learning objective, and provides algorithms which can help with its scalability issues and propose a globally rather than locally normalized probability distribution. It then explores the question of what makes a good language learning objective, and introduces discriminative objectives inspired by the notion of discourse coherence which help learn a representation of the meaning of sentences.
Giusti, Rafael. « Classicação de séries temporais utilizando diferentes representações de dados e ensembles ». Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-05122017-170029/.
Texte intégralTemporal data are ubiquitous in nearly all areas of human knowledge. The research field known as machine learning has contributed to temporal data mining with algorithms for classification, clustering, anomaly or exception detection, and motif detection, among others. These algorithms oftentimes are reliant on a distance function that must be capable of expressing a similarity concept among the data. One of the most important classification models, the 1-NN, employs a distance function when comparing a time series of interest against a reference set, and assigns to the former the label of the most similar reference time series. There are, however, several domains in which the temporal data are insufficient to characterize neighbors according to the concepts associated to the classes. One possible approach to this problem is to transform the time series into a representation domain in which the meaningful attributes for the classifier are more clearly expressed. For instance, a time series may be decomposed into periodic components of different frequency and amplitude values. For several applications, those components are much more meaningful in discriminating the classes than the temporal evolution of the original observations. In this work, we employ diversity of representation and distance functions for the classification of time series. By choosing a data representation that is more suitable to express the discriminating characteristics of the domain, we are able to achieve classification that are more faithful to the target-concept. With this goal in mind, we promote a study of time series representation domains, and we evaluate how such domains can provide alternative decision spaces. Different models of the 1-NN classifier are evaluated both isolated and associated in classification ensembles in order to construct more robust classifiers. We also use distance functions and alternative representation domains in order to extract nontemporal attributes, known as distance features. Distance features reflect neighborhood concepts of the instances to the training samples, and they may be used to induce classification models which are typically not as efficient when trained with the original time series observations. We show that distance features allow for classification results compatible with the state-of-the-art.
Zhou, Bolei. « Interpretable representation learning for visual intelligence ». Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117837.
Texte intégralThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 131-140).
Recent progress of deep neural networks in computer vision and machine learning has enabled transformative applications across robotics, healthcare, and security. However, despite the superior performance of the deep neural networks, it remains challenging to understand their inner workings and explain their output predictions. This thesis investigates several novel approaches for opening up the "black box" of neural networks used in visual recognition tasks and understanding their inner working mechanism. I first show that objects and other meaningful concepts emerge as a consequence of recognizing scenes. A network dissection approach is further introduced to automatically identify the internal units as the emergent concept detectors and quantify their interpretability. Then I describe an approach that can efficiently explain the output prediction for any given image. It sheds light on the decision-making process of the networks and why the predictions succeed or fail. Finally, I show some ongoing efforts toward learning efficient and interpretable deep representations for video event understanding and some future directions.
by Bolei Zhou.
Ph. D.
Boots, Byron. « Spectral Approaches to Learning Predictive Representations ». Research Showcase @ CMU, 2012. http://repository.cmu.edu/dissertations/131.
Texte intégralCribier-Delande, Perrine. « Contexts and user modeling through disentangled representations learning ». Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS407.
Texte intégralThe recent, sometimes very publicised, successes have drawn a lot of attention to Deep Learning (DL). Many questions are asked about the limitations of these techniques. The great strength of DL is its ability to learn representations of complex objects. Renault, as a car manufacturer, has a vested interest in discovering how their cars are used. Learning representations of drivers is one of their long-term goals. Renault's strength partly lies in their knowledge of cars and the data they use and produce. This data is almost entirely contained in the Controller Area Network (CAN). However, the CAN data only contains the inner workings of a car and not its surroundings. As many factors exterior to the driver and the car (such as weather, other road users, road condition...) can affect driving, we must find a way to disentangle them.Seeing the user (or driver) as just another context allowed us to use context modelling approaches. By transferring disentanglement approaches used in computer vision, we were able to develop models that learn disentangled representations of contexts. We tested these models with a few public datasets of time series with clearly labelled contexts. Using only forecasting as supervision during training, our models are able to generate data only from the learned representations of contexts. They even learn to represent new contexts, only seen after training.We then transferred the developed models on CAN data and were able to confirm that information about driving contexts (including driver's identity) is indeed contained in the CAN
Paudel, Subodh. « Methodology to estimate building energy consumption using artificial intelligence ». Thesis, Nantes, Ecole des Mines, 2016. http://www.theses.fr/2016EMNA0237/document.
Texte intégralHigh-energy efficiency building standards (as Low energy building LEB) to improve building consumption have drawn significant attention. Building standards is basically focused on improving thermal performance of envelope and high heat capacity thus creating a higher thermal inertia. However, LEB concept introduces alarge time constant as well as large heat capacity resulting in a slower rate of heat transfer between interior of building and outdoor environment. Therefore, it is challenging to estimate and predict thermal energy demand for such LEBs. This work focuses on artificial intelligence (AI) models to predict energy consumptionof LEBs. We consider two kinds of AI modeling approaches: “all data” and “relevant data”. The “all data” uses all available data and “relevant data” uses a small representative day dataset and addresses the complexity of building non-linear dynamics by introducing past day climatic impacts behavior. This extraction is based on either simple physical understanding: Heating Degree Day (HDD), modified HDD or pattern recognition methods: Frechet Distance and Dynamic Time Warping (DTW). Four AI techniques have been considered: Artificial Neural Network (ANN), Support Vector Machine (SVM), Boosted Ensemble Decision Tree (BEDT) and Random forest (RF). In a first part, numerical simulations for six buildings (heat demand in the range [25 – 85 kWh/m².yr]) have been performed. The approach “relevant data” with (DTW, SVM) shows the best results. Real data of the building “Ecole des Mines de Nantes” proves the approach is still relevant
Kim, Joo-Kyung. « Linguistic Knowledge Transfer for Enriching Vector Representations ». The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1500571436042414.
Texte intégralZaiem, Mohamed Salah. « Informed Speech Self-supervised Representation Learning ». Electronic Thesis or Diss., Institut polytechnique de Paris, 2024. http://www.theses.fr/2024IPPAT009.
Texte intégralFeature learning has been driving machine learning advancement with the recently proposed methods getting progressively rid of handcrafted parts within the transformations from inputs to desired labels. Self-supervised learning has emerged within this context, allowing the processing of unlabeled data towards better performance on low-labeled tasks. The first part of my doctoral work is aimed towards motivating the choices in the speech selfsupervised pipelines learning the unsupervised representations. In this thesis, I first show how conditional-independence-based scoring can be used to efficiently and optimally select pretraining tasks tailored for the best performance on a target task. The second part of my doctoral work studies the evaluation and usage of pretrained self-supervised representations. I explore, first, the robustness of current speech self-supervision benchmarks to changes in the downstream modeling choices. I propose, second, fine-tuning approaches for better efficicency and generalization
Ceccon, Stefano. « Extending Bayesian network models for mining and classification of glaucoma ». Thesis, Brunel University, 2013. http://bura.brunel.ac.uk/handle/2438/8051.
Texte intégralHernández-Vela, Antonio. « From pixels to gestures : learning visual representations for human analysis in color and depth data sequences ». Doctoral thesis, Universitat de Barcelona, 2015. http://hdl.handle.net/10803/292488.
Texte intégralL’anàlisi visual de persones a partir d'imatges és un tema de recerca molt important, atesa la rellevància que té a una gran quantitat d'aplicacions dins la visió per computador, com per exemple: detecció de vianants, monitorització i vigilància,interacció persona-màquina, “e-salut” o sistemes de recuperació d’matges a partir de contingut, entre d'altres. En aquesta tesi volem aprendre diferents representacions visuals del cos humà, que siguin útils per a la anàlisi visual de persones en imatges i vídeos. Per a tal efecte, analitzem diferents modalitats d'imatge com són les imatges de color RGB i les imatges de profunditat, i adrecem el problema a diferents nivells d'abstracció, des dels píxels fins als gestos: segmentació de persones, estimació de la pose humana i reconeixement de gestos. Primer, mostrem com la segmentació binària (objecte vs. fons) del cos humà en seqüències d'imatges ajuda a eliminar soroll pertanyent al fons de l'escena en qüestió. El mètode presentat, basat en optimització “Graph cuts”, imposa consistència espai-temporal a Ies màscares de segmentació obtingudes en “frames” consecutius. En segon lloc, presentem un marc metodològic per a la segmentació multi-classe, amb la qual podem obtenir una descripció més detallada del cos humà, en comptes d'obtenir una simple representació binària separant el cos humà del fons, podem obtenir màscares de segmentació més detallades, separant i categoritzant les diferents parts del cos. A un nivell d'abstraccíó més alt, tenim com a objectiu obtenir representacions del cos humà més simples, tot i ésser suficientment descriptives. Els mètodes d'estimació de la pose humana sovint es basen en models esqueletals del cos humà, formats per segments (o rectangles) que representen les extremitats del cos, connectades unes amb altres seguint les restriccions cinemàtiques del cos humà. A la pràctica, aquests models esqueletals han de complir certes restriccions per tal de poder aplicar mètodes d'inferència que permeten trobar la solució òptima de forma eficient, però a la vegada aquestes restriccions suposen una gran limitació en l'expressivitat que aques.ts models son capaços de capturar. Per tal de fer front a aquest problema, proposem un enfoc “top-down” per a predir la posició de les parts del cos del model esqueletal, introduïnt una representació de parts de mig nivell basada en “Poselets”. Finalment. proposem un marc metodològic per al reconeixement de gestos, basat en els “bag of visual words”. Aprofitem els avantatges de les imatges RGB i les imatges; de profunditat combinant vocabularis visuals específiques per a cada modalitat, emprant late fusion. Proposem un nou descriptor per a imatges de profunditat invariant a rotació, que millora l'estat de l'art, i fem servir piràmides espai-temporals per capturar certa estructura espaial i temporal dels gestos. Addicionalment, presentem una reformulació probabilística del mètode “Dynamic Time Warping” per al reconeixement de gestos en seqüències d'imatges. Més específicament, modelem els gestos amb un model probabilistic gaussià que implícitament codifica possibles deformacions tant en el domini espaial com en el temporal.
Ben-Younes, Hedi. « Multi-modal representation learning towards visual reasoning ». Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Texte intégralThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Boufoy-Bastick, Zacharyas Amaury. « Internet democracy : the political science and computer science of direct democracy at the large scale ». Thesis, Paris 4, 2014. http://www.theses.fr/2014PA040182.
Texte intégralRepresentative democracy suffers from numerous shortcomings that are so significant they bring into question the very legitimacy of modern democratic governments. While direct representation might theoretically eliminate these multiple defects, it has until now been considered unworkable due to limitations of space and of time. This thesis addresses these deficiencies by introducing Internet Democracy, which is distinct from existing e-democracy and e-government. Internet Democracy is an operational, computational formulation of democratic representation. To support this contribution, this thesis first derives the problems of democracy and indirect representation from first principles. It then proposes a new approach (the symbiotic structural approach) which applies the Internet to democracy. It then supports the proposition that Internet Democracy can operate through the analysis of passively collected data on information access and on information production (for instance, using sentiment analysis). Finally, it makes numerous topical contributions to computer science based on the observation that sentiment analysis hits a ceiling of accuracy which cannot currently be transcended. These contributions range from suggesting an Asymmetric Opinion Proposition (AOP) and applying this to a Sentiment Space describing the computational structure of sentiment; developing the first extremely fine-grained dataset for sentiment analysis; and applying Sentiment Space to develop the original ‘Split-Fit’ computing method which increases the accuracy of machine learning based Sentiment Analysis
Adapa, Supriya. « TensorFlow Federated Learning : Application to Decentralized Data ». Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Trouver le texte intégralDuminy, Willem H. « A learning framework for zero-knowledge game playing agents ». Pretoria : [s.n.], 2006. http://upetd.up.ac.za/thesis/available/etd-10172007-153836.
Texte intégralWilhelmi, Roca Francesc. « Towards spatial reuse in future wireless local area networks : a sequential learning approach ». Doctoral thesis, Universitat Pompeu Fabra, 2020. http://hdl.handle.net/10803/669970.
Texte intégralL'operació de reutilització espacial (SR) està guanyant impuls per a la darrera família d'estàndards IEEE 802.11 a causa dels aclaparadors requisits que presenten les xarxes sense fils de nova generació. En particular, la creixent necessitat de tràfic i el nombre de dispositius concurrents comprometen l'eficiència de les xarxes d'àrea local sense fils (WLANs) cada cop més concorregudes i posen en dubte la seva naturalesa descentralitzada. L'operació SR, inicialment introduïda per l'estàndard IEEE 802.11ax-2021 i estudiada posteriorment a IEEE 802.11be-2024, pretén augmentar el nombre de transmissions concurrents en un conjunt bàsic de serveis superposats (OBSS) mitjançant l'ajustament de la sensibilitat i el control de potència de transmissió, millorant així l'eficiència espectral. El nostre estudi sobre el funcionament de SR mostra un potencial destacat per millorar el nombre de transmissions simultànies en desplegaments multitudinaris, contribuint així al desenvolupament d'aplicacions de nova generació de baixa latència. Tot i això, els beneficis potencials de SR són actualment limitats per la rigidesa del mecanisme introduït per a l'11ax, i la manca de coordinació entre els BSS que ho implementen. L'operació SR evoluciona cap a esquemes coordinats on cooperen diferents BSS. En canvi, la coordinació comporta una sobrecàrrega de comunicació i sincronització, el qual té un impacte en el rendiment de les WLAN. D'altra banda, l'esquema coordinat és incompatible amb els dispositius que utilitzen versions anteriors IEEE 802.11, la qual cosa podria deteriorar el rendiment de les xarxes ja existents. Per aquests motius, en aquesta tesi s'avalua la viabilitat de mecanismes descentralitzats per a SR i s'analitzen minuciosament els principals impediments i mancances que se'n poden derivar. El nostre objectiu és donar llum a la futura forma de les WLAN pel que fa a l?optimització de SR i si s'ha de mantenir el seu caràcter descentralitzat, o bé és preferible evolucionar cap a desplegaments coordinats i centralitzats. Per abordar SR de forma descentralitzada, ens centrem en la Intel·ligència Artificial (AI) i ens proposem utilitzar una classe de mètodes seqüencials basats en l'aprenentatge, anomenats Multi-Armed Bandits (MAB). L'esquema MAB s'adapta al problema descentralitzat de SR perquè aborda la incertesa causada pel funcionament simultani de diversos dispositius (és a dir, un entorn multi-jugador) i la falta d'informació que se'n deriva. Els MAB poden fer front a la complexitat darrera les interaccions espacials entre dispositius que resulten de modificar la seva sensibilitat i potència de transmissió. En aquest sentit, els nostres resultats indiquen guanys importants de rendiment (fins al 100 \%) en desplegaments altament densos. Tot i això, l'aplicació d'aprenentatge automàtic amb múltiples agents planteja diversos problemes que poden comprometre el rendiment dels dispositius d'una xarxa (definició d'objectius conjunts, horitzó de convergència, aspectes d'escalabilitat o manca d'estacionarietat). A més, el nostre estudi d'aprenentatge multi-agent per a SR multi-agent inclou aspectes d'infraestructura per a xarxes de nova generació que integrin AI de manera intrínseca.
Feutry, Clément. « Two sides of relevant information : anonymized representation through deep learning and predictor monitoring ». Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS479.
Texte intégralThe work presented here is for a first part at the cross section of deep learning and anonymization. A full framework was developed in order to identify and remove to a certain extant, in an automated manner, the features linked to an identity in the context of image data. Two different kinds of processing data were explored. They both share the same Y-shaped network architecture despite components of this network varying according to the final purpose. The first one was about building from the ground an anonymized representation that allowed a trade-off between keeping relevant features and tampering private features. This framework has led to a new loss. The second kind of data processing specified no relevant information about the data, only private information, meaning that everything that was not related to private features is assumed relevant. Therefore the anonymized representation shares the same nature as the initial data (e.g. an image is transformed into an anonymized image). This task led to another type of architecture (still in a Y-shape) and provided results strongly dependent on the type of data. The second part of the work is relative to another kind of relevant information: it focuses on the monitoring of predictor behavior. In the context of black box analysis, we only have access to the probabilities outputted by the predictor (without any knowledge of the type of structure/architecture producing these probabilities). This monitoring is done in order to detect abnormal behavior that is an indicator of a potential mismatch between the data statistics and the model statistics. Two methods are presented using different tools. The first one is based on comparing the empirical cumulative distribution of known data and to be tested data. The second one introduces two tools: one relying on the classifier uncertainty and the other relying on the confusion matrix. These methods produce concluding results
Maxwell, Tricia Lesley. « Factors affecting the representation of objects in distributed attention ». Thesis, University of Sussex, 2011. http://sro.sussex.ac.uk/id/eprint/7478/.
Texte intégralMoulouel, Koussaila. « Hybrid AI approaches for context recognition : application to activity recognition and anticipation and context abnormalities handling in Ambient Intelligence environments ». Electronic Thesis or Diss., Paris Est, 2023. http://www.theses.fr/2023PESC0014.
Texte intégralAmbient Intelligence (AmI) systems aim to provide users with assistance services intended to improve the quality of their lives in terms of autonomy, safety, and well-being. The design of AmI systems capable of accurate, fine-grained and consistent recognition of the spatial and/or temporal user's context, taking into account the uncertainty and partial observability of AmI environments, poses several challenges to enable a better adaptation of the assistance services to the user's context. The purpose of this thesis is to propose a set of contributions that address these challenges. Firstly, a context ontology is proposed to model contextual knowledge in AmI environments. The purpose of this ontology is the modeling of the user's context taking into account different context attributes and defining axioms of the commonsense reasoning necessary to infer and update the context of the user. The second contribution is an ontology-based hybrid framework that combine probabilistic commonsense reasoning and probabilistic planning to recognize the user's context, in particular, context abnormalities, and provide context-aware assistance services, in presence of uncertainty and partial observability of the environments. This framework exploits context attribute predictions, namely user's activity and user's location, provided by deep learning models. In this framework, the probabilistic commonsense reasoning is based on the proposed context ontology to define the axiomatization of the context inference and planning under uncertainty. Probabilistic planning is used to characterize abnormal context by coping with the incompleteness of contextual knowledge due to the partial observability of AmI environments. The proposed framework was evaluated using transformers and CNN-LSTM models considering Orange4Home and SIMADL datasets. The results show the effectiveness of the framework to recognize user's contexts, in terms of user's activity and location, along with context abnormalities. Thirdly, a hybrid framework combining deep learning and probabilistic commonsense reasoning for anticipating human activities based on egocentric videos is proposed. The probabilistic commonsense reasoning exploited in this framework is based on abductive reasoning to anticipate both human atomic and composite activities, and temporal reasoning to capture context attribute changes. Deep learning models were exploited to recognize context attributes, such as objects, human hands, and human locations. The context ontology is used to model the relationships between atomic activities and composite activities. The evaluation of the framework shows its ability to anticipate composite activities over a time horizon of minutes, in contrast to state-of-the-art approaches that can only anticipate atomic activities over a time horizon of seconds. It also showed good performance in terms of accuracy of classification of anticipated activities and computation time. Lastly, a stream reasoning-based framework is proposed to anticipate atomic and composite human activities from data streams of context attributes collected on-the-fly. Deep learning models were used to recognize context attributes, such as objects used in activities, hands and user locations. The stream reasoning system performs causal, abductive and temporal reasoning with contextual knowledge obtained at run-time. Dynamic effect axioms were introduced to anticipate composite activities that can be subject to unforeseen events, such as skipping an atomic activity and delay an atomic activity. The proposed framework was validated through experiments conducted in a kitchen environment. The remarkably high performance in terms of the number of activity anticipations shows the ability of the framework to take into account the contextual knowledge of past episodes needed to anticipate composite activities
Feng, Qianli. « Modeling Action Intentionality in Humans and Machines ». The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1616769653536292.
Texte intégralSerban, Iulian Vlad. « Representation learning for dialogue systems ». Thèse, 2019. http://hdl.handle.net/1866/23440.
Texte intégralThis thesis presents a series of steps taken towards investigating representation learning (e.g. deep learning) for building dialogue systems and conversational agents. The thesis is split into two general parts. The first part of the thesis investigates representation learning for generative dialogue models. Conditioned on a sequence of turns from a text-based dialogue, these models are tasked with generating the next, appropriate response in the dialogue. This part of the thesis focuses on sequence-to-sequence models, a class of generative deep neural networks. First, we propose the Hierarchical Recurrent Encoder-Decoder model, which is an extension of the vanilla sequence-to sequence model incorporating the turn-taking structure of dialogues. Second, we propose the Multiresolution Recurrent Neural Network model, which is a stacked sequence-to-sequence model with an intermediate, stochastic representation (a "coarse representation") capturing the abstract semantic content communicated between the dialogue speakers. Third, we propose the Latent Variable Recurrent Encoder-Decoder model, which is a variant of the Hierarchical Recurrent Encoder-Decoder model with latent, stochastic normally-distributed variables. The latent, stochastic variables are intended for modelling the ambiguity and uncertainty occurring naturally in human language communication. The three models are evaluated and compared on two dialogue response generation tasks: a Twitter response generation task and the Ubuntu technical response generation task. The second part of the thesis investigates representation learning for a real-world reinforcement learning dialogue system. Specifically, this part focuses on the Milabot system built by the Quebec Artificial Intelligence Institute (Mila) for the Amazon Alexa Prize 2017 competition. Milabot is a system capable of conversing with humans on popular small talk topics through both speech and text. The system consists of an ensemble of natural language retrieval and generation models, including template-based models, bag-of-words models, and variants of the models discussed in the first part of the thesis. This part of the thesis focuses on the response selection task. Given a sequence of turns from a dialogue and a set of candidate responses, the system must select an appropriate response to give the user. A model-based reinforcement learning approach, called the Bottleneck Simulator, is proposed for selecting the appropriate candidate response. The Bottleneck Simulator learns an approximate model of the environment based on observed dialogue trajectories and human crowdsourcing, while utilizing an abstract (bottleneck) state representing high-level discourse semantics. The learned environment model is then employed to learn a reinforcement learning policy through rollout simulations. The learned policy has been evaluated and compared to competing approaches through A/B testing with real-world users, where it was found to yield excellent performance.
« Multimodal Representation Learning for Visual Reasoning and Text-to-Image Translation ». Master's thesis, 2018. http://hdl.handle.net/2286/R.I.51644.
Texte intégralDissertation/Thesis
Masters Thesis Computer Engineering 2018
« Knowledge Representation, Reasoning and Learning for Non-Extractive Reading Comprehension ». Doctoral diss., 2019. http://hdl.handle.net/2286/R.I.55482.
Texte intégralDissertation/Thesis
Doctoral Dissertation Computer Science 2019
Racah, Evan. « Unsupervised representation learning in interactive environments ». Thèse, 2019. http://hdl.handle.net/1866/23788.
Texte intégralExtracting a representation of all the high-level factors of an agent’s state from level-level sensory information is an important, but challenging task in machine learning. In this thesis, we will explore several unsupervised approaches for learning these state representations. We apply and analyze existing unsupervised representation learning methods in reinforcement learning environments, as well as contribute our own evaluation benchmark and our own novel state representation learning method. In the first chapter, we will overview and motivate unsupervised representation learning for machine learning in general and for reinforcement learning. We will then introduce a relatively new subfield of representation learning: self-supervised learning. We will then cover two core representation learning approaches, generative methods and discriminative methods. Specifically, we will focus on a collection of discriminative representation learning methods called contrastive unsupervised representation learning (CURL) methods. We will close the first chapter by detailing various approaches for evaluating the usefulness of representations. In the second chapter, we will present a workshop paper, where we evaluate a handful of off-the-shelf self-supervised methods in reinforcement learning problems. We discover that the performance of these representations depends heavily on the dynamics and visual structure of the environment. As such, we determine that a more systematic study of environments and methods is required. Our third chapter covers our second article, Unsupervised State Representation Learning in Atari, where we try to execute a more thorough study of representation learning methods in RL as motivated by the second chapter. To facilitate a more thorough evaluation of representations in RL we introduce a benchmark of 22 fully labelled Atari games. In addition, we choose the representation learning methods for comparison in a more systematic way by focusing on comparing generative methods with contrastive methods, instead of the less systematically chosen off-the-shelf methods from the second chapter. Finally, we introduce a new contrastive method, ST-DIM, which excels at the 22 Atari games.
Dumoulin, Vincent. « Representation Learning for Visual Data ». Thèse, 2018. http://hdl.handle.net/1866/21140.
Texte intégralHosseini, Seyedarian. « Towards learning sentence representation with self-supervision ». Thèse, 2019. http://hdl.handle.net/1866/23784.
Texte intégralIn chapter 1, we introduce the basics of feed forward neural networks and recurrent neural networks. The chapter continues with the discussion of the backpropagation algorithm to train feed forward neural networks, and the backpropagation through time algorithm to train recurrent neural networks. We also discuss three different approaches in learning representations, namely supervised learning, unsupervised learning, and a relatively new approach called self-supervised learning. In chapter 2, we talk about the fundamentals of deep natural language processing. Specifically, we cover word representations, sentence representations, and language modelling. We focus on the evaluation and current state of the literature for these concepts. We close the chapter by discussing large scale pre-training and transfer learning in language. In chapter 3, we investigate a set of self-supervised tasks that take advantage of noise contrastive estimation in order to learn sentence representations using unlabeled data. We train our model on a large corpora and evaluate our learned sentence representations on a set of downstream natural language tasks from the SentEval framework. Our model trained on the proposed tasks outperforms unsupervised methods on a subset of tasks from SentEval. In chapter 4, we introduce a memory augmented model called Ordered Memory with several improvements over traditional stack-augmented recurrent neural networks. We introduce a new Stick-breaking attention mechanism inspired by Ordered Neurons [Shen et.al., 2019] to write in and erase from the memory. A new Gated Recursive Cell is also introduced to compose low level representations into higher level ones. We show that this model performs well on the logical inference task and the ListOps task, and it also shows strong generalization properties in these tasks. Finally, we evaluate our model on the SST (Stanford Sentiment Treebank) tasks (binary and fine-grained) and report results that are comparable with state-of-the-art on these tasks.