Tesi sul tema "Apprentissage de representation d'etats"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Apprentissage de representation d'etats".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Hautot, Julien. "Représentation à base radiale pour l'apprentissage par renforcement visuel". Electronic Thesis or Diss., Université Clermont Auvergne (2021-...), 2024. http://www.theses.fr/2024UCFA0093.
Testo completoThis thesis work falls within the context of Reinforcement Learning (RL) from image data. Unlike supervised learning, which enables performing various tasks such as classification, regression, or segmentation from an annotated database, RL allows learning without a database through interactions with an environment. In these methods, an agent, such as a robot, performs different actions to explore its environment and gather training data. Training such an agent involves trial and error; the agent is penalized when it fails at its task and rewarded when it succeeds. The goal for the agent is to improve its behavior to obtain the most long-term rewards.We focus on visual extractions in RL scenarios using first-person view images. The use of visual data often involves deep convolutional networks that work directly on images. However, these networks have significant computational complexity, lack interpretability, and sometimes suffer from instability. To overcome these difficulties, we investigated the development of a network based on radial basis functions, which enable sparse and localized activations in the input space. Radial basis function networks (RBFNs) peaked in the 1990s but were later supplanted by convolutional networks due to their high computational cost on images. In this thesis, we developed a visual feature extractor inspired by RBFNs, simplifying the computational cost on images. We used our network for solving first-person visual tasks and compared its results with various state-of-the-art methods, including end-to-end learning methods, state representation learning methods, and extreme machine learning methods. Different scenarios were tested from the VizDoom simulator and the Pybullet robotics physics simulator. In addition to comparing the rewards obtained after learning, we conducted various tests on noise robustness, parameter generation of our network, and task transfer to reality.The proposed network achieves the best performance in reinforcement learning on the tested scenarios while being easier to use and interpret. Additionally, our network is robust to various noise types, paving the way for the effective transfer of knowledge acquired in simulation to reality
Dos, Santos Ludovic. "Representation learning for relational data". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066480.
Testo completoThe increasing use of social and sensor networks generates a large quantity of data that can be represented as complex graphs. There are many tasks from information analysis, to prediction and retrieval one can imagine on those data where relation between graph nodes should be informative. In this thesis, we proposed different models for three different tasks: - Graph node classification - Relational time series forecasting - Collaborative filtering. All the proposed models use the representation learning framework in its deterministic or Gaussian variant. First, we proposed two algorithms for the heterogeneous graph labeling task, one using deterministic representations and the other one Gaussian representations. Contrary to other state of the art models, our solution is able to learn edge weights when learning simultaneously the representations and the classifiers. Second, we proposed an algorithm for relational time series forecasting where the observations are not only correlated inside each series, but also across the different series. We use Gaussian representations in this contribution. This was an opportunity to see in which way using Gaussian representations instead of deterministic ones was profitable. At last, we apply the Gaussian representation learning approach to the collaborative filtering task. This is a preliminary work to see if the properties of Gaussian representations found on the two previous tasks were also verified for the ranking one. The goal of this work was to then generalize the approach to more relational data and not only bipartite graphs between users and items
Dos, Santos Ludovic. "Representation learning for relational data". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066480/document.
Testo completoThe increasing use of social and sensor networks generates a large quantity of data that can be represented as complex graphs. There are many tasks from information analysis, to prediction and retrieval one can imagine on those data where relation between graph nodes should be informative. In this thesis, we proposed different models for three different tasks: - Graph node classification - Relational time series forecasting - Collaborative filtering. All the proposed models use the representation learning framework in its deterministic or Gaussian variant. First, we proposed two algorithms for the heterogeneous graph labeling task, one using deterministic representations and the other one Gaussian representations. Contrary to other state of the art models, our solution is able to learn edge weights when learning simultaneously the representations and the classifiers. Second, we proposed an algorithm for relational time series forecasting where the observations are not only correlated inside each series, but also across the different series. We use Gaussian representations in this contribution. This was an opportunity to see in which way using Gaussian representations instead of deterministic ones was profitable. At last, we apply the Gaussian representation learning approach to the collaborative filtering task. This is a preliminary work to see if the properties of Gaussian representations found on the two previous tasks were also verified for the ranking one. The goal of this work was to then generalize the approach to more relational data and not only bipartite graphs between users and items
Zaiem, Mohamed Salah. "Informed Speech Self-supervised Representation Learning". Electronic Thesis or Diss., Institut polytechnique de Paris, 2024. http://www.theses.fr/2024IPPAT009.
Testo completoFeature learning has been driving machine learning advancement with the recently proposed methods getting progressively rid of handcrafted parts within the transformations from inputs to desired labels. Self-supervised learning has emerged within this context, allowing the processing of unlabeled data towards better performance on low-labeled tasks. The first part of my doctoral work is aimed towards motivating the choices in the speech selfsupervised pipelines learning the unsupervised representations. In this thesis, I first show how conditional-independence-based scoring can be used to efficiently and optimally select pretraining tasks tailored for the best performance on a target task. The second part of my doctoral work studies the evaluation and usage of pretrained self-supervised representations. I explore, first, the robustness of current speech self-supervision benchmarks to changes in the downstream modeling choices. I propose, second, fine-tuning approaches for better efficicency and generalization
Carvalho, Micael. "Deep representation spaces". Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS292.
Testo completoIn recent years, Deep Learning techniques have swept the state-of-the-art of many applications of Machine Learning, becoming the new standard approach for them. The architectures issued from these techniques have been used for transfer learning, which extended the power of deep models to tasks that did not have enough data to fully train them from scratch. This thesis' subject of study is the representation spaces created by deep architectures. First, we study properties inherent to them, with particular interest in dimensionality redundancy and precision of their features. Our findings reveal a strong degree of robustness, pointing the path to simple and powerful compression schemes. Then, we focus on refining these representations. We choose to adopt a cross-modal multi-task problem, and design a loss function capable of taking advantage of data coming from multiple modalities, while also taking into account different tasks associated to the same dataset. In order to correctly balance these losses, we also we develop a new sampling scheme that only takes into account examples contributing to the learning phase, i.e. those having a positive loss. Finally, we test our approach in a large-scale dataset of cooking recipes and associated pictures. Our method achieves a 5-fold improvement over the state-of-the-art, and we show that the multi-task aspect of our approach promotes a semantically meaningful organization of the representation space, allowing it to perform subtasks never seen during training, like ingredient exclusion and selection. The results we present in this thesis open many possibilities, including feature compression for remote applications, robust multi-modal and multi-task learning, and feature space refinement. For the cooking application, in particular, many of our findings are directly applicable in a real-world context, especially for the detection of allergens, finding alternative recipes due to dietary restrictions, and menu planning
Le, Naour Étienne. "Learning neural representation for time series". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS211.
Testo completoTime series analysis has become increasingly important in various fields, including industry, finance, and climate science. The proliferation of sensors and the data heterogeneity necessitate effective time series modeling techniques. While complex supervised machine learning models have been developed for specific tasks, representation learning offers a different approach by learning data representations in a new space without explicitly focusing on solving a supervised task. The learned representation is then reused to improve the performance of supervised tasks applied on top of it. Recently, deep learning has transformed time series modeling, with advanced models like convolutional and attention-based neural networks achieving state-of-the-art performance in classification, imputation, or forecasting. The fusion of representation learning and deep learning has given rise to the field of neural representation learning. Neural representations have a greater ability to extract intricate features and patterns compared to non-neural representations, making them more powerful and effective in handling complex time series data. Recent advances in the field have significantly improved the quality of time series representations, enhancing their usefulness for various downstream tasks. This thesis focuses on advancing the field of neural representation learning for time series, targeting both industrial and academic needs. This research addresses open problems in the domain, such as creating interpretable neural representations, developing continuous time series representations that handle irregular and unaligned time series, and creating adaptable models for distribution shifts. This manuscript offers multiple contributions to tackle the previously mentioned challenges in neural representation learning for time series.- First, we propose an interpretable discrete neural representation model for time series based on a vector quantization encoder-decoder architecture, which facilitates interpretable classification.- Secondly, we design a continuous implicit neural representation model, called TimeFlow, for time series imputation and forecasting that can handle unaligned and irregular samples. This model leverages time series data representation, enabling it to adapt to new samples and unseen contexts by adjusting the representations.- Lastly, we demonstrate that TimeFlow learns relevant features, making the representation space effective for downstream tasks such as data generation.These contributions aim to advance the field of neural representation learning for time series and provide practical solutions to real-world industrial challenges
Trottier, Ludovic, e Ludovic Trottier. "Sparse, hierarchical and shared-factors priors for representation learning". Doctoral thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/35777.
Testo completoLa représentation en caractéristiques est une préoccupation centrale des systèmes d’apprentissage automatique d’aujourd’hui. Une représentation adéquate peut faciliter une tâche d’apprentissage complexe. C’est le cas lorsque par exemple cette représentation est de faible dimensionnalité et est constituée de caractéristiques de haut niveau. Mais comment déterminer si une représentation est adéquate pour une tâche d’apprentissage ? Les récents travaux suggèrent qu’il est préférable de voir le choix de la représentation comme un problème d’apprentissage en soi. C’est ce que l’on nomme l’apprentissage de représentation. Cette thèse présente une série de contributions visant à améliorer la qualité des représentations apprises. La première contribution élabore une étude comparative des approches par dictionnaire parcimonieux sur le problème de la localisation de points de prises (pour la saisie robotisée) et fournit une analyse empirique de leurs avantages et leurs inconvénients. La deuxième contribution propose une architecture réseau de neurones à convolution (CNN) pour la détection de points de prise et la compare aux approches d’apprentissage par dictionnaire. Ensuite, la troisième contribution élabore une nouvelle fonction d’activation paramétrique et la valide expérimentalement. Finalement, la quatrième contribution détaille un nouveau mécanisme de partage souple de paramètres dans un cadre d’apprentissage multitâche.
Feature representation is a central concern of today’s machine learning systems. A proper representation can facilitate a complex learning task. This is the case when for instance the representation has low dimensionality and consists of high-level characteristics. But how can we determine if a representation is adequate for a learning task? Recent work suggests that it is better to see the choice of representation as a learning problem in itself. This is called Representation Learning. This thesis presents a series of contributions aimed at improving the quality of the learned representations. The first contribution elaborates a comparative study of Sparse Dictionary Learning (SDL) approaches on the problem of grasp detection (for robotic grasping) and provides an empirical analysis of their advantages and disadvantages. The second contribution proposes a Convolutional Neural Network (CNN) architecture for grasp detection and compares it to SDL. Then, the third contribution elaborates a new parametric activation function and validates it experimentally. Finally, the fourth contribution details a new soft parameter sharing mechanism for multitasking learning.
Feature representation is a central concern of today’s machine learning systems. A proper representation can facilitate a complex learning task. This is the case when for instance the representation has low dimensionality and consists of high-level characteristics. But how can we determine if a representation is adequate for a learning task? Recent work suggests that it is better to see the choice of representation as a learning problem in itself. This is called Representation Learning. This thesis presents a series of contributions aimed at improving the quality of the learned representations. The first contribution elaborates a comparative study of Sparse Dictionary Learning (SDL) approaches on the problem of grasp detection (for robotic grasping) and provides an empirical analysis of their advantages and disadvantages. The second contribution proposes a Convolutional Neural Network (CNN) architecture for grasp detection and compares it to SDL. Then, the third contribution elaborates a new parametric activation function and validates it experimentally. Finally, the fourth contribution details a new soft parameter sharing mechanism for multitasking learning.
Gerald, Thomas. "Representation Learning for Large Scale Classification". Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS316.
Testo completoThe past decades have seen the rise of new technologies that simplify information sharing. Today, a huge part of the data is accessible to most users. In this thesis, we propose to study the problems of document annotation to ease access to information thanks to retrieved annotations. We will be interested in extreme classification-related tasks which characterizes the tasks of automatic annotation when the number of labels is important. Many difficulties arise from the size and complexity of this data: prediction time, storage and the relevance of the annotations are the most representative. Recent research dealing with this issue is based on three classification schemes: "one against all" approaches learning as many classifiers as labels; "hierarchical" methods organizing a simple classifier structure; representation approaches embedding documents into small spaces. In this thesis, we study the representation classification scheme. Through our contributions, we study different approaches either to speed up prediction or to better structure representations. In a first part, we will study discrete representations such as "ECOC" methods to speed up the annotation process. In a second step, we will consider hyperbolic embeddings to take advantage of the qualities of this space for the representation of structured data
Coria, Juan Manuel. "Continual Representation Learning in Written and Spoken Language". Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG025.
Testo completoAlthough machine learning has recently witnessed major breakthroughs, today's models are mostly trained once on a target task and then deployed, rarely (if ever) revisiting their parameters.This problem affects performance after deployment, as task specifications and data may evolve with user needs and distribution shifts.To solve this, continual learning proposes to train models over time as new data becomes available.However, models trained in this way suffer from significant performance loss on previously seen examples, a phenomenon called catastrophic forgetting.Although many studies have proposed different strategies to prevent forgetting, they often rely on labeled data, which is rarely available in practice. In this thesis, we study continual learning for written and spoken language.Our main goal is to design autonomous and self-learning systems able to leverage scarce on-the-job data to adapt to the new environments they are deployed in.Contrary to recent work on learning general-purpose representations (or embeddings), we propose to leverage representations that are tailored to a downstream task.We believe the latter may be easier to interpret and exploit by unsupervised training algorithms like clustering, that are less prone to forgetting. Throughout our work, we improve our understanding of continual learning in a variety of settings, such as the adaptation of a language model to new languages for sequence labeling tasks, or even the adaptation to a live conversation in the context of speaker diarization.We show that task-specific representations allow for effective low-resource continual learning, and that a model's own predictions can be exploited for full self-learning
Venkataramanan, Shashanka. "Metric learning for instance and category-level visual representation". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS022.
Testo completoThe primary goal in computer vision is to enable machines to extract meaningful information from visual data, such as images and videos, and leverage this information to perform a wide range of tasks. To this end, substantial research has focused on developing deep learning models capable of encoding comprehensive and robust visual representations. A prominent strategy in this context involves pretraining models on large-scale datasets, such as ImageNet, to learn representations that can exhibit cross-task applicability and facilitate the successful handling of diverse downstream tasks with minimal effort. To facilitate learning on these large-scale datasets and encode good representations, com- plex data augmentation strategies have been used. However, these augmentations can be limited in their scope, either being hand-crafted and lacking diversity, or generating images that appear unnatural. Moreover, the focus of these augmentation techniques has primarily been on the ImageNet dataset and its downstream tasks, limiting their applicability to a broader range of computer vision problems. In this thesis, we aim to tackle these limitations by exploring different approaches to en- hance the efficiency and effectiveness in representation learning. The common thread across the works presented is the use of interpolation-based techniques, such as mixup, to generate diverse and informative training examples beyond the original dataset. In the first work, we are motivated by the idea of deformation as a natural way of interpolating images rather than using a convex combination. We show that geometrically aligning the two images in the fea- ture space, allows for more natural interpolation that retains the geometry of one image and the texture of the other, connecting it to style transfer. Drawing from these observations, we explore the combination of mixup and deep metric learning. We develop a generalized formu- lation that accommodates mixup in metric learning, leading to improved representations that explore areas of the embedding space beyond the training classes. Building on these insights, we revisit the original motivation of mixup and generate a larger number of interpolated examples beyond the mini-batch size by interpolating in the embedding space. This approach allows us to sample on the entire convex hull of the mini-batch, rather than just along lin- ear segments between pairs of examples. Finally, we investigate the potential of using natural augmentations of objects from videos. We introduce a "Walking Tours" dataset of first-person egocentric videos, which capture a diverse range of objects and actions in natural scene transi- tions. We then propose a novel self-supervised pretraining method called DoRA, which detects and tracks objects in video frames, deriving multiple views from the tracks and using them in a self-supervised manner
Wauquier, Pauline. "Task driven representation learning". Thesis, Lille 3, 2017. http://www.theses.fr/2017LIL30005/document.
Testo completoMachine learning proposes numerous algorithms to solve the different tasks that can be extracted from real world prediction problems. To solve the different concerned tasks, most Machine learning algorithms somehow rely on relationships between instances. Pairwise instances relationships can be obtained by computing a distance between the vectorial representations of the instances. Considering the available vectorial representation of the data, none of the commonly used distances is ensured to be representative of the task that aims at being solved. In this work, we investigate the gain of tuning the vectorial representation of the data to the distance to more optimally solve the task. We more particularly focus on an existing graph-based algorithm for classification task. An algorithm to learn a mapping of the data in a representation space which allows an optimal graph-based classification is first introduced. By projecting the data in a representation space in which the predefined distance is representative of the task, we aim at outperforming the initial vectorial representation of the data when solving the task. A theoretical analysis of the introduced algorithm is performed to define the conditions ensuring an optimal classification. A set of empirical experiments allows us to evaluate the gain of the introduced approach and to temper the theoretical analysis
Ben-Younes, Hedi. "Multi-modal representation learning towards visual reasoning". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS173.
Testo completoThe quantity of images that populate the Internet is dramatically increasing. It becomes of critical importance to develop the technology for a precise and automatic understanding of visual contents. As image recognition systems are becoming more and more relevant, researchers in artificial intelligence now seek for the next generation vision systems that can perform high-level scene understanding. In this thesis, we are interested in Visual Question Answering (VQA), which consists in building models that answer any natural language question about any image. Because of its nature and complexity, VQA is often considered as a proxy for visual reasoning. Classically, VQA architectures are designed as trainable systems that are provided with images, questions about them and their answers. To tackle this problem, typical approaches involve modern Deep Learning (DL) techniques. In the first part, we focus on developping multi-modal fusion strategies to model the interactions between image and question representations. More specifically, we explore bilinear fusion models and exploit concepts from tensor analysis to provide tractable and expressive factorizations of parameters. These fusion mechanisms are studied under the widely used visual attention framework: the answer to the question is provided by focusing only on the relevant image regions. In the last part, we move away from the attention mechanism and build a more advanced scene understanding architecture where we consider objects and their spatial and semantic relations. All models are thoroughly experimentally evaluated on standard datasets and the results are competitive with the literature
Dehouck, Mathieu. "Multi-lingual dependency parsing : word representation and joint training for syntactic analysis". Thesis, Lille 1, 2019. http://www.theses.fr/2019LIL1I019/document.
Testo completoWhile modern dependency parsers have become as good as human experts, they still rely heavily on hand annotated training examples which are available for a handful of languages only. Several methods such as model and annotation transfer have been proposed to make high quality syntactic analysis available to low resourced languages as well. In this thesis, we propose new approaches for sharing information across languages relying on their shared morphological features. In a fist time, we propose to use shared morphological features to induce cross-lingual delexicalised word representations that help learning syntactic analysis models. Then, we propose a new multi-task learning framework called phylogenetic learning which learns models for related tasks/languages guided by the tasks/languages evolutionary tree. Eventually, with our new measure of morphosyntactic complexity we investigate the intrinsic role of morphological information for dependency parsing
Prang, Mathieu. "Representation learning for symbolic music". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS489.
Testo completoA key part in the recent success of deep language processing models lies in the ability to learn efficient word embeddings. These methods provide structured spaces of reduced dimensionality with interesting metric relationship properties. These, in turn, can be used as efficient input representations for handling more complex tasks. In this thesis, we focus on the task of learning embedding spaces for polyphonic music in the symbolic domain. To do so, we explore two different approaches.We introduce an embedding model based on a convolutional network with a novel type of self-modulated hierarchical attention, which is computed at each layer to obtain a hierarchical vision of musical information.Then, we propose another system based on VAEs, a type of auto-encoder that constrains the data distribution of the latent space to be close to a prior distribution. As polyphonic music information is very complex, the design of input representation is a crucial process. Hence, we introduce a novel representation of symbolic music data, which transforms a polyphonic score into a continuous signal.Finally, we show the potential of the resulting embedding spaces through the development of several creative applications used to enhance musical knowledge and expression, through tasks such as melodies modification or composer identification
Chameron, Stéphane. "Apprentissage et representation des informations spatiales chez la fourmi cataglyphis cursor (hymenoptera, formicidae)". Toulouse 3, 1999. http://www.theses.fr/1999TOU30081.
Testo completoDenize, Julien. "Self-supervised representation learning and applications to image and video analysis". Electronic Thesis or Diss., Normandie, 2023. http://www.theses.fr/2023NORMIR37.
Testo completoIn this thesis, we develop approaches to perform self-supervised learning for image and video analysis. Self-supervised representation learning allows to pretrain neural networks to learn general concepts without labels before specializing in downstream tasks faster and with few annotations. We present three contributions to self-supervised image and video representation learning. First, we introduce the theoretical paradigm of soft contrastive learning and its practical implementation called Similarity Contrastive Estimation (SCE) connecting contrastive and relational learning for image representation. Second, SCE is extended to global temporal video representation learning. Lastly, we propose COMEDIAN a pipeline for local-temporal video representation learning for transformers. These contributions achieved state-of-the-art results on multiple benchmarks and led to several academic and technical published contributions
Dufumier, Benoit. "Representation learning in neuroimaging : transferring from big healthy data to small clinical cohorts". Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG093.
Testo completoPsychiatry currently lacks objective quantitative measures to guide the clinician in choosing the proper therapeutic treatment. The physio-pathology of mental illnesses such as schizophrenia and bipolar disorder is still poorly understood but the emergence of large-scale neuroimaging transdiagnostic datasets gives a unique opportunity for studying the neuroanatomical signatures of such diseases.While Deep Learning (DL) models for medical imaging unlocked unprecedented applications such as image segmentation, its applicability to single-subject prediction problems with neuroanatomical MRI remains limited. In this thesis, we first study the current performance and scaling trend of DL models, for several architectures representative of the recent progression in computer vision, as compared to regularized linear models and Kernel Support Vector Machine. We found a high over-fitting issue on clinical data-sets and a similar scaling trend with linear models, for the current accessible sample size in clinical research. This over-fitting effect was also due to the bias induced by MRI scanners and acquisition protocols.To tackle the sample size issue, we propose a new method to learn a representation of the healthy population brain anatomy on large multi-site cohorts with neural networks using contrastive learning, an innovative self-supervised framework. When transferring this knowledge to new datasets, we demonstrate an improvement in the classification performance of patients with mental illnesses. We provide a theoretical framework grounding these empirical results and we show good generalization properties of the model for downstream classification tasks with weaker hypotheses than in the literature.Moreover, as an advancement towards debiased deep models and reproducibility in neuroimaging, we introduce a new large-scale multi-site dataset, OpenBHB, for brain age prediction and site de-biasing as well as a permanent challenge focused on representation learning. We offer three pre-processing to study brain anatomical surface, geometry, and volume inside T1 images as well as a novel way to evaluate the bias in the model's representation
Renard, Xavier. "Time series representation for classification : a motif-based approach". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066593/document.
Testo completoOur research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations. We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach. The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach. The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the subsequences. These theoretical results have been extensively tested on more than one hundred classical benchmarks of the literature with univariate and multivariate time series. Moreover, since this research has been conducted in the context of an industrial research agreement (CIFRE) with Arcelormittal, our work has been applied to the detection of defective steel products based on production line's sensor measurements
Belharbi, Soufiane. "Neural networks regularization through representation learning". Thesis, Normandie, 2018. http://www.theses.fr/2018NORMIR10/document.
Testo completoNeural network models and deep models are one of the leading and state of the art models in machine learning. They have been applied in many different domains. Most successful deep neural models are the ones with many layers which highly increases their number of parameters. Training such models requires a large number of training samples which is not always available. One of the fundamental issues in neural networks is overfitting which is the issue tackled in this thesis. Such problem often occurs when the training of large models is performed using few training samples. Many approaches have been proposed to prevent the network from overfitting and improve its generalization performance such as data augmentation, early stopping, parameters sharing, unsupervised learning, dropout, batch normalization, etc. In this thesis, we tackle the neural network overfitting issue from a representation learning perspective by considering the situation where few training samples are available which is the case of many real world applications. We propose three contributions. The first one presented in chapter 2 is dedicated to dealing with structured output problems to perform multivariate regression when the output variable y contains structural dependencies between its components. Our proposal aims mainly at exploiting these dependencies by learning them in an unsupervised way. Validated on a facial landmark detection problem, learning the structure of the output data has shown to improve the network generalization and speedup its training. The second contribution described in chapter 3 deals with the classification task where we propose to exploit prior knowledge about the internal representation of the hidden layers in neural networks. This prior is based on the idea that samples within the same class should have the same internal representation. We formulate this prior as a penalty that we add to the training cost to be minimized. Empirical experiments over MNIST and its variants showed an improvement of the network generalization when using only few training samples. Our last contribution presented in chapter 4 showed the interest of transfer learning in applications where only few samples are available. The idea consists in re-using the filters of pre-trained convolutional networks that have been trained on large datasets such as ImageNet. Such pre-trained filters are plugged into a new convolutional network with new dense layers. Then, the whole network is trained over a new task. In this contribution, we provide an automatic system based on such learning scheme with an application to medical domain. In this application, the task consists in localizing the third lumbar vertebra in a 3D CT scan. A pre-processing of the 3D CT scan to obtain a 2D representation and a post-processing to refine the decision are included in the proposed system. This work has been done in collaboration with the clinic "Rouen Henri Becquerel Center" who provided us with data
Celikkanat, Abdulkadir. "Graph Representation Learning with Random Walk Diffusions". Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG030.
Testo completoGraph Representation Learning aims to embed nodes in a low-dimensional space. In this thesis, we tackle various challenging problems arising in the field. Firstly, we study how to leverage the inherent local community structure of graphs while learning node representations. We learn enhanced community-aware representations by combining the latent information with the embeddings. Moreover, we concentrate on the expressive- ness of node representations. We emphasize exponential family distributions to capture rich interaction patterns. We propose a model that combines random walks with kernelized matrix factorization. In the last part of the thesis, we study models balancing the trade-off between efficiency and accuracy. We propose a scalable embedding model which computes binary node representations
CADORET, VINCENT. "Determination d'actes de dialogue : une approche combinant representation explicite des connaissances et apprentissage connexionniste". Rennes 1, 1995. http://www.theses.fr/1996REN10059.
Testo completoGermani, Élodie. "Exploring and mitigating analytical variability in fMRI results using representation learning". Electronic Thesis or Diss., Université de Rennes (2023-....), 2024. http://www.theses.fr/2024URENS031.
Testo completoIn this thesis, we focus on the variations induced by different analysis methods, also known as analytical variability, in brain imaging studies. This phenomenon is now well known in the community, and our aim is now to better understand the factors leading to this variability and to find solutions to better account for it. To do so, I analyse data and explore the relationships between the results of different methods. At the same time, I study the constraints related to data reuse and I propose solutions based on artificial intelligence to build more robust studies
Tinas, Jean-louis. "Apprentissage d’un concept scientifique : statut de l’hypothese dans la demarche d’investigation en sciences physiques". Thesis, Bordeaux 2, 2013. http://www.theses.fr/2013BOR22051/document.
Testo completoTo learn a scientific concept proceeds of a process of demolition-reconstruction. To teach means helping the pupil in this approach which asks him to replay, for him the route of invention which allowed the emergence of the concept. It is exactly to face the crisis which crosses the scientific education in France and in the world and because we consider that the usual, still used educational practices, are partly responsible for it, that the approach of investigation is universally proposed. To proceed by investigation is a method which asks to the pupil to build his knowledge. She is presented as being more effective to learn. A reflection around the reason of this efficiency leads us to stop at the level of the stage of formulation of hypothesis which seems to constitute the pivot of the approach. Pupils’ statements for which we deduce that they are the translation of their representation show that it is possible to explore their state of thought in a situation of learning and better, to follow processes of thought. The methods developed for it seem effective because we succeed in showing on the scale of a class that thanks to the formulation of hypotheses all the pupils succeed, with their rhythm, in reaching the scientific knowledge. So, we notice that the hypothesis plays a role structuring for the knowledge under construction. She trains it for measure of the process of demolition-reconstruction. These considerations authorize us to think that the use of the formulation of hypothesis contributes to the efficiency of the approach by investigation compared with a more classic approach to learn a scientific knowledge
Dalens, Théophile. "Learnable factored image representation for visual discovery". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE036.
Testo completoThis thesis proposes an approach for analyzing unpaired visual data annotated with time stamps by generating how images would have looked like if they were from different times. To isolate and transfer time dependent appearance variations, we introduce a new trainable bilinear factor separation module. We analyze its relation to classical factored representations and concatenation-based auto-encoders. We demonstrate this new module has clear advantages compared to standard concatenation when used in a bottleneck encoder-decoder convolutional neural network architecture. We also show that it can be inserted in a recent adversarial image translation architecture, enabling the image transformation to multiple different target time periods using a single network
Dagher, Antoine. "Environnement informatique et apprentissage de l'articulation entre registres graphiques et algebrique de representation des fonctions". Paris 7, 1993. http://www.theses.fr/1993PA077038.
Testo completoLiu, Jingshu. "Unsupervised cross-lingual representation modeling for variable length phrases". Thesis, Nantes, 2020. http://www.theses.fr/2020NANT4009.
Testo completoSignificant advances have been achieved in bilingual word-level alignment from comparable corpora, yet the challenge remains for phrase-level alignment. Traditional methods to phrase alignment can only handle phrase of equal length, while word embedding based approaches learn phrase embeddings as individual vocabulary entries suffer from the data sparsity and cannot handle out of vocabulary phrases. Since bilingual alignment is a vector comparison task, phrase representation plays a key role. In this thesis, we study the approaches for unified phrase modeling and cross-lingual phrase alignment, ranging from co-occurrence models to most recent neural state-of-the-art approaches. We review supervised and unsupervised frameworks for modeling cross-lingual phrase representations. Two contributions are proposed in this work. First, a new architecture called tree-free recursive neural network (TF-RNN) for modeling phrases of variable length which, combined with a wrapped context prediction training objective, outperforms the state-of-the-art approaches on monolingual phrase synonymy task with only plain text training data. Second, for cross-lingual modeling, we propose to incorporate an architecture derived from TF-RNN in an encoder-decoder model with a pseudo back translation mechanism inspired by unsupervised neural machine translation. Our proposition improves significantly bilingual alignment of different length phrases
Sadok, Samir. "Audiovisual speech representation learning applied to emotion recognition". Electronic Thesis or Diss., CentraleSupélec, 2024. http://www.theses.fr/2024CSUP0003.
Testo completoEmotions are vital in our daily lives, becoming a primary focus of ongoing research. Automatic emotion recognition has gained considerable attention owing to its wide-ranging applications across sectors such as healthcare, education, entertainment, and marketing. This advancement in emotion recognition is pivotal for fostering the development of human-centric artificial intelligence. Supervised emotion recognition systems have significantly improved over traditional machine learning approaches. However, this progress encounters limitations due to the complexity and ambiguous nature of emotions. Acquiring extensive emotionally labeled datasets is costly, time-intensive, and often impractical.Moreover, the subjective nature of emotions results in biased datasets, impacting the learning models' applicability in real-world scenarios. Motivated by how humans learn and conceptualize complex representations from an early age with minimal supervision, this approach demonstrates the effectiveness of leveraging prior experience to adapt to new situations. Unsupervised or self-supervised learning models draw inspiration from this paradigm. Initially, they aim to establish a general representation learning from unlabeled data, akin to the foundational prior experience in human learning. These representations should adhere to criteria like invariance, interpretability, and effectiveness. Subsequently, these learned representations are applied to downstream tasks with limited labeled data, such as emotion recognition. This mirrors the assimilation of new situations in human learning. In this thesis, we aim to propose unsupervised and self-supervised representation learning methods designed explicitly for multimodal and sequential data and to explore their potential advantages in the context of emotion recognition tasks. The main contributions of this thesis encompass:1. Developing generative models via unsupervised or self-supervised learning for audiovisual speech representation learning, incorporating joint temporal and multimodal (audiovisual) modeling.2. Structuring the latent space to enable disentangled representations, enhancing interpretability by controlling human-interpretable latent factors.3. Validating the effectiveness of our approaches through both qualitative and quantitative analyses, in particular on emotion recognition task. Our methods facilitate signal analysis, transformation, and generation
Feutry, Clément. "Two sides of relevant information : anonymized representation through deep learning and predictor monitoring". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS479.
Testo completoThe work presented here is for a first part at the cross section of deep learning and anonymization. A full framework was developed in order to identify and remove to a certain extant, in an automated manner, the features linked to an identity in the context of image data. Two different kinds of processing data were explored. They both share the same Y-shaped network architecture despite components of this network varying according to the final purpose. The first one was about building from the ground an anonymized representation that allowed a trade-off between keeping relevant features and tampering private features. This framework has led to a new loss. The second kind of data processing specified no relevant information about the data, only private information, meaning that everything that was not related to private features is assumed relevant. Therefore the anonymized representation shares the same nature as the initial data (e.g. an image is transformed into an anonymized image). This task led to another type of architecture (still in a Y-shape) and provided results strongly dependent on the type of data. The second part of the work is relative to another kind of relevant information: it focuses on the monitoring of predictor behavior. In the context of black box analysis, we only have access to the probabilities outputted by the predictor (without any knowledge of the type of structure/architecture producing these probabilities). This monitoring is done in order to detect abnormal behavior that is an indicator of a potential mismatch between the data statistics and the model statistics. Two methods are presented using different tools. The first one is based on comparing the empirical cumulative distribution of known data and to be tested data. The second one introduces two tools: one relying on the classifier uncertainty and the other relying on the confusion matrix. These methods produce concluding results
Karpate, Yogesh. "Enhanced representation & learning of magnetic resonance signatures in multiple sclerosis". Thesis, Rennes 1, 2015. http://www.theses.fr/2015REN1S068/document.
Testo completoMultiple Sclerosis (MS) is an acquired inflammatory disease, which causes disabilities in young adults and it is common in northern hemisphere. This PhD work focuses on characterization and modeling of multidimensional MRI signatures in MS Lesions (MSL). The objective is to improve image representation and learning for visual recognition, where high level information such as MSL contained in MRI are automatically extracted. We propose a new longitudinal intensity normalization algorithm for multichannel MRI in the presence of MS lesions, which provides consistent and reliable longitudinal detections. This is primarily based on learning the tissue intensities from multichannel MRI using robust Gaussian Mixture Modeling. Further, we proposed two MSL detection methods based on a statistical patient to population comparison framework and probabilistic one class learning. We evaluated our proposed algorithms on multi-center databases to verify its efficacy
Maraš, Mirjana. "Learning efficient signal representation in sparse spike-coding networks". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEE023.
Testo completoThe complexity of sensory input is paralleled by the complexity of its representation in the neural activity of biological systems. Starting from the hypothesis that biological networks are tuned to achieve maximal efficiency and robustness, we investigate how efficient representation can be accomplished in networks with experimentally observed local connection probabilities and synaptic dynamics. We develop a Lasso regularized local synaptic rule, which optimizes the number and efficacy of recurrent connections. The connections that impact the efficiency the least are pruned, and the strength of the remaining ones is optimized for efficient signal representation. Our theory predicts that the local connection probability determines the trade-off between the number of population spikes and the number of recurrent synapses, which are developed and maintained in the network. The more sparsely connected networks represent signals with higher firing rates than those with denser connectivity. The variability of observed connection probabilities in biological networks could then be seen as a consequence of this trade-off, and related to different operating conditions of the circuits. The learned recurrent connections are structured, with most connections being reciprocal. The dimensionality of the recurrent weights can be inferred from the network’s connection probability and the dimensionality of the feedforward input. The optimal connectivity of a network with synaptic delays is somewhere at an intermediate level, neither too sparse nor too dense. Furthermore, when we add another biological constraint, adaptive regulation of firing rates, our learning rule leads to an experimentally observed scaling of the recurrent weights. Our work supports the notion that biological micro-circuits are highly organized and principled. A detailed examination of the local circuit organization can help us uncover the finer aspects of the principles which govern sensory representation
Renard, Xavier. "Time series representation for classification : a motif-based approach". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066593.
Testo completoOur research described in this thesis is about the learning of a motif-based representation from time series to perform automatic classification. Meaningful information in time series can be encoded across time through trends, shapes or subsequences usually with distortions. Approaches have been developed to overcome these issues often paying the price of high computational complexity. Among these techniques, it is worth pointing out distance measures and time series representations. We focus on the representation of the information contained in the time series. We propose a framework to generate a new time series representation to perform classical feature-based classification based on the discovery of discriminant sets of time series subsequences (motifs). This framework proposes to transform a set of time series into a feature space, using subsequences enumerated from the time series, distance measures and aggregation functions. One particular instance of this framework is the well-known shapelet approach. The potential drawback of such an approach is the large number of subsequences to enumerate, inducing a very large feature space and a very high computational complexity. We show that most subsequences in a time series dataset are redundant. Therefore, a random sampling can be used to generate a very small fraction of the exhaustive set of subsequences, preserving the necessary information for classification and thus generating a much smaller feature space compatible with common machine learning algorithms with tractable computations. We also demonstrate that the number of subsequences to draw is not linked to the number of instances in the training set, which guarantees the scalability of the approach. The combination of the latter in the context of our framework enables us to take advantage of advanced techniques (such as multivariate feature selection techniques) to discover richer motif-based time series representations for classification, for example by taking into account the relationships between the subsequences. These theoretical results have been extensively tested on more than one hundred classical benchmarks of the literature with univariate and multivariate time series. Moreover, since this research has been conducted in the context of an industrial research agreement (CIFRE) with Arcelormittal, our work has been applied to the detection of defective steel products based on production line's sensor measurements
Laugier, Catherine. "Apprentissage par observation en danse : rôle des processus représentatifs dans la reproduction de mouvements". Montpellier 1, 1995. http://www.theses.fr/1995MON14002.
Testo completoGainon, de Forsan de Gabriac Clara. "Deep Natural Language Processing for User Representation". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS274.
Testo completoThe last decade has witnessed the impressive expansion of Deep Learning (DL) methods, both in academic research and the private sector. This success can be explained by the ability DL to model ever more complex entities. In particular, Representation Learning methods focus on building latent representations from heterogeneous data that are versatile and re-usable, namely in Natural Language Processing (NLP). In parallel, the ever-growing number of systems relying on user data brings its own lot of challenges. This work proposes methods to leverage the representation power of NLP in order to learn rich and versatile user representations.Firstly, we detail the works and domains associated with this thesis. We study Recommendation. We then go over recent NLP advances and how they can be applied to leverage user-generated texts, before detailing Generative models.Secondly, we present a Recommender System (RS) that is based on the combination of a traditional Matrix Factorization (MF) representation method and a sentiment analysis model. The association of those modules forms a dual model that is trained on user reviews for rating prediction. Experiments show that, on top of improving performances, the model allows us to better understand what the user is really interested in in a given item, as well as to provide explanations to the suggestions made.Finally, we introduce a new task-centered on UR: Professional Profile Learning. We thus propose an NLP-based framework, to learn and evaluate professional profiles on different tasks, including next job generation
Moulouel, Koussaila. "Hybrid AI approaches for context recognition : application to activity recognition and anticipation and context abnormalities handling in Ambient Intelligence environments". Electronic Thesis or Diss., Paris Est, 2023. http://www.theses.fr/2023PESC0014.
Testo completoAmbient Intelligence (AmI) systems aim to provide users with assistance services intended to improve the quality of their lives in terms of autonomy, safety, and well-being. The design of AmI systems capable of accurate, fine-grained and consistent recognition of the spatial and/or temporal user's context, taking into account the uncertainty and partial observability of AmI environments, poses several challenges to enable a better adaptation of the assistance services to the user's context. The purpose of this thesis is to propose a set of contributions that address these challenges. Firstly, a context ontology is proposed to model contextual knowledge in AmI environments. The purpose of this ontology is the modeling of the user's context taking into account different context attributes and defining axioms of the commonsense reasoning necessary to infer and update the context of the user. The second contribution is an ontology-based hybrid framework that combine probabilistic commonsense reasoning and probabilistic planning to recognize the user's context, in particular, context abnormalities, and provide context-aware assistance services, in presence of uncertainty and partial observability of the environments. This framework exploits context attribute predictions, namely user's activity and user's location, provided by deep learning models. In this framework, the probabilistic commonsense reasoning is based on the proposed context ontology to define the axiomatization of the context inference and planning under uncertainty. Probabilistic planning is used to characterize abnormal context by coping with the incompleteness of contextual knowledge due to the partial observability of AmI environments. The proposed framework was evaluated using transformers and CNN-LSTM models considering Orange4Home and SIMADL datasets. The results show the effectiveness of the framework to recognize user's contexts, in terms of user's activity and location, along with context abnormalities. Thirdly, a hybrid framework combining deep learning and probabilistic commonsense reasoning for anticipating human activities based on egocentric videos is proposed. The probabilistic commonsense reasoning exploited in this framework is based on abductive reasoning to anticipate both human atomic and composite activities, and temporal reasoning to capture context attribute changes. Deep learning models were exploited to recognize context attributes, such as objects, human hands, and human locations. The context ontology is used to model the relationships between atomic activities and composite activities. The evaluation of the framework shows its ability to anticipate composite activities over a time horizon of minutes, in contrast to state-of-the-art approaches that can only anticipate atomic activities over a time horizon of seconds. It also showed good performance in terms of accuracy of classification of anticipated activities and computation time. Lastly, a stream reasoning-based framework is proposed to anticipate atomic and composite human activities from data streams of context attributes collected on-the-fly. Deep learning models were used to recognize context attributes, such as objects used in activities, hands and user locations. The stream reasoning system performs causal, abductive and temporal reasoning with contextual knowledge obtained at run-time. Dynamic effect axioms were introduced to anticipate composite activities that can be subject to unforeseen events, such as skipping an atomic activity and delay an atomic activity. The proposed framework was validated through experiments conducted in a kitchen environment. The remarkably high performance in terms of the number of activity anticipations shows the ability of the framework to take into account the contextual knowledge of past episodes needed to anticipate composite activities
Pineau, Edouard. "Contributions to representation learning of multivariate time series and graphs". Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT037.
Testo completoMachine learning (ML) algorithms are designed to learn models that have the ability to take decisions or make predictions from data, in a large panel of tasks. In general, the learned models are statistical approximations of the true/optimal unknown decision models. The efficiency of a learning algorithm depends on an equilibrium between model richness, complexity of the data distribution and complexity of the task to solve from data. Nevertheless, for computational convenience, the statistical decision models often adopt simplifying assumptions about the data (e.g. linear separability, independence of the observed variables, etc.). However, when data distribution is complex (e.g. high-dimensional with nonlinear interactions between observed variables), the simplifying assumptions can be counterproductive. In this situation, a solution is to feed the model with an alternative representation of the data. The objective of data representation is to separate the relevant information with respect to the task to solve from the noise, in particular if the relevant information is hidden (latent), in order to help the statistical model. Until recently and the rise of modern ML, many standard representations consisted in an expert-based handcrafted preprocessing of data. Recently, a branch of ML called deep learning (DL) completely shifted the paradigm. DL uses neural networks (NNs), a family of powerful parametric functions, as learning data representation pipelines. These recent advances outperformed most of the handcrafted data in many domains.In this thesis, we are interested in learning representations of multivariate time series (MTS) and graphs. MTS and graphs are particular objects that do not directly match standard requirements of ML algorithms. They can have variable size and non-trivial alignment, such that comparing two MTS or two graphs with standard metrics is generally not relevant. Hence, particular representations are required for their analysis using ML approaches. The contributions of this thesis consist of practical and theoretical results presenting new MTS and graphs representation learning frameworks.Two MTS representation learning frameworks are dedicated to the ageing detection of mechanical systems. First, we propose a model-based MTS representation learning framework called Sequence-to-graph (Seq2Graph). Seq2Graph assumes that the data we observe has been generated by a model whose graphical representation is a causality graph. It then represents, using an appropriate neural network, the sample on this graph. From this representation, when it is appropriate, we can find interesting information about the state of the studied mechanical system. Second, we propose a generic trend detection method called Contrastive Trend Estimation (CTE). CTE learns to classify pairs of samples with respect to the monotony of the trend between them. We show that using this method, under few assumptions, we identify the true state underlying the studied mechanical system, up-to monotone scalar transform.Two graph representation learning frameworks are dedicated to the classification of graphs. First, we propose to see graphs as sequences of nodes and create a framework based on recurrent neural networks to represent and classify them. Second, we analyze a simple baseline feature for graph classification: the Laplacian spectrum. We show that this feature matches minimal requirements to classify graphs when all the meaningful information is contained in the structure of the graphs
Louis, Thibault. "Implémentation et bénéfices des systèmes d'interaction haute-fidélité : d'un contrôle plus performant à un apprentissage d'objets 3D plus rapide". Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM047.
Testo completoInteracting with 3D virtual scenes is essential for numerous applications. Among others: 3D data visualization, computer assisted design, training simulators and video games. Performing this task through 2D systems like desktop computers or multi-touch tablets can be tedious. To interact more efficiently with 3D contents, high fidelity interactive systems such as virtual reality head-mounted displays try to reproduce the interactive modalities available in real life. Such systems offer a stereoscopic head-coupled rendering and an isomorphic control of 3D objects. However, there is a lack of rigorous studies that showed their benefits in the literature. This thesis has two purposes. We want to enrich the literature through controlled user studies that bring robust results on high fidelity systems' benefits. We also seek to provide the means to implement the most efficient high fidelity experiences.In this manuscript, we start by presenting a state of the art of existing high fidelity devices and their potential benefits. We especially introduce a promising approach called handheld perspective corrected displays (HPCD), that we particularly studied through this thesis.We then present two contributions that allowed us to quantify high fidelity systems benefits. We studied two tasks involving very different cognitive processes in order to attest the variety of applications that could benefit from those systems. The first study concerns a 6D docking task. The two high fidelity systems that we tested, an HPCD and a virtual reality head mounted display, performed respectively 43% and 29% more efficiently than the status quo (an articulated arm used alongside a flat screen). The second study focuses on the task of learning the shape of an unknown 3D object. Regarding this task, the two previously studied high fidelity systems allowed to enhance by 27% the object's recognition performances when compared to the use of a multi-touch tablet.We then present two other contributions that bring solutions to ease both hardware and software implementation of high fidelity systems. We provide a method to evaluate the impact of several technical parameters on the presence felt during an interactive experience, which is a feeling that testifies to the experience’s fidelity with regard to the simulated reality. Using this method in a user study allowed us to identify the fact that, with the tested HPCD, the tracking stability and the rendering frame rate were the most critical parameters concerning presence. We finally suggest a suit of interacting techniques that enable the implementation of applications well suited for spherical HPCD, and any other devices that provide a manipulable screen held with both hands. The proposed interactions take advantage of the efficient control of the device rotations and appeared to be both intuitive and efficient during a qualitative test in an anatomy learning application
Harrando, Ismail. "Representation, information extraction, and summarization for automatic multimedia understanding". Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS097.
Testo completoWhether on TV or on the internet, video content production is seeing an unprecedented rise. Not only is video the dominant medium for entertainment purposes, but it is also reckoned to be the future of education, information and leisure. Nevertheless, the traditional paradigm for multimedia management proves to be incapable of keeping pace with the scale brought about by the sheer volume of content created every day across the disparate distribution channels. Thus, routine tasks like archiving, editing, content organization and retrieval by multimedia creators become prohibitively costly. On the user side, too, the amount of multimedia content pumped daily can be simply overwhelming; the need for shorter and more personalized content has never been more pronounced. To advance the state of the art on both fronts, a certain level of multimedia understanding has to be achieved by our computers. In this research thesis, we aim to go about the multiple challenges facing automatic media content processing and analysis, mainly gearing our exploration to three axes: 1. Representing multimedia: With all its richness and variety, modeling and representing multimedia content can be a challenge in itself. 2. Describing multimedia: The textual component of multimedia can be capitalized on to generate high-level descriptors, or annotations, for the content at hand. 3. Summarizing multimedia: we investigate the possibility of extracting highlights from media content, both for narrative-focused summarization and for maximising memorability
Tamaazousti, Youssef. "Vers l’universalité des représentations visuelle et multimodales". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLC038/document.
Testo completoBecause of its key societal, economic and cultural stakes, Artificial Intelligence (AI) is a hot topic. One of its main goal, is to develop systems that facilitates the daily life of humans, with applications such as household robots, industrial robots, autonomous vehicle and much more. The rise of AI is highly due to the emergence of tools based on deep neural-networks which make it possible to simultaneously learn, the representation of the data (which were traditionally hand-crafted), and the task to solve (traditionally learned with statistical models). This resulted from the conjunction of theoretical advances, the growing computational capacity as well as the availability of many annotated data. A long standing goal of AI is to design machines inspired humans, capable of perceiving the world, interacting with humans, in an evolutionary way. We categorize, in this Thesis, the works around AI, in the two following learning-approaches: (i) Specialization: learn representations from few specific tasks with the goal to be able to carry out very specific tasks (specialized in a certain field) with a very good level of performance; (ii) Universality: learn representations from several general tasks with the goal to perform as many tasks as possible in different contexts. While specialization was extensively explored by the deep-learning community, only a few implicit attempts were made towards universality. Thus, the goal of this Thesis is to explicitly address the problem of improving universality with deep-learning methods, for image and text data. We have addressed this topic of universality in two different forms: through the implementation of methods to improve universality (“universalizing methods”); and through the establishment of a protocol to quantify its universality. Concerning universalizing methods, we proposed three technical contributions: (i) in a context of large semantic representations, we proposed a method to reduce redundancy between the detectors through, an adaptive thresholding and the relations between concepts; (ii) in the context of neural-network representations, we proposed an approach that increases the number of detectors without increasing the amount of annotated data; (iii) in a context of multimodal representations, we proposed a method to preserve the semantics of unimodal representations in multimodal ones. Regarding the quantification of universality, we proposed to evaluate universalizing methods in a Transferlearning scheme. Indeed, this technical scheme is relevant to assess the universal ability of representations. This also led us to propose a new framework as well as new quantitative evaluation criteria for universalizing methods
Bitton, Adrien. "Meaningful audio synthesis and musical interactions by representation learning of sound sample databases". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS362.
Testo completoComputer assisted music extensively relies on audio sample libraries and virtual instruments which provide users an ever increasing amount of contents to produce music with. However, principled methods for large-scale interactions are lacking so that browsing samples and presets with respect to a target sound idea is a tedious and arbitrary process. Indeed, library metadata can only describe coarse categories of sounds but do not meaningfully traduce the underlying acoustic contents and continuous variations in timbre which are key elements of music production and creativity. The recent advances in deep generative modelling show unprecedented successes at learning large-scale unsupervised representations which invert to data as diverse as images, texts and audio. These probabilistic models could be refined to specific generative tasks such as unpaired image translation and semantic manipulations of visual features, demonstrating the ability of learning transformations and representations that are perceptually meaningful. In this thesis, we target efficient analysis and synthesis with auto-encoders to learn low dimensional acoustic representations for timbre manipulations and intuitive interactions for music production. In the first place we adapt domain translation techniques to timbre transfer and propose alternatives to adversarial learning for many-to-many transfers. Then we develop models for explicit modelling of timbre variations and controllable audio sampling using conditioning for semantic attribute manipulations and hierarchical learning to represent both acoustic and temporal variations
Cheikhrouhou, Ikram. "Contribution à l'étude du changement conceptuel : les concepts de fermeture de circuit et de conservation de l'intensité dans un circuit en série chez les adultes, avant et après une formation en électricité". Paris 8, 1998. http://www.theses.fr/1998PA081497.
Testo completoSeveral verbal terms that refer to scientific concepts are also those used in everyday life. Therefore, before beginning the formal curriculum, learners have already built their own representation of some of these concepts. Formal-concept learning comes up against the existence of these preconceptions. Conceptual change is a research area studying the nature of these preconceptions and their modification after instruction. In this work, we propose to examine conceptual change by studying changes in the comprehension of two interdependent electrical concepts: closed circuit and intensity conservation of electric current in series circuit. These concepts were examined essentially in two situations. The first is familiar and aims to diagnose the subject's representations. The second situation is unfamiliar and aims to reveal the kinds of representations the subjects use to explain novel situations not usually taught. These concepts were examined for 24 novice adults before and after a two-week vocational training session in electricity. The methodology used to identify subject's representations consists of an analysis of verbalisations. These are collected with a "critical interview" method inspired by piaget's critical interrogation method. Verbalisations are analysed by describing representations in graph form and a series of symbols. Deduced on graphs, every series of symbols gives the subject's representation of intensity conservation and closed circuit notions. In the familiar situation, the results show an improvement of subject's representations especially for the closed circuit notion which seems more "acquired" than "intensity conservation". This latter notion is in contradiction with everyday experience (the materialization of electricity). This improvement suggests that conceptual change can be gradual. Nevertheless, the non-familiar situation results show the reappearance of some representations found before training. This reappearance suggests that the observed change was superficial and that the deep conceptual structures were not modified
Ouzir, Nora Leïla. "Cardiac motion estimation in ultrasound images using a sparse representation and dictionary learning". Thesis, Toulouse 3, 2018. http://www.theses.fr/2018TOU30149.
Testo completoCardiovascular diseases have become a major healthcare issue. Improving the diagnosis and analysis of these diseases have thus become a primary concern in cardiology. The heart is a moving organ that undergoes complex deformations. Therefore, the quantification of cardiac motion from medical images, particularly ultrasound, is a key part of the techniques used for diagnosis in clinical practice. Thus, significant research efforts have been directed toward developing new cardiac motion estimation methods. These methods aim at improving the quality and accuracy of the estimated motions. However, they are still facing many challenges due to the complexity of cardiac motion and the quality of ultrasound images. Recently, learning-based techniques have received a growing interest in the field of image processing. More specifically, sparse representations and dictionary learning strategies have shown their efficiency in regularizing different ill-posed inverse problems. This thesis investigates the benefits that such sparsity and learning-based techniques can bring to cardiac motion estimation. Three main contributions are presented, investigating different aspects and challenges that arise in echocardiography. Firstly, a method for cardiac motion estimation using a sparsity-based regularization is introduced. The motion estimation problem is formulated as an energy minimization, whose data fidelity term is built using the assumption that the images are corrupted by multiplicative Rayleigh noise. In addition to a classical spatial smoothness constraint, the proposed method exploits the sparse properties of the cardiac motion to regularize the solution via an appropriate dictionary learning step. Secondly, a fully robust optical flow method is proposed. The aim of this work is to take into account the limitations of ultrasound imaging and the violations of the regularization constraints. In this work, two regularization terms imposing spatial smoothness and sparsity of the motion field in an appropriate cardiac motion dictionary are also exploited. In order to ensure robustness to outliers, an iteratively re-weighted minimization strategy is proposed using weighting functions based on M-estimators. As a last contribution, we investigate a cardiac motion estimation method using a combination of sparse, spatial and temporal regularizations. The problem is formulated within a general optical flow framework. The proposed temporal regularization enforces smoothness of the motion trajectories between consecutive images. Furthermore, an iterative groupewise motion estimation allows us to incorporate the three regularization terms, while enabling the processing of the image sequence as a whole. Throughout this thesis, the proposed contributions are validated using synthetic and realistic simulated cardiac ultrasound images. These datasets with available groundtruth are used to evaluate the accuracy of the proposed approaches and show their competitiveness with state-of-the-art algorithms. In order to demonstrate clinical feasibility, in vivo sequences of healthy and pathological subjects are considered for the first two methods. A preliminary investigation is conducted for the last contribution, i.e., exploiting temporal smoothness, using simulated data
Bisot, Victor. "Apprentissage de représentations pour l'analyse de scènes sonores". Electronic Thesis or Diss., Paris, ENST, 2018. http://www.theses.fr/2018ENST0016.
Testo completoThis thesis work focuses on the computational analysis of environmental sound scenes and events. The objective of such tasks is to automatically extract information about the context in which a sound has been recorded. The interest for this area of research has been rapidly increasing in the last few years leading to a constant growth in the number of works and proposed approaches. We explore and contribute to the main families of approaches to sound scene and event analysis, going from feature engineering to deep learning. Our work is centered at representation learning techniques based on nonnegative matrix factorization, which are particularly suited to analyse multi-source environments such as acoustic scenes. As a first approach, we propose a combination of image processing features with the goal of confirming that spectrograms contain enough information to discriminate sound scenes and events. From there, we leave the world of feature engineering to go towards automatically learning the features. The first step we take in that direction is to study the usefulness of matrix factorization for unsupervised feature learning techniques, especially by relying on variants of NMF. Several of the compared approaches allow us indeed to outperform feature engineering approaches to such tasks. Next, we propose to improve the learned representations by introducing the TNMF model, a supervised variant of NMF. The proposed TNMF models and algorithms are based on jointly learning nonnegative dictionaries and classifiers by minimising a target classification cost. The last part of our work highlights the links and the compatibility between NMF and certain deep neural network systems by proposing and adapting neural network architectures to the use of NMF as an input representation. The proposed models allow us to get state of the art performance on scene classification and overlapping event detection tasks. Finally we explore the possibility of jointly learning NMF and neural networks parameters, grouping the different stages of our systems in one optimisation problem
Lagrange, Adrien. "From representation learning to thematic classification - Application to hierarchical analysis of hyperspectral images". Thesis, Toulouse, INPT, 2019. http://www.theses.fr/2019INPT0095.
Testo completoNumerous frameworks have been developed in order to analyze the increasing amount of available image data. Among those methods, supervised classification has received considerable attention leading to the development of state-of-the-art classification methods. These methods aim at inferring the class of each observation given a specific class nomenclature by exploiting a set of labeled observations. Thanks to extensive research efforts of the community, classification methods have become very efficient. Nevertheless, the results of a classification remains a highlevel interpretation of the scene since it only gives a single class to summarize all information in a given pixel. Contrary to classification methods, representation learning methods are model-based approaches designed especially to handle high-dimensional data and extract meaningful latent variables. By using physic-based models, these methods allow the user to extract very meaningful variables and get a very detailed interpretation of the considered image. The main objective of this thesis is to develop a unified framework for classification and representation learning. These two methods provide complementary approaches allowing to address the problem using a hierarchical modeling approach. The representation learning approach is used to build a low-level model of the data whereas classification is used to incorporate supervised information and may be seen as a high-level interpretation of the data. Two different paradigms, namely Bayesian models and optimization approaches, are explored to set up this hierarchical model. The proposed models are then tested in the specific context of hyperspectral imaging where the representation learning task is specified as a spectral unmixing problem
Barlaam, Fanny. "Maturation et apprentissage du contrôle postural anticipé au cours de l'adolescence : expressions motrice et cérébrale". Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4778.
Testo completoVoluntary action requires an anticipation, which predicts the consequence of action on posture. Anticipation rests on action and body representations. Adolescence is characterized by body modifications and cerebral maturation. This thesis explored the link between the anticipatory function, action and body representations, and the cerebral maturation. The bimanual load-lifting task engages a postural arm, supporting the load, and a motor arm, lifting the load. In this task, the anticipation, expressed by anticipatory postural adjustments (APA) cancelled the destabilizing effect of movement on the posture. Kinematics, EMG and EEG were recorded. Although performances of postural stabilization were stable, APAs at the adolescence were characterized by an earlier latency of inhibition on the postural flexors. In adults, APA are expressed by a mu rhythm desynchronization and a positive wave over M1involved in posture, which presented different temporal characteristics in adolescents. Thus, the improvement of APA would be underlain by a maturation of these EEG activities. Learning a new postural control was characterized by a rapid followed by a slow improvement of the postural stabilisation. This acquisition rested on the mastering of the temporal parameters of the flexors inhibition, which took more time at the adolescence. Integration of proprioceptive feedback coming from action allowed an update of sensorimotor representation. Expressed by the mastering of the temporal parameters, the update of body and action representations at adolescence would imply an enhancement of the integration of proprioceptive information. Maturation of the cerebral areas would be a key element
Nguyen, Dinh Quoc Dang. "Representation of few-group homogenized cross sections by polynomials and tensor decomposition". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASP142.
Testo completoThis thesis focuses on studying the mathematical modeling of few-group homogenized cross sections, a critical element in the two-step scheme widely used in nuclear reactor simulations. As industrial demands increasingly require finer spatial and energy meshes to improve the accuracy of core calculations, the size of the cross section library can become excessive, hampering the performance of core calculations. Therefore, it is essential to develop a representation that minimizes memory usage while still enabling efficient data interpolation.Two approaches, polynomial representation and Canonical Polyadic decomposition of tensors, are presented and applied to few-group homogenized cross section data. The data is prepared using APOLLO3 on the geometry of two assemblies in the X2 VVER-1000 benchmark. The compression rate and accuracy are evaluated and discussed for each approach to determine their applicability to the standard two-step scheme.Additionally, GPU implementations of both approaches are tested to assess the scalability of the algorithms based on the number of threads involved. These implementations are encapsulated in a library called Merlin, intended for future research and industrial applications that involve these approaches.Both approaches, particularly the method of tensor decomposition, demonstrate promising results in terms of data compression and reconstruction accuracy. Integrating these methods into the standard two-step scheme would not only substantially reduce memory usage for storing cross sections, but also significantly decrease the computational effort required for interpolating cross sections during core calculations, thereby reducing overall calculation time for industrial reactor simulations
Gaudiello, Ilaria. "Learning robotics, with robotics, by robotics : a study on three paradigms of educational robotics, under the issues of robot representation, robot acceptance, and robot impact on learning". Thesis, Paris 8, 2015. http://www.theses.fr/2015PA080081.
Testo completoThrough a psychological perspective, the thesis concerns the three ER learning paradigms that are distinguished upon the different hardware, software, and correspondent modes of interaction allowed by the robot. Learning robotics was investigated under the issue of robot representation. By robot representation, we mean its ontological and pedagogical status and how such status change when users learn robotics. In order to answer this question, we carried out an experimental study based on pre- and post-inquiries, involving 79 participants. Learning with robotics was investigated under the issue of robot’s functional and social acceptance. Here, the underlying research questions were as follows: do students trust in robot’s functional and social savvy? Is trust in functional savvy a pre-requisite for trust in social savvy? Which individuals and contextual factors are more likely to influence this trust? In order to answer these questions, we have carried an experimental study with 56 participants and an iCub robot. Trust in the robot has been considered as a main indicator of acceptance in situations of perceptual and socio-cognitive uncertainty and was measured by participants’ conformation to answers given by iCub. Learning by robotics was investigated under the issue of robot’s impact on learning. The research questions were the following: to what extent the combined RBI & IBSE frame has a positive impact on cognitive, affective, social and meta-cognitive dimensions of learning? Does this combined educational frame improve both domain-specific and non-domain specific knowledge and competences of students? In order to answer these questions, we have carried a one-year RBI & IBSE experimental study in the frame of RObeeZ, a research made through the FP7 EU project Pri-Sci-Net. The longitudinal experiments involved 26 pupils and 2 teachers from a suburb parisian primary school
Bourigault, Simon. "Apprentissage de représentations pour la prédiction de propagation d'information dans les réseaux sociaux". Thesis, Paris 6, 2016. http://www.theses.fr/2016PA066368/document.
Testo completoIn this thesis, we study information diffusion in online social networks. Websites like Facebook or Twitter have indeed become information medias, on which users create and share a lot of data. Most existing models of the information diffusion phenomenon relies on strong hypothesis about the structure and dynamics of diffusion. In this document, we study the problem of diffusion prediction in the context where the social graph is unknown and only user actions are observed. - We propose a learning algorithm for the independant cascades model that does not take time into account. Experimental results show that this approach obtains better results than time-based learning schemes. - We then propose several representations learning methods for this task of diffusion prediction. This let us define more compact and faster models. - Finally, we apply our representation learning approach to the source detection task, where it obtains much better results than graph-based approaches
Laforgue, Pierre. "Deep kernel representation learning for complex data and reliability issues". Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT006.
Testo completoThe first part of this thesis aims at exploring deep kernel architectures for complex data. One of the known keys to the success of deep learning algorithms is the ability of neural networks to extract meaningful internal representations. However, the theoretical understanding of why these compositional architectures are so successful remains limited, and deep approaches are almost restricted to vectorial data. On the other hand, kernel methods provide with functional spaces whose geometry are well studied and understood. Their complexity can be easily controlled, by the choice of kernel or penalization. In addition, vector-valued kernel methods can be used to predict kernelized data. It then allows to make predictions in complex structured spaces, as soon as a kernel can be defined on it.The deep kernel architecture we propose consists in replacing the basic neural mappings functions from vector-valued Reproducing Kernel Hilbert Spaces (vv-RKHSs). Although very different at first glance, the two functional spaces are actually very similar, and differ only by the order in which linear/nonlinear functions are applied. Apart from gaining understanding and theoretical control on layers, considering kernel mappings allows for dealing with structured data, both in input and output, broadening the applicability scope of networks. We finally expose works that ensure a finite dimensional parametrization of the model, opening the door to efficient optimization procedures for a wide range of losses.The second part of this thesis investigates alternatives to the sample mean as substitutes to the expectation in the Empirical Risk Minimization (ERM) paradigm. Indeed, ERM implicitly assumes that the empirical mean is a good estimate of the expectation. However, in many practical use cases (e.g. heavy-tailed distribution, presence of outliers, biased training data), this is not the case.The Median-of-Means (MoM) is a robust mean estimator constructed as follows: the original dataset is split into disjoint blocks, empirical means on each block are computed, and the median of these means is finally returned. We propose two extensions of MoM, both to randomized blocks and/or U-statistics, with provable guarantees. By construction, MoM-like estimators exhibit interesting robustness properties. This is further exploited by the design of robust learning strategies. The (randomized) MoM minimizers are shown to be robust to outliers, while MoM tournament procedure are extended to the pairwise setting.We close this thesis by proposing an ERM procedure tailored to the sample bias issue. If training data comes from several biased samples, computing blindly the empirical mean yields a biased estimate of the risk. Alternatively, from the knowledge of the biasing functions, it is possible to reweight observations so as to build an unbiased estimate of the test distribution. We have then derived non-asymptotic guarantees for the minimizers of the debiased risk estimate thus created. The soundness of the approach is also empirically endorsed
Vukotic, Verdran. "Deep Neural Architectures for Automatic Representation Learning from Multimedia Multimodal Data". Thesis, Rennes, INSA, 2017. http://www.theses.fr/2017ISAR0015/document.
Testo completoIn this dissertation, the thesis that deep neural networks are suited for analysis of visual, textual and fused visual and textual content is discussed. This work evaluates the ability of deep neural networks to learn automatic multimodal representations in either unsupervised or supervised manners and brings the following main contributions:1) Recurrent neural networks for spoken language understanding (slot filling): different architectures are compared for this task with the aim of modeling both the input context and output label dependencies.2) Action prediction from single images: we propose an architecture that allow us to predict human actions from a single image. The architecture is evaluated on videos, by utilizing solely one frame as input.3) Bidirectional multimodal encoders: the main contribution of this thesis consists of neural architecture that translates from one modality to the other and conversely and offers and improved multimodal representation space where the initially disjoint representations can translated and fused. This enables for improved multimodal fusion of multiple modalities. The architecture was extensively studied an evaluated in international benchmarks within the task of video hyperlinking where it defined the state of the art today.4) Generative adversarial networks for multimodal fusion: continuing on the topic of multimodal fusion, we evaluate the possibility of using conditional generative adversarial networks to lean multimodal representations in addition to providing multimodal representations, generative adversarial networks permit to visualize the learned model directly in the image domain
Gaudiello, Ilaria. "Learning robotics, with robotics, by robotics : a study on three paradigms of educational robotics, under the issues of robot representation, robot acceptance, and robot impact on learning". Electronic Thesis or Diss., Paris 8, 2015. http://www.theses.fr/2015PA080081.
Testo completoThrough a psychological perspective, the thesis concerns the three ER learning paradigms that are distinguished upon the different hardware, software, and correspondent modes of interaction allowed by the robot. Learning robotics was investigated under the issue of robot representation. By robot representation, we mean its ontological and pedagogical status and how such status change when users learn robotics. In order to answer this question, we carried out an experimental study based on pre- and post-inquiries, involving 79 participants. Learning with robotics was investigated under the issue of robot’s functional and social acceptance. Here, the underlying research questions were as follows: do students trust in robot’s functional and social savvy? Is trust in functional savvy a pre-requisite for trust in social savvy? Which individuals and contextual factors are more likely to influence this trust? In order to answer these questions, we have carried an experimental study with 56 participants and an iCub robot. Trust in the robot has been considered as a main indicator of acceptance in situations of perceptual and socio-cognitive uncertainty and was measured by participants’ conformation to answers given by iCub. Learning by robotics was investigated under the issue of robot’s impact on learning. The research questions were the following: to what extent the combined RBI & IBSE frame has a positive impact on cognitive, affective, social and meta-cognitive dimensions of learning? Does this combined educational frame improve both domain-specific and non-domain specific knowledge and competences of students? In order to answer these questions, we have carried a one-year RBI & IBSE experimental study in the frame of RObeeZ, a research made through the FP7 EU project Pri-Sci-Net. The longitudinal experiments involved 26 pupils and 2 teachers from a suburb parisian primary school