Rozprawy doktorskie na temat „Apprentisage en profondeur”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Apprentisage en profondeur”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Palli, Thazha Vyshakh. "Using context-cues and interaction for traffic-agent trajectory prediction". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAE001.
Pełny tekst źródłaAutonomous vehicle navigation in urban areas involves interactions with the different road-users or traffic-agents like cars, bicycles, and pedestrians, sharing the same road network. The ability of autonomous vehicle to observe, understand and predict the behaviour of these traffic-agents is very important to gain a good situation understanding prior to deciding what manoeuvre to follow. While this is achieved to various degrees of success using model-based or data-driven methods, human drivers remain much more efficient at this task, instinctively inferring different agent motions even in previously unseen and challenging situations. Moreover, context plays a very important role that enables us humans to understand what is being perceived and make finer predictions. The need to increase situational awareness of autonomous vehicles, as well as for safety related driving assistance functions, stimulates our goal to exploit contextual information to predict the future trajectories of the observed traffic-agents in different conditions.Over the past years, machine learning has proven to be efficient at solving a wide variety of problems, particularly those associated to machine perception. This thesis therefore focuses on developing machine learning models to exploit contextual information in order to observe and learn the trajectories of different interacting traffic-agents as perceived from an autonomous vehicle. While most models proposed in the past rely on a single sensor and model-based techniques, the current approaches often rely on the use of multiple sensors and process their outputs using different machine learning methods. The approach proposed in this thesis follows these trends by combining information from different sensors to predict the trajectories of the observed traffic-agents using machine learning, as well as integrating contextual information and interactions into the prediction process.The thesis gradually builds a machine learning architecture based on a theoretical formulation and experimentation. Our approach is based on an LSTM encoder-decoder model that accepts data from different inputs. Trajectory observations from 3D LiDAR point-cloud data and semantic information from map-masks are used. Map masks represent areas where the traffic-agents can operate or not, in a binary manner. The information on pedestrian attention to oncoming vehicles obtained from camera images is also exploited to enrich the sequence prediction system. The goal is to feed the model with context-cues and semantic information to enhance the prediction of the traffic-agent trajectories, by knowing whether or not the agents are aware of the presence of the subject vehicle and including knowledge on areas where they are likely to navigate. Moreover, interactions of the autonomous vehicle with traffic-agents often govern its behaviour as the vehicle navigates. A mechanism to incorporate this information to the machine learning model is also developed as an interaction-aware trajectory prediction system enhanced by context-cues.Machine learning architectures are built using datasets acquired from the perception sensors of a vehicle navigating in the expected workspace. As datasets play an important role in solving machine learning problems, available annotated datasets for autonomous navigation were reviewed according to their availability of sensor data and contextual information. Experiments were performed for our models to learn, and gradually build the resulting architecture. Their performance are demonstrated using the well-known NuScenes dataset acquired in urban settings. The performance of the proposed approach were compared with model and data-driven approaches, demonstrating that the incorporation of multiple contextual information and agent interactions provides a substantial performance increase
Goh, Hanlin. "Apprentissage de Représentations Visuelles Profondes". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2013. http://tel.archives-ouvertes.fr/tel-00948376.
Pełny tekst źródłaMoukari, Michel. "Estimation de profondeur à partir d'images monoculaires par apprentissage profond". Thesis, Normandie, 2019. http://www.theses.fr/2019NORMC211/document.
Pełny tekst źródłaComputer vision is a branch of artificial intelligence whose purpose is to enable a machine to analyze, process and understand the content of digital images. Scene understanding in particular is a major issue in computer vision. It goes through a semantic and structural characterization of the image, on one hand to describe its content and, on the other hand, to understand its geometry. However, while the real space is three-dimensional, the image representing it is two-dimensional. Part of the 3D information is thus lost during the process of image formation and it is therefore non trivial to describe the geometry of a scene from 2D images of it.There are several ways to retrieve the depth information lost in the image. In this thesis we are interested in estimating a depth map given a single image of the scene. In this case, the depth information corresponds, for each pixel, to the distance between the camera and the object represented in this pixel. The automatic estimation of a distance map of the scene from an image is indeed a critical algorithmic brick in a very large number of domains, in particular that of autonomous vehicles (obstacle detection, navigation aids).Although the problem of estimating depth from a single image is a difficult and inherently ill-posed problem, we know that humans can appreciate distances with one eye. This capacity is not innate but acquired and made possible mostly thanks to the identification of indices reflecting the prior knowledge of the surrounding objects. Moreover, we know that learning algorithms can extract these clues directly from images. We are particularly interested in statistical learning methods based on deep neural networks that have recently led to major breakthroughs in many fields and we are studying the case of the monocular depth estimation
Resmerita, Diana. "Compression pour l'apprentissage en profondeur". Thesis, Université Côte d'Azur, 2022. http://www.theses.fr/2022COAZ4043.
Pełny tekst źródłaAutonomous cars are complex applications that need powerful hardware machines to be able to function properly. Tasks such as staying between the white lines, reading signs, or avoiding obstacles are solved by using convolutional neural networks (CNNs) to classify or detect objects. It is highly important that all the networks work in parallel in order to transmit all the necessary information and take a common decision. Nowadays, as the networks improve, they also have become bigger and more computational expensive. Deploying even one network becomes challenging. Compressing the networks can solve this issue. Therefore, the first objective of this thesis is to find deep compression methods in order to cope with the memory and computational power limitations present on embedded systems. The compression methods need to be adapted to a specific processor, Kalray's MPPA, for short term implementations. Our contributions mainly focus on compressing the network post-training for storage purposes, which means compressing the parameters of the network without retraining or changing the original architecture and the type of the computations. In the context of our work, we decided to focus on quantization. Our first contribution consists in comparing the performances of uniform quantization and non-uniform quantization, in order to identify which of the two has a better rate-distortion trade-off and could be quickly supported in the company. The company's interest is also directed towards finding new innovative methods for future MPPA generations. Therefore, our second contribution focuses on comparing standard floating-point representations (FP32, FP16) to recently proposed alternative arithmetical representations such as BFloat16, msfp8, Posit8. The results of this analysis were in favor for Posit8. This motivated the company Kalray to conceive a decompressor from FP16 to Posit8. Finally, since many compression methods already exist, we decided to move to an adjacent topic which aims to quantify theoretically the effects of quantization error on the network's accuracy. This is the second objective of the thesis. We notice that well-known distortion measures are not adapted to predict accuracy degradation in the case of inference for compressed neural networks. We define a new distortion measure with a closed form which looks like a signal-to-noise ratio. A set of experiments were done using simulated data and small networks, which show the potential of this distortion measure
Peiffer, Elsa. "Implications des structures cérébrales profondes dans les apprentissages procéduraux". Lyon 1, 2000. http://www.theses.fr/2000LYO1T267.
Pełny tekst źródłaMordan, Taylor. "Conception d'architectures profondes pour l'interprétation de données visuelles". Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS270.
Pełny tekst źródłaNowadays, images are ubiquitous through the use of smartphones and social media. It then becomes necessary to have automatic means of processing them, in order to analyze and interpret the large amount of available data. In this thesis, we are interested in object detection, i.e. the problem of identifying and localizing all objects present in an image. This can be seen as a first step toward a complete visual understanding of scenes. It is tackled with deep convolutional neural networks, under the Deep Learning paradigm. One drawback of this approach is the need for labeled data to learn from. Since precise annotations are time-consuming to produce, bigger datasets can be built with partial labels. We design global pooling functions to work with them and to recover latent information in two cases: learning spatially localized and part-based representations from image- and object-level supervisions respectively. We address the issue of efficiency in end-to-end learning of these representations by leveraging fully convolutional networks. Besides, exploiting additional annotations on available images can be an alternative to having more images, especially in the data-deficient regime. We formalize this problem as a specific kind of multi-task learning with a primary objective to focus on, and design a way to effectively learn from this auxiliary supervision under this framework
Trullo, Ramirez Roger. "Approche basées sur l'apprentissage en profondeur pour la segmentation des organes à risques dans les tomodensitométries thoraciques". Thesis, Normandie, 2018. http://www.theses.fr/2018NORMR063.
Pełny tekst źródłaRadiotherapy is one of the options for treatment currently available for patients affected by cancer, one of the leading cause of deaths worldwide. Before radiotherapy, organs at risk (OAR) located near the target tumor, such as the heart, the lungs, the esophagus, etc. in thoracic cancer, must be outlined, in order to minimize the quantity of irradiation that they receive during treatment. Today, segmentation of the OAR is performed mainly manually by clinicians on Computed Tomography (CT) images, despite some partial software support. It is a tedious task, prone to intra and inter-observer variability. In this work, we present several frameworks using deep learning techniques to automatically segment the heart, trachea, aorta and esophagus. In particular, the esophagus is notably challenging to segment, due to the lack of surrounding contrast and shape variability across different patients. As deep networks and in particular fully convolutional networks offer now state of the art performance for semantic segmentation, we first show how a specific type of architecture based on skip connections can improve the accuracy of the results. As a second contribution, we demonstrate that context information can be of vital importance in the segmentation task, where we propose the use of two collaborative networks. Third, we propose a different, distance aware representation of the data, which is then used in junction with adversarial networks, as another way to constrain the anatomical context. All the proposed methods have been tested on 60 patients with 3D-CT scans, showing good performance compared with other methods
Chandra, Siddhartha. "Apprentissage Profond pour des Prédictions Structurées Efficaces appliqué à la Classification Dense en Vision par Ordinateur". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLC033/document.
Pełny tekst źródłaIn this thesis we propose a structured prediction technique that combines the virtues of Gaussian Conditional Random Fields (G-CRFs) with Convolutional Neural Networks (CNNs). The starting point of this thesis is the observation that while being of a limited form GCRFs allow us to perform exact Maximum-APosteriori (MAP) inference efficiently. We prefer exactness and simplicity over generality and advocate G-CRF based structured prediction in deep learning pipelines. Our proposed structured prediction methods accomodate (i) exact inference, (ii) both shortand long- term pairwise interactions, (iii) rich CNN-based expressions for the pairwise terms, and (iv) end-to-end training alongside CNNs. We devise novel implementation strategies which allow us to overcome memory and computational challenges
Pinheiro, de Carvalho Marcela. "Deep Depth from Defocus : Neural Networks for Monocular Depth Estimation". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS609.
Pełny tekst źródłaDepth estimation from a single image is a key instrument for several applications from robotics to virtual reality. Successful Deep Learning approaches in computer vision tasks as object recognition and classification also benefited the domain of depth estimation. In this thesis, we develop methods for monocular depth estimation with deep neural network by exploring different cues: defocus blur and semantics. We conduct several experiments to understand the contribution of each cue in terms of generalization and model performance. At first, we propose an efficient convolutional neural network for depth estimation along with a conditional Generative Adversarial framework. Our method achieves performances among the best on standard datasets for depth estimation. Then, we propose to explore defocus blur cues, which is an optical information deeply related to depth. We show that deep models are able to implicitly learn and use this information to improve performance and overcome known limitations of classical Depth-from-Defocus. We also build a new dataset with real focused and defocused images that we use to validate our approach. Finally, we explore the use of semantic information, which brings rich contextual information while learned jointly to depth on a multi-task approach. We validate our approaches with several datasets containing indoor, outdoor and aerial images
M'Saad, Soumaya. "Détection de changement de comportement de vie chez la personne âgée par images de profondeur". Thesis, Rennes 1, 2022. http://www.theses.fr/2022REN1S039.
Pełny tekst źródłaThe number of elderly people in the world is constantly increasing, hence the challenge of helping them to continue to live at home and ageing in good health. This PhD takes part in this public health issue and proposes the detection of the person behavior change based on the recording of activities in the home by low-cost depth sensors that guarantee anonymity and that operate autonomously day and night. After an initial study combining image classification by machine learning approaches, a method based on Resnet-18 deep neural networks was proposed for fall and posture position detection. This approach gave good results with a global accuracy of 93.44% and a global sensitivity of 93.24%. The detection of postures makes possible to follow the state of the person and in particular the behavior changes which are assumed to be the routine loss. Two strategies were deployed to monitor the routine. The first one examines the succession of activities in the day by computing an edit distance or a dynamic deformation of the day, the other one consists in classifying the day into routine and non-routine by combining supervised (k-means and k-modes), unsupervised (Random Forest) or a priori knowledge about the person's routine. These strategies were evaluated both on real data recorded in EHPAD in two frail people and on simulated data created to fill the lack of real data. They have shown the possibility to detect different behavioral change scenarios (abrupt, progressive, recurrent) and prove that depth sensors can be used in EHPAD or in the home of an elderly person
Germain, Mathieu. "L’estimation de distribution à l'aide d'un autoencodeur". Mémoire, Université de Sherbrooke, 2015. http://hdl.handle.net/11143/6910.
Pełny tekst źródłaChen, Dexiong. "Modélisation de données structurées avec des machines profondes à noyaux et des applications en biologie computationnelle". Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM070.
Pełny tekst źródłaDeveloping efficient algorithms to learn appropriate representations of structured data, including sequences or graphs, is a major and central challenge in machine learning. To this end, deep learning has become popular in structured data modeling. Deep neural networks have drawn particular attention in various scientific fields such as computer vision, natural language understanding or biology. For instance, they provide computational tools for biologists to possibly understand and uncover biological properties or relationships among macromolecules within living organisms. However, most of the success of deep learning methods in these fields essentially relies on the guidance of empirical insights as well as huge amounts of annotated data. Exploiting more data-efficient models is necessary as labeled data is often scarce.Another line of research is kernel methods, which provide a systematic and principled approach for learning non-linear models from data of arbitrary structure. In addition to their simplicity, they exhibit a natural way to control regularization and thus to avoid overfitting.However, the data representations provided by traditional kernel methods are only defined by simply designed hand-crafted features, which makes them perform worse than neural networks when enough labeled data are available. More complex kernels inspired by prior knowledge used in neural networks have thus been developed to build richer representations and thus bridge this gap. Yet, they are less scalable. By contrast, neural networks are able to learn a compact representation for a specific learning task, which allows them to retain the expressivity of the representation while scaling to large sample size.Incorporating complementary views of kernel methods and deep neural networks to build new frameworks is therefore useful to benefit from both worlds.In this thesis, we build a general kernel-based framework for modeling structured data by leveraging prior knowledge from classical kernel methods and deep networks. Our framework provides efficient algorithmic tools for learning representations without annotations as well as for learning more compact representations in a task-driven way. Our framework can be used to efficiently model sequences and graphs with simple interpretation of predictions. It also offers new insights about designing more expressive kernels and neural networks for sequences and graphs
Brédy, Jhemson, i Jhemson Brédy. "Prévision de la profondeur de la nappe phréatique d'un champ de canneberges à l'aide de deux approches de modélisation des arbres de décision". Master's thesis, Université Laval, 2019. http://hdl.handle.net/20.500.11794/37875.
Pełny tekst źródłaLa gestion intégrée de l’eau souterraine constitue un défi majeur pour les activités industrielles, agricoles et domestiques. Dans certains systèmes agricoles, une gestion optimisée de la nappe phréatique représente un facteur important pour améliorer les rendements des cultures et l’utilisation de l'eau. La prévision de la profondeur de la nappe phréatique (PNP) devient l’une des stratégies utiles pour planifier et gérer en temps réel l’eau souterraine. Cette étude propose une approche de modélisation basée sur les arbres de décision pour prédire la PNP en fonction des précipitations, des précédentes PNP et de l'évapotranspiration pour la gestion de l’eau souterraine des champs de canneberges. Premièrement, deux modèles: « Random Forest (RF) » et « Extreme Gradient Boosting (XGB) » ont été paramétrisés et comparés afin de prédirela PNP jusqu'à 48 heures. Deuxièmement, l’importance des variables prédictives a été déterminée pour analyser leur influence sur la simulation de PNP. Les mesures de PNP de trois puits d'observation dans un champ de canneberges, pour la période de croissance du 8 juillet au 30 août 2017, ont été utilisées pour entraîner et valider les modèles. Des statistiques tels que l’erreur quadratique moyenne, le coefficient de détermination et le coefficient d’efficacité de Nash-Sutcliffe sont utilisés pour mesurer la performance des modèles. Les résultats montrent que l'algorithme XGB est plus performant que le modèle RF pour prédire la PNP et est sélectionné comme le modèle optimal. Parmi les variables prédictives, les valeurs précédentes de PNP étaient les plus importantes pour la simulation de PNP, suivie par la précipitation. L’erreur de prédiction du modèle optimal pour la plage de PNP était de ± 5 cm pour les simulations de 1, 12, 24, 36 et 48 heures. Le modèle XGB fournit des informations utiles sur la dynamique de PNP et une simulation rigoureuse pour la gestion de l’irrigation des canneberges.
Integrated ground water management is a major challenge for industrial, agricultural and domestic activities. In some agricultural production systems, optimized water table management represents a significant factor to improve crop yields and water use. Therefore, predicting water table depth (WTD) becomes an important means to enable real-time planning and management of groundwater resources. This study proposes a decision-tree-based modelling approach for WTD forecasting as a function of precipitation, previous WTD values and evapotranspiration with applications in groundwater resources management for cranberry farming. Firstly, two models-based decision trees, namely Random Forest (RF) and Extrem Gradient Boosting (XGB), were parameterized and compared to predict the WTD up to 48-hours ahead for a cranberry farm located in Québec, Canada. Secondly, the importance of the predictor variables was analyzed to determine their influence on WTD simulation results. WTD measurements at three observation wells within acranberry field, for the growing period from July 8, 2017 to August 30, 2017, were used for training and testing the models. Statistical parameters such as the mean squared error, coefficient of determination and Nash-Sutcliffe efficiency coefficient were used to measure models performance. The results show that the XGB algorithm outperformed the RF model for predictions of WTD and was selected as the optimal model. Among the predictor variables, the antecedent WTD was the most important for water table depth simulation, followed by the precipitation. Base on the most important variables and optimal model, the prediction error for entire WTD range was within ± 5 cm for 1-, 12-, 24-, 26-and 48-hour prediction. The XGB model can provide useful information on the WTD dynamics and a rigorous simulation for irrigation planning and management in cranberry fields.
Integrated ground water management is a major challenge for industrial, agricultural and domestic activities. In some agricultural production systems, optimized water table management represents a significant factor to improve crop yields and water use. Therefore, predicting water table depth (WTD) becomes an important means to enable real-time planning and management of groundwater resources. This study proposes a decision-tree-based modelling approach for WTD forecasting as a function of precipitation, previous WTD values and evapotranspiration with applications in groundwater resources management for cranberry farming. Firstly, two models-based decision trees, namely Random Forest (RF) and Extrem Gradient Boosting (XGB), were parameterized and compared to predict the WTD up to 48-hours ahead for a cranberry farm located in Québec, Canada. Secondly, the importance of the predictor variables was analyzed to determine their influence on WTD simulation results. WTD measurements at three observation wells within acranberry field, for the growing period from July 8, 2017 to August 30, 2017, were used for training and testing the models. Statistical parameters such as the mean squared error, coefficient of determination and Nash-Sutcliffe efficiency coefficient were used to measure models performance. The results show that the XGB algorithm outperformed the RF model for predictions of WTD and was selected as the optimal model. Among the predictor variables, the antecedent WTD was the most important for water table depth simulation, followed by the precipitation. Base on the most important variables and optimal model, the prediction error for entire WTD range was within ± 5 cm for 1-, 12-, 24-, 26-and 48-hour prediction. The XGB model can provide useful information on the WTD dynamics and a rigorous simulation for irrigation planning and management in cranberry fields.
Bach, Tran. "Algorithmes avancés de DCA pour certaines classes de problèmes en apprentissage automatique du Big Data". Electronic Thesis or Diss., Université de Lorraine, 2019. http://www.theses.fr/2019LORR0255.
Pełny tekst źródłaBig Data has become gradually essential and ubiquitous in all aspects nowadays. Therefore, there is an urge to develop innovative and efficient techniques to deal with the rapid growth in the volume of data. This dissertation considers the following problems in Big Data: group variable selection in multi-class logistic regression, dimension reduction by t-SNE (t-distributed Stochastic Neighbor Embedding), and deep clustering. We develop advanced DCAs (Difference of Convex functions Algorithms) for these problems, which are based on DC Programming and DCA – the powerful tools for non-smooth non-convex optimization problems. Firstly, we consider the problem of group variable selection in multi-class logistic regression. We tackle this problem by using recently advanced DCAs -- Stochastic DCA and DCA-Like. Specifically, Stochastic DCA specializes in the large sum of DC functions minimization problem, which only requires a subset of DC functions at each iteration. DCA-Like relaxes the convexity condition of the second DC component while guaranteeing the convergence. Accelerated DCA-Like incorporates the Nesterov's acceleration technique into DCA-Like to improve its performance. The numerical experiments in benchmark high-dimensional datasets show the effectiveness of proposed algorithms in terms of running time and solution quality. The second part studies the t-SNE problem, an effective non-linear dimensional reduction technique. Motivated by the novelty of DCA-Like and Accelerated DCA-Like, we develop two algorithms for the t-SNE problem. The superiority of proposed algorithms in comparison with existing methods is illustrated through numerical experiments for visualization application. Finally, the third part considers the problem of deep clustering. In the first application, we propose two algorithms based on DCA to combine t-SNE with MSSC (Minimum Sum-of-Squares Clustering) by following two approaches: “tandem analysis” and joint-clustering. The second application considers clustering with auto-encoder (a well-known type of neural network). We propose an extension to a class of joint-clustering algorithms to overcome the scaling problem and applied for a specific case of joint-clustering with MSSC. Numerical experiments on several real-world datasets show the effectiveness of our methods in rapidity and clustering quality, compared to the state-of-the-art methods
Andrade, Valente da Silva Michelle. "SLAM and data fusion for autonomous vehicles : from classical approaches to deep learning methods". Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLEM079.
Pełny tekst źródłaSelf-driving cars have the potential to provoke a mobility transformation that will impact our everyday lives. They offer a novel mobility system that could provide more road safety, efficiency and accessibility to the users. In order to reach this goal, the vehicles need to perform autonomously three main tasks: perception, planning and control. When it comes to urban environments, perception becomes a challenging task that needs to be reliable for the safety of the driver and the others. It is extremely important to have a good understanding of the environment and its obstacles, along with a precise localization, so that the other tasks are well performed. This thesis explores from classical approaches to Deep Learning techniques to perform mapping and localization for autonomous vehicles in urban environments. We focus on vehicles equipped with low-cost sensors with the goal to maintain a reasonable price for the future autonomous vehicles. Considering this, we use in the proposed methods sensors such as 2D laser scanners, cameras and standard IMUs. In the first part, we introduce model-based methods using evidential occupancy grid maps. First, we present an approach to perform sensor fusion between a stereo camera and a 2D laser scanner to improve the perception of the environment. Moreover, we add an extra layer to the grid maps to set states to the detected obstacles. This state allows to track an obstacle overtime and to determine if it is static or dynamic. Sequentially, we propose a localization system that uses this new layer along with classic image registration techniques to localize the vehicle while simultaneously creating the map of the environment. In the second part, we focus on the use of Deep Learning techniques for the localization problem. First, we introduce a learning-based algorithm to provide odometry estimation using only 2D laser scanner data. This method shows the potential of neural networks to analyse this type of data for the estimation of the vehicle's displacement. Sequentially, we extend the previous method by fusing the 2D laser scanner with a camera in an end-to-end learning system. The addition of camera images increases the accuracy of the odometry estimation and proves that we can perform sensor fusion without any sensor modelling using neural networks. Finally, we present a new hybrid algorithm to perform the localization of a vehicle inside a previous mapped region. This algorithm takes the advantages of the use of evidential maps in dynamic environments along with the ability of neural networks to process images. The results obtained in this thesis allowed us to better understand the challenges of vehicles equipped with low-cost sensors in dynamic environments. By adapting our methods for these sensors and performing the fusion of their information, we improved the general perception of the environment along with the localization of the vehicle. Moreover, our approaches allowed a possible comparison between the advantages and disadvantages of learning-based techniques compared to model-based ones. Finally, we proposed a form of combining these two types of approaches in a hybrid system that led to a more robust solution
Steff, Marion. "Apprentissage de tâches motrices problématiques chez quatre personnes déficientes intellectuelles sévères ou profondes, par une organisation variée de la pratique, en vue d'un transfert en milieu naturel". Mémoire, Université de Sherbrooke, 2002. http://savoirs.usherbrooke.ca/handle/11143/744.
Pełny tekst źródłaPessiglione, Mathias. "Dopamine, ganglions de la base et selection de l'action : du singe MPTP au patient parkinsonien approche électrophysiologique et comportementale". Paris 6, 2004. http://www.theses.fr/2004PA066264.
Pełny tekst źródłaMollaret, Sébastien. "Artificial intelligence algorithms in quantitative finance". Thesis, Paris Est, 2021. http://www.theses.fr/2021PESC2002.
Pełny tekst źródłaArtificial intelligence has become more and more popular in quantitative finance given the increase of computer capacities as well as the complexity of models and has led to many financial applications. In the thesis, we have explored three different applications to solve financial derivatives challenges, from model selection, to model calibration and pricing. In Part I, we focus on a regime-switching model to price equity derivatives. The model parameters are estimated using the Expectation-Maximization (EM) algorithm and a local volatility component is added to fit vanilla option prices using the particle method. In Part II, we then use deep neural networks to calibrate a stochastic volatility model, where the volatility is modelled as the exponential of an Ornstein-Uhlenbeck process, by approximating the mapping between model parameters and corresponding implied volatilities offline. Once the expensive approximation has been performed offline, the calibration reduces to a standard & fast optimization problem.In Part III, we finally use deep neural networks to price American option on large baskets to solve the curse of the dimensionality. Different methods are studied with a Longstaff-Schwartz approach, where we approximate the continuation values, and a stochastic control approach, where we solve the pricing partial differential equation by reformulating the problem as a stochastic control problem using the non-linear Feynman-Kac formula
Wang, Xin. "Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066577/document.
Pełny tekst źródłaIn this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results
Wang, Qiong. "Salient object detection and segmentation in videos". Thesis, Rennes, INSA, 2019. http://www.theses.fr/2019ISAR0003/document.
Pełny tekst źródłaThis thesis focuses on the problem of video salient object detection and video object instance segmentation which aim to detect the most attracting objects or assign consistent object IDs to each pixel in a video sequence. One approach, one overview and one extended model are proposed for video salient object detection, and one approach is proposed for video object instance segmentation. For video salient object detection, we propose: (1) one traditional approach to detect the whole salient object via the adjunction of virtual borders. A guided filter is applied on the temporal output to integrate the spatial edge information for a better detection of the salient object edges. A global spatio-temporal saliency map is obtained by combining the spatial saliency map and the temporal saliency map together according to the entropy. (2) An overview of recent developments for deep-learning based methods is provided. It includes the classifications of the state-of-the-art methods and their frameworks, and the experimental comparison of the performances of the state-of-the-art methods. (3) One extended model further improves the performance of the proposed traditional approach by integrating a deep-learning based image salient object detection method For video object instance segmentation, we propose a deep-learning approach in which the warping confidence computation firstly judges the confidence of the mask warped map, then a semantic selection is introduced to optimize the warped map, where the object is re-identified using the semantics labels of the target object. The proposed approaches have been assessed on the published large-scale and challenging datasets. The experimental results show that the proposed approaches outperform the state-of-the-art methods
Wei, Wen. "Apprentissage automatique des altérations cérébrales causées par la sclérose en plaques en neuro-imagerie multimodale". Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4021.
Pełny tekst źródłaMultiple Sclerosis (MS) is the most common progressive neurological disease of young adults worldwide and thus represents a major public health issue with about 90,000 patients in France and more than 500,000 people affected with MS in Europe. In order to optimize treatments, it is essential to be able to measure and track brain alterations in MS patients. In fact, MS is a multi-faceted disease which involves different types of alterations, such as myelin damage and repair. Under this observation, multimodal neuroimaging are needed to fully characterize the disease. Magnetic resonance imaging (MRI) has emerged as a fundamental imaging biomarker for multiple sclerosis because of its high sensitivity to reveal macroscopic tissue abnormalities in patients with MS. Conventional MR scanning provides a direct way to detect MS lesions and their changes, and plays a dominant role in the diagnostic criteria of MS. Moreover, positron emission tomography (PET) imaging, an alternative imaging modality, can provide functional information and detect target tissue changes at the cellular and molecular level by using various radiotracers. For example, by using the radiotracer [11C]PIB, PET allows a direct pathological measure of myelin alteration. However, in clinical settings, not all the modalities are available because of various reasons. In this thesis, we therefore focus on learning and predicting missing-modality-derived brain alterations in MS from multimodal neuroimaging data
Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003/document.
Pełny tekst źródłaCustomer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
Panovski, Dancho. "Simulation, optimization and visualization of transportation data". Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAS016.
Pełny tekst źródłaToday all major metropolises in France, Europe and the rest of the world suffer from severe problems of congestion and saturation of infrastructures, which concern both individual and public transport. Current transportation systems are reaching capacity limits and appear unable to absorb increases in passenger flows in the future. The transport of the future is part of the various so-called Smart City initiatives and should be ”intelligent”, that is to say not only react to the demands but anticipate them, relying on the data exchanged between the end user and the information system of transportation operators.Within this context, one of the main challenges is the creation of appropriate methodologies for analysis of geo-localized transport data for instantaneous storage, analysis, management and dissemination of massive (typically thousands of instant geo-localizations , with a refresh rate of the order of a few seconds) data flows. The related algorithms must be capable of managing event lists of several tens of minutes to calculate real trajectories, instantaneous occupations, traffic lights changing cycles as well as vehicular traffic flow forecasts.In this thesis, we address two different issues related to this topic.A first contribution concerns the optimization of the traffic lights systems. The objective is to minimize the total journey time of the vehicles that are present in a certain part of a city. To this purpose, we propose a PSO (Particle Swarm Optimization) technique. The experimental results obtained show that such an approach makes it possible to obtain significant gains (5.37% - 21.53%) in terms of global average journey time.The second part of the thesis is dedicated to the issue of traffic flow prediction. In particular, we focus on prediction of the bus arrival time in the various bus stations existent over a given itinerary. Here, our contributions first concern the introduction of a novel data model, so-called TDM (Traffic Density Matrix), which captures dynamically the situation of the traffic along a given bus itinerary. Then, we show how different machine learning (ML) techniques can exploit such a structure in order to perform efficient prediction. To this purpose, we consider first traditional ML techniques, including linear regression and support vector regression with various kernels. The analysis of the results show that increasing the level of non-linearity can lead to superior results. Based on this observation, we propose various deep learning techniques with hand-crafted networks that we have specifically adapted to our objectives. The proposed approach include recurrent neural networks, LSTM (Long Short Time Memory) approaches, fully connected and convolutional networks. The analysis of the obtained experimental results confirm our intuition and demonstrate that such highly non-linear techniques outperform the traditional approaches and are able to deal with the singularities of the data that in this case correspond to localized traffic jams that globally affect the behavior of the system.Due to the lack of availability of such highly sensitive type of geo-localized information, all the data considered in our experiments has been produced with the help of the SUMO (Simulation of Urban Mobility) microscopic simulator. We notably show how SUMO can be exploited to construct realistic scenarios, close to real-life situations and exploitable for analysis purposes.Interpretation and understanding the data is of vital importance, nevertheless an adequate visualization platform is needed to present the results in a visually pleasing and understandable manner. To this purpose, we finally propose two different visualization application, a first one dedicated to the operators and the second one to clients. To ensure the deployment and compatibility of such applications on different devices (desktop PCs, Laptops, Smartphones, tablets…) a scalable solution is proposed
Pinard, Clément. "Robust Learning of a depth map for obstacle avoidance with a monocular stabilized flying camera". Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLY003.
Pełny tekst źródłaCustomer unmanned aerial vehicles (UAVs) are mainly flying cameras. They democratized aerial footage, but with thei success came security concerns.This works aims at improving UAVs security with obstacle avoidance, while keeping a smooth flight. In this context, we use only one stabilized camera, because of weight and cost incentives.For their robustness in computer vision and thei capacity to solve complex tasks, we chose to use convolutional neural networks (CNN). Our strategy is based on incrementally learning tasks with increasing complexity which first steps are to construct a depth map from the stabilized camera. This thesis is focused on studying ability of CNNs to train for this task.In the case of stabilized footage, the depth map is closely linked to optical flow. We thus adapt FlowNet, a CNN known for optical flow, to output directly depth from two stabilized frames. This network is called DepthNet.This experiment succeeded with synthetic footage, but is not robust enough to be used directly on real videos. Consequently, we consider self supervised training with real videos, based on differentiably reproject images. This training method for CNNs being rather novel in literature, a thorough study is needed in order not to depend too moch on heuristics.Finally, we developed a depth fusion algorithm to use DepthNet efficiently on real videos. Multiple frame pairs are fed to DepthNet to get a great depth sensing range
Corbat, Lisa. "Fusion de segmentations complémentaires d'images médicales par Intelligence Artificielle et autres méthodes de gestion de conflits". Thesis, Bourgogne Franche-Comté, 2020. http://www.theses.fr/2020UBFCD029.
Pełny tekst źródłaNephroblastoma is the most common kidney tumour in children and its diagnosis is based exclusively on imaging. This work, which is the subject of our research, is part of a larger project: the European project SAIAD (Automated Segmentation of Medical Images Using Distributed Artificial Intelligence). The aim of the project is to design a platform capable of performing different automatic segmentations from source images using Artificial Intelligence (AI) methods, and thus obtain a faithful three-dimensional reconstruction. In this sense, work carried out in a previous thesis of the research team led to the creation of a segmentation platform. It allows the segmentation of several structures individually, by methods such as Deep Learning, and more particularly Convolutional Neural Networks (CNNs), as well as Case Based Reasoning (CBR). However, it is then necessary to automatically fuse the segmentations of these different structures in order to obtain a complete relevant segmentation. When aggregating these structures, contradictory pixels may appear. These conflicts can be resolved by various methods based or not on AI and are the subject of our research. First, we propose a fusion approach not focused on AI using the combination of six different methods, based on different imaging and segmentation criteria. In parallel, two other fusion methods are proposed using, a CNN coupled to the CBR for one, and a CNN using a specific existing segmentation learning method for the other. These different approaches were tested on a set of 14 nephroblastoma patients and demonstrated their effectiveness in resolving conflicting pixels and their ability to improve the resulting segmentations
Wang, Xin. "Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066577.
Pełny tekst źródłaIn this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results
Staerman, Guillaume. "Functional anomaly detection and robust estimation". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAT021.
Pełny tekst źródłaEnthusiasm for Machine Learning is spreading to nearly all fields such as transportation, energy, medicine, banking or insurance as the ubiquity of sensors through IoT makes more and more data at disposal with an ever finer granularity. The abundance of new applications for monitoring of complex infrastructures (e.g. aircrafts, energy networks) together with the availability of massive data samples has put pressure on the scientific community to develop new reliable Machine-Learning methods and algorithms. The work presented in this thesis focuses around two axes: unsupervised functional anomaly detection and robust learning, both from practical and theoretical perspectives.The first part of this dissertation is dedicated to the development of efficient functional anomaly detection approaches. More precisely, we introduce Functional Isolation Forest (FIF), an algorithm based on randomly splitting the functional space in a flexible manner in order to progressively isolate specific function types. Also, we propose the novel notion of functional depth based on the area of the convex hull of sampled curves, capturing gradual departures from centrality, even beyond the envelope of the data, in a natural fashion. Estimation and computational issues are addressed and various numerical experiments provide empirical evidence of the relevance of the approaches proposed. In order to provide recommendation guidance for practitioners, the performance of recent functional anomaly detection techniques is evaluated using two real-world data sets related to the monitoring of helicopters in flight and to the spectrometry of construction materials.The second part describes the design and analysis of several robust statistical approaches relying on robust mean estimation and statistical data depth. The Wasserstein distance is a popular metric between probability distributions based on optimal transport. Although the latter has shown promising results in many Machine Learning applications, it suffers from a high sensitivity to outliers. To that end, we investigate how to leverage Medians-of-Means (MoM) estimators to robustify the estimation of Wasserstein distance with provable guarantees. Thereafter, a new statistical depth function, the Affine-Invariant Integrated Rank-Weighted (AI-IRW) depth is introduced. Beyond the theoretical analysis carried out, numerical results are presented, providing strong empirical confirmation of the relevance of the depth function proposed. The upper-level sets of statistical depths—the depth-trimmed regions—give rise to a definition of multivariate quantiles. We propose a new discrepancy measure between probability distributions that relies on the average of the Hausdorff distance between the depth-based quantile regions w.r.t. each distribution and demonstrate that it benefits from attractive properties of data depths such as robustness or interpretability. All algorithms developed in this thesis are open-sourced and available online
Bakkali, Souhail. "Multimodal Document Understanding with Unified Vision and Language Cross-Modal Learning". Electronic Thesis or Diss., La Rochelle, 2022. http://www.theses.fr/2022LAROS046.
Pełny tekst źródłaThe frameworks developed in this thesis were the outcome of an iterative process of analysis and synthesis between existing theories and our performed studies. More specifically, we wish to study cross-modality learning for contextualized comprehension on document components across language and vision. The main idea is to leverage multimodal information from document images into a common semantic space. This thesis focuses on advancing the research on cross-modality learning and makes contributions on four fronts: (i) to proposing a cross-modal approach with deep networks to jointly leverage visual and textual information into a common semantic representation space to automatically perform and make predictions about multimodal documents (i.e., the subject matter they are about); (ii) to investigating competitive strategies to address the tasks of cross-modal document classification, content-based retrieval and few-shot document classification; (iii) to addressing data-related issues like learning when data is not annotated, by proposing a network that learns generic representations from a collection of unlabeled documents; and (iv) to exploiting few-shot learning settings when data contains only few examples
Khatib, Natasha al. "Intrusion detection with deep learning for in-vehicle networks". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAT009.
Pełny tekst źródłaIn-vehicle communication which refers to the communication and exchange of data between embedded automotive devices plays a crucial role in the development of intelligent transportation systems (ITS), which aim to improve the efficiency, safety, and sustainability of transportation systems. The proliferation of embedded sensor-centric communication and computing devices connected to the in-vehicle network (IVN) has enabled the development of safety and convenience features including vehicle monitoring, physical wiring reduction, and improved driving experience. However, with the increasing complexity and connectivity of modern vehicles, the expanding threat landscape of the IVN is raising concerns. A range of potential security risks can compromise the safety and functionality of a vehicle putting the life of drivers and passengers in danger. Numerous approaches have thus been proposed and implemented to alleviate this issue including firewalls, encryption, and secure authentication and access controls. As traditional mechanisms fail to fully counterattack intrusion attempts, the need for a complementary defensive countermeasure is necessary. Intrusion Detection Systems (IDS) have been thus considered a fundamental component of every network security infrastructure, including IVN. Intrusion detection can be particularly useful in detecting threats that may not be caught by other security measures, such as zero-day vulnerabilities or insider attacks. It can also provide an early warning of a potential attack, allowing car manufacturers to take preventive measures before significant damage occurs. The main objective of this thesis is to investigate the capability of deep learning techniques in detecting in-vehicle intrusions. Deep learning algorithms have the ability to process large amounts of data and recognize complex patterns that may be difficult for humans to discern, making them well-suited for detecting intrusions in IVN. However, since the E/E architecture of a vehicle is constantly evolving as new technologies and requirements emerge, we propose different deep learning-based solutions for different E/E architectures and for various tasks including anomaly detection and classification
Torres, Rivera Andrés. "Detección y extracción de neologismos semánticos especializados: un acercamiento mediante clasificación automática de documentos y estrategias de aprendizaje profundo". Doctoral thesis, Universitat Pompeu Fabra, 2019. http://hdl.handle.net/10803/667928.
Pełny tekst źródłaDins del camp de la neologia, s’han dissenyat diferents aproximacions metodològics per a la detecció i extracció de neologismes semàntics amb tècniques com la desambiguació semàntica i el modelatge de temes, però encara no existeix cap proposta d’un sistema per a la detecció d’aquestes unitats. A partir d’un estudi detallat sobre els supòsits teòrics necessaris per identificar i descriure els neologismes semàntics, en aquesta tesi proposem el desenvolupament d’una aplicació per identificar i buidar aquestes unitats mitjançant estratègies estadístiques, de mineria de dades i d’aprenentatge automàtic. La metodologia que es planteja es basa en el tractament del procés de detecció i extracció com un problema de classificació, que consisteix a analitzar la concordança de temes entre el camp semàntic del significat principal d’una paraula i el text en què es troba aquesta paraula. Per constituir l’arquitectura del sistema proposat, analitzem cinc mètodes de classificació automàtica supervisada i tres models per a la generació de representacions vectorials de paraules mitjançant aprenentatge profund. El nostre corpus d’anàlisi està format pels neologismes semàntics de l'àmbit de la informàtica pertanyents a la base de dades de l’Observatori de Neologia de la Universitat Pompeu Fabra, que s’han registrat des de 1989 fins a 2015. Utilitzem aquest corpus per avaluar els diferents mètodes que implementa el sistema: classificació automàtica, extracció de paraules a partir de contextos breus i generació de llistes de paraules similars. Aquesta primera aproximació metodològica busca establir un marc de referència en matèria de detecció i extracció de neologismes semàntics.
Dans le domaine de la néologie, différentes approches méthodologiques ont été développées pour la détection et l’extraction de néologismes sémantiques. Ces approches utilisent des stratégies telles que la désambiguïsation sémantique et la modélisation thématique, mais il n’existe aucun système complet de détection de néologismes sémantiques. Avec une étude détaillée des hypothèses théoriques nécessaires pour délimiter et décrire les néologismes sémantiques, nous proposons dans cette thèse le développement d’une application qui permet d’identifier et d’extraire ces unités à travers de méthodes statistiques, d’extraction d’information et d’apprentissage automatique. La méthodologie proposée est basée sur le traitement du processus de détection et d’extraction en tant que problème de classification. Il consiste à analyser la proximité des thèmes entre le champ sémantique de la signification principale d’un terme et son contexte. Pour la construction du système nous avons étudié cinq méthodes de classification automatique supervisée et trois modèles pour la génération de représentations vectorielles de mots par apprentissage profonde. Le corpus d’analyse est composé de néologismes sémantiques du domaine informatique appartenant à la base de données de l’Observatoire de Néologie de l’Université Pompeu Fabra, enregistrés de 1989 à 2015. Nous utilisons ce corpus pour évaluer les différentes méthodes mises en œuvre par le système : classification automatique, extraction de mots à partir de contextes courts et génération de listes de mots similaires. Cette première approche méthodologique cherche à établir un cadre de référence en termes de détection et d’extraction de néologismes sémantiques.
In the field of neology, different methodological approaches for the detection and extraction of semantic neologisms have been developed using strategies such as word sense disambiguation and topic modeling, but there is still not a proposal for a system for the detection of these units. Beginning from a detailed study on the necessary theoretical assumptions required to delimit and describe semantic neologisms, in this thesis, we propose the development of an application to identify and extract said units using statistical, data mining and machine learning strategies. The proposed methodology is based on treating the process of detection and extraction as a classification task, which consists on analyzing the concordance of topics between the semantic field from the main meaning of a word and the text where it is found. To build the architecture of the proposed system, we analyzed five automatic classification methods and three deep learning based word embedding models. Our analysis corpus is composed of the semantic neologisms of the computer science field belonging to the database of the Observatory of Neology of the Pompeu Fabra University, which have been registered from 1989 to 2015. We used this corpus to evaluate the different methods that our system implements: automatic classification, keyword extraction from short contexts, and similarity list generation. This first methodological approach aims to establish a framework of reference in terms of detection and extraction of semantic neologisms.
Liu, Jiang. "Wireless Communications Assisted by Reconfigurable Intelligent Surfaces". Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG111.
Pełny tekst źródłaRecently, the emergence of reconfigurable intelligent surface (RIS) has attracted heated attention from both industry and academia. A RIS is a planar surface that consists of a large number of low-cost passive reflecting elements. By carefully adjusting the phase shifts of the reflecting elements, an RIS can reshape the wireless environment for better communication. In this thesis, we focus on two subjects: (i) To study the modeling and optimization of RIS-aided communication systems. (ii) To study RIS-aided spatial modulation, especially the detection using deep learning techniques. Chapter 1 introduces the concept of smart radio environments and RIS. In 5G and future communications, RIS is a key technique to achieve seamless connectivity and less energy consumption at the same time. Chapter 2 introduces RIS-aided communication systems. The reflection principle, channel estimation problem and system design problem are introduced in detail. State-of-the-art research on the problems of channel estimation and system design are overviewed. Chapter 3 investigates the distribution of the signal-to-noise ratio (SNR) as a random variable in an RIS-aided multiple-input multiple-output (MIMO) system. Rayleigh fading and line-of-sight propagation are considered separately. The theoretical derivation and numerical simulation prove that the SNR is equivalent in distribution to the product of three (Rayleigh fading) or two (line-of-sight propagation) independent random variables. Chapter 4 studies the behavior of interference in an RIS-aided MIMO system, where each base station serves a user equipment (UE) through an RIS. The interference at a UE is caused by its non-serving RIS. It is proven that the interference-to-noise ratio is equivalent in distribution to the product of a Chi-squared random variable and a random variable which can be approximated with a Gamma distribution. Chapter 5 focuses on RIS-aided spatial modulation. First, we introduce deep learning aided detection for MIMO systems. Then, by generalizing RIS-aided spatial modulation systems as a special case of traditional spatial modulation systems, we investigate deep learning based detection for RIS-aided spatial modulation systems. Numerical results validate the proposed data-based and model-based deep learning detection schemes for RIS-aided spatial modulation systems. Finally, Chapter 6 concludes the thesis and discusses possible future research directions
Zhang, Wuming. "Towards non-conventional face recognition : shadow removal and heterogeneous scenario". Thesis, Lyon, 2017. http://www.theses.fr/2017LYSEC030/document.
Pełny tekst źródłaIn recent years, biometrics have received substantial attention due to the evergrowing need for automatic individual authentication. Among various physiological biometric traits, face offers unmatched advantages over the others, such as fingerprints and iris, because it is natural, non-intrusive and easily understandable by humans. Nowadays conventional face recognition techniques have attained quasi-perfect performance in a highly constrained environment wherein poses, illuminations, expressions and other sources of variations are strictly controlled. However these approaches are always confined to restricted application fields because non-ideal imaging environments are frequently encountered in practical cases. To adaptively address these challenges, this dissertation focuses on this unconstrained face recognition problem, where face images exhibit more variability in illumination. Moreover, another major question is how to leverage limited 3D shape information to jointly work with 2D based techniques in a heterogeneous face recognition system. To deal with the problem of varying illuminations, we explicitly build the underlying reflectance model which characterizes interactions between skin surface, lighting source and camera sensor, and elaborate the formation of face color. With this physics-based image formation model involved, an illumination-robust representation, namely Chromaticity Invariant Image (CII), is proposed which can subsequently help reconstruct shadow-free and photo-realistic color face images. Due to the fact that this shadow removal process is achieved in color space, this approach could thus be combined with existing gray-scale level lighting normalization techniques to further improve face recognition performance. The experimental results on two benchmark databases, CMU-PIE and FRGC Ver2.0, demonstrate the generalization ability and robustness of our approach to lighting variations. We further explore the effective and creative use of 3D data in heterogeneous face recognition. In such a scenario, 3D face is merely available in the gallery set and not in the probe set, which one would encounter in real-world applications. Two Convolutional Neural Networks (CNN) are constructed for this purpose. The first CNN is trained to extract discriminative features of 2D/3D face images for direct heterogeneous comparison, while the second CNN combines an encoder-decoder structure, namely U-Net, and Conditional Generative Adversarial Network (CGAN) to reconstruct depth face image from its counterpart in 2D. Specifically, the recovered depth face images can be fed to the first CNN as well for 3D face recognition, leading to a fusion scheme which achieves gains in recognition performance. We have evaluated our approach extensively on the challenging FRGC 2D/3D benchmark database. The proposed method compares favorably to the state-of-the-art and show significant improvement with the fusion scheme
Klokov, Roman. "Deep learning pour la modélisation de formes 3D". Electronic Thesis or Diss., Université Grenoble Alpes, 2021. http://www.theses.fr/2021GRALM060.
Pełny tekst źródłaApplication of deep learning to geometric 3D data poses various challenges for researchers. The complex nature of geometric 3D data allows to represent it in different forms: occupancy grids, point clouds, meshes, implicit functions, etc. Each of those representations has already spawned streams of deep neural network models, capable of processing and predicting according data samples for further use in various data recognition, generation, and modification tasks.Modern deep learning models force researchers to make various design choices, associated with their architectures, learning algorithms and other specific aspects of the chosen applications. Often, these choices are made with the help of various heuristics and best practice methods discovered through numerous costly experimental evaluations. Probabilistic modeling provides an alternative to these methods that allows to formalize machine learning tasks in a meaningful manner and develop probability-based training objectives. This thesis explores combinations of deep learning based methods and probabilistic modeling in application to geometric 3D data.The first contribution explores how probabilistic modeling could be applied in the context of single-view 3D shape inference task. We propose a family of probabilistic models, Probabilistic Reconstruction Networks (PRNs),which treats the task as image conditioned generation and introduces a global latent variable, encoding shape geometry information. We explore different image conditioning options, and two different training objectives based on Monte Carlo and variational approximations of the model likelihood. Parameters of every distribution are predicted by multi-layered convolutional and fully-connected neural networks from the input images. All the options in the family of models are evaluated in the single-view 3D occupancy grid inference task on synthetic shapes and according image renderings from randomized viewpoints. We show that conditioning the latent variable prior on the input images is sufficient to achieve competitive and state-of-the-art single-view 3D shape inference performance for point cloud based and voxel based metrics, respectively. We additionally demonstrate that probabilistic objective based on variational approximation of the likelihood allows the model to obtain better results compared to Monte Carlo based approximation.The second contribution proposes a probabilistic model for 3D point cloud generation. It treats point clouds as distributions over exchangeable variables and use de Finetti’s representation theorem to define a global latent variable model with conditionally independent distributions for coordinates of each point. To model these point distributions a novel type of conditional normalizing flows is proposed, based on discrete coupling of point coordinate dimensions. These flows update the coordinates of each point sample multiple times by dividing them in two groups and inferring the updates for one group of coordinates from another group and, additionally, global latent variable sample by the means of multi-layered fully-connected neural networks with parameters shared for all the points. We also extend our Discrete Point Flow Networks (DPFNs) from generation to single-view inference task by conditioning the global latent variable prior in a manner similar to PRNs from the first contribution. Resulting generative performance demonstrates that DPFNs produce sets of samples of similar quality and diversity compared to state of the art based on continuous normalizing flows, but are approximately 30 times faster both in training and sampling. Results in autoencoding and single-view inference tasks show competitive and state-of-the-art performance for Chamfer distance, F-score and earth mover’s distance similarity metrics for point clouds
Nguyen, Thanh Hai. "Some contributions to deep learning for metagenomics". Electronic Thesis or Diss., Sorbonne université, 2018. http://www.theses.fr/2018SORUS102.
Pełny tekst źródłaMetagenomic data from human microbiome is a novel source of data for improving diagnosis and prognosis in human diseases. However, to do a prediction based on individual bacteria abundance is a challenge, since the number of features is much bigger than the number of samples. Hence, we face the difficulties related to high dimensional data processing, as well as to the high complexity of heterogeneous data. Machine Learning has obtained great achievements on important metagenomics problems linked to OTU-clustering, binning, taxonomic assignment, etc. The contribution of this PhD thesis is multi-fold: 1) a feature selection framework for efficient heterogeneous biomedical signature extraction, and 2) a novel deep learning approach for predicting diseases using artificial image representations. The first contribution is an efficient feature selection approach based on visualization capabilities of Self-Organizing Maps for heterogeneous data fusion. The framework is efficient on a real and heterogeneous datasets containing metadata, genes of adipose tissue, and gut flora metagenomic data with a reasonable classification accuracy compared to the state-of-the-art methods. The second approach is a method to visualize metagenomic data using a simple fill-up method, and also various state-of-the-art dimensional reduction learning approaches. The new metagenomic data representation can be considered as synthetic images, and used as a novel data set for an efficient deep learning method such as Convolutional Neural Networks. The results show that the proposed methods either achieve the state-of-the-art predictive performance, or outperform it on public rich metagenomic benchmarks
Bachhav, Pramod. "Explicit memory inclusion for efficient artificial bandwidth extension". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS492.
Pełny tekst źródłaMost ABE algorithms exploit contextual information or memory captured via the use of static or dynamic features extracted from neighbouring speech frames. The use of memory leads to higher dimensional features and increased computational complexity. When information from look-ahead frames is also utilised, then latency also increases. Past work points toward the benefit to ABE of exploiting memory in the form of dynamic features with a standard regression model. Even so, the literature is missing a quantitative analysis of the relative benefit of explicit memory inclusion. The research presented in this thesis assesses the degree to which explicit memory is of benefit and furthermore reports a number of different techniques that allow for its inclusion without significant increases to latency and computational complexity. Benefits are shown through both a quantitative analysis with an information-theoretic measure and subjective listening tests. Key contributions relate to the preservation of computational efficiency through the use of dimensionality reduction in the form of principal component analysis, semisupervised stacked autoencoders and conditional variational auto-encoders. The two latter techniques optimise dimensionality reduction to deliver superior ABE performance
Balikas, Georgios. "Explorer et apprendre à partir de collections de textes multilingues à l'aide des modèles probabilistes latents et des réseaux profonds". Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM054/document.
Pełny tekst źródłaText is one of the most pervasive and persistent sources of information. Content analysis of text in its broad sense refers to methods for studying and retrieving information from documents. Nowadays, with the ever increasing amounts of text becoming available online is several languages and different styles, content analysis of text is of tremendous importance as it enables a variety of applications. To this end, unsupervised representation learning methods such as topic models and word embeddings constitute prominent tools.The goal of this dissertation is to study and address challengingproblems in this area, focusing on both the design of novel text miningalgorithms and tools, as well as on studying how these tools can be applied to text collections written in a single or several languages.In the first part of the thesis we focus on topic models and more precisely on how to incorporate prior information of text structure to such models.Topic models are built on the premise of bag-of-words, and therefore words are exchangeable. While this assumption benefits the calculations of the conditional probabilities it results in loss of information.To overcome this limitation we propose two mechanisms that extend topic models by integrating knowledge of text structure to them. We assume that the documents are partitioned in thematically coherent text segments. The first mechanism assigns the same topic to the words of a segment. The second, capitalizes on the properties of copulas, a tool mainly used in the fields of economics and risk management that is used to model the joint probability density distributions of random variables while having access only to their marginals.The second part of the thesis explores bilingual topic models for comparable corpora with explicit document alignments. Typically, a document collection for such models is in the form of comparable document pairs. The documents of a pair are written in different languages and are thematically similar. Unless translations, the documents of a pair are similar to some extent only. Meanwhile, representative topic models assume that the documents have identical topic distributions, which is a strong and limiting assumption. To overcome it we propose novel bilingual topic models that incorporate the notion of cross-lingual similarity of the documents that constitute the pairs in their generative and inference processes. Calculating this cross-lingual document similarity is a task on itself, which we propose to address using cross-lingual word embeddings.The last part of the thesis concerns the use of word embeddings and neural networks for three text mining applications. First, we discuss polylingual document classification where we argue that translations of a document can be used to enrich its representation. Using an auto-encoder to obtain these robust document representations we demonstrate improvements in the task of multi-class document classification. Second, we explore multi-task sentiment classification of tweets arguing that by jointly training classification systems using correlated tasks can improve the obtained performance. To this end we show how can achieve state-of-the-art performance on a sentiment classification task using recurrent neural networks. The third application we explore is cross-lingual information retrieval. Given a document written in one language, the task consists in retrieving the most similar documents from a pool of documents written in another language. In this line of research, we show that by adapting the transportation problem for the task of estimating document distances one can achieve important improvements
Tong, Zheng. "Evidential deep neural network in the framework of Dempster-Shafer theory". Thesis, Compiègne, 2022. http://www.theses.fr/2022COMP2661.
Pełny tekst źródłaDeep neural networks (DNNs) have achieved remarkable success on many realworld applications (e.g., pattern recognition and semantic segmentation) but still face the problem of managing uncertainty. Dempster-Shafer theory (DST) provides a wellfounded and elegant framework to represent and reason with uncertain information. In this thesis, we have proposed a new framework using DST and DNNs to solve the problems of uncertainty. In the proposed framework, we first hybridize DST and DNNs by plugging a DSTbased neural-network layer followed by a utility layer at the output of a convolutional neural network for set-valued classification. We also extend the idea to semantic segmentation by combining fully convolutional networks and DST. The proposed approach enhances the performance of DNN models by assigning ambiguous patterns with high uncertainty, as well as outliers, to multi-class sets. The learning strategy using soft labels further improves the performance of the DNNs by converting imprecise and unreliable label data into belief functions. We have also proposed a modular fusion strategy using this proposed framework, in which a fusion module aggregates the belief-function outputs of evidential DNNs by Dempster’s rule. We use this strategy to combine DNNs trained from heterogeneous datasets with different sets of classes while keeping at least as good performance as those of the individual networks on their respective datasets. Further, we apply the strategy to combine several shallow networks and achieve a similar performance of an advanced DNN for a complicated task
Singh, Praveer. "Processing high-resolution images through deep learning techniques". Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1172.
Pełny tekst źródłaIn this thesis, we discuss four different application scenarios that can be broadly grouped under the larger umbrella of Analyzing and Processing high-resolution images using deep learning techniques. The first three chapters encompass processing remote-sensing (RS) images which are captured either from airplanes or satellites from hundreds of kilometers away from the Earth. We start by addressing a challenging problem related to improving the classification of complex aerial scenes through a deep weakly supervised learning paradigm. We showcase as to how by only using the image level labels we can effectively localize the most distinctive regions in complex scenes and thus remove ambiguities leading to enhanced classification performance in highly complex aerial scenes. In the second chapter, we deal with refining segmentation labels of Building footprints in aerial images. This we effectively perform by first detecting errors in the initial segmentation masks and correcting only those segmentation pixels where we find a high probability of errors. The next two chapters of the thesis are related to the application of Generative Adversarial Networks. In the first one, we build an effective Cloud-GAN model to remove thin films of clouds in Sentinel-2 imagery by adopting a cyclic consistency loss. This utilizes an adversarial lossfunction to map cloudy-images to non-cloudy images in a fully unsupervised fashion, where the cyclic-loss helps in constraining the network to output a cloud-free image corresponding to the input cloudy image and not any random image in the target domain. Finally, the last chapter addresses a different set of high-resolution images, not coming from the RS domain but instead from High Dynamic Range Imaging (HDRI) application. These are 32-bit imageswhich capture the full extent of luminance present in the scene. Our goal is to quantize them to 8-bit Low Dynamic Range (LDR) images so that they can be projected effectively on our normal display screens while keeping the overall contrast and perception quality similar to that found in HDR images. We adopt a Multi-scale GAN model that focuses on both coarser as well as finer-level information necessary for high-resolution images. The final tone-mapped outputs have a high subjective quality without any perceived artifacts
Blanc, Beyne Thibault. "Estimation de posture 3D à partir de données imprécises et incomplètes : application à l'analyse d'activité d'opérateurs humains dans un centre de tri". Thesis, Toulouse, INPT, 2020. http://www.theses.fr/2020INPT0106.
Pełny tekst źródłaIn a context of study of stress and ergonomics at work for the prevention of musculoskeletal disorders, the company Ebhys wants to develop a tool for analyzing the activity of human operators in a waste sorting center, by measuring ergonomic indicators. To cope with the uncontrolled environment of the sorting center, these indicators are measured from depth images. An ergonomic study allows us to define the indicators to be measured. These indicators are zones of movement of the operator’s hands and zones of angulations of certain joints of the upper body. They are therefore indicators that can be obtained from an analysis of the operator’s 3D pose. The software for calculating the indicators will thus be composed of three steps : a first part segments the operator from the rest of the scene to ease the 3D pose estimation, a second part estimates the operator’s 3D pose, and the third part uses the operator’s 3D pose to compute the ergonomic indicators. First of all, we propose an algorithm that extracts the operator from the rest of the depth image. To do this, we use a first automatic segmentation based on static background removal and selection of a moving element given its position and size. This first segmentation allows us to train a neural network that improves the results. This neural network is trained using the segmentations obtained from the first automatic segmentation, from which the best quality samples are automatically selected during training. Next, we build a neural network model to estimate the operator’s 3D pose. We propose a study that allows us to find a light and optimal model for 3D pose estimation on synthetic depth images, which we generate numerically. However, if this network gives outstanding performances on synthetic depth images, it is not directly applicable to real depth images that we acquired in an industrial context. To overcome this issue, we finally build a module that allows us to transform the synthetic depth images into more realistic depth images. This image-to-image translation model modifies the style of the depth image without changing its content, keeping the 3D pose of the operator from the synthetic source image unchanged on the translated realistic depth frames. These more realistic depth images are then used to re-train the 3D pose estimation neural network, to finally obtain a convincing 3D pose estimation on the depth images acquired in real conditions, to compute de ergonomic indicators
Mazloomi, Khamseh Hamid. "Intention d’apprendre et diversité des partenaires : effets simples et combinés sur le transfert de connaissances entre alliés". Thesis, Vandoeuvre-les-Nancy, INPL, 2010. http://www.theses.fr/2010INPL017N/document.
Pełny tekst źródłaRelying on knowledge based view; this study tests the effects of three concepts as the prerequisites for interfirm learning: Intent to explore, Existence of novelty, and Approach of exploration. The paper defines the existence of new knowledge to be learnt by the level of partner diversity and addresses approaches of exploration by the interactive effect of the explorative intent and partner diversity. The hypotheses are tested based on a survey over a sample of 114 French companies. Determinants of knowledge transfer between partners such as ambiguity of partner's knowledge, knowledge protection and trust are controlled. Using Tobit regression models, the findings show that the intent to explore is positively related with interfirm knowledge transfer. Moreover, an inverted U-shape relationship is observed between partner diversity and the effectiveness of interfirm knowledge transfer. Finally, the negative moderating effect of partner diversity on the relation of exploration and knowledge transfer highlights the effect of two approaches of exploration: depth and scope of exploration. In the accordance with the concept of depth of search we find that the interactive effect of similarity of partners with explorative intent is positive on interfirm learning. We also find that a broad search scope represented by the interactive effect of partner diversity and intent to explore has negative impact on interfirm learning
Mozafari, Marzieh. "Hate speech and offensive language detection using transfer learning approaches". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAS007.
Pełny tekst źródłaThe great promise of social media platforms (e.g., Twitter and Facebook) is to provide a safe place for users to communicate their opinions and share information. However, concerns are growing that they enable abusive behaviors, e.g., threatening or harassing other users, cyberbullying, hate speech, racial and sexual discrimination, as well. In this thesis, we focus on hate speech as one of the most concerning phenomenon in online social media.Given the high progression of online hate speech and its severe negative effects, institutions, social media platforms, and researchers have been trying to react as quickly as possible. The recent advancements in Natural Language Processing (NLP) and Machine Learning (ML) algorithms can be adapted to develop automatic methods for hate speech detection in this area.The aim of this thesis is to investigate the problem of hate speech and offensive language detection in social media, where we define hate speech as any communication criticizing a person or a group based on some characteristics, e.g., gender, sexual orientation, nationality, religion, race. We propose different approaches in which we adapt advanced Transfer Learning (TL) models and NLP techniques to detect hate speech and offensive content automatically, in a monolingual and multilingual fashion.In the first contribution, we only focus on English language. Firstly, we analyze user-generated textual content to gain a brief insight into the type of content by introducing a new framework being able to categorize contents in terms of topical similarity based on different features. Furthermore, using the Perspective API from Google, we measure and analyze the toxicity of the content. Secondly, we propose a TL approach for identification of hate speech by employing a combination of the unsupervised pre-trained model BERT (Bidirectional Encoder Representations from Transformers) and new supervised fine-tuning strategies. Finally, we investigate the effect of unintended bias in our pre-trained BERT based model and propose a new generalization mechanism in training data by reweighting samples and then changing the fine-tuning strategies in terms of the loss function to mitigate the racial bias propagated through the model. To evaluate the proposed models, we use two publicly available datasets from Twitter.In the second contribution, we consider a multilingual setting where we focus on low-resource languages in which there is no or few labeled data available. First, we present the first corpus of Persian offensive language consisting of 6k micro blog posts from Twitter to deal with offensive language detection in Persian as a low-resource language in this domain. After annotating the corpus, we perform extensive experiments to investigate the performance of transformer-based monolingual and multilingual pre-trained language models (e.g., ParsBERT, mBERT, XLM-R) in the downstream task. Furthermore, we propose an ensemble model to boost the performance of our model. Then, we expand our study into a cross-lingual few-shot learning problem, where we have a few labeled data in target language, and adapt a meta-learning based approach to address identification of hate speech and offensive language in low-resource languages
Souriau, Rémi. "machine learning for modeling dynamic stochastic systems : application to adaptive control on deep-brain stimulation". Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG004.
Pełny tekst źródłaThe past recent years have been marked by the emergence of a large amount of database in many fields like health. The creation of many databases paves the way to new applications. Properties of data are sometimes complex (non linearity, dynamic, high dimensions) and require to perform machine learning models. Belong existing machine learning models, artificial neural network got a large success since the last decades. The success of these models lies on the non linearity behavior of neurons, the use of latent units and the flexibility of these models to adapt to many different problems. Boltzmann machines presented in this thesis are a family of generative neural networks. Introduced by Hinton in the 80's, this family have got a large interest at the beginning of the 21st century and new extensions are regularly proposed.This thesis is divided into two parts. A first part exploring Boltzmann machines and their applications. In this thesis the unsupervised learning of intracranial electroencephalogram signals on rats with Parkinson's disease for the control of the symptoms is studied.Boltzmann machines gave birth to Diffusion networks which are also generative model based on the learning of a stochastic differential equation for dynamic and stochastic data. This model is studied again in this thesis and a new training algorithm is proposed. Its use is tested on toy data as well as on real database
Jaritz, Maximilian. "2D-3D scene understanding for autonomous driving". Thesis, Université Paris sciences et lettres, 2020. https://pastel.archives-ouvertes.fr/tel-02921424.
Pełny tekst źródłaIn this thesis, we address the challenges of label scarcity and fusion of heterogeneous 3D point clouds and 2D images. We adopt the strategy of end-to-end race driving where a neural network is trained to directly map sensor input (camera image) to control output, which makes this strategy independent from annotations in the visual domain. We employ deep reinforcement learning where the algorithm learns from reward by interaction with a realistic simulator. We propose new training strategies and reward functions for better driving and faster convergence. However, training time is still very long which is why we focus on perception to study point cloud and image fusion in the remainder of this thesis. We propose two different methods for 2D-3D fusion. First, we project 3D LiDAR point clouds into 2D image space, resulting in sparse depth maps. We propose a novel encoder-decoder architecture to fuse dense RGB and sparse depth for the task of depth completion that enhances point cloud resolution to image level. Second, we fuse directly in 3D space to prevent information loss through projection. Therefore, we compute image features with a 2D CNN of multiple views and then lift them all to a global 3D point cloud for fusion, followed by a point-based network to predict 3D semantic labels. Building on this work, we introduce the more difficult novel task of cross-modal unsupervised domain adaptation, where one is provided with multi-modal data in a labeled source and an unlabeled target dataset. We propose to perform 2D-3D cross-modal learning via mutual mimicking between image and point cloud networks to address the source-target domain shift. We further showcase that our method is complementary to the existing uni-modal technique of pseudo-labeling
Mohy, El Dine Kamal. "Control of robotic mobile manipulators : application to civil engineering". Thesis, Université Clermont Auvergne (2017-2020), 2019. http://www.theses.fr/2019CLFAC015/document.
Pełny tekst źródłaDespite the advancements in industrial automation, robotic solutions are not yet commonly used in the civil engineering sector. More specifically, grinding tasks such as asbestos removal, are still performed by human operators using conventional electrical and hydraulic tools. However, with the decrease in the relative cost of machinery with respect to human labor and with the strict health regulations on such risky jobs, robots are progressively becoming credible alternatives to automate these tasks and replace humans.In this thesis, novel surface grinding control approaches are elaborated. The first controller is based on hybrid position-force controller with compliant wrist and a smooth switching strategy. In this controller, the impact force is reduced by the proposed smooth switching between free space and contact modes. The second controller is based on a developed grinding model and an adaptive hybrid position-velocity-force controller. The controllers are validated experimentally on a 7-degrees-of-freedom robotic arm equipped with a camera and a force-torque sensor. The experimental results show good performances and the controllers are promising. Additionally, a new approach for controlling the stability of mobile manipulators in real time is presented. The controller is based on zero moment point, it is tested in simulations and it was able to actively maintain the tip-over stability of the mobile manipulator while moving. Moreover, the modeling and sensors uncertainties are taken into account in the mentioned controllers where observers are proposed. The details of the development and evaluation of the several proposed controllers are presented, their merits and limitations are discussed and future works are suggested
Hamel, Philippe. "Apprentissage de représentations musicales à l'aide d'architectures profondes et multiéchelles". Thèse, 2012. http://hdl.handle.net/1866/8678.
Pełny tekst źródłaMachine learning (ML) is an important tool in the field of music information retrieval (MIR). Many MIR tasks can be solved by training a classifier over a set of features. For MIR tasks based on music audio, it is possible to extract features from the audio with signal processing techniques. However, some musical aspects are hard to extract with simple heuristics. To obtain richer features, we can use ML to learn a representation from the audio. These learned features can often improve performance for a given MIR task. In order to learn interesting musical representations, it is important to consider the particular aspects of music audio when building learning models. Given the temporal and spectral structure of music audio, deep and multi-scale representations are particularly well suited to represent music. This thesis focuses on learning representations from music audio. Deep and multi-scale models that improve the state-of-the-art for tasks such as instrument recognition, genre recognition and automatic annotation are presented.
Lajoie, Isabelle. "Apprentissage de représentations sur-complètes par entraînement d’auto-encodeurs". Thèse, 2009. http://hdl.handle.net/1866/3768.
Pełny tekst źródłaProgress in the machine learning domain allows computational system to address more and more complex tasks associated with vision, audio signal or natural language processing. Among the existing models, we find the Artificial Neural Network (ANN), whose popularity increased suddenly with the recent breakthrough of Hinton et al. [22], that consists in using Restricted Boltzmann Machines (RBM) for performing an unsupervised, layer by layer, pre-training initialization, of a Deep Belief Network (DBN), which enables the subsequent successful supervised training of such architecture. Since this discovery, researchers studied the efficiency of other similar pre-training strategies such as the stacking of traditional auto-encoder (SAE) [5, 38] and the stacking of denoising auto-encoder (SDAE) [44]. This is the context in which the present study started. After a brief introduction of the basic machine learning principles and of the pre-training methods used until now with RBM, AE and DAE modules, we performed a series of experiments to deepen our understanding of pre-training with SDAE, explored its different proprieties and explored variations on the DAE algorithm as alternative strategies to initialize deep networks. We evaluated the sensitivity to the noise level, and influence of number of layers and number of hidden units on the generalization error obtained with SDAE. We experimented with other noise types and saw improved performance on the supervised task with the use of pepper and salt noise (PS) or gaussian noise (GS), noise types that are more justified then the one used until now which is masking noise (MN). Moreover, modifying the algorithm by imposing an emphasis on the corrupted components reconstruction during the unsupervised training of each different DAE showed encouraging performance improvements. Our work also allowed to reveal that DAE was capable of learning, on naturals images, filters similar to those found in V1 cells of the visual cortex, that are in essence edges detectors. In addition, we were able to verify that the learned representations of SDAE, are very good characteristics to be fed to a linear or gaussian support vector machine (SVM), considerably enhancing its generalization performance. Also, we observed that, alike DBN, and unlike SAE, the SDAE had the potential to be used as a good generative model. As well, we opened the door to novel pre-training strategies and discovered the potential of one of them : the stacking of renoising auto-encoders (SRAE).
Diouf, Jean Noël Dibocor. "Classification, apprentissage profond et réseaux de neurones : application en science des données". Thèse, 2020. http://depot-e.uqtr.ca/id/eprint/9555/1/eprint9555.pdf.
Pełny tekst źródłaKahya, Emre Onur. "Identifying electrons with deep learning methods". Thesis, 2020. http://hdl.handle.net/1866/25101.
Pełny tekst źródłaThis thesis is about applying the tools of Machine Learning to an important problem of experimental particle physics: identifying signal electrons after proton-proton collisions at the Large Hadron Collider. In Chapters 1, we provide some information about the Large Hadron Collider and explain why it was built. We give further details about one of the biggest detectors in the Large Hadron Collider, the ATLAS. Then we define what electron identification task is, as well as the importance of solving it. Finally, we give detailed information about our dataset that we use to solve the electron identification task. In Chapters 2, we give a brief introduction to fundamental principles of machine learning. Starting with the definition and types of different learning tasks, we discuss various ways to represent inputs. Then we present what to learn from the inputs as well as how to do it. And finally, we look at the problems that would arise if we “overdo” learning. In Chapters 3, we motivate the choice of the architecture to solve our task, especially for the parts that have sequential images as inputs. We then present the results of our experiments and show that our model performs much better than the existing algorithms that the ATLAS collaboration currently uses. Finally, we discuss future directions to further improve our results. In Chapter 4, we discuss two concepts: out of distribution generalization and flatness of loss surface. We claim that the algorithms, that brings a model into a wide flat minimum of its training loss surface, would generalize better for out of distribution tasks. We give the results of implementing two such algorithms to our dataset and show that it supports our claim. Finally, we end with our conclusions.
Pezeshki, Mohammad. "Towards deep semi supervised learning". Thèse, 2016. http://hdl.handle.net/1866/18343.
Pełny tekst źródłaRacah, Evan. "Unsupervised representation learning in interactive environments". Thèse, 2019. http://hdl.handle.net/1866/23788.
Pełny tekst źródłaExtracting a representation of all the high-level factors of an agent’s state from level-level sensory information is an important, but challenging task in machine learning. In this thesis, we will explore several unsupervised approaches for learning these state representations. We apply and analyze existing unsupervised representation learning methods in reinforcement learning environments, as well as contribute our own evaluation benchmark and our own novel state representation learning method. In the first chapter, we will overview and motivate unsupervised representation learning for machine learning in general and for reinforcement learning. We will then introduce a relatively new subfield of representation learning: self-supervised learning. We will then cover two core representation learning approaches, generative methods and discriminative methods. Specifically, we will focus on a collection of discriminative representation learning methods called contrastive unsupervised representation learning (CURL) methods. We will close the first chapter by detailing various approaches for evaluating the usefulness of representations. In the second chapter, we will present a workshop paper, where we evaluate a handful of off-the-shelf self-supervised methods in reinforcement learning problems. We discover that the performance of these representations depends heavily on the dynamics and visual structure of the environment. As such, we determine that a more systematic study of environments and methods is required. Our third chapter covers our second article, Unsupervised State Representation Learning in Atari, where we try to execute a more thorough study of representation learning methods in RL as motivated by the second chapter. To facilitate a more thorough evaluation of representations in RL we introduce a benchmark of 22 fully labelled Atari games. In addition, we choose the representation learning methods for comparison in a more systematic way by focusing on comparing generative methods with contrastive methods, instead of the less systematically chosen off-the-shelf methods from the second chapter. Finally, we introduce a new contrastive method, ST-DIM, which excels at the 22 Atari games.