Rozprawy doktorskie na temat „Traitement des données multimodales”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Traitement des données multimodales”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Guislain, Maximilien. "Traitement joint de nuage de points et d'images pour l'analyse et la visualisation des formes 3D". Thesis, Lyon, 2017. http://www.theses.fr/2017LYSE1219/document.
Pełny tekst źródłaRecent years saw a rapid development of city digitization technologies. Acquisition campaigns covering entire cities are now performed using LiDAR (Light Detection And Ranging) scanners embedded aboard mobile vehicles. These acquisition campaigns yield point clouds, composed of millions of points, representing the buildings and the streets, and may also contain a set of images of the scene. The subject developed here is the improvement of the point cloud using the information contained in the camera images. This thesis introduces several contributions to this joint improvement. The position and orientation of acquired images are usually estimated using devices embedded with the LiDAR scanner, even if this information is inaccurate. To obtain the precise registration of an image on a point cloud, we propose a two-step algorithm which uses both Mutual Information and Histograms of Oriented Gradients. The proposed method yields an accurate camera pose, even when the initial estimations are far from the real position and orientation. Once the images have been correctly registered, it is possible to use them to color each point of the cloud while using the variability of the point of view. This is done by minimizing an energy considering the different colors associated with a point and the potential colors of its neighbors. Illumination changes can also change the color assigned to a point. Notably, this color can be affected by cast shadows. These cast shadows are changing with the sun position, it is therefore necessary to detect and correct them. We propose a new method that analyzes the joint variation of the reflectance value obtained by the LiDAR and the color of the points. By detecting enough interfaces between shadow and light, we can characterize the luminance of the scene and to remove the cast shadows. The last point developed in this thesis is the densification of a point cloud. Indeed, the local density of a point cloud varies and is sometimes insufficient in certain areas. We propose a directly applicable approach to increase the density of a point cloud using multiple images
Cavalcante, Aguilar Paulo Armando. "Réseaux Évidentiels pour la fusion de données multimodales hétérogènes : application à la détection de chutes". Phd thesis, Institut National des Télécommunications, 2012. http://tel.archives-ouvertes.fr/tel-00789773.
Pełny tekst źródłaChlaily, Saloua. "Modèle d'interaction et performances du traitement du signal multimodal". Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAT026/document.
Pełny tekst źródłaThe joint processing of multimodal measurements is supposed to lead to better performances than those obtained using a single modality or several modalities independently. However, in literature, there are examples that show that is not always true. In this thesis, we analyze, in terms of mutual information and estimation error, the different situations of multimodal analysis in order to determine the conditions to achieve the optimal performances.In the first part, we consider the simple case of two or three modalities, each associated with noisy measurement of a signal. These modalities are linked through the correlations between the useful parts of the signal and the correlations between the noises. We show that the performances are improved if the links between the modalities are exploited. In the second part, we study the impact on performance of wrong links between modalities. We show that these false assumptions decline the performance, which can become lower than the performance achieved using a single modality.In the general case, we model the multiple modalities as a noisy Gaussian channel. We then extend literature results by considering the impact of the errors on signal and noise probability densities on the information transmitted by the channel. We then analyze this relationship in the case of a simple model of two modalities. Our results show in particular the unexpected fact that a double mismatch of the noise and the signal can sometimes compensate for each other, and thus lead to very good performances
Aron, Michaël. "Acquisition et modélisation de données articulatoires dans un contexte multimodal". Thesis, Nancy 1, 2009. http://www.theses.fr/2009NAN10097/document.
Pełny tekst źródłaThere is no single technique that will allow all relevant behaviour of the speech articulators (lips, tongue, palate...) to be spatially ant temporally acquired. Thus, this thesis investigates the fusion of multimodal articulatory data. A framework is described in order to acquire and fuse automatically an important database of articulatory data. This includes: 2D Ultrasound (US) data to recover the dynamic of the tongue, stereovision data to recover the 3D dynamic of the lips, electromagnetic sensors that provide 3D position of points on the face and the tongue, and 3D Magnetic Resonance Imaging (MRI) that depict the vocal tract for various sustained articulations. We investigate the problems of the temporal synchronization and the spatial registration between all these modalities, and also the extraction of the shape articulators from the data (tongue tracking in US images). We evaluate the uncertainty of our system by quantifying the spatial and temporal inaccuracies of the components of the system, both individually and in combination. Finally, the fused data are evaluated on an existing articulatory model to assess their quality for an application in speech production
Chesnel, Anne-Lise. "Quantification de dégâts sur le bâti liés aux catastrophes majeures par images satellite multimodales très haute résolution". Phd thesis, École Nationale Supérieure des Mines de Paris, 2008. http://pastel.archives-ouvertes.fr/pastel-00004211.
Pełny tekst źródłaBoscaro, Anthony. "Analyse multimodale et multicritères pour l'expertise et la localisation de défauts dans les composants électriques modernes". Thesis, Bourgogne Franche-Comté, 2017. http://www.theses.fr/2017UBFCK014/document.
Pełny tekst źródłaThe purpose of this manuscript is to exhibit the research work solving the issue of data processing stem from defect localization techniques. This step being decisive in the failure analysis process, scientists have to harness data coming from light emission and laser techniques. Nevertheless, this analysis process is sequential and only depends on the expert’s decision. This factor leads to a not quantified probability of localization. Consequently to solve these issues, a multimodaland multicriteria analysis has been developped, taking advantage of the heterogeneous and complementary nature of light emission and laser probing techniques. This kind of process is based on advanced level tools such as signal/image processing and data fusion. The final aim being to provide a quantitive and qualitative decision help for the experts.The first part of this manuscript is dedicated to the description of the entire process for 1D and 2D data enhancement. Thereafter, the spatio-temporal analysis of laser probing waveforms will be tackled. Finally, the last part highlights the decision support brought by data fusion
Wang, Xin. "Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset". Electronic Thesis or Diss., Paris 6, 2017. http://www.theses.fr/2017PA066577.
Pełny tekst źródłaIn this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results
Wang, Xin. "Gaze based weakly supervised localization for image classification : application to visual recognition in a food dataset". Thesis, Paris 6, 2017. http://www.theses.fr/2017PA066577/document.
Pełny tekst źródłaIn this dissertation, we discuss how to use the human gaze data to improve the performance of the weak supervised learning model in image classification. The background of this topic is in the era of rapidly growing information technology. As a consequence, the data to analyze is also growing dramatically. Since the amount of data that can be annotated by the human cannot keep up with the amount of data itself, current well-developed supervised learning approaches may confront bottlenecks in the future. In this context, the use of weak annotations for high-performance learning methods is worthy of study. Specifically, we try to solve the problem from two aspects: One is to propose a more time-saving annotation, human eye-tracking gaze, as an alternative annotation with respect to the traditional time-consuming annotation, e.g. bounding box. The other is to integrate gaze annotation into a weakly supervised learning scheme for image classification. This scheme benefits from the gaze annotation for inferring the regions containing the target object. A useful property of our model is that it only exploits gaze for training, while the test phase is gaze free. This property further reduces the demand of annotations. The two isolated aspects are connected together in our models, which further achieve competitive experimental results
Chen, Jianan. "Deep Learning Based Multimodal Retrieval". Electronic Thesis or Diss., Rennes, INSA, 2023. http://www.theses.fr/2023ISAR0019.
Pełny tekst źródłaMultimodal tasks play a crucial role in the progression towards achieving general artificial intelligence (AI). The primary goal of multimodal retrieval is to employ machine learning algorithms to extract relevant semantic information, bridging the gap between different modalities such as visual images, linguistic text, and other data sources. It is worth noting that the information entropy associated with heterogeneous data for the same high-level semantics varies significantly, posing a significant challenge for multimodal models. Deep learning-based multimodal network models provide an effective solution to tackle the difficulties arising from substantial differences in information entropy. These models exhibit impressive accuracy and stability in large-scale cross-modal information matching tasks, such as image-text retrieval. Furthermore, they demonstrate strong transfer learning capabilities, enabling a well-trained model from one multimodal task to be fine-tuned and applied to a new multimodal task, even in scenarios involving few-shot or zero-shot learning. In our research, we develop a novel generative multimodal multi-view database specifically designed for the multimodal referential segmentation task. Additionally, we establish a state-of-the-art (SOTA) benchmark and multi-view metric for referring expression segmentation models in the multimodal domain. The results of our comparative experiments are presented visually, providing clear and comprehensive insights
Guillaumin, Matthieu. "Données multimodales pour l'analyse d'image". Phd thesis, Grenoble, 2010. http://www.theses.fr/2010GRENM048.
Pełny tekst źródłaThis dissertation delves into the use of textual metadata for image understanding. We seek to exploit this additional textual information as weak supervision to improve the learning of recognition models. There is a recent and growing interest for methods that exploit such data because they can potentially alleviate the need for manual annotation, which is a costly and time-consuming process. We focus on two types of visual data with associated textual information. First, we exploit news images that come with descriptive captions to address several face related tasks, including face verification, which is the task of deciding whether two images depict the same individual, and face naming, the problem of associating faces in a data set to their correct names. Second, we consider data consisting of images with user tags. We explore models for automatically predicting tags for new images, i. E. Image auto-annotation, which can also used for keyword-based image search. We also study a multimodal semi-supervised learning scenario for image categorisation. In this setting, the tags are assumed to be present in both labelled and unlabelled training data, while they are absent from the test data. Our work builds on the observation that most of these tasks can be solved if perfectly adequate similarity measures are used. We therefore introduce novel approaches that involve metric learning, nearest neighbour models and graph-based methods to learn, from the visual and textual data, task-specific similarities. For faces, our similarities focus on the identities of the individuals while, for images, they address more general semantic visual concepts. Experimentally, our approaches achieve state-of-the-art results on several standard and challenging data sets. On both types of data, we clearly show that learning using additional textual information improves the performance of visual recognition systems
Guillaumin, Matthieu. "Données multimodales pour l'analyse d'image". Phd thesis, Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00522278/en/.
Pełny tekst źródłaGuo, Yan. "Perception multimodale pour un robot mobile en milieu marin". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2011. http://tel.archives-ouvertes.fr/tel-00637552.
Pełny tekst źródłaIstrate, Dan. "Contribution à l'analyse de l'environnement sonore et à la fusion multimodale pour l'identification d'activités dans le cadre de la télévigilance médicale". Habilitation à diriger des recherches, Université d'Evry-Val d'Essonne, 2011. http://tel.archives-ouvertes.fr/tel-00790339.
Pełny tekst źródłaRabhi, Sara. "Optimized deep learning-based multimodal method for irregular medical timestamped data". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS003.
Pełny tekst źródłaThe wide adoption of Electronic Health Records in hospitals’ information systems has led to the definition of large databases grouping various types of data such as textual notes, longitudinal medical events, and tabular patient information. However, the records are only filled during consultations or hospital stays that depend on the patient’s state, and local habits. A system that can leverage the different types of data collected at different time scales is critical for reconstructing the patient’s health trajectory, analyzing his history, and consequently delivering more adapted care.This thesis work addresses two main challenges of medical data processing: learning to represent the sequence of medical observations with irregular elapsed time between consecutive visits and optimizing the extraction of medical events from clinical notes. Our main goal is to design a multimodal representation of the patient’s health trajectory to solve clinical prediction problems. Our first work built a framework for modeling irregular medical time series to evaluate the importance of considering the time gaps between medical episodes when representing a patient’s health trajectory. To that end, we conducted a comparative study of sequential neural networks and irregular time representation techniques. The clinical objective was to predict retinopathy complications for type 1 diabetes patients in the French database CaRéDIAB (Champagne Ardenne Réseau Diabetes) using their history of HbA1c measurements. The study results showed that the attention-based model combined with the soft one-hot representation of time gaps led to AUROC score of 88.65% (specificity of 85.56%, sensitivity of 83.33%), an improvement of 4.3% when compared to the LSTM-based model. Motivated by these results, we extended our framework to shorter multivariate time series and predicted in-hospital mortality for critical care patients of the MIMIC-III dataset. The proposed architecture, HiTT, improved the AUC score by 5% over the Transformer baseline. In the second step, we focused on extracting relevant medical information from clinical notes to enrich the patient’s health trajectories. Particularly, Transformer-based architectures showed encouraging results in medical information extraction tasks. However, these complex models require a large, annotated corpus. This requirement is hard to achieve in the medical field as it necessitates access to private patient data and high expert annotators. To reduce annotation cost, we explored active learning strategies that have been shown to be effective in tasks such as text classification, information extraction, and speech recognition. In addition to existing methods, we defined a Hybrid Weighted Uncertainty Sampling active learning strategy that takes advantage of the contextual embeddings learned by the Transformer-based approach to measuring the representativeness of samples. A simulated study using the i2b2-2010 challenge dataset showed that our proposed metric reduces the annotation cost by 70% to achieve the same score as passive learning. Lastly, we combined multivariate medical time series and medical concepts extracted from clinical notes of the MIMIC-III database to train a multimodal transformer-based architecture. The test results of the in-hospital mortality task showed an improvement of 5.3% when considering additional text data. This thesis contributes to patient health trajectory representation by alleviating the burden of episodic medical records and the manual annotation of free-text notes
Hannachi, Ammar. "Imagerie multimodale et planification interactive pour la reconstruction 3D et la métrologie dimensionnelle". Thesis, Strasbourg, 2015. http://www.theses.fr/2015STRAD024/document.
Pełny tekst źródłaProducing industrially manufactured parts generates a very large number of data of various types defining the manufacturing geometries as well as the quality of production. This PhD work has been carried out within the framework of the realization of a cognitive vision system dedicated to the 3D evaluation of manufactured objects including possibly free form surfaces, taking into account the geometric tolerances and uncertainties. This system allows the comprehensive control of manufactured parts, and provides the means for their automated 3D dimensional inspection. The implementation of a multi-sensor (passive and active) measuring system enabled to improve significantly the assessment quality through an enriched three-dimensional reconstruction of the object to be evaluated. Specifically, we made use simultaneously of a stereoscopic vision system and of a structured light based system in order to reconstruct the edges and surfaces of various 3D objects
Meseguer, Brocal Gabriel. "Multimodal analysis : informed content estimation and audio source separation". Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS111.
Pełny tekst źródłaThis dissertation proposes the study of multimodal learning in the context of musical signals. Throughout, we focus on the interaction between audio signals and text information. Among the many text sources related to music that can be used (e.g. reviews, metadata, or social network feedback), we concentrate on lyrics. The singing voice directly connects the audio signal and the text information in a unique way, combining melody and lyrics where a linguistic dimension complements the abstraction of musical instruments. Our study focuses on the audio and lyrics interaction for targeting source separation and informed content estimation. Real-world stimuli are produced by complex phenomena and their constant interaction in various domains. Our understanding learns useful abstractions that fuse different modalities into a joint representation. Multimodal learning describes methods that analyse phenomena from different modalities and their interaction in order to tackle complex tasks. This results in better and richer representations that improve the performance of the current machine learning methods. To develop our multimodal analysis, we need first to address the lack of data containing singing voice with aligned lyrics. This data is mandatory to develop our ideas. Therefore, we investigate how to create such a dataset automatically leveraging resources from the World Wide Web. Creating this type of dataset is a challenge in itself that raises many research questions. We are constantly working with the classic ``chicken or the egg'' problem: acquiring and cleaning this data requires accurate models, but it is difficult to train models without data. We propose to use the teacher-student paradigm to develop a method where dataset creation and model learning are not seen as independent tasks but rather as complementary efforts. In this process, non-expert karaoke time-aligned lyrics and notes describe the lyrics as a sequence of time-aligned notes with their associated textual information. We then link each annotation to the correct audio and globally align the annotations to it. For this purpose, we use the normalized cross-correlation between the voice annotation sequence and the singing voice probability vector automatically, which is obtained using a deep convolutional neural network. Using the collected data we progressively improve that model. Every time we have an improved version, we can in turn correct and enhance the data
Harrando, Ismail. "Representation, information extraction, and summarization for automatic multimedia understanding". Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS097.
Pełny tekst źródłaWhether on TV or on the internet, video content production is seeing an unprecedented rise. Not only is video the dominant medium for entertainment purposes, but it is also reckoned to be the future of education, information and leisure. Nevertheless, the traditional paradigm for multimedia management proves to be incapable of keeping pace with the scale brought about by the sheer volume of content created every day across the disparate distribution channels. Thus, routine tasks like archiving, editing, content organization and retrieval by multimedia creators become prohibitively costly. On the user side, too, the amount of multimedia content pumped daily can be simply overwhelming; the need for shorter and more personalized content has never been more pronounced. To advance the state of the art on both fronts, a certain level of multimedia understanding has to be achieved by our computers. In this research thesis, we aim to go about the multiple challenges facing automatic media content processing and analysis, mainly gearing our exploration to three axes: 1. Representing multimedia: With all its richness and variety, modeling and representing multimedia content can be a challenge in itself. 2. Describing multimedia: The textual component of multimedia can be capitalized on to generate high-level descriptors, or annotations, for the content at hand. 3. Summarizing multimedia: we investigate the possibility of extracting highlights from media content, both for narrative-focused summarization and for maximising memorability
Ouenniche, Kaouther. "Multimodal deep learning for audiovisual production". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAS020.
Pełny tekst źródłaWithin the dynamic landscape of television content, the critical need to automate the indexing and organization of archives has emerged as a paramount objective. In response, this research explores the use of deep learning techniques to automate the extraction of diverse metadata from television archives, improving their accessibility and reuse.The first contribution of this research revolves around the classification of camera motion types. This is a crucial aspect of content indexing as it allows for efficient categorization and retrieval of video content based on the visual dynamics it exhibits. The novel approach proposed employs 3D convolutional neural networks with residual blocks, a technique inspired by action recognition methods. A semi-automatic approach for constructing a reliable camera motion dataset from publicly available videos is also presented, minimizing the need for manual intervention. Additionally, the creation of a challenging evaluation dataset, comprising real-life videos shot with professional cameras at varying resolutions, underlines the robustness and generalization power of the proposed technique, achieving an average accuracy rate of 94%.The second contribution centers on the demanding task of Video Question Answering. In this context, we explore the effectiveness of attention-based transformers for facilitating grounded multimodal learning. The challenge here lies in bridging the gap between the visual and textual modalities and mitigating the quadratic complexity of transformer models. To address these issues, a novel framework is introduced, which incorporates a lightweight transformer and a cross-modality module. This module leverages cross-correlation to enable reciprocal learning between text-conditioned visual features and video-conditioned textual features. Furthermore, an adversarial testing scenario with rephrased questions highlights the model's robustness and real-world applicability. Experimental results on benchmark datasets, such as MSVD-QA and MSRVTT-QA, validate the proposed methodology, with an average accuracy of 45% and 42%, respectively, which represents notable improvements over existing approaches.The third contribution of this research addresses the multimodal video captioning problem, a critical aspect of content indexing. The introduced framework incorporates a modality-attention module that captures the intricate relationships between visual and textual data using cross-correlation. Moreover, the integration of temporal attention enhances the model's ability to produce meaningful captions, considering the temporal dynamics of video content. Our work also incorporates an auxiliary task employing a contrastive loss function, which promotes model generalization and a deeper understanding of inter-modal relationships and underlying semantics. The utilization of a transformer architecture for encoding and decoding significantly enhances the model's capacity to capture interdependencies between text and video data. The research validates the proposed methodology through rigorous evaluation on the MSRVTT benchmark,viachieving BLEU4, ROUGE, and METEOR scores of 0.4408, 0.6291 and 0.3082, respectively. In comparison to state-of-the-art methods, this approach consistently outperforms, with performance gains ranging from 1.21% to 1.52% across the three metrics considered.In conclusion, this manuscript offers a holistic exploration of deep learning-based techniques to automate television content indexing, addressing the labor-intensive and time-consuming nature of manual indexing. The contributions encompass camera motion type classification, VideoQA, and multimodal video captioning, collectively advancing the state of the art and providing valuable insights for researchers in the field. These findings not only have practical applications for content retrieval and indexing but also contribute to the broader advancement of deep learning methodologies in the multimodal context
Francis, Danny. "Représentations sémantiques d'images et de vidéos". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS605.
Pełny tekst źródłaRecent research in Deep Learning has sent the quality of results in multimedia tasks rocketing: thanks to new big datasets of annotated images and videos, Deep Neural Networks (DNN) have outperformed other models in most cases. In this thesis, we aim at developing DNN models for automatically deriving semantic representations of images and videos. In particular we focus on two main tasks : vision-text matching and image/video automatic captioning. Addressing the matching task can be done by comparing visual objects and texts in a visual space, a textual space or a multimodal space. Based on recent works on capsule networks, we define two novel models to address the vision-text matching problem: Recurrent Capsule Networks and Gated Recurrent Capsules. In image and video captioning, we have to tackle a challenging task where a visual object has to be analyzed, and translated into a textual description in natural language. For that purpose, we propose two novel curriculum learning methods. Moreover regarding video captioning, analyzing videos requires not only to parse still images, but also to draw correspondences through time. We propose a novel Learned Spatio-Temporal Adaptive Pooling method for video captioning that combines spatial and temporal analysis. Extensive experiments on standard datasets assess the interest of our models and methods with respect to existing works
Prévost, Clémence. "Multimodal data fusion by coupled low-rank tensor approximations". Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0180.
Pełny tekst źródłaDue to the recent emergence of new modalities, the amount of signals collected daily has been increasing. As a result, it frequently occurs that various signals provide information about the same phenomenon. However, a single signal may only contain partial information about this phenomenon. Multimodal data fusion was proposed to overcome this issue. It is defined as joint processing of datasets acquired from different modalities. The aim of data fusion is to enhance the capabilities of each modality to express their specific information about the phenomenon of interest; it is also expected from data fusion that it brings out additional information that would be ignored by separate processing. However, due to the complex interactions between the modalities, understanding the advantages and limits of data fusion may not be straightforward.In a lot of applications such as biomedical imaging or remote sensing, the observed signals are three-dimensional arrays called tensors, thus tensor-based data fusion can be envisioned. Tensor low-rank modeling preserves the multidimensional structure of the observations and enjoys interesting uniqueness properties arising from tensor decompositions. In this work, we address the problem of recovering a high-resolution tensor from tensor observations with some lower resolutions.In particular, hyperspectral super-resolution (HSR) aims at reconstructing a tensor from two degraded versions. While one is degraded in two (spatial) modes, the second is degraded in the third (spectral) mode. Recently, tensor-based approaches were proposed for solving the problem at hand. These works are based on the assumption that the target tensor admits a given low-rank tensor decomposition. The first work addressing the problem of tensor-based HSR was based on a coupled canonical polyadic (CP) decomposition of the observations. This approach gave rise to numerous following reconstruction methods based on coupled tensor models, including our work.The first part of this thesis is devoted to the design of tensor-based algorithms for solving the HSR problem. In Chapter 2, we propose to formulate the problem as a coupled Tucker decomposition. We introduce two simple but fast algorithms based on the higher-order singular value decomposition of the observations. Our experiments show that our algorithms have a competitive performance with state-of-the-art tensor and matrix methods, with a lower computational time. In Chapter 3, we consider spectral variability between the observations. We formulate the reconstruction problem as a coupled block-term decomposition. We impose non-negativity of the low-rank factors, so that they can be incorporated into a physically-informed mixing model. Thus the proposed approach provides a solution to the joint HSR and unmixing problems.The second part of this thesis adresses the performance analysis of the coupled tensor models. The aim of this part is to assess the efficiency of some algorithms introduced in the first part. In Chapter 4, we consider constrained Cramér-Rao lower bounds (CCRB) for coupled tensor CP models. We provide a closed-form expression for the constrained Fisher information matrix in two scenarios, whether i) we only consider the fully-coupled reconstruction problem or ii) if we are interested in comparing the performance of fully-coupled, partially-coupled and uncoupled approaches. We prove that the existing CP-based algorithms are asymptotically efficient. Chapter 5 addresses a non-standard estimation problem in which the constraints on the deterministic model parameters involve a random parameter. We show that in this case, the standard CCRB is a non-informative bound. As a result, we introduce a new randomly constrained Cramér-Rao bound (RCCRB). The relevance of the RCCRB is illustrated using a coupled block-term decomposition model accounting for random uncertainties
Haegelen, Claire. "Construction et validation d'une base de données multimodales pour la stimulation cérébrale profonde". Phd thesis, Université Rennes 1, 2014. http://tel.archives-ouvertes.fr/tel-01073108.
Pełny tekst źródłaHaegelen, Claire. "Construction et validation d’une base de données multimodales pour la stimulation cérébrale profonde". Thesis, Rennes 1, 2014. http://www.theses.fr/2014REN1B003/document.
Pełny tekst źródłaDeep brain stimulation (DBS) is an effective treatment for patients with severe disabled Parkinson’s disease refractory to medical treatments. DBS surgery consists of the accurate implantation of an electrode in a deep brain nucleus. The quality of the surgical planning can be improved by developing a multimodal database based on anatomical, clinical and electrophysiologial data. The first step was to develop a specific magnetic resonance imaging (MRI) template of Parkinson’s disease patients’ anatomy, and to validate the segmentation of the 24 deep brain structures made on this template. Secondly, we focused on identifying optimum sites for subthalamic nucleus (STN) stimulation by studying symptomatic motor improvement along with neuropsychological side effects in 30 patients with PD. Each clinical score produced one anatomo-clinical atlas, associating the degree of improvement or worsening of the patient with its active contacts.We showed a discrepancy between a good motor improvement and an invevitable deterioration of the fluencies by targeting the postero-superior region of the STN. Finally, we developed new statistical anatomo-clinical maps the better to visualize the motor and neuropsychological consequences at 6 months of GPm stimulation in 20 patients with PD. These maps provided us with the motor improvement of GPm stimulation without cognitive impairments. We also proposed a new more lateral targeting of the GPm in PD because of the cortico-subcortical atrophy induced by the disease. Our goal is to use these statistical maps prospectively in further patients to improve their targeting, thus ensuring a shorter planning step on the day of the surgery as well as better outcomes from motor and neuropsychological point of view
Anzid, Hanan. "Fusion de données multimodales par combinaison de l’incertain et de modèles de perception". Thesis, Bourgogne Franche-Comté, 2019. http://www.theses.fr/2019UBFCK046.
Pełny tekst źródłaThe general idea is to use together heterogeneous multiple information on the same problem tainted by imperfections and coming from several sources in order to improve the knowledge of a given situation. Appropriate visualization of the images to aid in decision making using the perceptual information carried by the salience maps
Hafsi, Meriem. "Géo-détection des réseaux enterrés par fusion de données multimodales et raisonnement spatial". Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAA024/document.
Pełny tekst źródłaOur work aims to solve the problem of reliable detection of underground networks by optimization of the existing methods. Four methods are planned to identify the underground pipelines but they have limits and depend on many factors. Our investigation aims to solve the problem of reliable detection of underground networks by aggregation of the existing methods and reasoning at different abstraction levels. For that purpose, we must be able to provide an accurate geo-detection of underground networks regardless of their material, their function or the soil in which they are buried. The information collected in the field or soil by these detection methods will be merged in order to achieve and obtain an accurate and reliable single result of geo-detection. For that, we need to check independently these distinct methods and then to aggregate the information/data they provide. Besides, the first step will consists of the representation of this information into symbolic knowledge. The second step is to overcome the limitations of current methods to provide a reliable and expressive reasoning system
Abdat, Faiza. "Reconnaissance automatique des émotions par données multimodales : expressions faciales et des signaux physiologiques". Thesis, Metz, 2010. http://www.theses.fr/2010METZ035S/document.
Pełny tekst źródłaThis thesis presents a generic method for automatic recognition of emotions from a bimodal system based on facial expressions and physiological signals. This data processing approach leads to better extraction of information and is more reliable than single modality. The proposed algorithm for facial expression recognition is based on the distance variation of facial muscles from the neutral state and on the classification by means of Support Vector Machines (SVM). And the emotion recognition from physiological signals is based on the classification of statistical parameters by the same classifier. In order to have a more reliable recognition system, we have combined the facial expressions and physiological signals. The direct combination of such information is not trivial giving the differences of characteristics (such as frequency, amplitude, variation, and dimensionality). To remedy this, we have merged the information at different levels of implementation. At feature-level fusion, we have tested the mutual information approach for selecting the most relevant and principal component analysis to reduce their dimensionality. For decision-level fusion we have implemented two methods; the first based on voting process and another based on dynamic Bayesian networks. The optimal results were obtained with the fusion of features based on Principal Component Analysis. These methods have been tested on a database developed in our laboratory from healthy subjects and inducing with IAPS pictures. A self-assessment step has been applied to all subjects in order to improve the annotation of images used for induction. The obtained results have shown good performance even in presence of variability among individuals and the emotional state variability for several days
Abdat, Faiza. "Reconnaissance automatique des émotions par données multimodales : expressions faciales et des signaux physiologiques". Electronic Thesis or Diss., Metz, 2010. http://www.theses.fr/2010METZ035S.
Pełny tekst źródłaThis thesis presents a generic method for automatic recognition of emotions from a bimodal system based on facial expressions and physiological signals. This data processing approach leads to better extraction of information and is more reliable than single modality. The proposed algorithm for facial expression recognition is based on the distance variation of facial muscles from the neutral state and on the classification by means of Support Vector Machines (SVM). And the emotion recognition from physiological signals is based on the classification of statistical parameters by the same classifier. In order to have a more reliable recognition system, we have combined the facial expressions and physiological signals. The direct combination of such information is not trivial giving the differences of characteristics (such as frequency, amplitude, variation, and dimensionality). To remedy this, we have merged the information at different levels of implementation. At feature-level fusion, we have tested the mutual information approach for selecting the most relevant and principal component analysis to reduce their dimensionality. For decision-level fusion we have implemented two methods; the first based on voting process and another based on dynamic Bayesian networks. The optimal results were obtained with the fusion of features based on Principal Component Analysis. These methods have been tested on a database developed in our laboratory from healthy subjects and inducing with IAPS pictures. A self-assessment step has been applied to all subjects in order to improve the annotation of images used for induction. The obtained results have shown good performance even in presence of variability among individuals and the emotional state variability for several days
Tochon, Guillaume. "Analyse hiérarchique d'images multimodales". Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAT100/document.
Pełny tekst źródłaThere is a growing interest in the development of adapted processing tools for multimodal images (several images acquired over the same scene with different characteristics). Allowing a more complete description of the scene, multimodal images are of interest in various image processing fields, but their optimal handling and exploitation raise several issues. This thesis extends hierarchical representations, a powerful tool for classical image analysis and processing, to multimodal images in order to better exploit the additional information brought by the multimodality and improve classical image processing techniques. %when applied to real applications. This thesis focuses on three different multimodalities frequently encountered in the remote sensing field. We first investigate the spectral-spatial information of hyperspectral images. Based on an adapted construction and processing of the hierarchical representation, we derive a segmentation which is optimal with respect to the spectral unmixing operation. We then focus on the temporal multimodality and sequences of hyperspectral images. Using the hierarchical representation of the frames in the sequence, we propose a new method to achieve object tracking and apply it to chemical gas plume tracking in thermal infrared hyperspectral video sequences. Finally, we study the sensorial multimodality, being images acquired with different sensors. Relying on the concept of braids of partitions, we propose a novel methodology of image segmentation, based on an energetic minimization framework
Muliukov, Artem. "Étude croisée des cartes auto-organisatrices et des réseaux de neurones profonds pour l'apprentissage multimodal inspiré du cerveau". Electronic Thesis or Diss., Université Côte d'Azur, 2024. https://intranet-theses.unice.fr/2024COAZ4008.
Pełny tekst źródłaCortical plasticity is one of the main features that enable our capability to learn and adapt in our environment. Indeed, the cerebral cortex has the ability to self-organize itself through two distinct forms of plasticity: the structural plasticity and the synaptic plasticity. These mechanisms are very likely at the basis of an extremely interesting characteristic of the human brain development: the multimodal association. The brain uses spatio-temporal correlations between several modalities to structure the data and create sense from observations. Moreover, biological observations show that one modality can activate the internal representation of another modality when both are correlated. To model such a behavior, Edelman and Damasio proposed respectively the Reentry and the Convergence Divergence Zone frameworks where bi-directional neural communications can lead to both multimodal fusion (convergence) and inter-modal activation (divergence). Nevertheless, these frameworks do not provide a computational model at the neuron level, and only few works tackle this issue of bio-inspired multimodal association which is yet necessary for a complete representation of the environment especially when targeting autonomous and embedded intelligent systems. In this doctoral project, we propose to pursue the exploration of brain-inspired computational models of self-organization for multimodal unsupervised learning in neuromorphic systems. These neuromorphic architectures get their energy-efficient from the bio-inspired models they support, and for that reason we only consider in our work learning rules based on local and distributed processing
Leroy, Philippe. "Traitement des données en pharmacocinétique". Paris 5, 1988. http://www.theses.fr/1988PA05P177.
Pełny tekst źródłaBosc, Marcel. "Contribution à la détection de changements dans des séquences IRM 3D multimodales". Phd thesis, Université Louis Pasteur - Strasbourg I, 2003. http://tel.archives-ouvertes.fr/tel-00005163.
Pełny tekst źródłaLecomte, Gwenaële. "Analyse d'images radioscopiques et fusion d'informations multimodales pour l'amélioration du contrôle de pièces de fonderie". Lyon, INSA, 2005. http://theses.insa-lyon.fr/publication/2005ISAL0128/these.pdf.
Pełny tekst źródłaIn the frame of the 5th european PCRD, a non destructive control machine was developed to control casting samples by merging three techniques : radioscopy, spectrometry and vibration. We present in this report the image processing based on the top hat morphological and the hysteresis filters. Features are automatically extracted to classified detected objects as defect or false alarm. A confidence defect index is calculated with three features and gives good classification performance for the 684 analysed images. Thanks to the explicit geometric model developed for the X rays control system, the detected objects are matched with objects from the three others images, taken with other sample orientations. The three non destructive techniques are fused with the Dempster-Shafer theory, which takes into account the information ignorance. The fusion is done in three steps : first between radioscopic detected objects, secondly between radioscopic and spectrometric objets and to finish at the sample level with the vibration analysis. For each control, the confidence level estimations are presented, respectively at the detected objet level, at the control volume level and at the sample level. The frame of decision is adapted for each step. Results show that the fusion of radioscopic detected objets together increases the defect hypothese confidence. The spectrometry and vibration techniques must be improved to supply reliable information
Lecomte, Gwenaële Babot Daniel Kaftandjian Valérie. "Analyse d'images radioscopiques et fusion d'informations multimodales pour l'amélioration du contrôle de pièces de fonderie". Villeurbanne : Doc'INSA, 2006. http://docinsa.insa-lyon.fr/these/pont.php?id=lecomte.
Pełny tekst źródłaMedjahed, Hamid. "Identification de situation de détresse par la fusion de données multimodales pour la télévigilance médicale à domicile". Phd thesis, Institut National des Télécommunications, 2010. http://tel.archives-ouvertes.fr/tel-00541876.
Pełny tekst źródłaXu, Hao. "Estimation statistique d'atlas probabiliste avec les données multimodales et son application à la segmentation basée sur l'atlas". Phd thesis, Ecole Polytechnique X, 2014. http://pastel.archives-ouvertes.fr/pastel-00969176.
Pełny tekst źródłaMerroun, Omar. "Traitement à grand échelle des données symboliques". Paris 9, 2011. http://www.theses.fr/2011PA090027.
Pełny tekst źródłaSymbolic Data Analysis (SDA) proposes a generalization of classical Data Analysis (AD) methods using complex data (intervals, sets, histograms). These methods define high level and complex operators for symbolic data manipulation. Furthermore, recent implementations of the SDA model are not able to process large data volumes. According to the classical design of massive data computation, we define a new data model to represent and process symbolic data using algebraic operators that are minimal and closed by composition. We give some query samples to emphasize the expressiveness of our model. We implement this algebraic model, called LS-SODAS, and we define the language XSDQL to express queries for symbolic data manipulation. Two cases of study are provided in order to show the potential of XSDQL langage expressiveness and the data processing scalability
Touati, Mustafa. "Contribution géostatistique au traitement des données sismiques". Paris, ENMP, 1996. http://www.theses.fr/1996ENMP0617.
Pełny tekst źródłaDujardin, Bénédicte. "Approximation rationnelle appliquée au traitement de données". Nice, 2005. http://www.theses.fr/2005NICE4106.
Pełny tekst źródłaIn this document, we are concerned with different problems arising from mathematics and date processing whose common point is to involve polynomials with random coefficients, the study of which composes exclusively the material of the first chapter. In spectral analysis, the use of linear parametric models of a signal leads to rational estimators of its power spectrum density. We are interested in the AR and ARMA estimators of certain stochastic processes and characterize their performance in terms of the statistics of their complex poles and zeros. Our understanding of the role played by the random component of the signal is made easier by a preliminary part devoted to rational Padé approximants of randomly perturbed formal series. This first part provides us with the opportunity to underline some recurring phenomena related to the perturbation such as the matching of poles and zeros or the formation of crystal structures
Franchi, Gianni. "Machine learning spatial appliquée aux images multivariées et multimodales". Thesis, Paris Sciences et Lettres (ComUE), 2016. http://www.theses.fr/2016PSLEM071/document.
Pełny tekst źródłaThis thesis focuses on multivariate spatial statistics and machine learning applied to hyperspectral and multimodal and images in remote sensing and scanning electron microscopy (SEM). In this thesis the following topics are considered:Fusion of images:SEM allows us to acquire images from a given sample using different modalities. The purpose of these studies is to analyze the interest of fusion of information to improve the multimodal SEM images acquisition. We have modeled and implemented various techniques of image fusion of information, based in particular on spatial regression theory. They have been assessed on various datasets.Spatial classification of multivariate image pixels:We have proposed a novel approach for pixel classification in multi/hyper-spectral images. The aim of this technique is to represent and efficiently describe the spatial/spectral features of multivariate images. These multi-scale deep descriptors aim at representing the content of the image while considering invariances related to the texture and to its geometric transformations.Spatial dimensionality reduction:We have developed a technique to extract a feature space using morphological principal component analysis. Indeed, in order to take into account the spatial and structural information we used mathematical morphology operators
Courtial, Nicolas. "Fusion d’images multimodales pour l’assistance de procédures d’électrophysiologie cardiaque". Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S015.
Pełny tekst źródłaCardiac electrophysiology procedures have been proved to be efficient to suppress arrythmia and heart failure symptoms. Their success rate depends on patient’s heart condition’s knowledge, including electrical and mechanical functions and tissular quality. It is a major clinical concern for these therapies. This work focuses on the development of specific patient multimodal model to plan and assist radio-frequency ablation (RFA) and cardiac resynchronization therapy (CRT). First, segmentation, registration and fusion methods have been developped to create these models, allowing to plan these interventional procedures. For each therapy, specific means of integration within surgical room have been established, for assistance purposes. Finally, a new multimodal descriptor has been synthesized during a post-procedure analysis, aiming to predict the CRT’s response depending on the left ventricular stimulation site. These studies have been applied and validated on patients candidate to CRT and ARF. They showed the feasibility and interest of integrating such multimodal models in the clinical workflow to assist these procedures
Desseroit, Marie-Charlotte. "Caractérisation et exploitation de l'hétérogénéité intra-tumorale des images multimodales TDM et TEP". Thesis, Brest, 2016. http://www.theses.fr/2016BRES0129/document.
Pełny tekst źródłaPositron emission tomography (PET) / Computed tomography (CT) multi-modality imaging is the most commonly used imaging technique to diagnose and monitor patients in oncology. PET/CT images provide a global tissue density description (CT images) and a characterization of tumor metabolic activity (PET images). Further analysis of those images acquired in clinical routine supplied additional data as regards patient survival or treatment response. All those new data allow to describe the tumor phenotype and are generally grouped under the generic name of Radiomics. Nevertheless, the number of shape descriptors and texture features characterising tumors have significantly increased in recent years and those parameters can be sensitive to exctraction method or whether to imaging modality. During this thesis, parameters variability, computed on PET and CT images, was assessed thanks to a test-retest cohort : for each patient, two groups of PET/CT images, acquired under the same conditions but generated with an interval of few minutes, were available. Parameters classified as reliable after this analysis were exploited for survival analysis of patients in the context of non-small cell lug cancer (NSCLC).The construction of a prognostic model with those metrics permitted first to study the complementarity of PET and CT texture features. However, this nomogram has been generated by simply adding risk factors and not with a robust multi-parametric analysis method. In the second part, the same data were exploited to build a prognostic model using support vector machine (SVM) algorithm. The models thus generated were then tested on a prospective cohort currently being recruited to obtain preliminary results as regards the robustness of those nomograms
Fliti, Tamim. "Le problème SAT : traitement dynamique et données minimales". Aix-Marseille 2, 1997. http://www.theses.fr/1997AIX22015.
Pełny tekst źródłaBaby, Jean-François. "Le traitement des données spatialisées par stations geomatiques". Aix-Marseille 2, 1991. http://www.theses.fr/1991AIX23005.
Pełny tekst źródłaThe development of up-to-date computer aids now enable us to write down a new territorial geography, just as the increasing development in data banks offer new prospects in diffusing geographical messages our approach was pragmatic. We tried to put into practice a new method to process spatial data to our first experiment with the town planning department at the town hal in nice, involved establishing a cartographical data bank. Our second experiment is a larger department scale and is being carried at the cci nice-cote-d'azur both experiments have provided confirmation of our choice in "geomatic" computer aids, but furthemore to have the necessary tools in hans, and thesefore a complete step-by-step range of our data, from acquisition to distribution, is of primary importance
Macina, Abdoul. "Traitement de requêtes SPARQL sur des données liées". Thesis, Université Côte d'Azur (ComUE), 2018. http://www.theses.fr/2018AZUR4230/document.
Pełny tekst źródłaDriven by the Semantic Web standards, an increasing number of RDF data sources are published and connected over the Web by data providers, leading to a large distributed linked data network. However, exploiting the wealth of these data sources is very challenging for data consumers considering the data distribution, their volume growth and data sources autonomy. In the Linked Data context, federation engines allow querying these distributed data sources by relying on Distributed Query Processing (DQP) techniques. Nevertheless, a naive implementation of the DQP approach may generate a tremendous number of remote requests towards data sources and numerous intermediate results, thus leading to costly network communications. Furthermore, the distributed query semantics is often overlooked. Query expressiveness, data partitioning, and data replication are other challenges to be taken into account. To address these challenges, we first proposed in this thesis a SPARQL and RDF compliant Distributed Query Processing semantics which preserves the SPARQL language expressiveness. Afterwards, we presented several strategies for a federated query engine that transparently addresses distributed data sources, while managing data partitioning, query results completeness, data replication, and query processing performance. We implemented and evaluated our approach and optimization strategies in a federated query engine to prove their effectiveness
Barhoumi, Mohamed Adel. "Traitement des données manquantes dans les données de panel : cas des variables dépendantes dichotomiques". Thesis, Université Laval, 2006. http://www.theses.ulaval.ca/2006/23619/23619.pdf.
Pełny tekst źródłaBuchholz, Bert. "Abstraction et traitement de masses de données 3D animées". Phd thesis, Télécom ParisTech, 2012. http://pastel.archives-ouvertes.fr/pastel-00958339.
Pełny tekst źródłaBuchholz, Bert. "Abstraction et traitement de masses de données 3D animées". Electronic Thesis or Diss., Paris, ENST, 2012. http://www.theses.fr/2012ENST0080.
Pełny tekst źródłaIn this thesis, we explore intermediary structures and their relationship to the employed algorithms in the context of photorealistic (PR) and non-photorealistic (NPR) rendering. We present new structures for rendering as well as new uses for existing structures. We present three original contributions in the NPR and PR domain: First, we present binary shading, a method to generate stylized black and white images, inspired by comic artists, using appearance and geometry in a graph-based energy formulation. The user can control the algorithm to generate images of different styles and representations. The second work allows the temporally coherent parameterization of line animations for texturing purposes. We introduce a spatio-temporal structure over the input data and an energy formulation for a globally optimal parameterization. Similar to the work on binary shading, the energy formulation provides a an important and simple control over the output. Finally, we present an extension to Point-based Global Illumination, a method used extensively in movie production during the last years. Our work allows compressing the data generated by the original algorithm using quantification. It is memory-efficient and has only a neglegible time overhead while enabling the rendering of larger scenes. The user can easily control the strength and quality of the compression. We also propose a number of possible extensions and improvements to the methods presented in the thesis
Neumann, Markus. "Automatic multimodal real-time tracking for image plane alignment in interventional Magnetic Resonance Imaging". Phd thesis, Université de Strasbourg, 2014. http://tel.archives-ouvertes.fr/tel-01038023.
Pełny tekst źródłaVielzeuf, Valentin. "Apprentissage neuronal profond pour l'analyse de contenus multimodaux et temporels". Thesis, Normandie, 2019. http://www.theses.fr/2019NORMC229/document.
Pełny tekst źródłaOur perception is by nature multimodal, i.e. it appeals to many of our senses. To solve certain tasks, it is therefore relevant to use different modalities, such as sound or image.This thesis focuses on this notion in the context of deep learning. For this, it seeks to answer a particular problem: how to merge the different modalities within a deep neural network?We first propose to study a problem of concrete application: the automatic recognition of emotion in audio-visual contents.This leads us to different considerations concerning the modeling of emotions and more particularly of facial expressions. We thus propose an analysis of representations of facial expression learned by a deep neural network.In addition, we observe that each multimodal problem appears to require the use of a different merge strategy.This is why we propose and validate two methods to automatically obtain an efficient fusion neural architecture for a given multimodal problem, the first one being based on a central fusion network and aimed at preserving an easy interpretation of the adopted fusion strategy. While the second adapts a method of neural architecture search in the case of multimodal fusion, exploring a greater number of strategies and therefore achieving better performance.Finally, we are interested in a multimodal view of knowledge transfer. Indeed, we detail a non-traditional method to transfer knowledge from several sources, i.e. from several pre-trained models. For that, a more general neural representation is obtained from a single model, which brings together the knowledge contained in the pre-trained models and leads to state-of-the-art performances on a variety of facial analysis tasks
Moreau, Frédérique. "Méthodes de traitement de données géophysiques par transformée en ondelettes". Phd thesis, Université Rennes 1, 1995. http://tel.archives-ouvertes.fr/tel-00656040.
Pełny tekst źródłaGu, Co Weila Vila. "Méthodes statistiques et informatiques pour le traitement des données manquantes". Phd thesis, Conservatoire national des arts et metiers - CNAM, 1997. http://tel.archives-ouvertes.fr/tel-00808585.
Pełny tekst źródła