Rozprawy doktorskie na temat „Apprentissage à partir de données d'intéraction”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Apprentissage à partir de données d'intéraction”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Sakhi, Otmane. "Offline Contextual Bandit : Theory and Large Scale Applications". Electronic Thesis or Diss., Institut polytechnique de Paris, 2023. http://www.theses.fr/2023IPPAG011.
Pełny tekst źródłaThis thesis presents contributions to the problem of learning from logged interactions using the offline contextual bandit framework. We are interested in two related topics: (1) offline policy learning with performance certificates, and (2) fast and efficient policy learning applied to large scale, real world recommendation. For (1), we first leverage results from the distributionally robust optimisation framework to construct asymptotic, variance-sensitive bounds to evaluate policies' performances. These bounds lead to new, more practical learning objectives thanks to their composite nature and straightforward calibration. We then analyse the problem from the PAC-Bayesian perspective, and provide tighter, non-asymptotic bounds on the performance of policies. Our results motivate new strategies, that offer performance certificates before deploying the policies online. The newly derived strategies rely on composite learning objectives that do not require additional tuning. For (2), we first propose a hierarchical Bayesian model, that combines different signals, to efficiently estimate the quality of recommendation. We provide proper computational tools to scale the inference to real world problems, and demonstrate empirically the benefits of the approach in multiple scenarios. We then address the question of accelerating common policy optimisation approaches, particularly focusing on recommendation problems with catalogues of millions of items. We derive optimisation routines, based on new gradient approximations, computed in logarithmic time with respect to the catalogue size. Our approach improves on common, linear time gradient computations, yielding fast optimisation with no loss on the quality of the learned policies
Ferrandiz, Sylvain. "Apprentissage supervisé à partir de données séquentielles". Caen, 2006. http://www.theses.fr/2006CAEN2030.
Pełny tekst źródłaIn the data mining process, the main part of the data preparation step is devoted to feature construction and selection. The filter approach usually adopted requires evaluation methods for any kind of feature. We address the problem of the supervised evaluation of a sequential feature. We show that this problem is solved if a more general problem is tackled : that of the supervised evaluation of a similarity measure. We provide such an evaluation method. We first turn the problem into the search of a discriminating Voronoi partition. Then, we define a new supervised criterion evaluating such partitions and design a new optimised algorithm. The criterion automatically prevents from overfitting the data and the algorithm quickly provides a good solution. In the end, the method can be interpreted as a robust non parametric method for estimating the conditional density of a nominal target feature given a similarity measure defined from a descriptive feature. The method is experimented on many datasets. It is useful for answering questions like : which day of the week or which hourly time segment is the most relevant to discriminate customers from their call detailed records ? Which series allows to better estimate the customer need for a new service ?
Chevaleyre, Yann. "Apprentissage de règles à partir de données multi-instances". Paris 6, 2001. http://www.theses.fr/2001PA066502.
Pełny tekst źródłaDubois, Vincent. "Apprentissage approximatif et extraction de connaissances à partir de données textuelles". Nantes, 2003. http://www.theses.fr/2003NANT2001.
Pełny tekst źródłaJouve, Pierre-Emmanuel. "Apprentissage non supervisé et extraction de connaissances à partir de données". Lyon 2, 2003. http://theses.univ-lyon2.fr/documents/lyon2/2003/jouve_pe.
Pełny tekst źródłaGuillouet, Brendan. "Apprentissage statistique : application au trafic routier à partir de données structurées et aux données massives". Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30205/document.
Pełny tekst źródłaThis thesis focuses on machine learning techniques for application to big data. We first consider trajectories defined as sequences of geolocalized data. A hierarchical clustering is then applied on a new distance between trajectories (Symmetrized Segment-Path Distance) producing groups of trajectories which are then modeled with Gaussian mixture in order to describe individual movements. This modeling can be used in a generic way in order to resolve the following problems for road traffic : final destination, trip time or next location predictions. These examples show that our model can be applied to different traffic environments and that, once learned, can be applied to trajectories whose spatial and temporal characteristics are different. We also produce comparisons between different technologies which enable the application of machine learning methods on massive volumes of data
Elati, Mohamed. "Apprentissage de réseaux de régulation génétique à partir de données d'expression". Paris 13, 2007. http://www.theses.fr/2007PA132031.
Pełny tekst źródłaPradel, Bruno. "Evaluation des systèmes de recommandation à partir d'historiques de données". Paris 6, 2013. http://www.theses.fr/2013PA066263.
Pełny tekst źródłaThis thesis presents various experimental protocols leading to abetter offline estimation of errors in recommender systems. As a first contribution, results form a case study of a recommendersystem based on purchased data will be presented. Recommending itemsis a complex task that has been mainly studied considering solelyratings data. In this study, we put the stress on predicting thepurchase a customer will make rather than the rating he will assign toan item. While ratings data are not available for many industries andpurchases data widely used, very few studies considered purchasesdata. In that setting, we compare the performances of variouscollaborative filtering models from the litterature. We notably showthat some changes the training and testing phases, and theintroduction of contextual information lead to major changes of therelative perfomances of algorithms. The following contributions will focus on the study of ratings data. Asecond contribution will present our participation to the Challenge onContext-Aware Movie Recommendation. This challenge provides two majorchanges in the standard ratings prediction protocol: models areevaluated conisdering ratings metrics and tested on two specificsperiod of the year: Christmas and Oscars. We provides personnalizedrecommendation modeling the short-term evolution of the popularitiesof movies. Finally, we study the impact of the observation process of ratings onranking evaluation metrics. Users choose the items they want to rateand, as a result, ratings on items are not observed at random. First,some items receive a lot more ratings than others and secondly, highratings are more likely to be oberved than poor ones because usersmainly rate the items they likes. We propose a formal analysis ofthese effects on evaluation metrics and experiments on the Yahoo!Musicdataset, gathering standard and randomly collected ratings. We showthat considering missing ratings as negative during training phaseleads to good performances on the TopK task, but these performancescan be misleading favoring methods modeling the popularities of itemsmore than the real tastes of users
Liquière, Michel. "Apprentissage à partir d'objets structurés : conception et réalisation". Montpellier 2, 1990. http://www.theses.fr/1990MON20038.
Pełny tekst źródłaKhiali, Lynda. "Fouille de données à partir de séries temporelles d’images satellites". Thesis, Montpellier, 2018. http://www.theses.fr/2018MONTS046/document.
Pełny tekst źródłaNowadays, remotely sensed images constitute a rich source of information that can be leveraged to support several applications including risk prevention, land use planning, land cover classification and many other several tasks. In this thesis, Satellite Image Time Series (SITS) are analysed to depict the dynamic of natural and semi-natural habitats. The objective is to identify, organize and highlight the evolution patterns of these areas.We introduce an object-oriented method to analyse SITS that consider segmented satellites images. Firstly, we identify the evolution profiles of the objects in the time series. Then, we analyse these profiles using machine learning methods. To identify the evolution profiles, we explore all the objects to select a subset of objects (spatio-temporal entities/reference objects) to be tracked. The evolution of the selected spatio-temporal entities is described using evolution graphs.To analyse these evolution graphs, we introduced three contributions. The first contribution explores annual SITS. It analyses the evolution graphs using clustering algorithms, to identify similar evolutions among the spatio-temporal entities. In the second contribution, we perform a multi-annual cross-site analysis. We consider several study areas described by multi-annual SITS. We use the clustering algorithms to identify intra and inter-site similarities. In the third contribution, we introduce à semi-supervised method based on constrained clustering. We propose a method to select the constraints that will be used to guide the clustering and adapt the results to the user needs.Our contributions were evaluated on several study areas. The experimental results allow to pinpoint relevant landscape evolutions in each study sites. We also identify the common evolutions among the different sites. In addition, the constraint selection method proposed in the constrained clustering allows to identify relevant entities. Thus, the results obtained using the unsupervised learning were improved and adapted to meet the user needs
Le, Folgoc Loïc. "Apprentissage statistique pour la personnalisation de modèles cardiaques à partir de données d’imagerie". Thesis, Nice, 2015. http://www.theses.fr/2015NICE4098/document.
Pełny tekst źródłaThis thesis focuses on the calibration of an electromechanical model of the heart from patient-specific, image-based data; and on the related task of extracting the cardiac motion from 4D images. Long-term perspectives for personalized computer simulation of the cardiac function include aid to the diagnosis, aid to the planning of therapy and prevention of risks. To this end, we explore tools and possibilities offered by statistical learning. To personalize cardiac mechanics, we introduce an efficient framework coupling machine learning and an original statistical representation of shape & motion based on 3D+t currents. The method relies on a reduced mapping between the space of mechanical parameters and the space of cardiac motion. The second focus of the thesis is on cardiac motion tracking, a key processing step in the calibration pipeline, with an emphasis on quantification of uncertainty. We develop a generic sparse Bayesian model of image registration with three main contributions: an extended image similarity term, the automated tuning of registration parameters and uncertainty quantification. We propose an approximate inference scheme that is tractable on 4D clinical data. Finally, we wish to evaluate the quality of uncertainty estimates returned by the approximate inference scheme. We compare the predictions of the approximate scheme with those of an inference scheme developed on the grounds of reversible jump MCMC. We provide more insight into the theoretical properties of the sparse structured Bayesian model and into the empirical behaviour of both inference schemes
Renaux, Pierre. "Extraction d'informations à partir de documents juridiques : application à la contrefaçon de marques". Caen, 2006. http://www.theses.fr/2006CAEN2019.
Pełny tekst źródłaOur research framework focuses on the extraction and analysis of induced knowledge from legal corpus databases describing the nominative trade-mark infringement. This discipline deals with all the constraints arising from the different domains of knowledge discovery from documents: the electronic document, databases, statistics, artificial intelligence and human computer interaction. Meanwhile, the accuracy of these methods are closely linked with the quality of the data used. In our research framework, each decision is supervised by an author (the magistrate) and relies on a contextual writing environment, thus limiting the information extraction process. Here we are interesteding in decisions which direct the document learning process. We observe their surrounding, find their strategic capacity and offer adapted solutions in order to determine a better document representation. We suggest an explorative and supervised approach for calculating the data quality by finding properties which corrupt the knowledge quality. We have developped an interactive and collaborative platform for modelling all the processes concluding to the knowledge extraction in order to efficiently integrate the expert's know-how and practices
Pomorski, Denis. "Apprentissage automatique symbolique/numérique : construction et évaluation d'un ensemble de règles à partir des données". Lille 1, 1991. http://www.theses.fr/1991LIL10117.
Pełny tekst źródłaBuchet, Samuel. "Vérification formelle et apprentissage logique pour la modélisation qualitative à partir de données single-cell". Thesis, Ecole centrale de Nantes, 2022. http://www.theses.fr/2022ECDN0011.
Pełny tekst źródłaThe understanding of cellular mechanisms occurring inside human beings usually depends on the study of its gene expression.However, genes are implied in complex regulatory processes and their measurement is difficult to perform. In this context, the qualitative modeling of gene regulatory networks intends to establish the function of each gene from the discrete modeling of a dynamical interaction network. In this thesis, our goal is to implement this modeling approach from single-cell sequencing data. These data prove to be interesting for qualitative modeling since they bring high precision, and they can be interpreted in a dynamical way. Thus, we develop a method for the inference of qualitative models based on the automatic learning of logic programs. This method is applied on a single-cell dataset, and we propose several approaches to interpret the resulting models by comparing them with existing knowledge
Bouguelia, Mohamed-Rafik. "Classification et apprentissage actif à partir d'un flux de données évolutif en présence d'étiquetage incertain". Thesis, Université de Lorraine, 2015. http://www.theses.fr/2015LORR0034/document.
Pełny tekst źródłaThis thesis focuses on machine learning for data classification. To reduce the labelling cost, active learning allows to query the class label of only some important instances from a human labeller.We propose a new uncertainty measure that characterizes the importance of data and improves the performance of active learning compared to the existing uncertainty measures. This measure determines the smallest instance weight to associate with new data, so that the classifier changes its prediction concerning this data. We then consider a setting where the data arrives continuously from an infinite length stream. We propose an adaptive uncertainty threshold that is suitable for active learning in the streaming setting and achieves a compromise between the number of classification errors and the number of required labels. The existing stream-based active learning methods are initialized with some labelled instances that cover all possible classes. However, in many applications, the evolving nature of the stream implies that new classes can appear at any time. We propose an effective method of active detection of novel classes in a multi-class data stream. This method incrementally maintains a feature space area which is covered by the known classes, and detects those instances that are self-similar and external to that area as novel classes. Finally, it is often difficult to get a completely reliable labelling because the human labeller is subject to labelling errors that reduce the performance of the learned classifier. This problem was solved by introducing a measure that reflects the degree of disagreement between the manually given class and the predicted class, and a new informativeness measure that expresses the necessity for a mislabelled instance to be re-labeled by an alternative labeller
Wolley, Chirine. "Apprentissage supervisé à partir des multiples annotateurs incertains". Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4070/document.
Pełny tekst źródłaIn supervised learning tasks, obtaining the ground truth label for each instance of the training dataset can be difficult, time-consuming and/or expensive. With the advent of infrastructures such as the Internet, an increasing number of web services propose crowdsourcing as a way to collect a large enough set of labels from internet users. The use of these services provides an exceptional facility to collect labels from anonymous annotators, and thus, it considerably simplifies the process of building labels datasets. Nonetheless, the main drawback of crowdsourcing services is their lack of control over the annotators and their inability to verify and control the accuracy of the labels and the level of expertise for each labeler. Hence, managing the annotators' uncertainty is a clue for learning from imperfect annotations. This thesis provides three algorithms when learning from multiple uncertain annotators. IGNORE generates a classifier that predict the label of a new instance and evaluate the performance of each annotator according to their level of uncertainty. X-Ignore, considers that the performance of the annotators both depends on their uncertainty and on the quality of the initial dataset to be annotated. Finally, ExpertS deals with the problem of annotators' selection when generating the classifier. It identifies experts annotators, and learn the classifier based only on their labels. We conducted in this thesis a large set of experiments in order to evaluate our models, both using experimental and real world medical data. The results prove the performance and accuracy of our models compared to previous state of the art solutions in this context
Labernia, Fabien. "Algorithmes efficaces pour l’apprentissage de réseaux de préférences conditionnelles à partir de données bruitées". Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLED018/document.
Pełny tekst źródłaThe rapid growth of personal web data has motivated the emergence of learning algorithms well suited to capture users’ preferences. Among preference representation formalisms, conditional preference networks (CP-nets) have proven to be effective due to their compact and explainable structure. However, their learning is difficult due to their combinatorial nature.In this thesis, we tackle the problem of learning CP-nets from corrupted large datasets. Three new algorithms are introduced and studied on both synthetic and real datasets.The first algorithm is based on query learning and considers the contradictions between multiple users’ preferences by searching in a principled way the variables that affect the preferences. The second algorithm relies on information-theoretic measures defined over the induced preference rules, which allow us to deal with corrupted data. An online version of this algorithm is also provided, by exploiting the McDiarmid's bound to define an asymptotically optimal decision criterion for selecting the best conditioned variable and hence allowing to deal with possibly infinite data streams
Velcin, Julien. "Extraction automatique de stéréotypes à partir de données symboliques et lacunaires". Paris 6, 2005. http://www.theses.fr/2005PA066465.
Pełny tekst źródłaBourgeais, Victoria. "Interprétation de l'apprentissage profond pour la prédiction de phénotypes à partir de données d'expression de gènes". Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG069.
Pełny tekst źródłaDeep learning has been a significant advance in artificial intelligence in recent years. Its main domains of interest are image analysis and natural language processing. One of the major future challenges of this approach is its application to precision medicine. This new form of medicine will make it possible to personalize each stage of a patient's care pathway according to his or her characteristics, in particular molecular characteristics such as gene expression data that inform about the cellular state of a patient. However, deep learning models are considered black boxes as their predictions are not accompanied by an explanation, limiting their use in clinics. The General Data Protection Regulation (GDPR), adopted recently by the European Union, imposes that the machine learning algorithms must be able to explain their decisions to the users. Thus, there is a real need to make neural networks more interpretable, and this is particularly true in the medical field for several reasons. Understanding why a phenotype has been predicted is necessary to ensure that the prediction is based on reliable representations of the patients rather than on irrelevant artifacts present in the training data. Regardless of the model's effectiveness, this will affect any end user's decisions and confidence in the model. Finally, a neural network performing well for the prediction of a certain phenotype may have identified a signature in the data that could open up new research avenues.In the current state of the art, two general approaches exist for interpreting these black-boxes: creating inherently interpretable models or using a third-party method dedicated to the interpretation of the trained neural network. Whatever approach is chosen, the explanation provided generally consists of identifying the important input variables and neurons for the prediction. However, in the context of phenotype prediction from gene expression, these approaches generally do not provide an understandable explanation, as these data are not directly comprehensible by humans. Therefore, we propose novel and original deep learning methods, interpretable by design. The architecture of these methods is defined from one or several knowledge databases. A neuron represents a biological object, and the connections between neurons correspond to the relations between biological objects. Three methods have been developed, listed below in chronological order.Deep GONet is based on a multilayer perceptron constrained by a biological knowledge database, the Gene Ontology (GO), through an adapted regularization term. The explanations of the predictions are provided by a posteriori interpretation method.GraphGONet takes advantage of both a multilayer perceptron and a graph neural network to deal with the semantic richness of GO knowledge. This model has the capacity to generate explanations automatically.BioHAN is only established on a graph neural network and can easily integrate different knowledge databases and their semantics. Interpretation is facilitated by the use of an attention mechanism, enabling the model to focus on the most informative neurons.These methods have been evaluated on diagnostic tasks using real gene expression datasets and have shown competitiveness with state-of-the-art machine learning methods. Our models provide intelligible explanations composed of the most contributive neurons and their associated biological concepts. This feature allows experts to use our tools in a medical setting
Braud, Chloé. "Identification automatique des relations discursives implicites à partir de corpus annotés et de données brutes". Sorbonne Paris Cité, 2015. https://hal.inria.fr/tel-01256884.
Pełny tekst źródłaBuilding discourse parsers is currently a major challenge in Natural Language Processing. The identification of the relations (such as Explanation, Contrast. . . ) linking spans of text in the document is the main difficulty. Especially, identifying the so-called implicit relations, that is the relations that lack a discourse connective (such as but, because. . . ), is known as an hard tank sine it requires to take into account varions factors, and because it leads to specific difficulties in a classification system. In this thesis, we use raw data to improve automatic identification of implicit relations. First, we propose to use discourse markers in order to automatically annotate new data. We use domain adaptation methods to deal with the distributional differences between automatically and manually annotated data : we report improvements for systems built on the French corpus ANNODIS and on the English corpus Penn Discourse Treebank. Then, we propose to use word representations built from raw data, which may be automatically annotated with discourse markers, in order to feed a representation of the data based on the words found in the spans of text to be linked. We report improvements on the English corpus Penn Discourse Treebank, and especially we show that this method alleviates the need for rich resources, available but for a few languages
Sutton-Charani, Nicolas. "Apprentissage à partir de données et de connaissances incertaines : application à la prédiction de la qualité du caoutchouc". Thesis, Compiègne, 2014. http://www.theses.fr/2014COMP1835/document.
Pełny tekst źródłaDuring the learning of predictive models, the quality of available data is essential for the reliability of obtained predictions. These learning data are, in practice very often imperfect or uncertain (imprecise, noised, etc). This PhD thesis is focused on this context where the theory of belief functions is used in order to adapt standard statistical tools to uncertain data.The chosen predictive model is decision trees which are basic classifiers in Artificial Intelligence initially conceived to be built from precise data. The aim of the main methodology developed in this thesis is to generalise decision trees to uncertain data (fuzzy, probabilistic, missing, etc) in input and in output. To realise this extension to uncertain data, the main tool is a likelihood adapted to belief functions,recently presented in the literature, whose behaviour is here studied. The maximisation of this likelihood provide estimators of the trees’ parameters. This maximisation is obtained via the E2M algorithm which is an extension of the EM algorithm to belief functions.The presented methodology, the E2M decision trees, is applied to a real case : the natural rubber quality prediction. The learning data, mainly cultural and climatic,contains many uncertainties which are modelled by belief functions adapted to those imperfections. After a simple descriptiv statistic study of the data, E2M decision trees are built, evaluated and compared to standard decision trees. The taken into account of the data uncertainty slightly improves the predictive accuracy but moreover, the importance of some variables, sparsely studied until now, is highlighted
Cerda, Reyes Patricio. "Apprentissage statistique à partir de variables catégorielles non-uniformisées Similarity encoding for learning with dirty categorical variables Encoding high-cardinality string categorical variables". Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLS470.
Pełny tekst źródłaTabular data often contain columns with categorical variables, usually considered as non-numerical entries with a fixed and limited number of unique elements or categories. As many statistical learning algorithms require numerical representations of features, an encoding step is necessary to transform categorical entries into feature vectors, using for instance one-hot encoding. This and other similar strategies work well, in terms of prediction performance and interpretability, in standard statistical analysis when the number of categories is small. However, non-curated data give rise to string categorical variables with a very high cardinality and redundancy: the string entries share semantic and/or morphological information, and several entries can reflect the same entity. Without any data cleaning or feature engineering step, common encoding methods break down, as they tend to lose information in their vectorial representation. Also, they can create high-dimensional feature vectors, which prevent their usage in large scale settings. In this work, we study a series of categorical encodings that remove the need for preprocessing steps on high-cardinality string categorical variables. An ideal encoder should be: scalable to many categories; interpretable to end users; and capture the morphological information contained in the string entries. Experiments on real and simulated data show that the methods we propose improve supervised learning, are adapted to large-scale settings, and, in some cases, create feature vectors that are easily interpretable. Hence, they can be applied in Automated Machine Learning (AutoML) pipelines in the original string entries without any human intervention
El, Ahdab Ahmad. "Contribution à l'apprentissage de réseaux bayésiens à partir de données datées pour le dignostic des processus dynamiques continus". Aix-Marseille 3, 2010. http://www.theses.fr/2010AIX30018.
Pełny tekst źródłaThis thesis addresses the problem of learning a Dynamic Bayesian network from timed data without prior knowledge to the dynamic process that generated the data. One of the main difficulties of learning a Dynamic Bayesian network is building and orienting the edges of the network avoiding loops. This problem is more difficult when data are timed. The thesis proposes an algorithm, called BJT4BN, based on an adequate representation of a set of sequences of timed observations and uses the BJ-Measure, an information based measure adapted to timed data to evaluates the quantity of information flowing along an edge. This algorithm and this measure have been designed in the framework of the TOM4L process (Timed Observation Mining for Learning process) that is based on the Theory of the Timed Observations. The thesis illustrates the algorithm with an application on a pedagogical example of the diagnosis of a vehicle. The operational flavor of the works are described with the results obtained with the data provided by the Apache system, a real world knowledge based system developed by the Arcelor-Mittal Steel Group to diagnose its galvanization bathes
Temanni, Mohamed-Ramzi. "Combinaison de sources de données pour l'amélioration de la prédiction en apprentissage : une application à la prédiction de la perte de poids chez l'obèse à partir de données transcriptomiques et cliniques". Paris 6, 2009. https://tel.archives-ouvertes.fr/tel-00814513.
Pełny tekst źródłaDzogang, Fabon. "Représentation et apprentissage à partir de textes pour des informations émotionnelles et pour des informations dynamiques". Paris 6, 2013. http://www.theses.fr/2013PA066253.
Pełny tekst źródłaAutomatic knowledge extraction from texts consists in mapping lowlevel information, as carried by the words and phrases extracted fromdocuments, to higher level information. The choice of datarepresentation for describing documents is, thus, essential and thedefinition of a learning algorithm is subject to theirspecifics. This thesis addresses these two issues in the context ofemotional information on the one hand and dynamic information on theother. In the first part, we consider the task of emotion extraction forwhich the semantic gap is wider than it is with more traditionalthematic information. Therefore, we propose to study representationsaimed at modeling the many nuances of natural language used fordescribing emotional, hence subjective, information. Furthermore, wepropose to study the integration of semantic knowledge which provides,from a characterization perspective, support for extracting theemotional content of documents and, from a prediction perspective,assistance to the learning algorithm. In the second part, we study information dynamics: any corpus ofdocuments published over the Internet can be associated to sources inperpetual activity which exchange information in a continuousmovement. We explore three main lines of work: automaticallyidentified sources; the communities they form in a dynamic and verysparse description space; and the noteworthy themes they develop. Foreach we propose original extraction methods which we apply to a corpusof real data we have collected from information streams over the Internet
Bayoudh, Meriam. "Apprentissage de connaissances structurelles à partir d’images satellitaires et de données exogènes pour la cartographie dynamique de l’environnement amazonien". Thesis, Antilles-Guyane, 2013. http://www.theses.fr/2013AGUY0671/document.
Pełny tekst źródłaClassical methods for satellite image analysis are inadequate for the current bulky data flow. Thus, automate the interpretation of such images becomes crucial for the analysis and management of phenomena changing in time and space, observable by satellite. Thus, this work aims at automating land cover cartography from satellite images, by expressive and easily interpretable mechanism, and by explicitly taking into account structural aspects of geographic information. It is part of the object-based image analysis framework, and assumes that it is possible to extract useful contextual knowledge from maps. Thus, a supervised parameterization methods of a segmentation algorithm is proposed. Secondly, a supervised classification of geographical objects is presented. It combines machine learning by inductive logic programming and the multi-class rule set intersection approach. These approaches are applied to the French Guiana coastline cartography. The results demonstrate the feasibility of the segmentation parameterization, but also its variability as a function of the reference map classes and of the input data. Yet, methodological developments allow to consider an operational implementation of such an approach. The results of the object supervised classification show that it is possible to induce expressive classification rules that convey consistent and structural information in a given application context and lead to reliable predictions, with overall accuracy and Kappa values equal to, respectively, 84,6% and 0,7. In conclusion, this work contributes to the automation of the dynamic cartography from remotely sensed images and proposes original and promising perpectives
Fournier, Dominique. "Etude de la qualité de données à partir de l'apprentissage automatique : application aux arbres d'induction". Caen, 2001. http://www.theses.fr/2001CAEN2048.
Pełny tekst źródłaDeschamps, Sébastien. "Apprentissage actif profond pour la reconnaissance visuelle à partir de peu d’exemples". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS199.
Pełny tekst źródłaAutomatic image analysis has improved the exploitation of image sensors, with data coming from different sensors such as phone cameras, surveillance cameras, satellite imagers or even drones. Deep learning achieves excellent results in image analysis applications where large amounts of annotated data are available, but learning a new image classifier from scratch is a difficult task. Most image classification methods are supervised, requiring annotations, which is a significant investment. Different frugal learning solutions (with few annotated examples) exist, including transfer learning, active learning, semi-supervised learning or meta-learning. The goal of this thesis is to study these frugal learning solutions for visual recognition tasks, namely image classification and change detection in satellite images. The classifier is trained iteratively by starting with only a few annotated samples, and asking the user to annotate as little data as possible to obtain satisfactory performance. Deep active learning was initially studied with other methods and suited our operational problem the most, so we chose this solution. In this thesis, we have developed an interactive approach, where we ask the most informative questions about the relevance of the data to an oracle (annotator). Based on its answers, a decision function is iteratively updated. We model the probability that the samples are relevant, by minimizing an objective function capturing the representativeness, diversity and ambiguity of the data. Data with high probability are then selected for annotation. We have improved this approach, using reinforcement learning to dynamically and accurately weight the importance of representativeness, diversity and ambiguity of the data in each active learning cycle. Finally, our last approach consists of a display model that selects the most representative and diverse virtual examples, which adversely challenge the learned model, in order to obtain a highly discriminative model in subsequent iterations of active learning. The good results obtained against the different baselines and the state of the art in the tasks of satellite image change detection and image classification have demonstrated the relevance of the proposed frugal learning models, and have led to various publications (Sahbi et al. 2021; Deschamps and Sahbi 2022b; Deschamps and Sahbi 2022a; Sahbi and Deschamps2022)
Durand, Maëva. "Alimentation sur mesure et estimation du bien-être des truies gestantes à partir de données hétérogènes". Electronic Thesis or Diss., Rennes, Agrocampus Ouest, 2023. http://www.theses.fr/2023NSARC169.
Pełny tekst źródłaNew technologies are developing increasingly in pig farming, to help farmers in their labour tasks. They allow the distribution of tailored diets for gestating sows and better animal behaviour monitoring. The issue of this thesis is to improve the estimation of daily nutritional requirements and estimate the individual welfare status of gestating sows using behavioural and environmental data collected automatically. The first aim was to evaluate experimentally the effects of environmental disturbances on behaviour and nutritional requirements. To achieve this, two groups of sows were followed during two consecutive gestations during which several events were induced. A database containing a variety ofsows’ behavioural data was built from these experiments. The results of the thesis highlighted the influence of environmental conditions on the behaviour and nutritional requirements of sows during gestation, as well as an important individual variability. The second part involved estimating individual daily requirements and welfare based on behavioural and environmental data recorded by sensors. The individual estimation of nutritional requirements and state of welfare can be carried out accurately using machine learning algorithms and data produced by the automatic feeder. Using these innovative methods, this thesis opens potential for the design of a decision-support tool aiming at adjusting feeding and improving the welfare of gestating sows
Mouillet, Laure. "Modélisation, reconnaissance et apprentissage de scénarios de conflits ethno-politiques". Paris 6, 2005. http://www.theses.fr/2005PA066031.
Pełny tekst źródłaMagnan, Christophe Nicolas. "Apprentissage à partir de données diversement étiquetées pour l'étude du rôle de l'environnement local dans les interactions entre acides aminés". Aix-Marseille 1, 2007. http://www.theses.fr/2007AIX11022.
Pełny tekst źródłaThe 3D structure of proteins is constrained by some interactions between distant amino acids in the primary sequences. An accurate prediction of these bonds may be a step forward for the prediction of the 3D structure from sequences. A review of the literature raises questions about the role of the neighbourhood of bonded amino acids in the formation of these bonds. We show that we have to investigate uncommon learning frameworks to answer these questions. The first one is a particular case of semi-supervised learning, in which the only labelled data to learn from belong to one class, and the second one considers that the data are subject to class-conditional classification noise. We show that learning in these frameworks leads to ill-posed problems. We give some assumptions that make these problems well-posed. We propose adaptations of well-known methods to these learning frameworks. We apply them to try to answer the questions on the biological problem considered in this study
Fahlaoui, Tarik. "Réduction de modèles et apprentissage de solutions spatio-temporelles paramétrées à partir de données : application à des couplages EDP-EDO". Thesis, Compiègne, 2020. http://www.theses.fr/2020COMP2535.
Pełny tekst źródłaIn this thesis, an algorithm for learning an accurate reduced order model from data generated by a high fidelity solver (HF solver) is proposed. To achieve this goal, we use both Dynamic Mode Decomposition (DMD) and Proper Orthogonal Decomposition (POD). Anomaly detection, during the learning process, can be easily done by performing an a posteriori spectral analysis on the reduced order model learnt. Several extensions are presented to make the method as general as possible. Thus, we handle the case of coupled ODE/PDE systems or the case of second order hyperbolic equations. The method is also extended to the case of switched control systems, where the switching rule is learnt by using an Artificial Neural Network (ANN). The reduced order model learnt allows to predict time evolution of the POD coefficients. However, the POD coefficients have no interpretable meaning. To tackle this issue, we propose an interpretable reduction method using the Empirical Interpolation Method (EIM). This reduction method is then adapted to the case of third-order tensors, and combining with the Kernel Ridge Regression (KRR) we can learn the solution manifold in the case of parametrized PDEs. In this way, we can learn a parametrized reduced order model. The case of non-linear PDEs or disturbed data is finally presented in the opening
Gauthier, Luc-Aurélien. "Inférence de liens signés dans les réseaux sociaux, par apprentissage à partir d'interactions utilisateur". Electronic Thesis or Diss., Paris 6, 2015. http://www.theses.fr/2015PA066639.
Pełny tekst źródłaIn this thesis, we study the semantic of relations between users and, in particular, the antagonistic forces we naturally observe in various social relationships, such as hostility or suspicion. The study of these relationships raises many problems both techniques - because the mathematical arsenal is not really adapted to the negative ties - and practical, due to the difficulty of collecting such data (explaining a negative relationship is perceived as intrusive and inappropriate for many users). That’s why we focus on the alternative solutions consisting in inferring these negative relationships from more widespread content. We use the common judgments about items the users share, which are the data used in recommender systems. We provide three contributions, described in three distinct chapters. In the first one, we discuss the case of agreements about items that may not have the same semantics if they involve appreciated items or not by two users. We will see that disliking the same product does not mean similarity. Afterward, we consider in our second contribution the distributions of user ratings and items ratings in order to measure whether the agreements or disagreements may happen by chance or not, in particular to avoid the user and item biases observed in this type of data. Our third contribution consists in using these results to predict the sign of the links between users from the only positive ties and the common judgments about items, and then without any negative social information
Drago, Laetitia. "Analyse globale de la pompe à carbone biologique à partir de données en imagerie quantitative". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS562.
Pełny tekst źródłaThe biological carbon pump (BCP) plays a central role in the global ocean carbon cycle, transporting carbon from the surface to the deep ocean and sequestering it for long periods. This work aims to analyse two key players of the BCP: zooplankton and particles. To this end, we use in situ imaging data from the Underwater Vision Profiler (UVP5) to investigate two primary axes: 1) the global distribution of zooplankton biomass and 2) carbon export in the context of a North Atlantic spring bloom. Our objectives includes a quantification of global zooplankton biomass, enhancing our comprehension of the BCP via morphological analysis of particles, and assessing and comparing the gravitational flux of detrital particles during a the North Atlantic spring bloom using high-resolution UVP5 data. With the help of UVP5 imagery and machine learning through habitat models using boosted regression trees, we investigate the global distribution of zooplankton biomass and its ecological implications. The results show maximum zooplankton biomass values around 60°N and 55°S and minimum values within the oceanic gyres, with a global biomass dominated by crustaceans and rhizarians. By employing machine learning techniques on globally homogeneous data, this study provides taxonomical insights into the distribution of 19 large zooplankton groups (1-50 mm equivalent spherical diameter). This first protocol estimates global, spatially resolved zooplankton biomass and community composition from in situ imaging observations of individual organisms. In addition, within the unique context of the EXPORTS 2021 campaign, we analyse UVP5 data obtained by deploying three instruments in a highly retentive eddy. After clustering the 1,720,914 images using Morphocluster, a semi-autonomous classification software, we delve into the characteristics of the marine particles, studying their morphology through an oblique framework that follows a plume of detrital particles between the surface and 800 m depth. The results of the plume following approach show that, contrary to expectations, aggregates become unexpectedly larger, denser, more circular and more complex with depth. In contrast, the evolution of fecal pellets is more heterogeneous and shaped by zooplankton activity. Such results challenge previous expectations and may require a reassessment of our view of sinking aggregates and fecal pellets. We also studied concentration and carbon flux dynamics using a more traditional 1D framework where we explore the three key elements in flux estimation from in situ imaging data by comparing UVP5 and sediment trap flux estimates: size range covered, sinking rate and carbon content. According to the current literature, neutrally buoyant sediment traps (NBST) and surface-tethered traps (STT) usually cover a size range from 10 µm to approximately 2 mm. In our study, we have found that by expanding the UVP size range to 10 µm and limiting it to 2 mm, a more consistent comparison can be made between UVP5-generated flux and sediment trap fluxes (obtained by colleagues). However, it is worth noting that there remains a large flux contribution above this size threshold, necessitating further investigation of its implications through the use of complementary approaches such as the use of sediment traps with larger openings. This manuscript not only advances our knowledge, but also addresses critical challenges in estimating zooplankton biomass and particle dynamics during export events. The findings of this study open up new avenues for future research on the biological carbon pump and deepen our understanding of marine ecosystems
Armand, Stéphane. "Analyse quantifiée de la marche : extraction de connaissances à partir de données pour l'aide à l'interprétation clinique de la marche digitigrade". Valenciennes, 2005. http://ged.univ-valenciennes.fr/nuxeo/site/esupversions/6cfbb62f-d5e4-4bd3-b7b3-96618bf3ceea.
Pełny tekst źródłaClinical Gait Analysis (CGA) is used to identify and quantify gait deviations from biomechanical data. Interpreting CGA, which provides the explanations for the identified gait deviations, is a complex task. Toe-walking is one of the most common gait deviations, and identifying its causes is difficult. This research had for objective to provide a support tool for interpreting toe-walker CGAs. To reach this objective, a Knowledge Discovery in Databases (KDD) method combining unsupervised and supervised machine learning is used to extract objectively intrinsic and discriminant knowledge from CGA data. The unsupervised learning (fuzzy c-means) allowed three toe-walking patterns to be identified from ankle kinematics extracted from a database of more than 2500 CGA (Institut Saint-Pierre, Palavas, 34). The supervised learning was employed to explain these three gait patterns through clinical measurement using induced rules from fuzzy decision trees. The most significant and interpretable rules (12) were selected to create a knowledge base that has been validated in terms of the literature and experts. These rules can be used to facilitate the interpretation of toe-walker CGA data. This research opens several prospective paths of investigation, ranging from the development of a generic method based on the proposed method for studying movement to the creation of a pathologic gait simulator
Moscu, Mircea. "Inférence distribuée de topologie de graphe à partir de flots de données". Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4081.
Pełny tekst źródłaThe second decade of the current millennium can be summarized in one short phrase: the advent of data. There has been a surge in the number of data sources: from audio-video streaming, social networks and the Internet of Things, to smartwatches, industrial equipment and personal vehicles, just to name a few. More often than not, these sources form networks in order to exchange information. As a direct consequence, the field of Graph Signal Processing has been thriving and evolving. Its aim: process and make sense of all the surrounding data deluge.In this context, the main goal of this thesis is developing methods and algorithms capable of using data streams, in a distributed fashion, in order to infer the underlying networks that link these streams. Then, these estimated network topologies can be used with tools developed for Graph Signal Processing in order to process and analyze data supported by graphs. After a brief introduction followed by motivating examples, we first develop and propose an online, distributed and adaptive algorithm for graph topology inference for data streams which are linearly dependent. An analysis of the method ensues, in order to establish relations between performance and the input parameters of the algorithm. We then run a set of experiments in order to validate the analysis, as well as compare its performance with that of another proposed method of the literature.The next contribution is in the shape of an algorithm endowed with the same online, distributed and adaptive capacities, but adapted to inferring links between data that interact non-linearly. As such, we propose a simple yet effective additive model which makes use of the reproducing kernel machinery in order to model said nonlinearities. The results if its analysis are convincing, while experiments ran on biomedical data yield estimated networks which exhibit behavior predicted by medical literature.Finally, a third algorithm proposition is made, which aims to improve the nonlinear model by allowing it to escape the constraints induced by additivity. As such, the newly proposed model is as general as possible, and makes use of a natural and intuitive manner of imposing link sparsity, based on the concept of partial derivatives. We analyze this proposed algorithm as well, in order to establish stability conditions and relations between its parameters and its performance. A set of experiments are ran, showcasing how the general model is able to better capture nonlinear links in the data, while the estimated networks behave coherently with previous estimates
Nguyen, Dang Tuan. "Extraction d'information à partir de documents Web multilingues : une approche d'analyses structurelles". Caen, 2006. http://www.theses.fr/2006CAEN2023.
Pełny tekst źródłaMultilingual Web Document (MWD) processing has become one of the major interests of research and development in the area of information retrieval. Therefore, we observed that the structure of the multilingual resources has not been enough explored in most of the research works in this area. We consider that links structure embed crucial information for both hyperdocument retrieving and mining process. Discarding the multilingual information structures could affect the processing performance and generate various problems : i)°Redundancy : if the site proposes simultaneously translations in several languages, ii)° Noisy information: by using labels to shift from language to another, iii)° Loosing information: if the process does not consider the structure specificity of each language. In this context, we wonder to remind that each Web site is considered as a hyper-document that contains a set of Web documents (pages, screen, messages) which can be explored through the links paths. Therefore, detecting the dominant languages, in a Web Site, could be done in a different ways. The framework of this experimental research thesis is structures analysis for information extraction from a great number of heterogeneous structured or semi-structured electronic documents (essentially the Web document). It covers the following aspects : Enumerating the dominants languages, Setting-up (virtual) frontiers between those languages, enabling further processing, Recognizing the dominants languages. To experiment and validate our aim we have developed Hyperling which is a formal, language independent, system dealing with Web Documents. Hyperling proposes a Multilingual Structural Analysis approach to cluster and retrieve Web Document. Hyperling’s fundamental hypothesis is based on the notion of relation-density : The Monolingual relation density: i. E. Links between Web Documents written in the same language, The Interlingual relation density: i. E. Links between Web Documents written in different languages. In a Web document representation we can encounter a high level of monolingual relation density and low level of inter-lingual relation density. Therefore, we can consider a MWD to be represented by a set of clusters. Regarding the density level of each cluster, it may represent a dominant language. This hypothesis has been the core of Hyperling and has been experimented and approved on a real multilingual web documents (IMF, UNDP, UNFPA, UNICEF, WTO)
Temanni, Mohamed Ramzi. "Combinaison de sources de données pour l'amélioration de la prédiction en apprentissage : une application à la prédiction de la perte de poids chez l'obèse à partir de données transcriptomiques et cliniques". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2009. http://tel.archives-ouvertes.fr/tel-00814513.
Pełny tekst źródłaAsvatourian, Vahé. "Apports de la modélisation causale dans l’évaluation des immunothérapies à partir de données observationnelles". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS427/document.
Pełny tekst źródłaIn oncology, new treatments such as immunotherapy have been proposed, which are based on regulation of the immune system. However, not all treated patient have a long-term benefit of the treatment. To identify those patients who benefit most, we measured markers of the immune system expressed at treatment initiation and across time. In an observational study, the lack of randomization makes the groups not comparable and the effect measured is just an association. In this context, causal inference methods allow in some cases, after having identified all biases by constructing a directed acyclic graph (DAG), to get close to the case of conditional exchangeability between exposed and non-exposed subjects and thus estimating causal effects.In the most simple cases, where the number of variables is low, it is possible to draw the DAG with experts’ beliefs. Whereas in the situation where the number of variables rises, learning algorithms have been proposed in order to estimate the structure of the graphs. Nevertheless, these algorithms make the assumptions that any a priori information between the markers is known and have mainly been developed in the setting in which covariates are measured only once. The objective of this thesis is to develop learning methods of graphs for taking repeated measures into account, and reduce the space search by using a priori expert knowledge. Based on these graphs, we estimate causal effects of the repeated immune markers on treatment response and/or toxicity
Haury, Anne-Claire. "Sélection de variables à partir de données d'expression : signatures moléculaires pour le pronostic du cancer du sein et inférence de réseaux de régulation génique". Phd thesis, Ecole Nationale Supérieure des Mines de Paris, 2012. http://pastel.archives-ouvertes.fr/pastel-00818345.
Pełny tekst źródłaGauthier, Luc-Aurélien. "Inférence de liens signés dans les réseaux sociaux, par apprentissage à partir d'interactions utilisateur". Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066639/document.
Pełny tekst źródłaIn this thesis, we study the semantic of relations between users and, in particular, the antagonistic forces we naturally observe in various social relationships, such as hostility or suspicion. The study of these relationships raises many problems both techniques - because the mathematical arsenal is not really adapted to the negative ties - and practical, due to the difficulty of collecting such data (explaining a negative relationship is perceived as intrusive and inappropriate for many users). That’s why we focus on the alternative solutions consisting in inferring these negative relationships from more widespread content. We use the common judgments about items the users share, which are the data used in recommender systems. We provide three contributions, described in three distinct chapters. In the first one, we discuss the case of agreements about items that may not have the same semantics if they involve appreciated items or not by two users. We will see that disliking the same product does not mean similarity. Afterward, we consider in our second contribution the distributions of user ratings and items ratings in order to measure whether the agreements or disagreements may happen by chance or not, in particular to avoid the user and item biases observed in this type of data. Our third contribution consists in using these results to predict the sign of the links between users from the only positive ties and the common judgments about items, and then without any negative social information
Hedjazi, Lyamine. "Outil d'aide au diagnostic du cancer à partir d'extraction d'informations issues de bases de données et d'analyses par biopuces". Phd thesis, Toulouse 3, 2011. http://thesesups.ups-tlse.fr/1391/.
Pełny tekst źródłaCancer is one of the most common causes of death in the world. Currently, breast cancer is the most frequent in female cancers. Although the significant improvement made last decades in cancer management, an accurate cancer management is still needed to help physicians take the necessary treatment decisions and thereby reducing its related adverse effects as well as its expensive medical costs. This work addresses the use of machine learning techniques to develop such tools of breast cancer management. Clinical factors, such as patient age and histo-pathological variables, are still the basis of day-to-day decision for cancer management. However, with the emergence of high throughput technology, gene expression profiling is gaining increasing attention to build more accurate predictive tools for breast cancer. Nevertheless, several challenges have to be faced for the development of such tools mainly (1) high dimensionality of data issued from microarray technology; (2) low signal-to-noise ratio in microarray measurement; (3) membership uncertainty of patients to cancer groups; and (4) heterogeneous (or mixed-type) data present usually in clinical datasets. In this work we propose some approaches to deal appropriately with such challenges. A first approach addresses the problem of high data dimensionality by taking use of l1 learning capabilities to design an embedded feature selection algorithm for SVM (l1 SVM) based on a gradient descent technique. The main idea is to transform the initial constrained convex optimization problem into an unconstrained one through the use of an approximated loss function. A second approach handles simultaneously all challenges and therefore allows the integration of several data sources (clinical, microarray. . . ) to build more accurate predictive tools. In this order a unified principle to deal with the data heterogeneity problem is proposed. This principle is based on the mapping of different types of data from initially heterogeneous spaces into a common space through an adequacy measure. To take into account membership uncertainty and increase model interpretability, this principle is proposed within a fuzzy logic framework. Besides, in order to alleviate the problem of high level noise, a symbolic approach is proposed suggesting the use of interval representation to model the noisy measurements. Since all data are mapped into a common space, they can be processed in a unified way whatever its initial type for different data analysis purposes. We particularly designed, based on this principle, a supervised fuzzy feature weighting approach. The weighting process is mainly based on the definition of a membership margin for each sample. It optimizes then a membership-margin based objective function using classical optimization approach to avoid combinatorial search. An extension of this approach to the unsupervised case is performed to develop a weighted fuzzy rule-based clustering algorithm. The effectiveness of all approaches has been assessed through extensive experimental studies and compared with well-know state-of-the-art methods. Finally, some breast cancer applications have been performed based on the proposed approaches. In particular, predictive and prognostic models were derived based on microarray and/or clinical data and compared with genetic and clinical based approaches
Hedjazi, Lyamine. "Outil d'aide au diagnostic du cancer à partir d'extraction d'informations issues de bases de données et d'analyses par biopuces". Phd thesis, Université Paul Sabatier - Toulouse III, 2011. http://tel.archives-ouvertes.fr/tel-00657959.
Pełny tekst źródłaTrépos, Ronan. "Apprentissage symbolique à partir de données issues de simulation pour l’aide à la décision : gestion d’un bassin versant pour une meilleure qualité de l’eau". Rennes 1, 2008. http://www.theses.fr/2008REN1S004.
Pełny tekst źródłaOne often finds it difficult to analyze the results of a simulation model that represents the behavior of an environmental system. This is due to the large number of input variables and the complexity of interactions between the simulated processes. We have proposed to use symbolic learning techniques in order to perform this analyze, the goal of which is to learn classification rules for decision support. Two rule-learning methods have been developed and compared. In our context, the objects to be analyzed are tree structures, the nodes of which are labelled by attributes. Afterwards, we have developed a system which, from induced rules, suggests actions so that a situation proposed by a user can be improved. These contributions have been motivated by the SACADEAU project, devoted to develop a decision support system for the management of catchment areas. The project relies on a model that combines a model of farming practices with a model of pesticides transfer
Ekhteraei, Toussi Mohammad Massoud. "Analyse et reconstitution des décisions thérapeutiques des médecins et des patients à partir des données enregistrées dans les dossiers patient informatisés". Paris 13, 2009. http://www.theses.fr/2009PA132029.
Pełny tekst źródłaThis thesis deals with the study of the agreement between the therapeutic decisions and the recommendations of best practice. We propose three methods for the analysis and the reconstruction of physicians’ and patients’ therapeutic decisions through the information available in patient records. Our first method involves the analysis of the agreement between physicians’ prescriptions and the recommendations of best practice. We present a typology of drug therapy, applicable to chronic disease, allowing to formalize both prescriptions and recommendations and to compare them in three levels of detail: the type of treatment, pharmaco-therapeutic class, and the dose of each medication. Our second method involves the extraction of physicians’ therapeutic decisions through patient records when the guidelines do not offer recommendations. We first present a method for discovering knowledge gaps in clinical practice guidelines. Then we apply a machine learning algorithm (C5. 0 Quinlan) to a database of patient records to extract new rules that we graft to the decision tree of the original guideline. Our third method involves the analysis of compliance of patients’ therapeutic decisions with regard to the physicians’ recommendations concerning insulin dose adjustment. We present five indicators useful for the verification of the level of patient compliance: absolute agreement (AA) and the relative agreement (RA) show an acceptable compliance, extreme disagreement (ED) shows a dangerous behavior, over-treatment (OT) and under-treatment (UT) show that the administered dose was respectively too high or too low
Giffard-Roisin, Sophie. "Personnalisation non-invasive de modèles électrophysiologiques cardiaques à partir d'électrogrammes surfaciques". Thesis, Université Côte d'Azur (ComUE), 2017. http://www.theses.fr/2017AZUR4092/document.
Pełny tekst źródłaThe objective of this thesis is to use non-invasive data (body surface potential mapping, BSPM) to personalise the main parameters of a cardiac electrophysiological (EP) model for predicting the response to cardiac resynchronization therapy (CRT). CRT is a clinically proven treatment option for some heart failures. However, these therapies are ineffective in 30% of the treated patients and involve significant morbidity and substantial cost. The precise understanding of the patient-specific cardiac function can help to predict the response to therapy. Until now, such methods required to measure intra-cardiac electrical potentials through an invasive endovascular procedure which can be at risk for the patient. We developed a non-invasive EP model personalisation based on a patient-specific simulated database and machine learning regressions. First, we estimated the onset activation location and a global conduction parameter. We extended this approach to multiple onsets and to ischemic patients by means of a sparse Bayesian regression. Moreover, we developed a reference ventricle-torso anatomy in order to perform an common offline regression and we predicted the response to different pacing conditions from the personalised model. In a second part, we studied the adaptation of the proposed method to the input of 12-lead electrocardiograms (ECG) and the integration in an electro-mechanical model for a clinical use. The evaluation of our work was performed on an important dataset (more than 25 patients and 150 cardiac cycles). Besides having comparable results with state-of-the-art ECG imaging methods, the predicted BSPMs show good correlation coefficients with the real BSPMs
Ribeiro, Swen. "Induction non-supervisée de schémas d’évènements à partir de textes journalistiques". Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASS059.
Pełny tekst źródłaEvents are central in many Natural Language Processing tasks, despite the lack of a unified definition for the concept. The field of event processing took off with the MUC evaluation campaigns that provided participants with reference structures called templates. These templates were composed of a title (the name of the event) and several slots, i.e specific and atomic pieces of data about the event. Creating these templates is an expert task and therefore costly, painstaking and hard to extend to new domains.Meanwhile, the amount of data produced by individuals and organizations has grown exponentially, opening unprecedented perspectives of applications. In the journalistic domain, it fueled the development of a new paradigm called data-journalism.In this work, we aim at inducing synthetic representations of events from large textual journalistic corpora. These representations would be comparable to MUC templates and used by data-journalists to explore large textual news datasets. To this end, we propose a bottom-up approach composed of three main steps. The first step clusters several textual mentions of a same particular event (i.e tied to a time and place) to identify distinct instances. The second step groups these instances together based on more abstract features to infer event types. Finally, the third and last step extracts the most salient elements of each type to produce the synthetic, template-like structure we are looking for
Boulfani, Fériel. "Caractérisation du comportement de systèmes électriques aéronautiques à partir d'analyses statistiques". Thesis, Toulouse 1, 2021. http://publications.ut-capitole.fr/43780/.
Pełny tekst źródłaThe characterization of electrical systems is an essential task in aeronautic conception. It consists in particular of sizing the electrical components, defining maintenance frequency and finding the root cause of aircraft failures. Nowadays, the computations are made using electrical engineering theory and simulated physical models. The aim of this thesis is to use statistical approaches based on flight data and machine learning models to characterize the behavior of aeronautic electrical systems. In the first part, we estimate the maximal electrical consumption that the generator should deliver to optimize the generator size and to better understand its real margin. Using the extreme value theory we estimate quantiles that we compare to the theoretical values computed by the electrical engineers. In the second part, we compare different regularized procedures to predict the oil temperature of a generator in a functional data framework. In particular, this study makes it possible to understand the generator behavior under extreme conditions that could not be reproduced physically. Finally, in the last part, we develop a predictive maintenance model that detects the abnormal behavior of a generator to anticipate failures. This model is based on variants of "Invariant Coordinate Selection" adapted to functional data
Madra, Anna. "Analyse et visualisation de la géométrie des matériaux composites à partir de données d’imagerie 3D". Thesis, Compiègne, 2017. http://www.theses.fr/2017COMP2387/document.
Pełny tekst źródłaThe subject of the thesis project between Laboratoire Roberval at Université de Technologie Compiègne and Center for High-Performance Composites at Ecole Polytechnique de Montréal considered the design of a deep learning architecture with semantics for automatic generation of models of composite materials microstructure based on X-ray microtomographic imagery. The thesis consists of three major parts. Firstly, the methods of microtomographic image processing are presented, with an emphasis on phase segmentation. Then, the geometric features of phase elements are extracted and used to classify and identify new morphologies. The method is presented for composites filled with short natural fibers. The classification approach is also demonstrated for the study of defects in composites, but with spatial features added to the process. A high-level descriptor "defect genome" is proposed, that permits comparison of the state o defects between specimens. The second part of the thesis introduces structural segmentation on the example of woven reinforcement in a composite. The method relies on dual kriging, calibrated by the segmentation error from learning algorithms. In the final part, a stochastic formulation of the kriging model is presented based on Gaussian Processes, and distribution of physical properties of a composite microstructure is retrieved, ready for numerical simulation of the manufacturing process or of mechanical behavior
Cherfi, Hacène. "Etude et réalisation d'un système d'extraction de connaissances à partir de textes". Phd thesis, Université Henri Poincaré - Nancy I, 2004. http://tel.archives-ouvertes.fr/tel-00011195.
Pełny tekst źródłaL'utilisation d'un modèle de connaissances vient appuyer et surtout compléter cette première approche. Il est montré, par la définition d'une mesure de vraisemblance, l'intérêt de découvrir de nouvelles connaissances en écartant les connaissances déjà répertoriées et décrites par un modèle de connaissances du domaine. Les règles d'association peuvent donc être utilisées pour alimenter un modèle de connaissances terminologiques du domaine des textes choisi. La thèse inclut la réalisation d'un système appelé TAMIS : "Text Analysis by Mining Interesting ruleS" ainsi qu'une expérimentation et une validation sur des données réelles de résumés de textes en biologie moléculaire.