Tesi sul tema "Génération à partir de données"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Génération à partir de données".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Baez, miranda Belen. "Génération de récits à partir de données ambiantes". Thesis, Université Grenoble Alpes (ComUE), 2018. http://www.theses.fr/2018GREAM049/document.
Stories are a communication tool that allow people to make sense of the world around them. It represents a platform to understand and share their culture, knowledge and identity. Stories carry a series of real or imaginary events, causing a feeling, a reaction or even trigger an action. For this reason, it has become a subject of interest for different fields beyond Literature (Education, Marketing, Psychology, etc.) that seek to achieve a particular goal through it (Persuade, Reflect, Learn, etc.).However, stories remain underdeveloped in Computer Science. There are works that focus on its analysis and automatic production. However, those algorithms and implementations remain constrained to imitate the creative process behind literary texts from textual sources. Thus, there are no approaches that produce automatically stories whose 1) the source consists of raw material that passed in real life and 2) and the content projects a perspective that seeks to convey a particular message. Working with raw data becomes relevant today as it increase exponentially each day through the use of connected devices.Given the context of Big Data, we present an approach to automatically generate stories from ambient data. The objective of this work is to bring out the lived experience of a person from the data produced during a human activity. Any areas that use such raw data could benefit from this work, for example, Education or Health. It is an interdisciplinary effort that includes Automatic Language Processing, Narratology, Cognitive Science and Human-Computer Interaction.This approach is based on corpora and models and includes the formalization of what we call the activity récit as well as an adapted generation approach. It consists of 4 stages: the formalization of the activity récit, corpus constitution, construction of models of activity and the récit, and the generation of text. Each one has been designed to overcome constraints related to the scientific questions asked in view of the nature of the objective: manipulation of uncertain and incomplete data, valid abstraction according to the activity, construction of models from which it is possible the Transposition of the reality collected though the data to a subjective perspective and rendered in natural language. We used the activity narrative as a case study, as practitioners use connected devices, so they need to share their experience. The results obtained are encouraging and give leads that open up many prospects for research
Uribe, Lobello Ricardo. "Génération de maillages adaptatifs à partir de données volumiques de grande taille". Thesis, Lyon 2, 2013. http://www.theses.fr/2013LYO22024.
In this document, we have been interested in the surface extraction from the volumetric representation of an object. With this objective in mind, we have studied the spatial subdivision surface extraction algorithms. This approaches divide the volume in order to build a piecewise approximation of the surface. The general idea is to combine local and simple approximations to extract a complete representation of the object's surface.The methods based on the Marching Cubes (MC) algorithm have problems to produce good quality and to handle adaptive surfaces. Even if a lot of improvements to MC have been proposed, these approaches solved one or two problems but they don't offer a complete solution to all the MC drawbacks. Dual methods are more adapted to use adaptive sampling over volumes. These methods generate surfaces that are dual to those generated by the Marching Cubes algorithm or dual grids in order to use MC methods. These solutions build adaptive meshes that represent well the features of the object. In addition, recent improvements guarantee that the produced meshes have good geometrical and topological properties.In this dissertation, we have studied the main topological and geometrical properties of volumetric objects. In a first stage, we have explored the state of the art on spatial subdivision surface extraction methods in order to identify theirs advantages, theirs drawbacks and the implications of theirs application on volumetric objects. We have concluded that a dual approach is the best option to obtain a good compromise between mesh quality and geometrical approximation. In a second stage, we have developed a general pipeline for surface extraction based on a combination of dual methods and connected components extraction to better capture the topology and geometry of the original object. In a third stage, we have presented an out-of-core extension of our surface extraction pipeline in order to extract adaptive meshes from huge volumes. Volumes are divided in smaller sub-volumes that are processed independently to produce surface patches that are later combined in an unique and topologically correct surface. This approach can be implemented in parallel to speed up its performance. Test realized in a vast set of volumes have confirmed our results and the features of our solution
Raschia, Guillaume. "SaintEtiq : une approche floue pour la génération de résumés à partir de bases de données relationnelles". Nantes, 2001. http://www.theses.fr/2001NANT2099.
Sridhar, Srivatsan. "Analyse statistique de la distribution des amas de galaxies à partir des grands relevés de la nouvelle génération". Thesis, Université Côte d'Azur (ComUE), 2016. http://www.theses.fr/2016AZUR4152/document.
I aim to study to which accuracy it is actually possible to recover the real-space to-point correlation function from cluster catalogues based on photometric redshifts. I make use of cluster sub-samples selected from a light-cone simulated catalogue. Photometric redshifts are assigned to each cluster by randomly extracting from a Gaussian distribution having a dispersion varied in the range σ (z=0) = 0.005 à 0.050. The correlation function in real-space is computed through deprojection method. Four masse ranges and six redshifts slices covering the redshift range 0
Pentek, Quentin. "Contribution à la génération de cartes 3D-couleur de milieux naturels à partir de données d'un système multicapteur pour drone". Thesis, Montpellier, 2020. http://www.theses.fr/2020MONTS037.
These thesis works are preliminary works to the construction of 3D-colour maps. They aim to solve the problem of combining LiDAR data and optical imagery acquired from a drone. Two prerequisites are identified. These consist, on the one hand, in characterizing the measurement errors of heterogeneous data from the sensors and, on the other hand, in geometrically aligning the latter.First, we propose the development of a LiDAR measurement uncertainty prediction model that takes into account the influence of the laser footprint. A new method without reference is introduced to validate this prediction model. A second method using a reference plane validates the adequacy of the use of the method without reference.In a second step, we propose a new method for calibrating the multi-sensor system consisting of a LiDAR, a camera, an inertial navigation system and a global satellite navigation system. The performance of this method is evaluated on synthetic and real data. It has the advantage of being fully automatic, does not require a calibration object or ground control point and can operate in either natural or urban environments. The flexibility of this method allows it to be implemented quickly before each acquisition.Finally, we propose a method to generate 3D-color maps in the form of colored point clouds. Our experiments show that geometric data alignment significantly improves the quality of 3D-color maps. If we look more closely at these 3D-colour maps, there are still colorization errors due mainly to the failure to take into account measurement errors. The use of the proposed LiDAR measurement uncertainty prediction model in the construction of 3D-color maps would therefore be the logical continuation of this work
Broseus, Lucile. "Méthodes d'étude de la rétention d'intron à partir de données de séquençage de seconde et de troisième générations". Thesis, Montpellier, 2020. http://www.theses.fr/2020MONTT027.
In eucaryotic cells, the roles of RNA transcripts are known to be varied. Besides their role as messengers, transferring information from DNA to protein synthesis, the usage of alternative transcripts appears as a means to control gene expression in a post-transcriptional manner. Exemplary, the production of mature transcripts retaining introns (IRTs) was recently shown to take part in several distinct regulatory mechanisms. These observations benefited greatly from the development of the second generation of RNA-sequencing (RNA-seq). However, these data do not allow to identify the entire structure of IRTs, whose catalog is still fragmented. The emerging third generation of RNA-seq, apt to read RNA sequences in their full extent, could help achieve this goal. Despite their respective drawbacks and biases, both technologies are, to some extent, complementary. It is therefore appealing to try and combine them through so-called hybrid methods, so as to perform analyses at the isoform level. In the present thesis, we aim to investigate the potential of these two types of data, alone or in combination, in order to study intron retention (IR) events, more specifically. A growing number of studies harness the high coverage depths provided by second generation data to detect and quantify IR. However, there exist few dedicated computational methods, and many studies rely on methods designed for other purposes, such as gene or exon expression analysis. In any case, their ability to accurately measure IR has not been certified. For this reason, we set up a benchmark of the various IR quantification methods. Our study reveals several biases, prone to prejudice the interpretation of results and prompted us to suggest a novel method to estimate IR levels. Beyond event-centered analyses, Oxford Nanopore long read data have the capability to reveal the full-length structure of IRTs, and thereby to allow to infer some of their features. However, their high error rate and truncation events constitute inescapable impediments. Transcriptome-wide, the computational treatment of these data necessitates heuristics which will favor specific transcript forms, and, generally, overlook rare or unexpected ones. This results in a considerable loss of information and precludes meaningful interpretations. To address these issues, we develop a hybrid correction method and suggest specific strategies to recover and characterize IRTs
Kersale, Marion. "Dynamique de processus océaniques de méso- et de subméso-échelle à partir de simulations numériques et de données in situ". Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4061.
The hydrodynamics around oceanic islands and in coastal areas is characterized by the presence of numerous meso- and submesocale features. The aim of this PhD thesis is to study, from in situ data and numerical modeling, firstly the predominance of some forcings on the generation of these features and secondly their dynamics and their impacts on the dispersion of coastal waters. Firstly, a study based on a series of numerical simulations in the Hawaiian region, allows us to examine the relative importance of wind, topographic and inflow current forcing on the generation of mesoscale eddies. Sensitivity tests have shown the importance of high wind-forcing spatial resolution. Secondly, the coastal dynamics of the Gulf of Lions (GoL), also subject to these forcings, has been investigated. A first part focuses on the physical characteristics and the dynamics of an eddy in the western part of the gulf, using data from the Latex09 campaign and results from a realistic hydrodynamic model of the GoL. Their combined analysis has allowed to identify a new generation mechanism for the mesoscale eddies in this area and to understand the formation of a transient submesoscale structure. This work has shown the importance of these structures in modulating exchanges in this region. Based on the data of the Latex10 campaign, a second part has then focused on the dispersion of coastal waters in the western area of the GoL. The tracking of the water masses in a Lagrangian reference frame (floats, tracer) has allowed to determine the horizontal and vertical diffusion coefficients in this key area for coastal-offshore and interregional exchanges
Thurin, Nicolas H. "Evaluation empirique d’approches basées sur les cas pour la génération d’alertes de pharmacovigilance à partir du Système National des Données de Santé (SNDS)". Thesis, Bordeaux, 2019. http://www.theses.fr/2019BORD0408.
France has a large nationwide longitudinal database with claims and hospital data, the Système National des Données de Santé (French National healthcare database – SNDS), which currently covers almost the complete French population, from birth or immigration to death or emigration, and includes all reimbursed medical and paramedical encounters. Since SNDS systematically and prospectively captures drug dispensings, death, and events leading to hospital stays, it has a strong potential for drug assessment in real life settings. Following the worldwide withdrawal of rofecoxib in 2004, several initiatives aiming to develop and evaluate methodologies for drug safety monitoring on healthcare databases emerged. The EU-ADR alliance (Exploring and Understanding Adverse Drug Reactions by integrative mining of clinical records and biomedical knowledge) and OMOP (Observational Outcomes Partnership) were respectively launched in Europe and in the Unites-States. These experiments demonstrated the usefulness of pharmacoepidemiological approaches in drug safety signal detection. However the SNDS had never been tested in this scope. The objective of this thesis was to empirically assess 3 case-based designs – case-population, case-control, and self-controlled case series – for drug-safety alert generation in the SNDS, taking as examples two health outcome of interest: upper gastrointestinal bleeding (UGIB) and acute liver injury (ALI).The overall project consisted of 4 main stages: (1) preparation of the data to fit the OMOP common data model and the selection of positive and negative drug controls for each outcome of interest; (2) analysis of the selected drug controls using the 3 case-based designs, testing several design variants (e.g. testing different risk windows, adjustment strategies, etc.); (3) comparison of design variant performances through the calculation of the area under the receiver operating characteristics curve (AUC), the mean square error (MSE) and the coverage probability; (4) the selection of the best design variant and its calibration for each health outcome of interest.Self-controlled case series showed the best performances in both outcomes, ALI and UGIB, with AUCs reaching respectively 0.80 and 0.94 and MSEs 0.07 and 0.12. For UGIB optimal performances were observed when adjusting for multiple drugs and using a risk window corresponding to the 30 first days of exposure. For ALI, optimal performances were also observed when adjusting for multiple drugs but using a risk window corresponding to the overall period covered by drug dispensings. Negative drug control implementation highlighted that a low systematic error seemed to be generated by the optimum variants in the SNDS but that protopathic bias and confounding by indication remained unaddressed issues.These results showed that self-controlled case series were well suited to detect drug safety alerts associated with UGIB and ALI in the SNDS in an accurate manner. A clinical perspective remains necessary to rule out potential false positive signals from residual confounding. The application in routine of such approaches extended to other outcomes of interest could result in substantial progress in pharmacovigilance in France
Nguyen, Trung Ky. "Génération d'histoires à partir de données de téléphone intelligentes : une approche de script Dealing with Imbalanced data sets for Human Activity Recognition using Mobile Phone sensors". Thesis, Université Grenoble Alpes (ComUE), 2019. http://www.theses.fr/2019GREAS030.
Script is a structure describes an appropriate sequence of events or actions in our daily life. A story, is invoked a script with one or more interesting deviations, which allows us to deeper understand about what were happened in routine behaviour of our daily life. Therefore, it is essential in many ambient intelligence applications such as healthmonitoring and emergency services. Fortunately, in recent years, with the advancement of sensing technologies and embedded systems, which make health-care system possible to collect activities of human beings continuously, by integrating sensors into wearable devices (e.g., smart-phone, smart-watch, etc.). Hence, human activity recognition (HAR) has become a hot topic interest of research over the past decades. In order to do HAR, most researches used machine learning approaches such as Neural network, Bayesian network, etc. Therefore, the ultimate goal of our thesis is to generate such kind of stories or scripts from activity data of wearable sensors using machine learning approach. However, to best of our knowledge, it is not a trivial task due to very limitation of information of wearable sensors activity data. Hence, there is still no approach to generate script/story using machine learning, even though many machine learning approaches were proposed for HAR in recent years (e.g., convolutional neural network, deep neural network, etc.) to enhance the activity recognition accuracy. In order to achieve our goal, first of all in this thesis we proposed a novel framework, which solved for the problem of imbalanced data, based on active learning combined with oversampling technique so as to enhance the recognition accuracy of conventional machine learning models i.e., Multilayer Perceptron. Secondly, we introduce a novel scheme to automatically generate scripts from wearable sensor human activity data using deep learning models, and evaluate the generated method performance. Finally, we proposed a neural event embedding approach that is able to benefit from semantic and syntactic information about the textual context of events. The approach is able to learn the stereotypical order of events from sets of narrative describing typical situations of everyday life
Potes, Ruiz Paula Andrea. "Génération de connaissances à l’aide du retour d’expérience : application à la maintenance industrielle". Thesis, Toulouse, INPT, 2014. http://www.theses.fr/2014INPT0089/document.
The research work presented in this thesis relates to knowledge extraction from past experiences in order to improve the performance of industrial process. Knowledge is nowadays considered as an important strategic resource providing a decisive competitive advantage to organizations. Knowledge management (especially the experience feedback) is used to preserve and enhance the information related to a company’s activities in order to support decision-making and create new knowledge from the intangible heritage of the organization. In that context, advances in information and communication technologies play an essential role for gathering and processing knowledge. The generalised implementation of industrial information systems such as ERPs (Enterprise Resource Planning) make available a large amount of data related to past events or historical facts, which reuse is becoming a major issue. However, these fragments of knowledge (past experiences) are highly contextualized and require specific methodologies for being generalized. Taking into account the great potential of the information collected in companies as a source of new knowledge, we suggest in this work an original approach to generate new knowledge based on the analysis of past experiences, taking into account the complementarity of two scientific threads: Experience Feedback (EF) and Knowledge Discovery techniques from Databases (KDD). The suggested EF-KDD combination focuses mainly on: i) modelling the experiences collected using a knowledge representation formalism in order to facilitate their future exploitation, and ii) applying techniques related to data mining in order to extract new knowledge in the form of rules. These rules must necessarily be evaluated and validated by experts of the industrial domain before their reuse and/or integration into the industrial system. Throughout this approach, we have given a privileged position to Conceptual Graphs (CGs), knowledge representation formalism chosen in order to facilitate the storage, processing and understanding of the extracted knowledge by the user for future exploitation. This thesis is divided into four chapters. The first chapter is a state of the art addressing the generalities of the two scientific threads that contribute to our proposal: EF and KDD. The second chapter presents the EF-KDD suggested approach and the tools used for the generation of new knowledge, in order to exploit the available information describing past experiences. The third chapter suggests a structured methodology for interpreting and evaluating the usefulness of the extracted knowledge during the post-processing phase in the KDD process. Finally, the last chapter discusses real case studies dealing with the industrial maintenance domain, on which the proposed approach has been applied
Maupetit, Julien. "Génération ab initio de modèles protéiques à partir de représentations discrètes des protéines et de critères d'énergie simplifiés". Paris 7, 2007. http://www.theses.fr/2007PA077194.
In a post-genomic context, plenty of proteins identified by their sequence have no experimentally resolved structure, and fall out the range of application of comparative modelling methods. The goal of my PHD thesis has been to explore a new de novo protein structure prediction approach. Thus approach is based on the concept of structural alphabet, i. E. Of a local description of protein architecture by using a small number of prototype conformations. Starting from the amino acids sequence of the protein to model, we have developed a candidate fragments prediction method covering mode than 98. 6% of the protein structure with an average length of 6. 7 residues. This set of predicted fragments can approximate the protein structures with a precision of less than 2. 2 angströms. A greedy algorithm have been developed in the laboratory to assemble fragments. The OPEP force field has been optimized and then implemented in the greedy assembling procedure to evaluate the relevance of the generated models. Our participation to the CASP7 experiment came out some weaknesses of the method. For now, the improvement of the OPEP force field and the fragment assembly procedure leeds us to generate, in some cases, models as relevant or better than other famous protein structure prediction servers
Meghnoudj, Houssem. "Génération de caractéristiques à partir de séries temporelles physiologiques basée sur le contrôle optimal parcimonieux : application au diagnostic de maladies et de troubles humains". Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALT003.
In this thesis, a novel methodology for features generation from physiological signals (EEG, ECG) has been proposed that is used for the diagnosis of a variety of brain and heart diseases. Based on sparse optimal control, the generation of Sparse Dynamical Features (SDFs) is inspired by the functioning of the brain. The method's fundamental concept revolves around sparsely decomposing the signal into dynamical modes that can be switched on and off at the appropriate time instants with the appropriate amplitudes. This decomposition provides a new point of view on the data which gives access to informative features that are faithful to the brain functioning. Nevertheless, the method remains generic and versatile as it can be applied to a wide range of signals. The methodology's performance was evaluated on three use cases using openly accessible real-world data: (1) Parkinson's Disease, (2) Schizophrenia, and (3) various cardiac diseases. For all three applications, the results are highly conclusive, achieving results that are comparable to the state-of-the-art methods while using only few features (one or two for brain applications) and a simple linear classifier supporting the significance and reliability of the findings. It's worth highlighting that special attention has been given to achieving significant and meaningful results with an underlying explainability
Shimorina, Anastasia. "Natural Language Generation : From Data Creation to Evaluation via Modelling". Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0080.
Natural language generation is a process of generating a natural language text from some input. This input can be texts, documents, images, tables, knowledge graphs, databases, dialogue acts, meaning representations, etc. Recent methods in natural language generation, mostly based on neural modelling, have yielded significant improvements in the field. Despite this recent success, numerous issues with generation prevail, such as faithfulness to the source, developing multilingual models, few-shot generation. This thesis explores several facets of natural language generation from creating training datasets and developing models to evaluating proposed methods and model outputs. In this thesis, we address the issue of multilinguality and propose possible strategies to semi-automatically translate corpora for data-to-text generation. We show that named entities constitute a major stumbling block in translation exemplified by the English-Russian translation pair. We proceed to handle rare entities in data-to-text modelling exploring two mechanisms: copying and delexicalisation. We demonstrate that rare entities strongly impact performance and that the impact of these two mechanisms greatly varies depending on how datasets are constructed. Getting back to multilinguality, we also develop a modular approach for shallow surface realisation in several languages. Our approach splits the surface realisation task into three submodules: word ordering, morphological inflection and contraction generation. We show, via delexicalisation, that the word ordering component mainly depends on syntactic information. Along with the modelling, we also propose a framework for error analysis, focused on word order, for the shallow surface realisation task. The framework enables to provide linguistic insights into model performance on the sentence level and identify patterns where models underperform. Finally, we also touch upon the subject of evaluation design while assessing automatic and human metrics, highlighting the difference between the sentence-level and system-level type of evaluation
Papailiopoulou, Virginia. "Test automatique de programmes Lustre / SCADE". Phd thesis, Grenoble, 2010. http://www.theses.fr/2010GRENM005.
The work in this thesis addresses the improvement of the testing process with a view to automating test data generation as well as its quality evaluation, in the framework of reactive synchronous systems specified in Lustre/SCADE. On the one hand, we present a testing methodology using the Lutess tool that automatically generates test input data based exclusively on the environment description of the system under test. On the other hand, we are based on the SCADE model of the program under test and we define structural coverage criteria taking into account two new aspects: the use of multiple clocks as well as integration testing, allowing the coverage measurement of large-sized systems. These two strategies could have a positive impact in effectively testing real-world applications. Case studies extracted from the avionics domain are used to demonstrate the applicability of these methods and to empirically evaluate their complexity
Papailiopoulou, Virginia. "Test automatique de programmes Lustre / SCADE". Phd thesis, Grenoble, 2010. http://tel.archives-ouvertes.fr/tel-00454409.
Messé, Arnaud. "Caractérisation de la relation structure-fonction dans le cerveau humain à partir de données d'IRM fonctionnelle et de diffusion : méthodes et applications cognitive et clinique". Phd thesis, Université de Nice Sophia-Antipolis, 2010. http://tel.archives-ouvertes.fr/tel-00845014.
Pazat, Jean-Louis. "Génération de code réparti par distribution de données". Habilitation à diriger des recherches, Université Rennes 1, 1997. http://tel.archives-ouvertes.fr/tel-00170867.
Morisse, Pierre. "Correction de données de séquençage de troisième génération". Thesis, Normandie, 2019. http://www.theses.fr/2019NORMR043/document.
The aims of this thesis are part of the vast problematic of high-throughput sequencing data analysis. More specifically, this thesis deals with long reads from third-generation sequencing technologies. The aspects tackled in this topic mainly focus on error correction, and on its impact on downstream analyses such a de novo assembly. As a first step, one of the objectives of this thesis is to evaluate and compare the quality of the error correction provided by the state-of-the-art tools, whether they employ a hybrid (using complementary short reads) or a self-correction (relying only on the information contained in the long reads sequences) strategy. Such an evaluation allows to easily identify which method is best tailored for a given case, according to the genome complexity, the sequencing depth, or the error rate of the reads. Moreover, developpers can thus identify the limiting factors of the existing methods, in order to guide their work and propose new solutions allowing to overcome these limitations. A new evaluation tool, providing a wide variety of metrics, compared to the only tool previously available, was thus developped. This tool combines a multiple sequence alignment approach and a segmentation strategy, thus allowing to drastically reduce the evaluation runtime. With the help of this tool, we present a benchmark of all the state-of-the-art error correction methods, on various datasets from several organisms, spanning from the A. baylyi bacteria to the human. This benchmark allowed to spot two major limiting factors of the existing tools: the reads displaying error rates above 30%, and the reads reaching more than 50 000 base pairs. The second objective of this thesis is thus the error correction of highly noisy long reads. To this aim, a hybrid error correction tool, combining different strategies from the state-of-the-art, was developped, in order to overcome the limiting factors of existing methods. More precisely, this tool combines a short reads alignmentstrategy to the use of a variable-order de Bruijn graph. This graph is used in order to link the aligned short reads, and thus correct the uncovered regions of the long reads. This method allows to process reads displaying error rates as high as 44%, and scales better to larger genomes, while allowing to reduce the runtime of the error correction, compared to the most efficient state-of-the-art tools.Finally, the third objectif of this thesis is the error correction of extremely long reads. To this aim, aself-correction tool was developed, by combining, once again, different methologies from the state-of-the-art. More precisely, an overlapping strategy, and a two phases error correction process, using multiple sequence alignement and local de Bruijn graphs, are used. In order to allow this method to scale to extremely long reads, the aforementioned segmentation strategy was generalized. This self-correction methods allows to process reads reaching up to 340 000 base pairs, and manages to scale very well to complex organisms such as the human genome
Khalili, Malika. "Nouvelle approche de génération multi-site des données climatiques". Mémoire, École de technologie supérieure, 2007. http://espace.etsmtl.ca/580/1/KHALILI_Malika.pdf.
Genestier, Richard. "Vérification formelle de programmes de génération de données structurées". Thesis, Besançon, 2016. http://www.theses.fr/2016BESA2041/document.
The general problem of proving properties of imperative programs is undecidable. Some subproblems– restricting the languages of programs and properties – are known to be decidable. Inpractice, thanks to heuristics, program proving tools sometimes automate proofs for programs andproperties living outside of the theoretical framework of known decidability results. We illustrate thisfact by building a catalog of proofs, for similar programs and properties of increasing complexity. Mostof these programs are combinatorial map generators.Thus, this work contributes to the research fields of enumerative combinatorics and softwareengineering. We distribute a C library of bounded exhaustive generators of structured arrays, formallyspecified in ACSL and verified with the WP plugin of the Frama-C analysis platform. We also proposea testing-based methodology to assist interactive proof in Coq, an original formal study of maps, andnew results in enumerative combinatorics
Caron, Maxime. "Données confidentielles : génération de jeux de données synthétisés par forêts aléatoires pour des variables catégoriques". Master's thesis, Université Laval, 2015. http://hdl.handle.net/20.500.11794/25935.
Confidential data are very common in statistics nowadays. One way to treat them is to create partially synthetic datasets for data sharing. We will present an algorithm based on random forest to generate such datasets for categorical variables. We are interested by the formula used to make inference from multiple synthetic dataset. We show that the order of the synthesis has an impact on the estimation of the variance with the formula. We propose a variant of the algorithm inspired by differential privacy, and show that we are then not able to estimate a regression coefficient nor its variance. We show the impact of synthetic datasets on structural equations modeling. One conclusion is that the synthetic dataset does not really affect the coefficients between latent variables and measured variables.
Effantin, dit Toussaint Brice. "Colorations de graphes et génération exhaustive d'arbres". Dijon, 2003. http://www.theses.fr/2003DIJOS021.
Lagrange, Jean-Philippe. "Ogre : un système expert pour la génération de requêtes relationnelles". Paris 9, 1992. https://portail.bu.dauphine.fr/fileviewer/index.php?doc=1992PA090035.
Embe, Jiague Michel. "Génération de graphes d'accessibilité à partir de structures réplicables". Mémoire, Université de Sherbrooke, 2009. http://savoirs.usherbrooke.ca/handle/11143/4780.
Smadja, Laurent. "Génération d'environnements 3D denses à partir d'images panoramiques cylindriques". Paris 6, 2003. http://www.theses.fr/2003PA066488.
Ferrandiz, Sylvain. "Apprentissage supervisé à partir de données séquentielles". Caen, 2006. http://www.theses.fr/2006CAEN2030.
In the data mining process, the main part of the data preparation step is devoted to feature construction and selection. The filter approach usually adopted requires evaluation methods for any kind of feature. We address the problem of the supervised evaluation of a sequential feature. We show that this problem is solved if a more general problem is tackled : that of the supervised evaluation of a similarity measure. We provide such an evaluation method. We first turn the problem into the search of a discriminating Voronoi partition. Then, we define a new supervised criterion evaluating such partitions and design a new optimised algorithm. The criterion automatically prevents from overfitting the data and the algorithm quickly provides a good solution. In the end, the method can be interpreted as a robust non parametric method for estimating the conditional density of a nominal target feature given a similarity measure defined from a descriptive feature. The method is experimented on many datasets. It is useful for answering questions like : which day of the week or which hourly time segment is the most relevant to discriminate customers from their call detailed records ? Which series allows to better estimate the customer need for a new service ?
Bounar, Boualem. "Génération automatique de programmes sur une base de données en réseau : couplage PROLOG-Base de données en réseau". Lyon 1, 1986. http://www.theses.fr/1986LYO11703.
Leroux, (zinovieva) Elena. "Méthodes symboliques pour la génération de tests desystèmes réactifs comportant des données". Phd thesis, Université Rennes 1, 2004. http://tel.archives-ouvertes.fr/tel-00142441.
de transitions ne permet pas de le faire. Ceci oblige à énumérer les valeurs des données avant de construire le modèle de système de transitions d'un système, ce qui peut provoquer le problème de l'explosion de l'espace d'états. Cette énumération a également pour effet d'obtenir des cas de test où toutes les données sont instanciées. Or, cela contredit la pratique industrielle où les cas de test sont de vrais programmes avec des variables et des paramètres. La génération de tels
cas de test exige de nouveaux modèles et techniques. Dans cette thèse, nous atteignons deux objectifs. D'une part, nous introduisons un modèle appelé système symbolique de transitions à entrée/sortie qui inclut explicitement toutes les données d'un système réactif. D'autre part, nous proposons et implémentons une nouvelle technique de génération de test qui traite symboliquement les données d'un système en combinant l'approche de génération de test proposée auparavant par notre groupe de recherche avec des techniques d'interprétation abstraite. Les cas de test générés automatiquement par notre technique satisfont des propriétés de correction: ils émettent toujours un verdict correct.
Xue, Xiaohui. "Génération et adaptation automatiques de mappings pour des sources de données XML". Phd thesis, Versailles-St Quentin en Yvelines, 2006. http://www.theses.fr/2006VERS0019.
The integration of information originating from multiple heterogeneous data sources is required by many modern information systems. In this context, the applications’ needs are described by a target schema and the way in-stances of the target schema are derived from the data sources is expressed through mappings. In this thesis, we address the problem of mapping generation for multiple XML data sources and the adaptation of these mappings when the target schema or the sources evolve. We propose an automatic generation approach that first decom-poses the target schema into subtrees, then defines mappings, called partial mappings, for each of these subtrees, and finally combines these partial mappings to generate the mappings for the whole target schema. We also propose a mapping adaptation approach to keep existing mappings current if some changes occur in the target schema or in one of the sources. We have developed a prototype implementation of a tool to support these proc-esses
Xue, Xiaohui. "Génération et adaptation automatiques de mappings pour des sources de données XML". Phd thesis, Université de Versailles-Saint Quentin en Yvelines, 2006. http://tel.archives-ouvertes.fr/tel-00324429.
Nous proposons une approche de génération de mappings en trois phases : (i) la décomposition du schéma cible en sous-arbres, (ii) la recherche de mappings partiels pour chacun de ces sous-arbres et enfin (iii) la génération de mappings pour l'ensemble du schéma cible à partir de ces mappings partiels. Le résultat de notre approche est un ensemble de mappings, chacun ayant une sémantique propre. Dans le cas où l'information requise par le schéma cible n'est pas présente dans les sources, aucun mapping ne sera produit. Dans ce cas, nous proposons de relaxer certaines contraintes définies sur le schéma cible pour permettre de générer des mappings. Nous avons développé un outil pour supporter notre approche. Nous avons également proposé une approche d'adaptation des mappings existants en cas de changement survenant dans les sources ou dans le schéma cible.
Gingras, François. "Prise de décision à partir de données séquentielles". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape9/PQDD_0019/NQ56697.pdf.
Rannou, Éric. "Modélisation explicative de connaissances à partir de données". Toulouse 3, 1998. http://www.theses.fr/1998TOU30290.
Santoso, Mas Simon. "Simulation d'écoulements fluides à partir de données réelles". Thesis, Ecole centrale de Nantes, 2018. http://www.theses.fr/2018ECDN0011.
Points clouds are mathematical objects that allows to describe discretely multivariable functions. They are mainly used in the statistical domain but also in geometrical manifolds. It is nowadays a real challenge to immerse the previous manifolds in finite element computation. Indeed, the immersion of those points clouds requires the reconstruction of the surface of the manifold and the generation of a surfacic mesh. As those operations are often based on an iterative process, they are extremely time-consuming as points clouds are usually massive. The method developed in this thesis allows to immerse points clouds in a meshed domain without the surface reconstruction and mesh generations steps. For that purpose, we use the Volume Immersion Method adapted to point clouds. We coupled this method with an adaped mesh generation technique. Then we are able to generate a monolithic anisotropic mesh, adapted around interest zones. We also use the variational multi-scale method to simulate fluid flow. This method is an extension of the classical finite element method and allows to simulate fluid flow. The last part of this thesis introduce some applications cases in the aerodynamic and urbans domains
Zhang, Bo. "Reconnaissance de stress à partir de données hétérogènes". Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0113/document.
In modern society, the stress of an individual has been found to be a common problem. Continuous stress can lead to various mental and physical problems and especially for the people who always face emergency situations (e.g., fireman): it may alter their actions and put them in danger. Therefore, it is meaningful to provide the assessment of the stress of an individual. Based on this idea, the Psypocket project is proposed which is aimed at making a portable system able to analyze accurately the stress state of an individual based on his physiological, psychological and behavioural modifications. It should then offer solutions for feedback to regulate this state.The research of this thesis is an essential part of the Psypocket project. In this thesis, we discuss the feasibility and the interest of stress recognition from heterogeneous data. Not only physiological signals, such as Electrocardiography (ECG), Electromyography (EMG) and Electrodermal activity (EDA), but also reaction time (RT) are adopted to recognize different stress states of an individual. For the stress recognition, we propose an approach based on a SVM classifier (Support Vector Machine). The results obtained show that the reaction time can be used to estimate the level of stress of an individual in addition or not to the physiological signals. Besides, we discuss the feasibility of an embedded system which would realize the complete data processing. Therefore, the study of this thesis can contribute to make a portable system to recognize the stress of an individual in real time by adopting heterogeneous data like physiological signals and RT
Zhang, Bo. "Reconnaissance de stress à partir de données hétérogènes". Electronic Thesis or Diss., Université de Lorraine, 2017. http://www.theses.fr/2017LORR0113.
In modern society, the stress of an individual has been found to be a common problem. Continuous stress can lead to various mental and physical problems and especially for the people who always face emergency situations (e.g., fireman): it may alter their actions and put them in danger. Therefore, it is meaningful to provide the assessment of the stress of an individual. Based on this idea, the Psypocket project is proposed which is aimed at making a portable system able to analyze accurately the stress state of an individual based on his physiological, psychological and behavioural modifications. It should then offer solutions for feedback to regulate this state.The research of this thesis is an essential part of the Psypocket project. In this thesis, we discuss the feasibility and the interest of stress recognition from heterogeneous data. Not only physiological signals, such as Electrocardiography (ECG), Electromyography (EMG) and Electrodermal activity (EDA), but also reaction time (RT) are adopted to recognize different stress states of an individual. For the stress recognition, we propose an approach based on a SVM classifier (Support Vector Machine). The results obtained show that the reaction time can be used to estimate the level of stress of an individual in addition or not to the physiological signals. Besides, we discuss the feasibility of an embedded system which would realize the complete data processing. Therefore, the study of this thesis can contribute to make a portable system to recognize the stress of an individual in real time by adopting heterogeneous data like physiological signals and RT
Gaumer, Gaëtan. "Résumé de données en extraction de connaissances à partir des données (ECD) : application aux données relationnelles et textuelles". Nantes, 2003. http://www.theses.fr/2003NANT2025.
Zinovieva-Leroux, Eléna. "Méthodes symboliques pour la génération de tests de systèmes réactifs comportant des données". Rennes 1, 2004. https://tel.archives-ouvertes.fr/tel-00142441.
Pietrzyk-Nivau, Audrey. "Génération de plaquettes in vitro à partir de cellules souches hématopoïétiques". Thesis, Paris 5, 2014. http://www.theses.fr/2014PA05P626/document.
Megakaryopoiesis is a process allowing hematopoietic stem cell (HSC) to proliferate and differentiate into megakaryocytes (MK). It is followed by thrombopoiesis allowing blood platelet production. These processes occur 1) in the bone marrow three-dimensional (3D) structure, 2) in the bone marrow sinusoid vessels and 3) in the blood flow. Our general aim was to decipher the mechanism associated to each process. The first objective was to study the effects of porous 3D structure on MK differentiation and platelet production. This study demonstrated that the synergy between spatial organization and biological cues improved MK and platelet production. We also characterized platelets produced from mature MK in flow conditions, with respect to their in vitro and in vivo properties. We highlighted the capacity of flow-derived platelets to incorporate in a thrombus in vitro and in vivo, compared to static-derived platelets. These works represent some new developments for mimicking the bone marrow structure and to reproduce blood shear forces in order to improve and increase in vitro platelet production for therapeutic use
Ramirez, Lis. "Production de bio-carburants de 3ème génération à partir de microalgues". Phd thesis, Université Claude Bernard - Lyon I, 2013. http://tel.archives-ouvertes.fr/tel-01070856.
Antoine, Elodie. "Génération automatique d'interfaces Web à partir de spécifications l'outil DCI-Web". Mémoire, Université de Sherbrooke, 2008. http://savoirs.usherbrooke.ca/handle/11143/4746.
Bedini, Ivan. "Génération automatique d'ontologie à partir de fichiers XSD appliqué au B2B". Versailles-St Quentin en Yvelines, 2010. http://www.theses.fr/2010VERS0004.
La communication entre systèmes d'information d'entreprise joue un rôle central dans l'évolution des processus d'affaire. Pourtant l'intégration des données reste compliquée : elle exige un effort humain considérable, surtout pour les connexions d'applications appartenant à différentes entreprises. Dans notre recherche nous affirmons que les technologies du Web Sémantique, et plus particulièrement les ontologies, peuvent permettre l'obtention de la flexibilité nécessaire. Notre système permet de surmonter certains manques dans l'état de l'art actuel et réalise une nouvelle approche pour la génération automatique d'ontologies à partir de sources XML. Nous montrons l'utilité du système en appliquant notre théorie au domaine du B2B pour produire automatiquement des ontologies de qualité et d’expressivité appropriée
Patoz, Evelyne. "Génération de représentations topologiques à partir de requêtes en langage naturel". Besançon, 2006. http://www.theses.fr/2006BESA1031.
From the reasoning’ study and the visual perceptions abilities that use a human being for locating in the space, we elaborate an example theoretic allowing a computing system to situate an object in the space by means of linguistics signs. For this fact, the rule of linguistic activity is studying in his constructive rule of the spatial representation, but also to the other cognitive effect, is revealed as essential: the visual perception. The visual perception resting in a huge part on the products informations in function of an observer’ knowledges of the universe, the interpretation can conduct to a mental representation. The notion of representation so is linked up to a reality of objects that existence by itself depends of the perceptive aptitude of a special individual. The representation is no more examined like a construction for a well-done configuration, but relative to an environmental perception. We can show that the dynamic generation for a spatial representation depend of parameters, which the more important factor is the identification of a point of reference. We can develop a logical application, integrating a speech factor, that permit to a user to directing a robot in an area, and thus to give an account to the state of the world how it can evaluate
Boudellal, Toufik. "Extraction de l'information à partir des flux de données". Saint-Etienne, 2006. http://www.theses.fr/2006STET4014.
The aim of this work is an attempt to resolve a mining data streams specified problem. It is an adaptative analysis of data streams. The web generation proposes new challenges due to the complexity of data structures. As an example, the data issued from virtual galleries, credit card transactions,. . . Generally, such data are continuous in time, and their sizes are dynamic. We propose a new algorithm based on measures applied to adaptative data streams. The interpretation of results is possible due to such measures. In fact, we compare our algorithm experimentally to other adapted approaches that are considered fundamental in the field. A modified algorithm that is more useful in applications is also discussed. This thesis finishes with a suggestions set about our future work relating to noises data streams and another set of suggestions about the future needfully work
Guillouet, Brendan. "Apprentissage statistique : application au trafic routier à partir de données structurées et aux données massives". Thesis, Toulouse 3, 2016. http://www.theses.fr/2016TOU30205/document.
This thesis focuses on machine learning techniques for application to big data. We first consider trajectories defined as sequences of geolocalized data. A hierarchical clustering is then applied on a new distance between trajectories (Symmetrized Segment-Path Distance) producing groups of trajectories which are then modeled with Gaussian mixture in order to describe individual movements. This modeling can be used in a generic way in order to resolve the following problems for road traffic : final destination, trip time or next location predictions. These examples show that our model can be applied to different traffic environments and that, once learned, can be applied to trajectories whose spatial and temporal characteristics are different. We also produce comparisons between different technologies which enable the application of machine learning methods on massive volumes of data
Verdie, Yannick. "Modélisation de scènes urbaines à partir de données aeriennes". Phd thesis, Université Nice Sophia Antipolis, 2013. http://tel.archives-ouvertes.fr/tel-00881242.
Bernardes, Vieira Marcelo. "Reconstruction de surfaces à partir de données tridimensionnelles éparses". Cergy-Pontoise, 2002. http://biblioweb.u-cergy.fr/theses/02CERG0145.pdf.
This work approaches the problem of sparse data spatial organization inference for surface reconstruction. We propose a variant of the voting method developed by Gideon Guy and extended by Mi-Suen Lee. Tensors to represent orientations and spatial influence fields are the main mathematical instruments. These methods have been associated to perceptual grouping problems. However, we observe that their accumulation processes infer sparse data organization. From this point of view, we propose a new strategy for orientation inference focused on surfaces. In contrast with original ideas, we argue that a dedicated method may enhance this inference. The mathematical instruments are adapted to estimate normal vectors: the orientation tensor represents surfaces and influence fields code elliptical trajectories. We also propose a new process for the initial orientation inference which effectively evaluates the sparse data organization. The presentation and critique of Guy's and Lee's works and methodological development of this thesis are conducted by epistemological studies. Objects of different shapes are used in a qualitative evaluation of the method. Quantitative comparisons were prepared with error estimation from several reconstructions. Results show that the proposed method is more robust to noise and variable data density. A method to segment points structured on surfaces is also proposed. Comparative evaluations show a better performance of the proposed method in this application
Verdie, Yannick. "Modélisation de scènes urbaines à partir de données aériennes". Thesis, Nice, 2013. http://www.theses.fr/2013NICE4078.
Analysis and 3D reconstruction of urban scenes from physical measurements is a fundamental problem in computer vision and geometry processing. Within the last decades, an important demand arises for automatic methods generating urban scenes representations. This thesis investigates the design of pipelines for solving the complex problem of reconstructing 3D urban elements from either aerial Lidar data or Multi-View Stereo (MVS) meshes. Our approaches generate accurate and compact mesh representations enriched with urban-related semantic labeling.In urban scene reconstruction, two important steps are necessary: an identification of the different elements of the scenes, and a representation of these elements with 3D meshes. Chapter 2 presents two classification methods which yield to a segmentation of the scene into semantic classes of interests. The beneath is twofold. First, this brings awareness of the scene for better understanding. Second, deferent reconstruction strategies are adopted for each type of urban elements. Our idea of inserting both semantical and structural information within urban scenes is discussed and validated through experiments. In Chapter 3, a top-down approach to detect 'Vegetation' elements from Lidar data is proposed using Marked Point Processes and a novel optimization method. In Chapter 4, bottom-up approaches are presented reconstructing 'Building' elements from Lidar data and from MVS meshes. Experiments on complex urban structures illustrate the robustness and scalability of our systems
Giraudot, Simon. "Reconstruction robuste de formes à partir de données imparfaites". Thesis, Nice, 2015. http://www.theses.fr/2015NICE4024/document.
Over the last two decades, a high number of reliable algorithms for surface reconstruction from point clouds has been developed. However, they often require additional attributes such as normals or visibility, and robustness to defect-laden data is often achieved through strong assumptions and remains a scientific challenge. In this thesis we focus on defect-laden, unoriented point clouds and contribute two new reconstruction methods designed for two specific classes of output surfaces. The first method is noise-adaptive and specialized to smooth, closed shapes. It takes as input a point cloud with variable noise and outliers, and comprises three main steps. First, we compute a novel noise-adaptive distance function to the inferred shape, which relies on the assumption that this shape is a smooth submanifold of known dimension. Second, we estimate the sign and confidence of the function at a set of seed points, through minimizing a quadratic energy expressed on the edges of a uniform random graph. Third, we compute a signed implicit function through a random walker approach with soft constraints chosen as the most confident seed points. The second method generates piecewise-planar surfaces, possibly non-manifold, represented by low complexity triangle surface meshes. Through multiscale region growing of Hausdorff-error-bounded convex planar primitives, we infer both shape and connectivity of the input and generate a simplicial complex that efficiently captures large flat regions as well as small features and boundaries. Imposing convexity of primitives is shown to be crucial to both the robustness and efficacy of our approach
Edwards, Jonathan. "Construction de modèles stratigraphiques à partir de données éparses". Thesis, Université de Lorraine, 2017. http://www.theses.fr/2017LORR0367/document.
All stratigraphic models building and analysis are based on stratigraphic correlations of sedimentary units observed on wells or outcrops. However, the geologist building these stratigraphic correlations faces two main problems. First, the data available are few and sparse. Second, the sedimentary processes leading to the deposition of the units are numerous, interdependent and poorly known. So, the construction of a stratigraphic correlation model might be seen as an under-constrained problem with several possible solutions. The aim of this thesis is to create a numeric method to generate stochastic stratigraphic models that are locally constrained by observation data. Two steps are necessary: 1. The establishment of rules describing the spatial organization of sedimentary units observed on outcrops and wells. For these rules, two axis are explored: - The formulation in equations of rules defined in the sequence stratigraphy framework. These rules, presented qualitatively in the literature are translated in quantitative terms to evaluate the probability of two sedimentary units to be correlated. - The deduction of the probability of two sedimentary units to be correlated from stratigraphic models built from forward stratigraphic methods. 2. The development of an algorithm to build possible stochastic stratigraphic models from the rules cited above and observation data
Monneret, Gilles. "Inférence de réseaux causaux à partir de données interventionnelles". Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS290/document.
The purpose of this thesis is the use of current transcriptomic data in order to infer a gene regulatory network. These data are often complex, and in particular intervention data may be present. The use of causality theory makes it possible to use these interventions to obtain acyclic causal networks. I question the notion of acyclicity, then based on this theory, I propose several algorithms and / or improvements to current techniques to use this type of data