Thèses sur le sujet « Qualité des données et bruit »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Consultez les 50 meilleures thèses pour votre recherche sur le sujet « Qualité des données et bruit ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.
Azé, Jérôme. « Extraction de Connaissances à partir de Données Numériques et Textuelles ». Phd thesis, Université Paris Sud - Paris XI, 2003. http://tel.archives-ouvertes.fr/tel-00011196.
Texte intégralL'analyse de telles données est souvent contrainte par la définition d'un support minimal utilisé pour filtrer les connaissances non intéressantes.
Les experts des données ont souvent des difficultés pour déterminer ce support.
Nous avons proposé une méthode permettant de ne pas fixer un support minimal et fondée sur l'utilisation de mesures de qualité.
Nous nous sommes focalisés sur l'extraction de connaissances de la forme "règles d'association".
Ces règles doivent vérifier un ou plusieurs critères de qualité pour être considérées comme intéressantes et proposées à l'expert.
Nous avons proposé deux mesures de qualité combinant différents critères et permettant d'extraire des règles intéressantes.
Nous avons ainsi pu proposer un algorithme permettant d'extraire ces règles sans utiliser la contrainte du support minimal.
Le comportement de notre algorithme a été étudié en présence de données bruitées et nous avons pu mettre en évidence la difficulté d'extraire automatiquement des connaissances fiables à partir de données bruitées.
Une des solutions que nous avons proposée consiste à évaluer la résistance au bruit de chaque règle et d'en informer l'expert lors de l'analyse et de la validation des connaissances obtenues.
Enfin, une étude sur des données réelles a été effectuée dans le cadre d'un processus de fouille de textes.
Les connaissances recherchées dans ces textes sont des règles d'association entre des concepts définis par l'expert et propres au domaine étudié.
Nous avons proposé un outil permettant d'extraire les connaissances et d'assister l'expert lors de la validation de celles-ci.
Les différents résultats obtenus montrent qu'il est possible d'obtenir des connaissances intéressantes à partir de données textuelles en minimisant la sollicitation de l'expert dans la phase d'extraction des règles d'association.
Al, Jurdi Wissam. « Towards next generation recommender systems through generic data quality ». Electronic Thesis or Diss., Bourgogne Franche-Comté, 2024. http://www.theses.fr/2024UBFCD005.
Texte intégralRecommender systems are essential for filtering online information and delivering personalized content, thereby reducing the effort users need to find relevant information. They can be content-based, collaborative, or hybrid, each with a unique recommendation approach. These systems are crucial in various fields, including e-commerce, where they help customers find pertinent products, enhancing user experience and increasing sales. A significant aspect of these systems is the concept of unexpectedness, which involves discovering new and surprising items. This feature, while improving user engagement and experience, is complex and subjective, requiring a deep understanding of serendipitous recommendations for its measurement and optimization. Natural noise, an unpredictable data variation, can influence serendipity in recommender systems. It can introduce diversity and unexpectedness in recommendations, leading to pleasant surprises. However, it can also reduce recommendation relevance, causing user frustration. Therefore, it is crucial to design systems that balance natural noise and serendipity. Inconsistent user information due to natural noise can negatively impact recommender systems, leading to lower-quality recommendations. Current evaluation methods often overlook critical user-oriented factors, making noise detection a challenge. To provide powerful recommendations, it’s important to consider diverse user profiles, eliminate noise in datasets, and effectively present users with relevant content from vast data catalogs. This thesis emphasizes the role of serendipity in enhancing recommender systems and preventing filter bubbles. It proposes serendipity-aware techniques to manage noise, identifies algorithm flaws, suggests a user-centric evaluation method, and proposes a community-based architecture for improved performance. It highlights the need for a system that balances serendipity and considers natural noise and other performance factors. The objectives, experiments, and tests aim to refine recommender systems and offer a versatile assessment approach
Tiouchichine, Elodie. « Performance du calorimètre à argon liquide et recherche du boson de Higgs dans son canal de désintégration H -->ZZ*->4l avec l'expérience ATLAS auprès du LHC ». Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM4058/document.
Texte intégralThe work presented in this thesis within the ATLAS collaboration was performed in the context of the discovery of a new particle at the LHC in the search for the Standard Model Higgs boson. My contribution to the Higgs boson search is focused in the H -> ZZ* -> 4 l channel at different level, from the data taking to the physics analysis. After a theoretical introduction, the LHC and the ATLAS detector are presented as well as their performance during the 2011 and 2012 runs. A particular consideration is given to the liquid argon calorimeters and to the data quality assesment of this system. The validation of the data recorded during non-nominal high voltage conditions is presented. This study allowed to recover 2% of the data collected available for physics analyses. This has a direct impact on the H -> ZZ* -> 4 l channel were the number of signal events expected is very low. In order to optimize the acceptance of the four electrons decay channel, novel electron reconstruction algorithms were introducted in 2012. The measurement of their efficiency is presented. The efficiency gain reaching 7% for low transverse energy electrons (15 ZZ* -> 4 l analysis presented using the data recorded in 2012. The reducible background estimation methods in the channels containing electrons in the final state that were of primary importance after the discovery are detailed. Finally, the measurement of the new boson properties are presented based on the 2011 and the 2012 recorded data
Choquet, Rémy. « Partage de données biomédicales : modèles, sémantique et qualité ». Phd thesis, Université Pierre et Marie Curie - Paris VI, 2011. http://tel.archives-ouvertes.fr/tel-00824931.
Texte intégralBen, salem Aïcha. « Qualité contextuelle des données : détection et nettoyage guidés par la sémantique des données ». Thesis, Sorbonne Paris Cité, 2015. http://www.theses.fr/2015USPCD054/document.
Texte intégralNowadays, complex applications such as knowledge extraction, data mining, e-learning or web applications use heterogeneous and distributed data. The quality of any decision depends on the quality of the used data. The absence of rich, accurate and reliable data can potentially lead an organization to make bad decisions.The subject covered in this thesis aims at assisting the user in its quality ap-proach. The goal is to better extract, mix, interpret and reuse data. For this, the data must be related to its semantic meaning, data types, constraints and comments.The first part deals with the semantic schema recognition of a data source. This enables the extraction of data semantics from all the available information, inculding the data and the metadata. Firstly, it consists of categorizing the data by assigning it to a category and possibly a sub-category, and secondly, of establishing relations between columns and possibly discovering the semantics of the manipulated data source. These links detected between columns offer a better understanding of the source and the alternatives for correcting data. This approach allows automatic detection of a large number of syntactic and semantic anomalies.The second part is the data cleansing using the reports on anomalies returned by the first part. It allows corrections to be made within a column itself (data homogeni-zation), between columns (semantic dependencies), and between lines (eliminating duplicates and similar data). Throughout all this process, recommendations and analyses are provided to the user
Weber-Baghdiguian, Lexane. « Santé, genre et qualité de l'emploi : une analyse sur données microéconomiques ». Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLED014/document.
Texte intégralThis thesis studies the influence of work on job and life quality, the latter being considered through the perception that individuals have of their own health. The first chapter focuses on the long-term effects of job losses due to plant closure on job quality. We show that job loss negatively affects wages, perceived job insecurity, the quality of the working environment and job satisfaction, including in the long run. The two last chapters investigate gender differences in self-reported health. The second chapter provides descriptive evidence on the relationships between self-assessed health, gender and mental health problems, i.e. depression and/or affective pains. Finally, in the last chapter, we study the influence of social norms as proxied by the gender structure of the workplace environment, on gender differences in self-reported health. We show that both women and men working in female-dominated environments report more specific health problems than those who work in male-dominated environments. The overall findings of this thesis are twofold. First, losing a job has a negative impact on several dimensions of job quality and satisfaction in the long run. Secondly, mental diseases and social norms at work are important to understand gender-related differences in health perceptions
Puricelli, Alain. « Réingénierie et Contrôle Qualité des Données en vue d'une Migration Technologique ». Lyon, INSA, 2000. http://theses.insa-lyon.fr/publication/2000ISAL0092/these.pdf.
Texte intégralThe purpose of this thesis is to develop a methodology of treatment for logical consistency checking in a Geographical Information System (GIS), in order to ensure the migration of the data in the case of a technological change of system and re-structuring. This methodology is then applied to a real GIS installed in the Urban Community of Lyon (the SUR). Logical consistency is one of the quality criteria that are commonly allowed within the community of producers and users of geographical data, as well as precision or exhaustiveness for instance. After a presentation of the elements of quality and metadata in GIS, a state of the art is given concerning various works of standardization within these fields. The different standards under development (those of the CEN, the ISO and the FGDC among others) are analyzed and commented. A methodology of detection and correction of geometrical and topological errors is then detailed, within the framework of existing geographical vector databases. Three types of errors are identified, namely structural, geometrical and semantic errors. For each one of these families of errors, methods of detection based on established theories (integrity constraints, topology and computational geometry) are proposed as well ideas for the correction are detailed. This approach is then implemented within the context of the SUR databases. To complete this application, a specific mechanism was developed to deal also with the errors in tessellations, which were not taken into account by the methodology (which uses binary topological relations). Finally to ensure the consistency of the corrections, a method was set up to spread the corrections in the neighborhood of the objects under corrections. Those objects can be located inside a single layer of data as well as between different layers or different databases of the system
Defréville, Boris. « Caractérisation de la qualité sonore de l'environnement urbain : une approche physique et perceptive basée sur l'identification des sources sonores ». Cergy-Pontoise, 2005. http://biblioweb.u-cergy.fr/theses/05CERG0275.pdf.
Texte intégralNoise in cities is perceived as a question of quality of life. It is generally evaluated by the measurement of his sound level. If this measure is representative of noisy environments which are characterised by continuous flow of vehicles, it is insufficient to characterise the "colour" of a urban soundscape where different sources coexist. The first part reveals that sound sources are not perceived the same manner and their metrological evaluation should be adapted consequently. This present work proposes an indicator linked to the unpleasantness of sound. Depending on the place it describes, this indicator sometimes uses the loudness of the sequence but always takes account characteristics of emergent sound of sources. The second part of the study proposes two methods for the automatic calculation of this indicator thanks to the identification of sound sources. This tool represents, in fine, a help to the management of an urban soundscape
Feno, Daniel Rajaonasy. « Mesures de qualité des règles d'association : normalisation et caractérisation des bases ». Phd thesis, Université de la Réunion, 2007. http://tel.archives-ouvertes.fr/tel-00462506.
Texte intégralBazin, Cyril. « Tatouage de données géographiques et généralisation aux données devant préserver des contraintes ». Caen, 2010. http://www.theses.fr/2010CAEN2006.
Texte intégralDigital watermaking is a fundamental process for intellectual property protection. It consists in inserting a mark into a digital document by slightly modifications. The presence of this mark allows the owner of a document to prove the priority of his rights. The originality of our work is twofold. In one hand, we use a local approach to ensure a priori that the quality of constrained documents is preserved during the watermark insertion. On the other hand, we propose a generic watermarking scheme. The manuscript is divided in three parts. Firstly, we introduce the basic concepts of digital watermarking for constrainted data and the state of the art of geographical data watermarking. Secondly, we present our watermarking scheme for digital vectorial maps often used in geographic information systems. This scheme preserves some topological and metric qualities of the document. The watermark is robust, it is resilient against geometric transformations and cropping. We give an efficient implementation that is validated by many experiments. Finally, we propose a generalization of the scheme for constrainted data. This generic scheme will facilitate the design of watermarking schemes for new data type. We give a particular example of application of a generic schema for relational databases. In order to prove that it is possible to work directly on the generic scheme, we propose two detection protocols straightly applicable on any implementation of generic scheme
Paolino, Pierdomenico. « Bruit thermique et dissipation d'un microlevier ». Phd thesis, Ecole normale supérieure de lyon - ENS LYON, 2008. http://tel.archives-ouvertes.fr/tel-00423692.
Texte intégralAu delà du dispositif traditionnel de mesure de déflexion angulaire, nous avons conçu et réalisé un AFM avec une détection interférométrique différentielle (entre la base encastrée et l'extrémité libre du levier). La résolution ultime est de 10^-14 m/Hz^1/2, la mesure est de plus intrinsèquement calibrée, indifférente aux dérives thermiques lentes et sans limitation de la plage d'amplitude de la déflexion.
Grâce à notre dispositif, nous mesurons le bruit thermique le long du levier. Une reconstruction de la forme spatiale des quatre premiers modes propres en flexion révèle un excellent accord avec le modèle de poutre de Euler-Bernoulli. Un ajustement simultané sur les quatre résonances thermiquement excitées est réalisé à l'aide d'un seul paramètre libre : la raideur du levier, qui est ainsi mesurée avec une grande précision et robustesse.
Les fluctuations thermiques de déflexion à basse fréquence démontrent qu'un modèle d'oscillateur harmonique avec dissipation visqueuse n'est plus pertinent hors résonance. De plus, on observe des différences substantielles entre les leviers avec et sans revêtement métallique. Pour ces derniers, l'approche hydrodynamique de Sader rend compte fidèlement du comportement des fluctuations en dessous de la résonance dans l'air. La présence du revêtement introduit une deuxième source de dissipation : la viscoélasticité. Elle se manifeste comme un bruit en 1/f à basse fréquence. L'utilisation du Théorème Fluctuation-Dissipation (TFD) et des relations de Kramers-Kronig permettent une caractérisation complète de la réponse du levier à l'aide des spectres de fluctuations. Une estimation quantitative de la viscoélasticité et de sa dépendance en fréquence est notamment obtenue.
Ben, Saad Myriam. « Qualité des archives web : modélisation et optimisation ». Paris 6, 2011. http://www.theses.fr/2011PA066446.
Texte intégralSpill, Yannick. « Développement de méthodes d'échantillonnage et traitement bayésien de données continues : nouvelle méthode d'échange de répliques et modélisation de données SAXS ». Paris 7, 2013. http://www.theses.fr/2013PA077237.
Texte intégralThe determination of protein structures and other macromolecular complexes is becoming more and more difficult. The simplest cases have already been determined, and today's research in structural bioinformatics focuses on ever more challenging targets. To successfully determine the structure of these complexes, it has become necessary to combine several kinds of experiments and to relax the quality standards during acquisition. In other words, structure determination makes an increasing use of sparse, noisy and inconsistent data. It is therefore becoming essential to quantify the accuracy of a determined structure. This quantification is superbly achieved by statistical inference. In this thesis, I develop a new sampling algorithm, Convective Replica-Exchange, sought to find probable structures more robustly. I also propose e proper statistical treatment for continuous data, such as Small-Angle X-Ray Scattering data
Maddi, Abdelghani. « La quantification de la recherche scientifique et ses enjeux : bases de données, indicateurs et cartographie des données bibliométriques ». Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCD020/document.
Texte intégralThe issue of productivity and the "quality" of scientific research is one of the central issues of the 21st century in the economic and social world. Scientific research, source of innovation in all fields, is considered the key to economic development and competitiveness. Science must also contribute to the societal challenges defined in the Framework Programmes for Research and Technological Development (H2020) for example, such as health, demography and well-being. In order to rationalize public spending on research and innovation or to guide the investment strategies of funders, several indicators are developed to measure the performance of research entities. Now, no one can escape evaluation, starting with research articles, researchers, institutions and countries (Pansu, 2013, Gingras, 2016). For lack of methodological comprehension, quantitative indicators are sometimes misused by neglecting the aspects related to their method of calculation / normalization, what they represent or the inadequacies of the databases from which they are calculated. This situation may have disastrous scientific and social consequences. Our work plans to examine the tools of evaluative bibliometrics (indicators and databases) in order to measure the issues related to the quantitative evaluation of scientific performances. We show through this research that the quantitative indicators, can never be used alone to measure the quality of the research entities given the disparities of the results according to the analysis perimeters, the ex-ante problems related to the individual characteristics of researchers who directly affect the quantitative indicators, or the shortcomings of the databases from which they are calculated. For a responsible evaluation, it is imperative to accompany the quantitative measures by a qualitative assessment of the peers. In addition, we also examined the effectiveness of quantitative measures for the purpose of understanding the evolution of science and the formation of scientific communities. Our analysis, applied to a corpus of publications dealing the economic crisis, allowed us to show the dominant authors and currents of thought, as well as the temporal evolution of the terms used in this thematic
Le, conte des floris Robin. « Effet des biais cognitifs et de l'environnement sur la qualité des données et des informations ». Electronic Thesis or Diss., Université Paris sciences et lettres, 2024. http://www.theses.fr/2024UPSLM004.
Texte intégralFrom the perspective of philosopher Friedrich Nietzsche, there is no reality that exists in itself, no raw fact, no absolute reality: everything that we define as reality is, in fact, only the result of interpretation processes that are unique to us. Mo-reover, the data stored in information systems is often nothing more than the coded representation of statements made by human beings, thereby inherently involving human interpretation and consequently being affected by the same biases and limitations that characterize the human psyche. This thesis introduces a new conceptual framework, the "Data binding and reification" (DBR) model, that describes the process of data interpretation, and then the reification of information, using a new approach that places human-perception mechanisms at the heart of this process. By mobilizing cognitive and beha-vioral sciences, this approach allows us to identify to what extent human intervention and the structure of the environment to which one is subjected condition the emergence of cognitive biases affecting these processes. Experimental results partially validate this model by identifying the characteristics of the environment that affect, in an organizational context, the data-collection process and the quality of the information produced. This work opens up numerous perspectives, such as the development of a choice architecture in the sense of the economist Richard Thaler, which could improve the very process of data collection by modifying the experience of users of the information system
Boydens, Isabelle. « Evaluer et améliorer la qualité de l'information : herméneutique des bases de données administratives ». Doctoral thesis, Universite Libre de Bruxelles, 1998. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/212039.
Texte intégralLegros, Diégo. « Innovation, formation, qualité et performances des entreprises : Une étude économétrique sur données d'entreprises ». Paris 2, 2005. http://www.theses.fr/2005PA020106.
Texte intégralCaron, Clément. « Provenance et Qualité dans les Workflows Orientés Données : application à la plateforme WebLab ». Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066568/document.
Texte intégralThe WebLab platform is an application used to define and execute media-mining workflows. It is an open source platform, developed by the IPCC1 section of Airbus Defence and Space, for the integration of external components. A designer can create complex media-mining workflows using components, whose operation is not always known (black-boxes services). These complex workflows can lead to a problem of data quality, however, and before this work, no tool existed to analyse and improve the quality of WebLab workflows. To deal with black-box services, we choose to tackle this quality problem with a non-intrusive approach: we enhance the definition of the WebLab workflow with provenance and quality propagation rules. Provenance rules generate fine-grained data dependency links between data and services after the execution of a WebLab workflow. Then the quality propagation rules use these links to reason on the influence that the quality of the data used by a component has on the quality of the output data…
Caron, Clément. « Provenance et Qualité dans les Workflows Orientés Données : application à la plateforme WebLab ». Electronic Thesis or Diss., Paris 6, 2015. http://www.theses.fr/2015PA066568.
Texte intégralThe WebLab platform is an application used to define and execute media-mining workflows. It is an open source platform, developed by the IPCC1 section of Airbus Defence and Space, for the integration of external components. A designer can create complex media-mining workflows using components, whose operation is not always known (black-boxes services). These complex workflows can lead to a problem of data quality, however, and before this work, no tool existed to analyse and improve the quality of WebLab workflows. To deal with black-box services, we choose to tackle this quality problem with a non-intrusive approach: we enhance the definition of the WebLab workflow with provenance and quality propagation rules. Provenance rules generate fine-grained data dependency links between data and services after the execution of a WebLab workflow. Then the quality propagation rules use these links to reason on the influence that the quality of the data used by a component has on the quality of the output data…
Drouet, Julie. « Séparation des sources de bruit des moteurs Diesel : Application en hiérarchisation de source et qualité sonore ». Thesis, Lyon, INSA, 2013. http://www.theses.fr/2013ISAL0053/document.
Texte intégralThe spectrofilter is a Wiener filter used to extract combustion noise. This filter requires an important data processing and is determined in all operating conditions. Thus it is difficult to carry out perceptual studies on combustion noise from various motor adjustments. To overcome this drawback, this PhD dissertation aims to define a common filter which can synthesize a combustion noise in all operating conditions. For this, a perceptual study showed that the substitution of the conventional spectrofilter by another Wiener filter allows the synthesis of a combustion noise. The use of a common spectrofilter is thus possible. The experimental modal analysis allows to estimate the Wiener filter from characteristic data of the engine structure. After studying on a synthetic signal, the ESPRIT method seems to be the most appropriate, but requires some optimizations to be adapted to spectrofilter peculiarities. The Wiener filters of several running speeds are estimated in different estimating conditions, defined by the ESTER criterion. A fictitious damping evolution with the running speed is observed and linked to the temporal windowing applied on the spectrofilter computation. A perceptual experience is then carried out to identify if the ESTER criterion allows to estimate accurately filters to synthesize combustion noises similar to conventional combustion noises. The results lead to conceive the spectrofilter obtained in idling condition as a good common filter, as in a physical point of view that perception
Berti-Équille, Laure. « La qualité des données et leur recommandation : modèle conceptuel, formalisation et application a la veille technologique ». Toulon, 1999. http://www.theses.fr/1999TOUL0008.
Texte intégralTechnological Watch activities are focused on information qualification and validation by human expertise. As a matter of facf, none of these systems can provide (nor assist) a critical and qualitative analysis of data they store and manage- Most of information systems store data (1) whose source is usually unique, not known or not identified/authenticated (2) whose quality is unequal and/or ignored. In practice, several data may describe the same entity in the real world with contradictory values and their relative quality may be comparatively evaluated. Many techniques for data cleansing and editing exist for detecting some errors in database but it is determinant to know which data have bad quality and to use the benefit of a qualitative expert judgment on data, which is complementary to quantitative and statistical data analysis. My contribution is to provide a multi-source perspective to data quality, to introduce and to define the concepts of multi-source database (MSDB) and multi-source data quality (MSDQ). My approach was to analyze the wide panorama of research in the literature whose problematic have some analogies with technological watch problematic. The main objective of my work was to design and to provide a storage environment for managing textual information sources, (more or less contradictory) data that are extracted from the textual content and their quality mcta-data. My work was centered on proposing : the methodology to guide step-by-step a project for data quality in a multi-source information context, the conceptual modeling of a multi-source database (MSDB) for managing data sources, multi-source data and their quality meta-data and proposing mechanisms for multi-criteria data recommendation ; the formalization of the QMSD data model (Quality of Multi-Source Data) which describes multi-source data, their quality meta-data and the set of operations for manipulating them ; the development of the sQuaL prototype for implementing and validating my propositions. In the long term, the perspectives are to develop a specific dccisional information system extending classical functionalities for (1) managing multi-source data (2) taking into account their quality meta-data and (3) proposing data-quality-based recommendation as query results. The ambition is to develop the concept of "introspective information system" ; that is to say, an information system thai is active and reactive concerning the quality of its own data
Troya-Galvis, Andrès. « Approche collaborative et qualité des données et des connaissances en analyse multi-paradigme d'images de télédétection ». Thesis, Strasbourg, 2016. http://www.theses.fr/2016STRAD040/document.
Texte intégralAutomatic interpretation of very high spatial resolution remotely sensed images is a complex but necessary task. Object-based image analysis approaches are commonly used to deal with this kind of images. They consist in applying an image segmentation algorithm in order to construct the abjects of interest, and then classifying them using data-mining methods. Most of the existing work in this domain consider the segmentation and the classification independently. However, these two crucial steps are closely related. ln this thesis, we propose two different approaches which are based on data and knowledge quality in order to initialize, guide, and evaluate a segmentation and classification collaborative process. 1. The first approach is based on a mono-class extraction strategy allowing us to focus on the particular properties of a given thematic class in order to accurately label the abjects of this class. 2. The second approach deals with multi-class extraction and offers two strategies to aggregate several mono-class extractors to get a final and completely labelled image
Da, Silva Carvalho Paulo. « Plateforme visuelle pour l'intégration de données faiblement structurées et incertaines ». Thesis, Tours, 2017. http://www.theses.fr/2017TOUR4020/document.
Texte intégralWe hear a lot about Big Data, Open Data, Social Data, Scientific Data, etc. The importance currently given to data is, in general, very high. We are living in the era of massive data. The analysis of these data is important if the objective is to successfully extract value from it so that they can be used. The work presented in this thesis project is related with the understanding, assessment, correction/modification, management and finally the integration of the data, in order to allow their respective exploitation and reuse. Our research is exclusively focused on Open Data and, more precisely, Open Data organized in tabular form (CSV - being one of the most widely used formats in the Open Data domain). The first time that the term Open Data appeared was in 1995 when the group GCDIS (Global Change Data and Information System) (from United States) used this expression to encourage entities, having the same interests and concerns, to share their data [Data et System, 1995]. However, the Open Data movement has only recently undergone a sharp increase. It has become a popular phenomenon all over the world. Being the Open Data movement recent, it is a field that is currently growing and its importance is very strong. The encouragement given by governments and public institutions to have their data published openly has an important role at this level
Ben, othmane Zied. « Analyse et visualisation pour l'étude de la qualité des séries temporelles de données imparfaites ». Thesis, Reims, 2020. http://www.theses.fr/2020REIMS002.
Texte intégralThis thesis focuses on the quality of the information collected by sensors on the web. These data form time series that are incomplete, imprecise, and are on quantitative scales that are not very comparable. In this context, we are particularly interested in the variability and stability of these time series. We propose two approaches to quantify them. The first is based on a representation using quantiles, the second is a fuzzy approach. Using these indicators, we propose an interactive visualization tool dedicated to the analysis of the quality of the harvest carried out by the sensors. This work is part of a CIFRE collaboration with Kantar
Vaillant, Benoît. « Mesurer la qualité des règles d'association : études formelles et expérimentales ». Télécom Bretagne, 2006. http://www.theses.fr/2006TELB0026.
Texte intégralKnowledge discovery in databases aims at extracting information contained in data warehouses. It is a complex process, in which several experts (those acquainted with data, analysts, processing specialists, etc. ) must act together in order to reveal patterns, which will be evaluated according to several criteria: validity, novelty, understandability, exploitability, etc. Depending on the application field, these criteria may be related to differing concepts. In addition, constant improvements made in the methodological and technical aspects of data mining allow one to deal with ever-increasing databases. The number of extracted patterns follows the same increasing trend, without them all being valid, however. It is commonly assumed that the validation of the knowledge mined cannot be performed by a decision maker, usually in charge of this step in the process, without some automated help. In order to carry out this final validation task, a typical approach relies on the use of functions which numerically quantify the pertinence of the patterns. Since such functions, called interestingness measures, imply an order on the patterns, they highlight some specific kind of information. Many measures have been proposed, each of them being related to a particular category of situations. We here address the issue of evaluating the objective interestingness of the particular type of patterns that are association rules, through the use of such measures. Considering that the selection of ``good'' rules implies the use of appropriated measures, we propose a systematic study of the latter, based on formal properties expressed in the most straightforward terms. From this study, we obtain a clustering of many commonly-used measures which we confront with an experimental approach obtained by comparing the rankingsinduced by these measures on classical datasets. Analysing these properties enabled us to highlight some particularities of the measures. We deduce a generalised framework that includes a large majority of them. We also apply two Multicriteria Decision Aiding methods in order to solve the issue of retaining pertinent rules. The first approach takes into account a modelling of the preferences expressed by an expert in the field being mined about the previously defined properties. From this modelling, we establish which measures are the most adapted to the specific context. The second approach addresses the problem of taking into account the potentially differing values that the measures take, and builds an aggregated view of the ordering of the rules by taking into account the differences in evaluations. These methods are applied to practical situations. This work also led us to develop powerful dedicated software, Herbs. We present the processing it allows for rule selection purposes, as well as for the analysis of the behaviour of measures and visualisation aspects. Without any claim to exhaustiveness in our study, the methodology We propose can be extended to new measures or properties, and is applicable to other data mining contexts
Au, Carine. « Acoustique des chaudières hybrides : optimisation et contrôle par une approche qualité sonore ». Thesis, Paris, ENSAM, 2016. http://www.theses.fr/2016ENAM0071/document.
Texte intégralManufacturers increasingly care about noise made by their products, whether it is in car, in aeronautics, or recently in home appliances industry, because of much more restricting regulations regarding noise level. But beyond noise, users and population’s demand concerns more about sound quality.Among domestic equipment, the innovate hybrid boiler developed by e.l.m. leblanc is an interesting product energetically speaking as the condensing gas boiler is combined with a heat pump in a limited space. The performance coefficient of the heat pump is 3.7 (3.7 kW/h provided for an electric consumption of 1 kW/h), resulting in large savings on users’ bills.However, its high noise level prevents its installation in living spaces, such as kitchen, restricting the conquest of a larger market. His noise should be 40 times lower to meet regulatory level. The design of a new generation of hybrid boiler has been the opportunity to start this study aiming not only to reduce noise level, but also to carve sound in order to make it less annoying.A new approach, called Projected Acoustics of Dynamic Systems (PADS), has been used and has acted as a thread for our study. It takes into account acoustic, vibrating criteria and also those from sound quality well upstream from product design stage, thus avoiding expensive modifications or even rejection of the finished product if the latter is considered noisy.Acoustic and vibration measurements have been carried out on existing hybrid boilers and they have been analyzed in order to identify optimal modifications that may reduce noise. Solutions have been suggested and their effects have been evaluated by measurements. The regulatory noise level is reached with suggested measures.To integrate sound quality component into design, listening tests have been carried out to identify the sound target of the new hybrid boiler. Statistical analysis (PCA, FCA, Variance Analysis) brought out parameters impacting on noise perception, sometimes very subtly. We have tested the effect of the integration of musical tone and a correlation between noise pleasantness and this parameter was observed.This observation has concentrated this thesis on the development of an original method that can define the physical parameters of a fan so it makes a given musical tone. To do so, we have suggested a new semi-experimental approach that can predict line spectrum of a fan and we have used an optimization algorithm in order to find the optimum geometrical parameters. This new method has been validated with two test benches. It can be used to curve tonal noise of a fan, a turbomachine, or more generally a rotating machine
Palan, Bohuslav. « Conception de microcapteurs pH-ISFET faible bruit et d'inductances intégrées suspendues à fort facteur de qualité Q ». Grenoble INPG, 2002. http://www.theses.fr/2002INPG0023.
Texte intégralGuillet, Fabrice. « Qualité, Fouille et Gestion des Connaissances ». Habilitation à diriger des recherches, Université de Nantes, 2006. http://tel.archives-ouvertes.fr/tel-00481938.
Texte intégralMantel, Claire. « Bruits temporels de compression et perception de la qualité vidéo : mesure et correction ». Phd thesis, Université de Grenoble, 2011. http://tel.archives-ouvertes.fr/tel-00680787.
Texte intégralBen, Hassine Soumaya. « Évaluation et requêtage de données multisources : une approche guidée par la préférence et la qualité des données : application aux campagnes marketing B2B dans les bases de données de prospection ». Thesis, Lyon 2, 2014. http://www.theses.fr/2014LYO22012/document.
Texte intégralIn Business-to-Business (B-to-B) marketing campaigns, manufacturing “the highest volume of sales at the lowest cost” and achieving the best return on investment (ROI) score is a significant challenge. ROI performance depends on a set of subjective and objective factors such as dialogue strategy, invested budget, marketing technology and organisation, and above all data and, particularly, data quality. However, data issues in marketing databases are overwhelming, leading to insufficient target knowledge that handicaps B-to-B salespersons when interacting with prospects. B-to-B prospection data is indeed mainly structured through a set of independent, heterogeneous, separate and sometimes overlapping files that form a messy multisource prospect selection environment. Data quality thus appears as a crucial issue when dealing with prospection databases. Moreover, beyond data quality, the ROI metric mainly depends on campaigns costs. Given the vagueness of (direct and indirect) cost definition, we limit our focus to price considerations.Price and quality thus define the fundamental constraints data marketers consider when designing a marketing campaign file, as they typically look for the "best-qualified selection at the lowest price". However, this goal is not always reachable and compromises often have to be defined. Compromise must first be modelled and formalized, and then deployed for multisource selection issues. In this thesis, we propose a preference-driven selection approach for multisource environments that aims at: 1) modelling and quantifying decision makers’ preferences, and 2) defining and optimizing a selection routine based on these preferences. Concretely, we first deal with the data marketer’s quality preference modelling by appraising multisource data using robust evaluation criteria (quality dimensions) that are rigorously summarized into a global quality score. Based on this global quality score and data price, we exploit in a second step a preference-based selection algorithm to return "the best qualified records bearing the lowest possible price". An optimisation algorithm, BrokerACO, is finally run to generate the best selection result
Etame, Etame Thierry. « Conception de signaux de référence pour l'évaluation de la qualité perçue des codeurs de la parole et du son ». Rennes 1, 2008. http://www.theses.fr/2008REN1S112.
Texte intégralSubjective assessment is the most reliable way to determine overall perceived voice quality of network equipment, as digital codecs. Reference conditions are useful in subjective tests to provide anchors so that results from different tests can be compared. The Modulated Noise Reference Unit (MNRU) provides a simulated and calibrated degradation qualitatively similar to quantization distortion of waveform codecs. The introduction of new technologies for telecommunications services introduce new types of distortions and so the MNRU is not representative any more of the current degradations. The purpose of our work is to produce a reference system that can simulate and calibrate current degradations of speech and audio codec. The first step of the work consists in producing the multidimensional perceptive space underlying the perception of current degradations. The characterization of these perceptive dimensions should help to simulate and calibrate similar degradations
Lévesque, Johann. « Évaluation de la qualité des données géospatiales : approche top-down et gestion de la métaqualité ». Thesis, Université Laval, 2007. http://www.theses.ulaval.ca/2007/24759/24759.pdf.
Texte intégralUbéda, Thierry. « Contrôle de la qualité spatiale des bases de données géographiques : cohérence topologique et corrections d'erreurs ». Lyon, INSA, 1997. http://theses.insa-lyon.fr/publication/1997ISAL0116/these.pdf.
Texte intégralThis work concerns spatial data quality checking in geographical data sets, and especially existing geographical vector databases. Methods developed in this work are not dedicated to a particular data model, but can be adapted to all database fulfilling the two criteria previously given. Concerning the issue of data quality enrichment, this study concerns two complementary levels, namely the conceptual and the semantic level. For each level, processes are developed :- At the conceptual level, geometric properties applicable to geographical data types depending on the dimension of the shape that represents them (0, 1 or 2) are defined. This approach is only based on the objects that compose the database and not on the data model itself. It can then be adapted to every vector geographical data set. - At the semantic level, spatial relation among objects of the database are taken into account by means of topological integrity constraints. They allow to define topological situation that should or should not happen
Heguy, Xabier. « Extensions de BPMN 2.0 et méthode de gestion de la qualité pour l'interopérabilité des données ». Thesis, Bordeaux, 2018. http://www.theses.fr/2018BORD0375/document.
Texte intégralBusiness Process Model and Notation (BPMN) is being becoming the most used standard for business process modelling. One of the important upgrades of BPMN 2.0 with respect to BPMN 1.2 is the fact that Data Objects are now handling semantic elements. Nevertheless, BPMN doesn't enable the representation of performance measurement in the case of interoperability problems in the exchanged data object, which remains a limitation when using BPMN to express interoperability issues in enterprise processes. We propose to extend the Meta-Object Facility meta-model and the XML Schema Definition of BPMN as well as the notation in order to fill this gap. The extension, named performanceMeasurement, is defined using the BPMN Extension Mechanism. This new element will allow to represent performance measurement in the case of interoperability problems as well as interoperability concerns which have been solved. We illustrate the use of this extension with an example from a real industrial case
Cochet, Caroline. « Bruit et urbanisme : Une approche juridique ». Thesis, Antilles-Guyane, 2014. http://www.theses.fr/2014AGUY0711/document.
Texte intégralNoise is considered as a real pollution for the quality of life. Law has been requested to respond the multi-form cases of noise pollution. The matter is firstly the concern of environmental law. It is especially treated in a sectorial way. Town planning law also seizes the question, in a diffuse way, as environmental issue, or in a specific way when noise pollutions are directly caused by the use of grounds.However, under the influence of more and more pervasive environmental law, and further to the new legislation resulting from the Grenelle of the environment, town planning law underwent a deep transformation. It has been rewritten on the basis of new environmental objectives and of sustainable development. Town planning law also absorbs many other juridical sectors. Therefore it appears as a global space law and living environment law, allowing to improve the sound context.The perception of noise has changed, as well as its consideration into town planning law. Town planning law can be considered as a favorable measure to develop a more global and unified approach of the very composite legal system against noise pollution.The study of the relationship between noise and town planning highlights new manners to consider noise into space and living environment, differently from the classic approach imposed by environmental law
Barland, Rémi. « Évaluation objective sans référence de la qualité perçue : applications aux images et vidéos compressées ». Nantes, 2007. http://www.theses.fr/2007NANT2028.
Texte intégralThe conversion to the all-digital and the development of multimedia communications produce an ever-increasing flow of information. This massive increase in the quantity of data exchanged generates a progressive saturation of the transmission networks. To deal with this situation, the compression standards seek to exploit more and more the spatial and/or temporal correlation to reduce the bit rate. The reduction of the resulting information creates visual artefacts which can deteriorate the visual content of the scene and thus cause troubles for the end-user. In order to propose the best broadcasting service, the assessment of the perceived quality is then necessary. The subjective tests which represent the reference method to quantify the perception of distortions are expensive, difficult to implement and remain inappropriate for an on-line quality assessment. In this thesis, we are interested in the most used compression standards (image or video) and have designed no-reference quality metrics based on the exploitation of the most annoying visual artefacts, such as the blocking, blurring and ringing effects. The proposed approach is modular and adapts to the considered coder and to the required ratio between computational cost and performance. For a low complexity, the metric quantifies the distortions specific to the considered coder, only exploiting the properties of the image signal. To improve the performance, to the detriment of a certain complexity, this one integrates in addition, cognitive models simulating the mechanisms of the visual attention. The saliency maps generated are then used to refine the proposed distortion measures purely based on the image signal
Yildiz, Ustun. « Decentralisation des procédés métiers : qualité de services et confidentialité ». Phd thesis, Université Henri Poincaré - Nancy I, 2008. http://tel.archives-ouvertes.fr/tel-00437469.
Texte intégralMaffiolo, Valérie. « De la caractérisation sémantique et acoustique de la qualité sonore de l'environnement urbain : structuration des représentations mentales et influence sur l'appréciation qualitative : application aux ambiances sonores de Paris ». Le Mans, 1999. http://www.theses.fr/1999LEMA1012.
Texte intégralDevillers, Rodolphe. « Conception d'un système multidimensionnel d'information sur la qualité des données géospatiales ». Phd thesis, Université de Marne la Vallée, 2004. http://tel.archives-ouvertes.fr/tel-00008930.
Texte intégralIsambert, Aurélie. « Contrôle de qualité et optimisation de l'acquisition des données en imagerie multimodale pour la radiothérapie externe ». Paris 11, 2009. http://www.theses.fr/2009PA11T006.
Texte intégralDurand, Philippe. « Traitement des donnees radar varan et estimation de qualites en geologie, geomorphologie et occupation des sols ». Paris 7, 1988. http://www.theses.fr/1988PA077183.
Texte intégralMerino, Laso Pedro. « Détection de dysfonctionements et d'actes malveillants basée sur des modèles de qualité de données multi-capteurs ». Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0056/document.
Texte intégralNaval systems represent a strategic infrastructure for international commerce and military activity. Their protection is thus an issue of major importance. Naval systems are increasingly computerized in order to perform an optimal and secure navigation. To attain this objective, on board vessel sensor systems provide navigation information to be monitored and controlled from distant computers. Because of their importance and computerization, naval systems have become a target for hackers. Maritime vessels also work in a harsh and uncertain operational environments that produce failures. Navigation decision-making based on wrongly understood anomalies can be potentially catastrophic.Due to the particular characteristics of naval systems, the existing detection methodologies can't be applied. We propose quality evaluation and analysis as an alternative. The novelty of quality applications on cyber-physical systems shows the need for a general methodology, which is conceived and examined in this dissertation, to evaluate the quality of generated data streams. Identified quality elements allow introducing an original approach to detect malicious acts and failures. It consists of two processing stages: first an evaluation of quality; followed by the determination of agreement limits, compliant with normal states to identify and categorize anomalies. The study cases of 13 scenarios for a simulator training platform of fuel tanks and 11 scenarios for two aerial drones illustrate the interest and relevance of the obtained results
Claeyman, Marine. « Etude par modélisation et assimilation de données d'un capteur infrarouge géostationnaire pour la qualité de l'air ». Toulouse 3, 2010. http://thesesups.ups-tlse.fr/1216/.
Texte intégralThe objective of this thesis is to define a geostationary infrared sensor to observe the atmospheric composition of the lowermost troposphere. We evaluate the potential added value of such an instrument at characterizing the variability of the main pollutants and improving air quality observations and forecasts. We focus on two air quality key pollutants: tropospheric ozone, because of its impact on human health, ecosystems and climate; carbon monoxide (CO), which is a tracer of pollutants emissions. Firstly, an evaluation of a linear scheme for the CO chemistry during one year and a half has been performed in comparison with a detailed chemical scheme (RACMOBUS) and different tropospheric and stratospheric observations (satellite and aircraft data). The advantage of such a scheme is its low computational cost which allows data assimilation of CO during long periods. Assimilation of CO data from the Measurements Of Pollution In The Troposphere (MOPITT) instrument allows us to evaluate the information brought by such infrared observations at the global scale. Secondly, the optimal configuration of a new infrared geostationary sensor has been defined using retrieval studies of atmospheric spectra with the objectives to contribute to the monitoring of ozone and CO for air quality purposes; our constraint also set the ground for a sensor with technically feasible and affordable characteristics. For reference, the information content of this instrument has been compared during summer to the information content from another infrared geostationary instrument similar to MTG-IRS (Meteosat Third Generation - Infrared Sounder), optimized to monitor water vapour and temperature but with monitoring atmospheric composition as Lastly, the potential added value of both instruments for air quality prognoses has been compared using observing system simulation experiments (OSSEs) over two summer months (July - August 2009). The skill of the two instruments to correct different error sources (atmospheric forcing, emission, initial state and the three conditions together) affecting air quality simulations and forecasts, has been characterised. In the end, it is concluded that the instrument configuration proposed is effectively able to bring a constraint on ozone and CO fields in the mid-to-low troposphere
Pellay, François-Xavier. « Méthodes d'estimation statistique de la qualité et méta-analyse de données transcriptomiques pour la recherche biomédicale ». Thesis, Lille 1, 2008. http://www.theses.fr/2008LIL10058/document.
Texte intégralTo understand the biological phenomena taking place in a cell under physiological or pathological conditions, it is essential to know the genes that it expresses Measuring genetic expression can be done with DNA chlp technology on which are set out thousands of probes that can measure the relative abundance of the genes expressed in the cell. The microarrays called pangenomic are supposed to cover all existing proteincoding genes, that is to say currently around thirty-thousand for human beings. The measure, analysis and interpretation of such data poses a number of problems and the analytlcal methods used will determine the reliability and accuracy of information obtained with the microarrays technology. The aim of thls thesis is to define methods to control measures, improve the analysis and deepen interpretation of microarrays to optimize their utilization in order to apply these methods in the transcriptome analysis of juvenile myelomocytic leukemia patients, to improve the diagnostic and understand the biological mechanisms behind this rare disease. We thereby developed and validated through several independent studies, a quality control program for microarrays, ace.map QC, a software that improves biological Interpretations of microarrays data based on genes ontologies and a visualization tool for global analysis of signaling pathways. Finally, combining the different approaches described, we have developed a method to obtain reliable biological signatures for diagnostic purposes
Andrieu, Pierre. « Passage à l'échelle, propriétés et qualité des algorithmes de classements consensuels pour les données biologiques massives ». Electronic Thesis or Diss., université Paris-Saclay, 2021. http://www.theses.fr/2021UPASG041.
Texte intégralBiologists and physicians regularly query public biological databases, for example when they are looking for the most associated genes towards a given disease. The chosen keyword are particularly important: synonymous reformulations of the same disease (for example "breast cancer" and "breast carcinoma") may lead to very different rankings of (thousands of) genes. The genes, sorted by relevance, can be tied (equal importance towards the disease). Additionally, some genes returned when using a first synonym may be absent when using another synonym. The rankings are then called "incomplete rankings with ties". The challenge is to combine the information provided by these different rankings of genes. The problem of taking as input a list of rankings and returning as output a so-called consensus ranking, as close as possible to the input rankings, is called the "rank aggregation problem". This problem is known to be NP-hard. Whereas most works focus on complete rankings without ties, we considered incomplete rankings with ties. Our contributions are divided into three parts. First, we have designed a graph-based heuristic able to divide the initial problem into independent sub-problems in the context of incomplete rankings with ties. Second, we have designed an algorithm able to identify common points between all the optimal consensus rankings, allowing to provide information about the robustness of the provided consensus ranking. An experimental study on a huge number of massive biological datasets has highlighted the biological relevance of these approaches. Our last contribution the following one : we have designed a parameterized model able to consider various interpretations of missing data. We also designed several algorithms for this model and did an axiomatic study of this model, based on social choice theory
Bothorel, Gwenael. « Algorithmes automatiques pour la fouille visuelle de données et la visualisation de règles d’association : application aux données aéronautiques ». Phd thesis, Toulouse, INPT, 2014. http://oatao.univ-toulouse.fr/13783/1/bothorel.pdf.
Texte intégralJallet, Roxane. « Splines de régression et splines de lissage en régression non paramétrique avec bruit processus ». Paris 6, 2008. http://www.theses.fr/2008PA066054.
Texte intégralIn the present work, we are interested in estimation methods of a regular function with a processus noise by smoothing splines and regression splines. Convergence rates results for smoothing splines are presented in the case of processus noise and an extension for unbalanced data is proposed. In order to build the regression splines estimators, we introduce two criteria : ordinary least squares and generalized least squares. For these two regression splines estimators convergence rates are studied and compared. Finally, through simulations the various estimators are compared
Bergès, Corinne. « Étude de systèmes d'acquisitions de données dans deux milieux contraignants : expérimentation spatiale et prospection sismique ». Toulouse, INPT, 1999. http://www.theses.fr/1999INPT026H.
Texte intégralVeron, Didier. « Utilisation des FADC pour la reconstruction et l'analyse des données de bruit de fond dans l'expérience neutrino de Chooz ». Lyon 1, 1997. http://www.theses.fr/1997LYO10074.
Texte intégralPetit, Laurent. « Etude de la qualité des données pour la représentation des réseaux techniques urbains : applications au réseau d'assainissement ». Artois, 1999. http://www.theses.fr/1999ARTO0203.
Texte intégral