Literatura académica sobre el tema "Imputation de données manquantes"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Imputation de données manquantes".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Artículos de revistas sobre el tema "Imputation de données manquantes"
Galimard, J. E., S. Chevret y M. Resche-Rigon. "Imputation multiple en présence de données manquantes MNAR". Revue d'Épidémiologie et de Santé Publique 63 (mayo de 2015): S42. http://dx.doi.org/10.1016/j.respe.2015.03.014.
Texto completoBadisy, I. El, C. Nejjari, A. Naim, K. El Rhaz, M. Khalis y R. Giorgi. "CO10.6 - Imputation des données manquantes par un méta-algorithme (metaCART): étude de simulation". Revue d'Épidémiologie et de Santé Publique 71 (mayo de 2023): 101632. http://dx.doi.org/10.1016/j.respe.2023.101632.
Texto completoAurélien, Njamen Kengdo Arsène y Kwatcho Kengdo Steve. "Gestion Des Donnees Manquantes Dans Les Bases De Donnees En Sciences Sociales : Algorithme Nipals Ou Imputation Multiple?" European Scientific Journal, ESJ 12, n.º 35 (31 de diciembre de 2016): 390. http://dx.doi.org/10.19044/esj.2016.v12n35p390.
Texto completoSoullier, N., E. de la Rochebrochard y J. Bouyer. "Imputation multiple et répartition des données manquantes dans les cohortes : exemple de la fécondation in vitro". Revue d'Épidémiologie et de Santé Publique 56, n.º 5 (septiembre de 2008): 276. http://dx.doi.org/10.1016/j.respe.2008.06.077.
Texto completoLegendre, Bruno, Damiano Cerasuolo, Olivier Dejardin y Annabel Boyer. "Comment gérer les données manquantes ? Imputation multiple par équations chaînées : recommandations et explications pour la pratique clinique". Néphrologie & Thérapeutique 19, n.º 3 (1 de junio de 2023): 1–9. http://dx.doi.org/10.1684/ndt.2023.24.
Texto completoDe Keizer, J., J. Paul, M. Albouy, A. Dupuis, V. Migeot, S. Rabouan, N. Venisse y E. Gand. "Simulation et imputation de plusieurs variables corrélées dans un contexte de données manquantes de façon non aléatoires (MNAR)". Revue d'Épidémiologie et de Santé Publique 69 (junio de 2021): S32—S33. http://dx.doi.org/10.1016/j.respe.2021.04.052.
Texto completoCaron, A., G. Clément, C. Heyman, E. Aernout, E. Chazard y A. Le Tertre. "Détermination de l’exposition de 394 979 nouveau-nés par imputation multiple de données manquantes dans une étude épidémiologique". Revue d'Épidémiologie et de Santé Publique 63 (marzo de 2015): S9. http://dx.doi.org/10.1016/j.respe.2015.01.016.
Texto completoBasham, C. Andrew. "Variations régionales de prévalence de la multimorbidité en Colombie-Britannique (Canada) : analyse transversale des données de l’Enquête sur la santé dans les collectivités canadiennes de 2015-2016". Promotion de la santé et prévention des maladies chroniques au Canada 40, n.º 7/8 (julio de 2020): 251–61. http://dx.doi.org/10.24095/hpcdp.40.7/8.02f.
Texto completoTrempe, Normand, Marie-Claude Boivin, Ernest Lo y Amadou Diogo Barry. "L’utilisation de la variable sur la langue d’usage à la maison du Registre des décès du Québec". Notes de recherche 43, n.º 1 (4 de junio de 2014): 163–80. http://dx.doi.org/10.7202/1025494ar.
Texto completoDoggett, Amanda, Ashok Chaurasia, Jean-Philippe Chaput y Scott T. Leatherdale. "Utilisation des arbres de classification et de régression pour modéliser les données manquantes sur l’IMC, la taille et la masse corporelle chez les jeunes". Promotion de la santé et prévention des maladies chroniques au Canada 43, n.º 5 (mayo de 2023): 257–69. http://dx.doi.org/10.24095/hpcdp.43.5.03f.
Texto completoTesis sobre el tema "Imputation de données manquantes"
Bernard, Francis. "Méthodes d'analyse des données incomplètes incorporant l'incertitude attribuable aux valeurs manquantes". Mémoire, Université de Sherbrooke, 2013. http://hdl.handle.net/11143/6571.
Texto completoAudigier, Vincent. "Imputation multiple par analyse factorielle : Une nouvelle méthodologie pour traiter les données manquantes". Thesis, Rennes, Agrocampus Ouest, 2015. http://www.theses.fr/2015NSARG015/document.
Texto completoThis thesis proposes new multiple imputation methods that are based on principal component methods, which were initially used for exploratory analysis and visualisation of continuous, categorical and mixed multidimensional data. The study of principal component methods for imputation, never previously attempted, offers the possibility to deal with many types and sizes of data. This is because the number of estimated parameters is limited due to dimensionality reduction.First, we describe a single imputation method based on factor analysis of mixed data. We study its properties and focus on its ability to handle complex relationships between variables, as well as infrequent categories. Its high prediction quality is highlighted with respect to the state-of-the-art single imputation method based on random forests.Next, a multiple imputation method for continuous data using principal component analysis (PCA) is presented. This is based on a Bayesian treatment of the PCA model. Unlike standard methods based on Gaussian models, it can still be used when the number of variables is larger than the number of individuals and when correlations between variables are strong.Finally, a multiple imputation method for categorical data using multiple correspondence analysis (MCA) is proposed. The variability of prediction of missing values is introduced via a non-parametric bootstrap approach. This helps to tackle the combinatorial issues which arise from the large number of categories and variables. We show that multiple imputation using MCA outperforms the best current methods
Héraud, Bousquet Vanina. "Traitement des données manquantes en épidémiologie : application de l’imputation multiple à des données de surveillance et d’enquêtes". Thesis, Paris 11, 2012. http://www.theses.fr/2012PA11T017/document.
Texto completoThe management of missing values is a common and widespread problem in epidemiology. The most common technique used restricts the data analysis to subjects with complete information on variables of interest, which can reducesubstantially statistical power and precision and may also result in biased estimates.This thesis investigates the application of multiple imputation methods to manage missing values in epidemiological studies and surveillance systems for infectious diseases. Study designs to which multiple imputation was applied were diverse: a risk analysis of HIV transmission through blood transfusion, a case-control study on risk factors for ampylobacter infection, and a capture-recapture study to estimate the number of new HIV diagnoses among children. We then performed multiple imputation analysis on data of a surveillance system for chronic hepatitis C (HCV) to assess risk factors of severe liver disease among HCV infected patients who reported drug use. Within this study on HCV, we proposedguidelines to apply a sensitivity analysis in order to test the multiple imputation underlying hypotheses. Finally, we describe how we elaborated and applied an ongoing multiple imputation process of the French national HIV surveillance database, evaluated and attempted to validate multiple imputation procedures.Based on these practical applications, we worked out a strategy to handle missing data in surveillance data base, including the thorough examination of the incomplete database, the building of the imputation model, and the procedure to validate imputation models and examine underlying multiple imputation hypotheses
Croiseau, Pascal. "Influence et traitement des données manquantes dans les études d'association sur trios : application à des données sur la sclérose en plaques". Paris 11, 2008. http://www.theses.fr/2008PA112021.
Texto completoTo test for association between a set of markers and a disease, or to estimate the disease risks, different methods have been developped. Several of these methods need that all individuals are genotyped for all markers. When it is not the case, individuals with missing data are discarded. We have shown that this solution, which leads to a strong decrease of the sample size, could involve a loss of power to detect an association but also to false conclusion. In this work, we adapted to genetic data a method of "multiple imputation" that consists in replacing missing data by plausible values. Results obtained from simulated data show that this approach is promising to search for disease susceptibility genes. It is simple to use and very flexible in terms of genetic models that can be tested. We applied our method to a sample of 450 multiple sclerosis family trios (an affected child and both parents). Recent works have detected an association between a polymorphism of CTLA4 gene and multiple sclerosis. However, CTLA4 belongs to a cluster of three gene CD28, CTLA4 and ICOS all involved in the immune response. Consequently, this association could be due to another marker in linkage disequilibrium with CTLA4. Our method allows us to detect the association with CTLA4's polymorphism and also to provide us with a new candidate to explore : a CD28 polymorphism which could be involved in multiple sclerosis in interaction with the CTLA4 polymorphism
Etourneau, Lucas. "Contrôle du FDR et imputation de valeurs manquantes pour l'analyse de données de protéomiques par spectrométrie de masse". Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALS001.
Texto completoProteomics involves characterizing the proteome of a biological sample, that is, the set of proteins it contains, and doing so as exhaustively as possible. By identifying and quantifying protein fragments that are analyzable by mass spectrometry (known as peptides), proteomics provides access to the level of gene expression at a given moment. This is crucial information for improving the understanding of molecular mechanisms at play within living organisms. These experiments produce large amounts of data, often complex to interpret and subject to various biases. They require reliable data processing methods that ensure a certain level of quality control, as to guarantee the relevance of the resulting biological conclusions.The work of this thesis focuses on improving this data processing, and specifically on the following two major points:The first is controlling for the false discovery rate (FDR), when either identifying (1) peptides or (2) quantitatively differential biomarkers between a tested biological condition and its negative control. Our contributions focus on establishing links between the empirical methods stemmed for proteomic practice and other theoretically supported methods. This notably allows us to provide directions for the improvement of FDR control methods used for peptide identification.The second point focuses on managing missing values, which are often numerous and complex in nature, making them impossible to ignore. Specifically, we have developed a new algorithm for imputing them that leverages the specificities of proteomics data. Our algorithm has been tested and compared to other methods on multiple datasets and according to various metrics, and it generally achieves the best performance. Moreover, it is the first algorithm that allows imputation following the trending paradigm of "multi-omics": if it is relevant to the experiment, it can impute more reliably by relying on transcriptomic information, which quantifies the level of messenger RNA expression present in the sample. Finally, Pirat is implemented in a freely available software package, making it easy to use for the proteomic community
Héraud, Bousquet Vanina. "Traitement des données manquantes en épidémiologie : Application de l'imputation multiple à des données de surveillance et d'enquêtes". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00713926.
Texto completoLorga, Da Silva Ana. "Tratamento de dados omissos e métodos de imputação em classificação". Doctoral thesis, Instituto Superior de Economia e Gestão, 2005. http://hdl.handle.net/10400.5/3849.
Texto completoNeste trabalho, pretende-se estudar o efeito dos dados omissos em classificação de variáveis, principalmente em classificação hierárquica ascendente, de acordo com.òs seguintes factores: percentagens de dados omissos, métodos de imputação, coeficientes de semelhança-e métodos de classificação. Supõe-se que os dados omissos são do tipo MAR ("missing at random"), isto é, a presença de dados omissos não depende dos valores omissos, nem das variáveis com dados omissos, mas depende de valores observados sobre outras variáveis da matriz de dados. Os dados omissos satisfazem um padrão maioritariamente monótono. Utilizaram-se as técnicas, em presença de dados omissos "listwise" e "pairwise"; como métodos de imputação simples: o algoritmo EM, o modelo de regressão OLS, o algoritmo MPALS e um método de regressão PLS. Como métodos de imputação múltipla, adoptou-se um método baseado sobre o modelo de regressão OLS associado a técnicas bayesianas; propôs-se também um novo método de imputação múltipla baseado sobre os métodos de regressão PLS. Como métodos de classificação hierárquica utilizaram-se classificações clássicas e probabilísticas, estas últimas baseadas na família de métodos VL (validade da ligação). Os métodos de classificação hierárquica utilizados foram, "single", "complete" e "average" "linkage", AVL e AYB. Para as matrizes de semelhança utilizou-se o coeficiente de afinidade básico (para dados contínuos) - que corresponde ao índice d'Ochiai para dados binários; o coeficiente de correlação de Pearson e a aproximação probabilística do coeficiente de afinidade centrado e reduzido pelo método-W. O estudo foi baseado em dados simulados e reais. Utilizou-se o coeficiente de Spearman, para comparar as estruturas de classificação hierárquicas e para as classificações não hierárquicas o índice de Rand.
Le but de ce travail est d'étudier l’effet des données manquantes en classification de variables, principalement en classification hiérarchique ascendante, et aussi en classification non hiérarchique (ou partitionnement). L'étude est effectuée en considérant les facteurs suivants: pourcentage de données manquantes, méthodes d'imputation, coefficients de ressemblance et critères de classification. On suppose que les données manquantes sont du type MAR («missing at random») données manquantes au hasard, mais pas. complètement au hasard.. Les données manquantes satisfont un schéma majoritairement monotone. Nous avons utilisé comme techniques sans imputation les méthodes lisîwise et pairwise et comme méthodes d'imputation simple: l'algorithme EM, le modèle de régression OLS, l’algorithme NIPALS et une méthode de régression PLS., Comme méthodes d'imputation multiple nous avons adopté une méthode basée sur le modèle de régression OLS associé à des techniques bayesiennes; on a aussi proposé un nouveau modèle d'imputation multiple basé sur les méthodes de régression PLS. Pour combiner les structures de classification résultant des méthodes d'imputation multiple nous avons proposé une combinaison par la moyenne des matrices de similarité et deux méthodes de consensus. Nous avons utilisé comme méthodes de classification hiérarchique des méthodes classiques et probabilistes, ces dernières basées sur la famille de méthodes VL (Vraisemblance du Lien), comme méthodes de classification hiérarchique, le saut minimal, le saut maximal, la moyenne parmi les groupes et aussi les AVL et AVB; pour les matrices de ressemblance, le coefficient d'affinité basique (pour les données continues) - qui correspond à l'indice d'Ochiai; pour les données binaires, le coefficient de corrélation de Bravais-Pearson et l'approximation probabiliste du coefficient d'affinité centré et réduit par la méthode-W. L'étude est basée principalement sur des données simulées et complétée par des applications à des données réelles. Nous avons travaillé sur des données continues et binaires. Le coefficient de Spearman est utilisé pour comparer les structures hiérarchiques obtenues sur des matrices complètes avec les structures obtenues à partir des matrices ; où les données sont «effacées» puis imputées. L'indice de Rand est utilisé pour comparer les structures non hiérarchiques. Enfin, nous avons aussi proposé une méthode non hiérarchique qui «s'adapte» aux données manquantes. Sur un cas réel la méthode de Ward est utilisée dans les mêmes conditions que pour les simulations; mais aussi sans satisfaire un schéma monotone; une méthode de Monte Carlo par chaînes de Markov sert pour l'imputation multiple.
In this work we aimed to study the effect of missing data in classification of variables; mainly in ascending hierarchical classification, according to the following factors: amount of missing data, imputation techniques, similarity coefficient and classification-criterion. We used as techniques in presence of missing data, listwise and pairwise; as simple imputation methods, an EM algorithm, the OLS regression method, the NIPALS algorithm and a PLS regression method. As multiple imputation, we used a method based on the OLS regression and a new one based on PLS, combined by the mean value of the similarity matrices and an ordinal consensus. As hierarchical methods we used classical and. probabilistic approaches, the latter based on the VL-family. The hierarchical methods used were single, complete and average linkage, AVL and AVB. For the similarity matrices we used the basic affinity coefficient (for continuous data) - that corresponds to the Ochiai index for binary data; the Pearson's correlation coefficient and the probabilistic approach of the affinity coefficient, centered and reduced by the W-method.. The study was based mainly on simulated data, complemented by real ones. We used the Spearman.coefficient between the associated ultrametrics to compare the structures of the hierarchical classifications and, for the non hierarchical classifications, the Rand's index.
Marti, soler Helena. "Modélisation des données d'enquêtes cas-cohorte par imputation multiple : Application en épidémiologie cardio-vasculaire". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00779739.
Texto completoGeronimi, Julia. "Contribution à la sélection de variables en présence de données longitudinales : application à des biomarqueurs issus d'imagerie médicale". Thesis, Paris, CNAM, 2016. http://www.theses.fr/2016CNAM1114/document.
Texto completoClinical studies enable us to measure many longitudinales variables. When our goal is to find a link between a response and some covariates, one can use regularisation methods, such as LASSO which have been extended to Generalized Estimating Equations (GEE). They allow us to select a subgroup of variables of interest taking into account intra-patient correlations. Databases often have unfilled data and measurement problems resulting in inevitable missing data. The objective of this thesis is to integrate missing data for variable selection in the presence of longitudinal data. We use mutiple imputation and introduce a new imputation function for the specific case of variables under detection limit. We provide a new variable selection method for correlated data that integrate missing data : the Multiple Imputation Penalized Generalized Estimating Equations (MI-PGEE). Our operator applies the group-LASSO penalty on the group of estimated regression coefficients of the same variable across multiply-imputed datasets. Our method provides a consistent selection across multiply-imputed datasets, where the optimal shrinkage parameter is chosen by minimizing a BIC-like criteria. We then present an application on knee osteoarthritis aiming to select the subset of biomarkers that best explain the differences in joint space width over time
Mehanna, Souheir. "Data quality issues in mobile crowdsensing environments". Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG053.
Texto completoMobile crowdsensing has emerged as a powerful paradigm for harnessing the collective sensing capabilities of mobile devices to gather diverse data in real-world settings. However, ensuring the quality of the collected data in mobile crowdsensing environments (MCS) remains a challenge because low-cost nomadic sensors can be prone to malfunctions, faults, and points of failure. The quality of the collected data can significantly impact the results of the subsequent analyses. Therefore, monitoring the quality of sensor data is crucial for effective analytics.In this thesis, we have addressed some of the issues related to data quality in mobile crowdsensing environments. First, we have explored issues related to data completeness. The mobile crowdsensing context has specific characteristics that are not all captured by the existing factors and metrics. We have proposed a set of quality factors of data completeness suitable for mobile crowdsensing environments. We have also proposed a set of metrics to evaluate each of these factors. In order to improve data completeness, we have tackled the problem of generating missing values.Existing data imputation techniques generate missing values by relying on existing measurements without considering the disparate quality levels of these measurements. We propose a quality-aware data imputation approach that extends existing data imputation techniques by taking into account the quality of the measurements.In the second part of our work, we have focused on anomaly detection, which is another major problem that sensor data face. Existing anomaly detection approaches use available data measurements to detect anomalies, and are oblivious of the quality of the measurements. In order to improve the detection of anomalies, we propose an approach relying on clustering algorithms that detects pattern anomalies while integrating the quality of the sensor into the algorithm.Finally, we have studied the way data quality could be taken into account for analyzing sensor data. We have proposed some contributions which are the first step towards quality-aware sensor data analytics, which consist of quality-aware aggregation operators, and an approach that evaluates the quality of a given aggregate considering the data used in its computation
Libros sobre el tema "Imputation de données manquantes"
Raghunathan, Trivellore, Patricia A. Berglund y Peter W. Solenberger. Multiple Imputation in Practice: With Examples Using IVEware. Taylor & Francis Group, 2018.
Buscar texto completoRaghunathan, Trivellore, Patricia A. Berglund y Peter W. Solenberger. Multiple Imputation in Practice: With Examples Using IVEware. Taylor & Francis Group, 2018.
Buscar texto completoRaghunathan, Trivellore, Patricia A. Berglund y Peter W. Solenberger. Multiple Imputation in Practice: With Examples Using IVEware. Taylor & Francis Group, 2018.
Buscar texto completoMultiple Imputation in Practice: With Examples Using IVEware. Taylor & Francis Group, 2018.
Buscar texto completoBuuren, Stef van. Flexible Imputation of Missing Data, Second Edition. Taylor & Francis Group, 2018.
Buscar texto completoBuuren, Stef van. Flexible Imputation of Missing Data, Second Edition. Taylor & Francis Group, 2018.
Buscar texto completoBuuren, Stef van. Flexible Imputation of Missing Data Second Edition. Taylor & Francis Group, 2021.
Buscar texto completoBuuren, Stef van. Flexible Imputation of Missing Data, Second Edition. Taylor & Francis Group, 2018.
Buscar texto completoFlexible Imputation of Missing Data, Second Edition. Taylor & Francis Group, 2018.
Buscar texto completoBuuren, Stef van. Flexible Imputation of Missing Data. Taylor & Francis Group, 2012.
Buscar texto completoCapítulos de libros sobre el tema "Imputation de données manquantes"
"Le traitement des données manquantes (Missing data)". En La modélisation par équations structurelles avec Mplus, 55–66. Presses de l'Université du Québec, 2018. http://dx.doi.org/10.2307/j.ctvt1sh9g.11.
Texto completo"Le traitement des données manquantes (Missing data)". En La modélisation par équations structurelles avec Mplus, 55–66. Presses de l'Université du Québec, 2018. http://dx.doi.org/10.1515/9782760549739-009.
Texto completo