Thematische Bibliographien / Missing Value Imputation

Auswahl der wissenschaftlichen Literatur zum Thema „Missing Value Imputation“

Autor: Grafiati

Veröffentlicht am 7. Juli 2024

Zuletzt aktualisiert am 7. Juli 2024

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Missing Value Imputation" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Missing Value Imputation"

Zhao, Yuxuan, Eric Landgrebe, Eliot Shekhtman und Madeleine Udell. „Online Missing Value Imputation and Change Point Detection with the Gaussian Copula“. Proceedings of the AAAI Conference on Artificial Intelligence 36, Nr. 8 (28.06.2022): 9199–207. http://dx.doi.org/10.1609/aaai.v36i8.20906.

Der volle Inhalt der Quelle

Annotation:

Missing value imputation is crucial for real-world data science workflows. Imputation is harder in the online setting, as it requires the imputation method itself to be able to evolve over time. For practical applications, imputation algorithms should produce imputations that match the true data distribution, handle data of mixed types, including ordinal, boolean, and continuous variables, and scale to large datasets. In this work we develop a new online imputation algorithm for mixed data using the Gaussian copula. The online Gaussian copula model produces meets all the desiderata: its imputations match the data distribution even for mixed data, improve over its offline counterpart on the accuracy when the streaming data has a changing distribution, and on the speed (up to an order of magnitude) especially on large scale datasets. By fitting the copula model to online data, we also provide a new method to detect change points in the multivariate dependence structure for mixed data with missing values. Experimental results on synthetic and real world data validate the performance of the proposed methods.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Lu, Kaifeng. „Number of imputations needed to stabilize estimated treatment difference in longitudinal data analysis“. Statistical Methods in Medical Research 26, Nr. 2 (10.10.2014): 674–90. http://dx.doi.org/10.1177/0962280214554439.

Der volle Inhalt der Quelle

Annotation:

Multiple imputation procedures replace each missing value with a set of plausible values based on the posterior predictive distribution of missing data given observed data. In many applications, as few as five imputations are adequate to achieve high efficiency relative to an infinite number of imputations. However, substantially more imputations are often needed to stabilize imputation-based inference at the analysis stage. Imputation-based inference at the analysis stage is considered stable if the conditional variability of the multiple imputation estimator, half-width of 95% confidence interval, test statistic, and estimated fraction of missing information given observed data is within specified thresholds for simulation error. For the estimation of treatment difference at study end for normally distributed responses in longitudinal trials, we calculate the multiple imputation quantities for an infinite number of imputations analytically and use simulations to assess the variability of the number of imputations needed at the analysis stage in repeated sampling.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Hameed, Wafaa Mustafa, und Nzar A. Ali. „Missing value imputation Techniques: A Survey“. UHD Journal of Science and Technology 7, Nr. 1 (28.03.2023): 72–81. http://dx.doi.org/10.21928/uhdjst.v7n1y2023.pp72-81.

Der volle Inhalt der Quelle

Annotation:

Numerous of information is being accumulated and placed away every day. Big quantity of misplaced areas in a dataset might be a large problem confronted through analysts due to the fact it could cause numerous issues in quantitative investigates. To handle such misplaced values, numerous methods were proposed. This paper offers a review on different techniques available for imputation of unknown information, such as median imputation, hot (cold) deck imputation, regression imputation, expectation maximization, help vector device imputation, multivariate imputation using chained equation, SICE method, reinforcement programming, non-parametric iterative imputation algorithms, and multilayer perceptrons. This paper also explores a few satisfactory choices of methods to estimate missing values to be used by different researchers on this discipline of study. Furthermore, it aims to assist them to discern out what approach is commonly used now, the overview may additionally provide a view of every technique alongside its blessings and limitations to take into consideration of future studies on this area of study. It can be taking into account as baseline to solutions the question which techniques were used and that is the maximum popular.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Das, Dipalika, Maya Nayak und Subhendu Kumar Pani. „Missing Value Imputation-A Review“. International Journal of Computer Sciences and Engineering 7, Nr. 4 (30.04.2019): 548–58. http://dx.doi.org/10.26438/ijcse/v7i4.548558.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Seu, Kimseth, Mi-Sun Kang und HwaMin Lee. „An Intelligent Missing Data Imputation Techniques: A Review“. JOIV : International Journal on Informatics Visualization 6, Nr. 1-2 (31.05.2022): 278. http://dx.doi.org/10.30630/joiv.6.1-2.935.

Der volle Inhalt der Quelle

Annotation:

The incomplete dataset is an unescapable problem in data preprocessing that primarily machine learning algorithms could not employ to train the model. Various data imputation approaches were proposed and challenged each other to resolve this problem. These imputations were established to predict the most appropriate value using different machine learning algorithms with various concepts. Furthermore, accurate estimation of the imputation method is exceptionally critical for some datasets to complete the missing value, especially imputing datasets in medical data. The purpose of this paper is to express the power of the distinguished state-of-the-art benchmarks, which have included the K-nearest Neighbors Imputation (KNNImputer) method, Bayesian Principal Component Analysis (BPCA) Imputation method, Multiple Imputation by Center Equation (MICE) Imputation method, Multiple Imputation with denoising autoencoder neural network (MIDAS) method. These methods have contributed to the achievable resolution to optimize and evaluate the appropriate data points for imputing the missing value. We demonstrate the experiment with all these imputation techniques based on the same four datasets which are collected from the hospital. Both Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are utilized to measure the outcome of implementation and compare with each other to prove an extremely robust and appropriate method that overcomes missing data problems. As a result of the experiment, the KNNImputer and MICE have performed better than BPCA and MIDAS imputation, and BPCA has performed better than the MIDAS algorithm.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Huang, Min-Wei, Wei-Chao Lin und Chih-Fong Tsai. „Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets“. Journal of Healthcare Engineering 2018 (2018): 1–9. http://dx.doi.org/10.1155/2018/1817479.

Der volle Inhalt der Quelle

Annotation:

Many real-world medical datasets contain some proportion of missing (attribute) values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete) observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Kumar, Nishith, Md Aminul Hoque, Md Shahjaman, S. M. Shahinul Islam und Md Nurul Haque Mollah. „A New Approach of Outlier-robust Missing Value Imputation for Metabolomics Data Analysis“. Current Bioinformatics 14, Nr. 1 (06.12.2018): 43–52. http://dx.doi.org/10.2174/1574893612666171121154655.

Der volle Inhalt der Quelle

Annotation:

Background: Metabolomics data generation and quantification are different from other types of molecular “omics” data in bioinformatics. Mass spectrometry (MS) based (gas chromatography mass spectrometry (GC-MS), liquid chromatography mass spectrometry (LC-MS), etc.) metabolomics data frequently contain missing values that make some quantitative analysis complex. Typically metabolomics datasets contain 10% to 20% missing values that originate from several reasons, like analytical, computational as well as biological hazard. Imputation of missing values is a very important and interesting issue for further metabolomics data analysis. </P><P> Objective: This paper introduces a new algorithm for missing value imputation in the presence of outliers for metabolomics data analysis. </P><P> Method: Currently, the most well known missing value imputation techniques in metabolomics data are knearest neighbours (kNN), random forest (RF) and zero imputation. However, these techniques are sensitive to outliers. In this paper, we have proposed an outlier robust missing imputation technique by minimizing twoway empirical mean absolute error (MAE) loss function for imputing missing values in metabolomics data. Results: We have investigated the performance of the proposed missing value imputation technique in a comparison of the other traditional imputation techniques using both simulated and real data analysis in the absence and presence of outliers. Conclusion: Results of both simulated and real data analyses show that the proposed outlier robust missing imputation technique is better performer than the traditional missing imputation methods in both absence and presence of outliers.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Zimmermann, Pavel, Petr Mazouch und Klára Hulíková Tesárková. „Missing Categorical Data Imputation and Individual Observation Level Imputation“. Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 62, Nr. 6 (2014): 1527–34. http://dx.doi.org/10.11118/actaun201462061527.

Der volle Inhalt der Quelle

Annotation:

Traditional missing data techniques of imputation schemes focus on prediction of the missing value based on other observed values. In the case of continuous missing data the imputation of missing values often focuses on regression models. In the case of categorical data, usual techniques are then focused on classification techniques which sets the missing value to the ‘most likely’ category. This however leads to overrepresentation of the categories which are in general observed more often and hence can lead to biased results in many tasks especially in the case of presence of dominant categories. We present original methodology of imputation of missing values which results in the most likely structure (distribution) of the missing data conditional on the observed values. The methodology is based on the assumption that the categorical variable containing the missing values has multinomial distribution. Values of the parameters of this distribution are than estimated using the multinomial logistic regression. Illustrative example of missing value and its reconstruction of the highest education level of persons in some population is described.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

H.Mohamed, Marghny, Abdel-Rahiem A. Hashem und Mohammed M. Abdelsamea. „Scalable Algorithms for Missing Value Imputation“. International Journal of Computer Applications 87, Nr. 11 (14.02.2014): 35–42. http://dx.doi.org/10.5120/15255-4019.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Gashler, Michael S, Michael R Smith, Richard Morris und Tony Martinez. „Missing Value Imputation with Unsupervised Backpropagation“. Computational Intelligence 32, Nr. 2 (01.07.2014): 196–215. http://dx.doi.org/10.1111/coin.12048.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mehr Quellen

Dissertationen zum Thema "Missing Value Imputation"

Aslan, Sipan. „Comparison Of Missing Value Imputation Methods For Meteorological Time Series Data“. Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612426/index.pdf.

Der volle Inhalt der Quelle

Annotation:

Dealing with missing data in spatio-temporal time series constitutes important branch of general missing data problem. Since the statistical properties of time-dependent data characterized by sequentiality of observations then any interruption of consecutiveness in time series will cause severe problems. In order to make reliable analyses in this case missing data must be handled cautiously without disturbing the series statistical properties, mainly as temporal and spatial dependencies. In this study we aimed to compare several imputation methods for the appropriate completion of missing values of the spatio-temporal meteorological time series. For this purpose, several missing imputation methods are assessed on their imputation performances for artificially created missing data in monthly total precipitation and monthly mean temperature series which are obtained from the climate stations of Turkish State Meteorological Service. Artificially created missing data are estimated by using six methods. Single Arithmetic Average (SAA), Normal Ratio (NR) and NR Weighted with Correlations (NRWC) are the three simple methods used in the study. On the other hand, we used two computational intensive methods for missing data imputation which are called Multi Layer Perceptron type Neural Network (MLPNN) and Monte Carlo Markov Chain based on Expectation-Maximization Algorithm (EM-MCMC). In addition to these, we propose a modification in the EM-MCMC method in which results of simple imputation methods are used as auxiliary variables. Beside the using accuracy measure based on squared errors we proposed Correlation Dimension (CD) technique for appropriate evaluation of imputation performances which is also important subject of Nonlinear Dynamic Time Series Analysis.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Andersson, Joacim, und Henrik Falk. „Missing Data in Value-at-Risk Analysis : Conditional Imputation in Optimal Portfolios Using Regression“. Thesis, KTH, Matematisk statistik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-122276.

Der volle Inhalt der Quelle

Annotation:

A regression-based method is presented in order toregenerate missing data points in stock return time series. The method usesonly complete time series of assets in optimal portfolios, in which the returnsof the underlying tend to correlate inadequately with each other. The studyshows that the method is able to replicate empirical VaR-backtesting resultswhere all data are available, even when up to 90% of the time series in half ofthe assets in the portfolios have been removed.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bischof, Stefan, Andreas Harth, Benedikt Kämpgen, Axel Polleres und Patrik Schneider. „Enriching integrated statistical open city data by combining equational knowledge and missing value imputation“. Elsevier, 2017. http://dx.doi.org/10.1016/j.websem.2017.09.003.

Der volle Inhalt der Quelle

Annotation:

Several institutions collect statistical data about cities, regions, and countries for various purposes. Yet, while access to high quality and recent such data is both crucial for decision makers and a means for achieving transparency to the public, all too often such collections of data remain isolated and not re-useable, let alone comparable or properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and re-publish the resulting dataset in a re-useable manner as Linked Data. The main features of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques and reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data, including links to external data sources, such as DBpedia, available both in a web browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV. Apart from providing a contribution to the growing collection of data available as Linked Data, our enrichment process for missing values also contributes a novel methodology for combining rule-based inference about equational knowledge with inferences obtained from statistical Machine Learning approaches. While most existing works about inference in Linked Data have focused on ontological reasoning in RDFS and OWL, we believe that these complementary methods and particularly their combination could be fruitfully applied also in many other domains for integrating Statistical Linked Data, independent from our concrete use case of integrating city data.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Jagirdar, Suresh. „Investigation into Regression Analysis of Multivariate Additional Value and Missing Value Data Models Using Artificial Neural Networks and Imputation Techniques“. Ohio University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1219343139.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bala, Abdalla. „Impact analysis of a multiple imputation technique for handling missing value in the ISBSG repository of software projects“. Mémoire, École de technologie supérieure, 2013. http://espace.etsmtl.ca/1236/1/BALA_Abdalla.pdf.

Der volle Inhalt der Quelle

Annotation:

Jusqu'au début des années 2000, la plupart des études empiriques pour construire des modèles d'estimation de projets logiciels ont été effectuées avec des échantillons de taille très faible (moins de 20 projets), tandis que seules quelques études ont utilisé des échantillons de plus grande taille (entre 60 à 90 projets). Avec la mise en place d’un répertoire de projets logiciels par l'International Software Benchmarking Standards Group - ISBSG - il existe désormais un plus grand ensemble de données disponibles pour construire des modèles d'estimation: la version 12 en 2013 du référentiel ISBSG contient plus de 6000 projets, ce qui constitue une base plus adéquate pour des études statistiques. Toutefois, dans le référentiel ISBSG un grand nombre de valeurs sont manquantes pour un nombre important de variables, ce qui rend assez difficile son utilisation pour des projets de recherche. Pour améliorer le développement de modèles d’estimation, le but de ce projet de recherche est de s'attaquer aux nouveaux problèmes d’accès à des plus grandes bases de données en génie logiciel en utilisant la technique d’imputation multiple pour tenir compte dans les analyses des données manquantes et des données aberrantes.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Etourneau, Lucas. „Contrôle du FDR et imputation de valeurs manquantes pour l'analyse de données de protéomiques par spectrométrie de masse“. Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALS001.

Der volle Inhalt der Quelle

Annotation:

La protéomique consiste en la caractérisation du protéome d’un échantillon biologique, c’est-à-dire l’ensemble des protéines qu’il contient, et ce de la manière la plus exhaustive possible. Par l’identification et la quantification de fragments de protéines analysables en spectrométrie de masse (appelés peptides), la protéomique donne accès au niveau d’expression des gènes à un instant donné, ce qui est une information capitale pour améliorer la compréhension des mécanismes moléculaires en jeu au sein du vivant. Ces expériences produisent de grandes quantités de données, souvent complexes à interpréter et sujettes à certains biais. Elles requièrent des méthodes de traitement fiables et qui assurent un certain contrôle qualité, afin de garantir la pertinence des conclusions biologiques qui en résultent.Les travaux de cette thèse portent sur l'amélioration de ces traitements de données, et plus particulièrement sur les deux points majeurs suivants:Le premier est le contrôle du taux de fausses découvertes (abrégé en FDR pour “False Discovery Rate”), durant les étapes d’identification (1) des peptides, et (2) de biomarqueurs quantitativement différentiels entre une condition biologique testée et son contrôle négatif. Nos contributions portent sur l'établissement de liens entre les méthodes empiriques propres à la protéomique, et d’autres méthodes théoriquement bien établies. Cela nous permet notamment de donner des directions à suivre pour l’amélioration des méthodes de contrôle du FDR lors de l'identification de peptides.Le second point porte sur la gestion des valeurs manquantes, souvent nombreuses et de nature complexe, les rendant impossible à ignorer. En particulier, nous avons développé un nouvel algorithme d’imputation de valeurs manquantes qui tire parti des spécificités des données de protéomique. Notre algorithme a été testé et comparé à d’autres méthodes sur plusieurs jeux de données et selon des métriques variées, et obtient globalement les meilleures performances. De plus, il s’agit du premier algorithme permettant d’imputer en suivant le paradigme en vogue de la “multi-omique”: il peut en effet s’appuyer, lorsque cela est pertinent, sur des informations de type transcriptomique, qui quantifie le niveau d’expression des ARN messagers présents dans l’échantillon, pour imputer de manière plus fiable. Finalement, Pirat est implémenté dans un paquet logiciel disponible gratuitement, ce qui rend facilement utilisable pour la communauté protéomique
Proteomics involves characterizing the proteome of a biological sample, that is, the set of proteins it contains, and doing so as exhaustively as possible. By identifying and quantifying protein fragments that are analyzable by mass spectrometry (known as peptides), proteomics provides access to the level of gene expression at a given moment. This is crucial information for improving the understanding of molecular mechanisms at play within living organisms. These experiments produce large amounts of data, often complex to interpret and subject to various biases. They require reliable data processing methods that ensure a certain level of quality control, as to guarantee the relevance of the resulting biological conclusions.The work of this thesis focuses on improving this data processing, and specifically on the following two major points:The first is controlling for the false discovery rate (FDR), when either identifying (1) peptides or (2) quantitatively differential biomarkers between a tested biological condition and its negative control. Our contributions focus on establishing links between the empirical methods stemmed for proteomic practice and other theoretically supported methods. This notably allows us to provide directions for the improvement of FDR control methods used for peptide identification.The second point focuses on managing missing values, which are often numerous and complex in nature, making them impossible to ignore. Specifically, we have developed a new algorithm for imputing them that leverages the specificities of proteomics data. Our algorithm has been tested and compared to other methods on multiple datasets and according to various metrics, and it generally achieves the best performance. Moreover, it is the first algorithm that allows imputation following the trending paradigm of "multi-omics": if it is relevant to the experiment, it can impute more reliably by relying on transcriptomic information, which quantifies the level of messenger RNA expression present in the sample. Finally, Pirat is implemented in a freely available software package, making it easy to use for the proteomic community

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Gheyas, Iffat A. „Novel computationally intelligent machine learning algorithms for data mining and knowledge discovery“. Thesis, University of Stirling, 2009. http://hdl.handle.net/1893/2152.

Der volle Inhalt der Quelle

Annotation:

This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Alarcon, Sergio Arciniegas. „Imputação de dados em experimentos multiambientais: novos algoritmos utilizando a decomposição por valores singulares“. Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-10052016-130506/.

Der volle Inhalt der Quelle

Annotation:

As análises biplot que utilizam os modelos de efeitos principais aditivos com inter- ação multiplicativa (AMMI) requerem matrizes de dados completas, mas, frequentemente os ensaios multiambientais apresentam dados faltantes. Nesta tese são propostas novas metodologias de imputação simples e múltipla que podem ser usadas para analisar da- dos desbalanceados em experimentos com interação genótipo por ambiente (G×E). A primeira, é uma nova extensão do método de validação cruzada por autovetor (Bro et al, 2008). A segunda, corresponde a um novo algoritmo não-paramétrico obtido por meio de modificações no método de imputação simples desenvolvido por Yan (2013). Também é incluído um estudo que considera sistemas de imputação recentemente relatados na literatura e os compara com o procedimento clássico recomendado para imputação em ensaios (G×E), ou seja, a combinação do algoritmo de Esperança-Maximização com os modelos AMMI ou EM-AMMI. Por último, são fornecidas generalizações da imputação simples descrita por Arciniegas-Alarcón et al. (2010) que mistura regressão com aproximação de posto inferior de uma matriz. Todas as metodologias têm como base a decomposição por valores singulares (DVS), portanto, são livres de pressuposições distribucionais ou estruturais. Para determinar o desempenho dos novos esquemas de imputação foram realizadas simulações baseadas em conjuntos de dados reais de diferentes espécies, com valores re- tirados aleatoriamente em diferentes porcentagens e a qualidade das imputações avaliada com distintas estatísticas. Concluiu-se que a DVS constitui uma ferramenta útil e flexível na construção de técnicas eficientes que contornem o problema de perda de informação em matrizes experimentais.
The biplot analysis using the additive main effects and multiplicative interaction models (AMMI) require complete data matrix, but often multi-environments trials have missing values. This thesis proposed new methods of single and multiple imputation that can be used to analyze unbalanced data in experiments with genotype by environment interaction (G×E). The first is a new extension of the cross-validation method by eigenvector (Bro et al., 2008). The second, corresponds to a new non-parametric algorithm obtained through modifications of the simple imputation method developed by Yan (2013). Also is included a study that considers imputation systems recently reported in the literature and compares them with the classic procedure recommended for imputation in trials (G×E), it means, the combination of the Expectation-Maximization (EM) algorithm with the additive main effects and multiplicative interaction (AMMI) model or EM-AMMI. Finally, are supplied generalizations of simple imputation described by Arciniegas-Alarcón et al. (2010) that combines regression with lower-rank approximation of a matrix. All methodologies are based on singular value decomposition (SVD), so, are free of any distributional or structural assumptions. In order to determine the performance of the new imputation schemes were performed simulations based on real data set of different species, with values deleted randomly at different percentages and the quality of the imputations was evaluated using different statistics. It was concluded that SVD provides a useful and flexible tool for the construction of efficient techniques that circumvent the problem of missing data in experimental matrices.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bengtsson, Fanny, und Klara Lindblad. „Methods for handling missing values : A simulation study comparing imputation methods for missing values on a Poisson distributed explanatory variable“. Thesis, Uppsala universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-432467.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Huo, Zhao. „A Comparsion of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data“. Thesis, Uppsala universitet, Statistiska institutionen, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-256602.

Der volle Inhalt der Quelle

Annotation:

Multiple imputation (MI) is a commonly used approach to impute missing data. This thesis studies missing covariates in recurrent event data, and discusses ways to include the survival outcomes in the imputation model. Some MI methods under consideration are the event indicator D combined with, respectively, the right-censored event times T, the logarithm of T and the cumulative baseline hazard H0(T). After imputation, we can then proceed to the complete data analysis. The Cox proportional hazards (PH) model and the PWP model are chosen as the analysis models, and the coefficient estimates are of substantive interest. A Monte Carlo simulation study is conducted to compare different MI methods, the relative bias and mean square error will be used in the evaluation process. Furthermore, an empirical study based on cardiovascular disease event data which contains missing values will be conducted. Overall, the results show that MI based on the Nelson-Aalen estimate of H0(T) is preferred in most circumstances.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Mehr Quellen

Bücher zum Thema "Missing Value Imputation"

Templ, Matthias. Visualization and Imputation of Missing Values. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-30073-8.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Subramanian, Rajesh. Transitioning to multiple imputation: A new method to impute missing blood alcohol concentration (BAC) values in FARS. Washington, D.C: National Highway Traffic Safety Administration, National Center for Statistics and Analysis, 2002.

Den vollen Inhalt der Quelle finden

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Missing Value Imputation. India: Starttech Educational Services LLP, 2020. http://dx.doi.org/10.4135/9781529630756.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Missing Value Imputation"

Raja, P. S., und K. Thangavel. „Soft Clustering Based Missing Value Imputation“. In Digital Connectivity – Social Impact, 119–33. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-3274-5_10.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Manna, Sweta, und Soumen Kumar Pati. „Missing Value Imputation Using Correlation Coefficient“. In Computational Intelligence in Pattern Recognition, 551–58. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-2449-3_47.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Sujatha, M., G. Lavanya Devi, K. Srinivasa Rao und N. Ramesh. „Rough Set Theory Based Missing Value Imputation“. In Cognitive Science and Health Bioinformatics, 97–106. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-6653-5_9.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Shi, Yi, Zhipeng Cai und Guohui Lin. „Classification Accuracy Based Microarray Missing Value Imputation“. In Bioinformatics Algorithms, 303–27. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2007. http://dx.doi.org/10.1002/9780470253441.ch14.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Rashid, Wajeeha, und Manoj Kumar Gupta. „A Perspective of Missing Value Imputation Approaches“. In Advances in Intelligent Systems and Computing, 307–15. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-1275-9_25.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Rashid, Wajeeha, Sakshi Arora und Manoj Kumar Gupta. „Missing Value Imputation Approach Using Cosine Similarity Measure“. In Advances in Intelligent Systems and Computing, 557–65. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-5113-0_44.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wu, Jiahua, Xiangyan Tang, Guangxing Liu und Bofan Wu. „An Overview of Graph Data Missing Value Imputation“. In Communications in Computer and Information Science, 256–70. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-1280-9_20.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Gond, Vikesh Kumar, Aditya Dubey, Akhtar Rasool und Nilay Khare. „Missing Value Imputation Using Weighted KNN and Genetic Algorithm“. In ICT Analysis and Applications, 161–69. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-5224-1_18.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Cheng, Yu, Lan Wang und Jinglu Hu. „A Quasi-linear Approach for Microarray Missing Value Imputation“. In Neural Information Processing, 233–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24955-6_28.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Singh, Ninni, Anum Javeed, Sheenu Chhabra und Pardeep Kumar. „Missing Value Imputation with Unsupervised Kohonen Self Organizing Map“. In Emerging Research in Computing, Information, Communication and Applications, 61–76. New Delhi: Springer India, 2015. http://dx.doi.org/10.1007/978-81-322-2550-8_7.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Missing Value Imputation"

Luo, Fei, Hangwei Qian, Di Wang, Xu Guo, Yan Sun, Eng Sing Lee, Hui Hwang Teong, Ray Tian Rui Lai und Chunyan Miao. „Missing Value Imputation for Diabetes Prediction“. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892398.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Chong He, Hui-Hui Li, Changbo Zhao, Guo-Zheng Li und Wei Zhang. „Triple imputation for microarray missing value estimation“. In 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2015. http://dx.doi.org/10.1109/bibm.2015.7359682.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Karanikola, Aikaterini, und Sotiris Kotsiantis. „A hybrid method for missing value imputation“. In PCI '19: 23rd Pan-Hellenic Conference on Informatics. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3368640.3368653.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Aidos, Helena, und Pedro Tomas. „Neighborhood-aware autoencoder for missing value imputation“. In 2020 28th European Signal Processing Conference (EUSIPCO). IEEE, 2021. http://dx.doi.org/10.23919/eusipco47968.2020.9287580.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Rachmawan, Irene Erlyn Wina, und Ali Ridho Barakbah. „Optimization of missing value imputation using Reinforcement Programming“. In 2015 International Electronics Symposium (IES). IEEE, 2015. http://dx.doi.org/10.1109/elecsym.2015.7380828.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Lee, Namgil. „Block Tensor Train Decomposition for Missing Value Imputation“. In 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2018. http://dx.doi.org/10.23919/apsipa.2018.8659560.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Li, Hui-Hui, Feng-Feng Shao und Guo-Zheng Li. „Semi-supervised imputation for microarray missing value estimation“. In 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2014. http://dx.doi.org/10.1109/bibm.2014.6999172.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Zhang, Chengqi, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang und Shichao Zhang. „Clustering-based Missing Value Imputation for Data Preprocessing“. In 2006 IEEE International Conference on Industrial Informatics. IEEE, 2006. http://dx.doi.org/10.1109/indin.2006.275767.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Xu, Zhen, und Sargur N. Srihari. „Missing value imputation: with application to handwriting data“. In IS&T/SPIE Electronic Imaging, herausgegeben von Eric K. Ringger und Bart Lamiroy. SPIE, 2015. http://dx.doi.org/10.1117/12.2075842.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bou, Savong, Toshiyuki Amagasa, Hiroyuki Kitagawa, Salman Ahmed Shaikh und Akiyoshi Matono. „Efficient Missing Value Imputation by Maximum Distance Likelihood“. In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386584.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Berichte der Organisationen zum Thema "Missing Value Imputation"

Sukasih, Amang S., und Victoria Scott. Cyclical Tree-Based Hot Deck Imputation. RTI Press, Juni 2023. http://dx.doi.org/10.3768/rtipress.2023.mr.0052.2307.

Der volle Inhalt der Quelle

Annotation:

Hot deck imputation is a method for filling in a missing value in a survey item (item nonrespondent) with a valid reported value from a donor (item respondent) within the survey. Our paper presents a multivariate hot deck imputation method called Cyclical Tree-Based Hot Deck (CTBHD). This method was developed to handle missing values in complex survey data with many different types of variables and allows the user to customize imputation classes, use sorting variables, impute vectors and compositional variables, and even edit or recode data “on-the-fly.” Additionally, CTBHD employs a cycling approach to get more stable imputed values with less bias and variance. Our paper evaluates the performance of CTBHD imputation through a simulation study using publicly available survey data from the 2020 Residential Energy Consumption Survey. Developed as a system for imputation, the CTBHD system is proprietary to RTI International.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Kott, Phillip S. The Role of Weights in Regression Modeling and Imputation. RTI Press, April 2022. http://dx.doi.org/10.3768/rtipress.2022.mr.0047.2203.

Der volle Inhalt der Quelle

Annotation:

When fitting observations from a complex survey, the standard regression model assumes that the expected value of the difference between the dependent variable and its model-based prediction is zero, regardless of the values of the explanatory variables. A rarely failing extended regression model assumes only that the model error is uncorrelated with the model’s explanatory variables. When the standard model holds, it is possible to create alternative analysis weights that retain the consistency of the model-parameter estimates while increasing their efficiency by scaling the inverse-probability weights by an appropriately chosen function of the explanatory variables. When a regression model is used to impute for missing item values in a complex survey and when item missingness is a function of the explanatory variables of the regression model and not the item value itself, near unbiasedness of an estimated item mean requires that either the standard regression model for the item in the population holds or the analysis weights incorporate a correctly specified and consistently estimated probability of item response. By estimating the parameters of the probability of item response with a calibration equation, one can sometimes account for item missingness that is (partially) a function of the item value itself.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Huang, Lei, Meng Song, Hui Shen, Huixiao Hong, Ping Gong, Deng Hong-Wen und Zhang Chaoyang. Deep learning methods for omics data imputation. Engineer Research and Development Center (U.S.), Februar 2024. http://dx.doi.org/10.21079/11681/48221.

Der volle Inhalt der Quelle

Annotation:

One common problem in omics data analysis is missing values, which can arise due to various reasons, such as poor tissue quality and insufficient sample volumes. Instead of discarding missing values and related data, imputation approaches offer an alternative means of handling missing data. However, the imputation of missing omics data is a non-trivial task. Difficulties mainly come from high dimensionality, non-linear or nonmonotonic relationships within features, technical variations introduced by sampling methods, sample heterogeneity, and the non-random missingness mechanism. Several advanced imputation methods, including deep learning-based methods, have been proposed to address these challenges. Due to its capability of modeling complex patterns and relationships in large and high-dimensional datasets, many researchers have adopted deep learning models to impute missing omics data. This review provides a comprehensive overview of the currently available deep learning-based methods for omics imputation from the perspective of deep generative model architectures such as autoencoder, variational autoencoder, generative adversarial networks, and Transformer, with an emphasis on multi-omics data imputation. In addition, this review also discusses the opportunities that deep learning brings and the challenges that it might face in this field.

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Cao, Honggao. IMPUTE: A SAS Application System for Missing Value Imputations--With Special Reference to HRS Income/Assets. Institute for Social Research, University of Michigan, 2001. http://dx.doi.org/10.7826/isr-um.06.585031.001.05.0006.2001.

Der volle Inhalt der Quelle

APA, Harvard, Vancouver, ISO und andere Zitierweisen

Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!