Добірка наукової літератури з теми "Missing Value Imputation"

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Missing Value Imputation".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Missing Value Imputation":

1

Zhao, Yuxuan, Eric Landgrebe, Eliot Shekhtman, and Madeleine Udell. "Online Missing Value Imputation and Change Point Detection with the Gaussian Copula." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 8 (June 28, 2022): 9199–207. http://dx.doi.org/10.1609/aaai.v36i8.20906.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Missing value imputation is crucial for real-world data science workflows. Imputation is harder in the online setting, as it requires the imputation method itself to be able to evolve over time. For practical applications, imputation algorithms should produce imputations that match the true data distribution, handle data of mixed types, including ordinal, boolean, and continuous variables, and scale to large datasets. In this work we develop a new online imputation algorithm for mixed data using the Gaussian copula. The online Gaussian copula model produces meets all the desiderata: its imputations match the data distribution even for mixed data, improve over its offline counterpart on the accuracy when the streaming data has a changing distribution, and on the speed (up to an order of magnitude) especially on large scale datasets. By fitting the copula model to online data, we also provide a new method to detect change points in the multivariate dependence structure for mixed data with missing values. Experimental results on synthetic and real world data validate the performance of the proposed methods.
2

Lu, Kaifeng. "Number of imputations needed to stabilize estimated treatment difference in longitudinal data analysis." Statistical Methods in Medical Research 26, no. 2 (October 10, 2014): 674–90. http://dx.doi.org/10.1177/0962280214554439.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Multiple imputation procedures replace each missing value with a set of plausible values based on the posterior predictive distribution of missing data given observed data. In many applications, as few as five imputations are adequate to achieve high efficiency relative to an infinite number of imputations. However, substantially more imputations are often needed to stabilize imputation-based inference at the analysis stage. Imputation-based inference at the analysis stage is considered stable if the conditional variability of the multiple imputation estimator, half-width of 95% confidence interval, test statistic, and estimated fraction of missing information given observed data is within specified thresholds for simulation error. For the estimation of treatment difference at study end for normally distributed responses in longitudinal trials, we calculate the multiple imputation quantities for an infinite number of imputations analytically and use simulations to assess the variability of the number of imputations needed at the analysis stage in repeated sampling.
3

Hameed, Wafaa Mustafa, and Nzar A. Ali. "Missing value imputation Techniques: A Survey." UHD Journal of Science and Technology 7, no. 1 (March 28, 2023): 72–81. http://dx.doi.org/10.21928/uhdjst.v7n1y2023.pp72-81.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Numerous of information is being accumulated and placed away every day. Big quantity of misplaced areas in a dataset might be a large problem confronted through analysts due to the fact it could cause numerous issues in quantitative investigates. To handle such misplaced values, numerous methods were proposed. This paper offers a review on different techniques available for imputation of unknown information, such as median imputation, hot (cold) deck imputation, regression imputation, expectation maximization, help vector device imputation, multivariate imputation using chained equation, SICE method, reinforcement programming, non-parametric iterative imputation algorithms, and multilayer perceptrons. This paper also explores a few satisfactory choices of methods to estimate missing values to be used by different researchers on this discipline of study. Furthermore, it aims to assist them to discern out what approach is commonly used now, the overview may additionally provide a view of every technique alongside its blessings and limitations to take into consideration of future studies on this area of study. It can be taking into account as baseline to solutions the question which techniques were used and that is the maximum popular.
4

Das, Dipalika, Maya Nayak, and Subhendu Kumar Pani. "Missing Value Imputation-A Review." International Journal of Computer Sciences and Engineering 7, no. 4 (April 30, 2019): 548–58. http://dx.doi.org/10.26438/ijcse/v7i4.548558.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Seu, Kimseth, Mi-Sun Kang, and HwaMin Lee. "An Intelligent Missing Data Imputation Techniques: A Review." JOIV : International Journal on Informatics Visualization 6, no. 1-2 (May 31, 2022): 278. http://dx.doi.org/10.30630/joiv.6.1-2.935.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
The incomplete dataset is an unescapable problem in data preprocessing that primarily machine learning algorithms could not employ to train the model. Various data imputation approaches were proposed and challenged each other to resolve this problem. These imputations were established to predict the most appropriate value using different machine learning algorithms with various concepts. Furthermore, accurate estimation of the imputation method is exceptionally critical for some datasets to complete the missing value, especially imputing datasets in medical data. The purpose of this paper is to express the power of the distinguished state-of-the-art benchmarks, which have included the K-nearest Neighbors Imputation (KNNImputer) method, Bayesian Principal Component Analysis (BPCA) Imputation method, Multiple Imputation by Center Equation (MICE) Imputation method, Multiple Imputation with denoising autoencoder neural network (MIDAS) method. These methods have contributed to the achievable resolution to optimize and evaluate the appropriate data points for imputing the missing value. We demonstrate the experiment with all these imputation techniques based on the same four datasets which are collected from the hospital. Both Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) are utilized to measure the outcome of implementation and compare with each other to prove an extremely robust and appropriate method that overcomes missing data problems. As a result of the experiment, the KNNImputer and MICE have performed better than BPCA and MIDAS imputation, and BPCA has performed better than the MIDAS algorithm.
6

Huang, Min-Wei, Wei-Chao Lin, and Chih-Fong Tsai. "Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets." Journal of Healthcare Engineering 2018 (2018): 1–9. http://dx.doi.org/10.1155/2018/1817479.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Many real-world medical datasets contain some proportion of missing (attribute) values. In general, missing value imputation can be performed to solve this problem, which is to provide estimations for the missing values by a reasoning process based on the (complete) observed data. However, if the observed data contain some noisy information or outliers, the estimations of the missing values may not be reliable or may even be quite different from the real values. The aim of this paper is to examine whether a combination of instance selection from the observed data and missing value imputation offers better performance than performing missing value imputation alone. In particular, three instance selection algorithms, DROP3, GA, and IB3, and three imputation algorithms, KNNI, MLP, and SVM, are used in order to find out the best combination. The experimental results show that that performing instance selection can have a positive impact on missing value imputation over the numerical data type of medical datasets, and specific combinations of instance selection and imputation methods can improve the imputation results over the mixed data type of medical datasets. However, instance selection does not have a definitely positive impact on the imputation result for categorical medical datasets.
7

Kumar, Nishith, Md Aminul Hoque, Md Shahjaman, S. M. Shahinul Islam, and Md Nurul Haque Mollah. "A New Approach of Outlier-robust Missing Value Imputation for Metabolomics Data Analysis." Current Bioinformatics 14, no. 1 (December 6, 2018): 43–52. http://dx.doi.org/10.2174/1574893612666171121154655.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Background: Metabolomics data generation and quantification are different from other types of molecular “omics” data in bioinformatics. Mass spectrometry (MS) based (gas chromatography mass spectrometry (GC-MS), liquid chromatography mass spectrometry (LC-MS), etc.) metabolomics data frequently contain missing values that make some quantitative analysis complex. Typically metabolomics datasets contain 10% to 20% missing values that originate from several reasons, like analytical, computational as well as biological hazard. Imputation of missing values is a very important and interesting issue for further metabolomics data analysis. </P><P> Objective: This paper introduces a new algorithm for missing value imputation in the presence of outliers for metabolomics data analysis. </P><P> Method: Currently, the most well known missing value imputation techniques in metabolomics data are knearest neighbours (kNN), random forest (RF) and zero imputation. However, these techniques are sensitive to outliers. In this paper, we have proposed an outlier robust missing imputation technique by minimizing twoway empirical mean absolute error (MAE) loss function for imputing missing values in metabolomics data. Results: We have investigated the performance of the proposed missing value imputation technique in a comparison of the other traditional imputation techniques using both simulated and real data analysis in the absence and presence of outliers. Conclusion: Results of both simulated and real data analyses show that the proposed outlier robust missing imputation technique is better performer than the traditional missing imputation methods in both absence and presence of outliers.
8

Zimmermann, Pavel, Petr Mazouch, and Klára Hulíková Tesárková. "Missing Categorical Data Imputation and Individual Observation Level Imputation." Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 62, no. 6 (2014): 1527–34. http://dx.doi.org/10.11118/actaun201462061527.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Traditional missing data techniques of imputation schemes focus on prediction of the missing value based on other observed values. In the case of continuous missing data the imputation of missing values often focuses on regression models. In the case of categorical data, usual techniques are then focused on classification techniques which sets the missing value to the ‘most likely’ category. This however leads to overrepresentation of the categories which are in general observed more often and hence can lead to biased results in many tasks especially in the case of presence of dominant categories. We present original methodology of imputation of missing values which results in the most likely structure (distribution) of the missing data conditional on the observed values. The methodology is based on the assumption that the categorical variable containing the missing values has multinomial distribution. Values of the parameters of this distribution are than estimated using the multinomial logistic regression. Illustrative example of missing value and its reconstruction of the highest education level of persons in some population is described.
9

H.Mohamed, Marghny, Abdel-Rahiem A. Hashem, and Mohammed M. Abdelsamea. "Scalable Algorithms for Missing Value Imputation." International Journal of Computer Applications 87, no. 11 (February 14, 2014): 35–42. http://dx.doi.org/10.5120/15255-4019.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Gashler, Michael S, Michael R Smith, Richard Morris, and Tony Martinez. "Missing Value Imputation with Unsupervised Backpropagation." Computational Intelligence 32, no. 2 (July 1, 2014): 196–215. http://dx.doi.org/10.1111/coin.12048.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Дисертації з теми "Missing Value Imputation":

1

Aslan, Sipan. "Comparison Of Missing Value Imputation Methods For Meteorological Time Series Data." Master's thesis, METU, 2010. http://etd.lib.metu.edu.tr/upload/12612426/index.pdf.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Dealing with missing data in spatio-temporal time series constitutes important branch of general missing data problem. Since the statistical properties of time-dependent data characterized by sequentiality of observations then any interruption of consecutiveness in time series will cause severe problems. In order to make reliable analyses in this case missing data must be handled cautiously without disturbing the series statistical properties, mainly as temporal and spatial dependencies. In this study we aimed to compare several imputation methods for the appropriate completion of missing values of the spatio-temporal meteorological time series. For this purpose, several missing imputation methods are assessed on their imputation performances for artificially created missing data in monthly total precipitation and monthly mean temperature series which are obtained from the climate stations of Turkish State Meteorological Service. Artificially created missing data are estimated by using six methods. Single Arithmetic Average (SAA), Normal Ratio (NR) and NR Weighted with Correlations (NRWC) are the three simple methods used in the study. On the other hand, we used two computational intensive methods for missing data imputation which are called Multi Layer Perceptron type Neural Network (MLPNN) and Monte Carlo Markov Chain based on Expectation-Maximization Algorithm (EM-MCMC). In addition to these, we propose a modification in the EM-MCMC method in which results of simple imputation methods are used as auxiliary variables. Beside the using accuracy measure based on squared errors we proposed Correlation Dimension (CD) technique for appropriate evaluation of imputation performances which is also important subject of Nonlinear Dynamic Time Series Analysis.
2

Andersson, Joacim, and Henrik Falk. "Missing Data in Value-at-Risk Analysis : Conditional Imputation in Optimal Portfolios Using Regression." Thesis, KTH, Matematisk statistik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-122276.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
A regression-based method is presented in order toregenerate missing data points in stock return time series. The method usesonly complete time series of assets in optimal portfolios, in which the returnsof the underlying tend to correlate inadequately with each other. The studyshows that the method is able to replicate empirical VaR-backtesting resultswhere all data are available, even when up to 90% of the time series in half ofthe assets in the portfolios have been removed.
3

Bischof, Stefan, Andreas Harth, Benedikt Kämpgen, Axel Polleres, and Patrik Schneider. "Enriching integrated statistical open city data by combining equational knowledge and missing value imputation." Elsevier, 2017. http://dx.doi.org/10.1016/j.websem.2017.09.003.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Several institutions collect statistical data about cities, regions, and countries for various purposes. Yet, while access to high quality and recent such data is both crucial for decision makers and a means for achieving transparency to the public, all too often such collections of data remain isolated and not re-useable, let alone comparable or properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and re-publish the resulting dataset in a re-useable manner as Linked Data. The main features of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques and reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data, including links to external data sources, such as DBpedia, available both in a web browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV. Apart from providing a contribution to the growing collection of data available as Linked Data, our enrichment process for missing values also contributes a novel methodology for combining rule-based inference about equational knowledge with inferences obtained from statistical Machine Learning approaches. While most existing works about inference in Linked Data have focused on ontological reasoning in RDFS and OWL, we believe that these complementary methods and particularly their combination could be fruitfully applied also in many other domains for integrating Statistical Linked Data, independent from our concrete use case of integrating city data.
4

Jagirdar, Suresh. "Investigation into Regression Analysis of Multivariate Additional Value and Missing Value Data Models Using Artificial Neural Networks and Imputation Techniques." Ohio University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1219343139.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Bala, Abdalla. "Impact analysis of a multiple imputation technique for handling missing value in the ISBSG repository of software projects." Mémoire, École de technologie supérieure, 2013. http://espace.etsmtl.ca/1236/1/BALA_Abdalla.pdf.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Jusqu'au début des années 2000, la plupart des études empiriques pour construire des modèles d'estimation de projets logiciels ont été effectuées avec des échantillons de taille très faible (moins de 20 projets), tandis que seules quelques études ont utilisé des échantillons de plus grande taille (entre 60 à 90 projets). Avec la mise en place d’un répertoire de projets logiciels par l'International Software Benchmarking Standards Group - ISBSG - il existe désormais un plus grand ensemble de données disponibles pour construire des modèles d'estimation: la version 12 en 2013 du référentiel ISBSG contient plus de 6000 projets, ce qui constitue une base plus adéquate pour des études statistiques. Toutefois, dans le référentiel ISBSG un grand nombre de valeurs sont manquantes pour un nombre important de variables, ce qui rend assez difficile son utilisation pour des projets de recherche. Pour améliorer le développement de modèles d’estimation, le but de ce projet de recherche est de s'attaquer aux nouveaux problèmes d’accès à des plus grandes bases de données en génie logiciel en utilisant la technique d’imputation multiple pour tenir compte dans les analyses des données manquantes et des données aberrantes.
6

Etourneau, Lucas. "Contrôle du FDR et imputation de valeurs manquantes pour l'analyse de données de protéomiques par spectrométrie de masse." Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALS001.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
La protéomique consiste en la caractérisation du protéome d’un échantillon biologique, c’est-à-dire l’ensemble des protéines qu’il contient, et ce de la manière la plus exhaustive possible. Par l’identification et la quantification de fragments de protéines analysables en spectrométrie de masse (appelés peptides), la protéomique donne accès au niveau d’expression des gènes à un instant donné, ce qui est une information capitale pour améliorer la compréhension des mécanismes moléculaires en jeu au sein du vivant. Ces expériences produisent de grandes quantités de données, souvent complexes à interpréter et sujettes à certains biais. Elles requièrent des méthodes de traitement fiables et qui assurent un certain contrôle qualité, afin de garantir la pertinence des conclusions biologiques qui en résultent.Les travaux de cette thèse portent sur l'amélioration de ces traitements de données, et plus particulièrement sur les deux points majeurs suivants:Le premier est le contrôle du taux de fausses découvertes (abrégé en FDR pour “False Discovery Rate”), durant les étapes d’identification (1) des peptides, et (2) de biomarqueurs quantitativement différentiels entre une condition biologique testée et son contrôle négatif. Nos contributions portent sur l'établissement de liens entre les méthodes empiriques propres à la protéomique, et d’autres méthodes théoriquement bien établies. Cela nous permet notamment de donner des directions à suivre pour l’amélioration des méthodes de contrôle du FDR lors de l'identification de peptides.Le second point porte sur la gestion des valeurs manquantes, souvent nombreuses et de nature complexe, les rendant impossible à ignorer. En particulier, nous avons développé un nouvel algorithme d’imputation de valeurs manquantes qui tire parti des spécificités des données de protéomique. Notre algorithme a été testé et comparé à d’autres méthodes sur plusieurs jeux de données et selon des métriques variées, et obtient globalement les meilleures performances. De plus, il s’agit du premier algorithme permettant d’imputer en suivant le paradigme en vogue de la “multi-omique”: il peut en effet s’appuyer, lorsque cela est pertinent, sur des informations de type transcriptomique, qui quantifie le niveau d’expression des ARN messagers présents dans l’échantillon, pour imputer de manière plus fiable. Finalement, Pirat est implémenté dans un paquet logiciel disponible gratuitement, ce qui rend facilement utilisable pour la communauté protéomique
Proteomics involves characterizing the proteome of a biological sample, that is, the set of proteins it contains, and doing so as exhaustively as possible. By identifying and quantifying protein fragments that are analyzable by mass spectrometry (known as peptides), proteomics provides access to the level of gene expression at a given moment. This is crucial information for improving the understanding of molecular mechanisms at play within living organisms. These experiments produce large amounts of data, often complex to interpret and subject to various biases. They require reliable data processing methods that ensure a certain level of quality control, as to guarantee the relevance of the resulting biological conclusions.The work of this thesis focuses on improving this data processing, and specifically on the following two major points:The first is controlling for the false discovery rate (FDR), when either identifying (1) peptides or (2) quantitatively differential biomarkers between a tested biological condition and its negative control. Our contributions focus on establishing links between the empirical methods stemmed for proteomic practice and other theoretically supported methods. This notably allows us to provide directions for the improvement of FDR control methods used for peptide identification.The second point focuses on managing missing values, which are often numerous and complex in nature, making them impossible to ignore. Specifically, we have developed a new algorithm for imputing them that leverages the specificities of proteomics data. Our algorithm has been tested and compared to other methods on multiple datasets and according to various metrics, and it generally achieves the best performance. Moreover, it is the first algorithm that allows imputation following the trending paradigm of "multi-omics": if it is relevant to the experiment, it can impute more reliably by relying on transcriptomic information, which quantifies the level of messenger RNA expression present in the sample. Finally, Pirat is implemented in a freely available software package, making it easy to use for the proteomic community
7

Gheyas, Iffat A. "Novel computationally intelligent machine learning algorithms for data mining and knowledge discovery." Thesis, University of Stirling, 2009. http://hdl.handle.net/1893/2152.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.
8

Alarcon, Sergio Arciniegas. "Imputação de dados em experimentos multiambientais: novos algoritmos utilizando a decomposição por valores singulares." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-10052016-130506/.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
As análises biplot que utilizam os modelos de efeitos principais aditivos com inter- ação multiplicativa (AMMI) requerem matrizes de dados completas, mas, frequentemente os ensaios multiambientais apresentam dados faltantes. Nesta tese são propostas novas metodologias de imputação simples e múltipla que podem ser usadas para analisar da- dos desbalanceados em experimentos com interação genótipo por ambiente (G×E). A primeira, é uma nova extensão do método de validação cruzada por autovetor (Bro et al, 2008). A segunda, corresponde a um novo algoritmo não-paramétrico obtido por meio de modificações no método de imputação simples desenvolvido por Yan (2013). Também é incluído um estudo que considera sistemas de imputação recentemente relatados na literatura e os compara com o procedimento clássico recomendado para imputação em ensaios (G×E), ou seja, a combinação do algoritmo de Esperança-Maximização com os modelos AMMI ou EM-AMMI. Por último, são fornecidas generalizações da imputação simples descrita por Arciniegas-Alarcón et al. (2010) que mistura regressão com aproximação de posto inferior de uma matriz. Todas as metodologias têm como base a decomposição por valores singulares (DVS), portanto, são livres de pressuposições distribucionais ou estruturais. Para determinar o desempenho dos novos esquemas de imputação foram realizadas simulações baseadas em conjuntos de dados reais de diferentes espécies, com valores re- tirados aleatoriamente em diferentes porcentagens e a qualidade das imputações avaliada com distintas estatísticas. Concluiu-se que a DVS constitui uma ferramenta útil e flexível na construção de técnicas eficientes que contornem o problema de perda de informação em matrizes experimentais.
The biplot analysis using the additive main effects and multiplicative interaction models (AMMI) require complete data matrix, but often multi-environments trials have missing values. This thesis proposed new methods of single and multiple imputation that can be used to analyze unbalanced data in experiments with genotype by environment interaction (G×E). The first is a new extension of the cross-validation method by eigenvector (Bro et al., 2008). The second, corresponds to a new non-parametric algorithm obtained through modifications of the simple imputation method developed by Yan (2013). Also is included a study that considers imputation systems recently reported in the literature and compares them with the classic procedure recommended for imputation in trials (G×E), it means, the combination of the Expectation-Maximization (EM) algorithm with the additive main effects and multiplicative interaction (AMMI) model or EM-AMMI. Finally, are supplied generalizations of simple imputation described by Arciniegas-Alarcón et al. (2010) that combines regression with lower-rank approximation of a matrix. All methodologies are based on singular value decomposition (SVD), so, are free of any distributional or structural assumptions. In order to determine the performance of the new imputation schemes were performed simulations based on real data set of different species, with values deleted randomly at different percentages and the quality of the imputations was evaluated using different statistics. It was concluded that SVD provides a useful and flexible tool for the construction of efficient techniques that circumvent the problem of missing data in experimental matrices.
9

Bengtsson, Fanny, and Klara Lindblad. "Methods for handling missing values : A simulation study comparing imputation methods for missing values on a Poisson distributed explanatory variable." Thesis, Uppsala universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-432467.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Huo, Zhao. "A Comparsion of Multiple Imputation Methods for Missing Covariate Values in Recurrent Event Data." Thesis, Uppsala universitet, Statistiska institutionen, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-256602.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Multiple imputation (MI) is a commonly used approach to impute missing data. This thesis studies missing covariates in recurrent event data, and discusses ways to include the survival outcomes in the imputation model. Some MI methods under consideration are the event indicator D combined with, respectively, the right-censored event times T, the logarithm of T and the cumulative baseline hazard H0(T). After imputation, we can then proceed to the complete data analysis. The Cox proportional hazards (PH) model and the PWP model are chosen as the analysis models, and the coefficient estimates are of substantive interest. A Monte Carlo simulation study is conducted to compare different MI methods, the relative bias and mean square error will be used in the evaluation process. Furthermore, an empirical study based on cardiovascular disease event data which contains missing values will be conducted. Overall, the results show that MI based on the Nelson-Aalen estimate of H0(T) is preferred in most circumstances.

Книги з теми "Missing Value Imputation":

1

Templ, Matthias. Visualization and Imputation of Missing Values. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-30073-8.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Subramanian, Rajesh. Transitioning to multiple imputation: A new method to impute missing blood alcohol concentration (BAC) values in FARS. Washington, D.C: National Highway Traffic Safety Administration, National Center for Statistics and Analysis, 2002.

Знайти повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Missing Value Imputation. India: Starttech Educational Services LLP, 2020. http://dx.doi.org/10.4135/9781529630756.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Частини книг з теми "Missing Value Imputation":

1

Raja, P. S., and K. Thangavel. "Soft Clustering Based Missing Value Imputation." In Digital Connectivity – Social Impact, 119–33. Singapore: Springer Singapore, 2016. http://dx.doi.org/10.1007/978-981-10-3274-5_10.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Manna, Sweta, and Soumen Kumar Pati. "Missing Value Imputation Using Correlation Coefficient." In Computational Intelligence in Pattern Recognition, 551–58. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-2449-3_47.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Sujatha, M., G. Lavanya Devi, K. Srinivasa Rao, and N. Ramesh. "Rough Set Theory Based Missing Value Imputation." In Cognitive Science and Health Bioinformatics, 97–106. Singapore: Springer Singapore, 2017. http://dx.doi.org/10.1007/978-981-10-6653-5_9.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Shi, Yi, Zhipeng Cai, and Guohui Lin. "Classification Accuracy Based Microarray Missing Value Imputation." In Bioinformatics Algorithms, 303–27. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2007. http://dx.doi.org/10.1002/9780470253441.ch14.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Rashid, Wajeeha, and Manoj Kumar Gupta. "A Perspective of Missing Value Imputation Approaches." In Advances in Intelligent Systems and Computing, 307–15. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-1275-9_25.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Rashid, Wajeeha, Sakshi Arora, and Manoj Kumar Gupta. "Missing Value Imputation Approach Using Cosine Similarity Measure." In Advances in Intelligent Systems and Computing, 557–65. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-5113-0_44.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Wu, Jiahua, Xiangyan Tang, Guangxing Liu, and Bofan Wu. "An Overview of Graph Data Missing Value Imputation." In Communications in Computer and Information Science, 256–70. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-97-1280-9_20.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Gond, Vikesh Kumar, Aditya Dubey, Akhtar Rasool, and Nilay Khare. "Missing Value Imputation Using Weighted KNN and Genetic Algorithm." In ICT Analysis and Applications, 161–69. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-5224-1_18.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Cheng, Yu, Lan Wang, and Jinglu Hu. "A Quasi-linear Approach for Microarray Missing Value Imputation." In Neural Information Processing, 233–40. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24955-6_28.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Singh, Ninni, Anum Javeed, Sheenu Chhabra, and Pardeep Kumar. "Missing Value Imputation with Unsupervised Kohonen Self Organizing Map." In Emerging Research in Computing, Information, Communication and Applications, 61–76. New Delhi: Springer India, 2015. http://dx.doi.org/10.1007/978-81-322-2550-8_7.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Тези доповідей конференцій з теми "Missing Value Imputation":

1

Luo, Fei, Hangwei Qian, Di Wang, Xu Guo, Yan Sun, Eng Sing Lee, Hui Hwang Teong, Ray Tian Rui Lai, and Chunyan Miao. "Missing Value Imputation for Diabetes Prediction." In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892398.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Chong He, Hui-Hui Li, Changbo Zhao, Guo-Zheng Li, and Wei Zhang. "Triple imputation for microarray missing value estimation." In 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2015. http://dx.doi.org/10.1109/bibm.2015.7359682.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Karanikola, Aikaterini, and Sotiris Kotsiantis. "A hybrid method for missing value imputation." In PCI '19: 23rd Pan-Hellenic Conference on Informatics. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3368640.3368653.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
4

Aidos, Helena, and Pedro Tomas. "Neighborhood-aware autoencoder for missing value imputation." In 2020 28th European Signal Processing Conference (EUSIPCO). IEEE, 2021. http://dx.doi.org/10.23919/eusipco47968.2020.9287580.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Rachmawan, Irene Erlyn Wina, and Ali Ridho Barakbah. "Optimization of missing value imputation using Reinforcement Programming." In 2015 International Electronics Symposium (IES). IEEE, 2015. http://dx.doi.org/10.1109/elecsym.2015.7380828.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Lee, Namgil. "Block Tensor Train Decomposition for Missing Value Imputation." In 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2018. http://dx.doi.org/10.23919/apsipa.2018.8659560.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Li, Hui-Hui, Feng-Feng Shao, and Guo-Zheng Li. "Semi-supervised imputation for microarray missing value estimation." In 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2014. http://dx.doi.org/10.1109/bibm.2014.6999172.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Zhang, Chengqi, Yongsong Qin, Xiaofeng Zhu, Jilian Zhang, and Shichao Zhang. "Clustering-based Missing Value Imputation for Data Preprocessing." In 2006 IEEE International Conference on Industrial Informatics. IEEE, 2006. http://dx.doi.org/10.1109/indin.2006.275767.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Xu, Zhen, and Sargur N. Srihari. "Missing value imputation: with application to handwriting data." In IS&T/SPIE Electronic Imaging, edited by Eric K. Ringger and Bart Lamiroy. SPIE, 2015. http://dx.doi.org/10.1117/12.2075842.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Bou, Savong, Toshiyuki Amagasa, Hiroyuki Kitagawa, Salman Ahmed Shaikh, and Akiyoshi Matono. "Efficient Missing Value Imputation by Maximum Distance Likelihood." In 2023 IEEE International Conference on Big Data (BigData). IEEE, 2023. http://dx.doi.org/10.1109/bigdata59044.2023.10386584.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

Звіти організацій з теми "Missing Value Imputation":

1

Sukasih, Amang S., and Victoria Scott. Cyclical Tree-Based Hot Deck Imputation. RTI Press, June 2023. http://dx.doi.org/10.3768/rtipress.2023.mr.0052.2307.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
Hot deck imputation is a method for filling in a missing value in a survey item (item nonrespondent) with a valid reported value from a donor (item respondent) within the survey. Our paper presents a multivariate hot deck imputation method called Cyclical Tree-Based Hot Deck (CTBHD). This method was developed to handle missing values in complex survey data with many different types of variables and allows the user to customize imputation classes, use sorting variables, impute vectors and compositional variables, and even edit or recode data “on-the-fly.” Additionally, CTBHD employs a cycling approach to get more stable imputed values with less bias and variance. Our paper evaluates the performance of CTBHD imputation through a simulation study using publicly available survey data from the 2020 Residential Energy Consumption Survey. Developed as a system for imputation, the CTBHD system is proprietary to RTI International.
2

Kott, Phillip S. The Role of Weights in Regression Modeling and Imputation. RTI Press, April 2022. http://dx.doi.org/10.3768/rtipress.2022.mr.0047.2203.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
When fitting observations from a complex survey, the standard regression model assumes that the expected value of the difference between the dependent variable and its model-based prediction is zero, regardless of the values of the explanatory variables. A rarely failing extended regression model assumes only that the model error is uncorrelated with the model’s explanatory variables. When the standard model holds, it is possible to create alternative analysis weights that retain the consistency of the model-parameter estimates while increasing their efficiency by scaling the inverse-probability weights by an appropriately chosen function of the explanatory variables. When a regression model is used to impute for missing item values in a complex survey and when item missingness is a function of the explanatory variables of the regression model and not the item value itself, near unbiasedness of an estimated item mean requires that either the standard regression model for the item in the population holds or the analysis weights incorporate a correctly specified and consistently estimated probability of item response. By estimating the parameters of the probability of item response with a calibration equation, one can sometimes account for item missingness that is (partially) a function of the item value itself.
3

Huang, Lei, Meng Song, Hui Shen, Huixiao Hong, Ping Gong, Deng Hong-Wen, and Zhang Chaoyang. Deep learning methods for omics data imputation. Engineer Research and Development Center (U.S.), February 2024. http://dx.doi.org/10.21079/11681/48221.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
Анотація:
One common problem in omics data analysis is missing values, which can arise due to various reasons, such as poor tissue quality and insufficient sample volumes. Instead of discarding missing values and related data, imputation approaches offer an alternative means of handling missing data. However, the imputation of missing omics data is a non-trivial task. Difficulties mainly come from high dimensionality, non-linear or nonmonotonic relationships within features, technical variations introduced by sampling methods, sample heterogeneity, and the non-random missingness mechanism. Several advanced imputation methods, including deep learning-based methods, have been proposed to address these challenges. Due to its capability of modeling complex patterns and relationships in large and high-dimensional datasets, many researchers have adopted deep learning models to impute missing omics data. This review provides a comprehensive overview of the currently available deep learning-based methods for omics imputation from the perspective of deep generative model architectures such as autoencoder, variational autoencoder, generative adversarial networks, and Transformer, with an emphasis on multi-omics data imputation. In addition, this review also discusses the opportunities that deep learning brings and the challenges that it might face in this field.
4

Cao, Honggao. IMPUTE: A SAS Application System for Missing Value Imputations--With Special Reference to HRS Income/Assets. Institute for Social Research, University of Michigan, 2001. http://dx.doi.org/10.7826/isr-um.06.585031.001.05.0006.2001.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.

До бібліографії