Auswahl der wissenschaftlichen Literatur zum Thema „Data missingness“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Data missingness" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Data missingness"

1

Ghazali, Shamihah Muhammad, Norshahida Shaadan und Zainura Idrus. „Missing data exploration in air quality data set using R-package data visualisation tools“. Bulletin of Electrical Engineering and Informatics 9, Nr. 2 (01.04.2020): 755–63. http://dx.doi.org/10.11591/eei.v9i2.2088.

Der volle Inhalt der Quelle
Annotation:
Missing values often occur in many data sets of various research areas. This has been recognized as data quality problem because missing values could affect the performance of analysis results. To overcome the problem, the incomplete data set need to be treated or replaced using imputation method. Thus, exploring missing values pattern must be conducted beforehand to determine a suitable method. This paper discusses on the application of data visualisation as a smart technique for missing data exploration aiming to increase understanding on missing data behaviour which include missing data mechanism (MCAR, MAR and MNAR), distribution pattern of missingness in terms of percentage as well as the gap size. This paper presents the application of several data visualisation tools from five R-packges such as visdat, VIM, ggplot2, Amelia and UpSetR for data missingness exploration. For an illustration, based on an air quality data set in Malaysia, several graphics were produced and discussed to illustrate the contribution of the visualisation tools in providing input and the insight on the pattern of data missingness. Based on the results, it is shown that missing values in air quality data set of the chosen sites in Malaysia behave as missing at random (MAR) with small percentage of missingness and do contain long gap size of missingness.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

ZHANG, WEN, YE YANG und QING WANG. „A COMPARATIVE STUDY OF ABSENT FEATURES AND UNOBSERVED VALUES IN SOFTWARE EFFORT DATA“. International Journal of Software Engineering and Knowledge Engineering 22, Nr. 02 (März 2012): 185–202. http://dx.doi.org/10.1142/s0218194012400025.

Der volle Inhalt der Quelle
Annotation:
Software effort data contains a large amount of missing values of project attributes. The problem of absent features, which occurred recently in machine learning, is often neglected by researchers of software engineering when handling the missingness in software effort data. In essence, absent features (structural missingness) and unobserved values (unstructured missingness) are different cases of missingness although their appearance in the data set are the same. This paper attempts to clarify the root cause of missingness of software effort data. When regarding missingness as absent features, we develop Max-margin regression to predict real effort of software projects. When regarding missingness as unobserved values, we use existing imputation techniques to impute missing values. Then, ε – SVR is used to predict real effort of software projects with the input data sets. Experiments on ISBSG (International Software Benchmarking Standard Group) and CSBSG (Chinese Software Benchmarking Standard Group) data sets demonstrate that, with the tasks of effort prediction, the treatment regarding missingness in software effort data set as unobserved values can produce more desirable performance than that of regarding missingness as absent features. This paper is the first to introduce the concept of absent features to deal with missingness of software effort data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

De Raadt, Alexandra, Matthijs J. Warrens, Roel J. Bosker und Henk A. L. Kiers. „Kappa Coefficients for Missing Data“. Educational and Psychological Measurement 79, Nr. 3 (16.01.2019): 558–76. http://dx.doi.org/10.1177/0013164418823249.

Der volle Inhalt der Quelle
Annotation:
Cohen’s kappa coefficient is commonly used for assessing agreement between classifications of two raters on a nominal scale. Three variants of Cohen’s kappa that can handle missing data are presented. Data are considered missing if one or both ratings of a unit are missing. We study how well the variants estimate the kappa value for complete data under two missing data mechanisms—namely, missingness completely at random and a form of missingness not at random. The kappa coefficient considered in Gwet ( Handbook of Inter-rater Reliability, 4th ed.) and the kappa coefficient based on listwise deletion of units with missing ratings were found to have virtually no bias and mean squared error if missingness is completely at random, and small bias and mean squared error if missingness is not at random. Furthermore, the kappa coefficient that treats missing ratings as a regular category appears to be rather heavily biased and has a substantial mean squared error in many of the simulations. Because it performs well and is easy to compute, we recommend to use the kappa coefficient that is based on listwise deletion of missing ratings if it can be assumed that missingness is completely at random or not at random.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Arioli, Angelica, Arianna Dagliati, Bethany Geary, Niels Peek, Philip A. Kalra, Anthony D. Whetton und Nophar Geifman. „OptiMissP: A dashboard to assess missingness in proteomic data-independent acquisition mass spectrometry“. PLOS ONE 16, Nr. 4 (15.04.2021): e0249771. http://dx.doi.org/10.1371/journal.pone.0249771.

Der volle Inhalt der Quelle
Annotation:
Background Missing values are a key issue in the statistical analysis of proteomic data. Defining the strategy to address missing values is a complex task in each study, potentially affecting the quality of statistical analyses. Results We have developed OptiMissP, a dashboard to visually and qualitatively evaluate missingness and guide decision making in the handling of missing values in proteomics studies that use data-independent acquisition mass spectrometry. It provides a set of visual tools to retrieve information about missingness through protein densities and topology-based approaches, and facilitates exploration of different imputation methods and missingness thresholds. Conclusions OptiMissP provides support for researchers’ and clinicians’ qualitative assessment of missingness in proteomic datasets in order to define study-specific strategies for the handling of missing values. OptiMissP considers biases in protein distributions related to the choice of imputation method and helps analysts to balance the information loss caused by low missingness thresholds and the noise introduced by selecting high missingness thresholds. This is complemented by topological data analysis which provides additional insight to the structure of the data and their missingness. We use an example in Chronic Kidney Disease to illustrate the main functionalities of OptiMissP.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Xie, Hui. „Analyzing longitudinal clinical trial data with nonignorable missingness and unknown missingness reasons“. Computational Statistics & Data Analysis 56, Nr. 5 (Mai 2012): 1287–300. http://dx.doi.org/10.1016/j.csda.2010.11.021.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Babcock, Ben, Peter E. L. Marks, Yvonne H. M. van den Berg und Antonius H. N. Cillessen. „Implications of systematic nominator missingness for peer nomination data“. International Journal of Behavioral Development 42, Nr. 1 (19.08.2016): 148–54. http://dx.doi.org/10.1177/0165025416664431.

Der volle Inhalt der Quelle
Annotation:
Missing data are a persistent problem in psychological research. Peer nomination data present a unique missing data problem, because a nominator’s nonparticipation results in missing data for other individuals in the study. This study examined the range of effects of systematic nonparticipation on the correlations between peer nomination data when nominators with various levels of popularity and social preference are missing. Results showed that, compared to completely random nominator missingness, systematic missingness of raters based on popularity had a significant impact on the correlations between various peer nomination variables. Systematic missingness based on social preference had a smaller impact. These results demonstrate varying (and potentially large) effects of systematically missing nominators on studies using nomination data. It is important that researchers using peer nomination data explore whether nominators are missing in any sort of systematic way and include these results as part of each study. Future research into the nature of systematic nominator missingness could make it possible to use advanced methodologies, such as multiple imputation, in an attempt to minimize the issues associated with systematic missingness.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Spineli, Loukia M., Chrysostomos Kalyvas und Katerina Papadimitropoulou. „Continuous(ly) missing outcome data in network meta-analysis: A one-stage pattern-mixture model approach“. Statistical Methods in Medical Research 30, Nr. 4 (06.01.2021): 958–75. http://dx.doi.org/10.1177/0962280220983544.

Der volle Inhalt der Quelle
Annotation:
Appropriate handling of aggregate missing outcome data is necessary to minimise bias in the conclusions of systematic reviews. The two-stage pattern-mixture model has been already proposed to address aggregate missing continuous outcome data. While this approach is more proper compared with the exclusion of missing continuous outcome data and simple imputation methods, it does not offer flexible modelling of missing continuous outcome data to investigate their implications on the conclusions thoroughly. Therefore, we propose a one-stage pattern-mixture model approach under the Bayesian framework to address missing continuous outcome data in a network of interventions and gain knowledge about the missingness process in different trials and interventions. We extend the hierarchical network meta-analysis model for one aggregate continuous outcome to incorporate a missingness parameter that measures the departure from the missing at random assumption. We consider various effect size estimates for continuous data, and two informative missingness parameters, the informative missingness difference of means and the informative missingness ratio of means. We incorporate our prior belief about the missingness parameters while allowing for several possibilities of prior structures to account for the fact that the missingness process may differ in the network. The method is exemplified in two networks from published reviews comprising a different amount of missing continuous outcome data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

McGurk, Kathryn A., Arianna Dagliati, Davide Chiasserini, Dave Lee, Darren Plant, Ivona Baricevic-Jones, Janet Kelsall et al. „The use of missing values in proteomic data-independent acquisition mass spectrometry to enable disease activity discrimination“. Bioinformatics 36, Nr. 7 (02.12.2019): 2217–23. http://dx.doi.org/10.1093/bioinformatics/btz898.

Der volle Inhalt der Quelle
Annotation:
Abstract Motivation Data-independent acquisition mass spectrometry allows for comprehensive peptide detection and relative quantification than standard data-dependent approaches. While less prone to missing values, these still exist. Current approaches for handling the so-called missingness have challenges. We hypothesized that non-random missingness is a useful biological measure and demonstrate the importance of analysing missingness for proteomic discovery within a longitudinal study of disease activity. Results The magnitude of missingness did not correlate with mean peptide concentration. The magnitude of missingness for each protein strongly correlated between collection time points (baseline, 3 months, 6 months; R = 0.95–0.97, confidence interval = 0.94–0.97) indicating little time-dependent effect. This allowed for the identification of proteins with outlier levels of missingness that differentiate between the patient groups characterized by different patterns of disease activity. The association of these proteins with disease activity was confirmed by machine learning techniques. Our novel approach complements analyses on complete observations and other missing value strategies in biomarker prediction of disease activity. Supplementary information Supplementary data are available at Bioinformatics online.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Elleman, Lorien G., Sarah K. McDougald, David M. Condon und William Revelle. „That Takes the BISCUIT“. European Journal of Psychological Assessment 36, Nr. 6 (November 2020): 948–58. http://dx.doi.org/10.1027/1015-5759/a000590.

Der volle Inhalt der Quelle
Annotation:
Abstract. The predictive accuracy of personality-criterion regression models may be improved with statistical learning (SL) techniques. This study introduced a novel SL technique, BISCUIT (Best Items Scale that is Cross-validated, Unit-weighted, Informative, and Transparent). The predictive accuracy and parsimony of BISCUIT were compared with three established SL techniques (the lasso, elastic net, and random forest) and regression using two sets of scales, for five criteria, across five levels of data missingness. BISCUIT’s predictive accuracy was competitive with other SL techniques at higher levels of data missingness. BISCUIT most frequently produced the most parsimonious SL model. In terms of predictive accuracy, the elastic net and lasso dominated other techniques in the complete data condition and in conditions with up to 50% data missingness. Regression using 27 narrow traits was an intermediate choice for predictive accuracy. For most criteria and levels of data missingness, regression using the Big Five had the worst predictive accuracy. Overall, loss in predictive accuracy due to data missingness was modest, even at 90% data missingness. Findings suggest that personality researchers should consider incorporating planned data missingness and SL techniques into their designs and analyses.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Rhemtulla, Mijke, Fan Jia, Wei Wu und Todd D. Little. „Planned missing designs to optimize the efficiency of latent growth parameter estimates“. International Journal of Behavioral Development 38, Nr. 5 (23.01.2014): 423–34. http://dx.doi.org/10.1177/0165025413514324.

Der volle Inhalt der Quelle
Annotation:
We examine the performance of planned missing (PM) designs for correlated latent growth curve models. Using simulated data from a model where latent growth curves are fitted to two constructs over five time points, we apply three kinds of planned missingness. The first is item-level planned missingness using a three-form design at each wave such that 25% of data are missing. The second is wave-level planned missingness such that each participant is missing up to two waves of data. The third combines both forms of missingness. We find that three-form missingness results in high convergence rates, little parameter estimate or standard error bias, and high efficiency relative to the complete data design for almost all parameter types. In contrast, wave missingness and the combined design result in dramatically lowered efficiency for parameters measuring individual variability in rates of change (e.g., latent slope variances and covariances), and bias in both estimates and standard errors for these same parameters. We conclude that wave missingness should not be used except with large effect sizes and very large samples.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Data missingness"

1

Cao, Yu. „Bayesian nonparametric analysis of longitudinal data with non-ignorable non-monotone missingness“. VCU Scholars Compass, 2019. https://scholarscompass.vcu.edu/etd/5750.

Der volle Inhalt der Quelle
Annotation:
In longitudinal studies, outcomes are measured repeatedly over time, but in reality clinical studies are full of missing data points of monotone and non-monotone nature. Often this missingness is related to the unobserved data so that it is non-ignorable. In such context, pattern-mixture model (PMM) is one popular tool to analyze the joint distribution of outcome and missingness patterns. Then the unobserved outcomes are imputed using the distribution of observed outcomes, conditioned on missing patterns. However, the existing methods suffer from model identification issues if data is sparse in specific missing patterns, which is very likely to happen with a small sample size or a large number of repetitions. We extend the existing methods using latent class analysis (LCA) and a shared-parameter PMM. The LCA groups patterns of missingness with similar features and the shared-parameter PMM allows a subset of parameters to be different among latent classes when fitting a model, thus restoring model identifiability. A novel imputation method is also developed using the distribution of observed data conditioned on latent classes. We develop this model for continuous response data and extend it to handle ordinal rating scale data. Our model performs better than existing methods for data with small sample size. The method is applied to two datasets from a phase II clinical trial that studies the quality of life for patients with prostate cancer receiving radiation therapy, and another to study the relationship between the perceived neighborhood condition in adolescence and the drinking habit in adulthood.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Deng, Wei. „Multiple imputation for marginal and mixed models in longitudinal data with informative missingness“. Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1126890027.

Der volle Inhalt der Quelle
Annotation:
Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xiii, 108 p.; also includes graphics. Includes bibliographical references (p. 104-108). Available online via OhioLINK's ETD Center
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Hafez, Mai. „Analysis of multivariate longitudinal categorical data subject to nonrandom missingness : a latent variable approach“. Thesis, London School of Economics and Political Science (University of London), 2015. http://etheses.lse.ac.uk/3184/.

Der volle Inhalt der Quelle
Annotation:
Longitudinal data are collected for studying changes across time. In social sciences, interest is often in theoretical constructs, such as attitudes, behaviour or abilities, which cannot be directly measured. In that case, multiple related manifest (observed) variables, for example survey questions or items in an ability test, are used as indicators for the constructs, which are themselves treated as latent (unobserved) variables. In this thesis, multivariate longitudinal data is considered where multiple observed variables, measured at each time point, are used as indicators for theoretical constructs (latent variables) of interest. The observed items and the latent variables are linked together via statistical latent variable models. A common problem in longitudinal studies is missing data, where missingness can be classiffed into one of two forms. Dropout occurs when subjects exit the study prematurely, while intermittent missingness takes place when subjects miss one or more occasions but show up on a subsequent wave of the study. Ignoring the missingness mechanism can lead to biased estimates, especially when the missingness is nonrandom. The approach proposed in this thesis uses latent variable models to capture the evolution of a latent phenomenon over time, while incorporating a missingness mechanism to account for possibly nonrandom forms of missingness. Two model specifications are presented, the first of which incorporates dropout only in the missingness mechanism, while the other accounts for both dropout and intermittent missingness allowing them to be informative by being modelled as functions of the latent variables and possibly observed covariates. Models developed in this thesis consider ordinal and binary observed items, because such variables are often met in social surveys, while the underlying latent variables are assumed to be continuous. The proposed models are illustrated by analysing people's perceptions on women's work using three questions from five waves of the British Household Panel Survey.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Andersson, Oscar, und Tim Andersson. „AI applications on healthcare data“. Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-44752.

Der volle Inhalt der Quelle
Annotation:
The purpose of this research is to get a better understanding of how different machine learning algorithms work with different amounts of data corruption. This is important since data corruption is an overbearing issue within data collection and thus, in extension, any work that relies on the collected data. The questions we were looking at were: What feature is the most important? How significant is the correlation of features? What algorithms should be used given the data available? And, How much noise (inaccurate or unhelpful captured data) is acceptable?  The study is structured to introduce AI in healthcare, data missingness, and the machine learning algorithms we used in the study. In the method section, we give a recommended workflow for handling data with machine learning in mind. The results show us that when a dataset is filled with random values, the run-time of algorithms increases since many patterns are lost. Randomly removing values also caused less of a problem than first anticipated since we ran multiple trials, evening out any problems caused by the lost values. Lastly, imputation is a preferred way of handling missing data since it retained many dataset structures. One has to keep in mind if the imputation is done on categories or numerical values. However, there is no easy "best-fit" for any dataset. It is hard to give a concrete answer when choosing a machine learning algorithm that fits any dataset. Nevertheless, since it is easy to simply plug-and-play with many algorithms, we would recommend any user try different ones before deciding which one fits a project the best.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Bishop, Brenden. „Examining Random-Coeffcient Pattern-Mixture Models forLongitudinal Data with Informative Dropout“. The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu150039066582153.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Lee, Amra. „Why do some civilian lives matter more than others? Exploring how the quality, timeliness and consistency of data on civilian harm affects the conduct of hostilities for civilians caught in conflict“. Thesis, Uppsala universitet, Institutionen för freds- och konfliktforskning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-387653.

Der volle Inhalt der Quelle
Annotation:
Normatively, protecting civilians from the conduct of hostilities is grounded in the Geneva Conventions and the UN Security Council protection of civilian agenda, both of which celebrate their 70 and 20 year anniversaries in 2019. Previous research focusses heavily on protection of civilians through peacekeeping whereas this research focuses on ‘non-armed’ approaches to enhancing civilian protection in conflict. Prior research and experience reveals a high level of missingness and variation in the level of available data on civilian harm in conflict. Where civilian harm is considered in the peace and conflict literature, it is predominantly from a securitized lens of understanding insurgent recruitment strategies and more recent counter-insurgent strategies aimed at winning ‘hearts and minds’. Through a structured focused comparison of four case studies the correlation between the level of quality, timely and consistent data on civilian harm and affect on the conduct of hostilities will be reviewed and potential confounders identified. Following this the hypothesized causal mechanism will be process traced through the pathway case of Afghanistan. The findings and analysis from both methods identify support for the theory and it’s refinement with important nuances in the factors conducive to quality, timely and consistent data collection on civilian harm in armed conflict.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Poleto, Frederico Zanqueta. „Análise de dados categorizados com omissão em variáveis explicativas e respostas“. Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-09052011-000104/.

Der volle Inhalt der Quelle
Annotation:
Nesta tese apresentam-se desenvolvimentos metodológicos para analisar dados com omissão e também estudos delineados para compreender os resultados de tais análises. Escrutinam-se análises de sensibilidade bayesiana e clássica para dados com respostas categorizadas sujeitas a omissão. Mostra-se que as componentes subjetivas de cada abordagem podem influenciar os resultados de maneira não-trivial, independentemente do tamanho da amostra, e que, portanto, as conclusões devem ser cuidadosamente avaliadas. Especificamente, demonstra-se que distribuições \\apriori\\ comumente consideradas como não-informativas ou levemente informativas podem, na verdade, ser bastante informativas para parâmetros inidentificáveis, e que a escolha do modelo sobreparametrizado também tem um papel importante. Quando há omissão em variáveis explicativas, também é necessário propor um modelo marginal para as covariáveis mesmo se houver interesse apenas no modelo condicional. A especificação incorreta do modelo para as covariáveis ou do modelo para o mecanismo de omissão leva a inferências enviesadas para o modelo de interesse. Trabalhos anteriormente publicados têm-se dividido em duas vertentes: ou utilizam distribuições semiparamétricas/não-paramétricas, flexíveis para as covariáveis, e identificam o modelo com a suposição de um mecanismo de omissão não-informativa, ou empregam distribuições paramétricas para as covariáveis e permitem um mecanismo mais geral, de omissão informativa. Neste trabalho analisam-se respostas binárias, combinando um mecanismo de omissão informativa com um modelo não-paramétrico para as covariáveis contínuas, por meio de uma mistura induzida pela distribuição \\apriori\\ de processo de Dirichlet. No caso em que o interesse recai apenas em momentos da distribuição das respostas, propõe-se uma nova análise de sensibilidade sob o enfoque clássico para respostas incompletas que evita suposições distribucionais e utiliza parâmetros de sensibilidade de fácil interpretação. O procedimento tem, em particular, grande apelo na análise de dados contínuos, campo que tradicionalmente emprega suposições de normalidade e/ou utiliza parâmetros de sensibilidade de difícil interpretação. Todas as análises são ilustradas com conjuntos de dados reais.
We present methodological developments to conduct analyses with missing data and also studies designed to understand the results of such analyses. We examine Bayesian and classical sensitivity analyses for data with missing categorical responses and show that the subjective components of each approach can influence results in non-trivial ways, irrespectively of the sample size, concluding that they need to be carefully evaluated. Specifically, we show that prior distributions commonly regarded as slightly informative or non-informative may actually be too informative for non-identifiable parameters, and that the choice of over-parameterized models may drastically impact the results. When there is missingness in explanatory variables, we also need to consider a marginal model for the covariates even if the interest lies only on the conditional model. An incorrect specification of either the model for the covariates or of the model for the missingness mechanism leads to biased inferences for the parameters of interest. Previously published works are commonly divided into two streams: either they use semi-/non-parametric flexible distributions for the covariates and identify the model via a non-informative missingness mechanism, or they employ parametric distributions for the covariates and allow a more general informative missingness mechanism. We consider the analysis of binary responses, combining an informative missingness model with a non-parametric model for the continuous covariates via a Dirichlet process mixture. When the interest lies only in moments of the response distribution, we consider a new classical sensitivity analysis for incomplete responses that avoids distributional assumptions and employs easily interpreted sensitivity parameters. The procedure is particularly useful for analyses of missing continuous data, an area where normality is traditionally assumed and/or relies on hard-to-interpret sensitivity parameters. We illustrate all analyses with real data sets.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Park, Soomin. „Analysis of longitudinal data with informative missingness“. 2001. http://www.library.wisc.edu/databases/connect/dissertations.html.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Chang, Yu-Ping, und 張育萍. „Geonme-wide pattern of informative missingness using HapMap data“. Thesis, 2013. http://ndltd.ncl.edu.tw/handle/11567638822940451863.

Der volle Inhalt der Quelle
Annotation:
碩士
國立陽明大學
公共衛生研究所
101
Objectives: This dissertation aims to explore the genome-wide pattern of informative missingness among parent genotypes due to various qualities of genotyping. Methods: Genotype, quality score, and pedigree of HapMap data were merged together and genotype scores below 10000, 9000, 8000, and 7000 were assigned to be missing values. Therefore, four sets of trio data with partial missing parental genotypes were implemented by the TIMBD (Guo, 2012), which determines whether parental genotypes are missing informatively or not. SNPs that are significant in the four sets of trio data were studied and 20 of them were matched with RS numbers. Using the NCBI (The National Center for Biotechnology Information) data base, the SNP regional map was used to identify published associations and nearby SNPs in LD with the 20 SNPs. Haploview was used to find linkage disequilibrium information between the 20 SNPs and nearby SNPs. Results: Among the 20 SNPs where parental genotypes are missing informatively due to various genotyping qualities, only one SNP was reported to be associated with obesity and cardiovascular disease. No replications were reported by the 20 SNPs. It is likely that these significant SNPs are false positives.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Costa, Adriana Isabel Fonseca. „A study on missing data: handing missingness using Denoising Autoencoders“. Master's thesis, 2018. http://hdl.handle.net/10316/86262.

Der volle Inhalt der Quelle
Annotation:
Trabalho de Projeto do Mestrado Integrado em Engenharia Biomédica apresentado à Faculdade de Ciências e Tecnologia
Com a evolução tecnológica, verificou-se um aumento exponencial da quantidade de dados recolhidos e armazenados. Assim, surgiu a necessidade de criar mecanismos automáticos para extrair conhecimento dos referidos dados. Estes mecanismos automáticos, conhecidos por modelos de aprendizagem automática, foram, na sua maioria, desenvolvidos para dados completos, requisito que nem sempre é possível cumprir. Neste contexto, a imputação dos dados (substituição dos valores em falta por estimativas plausíveis) surge como uma possível solução, garantindo a qualidade dos dados para posterior análise.Nos últimos anos, vários estudos têm proposto novas técnicas de imputação, de entre as quais se destaca a utilização de Stacked Denoising Autoencoders. Dada a sua extraordinária capacidade de recuperar dados corrompidos, os Denoising Autoencoders mostram-se promissores na área da imputação de dados, tendo despertado um interesse crescente por parte da comunidade científica.No entanto, sendo um tópico recente, a sua aplicação ainda não se encontra suficientemente bem estudada, apresentando diversos aspetos por explorar; em particular, a sua adequação a diferentes mecanismos de dados em falta (Missing Completely At Random, Missing At Random e Missing Not At Random). Esta tese apresenta um estudo aprofundado da imputação de dados via Stacked Denoising Autoencoders, considerando diferentes mecanismos e percentagens de dados em falta. Em comparação com métodos de imputação do estado da arte, os Stacked Denoising Autoencoders mostraram ser abordagens robustas para a imputação de elevadas percentagens de dados em falta, especialmente quando o mecanismo subjacente à sua geração é Missing Not At Random.
The evolution of technology led to an exponential increase in the amount of data being collected and stored, thus creating the need to develop automatic mechanisms to extract knowledge from data. These automatic mechanisms, known as Machine Learning techniques, were mostly designed for complete data, a requirement that is not always fulfilled. In this context, data imputation (replacement of missing values by plausible estimates) arises as a possible solution, ensuring the quality of data for later analysis. Over the years, several studies presented alternative imputation strategies, among which Stacked Denoising Autoencoders stand out. Given their ability to recover corrupted data, Stacked Denoising Autoencoders are promising in the area of data imputation, generating great interest in the scientific community. However, their application is an understudied topic, still presenting challenging aspects for research; namely, their suitability for different missing data mechanisms (Missing Completely At Random, Missing At Random and Missing Not At Random). This thesis presents a thorough study of data imputation via Stacked Denoising Autoencoders, considering different missing data mechanisms and missing rates. In comparison to state-of-the-art imputation methods, Stacked Denoising Autoencoders proved to be robust for imputing high missing rates, especially, when the mechanism underlying their generation is Missing Not At Random.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Bücher zum Thema "Data missingness"

1

Benstead, Lindsay J. Survey Research in the Arab World. Herausgegeben von Lonna Rae Atkeson und R. Michael Alvarez. Oxford University Press, 2017. http://dx.doi.org/10.1093/oxfordhb/9780190213299.013.14.

Der volle Inhalt der Quelle
Annotation:
Since the first surveys were conducted there in the late 1980s, survey research has expanded rapidly in the Arab world. Almost every country in the region is now included in the Arab Barometer, Afrobarometer, or World Values Survey. Moreover, the Arab spring marked a watershed, with the inclusion of Tunisia and Libya and addition of many topics, such as voting behavior, that were previously considered too sensitive. As a result, political scientists have dozens of largely untapped data sets to answer theoretical and policy questions. To make progress toward measuring and reducing total survey error, discussion is needed about quality issues, such as high rates of missingness and sampling challenges. Ongoing attention to ethics is also critical. This chapter discusses these developments and frames a substantive and methodological research agenda for improving data quality and survey practice in the Arab world.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Data missingness"

1

Laaksonen, Seppo. „Missingness, Its Reasons and Treatment“. In Survey Methodology and Missing Data, 99–110. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-79011-4_7.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Laaksonen, Seppo. „Sampling Principles, Missingness Mechanisms, and Design Weighting“. In Survey Methodology and Missing Data, 49–76. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-79011-4_4.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Rodrigues de Morais, Sérgio, und Alex Aussem. „Exploiting Data Missingness in Bayesian Network Modeling“. In Advances in Intelligent Data Analysis VIII, 35–46. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-03915-7_4.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

„Case Studies: Ignorable Missingness“. In Missing Data in Longitudinal Studies, 165–84. Chapman and Hall/CRC, 2008. http://dx.doi.org/10.1201/9781420011180-11.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

„Case Studies: Nonignorable Missingness“. In Missing Data in Longitudinal Studies, 253–87. Chapman and Hall/CRC, 2008. http://dx.doi.org/10.1201/9781420011180-14.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

„Models for Handling Nonignorable Missingness“. In Missing Data in Longitudinal Studies, 185–235. Chapman and Hall/CRC, 2008. http://dx.doi.org/10.1201/9781420011180-12.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Wuy, Margaret, und Paul Albert. „Analysis of Longitudinal Data with Missingness*“. In Advances in Clinical Trial Biostatistics. CRC Press, 2003. http://dx.doi.org/10.1201/9780203912881.ch11.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Daniels, Michael J., und Dandan Xu. „Bayesian Methods for Longitudinal Data with Missingness“. In Bayesian Methods in Pharmaceutical Research, 185–205. Chapman and Hall/CRC, 2020. http://dx.doi.org/10.1201/9781315180212-9.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

„Imputing Prociency Data under Planned Missingness in Population Models“. In Handbook of International Large-Scale Assessment, 189–216. Chapman and Hall/CRC, 2013. http://dx.doi.org/10.1201/b16061-13.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Collins, Tim, Sandra I. Woolley, Salome Oniani und Anand Pandyan. „Quantifying Missingness in Wearable Heart Rate Recordings“. In Studies in Health Technology and Informatics. IOS Press, 2021. http://dx.doi.org/10.3233/shti210352.

Der volle Inhalt der Quelle
Annotation:
Wrist-worn photoplethysmography (PPG) heart rate monitoring devices are increasingly used in clinical applications despite the potential for data missingness and inaccuracy. This paper provides an analysis of the intermittency of experimental wearable data recordings. Devices recorded heart rate with gaps of 5 or more minutes 41.6% of the time and 15 or more minutes 3.8% of the time.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Data missingness"

1

Ghorbani, Amirata, und James Y. Zou. „Embedding for Informative Missingness: Deep Learning With Incomplete Data“. In 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2018. http://dx.doi.org/10.1109/allerton.2018.8636008.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Mohan, Karthika, Felix Thoemmes und Judea Pearl. „Estimation with Incomplete Data: The Linear Case“. In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/705.

Der volle Inhalt der Quelle
Annotation:
Traditional methods for handling incomplete data, including Multiple Imputation and Maximum Likelihood, require that the data be Missing At Random (MAR). In most cases, however, missingness in a variable depends on the underlying value of that variable. In this work, we devise model-based methods to consistently estimate mean, variance and covariance given data that are Missing Not At Random (MNAR). While previous work on MNAR data require variables to be discrete, we extend the analysis to continuous variables drawn from Gaussian distributions. We demonstrate the merits of our techniques by comparing it empirically to state of the art software packages.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie