Dissertations / Theses on the topic 'Generalised lineal mixed-effects models'

To see the other types of publications on this topic, follow the link: Generalised lineal mixed-effects models.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Generalised lineal mixed-effects models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Sima, Adam. "Accounting for Model Uncertainty in Linear Mixed-Effects Models." VCU Scholars Compass, 2013. http://scholarscompass.vcu.edu/etd/2950.

Full text
Abstract:
Standard statistical decision-making tools, such as inference, confidence intervals and forecasting, are contingent on the assumption that the statistical model used in the analysis is the true model. In linear mixed-effect models, ignoring model uncertainty results in an underestimation of the residual variance, contributing to hypothesis tests that demonstrate larger than nominal Type-I errors and confidence intervals with smaller than nominal coverage probabilities. A novel utilization of the generalized degrees of freedom developed by Zhang et al. (2012) is used to adjust the estimate of the residual variance for model uncertainty. Additionally, the general global linear approximation is extended to linear mixed-effect models to adjust the standard errors of the parameter estimates for model uncertainty. Both of these methods use a perturbation method for estimation, where random noise is added to the response variable and, conditional on the observed responses, the corresponding estimate is calculated. A simulation study demonstrates that when the proposed methodologies are utilized, both the variance and standard errors are inflated for model uncertainty. However, when a data-driven strategy is employed, the proposed methodologies show limited usefulness. These methods are evaluated with a trial assessing the performance of cervical traction in the treatment of cervical radiculopathy.
APA, Harvard, Vancouver, ISO, and other styles
2

Overstall, Antony Marshall. "Default Bayesian model determination for generalised linear mixed models." Thesis, University of Southampton, 2010. https://eprints.soton.ac.uk/170229/.

Full text
Abstract:
In this thesis, an automatic, default, fully Bayesian model determination strategy for GLMMs is considered. This strategy must address the two key issues of default prior specification and computation. Default prior distributions for the model parameters, that are based on a unit information concept, are proposed. A two-phase computational strategy, that uses a reversible jump algorithm and implementation of bridge sampling, is also proposed. This strategy is applied to four examples throughout this thesis.
APA, Harvard, Vancouver, ISO, and other styles
3

Gory, Jeffrey J. "Marginally Interpretable Generalized Linear Mixed Models." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1497966698387606.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Min, Min. "Asymptotic normality in generalized linear mixed models." College Park, Md.: University of Maryland, 2007. http://hdl.handle.net/1903/7758.

Full text
Abstract:
Thesis (Ph. D.) -- University of Maryland, College Park, 2007.
Thesis research directed by: Dept. of Mathematics. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.
APA, Harvard, Vancouver, ISO, and other styles
5

Richardson, Troy E. "Treatment heterogeneity and potential outcomes in linear mixed effects models." Diss., Kansas State University, 2013. http://hdl.handle.net/2097/15950.

Full text
Abstract:
Doctor of Philosophy
Department of Statistics
Gary L. Gadbury
Studies commonly focus on estimating a mean treatment effect in a population. However, in some applications the variability of treatment effects across individual units may help to characterize the overall effect of a treatment across the population. Consider a set of treatments, {T,C}, where T denotes some treatment that might be applied to an experimental unit and C denotes a control. For each of N experimental units, the duplet {r[subscript]i, r[subscript]Ci}, i=1,2,…,N, represents the potential response of the i[superscript]th experimental unit if treatment were applied and the response of the experimental unit if control were applied, respectively. The causal effect of T compared to C is the difference between the two potential responses, r[subscript]Ti- r[subscript]Ci. Much work has been done to elucidate the statistical properties of a causal effect, given a set of particular assumptions. Gadbury and others have reported on this for some simple designs and primarily focused on finite population randomization based inference. When designs become more complicated, the randomization based approach becomes increasingly difficult. Since linear mixed effects models are particularly useful for modeling data from complex designs, their role in modeling treatment heterogeneity is investigated. It is shown that an individual treatment effect can be conceptualized as a linear combination of fixed treatment effects and random effects. The random effects are assumed to have variance components specified in a mixed effects “potential outcomes” model when both potential outcomes, r[subscript]T,r[subscript]C, are variables in the model. The variance of the individual causal effect is used to quantify treatment heterogeneity. Post treatment assignment, however, only one of the two potential outcomes is observable for a unit. It is then shown that the variance component for treatment heterogeneity becomes non-estimable in an analysis of observed data. Furthermore, estimable variance components in the observed data model are demonstrated to arise from linear combinations of the non-estimable variance components in the potential outcomes model. Mixed effects models are considered in context of a particular design in an effort to illuminate the loss of information incurred when moving from a potential outcomes framework to an observed data analysis.
APA, Harvard, Vancouver, ISO, and other styles
6

Yam, Ho-kwan, and 任浩君. "On a topic of generalized linear mixed models and stochastic volatility model." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2002. http://hub.hku.hk/bib/B29913342.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ogden, Helen E. "Inference for generalised linear mixed models with sparse structure." Thesis, University of Warwick, 2014. http://wrap.warwick.ac.uk/60467/.

Full text
Abstract:
The likelihood for the parameters of a generalised linear mixed model involves an integral which may be of very high dimension. Because of this apparent intractability, many alternative methods have been proposed for inference in these models, but it is shown that all can fail when the model is sparse, in that there is only a small amount of information available on each random effect. The sequential reduction method developed in this thesis seeks to fill in this gap, by exploiting the dependence structure of the posterior distribution of the random effects to reduce dramatically the cost of approximating the likelihood in models with sparse structure. Examples are given to demonstrate the high quality of the new approximation relative to the available alternatives. Finally, robustness of various estimators to misspecification of the random effect distribution is considered. It is found that certain marginal composite likelihood estimators are not robust to such misspecification in situations in which the full maximum likelihood estimator is robust, providing a counterexample to the notion that composite likelihood estimators will always be at least as robust as the maximum likelihood estimator under model misspecification.
APA, Harvard, Vancouver, ISO, and other styles
8

Tang, On-yee, and 鄧安怡. "Estimation for generalized linear mixed model via multipleimputations." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B30687652.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ma, Renjun. "An orthodox BLUP approach to generalized linear mixed models." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0024/NQ38934.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tang, On-yee. "Estimation for generalized linear mixed model via multiple imputations." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B30687652.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Sepato, Sandra Moepeng. "Generalized linear mixed model and generalized estimating equation for binary longitudinal data." Diss., University of Pretoria, 2014. http://hdl.handle.net/2263/43143.

Full text
Abstract:
The most common analysis used for binary data is generalised linear model (GLM) with either a binomial or bernoulli distribution using either a logit, probit, complementary log-log or other type of link functions. However, such analyses violate the independence assumption if the binary data are measured repeatedly over time at the same subject or site. Failure to take into account the correlation can lead to incorrect estimation of regression parameters and the estimates are less efficient, particularly when the correlations are large. Therefore, to obtain the most efficient estimates that are also unbiased the methods that incorporate correlations (McCullagh and Nelder, 1989) should be used. Two of the statistical methodologies that can be used to account for this correlation for the longitudinal data are the generalized linear mixed models (GLMMs) and generalized estimating equation (GEE). The GLMM method is based on extending the fixed effects GLM to include random effects and covariance patterns. Unlike the GLM and GLMM methods, the GEE method is based on the quasi-likelihood theory and no assumption is made about the distribution of response observations (Liang and Zeger, 1986). The main objective of the study is to investigate the statistical properties and limitations of these three approaches, i.e. GLM, GLMMs and GEE for analyzing longitudinal data through use of a binary data from an entomology study. The results reaffirms the point made by these authors that misspecification of working correlation in GEE approach would still give consistent regression parameter estimates. Further, the results of this study suggest that even with small correlation, ignoring a random effects in a binary model can lead to inconsistent estimation.
Dissertation (MSc)--University of Pretoria, 2014.
lk2014
Statistics
MSc
Unrestricted
APA, Harvard, Vancouver, ISO, and other styles
12

Hercz, Daniel. "Flexible modeling with generalized additive models and generalized linear mixed models: comprehensive simulation and case studies." Thesis, McGill University, 2013. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=114300.

Full text
Abstract:
This thesis compares GAMs and GLMMs in the context of modeling nonlinear curves. The study contains a comprehensive simulation and a few real life data analyses. The simulation uses thousands of generated datasets to compare and contrast the two models' (and linear models as a benchmark) fit, extent of nonlinearity, and shape of the resulting curve. The data analyses extend the results of the simulation to GLMM/GAM curves of lung function with measures of smoking as the independent variable. An additional and larger real life data analysis with dichotomous outcomes rounds out the study and allow for more representative results.
Cette these compare des GAM et GLMM dans le cadre de la modélisation des courbes non-linéaires. L'étude comprend une simulation complète et quelques analyses réelles. La simulation utilise des milliers de 'datasets' générés pour comparer forme entres les deux modèles (et les modèles linéaires comme point de repère), l'étendue de la non-linéarité, et la forme de la courbe obtenue. Les analyses d'étendre les résultats de la simulation à courbes de la fonction pulmonaire avec de GLMM / GAM avec mesures du tabagisme (la variable indépendante). Un autre analyse réelle avec les résultats dichotomiques complète l'étude et que les résultats soient plus représentatifs.
APA, Harvard, Vancouver, ISO, and other styles
13

Chen, Jinsong. "Semiparametric Methods for the Generalized Linear Model." Diss., Virginia Tech, 2010. http://hdl.handle.net/10919/28012.

Full text
Abstract:
The generalized linear model (GLM) is a popular model in many research areas. In the GLM, each outcome of the dependent variable is assumed to be generated from a particular distribution function in the exponential family. The mean of the distribution depends on the independent variables. The link function provides the relationship between the linear predictor and the mean of the distribution function. In this dissertation, two semiparametric extensions of the GLM will be developed. In the first part of this dissertation, we have proposed a new model, called a semiparametric generalized linear model with a log-concave random component (SGLM-L). In this model, the estimate of the distribution of the random component has a nonparametric form while the estimate of the systematic part has a parametric form. In the second part of this dissertation, we have proposed a model, called a generalized semiparametric single-index mixed model (GSSIMM). A nonparametric component with a single index is incorporated into the mean function in the generalized linear mixed model (GLMM) assuming that the random component is following a parametric distribution. In the first part of this dissertation, since most of the literature on the GLM deals with the parametric random component, we relax the parametric distribution assumption for the random component of the GLM and impose a log-concave constraint on the distribution. An iterative numerical algorithm for computing the estimators in the SGLM-L is developed. We construct a log-likelihood ratio test for inference. In the second part of this dissertation, we use a single index model to generalize the GLMM to have a linear combination of covariates enter the model via a nonparametric mean function, because the linear model in the GLMM is not complex enough to capture the underlying relationship between the response and its associated covariates. The marginal likelihood is approximated using the Laplace method. A penalized quasi-likelihood approach is proposed to estimate the nonparametric function and parameters including single-index coe±cients in the GSSIMM. We estimate variance components using marginal quasi-likelihood. Asymptotic properties of the estimators are developed using a similar idea by Yu (2008). A simulation example is carried out to compare the performance of the GSSIMM with that of the GLMM. We demonstrate the advantage of my approach using a study of the association between daily air pollutants and daily mortality adjusted for temperature and wind speed in various counties of North Carolina.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
14

Hossain, Mohammad Zakir. "A small-sample randomization-based approach to semi-parametric estimation and misspecification in generalized linear mixed models." Thesis, Queen Mary, University of London, 2017. http://qmro.qmul.ac.uk/xmlui/handle/123456789/24641.

Full text
Abstract:
In a generalized linear mixed model (GLMM), the random effects are typically uncorrelated and assumed to follow a normal distribution. However, findings from recent studies on how the misspecification of the random effects distribution affects the estimated model parameters are inconclusive. In the thesis, we extend the randomization approach for deriving linear models to the GLMM framework. Based on this approach, we develop an algorithm for estimating the model parameters of the randomization-based GLMM (RBGLMM) for the completely randomized design (CRD) which does not require normally distributed random effects. Instead, the discrete uniform distribution on the symmetric group of permutations is used for the random effects. Our simulation results suggest that the randomization-based algorithm may be an alternative when the assumption of normality is violated. In the second part of the thesis, we consider an RB-GLMM for the randomized complete block design (RCBD) with random block effects. We investigate the effect of misspecification of the correlation structure and of the random effects distribution via simulation studies. In the simulation, we use the variance covariance matrices derived from the randomization approach. The misspecified model with uncorrelated random effects is fitted to data generated from the model with correlated random effects. We also fit the model with normally distributed random effects to data simulated from models with different random effects distributions. The simulation results show that misspecification of both the correlation structure and of the random effects distribution has hardly any effect on the estimates of the fixed effects parameters. However, the estimated variance components are frequently severely biased and standard errors of these estimates are substantially higher.
APA, Harvard, Vancouver, ISO, and other styles
15

Shrewsbury, John Stephen. "Calibration of trip distribution by generalised linear models." Thesis, University of Canterbury. Department of Civil and Natuaral Resources Engineering, 2012. http://hdl.handle.net/10092/7685.

Full text
Abstract:
Generalised linear models (GLMs) provide a flexible and sound basis for calibrating gravity models for trip distribution, for a wide range of deterrence functions (from steps to splines), with K factors and geographic segmentation. The Tanner function fitted Wellington Transport Strategy Model data as well as more complex functions and was insensitive to the formulation of intrazonal and external costs. Weighting from variable expansion factors and interpretation of the deviance under sparsity are addressed. An observed trip matrix is disaggregated and fitted at the household, person and trip levels with consistent results. Hierarchical GLMs (HGLMs) are formulated to fit mixed logit models, but were unable to reproduce the coefficients of simple nested logit models. Geospatial analysis by HGLM showed no evidence of spatial error patterns, either as random K factors or as correlations between them. Equivalence with hierarchical mode choice, duality with trip distribution, regularisation, lorelograms, and the modifiable areal unit problem are considered. Trip distribution is calibrated from aggregate data by the MVESTM matrix estimation package, incorporating period and direction factors in the intercepts. Counts across four screenlines showed a significance similar to a thousand-household travel survey. Calibration was possible only in conjuction with trip end data. Criteria for validation against screenline counts were met, but only if allowance was made for error in the trip end data.
APA, Harvard, Vancouver, ISO, and other styles
16

Nelson, Kerrie P. "Generalized linear mixed models : development and comparison of different estimation methods /." Thesis, Connect to this title online; UW restricted, 2002. http://hdl.handle.net/1773/8960.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Chen, Yin. "Quasi-Monte Carlo methods in generalized linear mixed model with correlated and non-normal random effects." Thesis, University of Manchester, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.516829.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Barbosa, Luciano [UNESP]. "Metodologias estatísticas na análise de germinação de sementes de mamona." Universidade Estadual Paulista (UNESP), 2010. http://hdl.handle.net/11449/101848.

Full text
Abstract:
Made available in DSpace on 2014-06-11T19:31:37Z (GMT). No. of bitstreams: 0 Previous issue date: 2010-11-16Bitstream added on 2014-06-13T21:02:57Z : No. of bitstreams: 1 barbosa_l_dr_botfca.pdf: 2587351 bytes, checksum: 76e343f1e0edbbbee5cb996188d8efd2 (MD5)
É bastante comum na área agrícola, experimentos cujas variáveis respostas são contagens ou proporções. Para esse tipo de dados, utiliza-se a metodologia de modelos lineares generalizados quando as respostas são independentes. Por outro lado, quando as respostas são dependentes, há uma correlação entre as observações e isso tem que ser levado em consideração na análise, para evitar inferências incorretas sobre os coeficientes de regressão. Na literatura há técnicas disponíveis para a modelagem e análise desses dados, sendo os modelos disponíveis extensões dos modelos lineares generalizados. No presente trabalho, utiliza-se a metodologia de equação de estimação generalizada, que inclui no modelo uma matriz de correlação para a obtenção de um melhor ajuste. Outra alternativa, também abordada neste trabalho, é a utilização de um modelo linear generalizado misto, no qual o uso de efeitos aleatórios também introduz uma correlação entre observações que tenham algum efeito em comum. Essas duas metodologias são aplicadas a um conjunto de dados obtidos de um experimento para avaliar certas condições na germinação de sementes de mamona da cultivar AL Guarany 2002, com o objetivo de se verificar qual o melhor modelo de estimação para esses dados
Experiments whose response variables are counts or proportions are very common in agriculture. For this type of data, if the observational units are independent, the methodology of generalized linear models can be appropriate. On the other hand, when responses are dependent or clustered, there is a correlation between the observations and that has to be taken into consideration in the analysis to avoid incorrect inferences about the regression coefficients. In the literature there are techniques available for modeling and analyzing such type data, the models being extensions of generalized linear models. The present study explores the use of: 1) generalized estimation equations, that includes a correlation matrix to obtain a better fit; 2) generalized linear mixed models, that introduce a correlation between clustered observations though the addition of random effects in the model. These two methodologies are applied to a data set obtained from an experiment to evaluate certain conditions on the germination of seeds of castor bean cultivar AL Guarany 2002 with the objective of determining the best estimation model for such data
APA, Harvard, Vancouver, ISO, and other styles
19

Evangelou, Evangelos A. Smith Richard L. "Bayesian and frequentist methods for approximate inference in generalized linear mixed models." Chapel Hill, N.C. : University of North Carolina at Chapel Hill, 2009. http://dc.lib.unc.edu/u?/etd,2607.

Full text
Abstract:
Thesis (Ph. D.)--University of North Carolina at Chapel Hill, 2009.
Title from electronic title page (viewed Oct. 5, 2009). "... in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of Statistics and Operations Research Statistics." Discipline: Statistics and Operations Research; Department/School: Statistics and Operations Research.
APA, Harvard, Vancouver, ISO, and other styles
20

Jung, Jungah. "Using generalized linear models with a mixed random component to analyze count data." Fogler Library, University of Maine, 2001. http://www.library.umaine.edu/theses/pdf/JungJX2001.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Kurusu, Ricardo Salles. "Avaliação de técnicas de diagnóstico para a análise de dados com medidas repetidas." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-21062013-202727/.

Full text
Abstract:
Dentre as possíveis propostas encontradas na literatura estatística para analisar dados oriundos de estudos com observações correlacionadas, estão os modelos condicionais e os modelos marginais. Diversas técnicas têm sido propostas para a análise de diagnóstico nesses modelos. O objetivo deste trabalho é apresentar algumas das técnicas de diagnóstico disponíveis para os dois tipos de modelos e avaliá-las por meio de estudos de simulação. As técnicas apresentadas também foram aplicadas em um conjunto de dados reais.
Conditional and marginal models are among the possibilities in statistical literature to analyze data from studies with correlated observations. Several techniques have been proposed for diagnostic analysis in these models. The objective of this work is to present some of the diagnostic techniques available for both modeling approaches and to evaluate them by simulation studies. The presented techniques were also applied in a real dataset.
APA, Harvard, Vancouver, ISO, and other styles
22

Maekawa, Eduardo Shigueiti. "Estimativa do custo da colheita mecanizada de cana-de-açúcar utilizando modelos de regressão." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11152/tde-30092016-101059/.

Full text
Abstract:
A colheita mecanizada é uma das mais significativas e onerosas operações do processo de produção de cana-de-açúcar, tornando-se importante o entendimento das relações que envolvem o seu custo. Atualmente, as metodologias para estimar o custo da colheita partem do conceito de custo fixo e variável. No entanto, considerando a complexidade desse processo, faz-se necessário avaliar métodos capazes de relacionar os parâmetros operacionais com o custo final. Neste contexto, a modelagem estatística por meio da regressão permite tratar tais relações e prever tendências. O objetivo deste trabalho foi desenvolver um modelo empírico para o cálculo do custo da colheita mecanizada de cana-de-açúcar. Desenvolveu-se um modelo linear generalizado (MLG) e um modelo linear generalizado misto (MLGM) ambos com distribuição gama, utilizando indicadores operacionais e dados de custo de 20 usinas do setor sucroalcooleiro. Por meio do MLGM, obteve-se uma aderência satisfatória quando comparado aos modelos MLG, nulo (média) e linear (supondo normalidade). Os indicadores que explicaram o custo foram: produtividade (t maq-1), consumo (l t-1), horímetro (h) e número de operadores por colhedora (nop).
The mechanized harvesting of sugarcane is one of the most significant and costly operations of the production process, thus it is important to understand the relationships involving its cost. Currently, methods to estimate these costs rise from the concept of fixed and variable cost. However, considering the complexity of the harvesting process, it is necessary to evaluate techniques to relate the operating parameters with the final cost. In this context, statistical modeling by regression allows to treat such relationship and predict trends. The objective of this study was to develop an empirical model to calculate the cost of mechanical harvesting of sugarcane. A generalized linear model (GLM) and a generalized linear mixed model (GLMM) both with gamma distribution was developed using operational indicators and cost data from 20 plants in the sugarcane industry. Through the GLMM, satisfactory adhesion was obtained when compared to the GLM, null model (average) and linear (assuming normality). The indicators that explained the cost were: productivity (t mach-1), consumption (l t-1), hourmeter (h) and number of operators per harvester (nop).
APA, Harvard, Vancouver, ISO, and other styles
23

Barbosa, Luciano 1971. "Metodologias estatísticas na análise de germinação de sementes de mamona /." Botucatu : [s.n.], 2010. http://hdl.handle.net/11449/101848.

Full text
Abstract:
Orientador: Luiza Aparecida Trinca
Banca: Liciana Vaz da Arruda
Banca: Osmar Delmanto Junior
Banca: Célia Regina Lopes Zimback
Banca: Marli Teixeira de A. Minhoni
Resumo: É bastante comum na área agrícola, experimentos cujas variáveis respostas são contagens ou proporções. Para esse tipo de dados, utiliza-se a metodologia de modelos lineares generalizados quando as respostas são independentes. Por outro lado, quando as respostas são dependentes, há uma correlação entre as observações e isso tem que ser levado em consideração na análise, para evitar inferências incorretas sobre os coeficientes de regressão. Na literatura há técnicas disponíveis para a modelagem e análise desses dados, sendo os modelos disponíveis extensões dos modelos lineares generalizados. No presente trabalho, utiliza-se a metodologia de equação de estimação generalizada, que inclui no modelo uma matriz de correlação para a obtenção de um melhor ajuste. Outra alternativa, também abordada neste trabalho, é a utilização de um modelo linear generalizado misto, no qual o uso de efeitos aleatórios também introduz uma correlação entre observações que tenham algum efeito em comum. Essas duas metodologias são aplicadas a um conjunto de dados obtidos de um experimento para avaliar certas condições na germinação de sementes de mamona da cultivar AL Guarany 2002, com o objetivo de se verificar qual o melhor modelo de estimação para esses dados
Abstract: Experiments whose response variables are counts or proportions are very common in agriculture. For this type of data, if the observational units are independent, the methodology of generalized linear models can be appropriate. On the other hand, when responses are dependent or clustered, there is a correlation between the observations and that has to be taken into consideration in the analysis to avoid incorrect inferences about the regression coefficients. In the literature there are techniques available for modeling and analyzing such type data, the models being extensions of generalized linear models. The present study explores the use of: 1) generalized estimation equations, that includes a correlation matrix to obtain a better fit; 2) generalized linear mixed models, that introduce a correlation between clustered observations though the addition of random effects in the model. These two methodologies are applied to a data set obtained from an experiment to evaluate certain conditions on the germination of seeds of castor bean cultivar AL Guarany 2002 with the objective of determining the best estimation model for such data
Doutor
APA, Harvard, Vancouver, ISO, and other styles
24

Codd, Casey. "A Review and Comparison of Models and Estimation Methods for Multivariate Longitudinal Data of Mixed Scale Type." The Ohio State University, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=osu1398686513.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

CHEN, JUNLIANG. "A MONTE CARLO EM ALGORITHM FOR GENERALIZED LINEAR MIXED MODELS WITH FLEXIBLE RANDOMEFFECTS DISTRIBUTION." NCSU, 2001. http://www.lib.ncsu.edu/theses/available/etd-20011025-112332.

Full text
Abstract:

CHEN, JUNLIANG. A Monte Carlo EM algorithm for generalized linear mixed modelswith flexible random effects distribution. (Under the direction of DaowenZhang and Marie Davidian)A popular way to model correlated binary, count, or other data arising inclinical trials and epidemiological studies of cancer and other diseases is byusing generalized linear mixed models (GLMMs), which acknowledge correlationthrough incorporation of random effects. A standard model assumption is thatthe random effects follow a parametric family such as the normal distribution.However, this may be unrealistic or too restrictive to represent the data,raising concern over the validity of inferences both on fixed and randomeffects if it is violated.Here we use the seminonparametric (SNP) approach (Davidian and Gallant 1992,1993) to model the random effects, which relaxes the normality assumption andjust requires that the distribution of random effects belong to a class of``smooth'' densities given by Gallant and Nychka (1987). This representation allows the density of random effects to be very flexible, including densitiesthat are skewed, multi--modal, fat-- or thin--tailed relative to the normal, andthe normal as a special case. We also provide a reparameterization of thisrepresentation to avoid numerical instability in estimating the polynomialcoefficients.Because an efficient algorithm to sample from a SNP density is available, wepropose a Monte Carlo expectation maximization (MCEM) algorithm using arejection sampling scheme (Booth and Hobert, 1999) to estimate the fixedparameters of the linear predictor, variance components and the SNP density. Astrategy of choosing the degree of flexibility required for the SNP density isalso proposed. We illustrate the methods by application to two data sets fromthe Framingham and Six Cities Studies, and present simulations demonstratingperformance of the approach.

APA, Harvard, Vancouver, ISO, and other styles
26

Beauchamp, Marie-Eve. "Generalized linear mixed models for binary outcome data with a low proportion of occurrences." Thesis, McGill University, 2010. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=86709.

Full text
Abstract:
Many studies in epidemiology and other fields such as econometrics and social sciences give rise to correlated outcome data (e.g., longitudinal studies, meta-analyses, and multi-centre studies). Parameter estimation of generalized linear mixed models (GLMMs), which are frequently used to perform inference on correlated binary outcomes, is complicated by intractable integrals in the marginal likelihood. Penalized quasi-likelihood (PQL) and maximum likelihood estimation in conjunction with numerical integration via adaptive Gauss-Hermite quadrature (AGHQ) are estimation methods that are commonly used in practice. However, the assessment of the performance of these estimation methods in settings found in practice is incomplete, particularly for binary outcome data with a low proportion of occurrences.
To begin with, I considered graphical representations of the distributions of cluster-specific log odds of outcome ensuing from random intercepts logistic models (RILMs) converted to the probability scale with the inverse logit transformation. RILMs are special cases of GLMMs. These representations are helpful to comprehend the implications of RILM parameter values for the distributions of cluster-specific probabilities of outcome. The correspondence of these distributions with beta distributions, also used for random effects models for binary outcomes, was graphically assessed and a generally good agreement was found.
Afterwards, I evaluated via a simulation study the performance of the PQL and AGHQ methods in several realistic settings of binary outcome data with a low proportion of occurrences. Different features determining the number of occurrences were considered (number of clusters, cluster size, and probabilities of outcome). The AGHQ method produced nearly unbiased fixed effects estimates, even in challenging settings with low proportions of occurrences or a small sample size, but mean square errors tended to be larger than with PQL for small datasets. Both methods produced biased variance component estimates when the number of clusters was moderate, especially with rarer occurrences.
Finally, through further analysis of the simulation results, I assessed if a number of indicators quantifying different aspects of the rarity of the events in a dataset, all measurable in practice, could explain patterns of bias in the parameter estimates. The selected rarity indicators quantify the overall number of events and their distribution across the clusters.
Plusieurs études en épidémiologie et autres domaines, tels que les sciences sociales, donnent lieu à des données de réponse corrélées (par exemple, les études longitudinales et multi-centres). L'estimation des paramètres des modèles linéaires généralisés mixtes (MLGM), souvent utilisés pour les données de réponse corrélées, est compliquée par des intégrales sans solution analytique dans la fonction de vraisemblance marginale. La méthode de quasi-vraisemblance pénalisée (QVP) et l'estimation par la maximisation de la vraisemblance conjointement avec la technique d'intégration numérique de quadrature Gauss-Hermite adaptée (QGHA) sont souvent utilisées. Cependant, l'évaluation de la performance de ces méthodes en pratique est incomplète, en particulier pour les données de réponse binaires avec faible proportion d'événements.
Dans un premier temps, j'ai considéré la représentation graphique de distributions du logarithme de la cote spécifique à chaque groupe résultant de modèles logistiques avec intercepts aléatoires (MLIA) transformées à l'échelle des probabilités avec la transformation logit inversée. Les MLIA sont des cas particuliers des MLGM. Ces représentations sont utiles pour comprendre les implications des valeurs des paramètres sur la distribution de la probabilité de réponse spécifique à chaque groupe. La correspondance avec la loi bêta a été évaluée graphiquement et une bonne concordance fut observée.
Par la suite, j'ai évalué avec une étude de simulations la performance des méthodes QVP et QGHA pour plusieurs cas réalistes de données de réponse binaires avec faible proportion d'événements. Différentes caractéristiques déterminant le nombre d'événements furent considérées (nombre et taille des groupes et probabilités d'événement). La méthode QGHA a produit des valeurs estimées presque sans biais, même dans des situations avec faible proportion d'événements ou petite taille d'échantillon, mais les erreurs quadratiques moyennes étaient souvent plus élevées qu'avec la méthode QVP pour les petits échantillons. Les deux méthodes ont produit des valeurs estimées biaisées pour la composante de variance lorsque le nombre de groupes était modéré, particulièrement lorsque les événements étaient rares.
Finalement, j'ai évalué si un nombre d'indicateurs de rareté des événements, tous mesurables en pratique pour un jeu de données, pouvaient expliquer le biais dans les valeurs estimées des paramètres. Les indicateurs sélectionnés quantifient le nombre total d'événements et leur distribution dans les groupes.
APA, Harvard, Vancouver, ISO, and other styles
27

Cho, Jang Ik. "Partial EM Procedure for Big-Data Linear Mixed Effects Model, and Generalized PPE for High-Dimensional Data in Julia." Case Western Reserve University School of Graduate Studies / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=case152845439167999.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Zhan, Tingting. "The Generalized Linear Mixed Model for Finite Normal Mixtures with Application to Tendon Fibrilogenesis Data." Diss., Temple University Libraries, 2012. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/171613.

Full text
Abstract:
Statistics
Ph.D.
We propose the generalized linear mixed model for finite normal mixtures (GLMFM), as well as the estimation procedures for the GLMFM model, which are widely applicable to the hierarchical dataset with small number of individual units and multi-modal distributions at the lowest level of clustering. The modeling task is two-fold: (a). to model the lowest level cluster as a finite mixtures of the normal distribution; and (b). to model the properly transformed mixture proportions, means and standard deviations of the lowest-level cluster as a linear hierarchical structure. We propose the robust generalized weighted likelihood estimators and the new cubic-inverse weight for the estimation of the finite mixture model (Zhan et al., 2011). We propose two robust methods for estimating the GLMFM model, which accommodate the contaminations on all clustering levels, the standard-two-stage approach (Chervoneva et al., 2011, co-authored) and a robust joint estimation. Our research was motivated by the data obtained from the tendon fibril experiment reported in Zhang et al. (2006). Our statistical methodology is quite general and has potential application in a variety of relatively complex statistical modeling situations.
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
29

Hewson, Paul James. "On the uses of generalised linear mixed models for the simultaneous investigation of multiple performance indicators." Thesis, University of Exeter, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.418464.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Policastro, Catherine. "The Effects of Ecological Context and Individual Characteristics on Stereotyped Displays in Male Anolis carolinensis." ScholarWorks@UNO, 2013. http://scholarworks.uno.edu/td/1757.

Full text
Abstract:
Displays are ubiquitous throughout the animal kingdom. While many have been thoroughly documented, the factors affecting the expression of such displays are still not fully understood. We tested the hypotheses that display production would be affected by ecological context (i.e. the identity of the receiver) and intrinsic qualities of the signaler (i.e. heavyweight and lightweight size class) in the green anole lizard, Anolis carolinensis. Our results supported these predictions and show that a) ecological context, specifically displaying to conspecifics, has the greatest impact on display production; b) size class influenced display rate with heavyweight males displaying more to green females and lightweight males displaying more to green males in similar frequency between the two size classes to their respective target stimuli. Furthermore, our results provide empirical support for differential use of the three major display types (A, B and C displays), and uncover unexpected complexity in green anole display production.
APA, Harvard, Vancouver, ISO, and other styles
31

Nuthmann, Antje, Wolfgang Einhäuser, and Immo Schütz. "How Well Can Saliency Models Predict Fixation Selection in Scenes Beyond Central Bias? A New Approach to Model Evaluation Using Generalized Linear Mixed Models." Universitätsbibliothek Chemnitz, 2018. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-232614.

Full text
Abstract:
Since the turn of the millennium, a large number of computational models of visual salience have been put forward. How best to evaluate a given model's ability to predict where human observers fixate in images of real-world scenes remains an open research question. Assessing the role of spatial biases is a challenging issue; this is particularly true when we consider the tendency for high-salience items to appear in the image center, combined with a tendency to look straight ahead (“central bias”). This problem is further exacerbated in the context of model comparisons, because some—but not all—models implicitly or explicitly incorporate a center preference to improve performance. To address this and other issues, we propose to combine a-priori parcellation of scenes with generalized linear mixed models (GLMM), building upon previous work. With this method, we can explicitly model the central bias of fixation by including a central-bias predictor in the GLMM. A second predictor captures how well the saliency model predicts human fixations, above and beyond the central bias. By-subject and by-item random effects account for individual differences and differences across scene items, respectively. Moreover, we can directly assess whether a given saliency model performs significantly better than others. In this article, we describe the data processing steps required by our analysis approach. In addition, we demonstrate the GLMM analyses by evaluating the performance of different saliency models on a new eye-tracking corpus. To facilitate the application of our method, we make the open-source Python toolbox “GridFix” available.
APA, Harvard, Vancouver, ISO, and other styles
32

Shannon, Carlie. "A case study in applying generalized linear mixed models to proportion data from poultry feeding experiments." Kansas State University, 2013. http://hdl.handle.net/2097/15519.

Full text
Abstract:
Master of Science
Department of Statistics
Leigh Murray
This case study was motivated by the need for effective statistical analysis for a series of poultry feeding experiments conducted in 2006 by Kansas State University researchers in the department of Animal Science. Some of these experiments involved an automated auger feed line system commonly used in commercial broiler houses and continuous, proportion response data. Two of the feed line experiments are considered in this case study to determine if a statistical model using a non-normal response offers a better fit for this data than a model utilizing a normal approximation. The two experiments involve fixed as well as multiple random effects. In this case study, the data from these experiments is analyzed using a linear mixed model and Generalized Linear Mixed Models (GLMM’s) with the SAS Glimmix procedure. Comparisons are made between a linear mixed model and GLMM’s using the beta and binomial responses. Since the response data is not count data a quasi-binomial approximation to the binomial is used to convert continuous proportions to the ratio of successes over total number of trials, N, for a variety of possible N values. Results from these analyses are compared on the basis of point estimates, confidence intervals and confidence interval widths, as well as p-values for tests of fixed effects. The investigation concludes that a GLMM may offer a better fit than models using a normal approximation for this data when sample sizes are small or response values are close to zero. This investigation discovers that these same instances can cause GLMM’s utilizing the beta response to behave poorly in the Glimmix procedure because lack of convergence issues prevent the obtainment of valid results. In such a case, a GLMM using a quasi-binomial response distribution with a high value of N can offer a reasonable and well behaved alternative to the beta distribution.
APA, Harvard, Vancouver, ISO, and other styles
33

Nati, Lilian. "Superdispersão em dados binomiais hierárquicos." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-19062008-132744/.

Full text
Abstract:
Para analisar dados binários oriundos de uma estrutura hierárquica com dois níveis (por exemplo, aluno e escola), uma alternativa bastante utilizada é a suposição da distribuição binomial para as unidades experimentais do primeiro nível (aluno) condicionalmente a um efeito aleatório proveniente de uma distribuição normal para as unidades do segundo nível (escola). Neste trabalho, propõe-se a adição de um efeito aleatório normal no primeiro nível de um modelo linear generalizado hierárquico binomial para contemplar uma possível variabilidade extra-binomial decorrente da dependência entre os ensaios de Bernoulli de um mesmo indivíduo. Obtém-se o processo de estimação por máxima verossimilhança para este modelo a partir da verossimilhança marginal dos dados, após uma dupla aplicação do método de quadratura de Gauss-Hermite adaptativa como aproximação para as integrais dos efeitos aleatórios. Realiza-se um estudo de simulação para contrastar propriedades inferenciais do modelo aspirante com o modelo linear generalizado binomial, um modelo de quase-verossimilhança e o tradicional modelo linear generalizado hierárquico em dois níveis.
A common alternative when analyzing binary data originated from a two-level hierarchical structure (for instance, student and school) is to assume a binomial distribution for the experimental units of the first level (student) conditionally to a normal random effect for the second level units (school). In this work, we propose the inclusion of a second normal random effect in the first level to contemplate a possible extra-binomial variability due to the dependence among the Bernoulli trials in the same individual. We obtain the maximum likelihood estimation process for this hierarchical model starting from the marginal likelihood of the data, after a double application of the adaptive Gauss-Hermite quadrature as an approximation of the integrals of the random effects. We conduct a simulation study to compare the inferential properties of the advocated model with the generalized linear (binomial) model, a quasi-likelihood model and the usual two-level hierarchical generalized linear model.
APA, Harvard, Vancouver, ISO, and other styles
34

Fatoretto, Maíra Blumer. "Modelos para dados categorizados ordinais com efeito aleatório: uma aplicação à análise sensorial." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-16032016-170135/.

Full text
Abstract:
Os modelos para dados categorizados ordinais são extensões dos Modelos Lineares Generalizados e suas suposições e inferências são fundamentadas por esta classe de modelos. Os Modelos de Logitos Cumulativos, em que a função de ligação é constituída de probabilidades acumuladas, são muito utilizados para este tipo de variável, sendo uma de suas simplificações, os Modelos de Chances Proporcionais, em que para todas as covaríaveis no modelo há um crescimento linear nas razões de chances, porém, neste caso, é necessária a verificação da suposição de paralelismo. Outros modelos como o Modelo de Chances Proporcionais Parciais, o Modelo de Categorias Adjacentes e o Modelo Logito de Razão Contínua também podem ser utilizados. Em diversos estudos deste tipo, é necessário a utilização de modelos mistos, seja pelo tipo de um fator ou a dependência entre observações da variável resposta. Objetivou-se, neste trabalho, o estudo de modelos para variável resposta ordinal com a inclusão de um ou mais efeitos aleatórios. Esses modelos são ilustrados com a utilização de dados reais de análise sensorial, cuja variável resposta é constituída de uma escala ordinal e deseja-se saber dentre duas variedades de tomates desidratados (Italiano e Sweet Grape), qual teve melhor aceitação pelos consumidores. Nesse experimento os provadores avaliaram uma única vez cada uma das variedades, sendo as repetições constituídas pelas avaliações dadas por diferentes provadores. Nesse caso, é necessária a inclusão de um efeito aleatório por provador, para que o modelo consiga capturar as diferenças entre esses provadores não treinados. O Modelo de Chances Proporcionais ajustou-se de maneira satisfatória aos dados, podendo-se fazer uso das estimativas de probabilidades e razões de chances para a interpretação dos resultados e concluindo-se que o sabor da variedade Sweet Grape foi o que mais agradou os provadores, independente do sexo.
Models for ordinal categorical data are extensions of the Generalized Linear Models and their assumptions and inferences are based on this class of models. The Cumulative Logit Models in wich the link function consists of accumulated probabilities are more used for this type of variable, with one of its simplifications are the Proportional Odds Model, in wich for all covariates in the model there is a linear growth in odds ratios, but in this case, checking the parallelism assumption is required. Other models such as the Partial Proportional Odds Model, the Adjacent-Categories Logits and Continuation-Ratio Logits model can also be used. In several of such studies, the use of mixed models is required, either by type of factor or dependence between the response variable observations. The aim of this work is studying models for ordinal variable response with the inclusion of one or more random effects. These models are illustrated by using real data of sensory analysis, the response variable consists of an ordinal scale and we want to know from two varieties of dried tomatoes, Italian and Sweet Grape, which had better acceptance by consumers. In this experiment, the panelists evaluated each variety once, and the repetitions constituted by the ratings given by different tasters. In this case, the inclusion of a random effect by taster is required so that the model can capture the difference between these untrained tasters. The Proportional Odds Model fitted satisfactorily to the data and it is possible to make use of the estimates of probabilities and odds ratios for the interpretation of results and concluding that the taste of the variety Sweet Grape was the one that most pleased the tasters regardless of sex.
APA, Harvard, Vancouver, ISO, and other styles
35

Hao, Chengcheng. "Explicit Influence Analysis in Crossover Models." Doctoral thesis, Stockholms universitet, Statistiska institutionen, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-107703.

Full text
Abstract:
This dissertation develops influence diagnostics for crossover models. Mixed linear models and generalised mixed linear models are utilised to investigate continuous and count data from crossover studies, respectively. For both types of models, changes in the maximum likelihood estimates of parameters, particularly in the estimated treatment effect, due to minor perturbations of the observed data, are assessed. The novelty of this dissertation lies in the analytical derivation of influence diagnostics using decompositions of the perturbed mixed models. Consequently, the suggested influence diagnostics, referred to as the delta-beta and variance-ratio influences, provide new findings about how the constructed residuals affect the estimation in terms of different parameters of interest. The delta-beta and variance-ratio influence in three different crossover models are studied in Chapters 5-6, respectively. Chapter 5 analyses the influence of subjects in a two-period continuous crossover model. Possible problems with observation-level perturbations in crossover models are discussed. Chapter 6 extends the approach to higher-order crossover models. Furthermore, not only the individual delta-beta and variance-ratio influences of a subject are derived, but also the joint influences of two subjects from different sequences. Chapters 5-6 show that the delta-beta and variance-ratio influences of a particular parameter are decided by the special linear combination of the constructed residuals. In Chapter 7, explicit delta-beta influence on the estimated treatment effect in the two-period count crossover model is derived. The influence is related to the Pearson residuals of the subject. Graphical tools are developed to visualise information of influence concerning crossover models for both continuous and count data. Illustrative examples are provided in each chapter.
APA, Harvard, Vancouver, ISO, and other styles
36

Menezes, Renee Xavier de. "More useful standard errors for group and factor effects in generalized linear models." Thesis, University of Oxford, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.302362.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Hu, Shuwen. "Statistical modeling and machine learning in longitudinal data analysis." Thesis, Queensland University of Technology, 2021. https://eprints.qut.edu.au/211253/1/Shuwen_Hu_Thesis.pdf.

Full text
Abstract:
This thesis mainly concerns the statistical modelling and machine learning methods for the analysis of longitudinal data. As a contribution to this area, this thesis provides theoretical discussion and empirical illustrations of longitudinal data analysis. The first contribution is developing methods to obtain robust and efficient variance estimators when the cluster size is large. The second one is comparing a traditional parametric approach, the linear mixed model with machine learning methods in longitudinal data analysis. The last one is extracting new features to improve sheep behaviour classification accuracy of different machine learning algorithms.
APA, Harvard, Vancouver, ISO, and other styles
38

Pinto, João Pedro Senhorães Senra. "New credibility approaches in workers compensation insurance." Master's thesis, Instituto Superior de Economia e Gestão, 2015. http://hdl.handle.net/10400.5/10853.

Full text
Abstract:
Mestrado em Ciências Actuariais
No nosso relatório apresentamos diferentes interpretações da teoria de credibilidade de Bühlmann que foram aplicadas na análise da carteira de seguros de trabalho de uma seguradora portuguesa. Começamos pela apresentação e implementação dos modelos clássicos de Bühlmann-Straub e Jewell, posteriormente debruçamo-nos sobre a mais recente leitura destes modelos enquanto modelos lineares mistos. Por fim, apresentamos duas abordagens que sugerem como a credibilidade de Bühlmann poderá aperfeiçoar o desempenho dos modelos lineares generalizados.
In our report, several interpretations of Bühlmann credibility are applied in the workers compensation portfolio of a portuguese insurance company. We begin with classical implementations of Bühlmann-Straub and Jewell models, and then we display a more recent reading of those models as Linear Mixed Models. We end presenting two approaches that show how Bühlmann credibility can enhance the performance of generalized linear models.
APA, Harvard, Vancouver, ISO, and other styles
39

Lee, Min Cherng. "Multiple imputation for missing data and statistical disclosure control for mixed-mode data using a sequence of generalised linear models." Thesis, University of Southampton, 2014. https://eprints.soton.ac.uk/366481/.

Full text
Abstract:
Multiple imputation is a commonly used approach to deal with missing data and to protect confidentiality of public use data sets. The basic idea is to replace the missing values or sensitive values with multiple imputation, and we then release the multiply imputed data sets to the public. Users can analyze the multiply imputed data sets and obtain valid inferences by using simple combining rules, which take the uncertainty due to the presence of missing values and synthetic values into account. It is crucial that imputations are drawn from the posterior predictive distribution to preserve relationships present in the data and allow valid conclusions to be made from any analysis. In data sets with different types of variables, e.g. some categorical and some continuous variables, multivariate imputation by chained equations (MICE) (Van Buuren (2011)) is a commonly used multiple imputation method. However, imputations from such an approach are not necessarily drawn from a proper posterior predictive distribution. We propose a method, called factored regression model (FRM) to multiply impute missing values in such data sets by modelling the joint distribution of the variables in the data through a sequence of generalised linear models. We use data augmentation methods to connect the categorical and continuous variables and this allows us to draw imputations from a proper posterior distribution. We compare the performance of our method with MICE using simulation studies and on a breastfeeding data. We also extend our modelling strategies to incorporate different informative priors for the FRM to explore robust regression modelling and the sparse relationships between the predictors. We then apply our model to protect confidentiality of the current population survey (CPS) data by generating multiply imputed, partially synthetic data sets. These data sets comprise a mix of original data and the synthetic data where values chosen for synthesis are based on an approach that considers unique and sensitive units in the survey. Valid inference can then be made using the combining rules described by Reiter (2003). An extension to the modelling strategy is also introduced to deal with the presence of spikes at zero in some of the continuous variables in the CPS data.
APA, Harvard, Vancouver, ISO, and other styles
40

Wang, Yu. "A study on the type I error rate and power for generalized linear mixed model containing one random effect." Kansas State University, 2017. http://hdl.handle.net/2097/35301.

Full text
Abstract:
Master of Science
Department of Statistics
Christopher Vahl
In animal health research, it is quite common for a clinical trial to be designed to demonstrate the efficacy of a new drug where a binary response variable is measured on an individual experimental animal (i.e., the observational unit). However, the investigational treatments are applied to groups of animals instead of an individual animal. This means the experimental unit is the group of animals and the response variable could be modeled with the binomial distribution. Also, the responses of animals within the same experimental unit may then be statistically dependent on each other. The usual logit model for a binary response assumes that all observations are independent. In this report, a logit model with a random error term representing the group of animals is considered. This is model belongs to a class of models referred to as generalized linear mixed models and is commonly fit using the SAS System procedure PROC GLIMMIX. Furthermore, practitioners often adjust the denominator degrees of freedom of the test statistic produced by PROC GLIMMIX using one of several different methods. In this report, a simulation study was performed over a variety of different parameter settings to compare the effects on the type I error rate and power of two methods for adjusting the denominator degrees of freedom, namely “DDFM = KENWARDROGER” and “DDFM = NONE”. Despite its reputation for fine performance in linear mixed models with normally distributed errors, the “DDFM = KENWARDROGER” option tended to perform poorly more often than the “DDFM = NONE” option in the logistic regression model with one random effect.
APA, Harvard, Vancouver, ISO, and other styles
41

Costa, Silvano Cesar da. "Modelos lineares generalizados mistos para dados longitudinais." Universidade de São Paulo, 2003. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-09052003-164143/.

Full text
Abstract:
Experimentos cujas variaveis respostas s~ ao proporcoes ou contagens, sao muito comuns nas diversas areas do conhecimento, principalmente na area agricola. Na analise desses experimentos, utiliza-se a teoria de modelos lineares generalizados, bastante difundida (McCullagh & Nelder, 1989; Demetrio, 2001), em que as respostas sao independentes. Caso a variancia estimada seja maior do que a esperada, estima-se o parametro de dispersao, incluindo-o no processo de estimaçao dos parametros. Quando a variavel resposta e observada ao longo do tempo, pode haver uma correlacao entre as observacoes e isso tem que ser levado em consideracao na estimacao dos parametros. Uma forma de se trabalhar essa correlacao e aplicando a metodologia de equacoes de estimacao generalizada (EEG), discutida por Liang & Zeger (1986), embora, neste caso, o interesse esteja nas estimativas dos efeitos fixos e a inclusao da matriz de correlacao de trabalho sirva para se obter um melhor ajuste. Uma outra alternativa e a inclusao, no preditor linear, de um efeito latente para captar variabilidades nao consideradas no modelo e que podem in uenciar nos resultados. No presente trabalho, usa-se uma forma combinada de efeito aleatorio e parametro de dispersao, incluidos conjuntamente na estimacao dos parametros. Essa metodologia e aplicada a um conjunto de dados obtidos de um experimento com camu-camu, com objetivo de se avaliarem quais os melhores metodos de enxertia e tipos de porta-enxertos que podem ser utilizados, atraves da proporcao de pegamentos da muda. Varios modelos sao ajustados, desde o modelo em parcelas subdivididas (supondo independencia), ate o modelo em que se considera o parametro de dispersao e efeito aleatorio conjuntamente. Ha evidencias de que o modelo em que se inclui o efeito aleatorio e o parametro de dispersao, conjuntamente, resultam em melhores estimativas dos parametros. Outro conjunto de dados longitudinais, com milho transgenico MON810, em que a variavel resposta e o numero de lagartas (Spodoptera frugiperda), e utilizado. Neste caso, devido ao excesso de respostas zero, emprega-se o modelo de regressao Poisson in acionado de zeros (ZIP), alem do modelo Poisson padrao, em que as observacoes sao consideradas independentes, e do modelo Poisson in acionado de zeros com efeito aleatorio. Os resultados mostram que o efeito aleatorio incluido no preditor foi nao significativo e, assim, o modelo adotado e o modelo de regressao Poisson in acionado de zeros. Os resultados foram obtidos usando-se os procedimentos NLMIXED, GENMOD e GPLOT do SAS - Statistical Analysis System, versao 8.2.
Experiments which response variables are proportions or counts are very common in several research areas, specially in the area of agriculture. The theory of generalized linear models, well difused (McCullagh & Nelder, 1989; Demetrio, 2001), is used for analyzing these experiments where the responses are independent. If the estimated variance is greater than the expected variance, the dispersion parameter is estimated including it on the parameter estimation process. When the response variable is observed over time a correlation among observations might occur and it should be taken into account in the parameter estimation. A way of dealing with this correlation is applying the methodology of generalized estimating equations (GEEs) discussed by Liang & Zeger (1986) although, in this case, the interest is on the estimates of the xed efect being the inclusion of a working correlation matrix useful to obtain more accurate estimates. Another alternative is the inclusion of a latent efect in the linear predictor to explain variabilities not considered in the model that might in uence the results. In this work the random efect and the dispersion parameter are combined and included together in the parameter estimation. Such methodology is applied to a data set obtained from an experiment realized with camu-camu to evaluate, through proportion of grafting well successful of seedling, which kind of grafting and understock are suitable to be used. Several models are fitted, since the split plot model (with independence assumption) up to the model where the dispersion parameter and the random efect are considered together. There is evidence that the model including the random efect and the dispersion parameter together, produce better estimates of the parameters. Another longitudinal data set used here comes from an experiment realized with the MON810 transgenic corn where the response variable is the number of caterpillars (Spodoptera frugiperda). In this case, due to the excessive number of zeros obtained, the zero in ated Poisson regression model (ZIP) is used in addition to the standard Poisson model, where observations are considered independent, and the zero in ated Poisson regression model with random efect. The results show that the random efect included in the linear predictor was not significant and, therefore, the adopted model is the zero in ated Poisson regression model. The results were obtained using the procedures NLMIXED, GENMOD and GPLOT available on SAS - Statistical Analysis System, version 8.2.
APA, Harvard, Vancouver, ISO, and other styles
42

Bautista, Ezequiel Abraham López. "Modelos lineares mistos e generalizados mistos em estudos de adaptação local e plasticidade fenotípica de Euterpe edulis." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-11092014-170903/.

Full text
Abstract:
Este trabalho objetivou a avaliação da presença de plasticidade fenotípica e de adaptação local de três procedências de palmiteiro: Ombrófila Densa, Estacional Semidecidual e Restinga, em três locais no Estado de São Paulo: Parque Estadual da Ilha do Cardoso, Parque Estadual de Carlos Botelho e Estação Ecológica dos Caetetus, em ensaios de adaptação no estabelecimento (ou de semeadura) e de adaptação em juvenis (ou de crescimento). Os conjuntos de dados foram analisados utilizando estruturas de grupos de experimentos, com efeitos cruzados e aninhados. As variáveis relacionadas com a massa de matéria seca das plantas, nos dois ensaios, foram analisadas usando a abordagem de modelos lineares de efeitos mistos, por meio da incorporação de fatores de efeito aleatório, e fazendo uso do método da máxima verossimilhança restrita (REML) para estimação dos componentes de variância associados a tais fatores com um menor viés. Por outro lado, para a proporção de sementes germinadas, no ensaio de adaptação no estabelecimento, a análise estatística foi realizada a partir da abordagem dos modelos lineares generalizados mistos, sob a pressuposição de que a variável segue uma distribuição binomial, com função de ligação logito. O método da pseudo-verossimilhança foi empregado para obtenção da solução das equações de verossimilhança. Os resultados mostraram que as plantas originadas de sementes dos três biomas avaliados apresentaram um comportamento plástico, para todos os caracteres avaliados no ensaio de adaptação no estabelecimento. Com relação ao ensaio de adaptação em juvenis, a característica de plasticidade foi verificada somente para a massa de matéria seca da folha em plantas provenientes do bioma Estacional Semidecidual. A característica de adaptação local, apresentou-se de forma evidente no ensaio de adaptação no estabelecimento. Estes resultados evidenciaram que em cada local avaliado, as plantas originadas das sementes de diferentes procedências apresentaram um comportamento diferenciado nos caracteres relacionados à massa de matéria seca, podendo em alguns casos, tratar-se de adaptação local. Concluiu-se que os locais Carlos Botelho e Ilha do Cardoso são os mais favoráveis para a germinação das sementes de sua mesma procedência.
The aim of this work was to evaluate the presence of phenotypic plasticity and local adaptation of three provenances of the palm specie Euterpe edulis: Atlantic Rainforest, Seasonally Dry Forest and Restinga Forest, in permanent parcels inserted in three forest types of the São Paulo State (Brazil): Parque Estadual da Ilha do Cardoso, Parque Estadual de Carlos Botelho e Estação Ecológica dos Caetetus, in experiments of seedling establishment and juveniles plants growth. The data sets were analyzed using structures of groups of experiments, with crossed and nested effects. The variables related to dry matter content of plants in both assays were analyzed using linear mixed models (LMM) approach, through the incorporation of random effect factors, and using the restricted maximum likelihood method (REML) for estimation of variance components associated with these factors with a minor bias. On the other hand, germination proportion of the seeds at seedling establishment assay was analyzed using the generalized linear mixed models (GLMM) approach, under the assumption that the variable follows a binomial distribution, with logit link function. The pseudo-likelihood (PL) method was used to obtain the numerical solution of the likelihood equations. The results showed that, plants from seeds of the three biomes evaluated presented a plastic behavior for all characters assessed in the seedling establishment assay. In respect to juveniles adaptation assay, the phenotypic plasticity characteristic was observed only to the leaf dry matter content of plants from Seasonally Dry Forest biome. The local adaptation characteristic was clearly observed in the seedling establishment assay. These results showed that at each site evaluated, plants originating from seeds of different provenances exhibited different behavior on characters related to the dry matter content and may in some cases be local adaptation. It was concluded that locations Carlos Botelho and Ilha do Cardoso are the most favorable for seed germination of its same provenance.
APA, Harvard, Vancouver, ISO, and other styles
43

Eldridge, James Vincent. "Landscape ecology of the lesser grain borer, Rhyzopertha dominica." Thesis, Queensland University of Technology, 2014. https://eprints.qut.edu.au/69538/2/James_Eldridge_Thesis.pdf.

Full text
Abstract:
The Lesser Grain Borer is a major pest of stored grain with a global distribution. This project has, for the first time recorded this pest throughout broad spatial areas, tens of kilometres from grain production or storage. Statistical analysis revealed that different factors such as ambient temperature and the availability of food resources affect R. dominica differently between different habitats. This suggests that, contrary to the prevailing view, this pest is not solely dependent on stored wheat and can continue to persist throughout a range of habitats. These findings have important management implications for Australia's wheat industry.
APA, Harvard, Vancouver, ISO, and other styles
44

Rodríguez, Sanz Maica 1974. "Evolución de las desigualdades socioeconómicas en la mortalidad prematura en los barrios de Barcelona." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/664848.

Full text
Abstract:
El objetivo es analizar la evolución de las desigualdades socioeconómicas en mortalidad en los barrios de Barcelona, considerando los cambios poblacionales ocurridos. Se realizaron 3 estudios. Una revisión del uso del nivel socioeconómico del área, en España, y su relación con la salud y las desigualdades en salud. Un análisis de 20 años de evolución de las desigualdades socioeconómicas en mortalidad prematura en los barrios de Barcelona, teniendo en cuenta la inmigración en los barrios. Un análisis de 10 años de evolución de las desigualdades socioeconómicas en mortalidad prematura en los barrios, en población autóctona y extranjera. En Barcelona, las desigualdades socioeconómicas entre barrios persisten, existe un exceso de mortalidad prematura en los barrios más desfavorecidos. En los últimos años estas desigualdades tienden a disminuir, en parte por la llegada de población inmigrante a los barrios desfavorecidos. La población inmigrante registra menor mortalidad y no presenta desigualdades entre barrios.
The objective is to analyze trends in socioeconomic inequalities in mortality in the neighborhoods of Barcelona, taking into account the population changes. We have three studies. A review of the use of area-level socioeconomic indicators in epidemiological research, in Spain, and its association with health and health inequalities. An analysis of twenty years of trends in socioeconomic inequalities in premature mortality in the neighborhoods of Barcelona, accounting for immigration in neighborhoods. An analysis of ten years of trends in socioeconomic inequalities in premature mortality in the neighborhoods of Barcelona, in foreign-born and native population. In Barcelona, socioeconomic inequalities between neighborhoods persist, there is an excess of premature mortality in the most disadvantaged neighborhoods. Last years, these inequalities tend to diminish, related to the arrival of immigrant population. Foreign-born population register lower levels of premature mortality than native population, and without inequalities between neighborhoods.
APA, Harvard, Vancouver, ISO, and other styles
45

Baker, Jannah F. "Bayesian spatiotemporal modelling of chronic disease outcomes." Thesis, Queensland University of Technology, 2017. https://eprints.qut.edu.au/104455/1/Jannah_Baker_Thesis.pdf.

Full text
Abstract:
This thesis contributes to Bayesian spatial and spatiotemporal methodology by investigating techniques for spatial imputation and joint disease modelling, and identifies high-risk individual profiles and geographic areas for type II diabetes mellitus (DMII) outcomes. DMII and related chronic conditions including hypertension, coronary arterial disease, congestive heart failure and chronic obstructive pulmonary disease are examples of ambulatory care sensitive conditions for which hospitalisation for complications is potentially avoidable with quality primary care. Bayesian spatial and spatiotemporal studies are useful for identifying small areas that would benefit from additional services to detect and manage these conditions early, thus avoiding costly sequelae.
APA, Harvard, Vancouver, ISO, and other styles
46

Shen, Xia. "Novel Statistical Methods in Quantitative Genetics : Modeling Genetic Variance for Quantitative Trait Loci Mapping and Genomic Evaluation." Doctoral thesis, Uppsala universitet, Beräknings- och systembiologi, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-170091.

Full text
Abstract:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision.  Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes.  The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
APA, Harvard, Vancouver, ISO, and other styles
47

Sabangan, Rainier Monteclaro. "Identification and Estimation of Location and Dispersion Effects in Unreplicated 2k-p Designs Using Generalized Linear Models." Bowling Green State University / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1269014397.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Letsoalo, Marothi Peter. "Assessing variance components of multilevel models pregnancy data." Thesis, University of Limpopo, 2019. http://hdl.handle.net/10386/2873.

Full text
Abstract:
Thesis (M. Sc. (Statistics)
Most social and health science data are longitudinal and additionally multilevel in nature, which means that response data are grouped by attributes of some cluster. Ignoring the differences and similarities generated by these clusters results to misleading estimates, hence motivating for a need to assess variance components (VCs) using multilevel models (MLMs) or generalised linear mixed models (GLMMs). This study has explored and fitted teenage pregnancy census data that were gathered from 2011 to 2015 by the Africa Centre at Kwa-Zulu Natal, South Africa. The exploration of these data revealed a two level pure hierarchy data structure of teenage pregnancy status for some years nested within female teenagers. To fit these data, the effects that census year (year) and three female characteristics (namely age (age), number of household membership (idhhms), number of children before observation year (nch) have on teenage pregnancy were examined. Model building of this work, firstly, fitted a logit gen eralised linear model (GLM) under the assumption that teenage pregnancy measurements are independent between females and secondly, fitted a GLMM or MLM of female random effect. A better fit GLMM indicated, for an additional year on year, a 0.203 decrease on the log odds of teenage pregnancy while GLM suggested a 0.21 decrease and 0.557 increase for each additional year on age and year, respectively. A GLM with only year effect uncovered a fixed estimate which is higher, by 0.04, than that of a better fit GLMM. The inconsistency in the effect of year was caused by a significant female cluster variance of approximately 0.35 that was used to compute the VCs. Given the effect of year, the VCs suggested that 9.5% of the differences in teenage pregnancy lies between females while 0.095 similarities (scale from 0 to 1) are for the same female. It was also revealed that year does not vary within females. Apart from the small differences between observed estimates of the fitted GLM and GLMM, this work produced evidence that accounting for cluster effect improves accuracy of estimates. Keywords: Multilevel Model, Generalised Linear Mixed Model, Variance Components, Hier archical Data Structure, Social Science Data, Teenage Pregnancy
APA, Harvard, Vancouver, ISO, and other styles
49

Johnson, Nels Gordon. "Semiparametric Regression Methods with Covariate Measurement Error." Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/49551.

Full text
Abstract:
In public health, biomedical, epidemiological, and other applications, data collected are often measured with error. When mismeasured data is used in a regression analysis, not accounting for the measurement error can lead to incorrect inference about the relationships between the covariates and the response. We investigate measurement error in the covariates of two types of regression models.  For each we propose a fully Bayesian approach that treats the variable measured with error as a latent variable to be integrated over, and a semi-Bayesian approach which uses a first order Laplace approximation to marginalize the variable measured with error out of the likelihood.

The first model is the matched case-control study for analyzing clustered binary outcomes. We develop low-rank thin plate splines for the case where a variable measured with error has an unknown, nonlinear relationship with the response. In addition to the semi- and fully Bayesian approaches, we propose another using expectation-maximization to detect both parametric and nonparametric relationships between the covariates and the binary outcome. We assess the performance of each method via simulation terms of mean squared error and mean bias. We illustrate each method on a perturbed example of 1--4 matched case-control study.

The second regression model is the generalized linear model (GLM) with unknown link function. Usually, the link function is chosen by the user based on the distribution of the response variable, often to be the canonical link. However, when covariates are measured with error, incorrect inference as a result of the error can be compounded by incorrect choice of link function. We assess performance via simulation of the semi- and fully Bayesian methods in terms of mean squared error. We illustrate each method on the Framingham Heart Study dataset.

The simulation results for both regression models support that the fully Bayesian approach is at least as good as the semi-Bayesian approach for adjusting for measurement error, particularly when the distribution of the variable of measure with error and the distribution of the measurement error are misspecified.

Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
50

Carrico, Robert. "Unbiased Estimation for the Contextual Effect of Duration of Adolescent Height Growth on Adulthood Obesity and Health Outcomes via Hierarchical Linear and Nonlinear Models." VCU Scholars Compass, 2012. http://scholarscompass.vcu.edu/etd/2817.

Full text
Abstract:
This dissertation has multiple aims in studying hierarchical linear models in biomedical data analysis. In Chapter 1, the novel idea of studying the durations of adolescent growth spurts as a predictor of adulthood obesity is defined, established, and illustrated. The concept of contextual effects modeling is introduced in this first section as we study secular trend of adulthood obesity and how this trend is mitigated by the durations of individual adolescent growth spurts and the secular average length of adolescent growth spurts. It is found that individuals with longer periods of fast height growth in adolescence are more prone to having favorable BMI profiles in adulthood. In Chapter 2 we study the estimation of contextual effects in a hierarchical generalized linear model (HGLM). We simulate data and study the effects using the higher level group sample mean as the estimate for the true mean versus using an Empirical Bayes (EB) approach (Shin and Raudenbush 2010). We study this comparison for logistic, probit, log-linear, ordinal and nominal regression models. We find that in general the EB estimate lends a parameter estimate much closer to the true value, except for cases with very small variability in the upper level, where it is a more complicated situation and there is likely no need for contextual effects analysis. In Chapter 3 the HGLM studies are made clearer with large-scale simulations. These large scale simulations are shown for logistic regression and probit regression models for binary outcome data. With repetition we are able to establish coverage percentages of the confidence intervals of the true contextual effect. Coverage percentages show the percentage of simulations that have confidence intervals containing the true parameter values. Results confirm observations from the preliminary simulations in the previous section of this paper, and an accompanying example of adulthood hypertension shows how these results can be used in an application.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography