Dissertations / Theses: 'Gibbs sampler'

1

Chimisov, Cyril. "Adapting the Gibbs sampler." Thesis, University of Warwick, 2018. http://wrap.warwick.ac.uk/108829/.

Abstract:

In the present thesis, we close a methodological gap of optimising the basic Markov Chain Monte Carlo algorithms. Similarly to the straightforward and computationally efficient optimisation criteria for the Metropolis algorithm acceptance rate (and, equivalently, proposal scale), we develop criteria for optimising the selection probabilities of the Random Scan Gibbs Sampler. We develop a general purpose Adaptive Random Scan Gibbs Sampler, that adapts the selection probabilities, gradually, as further information is accrued by the sampler. We argue that Adaptive Random Scan Gibbs Samplers can be routinely implemented and substantial computational gains will be observed across many typical Gibbs sampling problems. Additionally, motivated to develop theory to analyse convergence properties of the Adaptive Gibbs Sampler, we introduce a class of Adapted Increasingly Rarely Markov Chain Monte Carlo (AirMCMC) algorithms, where the underlying Markov kernel is allowed to be changed based on the whole available chain output, but only at specific time points separated by an increasing number of iterations. The main motivation is the ease of analysis of such algorithms. Under regularity assumptions, we prove the Mean Square Error convergence, Weak and Strong Laws of Large Numbers, and the Central Limit Theorem and discuss how our approach extends the existing results. We argue that many of the known Adaptive MCMC algorithms may be transformed into the corresponding Air versions and provide an empirical evidence that performance of the Air version remains virtually the same.

APA, Harvard, Vancouver, ISO, and other styles

2

Pang, Wan-Kai. "Modelling ordinal categorical data : a Gibbs sampler approach." Thesis, University of Southampton, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.323876.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Fair, Shannon Marie. "A Bayesian Meta-Analysis Using the Gibbs Sampler." UNF Digital Commons, 1998. http://digitalcommons.unf.edu/etd/87.

Full text

Abstract:

A meta-analysis is the combination of results from several similar studies, conducted by different scientists, in order to arrive at a single, overall conclusion. Unlike common experimental procedures, the data used in a meta-analysis happen to be the descriptive statistics from the distinct individual studies. In this thesis, we will consider two regression studies performed by two scientists. These studies have one common dependent variable, Y, and one or more independent common variables, X. A regression of Y on X with other independent variables is carried out on both studies. We will estimate the regression coefficients of X meta-analytically. After combining the two studies, we will derive a single regression model. There will be observations that one scientist witnesses and the other does not. The missing observations are considered parameters and are estimated using a method called Gibbs sampling.

APA, Harvard, Vancouver, ISO, and other styles

4

Zhang, Zuoshun. "Proper posterior distributions for some hierarchical models and roundoff effects in the Gibbs sampler /." Digital version accessible at:, 2000. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Tan, Aixin. "Convergence rates and regeneration of the block Gibbs sampler for Bayesian random effects models." [Gainesville, Fla.] : University of Florida, 2009. http://purl.fcla.edu/fcla/etd/UFE0024910.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Al-Hamzawi, Rahim Jabbar Thaher. "Prior elicitation and variable selection for bayesian quantile regression." Thesis, Brunel University, 2013. http://bura.brunel.ac.uk/handle/2438/7501.

Full text

Abstract:

Bayesian subset selection suffers from three important difficulties: assigning priors over model space, assigning priors to all components of the regression coefficients vector given a specific model and Bayesian computational efficiency (Chen et al., 1999). These difficulties become more challenging in Bayesian quantile regression framework when one is interested in assigning priors that depend on different quantile levels. The objective of Bayesian quantile regression (BQR), which is a newly proposed tool, is to deal with unknown parameters and model uncertainty in quantile regression (QR). However, Bayesian subset selection in quantile regression models is usually a difficult issue due to the computational challenges and nonavailability of conjugate prior distributions that are dependent on the quantile level. These challenges are rarely addressed via either penalised likelihood function or stochastic search variable selection (SSVS). These methods typically use symmetric prior distributions for regression coefficients, such as the Gaussian and Laplace, which may be suitable for median regression. However, an extreme quantile regression should have different regression coefficients from the median regression, and thus the priors for quantile regression coefficients should depend on quantiles. This thesis focuses on three challenges: assigning standard quantile dependent prior distributions for the regression coefficients, assigning suitable quantile dependent priors over model space and achieving computational efficiency. The first of these challenges is studied in Chapter 2 in which a quantile dependent prior elicitation scheme is developed. In particular, an extension of the Zellners prior which allows for a conditional conjugate prior and quantile dependent prior on Bayesian quantile regression is proposed. The prior is generalised in Chapter 3 by introducing a ridge parameter to address important challenges that may arise in some applications, such as multicollinearity and overfitting problems. The proposed prior is also used in Chapter 4 for subset selection of the fixed and random coefficients in a linear mixedeffects QR model. In Chapter 5 we specify normal-exponential prior distributions for the regression coefficients which can provide adaptive shrinkage and represent an alternative model to the Bayesian Lasso quantile regression model. For the second challenge, we assign a quantile dependent prior over model space in Chapter 2. The prior is based on the percentage bend correlation which depends on the quantile level. This prior is novel and is used in Bayesian regression for the first time. For the third challenge of computational efficiency, Gibbs samplers are derived and setup to facilitate the computation of the proposed methods. In addition to the three major aforementioned challenges this thesis also addresses other important issues such as the regularisation in quantile regression and selecting both random and fixed effects in mixed quantile regression models.

APA, Harvard, Vancouver, ISO, and other styles

7

Yankovskyy, Yevhen. "Application of a Gibbs Sampler to estimating parameters of a hierarchical normal model with a time trend and testing for existence of the global warming." Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/1010.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Xu, Zhiqing. "Bayesian Inference of a Finite Population under Selection Bias." Digital WPI, 2014. https://digitalcommons.wpi.edu/etd-theses/621.

Full text

Abstract:

Length-biased sampling method gives the samples from a weighted distribution. With the underlying distribution of the population, one can estimate the attributes of the population by converting the weighted samples. In this thesis, generalized gamma distribution is considered as the underlying distribution of the population and the inference of the weighted distribution is made. Both the models with known and unknown finite population size are considered. In the modes with known finite population size, maximum likelihood estimation and bootstrapping methods are attempted to derive the distributions of the parameters and population mean. For the sake of comparison, both the models with and without the selection bias are built. The computer simulation results show the model with selection bias gives better prediction for the population mean. In the model with unknown finite population size, the distributions of the population size as well as the sample complements are derived. Bayesian analysis is performed using numerical methods. Both the Gibbs sampler and random sampling method are employed to generate the parameters from their joint posterior distribution. The fitness of the size-biased samples are checked by utilizing conditional predictive ordinate.

APA, Harvard, Vancouver, ISO, and other styles

9

Plassmann, Florenz. "The Impact of Two-Rate Taxes on Construction in Pennsylvania." Diss., Virginia Tech, 1997. http://hdl.handle.net/10919/30622.

Full text

Abstract:

The evaluation of policy-relevant economic research requires an ethical foundation. Classical liberal theory provides the requisite foundation for this dissertation, which uses various econometric tools to estimate the effects of shifting some of the property tax from buildings to land in 15 cities in Pennsylvania. Economic theory predicts that such a shift will lead to higher building activity. However, this prediction has been supported little by empirical evidence so far.

The first part of the dissertation examines the effect of the land-building tax differential on the number of building permits that were issued in 219 municipalities in Pennsylvania between 1972 and 1994. For such count data a conventional analysis based on a continuous distribution leads to incorrect results; a discrete maximum likelihood analysis with a negative binomial distribution is more appropriate. Two models, a non-linear and a fixed effects model, are developed to examine the influence of the tax differential. Both models suggest that this influence is positive, albeit not statistically significant.

Application of maximum likelihood techniques is computationally cumbersome if the assumed distribution of the data cannot be written in closed form. The negative binomial distribution is the only discrete distribution with a variance that is larger than its mean that can easily be applied, although it might not be the best approximation of the true distribution of the data. The second part of the dissertation uses a Markov Chain Monte Carlo method to examine the influence of the tax differential on the number of building permits, under the assumption that building permits are generated by a Poisson process whose parameter varies lognormally. Contrary to the analysis in the first part, the tax is shown to have a strong and significantly positive impact on the number of permits.

The third part of the dissertation uses a fixed-effects weighted least squares method to estimate the effect of the tax differential on the value per building permit. The tax coefficient is not significantly different from zero. Still, the overall impact of the tax differential on the total value of construction is shown to be positive and statistically significant.

Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

10

Cao, Jun. "A Random-Linear-Extension Test Based on Classic Nonparametric Procedures." Diss., Temple University Libraries, 2009. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/48271.

Full text

Abstract:

Statistics
Ph.D.
Most distribution free nonparametric methods depend on the ranks or orderings of the individual observations. This dissertation develops methods for the situation when there is only partial information about the ranks available. A random-linear-extension exact test and an empirical version of the random-linear-extension test are proposed as a new way to compare groups of data with partial orders. The basic computation procedure is to generate all possible permutations constrained by the known partial order using a randomization method similar in nature to multiple imputation. This random-linear-extension test can be simply implemented using a Gibbs Sampler to generate a random sample of complete orderings. Given a complete ordering, standard nonparametric methods, such as the Wilcoxon rank-sum test, can be applied, and the corresponding test statistics and rejection regions can be calculated. As a direct result of our new method, a single p-value is replaced by a distribution of p-values. This is related to some recent work on Fuzzy P-values, which was introduced by Geyer and Meeden in Statistical Science in 2005. A special case is to compare two groups when only two objects can be compared at a time. Three matching schemes, random matching, ordered matching and reverse matching are introduced and compared between each other. The results described in this dissertation provide some surprising insights into the statistical information in partial orderings.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

11

Sari, Ilkay. "Joint synchronization of clock phase offset, skew and drift in reference broadcast synchronization (RBS) protocol." [College Station, Tex. : Texas A&M University, 2006. http://hdl.handle.net/1969.1/ETD-TAMU-1781.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Ellefsrød, Martin Belgau. "The Betting Machine : Using in-depth match statistics to compute future probabilities of football match outcomes using the Gibbs sampler." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for datateknikk og informasjonsvitenskap, 2013. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-22996.

Full text

Abstract:

Football is one of the most, if not <i>the</i> most, popular sporting games in the world, both played and watched by millions of people from all over the world almost daily, certainly weekly. Though most of those who place weekly bets on match outcomes have made up their minds on the abilities on competing teams, many have nevertheless attempted to assess the abilities of sporting teams using different statistical approaches, assigning objective, quantitative values to each team. From that standing point, one can then try to predict the future results of games. This paper researches the existing methods used by Maher (1982) and Dixon & Coles (1997) on modeling team strengths, and how these models are used for prediction.The study then proceeds to compare the two methods of Maher (1982) and Dixon & Coles (1997) by experimenting with the models, finding that the latter seems to provide the most promising results. Tests are run by constructing the models and collecting empirical evidence on the accuracy on the models when using them to bet on matches.We then continue with constructing our own model, which utilizes more detailed data from the current season's football matches, retrieved from several football and betting sites on the internet, and compare our results with how the older models performed on the same season.Our study finds that the current data we were able to retrieve does not significantly increase the return of investments when betting on matches over the course of a season. Though our model performs slightly better than the two methods of Maher(1982) and Dixon & Coles(1997), it is not able to perform better than the bookmakers it is betting against.The study is concluded by a section on what further work should be done to attempt to improve the models, focusing on using extensive data on matches that we did not manage to find, such as where on the pitch most passes were made, or where shots where fired from, and whether important players were available.

APA, Harvard, Vancouver, ISO, and other styles

13

HUNTER, TINA D. "Gibbs Sampling and Expectation Maximization Methods for Estimation of Censored Values from Correlated Multivariate Distributions." University of Cincinnati / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1212157899.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Kousha, Termeh. "Topics in Random Matrices: Theory and Applications to Probability and Statistics." Thèse, Université d'Ottawa / University of Ottawa, 2011. http://hdl.handle.net/10393/20480.

Full text

Abstract:

In this thesis, we discuss some topics in random matrix theory which have applications to probability, statistics and quantum information theory. In Chapter 2, by relying on the spectral properties of an associated adjacency matrix, we find the distribution of the maximum of a Dyck path and show that it has the same distribution function as the unsigned Brownian excursion which was first derived in 1976 by Kennedy. We obtain a large and moderate deviation principle for the law of the maximum of a random Dyck path. Our result extends the results of Chung, Kennedy and Khorunzhiy and Marckert. In Chapter 3, we discuss a method of sampling called the Gibbs-slice sampler. This method is based on Neal's slice sampling combined with Gibbs sampling. In Chapter 4, we discuss several examples which have applications in physics and quantum information theory.

APA, Harvard, Vancouver, ISO, and other styles

15

Chouchane, Mathieu. "Optimisation spatio-temporelle d’efforts de recherche pour cibles manoeuvrantes et intelligentes." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4318.

Full text

Abstract:

Dans cette thèse, nous cherchons à répondre à une problématique formulée par la DGA Techniques navales pour surveiller une zone stratégique : planifier le déploiement spatial et temporel optimal d’un ensemble de capteurs de façon à maximiser les chances de détecter une cible mobile et intelligente. La cible est dite intelligente car elle est capable de détecter sous certaines conditions les menaces que représentent les capteurs et ainsi de réagir en adaptant son comportement. Les déploiements générés pouvant aussi avoir un coût élevé nous devons tenir compte de ce critère lorsque nous résolvons notre problématique. Il est important de noter que la résolution d’un problème de ce type requiert, selon les besoins, l’application d’une méthode d’optimisation mono-objectif voire multiobjectif. Jusqu’à présent, les travaux existants n’abordent pas la question du coût des déploiements proposés. De plus la plupart d’entre eux ne se concentrent que sur un seul aspect à la fois. Enfin, pour des raisons algorithmiques, les contraintes sont généralement discrétisées.Dans une première partie, nous présentons un algorithme qui permet de déterminer le déploiement spatio-temporel de capteurs le plus efficace sans tenir compte de son coût. Cette méthode est une application à l’optimisation de la méthode multiniveau généralisée.Dans la seconde partie, nous montrons d’abord que l’utilisation de la somme pondérée des deux critères permet d’obtenir des solutions sans augmenter le temps de calcul. Pour notre seconde approche, nous nous inspirons des algorithmes évolutionnaires d’optimisation multiobjectif et adaptons la méthode multiniveau généralisée à l’optimisation multiobjectif
In this work, we propose a solution to a problem issued by the DGA Techniques navales in order to survey a strategic area: determining the optimal spatio-temporal deployment of sensors that will maximize the detection probability of a mobile and smart target. The target is said to be smart because it is capable of detecting the threat of the sensors under certain conditions and then of adapting its behaviour to avoid it. The cost of a deployment is known to be very expensive and therefore it has to be taken into account. It is important to note that the wide spectrum of applications within this field of research also reflects the need for a highly complex theoretical framework based on stochastic mono or multi-objective optimisation. Until now, none of the existing works have dealt with the cost of the deployments. Moreover, the majority only treat one type of constraint at a time. Current works mostly rely on operational research algorithms which commonly model the constraints in both discrete space and time.In the first part, we present an algorithm which computes the most efficient spatio-temporal deployment of sensors, but without taking its cost into account. This optimisation method is based on an application of the generalised splitting method.In the second part, we first use a linear combination of the two criteria. For our second approach, we use the evolutionary multiobjective optimisation framework to adapt the generalised splitting method to multiobjective optimisation. Finally, we compare our results with the results of the NSGA-II algorithm

APA, Harvard, Vancouver, ISO, and other styles

16

Smith, Michael Ross. "Modeling the Performance of a Baseball Player's Offensive Production." Diss., CLICK HERE for online access, 2006. http://contentdm.lib.byu.edu/ETD/image/etd1189.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Donkor, Simon. "Performance Measurement in the eCommerce Industry." Digital WPI, 2003. https://digitalcommons.wpi.edu/etd-theses/487.

Full text

Abstract:

The eCommerce industry introduced new business principles, as well as new strategies for achieving these principles, and as a result some traditional measures of success are no longer valid. We classified and ranked the performance of twenty business-to-consumer eCommerce companies by developing critical benchmarks using the Balanced scorecard methodology. We applied a Latent class model, a statistical model along the Bayesian framework, to facilitate the determination of the best and worst performing companies. An eCommerce site's greatest asset is its customers, which is why some of the most valued and sophisticated metrics used today evolve around customer behavior. The results from our classification and ranking procedure showed that companies that ranked high overall also ranked comparatively well in the customer analysis ranking, For example, Amazon.com, one of the highest rated eCommerce companies with a large customer base ranked second in the critical benchmark developed towards measuring customer analysis. The results from our simulation also showed that the Latent class model is a good fit for the classification procedure, and it has a high classification rate for the worst and best performing companies. The resulting work offers a practical tool with the ability to identify profitable investment opportunities for financial managers and analysts.

APA, Harvard, Vancouver, ISO, and other styles

18

Helali, Amine. "Vitesse de convergence de l'échantillonneur de Gibbs appliqué à des modèles de la physique statistique." Thesis, Brest, 2019. http://www.theses.fr/2019BRES0002/document.

Full text

Abstract:

Les méthodes de Monte Carlo par chaines de Markov MCMC sont des outils mathématiques utilisés pour simuler des mesures de probabilités π définies sur des espaces de grandes dimensions. Une des questions les plus importantes dans ce contexte est de savoir à quelle vitesse converge la chaine de Markov P vers la mesure invariante π. Pour mesurer la vitesse de convergence de la chaine de Markov P vers sa mesure invariante π nous utilisons la distance de la variation totale. Il est bien connu que la vitesse de convergence d’une chaine de Markov réversible P dépend de la deuxième plus grande valeur propre en valeur absolue de la matrice P notée β!. Une partie importante dans l’estimation de β! consiste à estimer la deuxième plus grande valeur propre de la matrice P, qui est notée β1. Diaconis et Stroock (1991) ont introduit une méthode basée sur l’inégalité de Poincaré pour estimer β1 pour le cas général des chaines de Markov réversibles avec un nombre fini d'état. Dans cette thèse, nous utilisons la méthode de Shiu et Chen (2015) pour étudier le cas de l'algorithme de l'échantillonneur de Gibbs pour le modèle d'Ising unidimensionnel avec trois états ou plus appelé aussi modèle de Potts. Puis, nous généralisons le résultat de Shiu et Chen au cas du modèle d’Ising deux- dimensionnel avec deux états. Les résultats obtenus minorent ceux introduits par Ingrassia (1994). Puis nous avons pensé à perturber l'échantillonneur de Gibbs afin d’améliorer sa vitesse de convergence vers l'équilibre
Monte Carlo Markov chain methods MCMC are mathematical tools used to simulate probability measures π defined on state spaces of high dimensions. The speed of convergence of this Markov chain X to its invariant state π is a natural question to study in this context.To measure the convergence rate of a Markov chain we use the total variation distance. It is well known that the convergence rate of a reversible Markov chain depends on its second largest eigenvalue in absolute value denoted by β!. An important part in the estimation of β! is the estimation of the second largest eigenvalue which is denoted by β1.Diaconis and Stroock (1991) introduced a method based on Poincaré inequality to obtain a bound for β1 for general finite state reversible Markov chains.In this thesis we use the Chen and Shiu approach to study the case of the Gibbs sampler for the 1−D Ising model with three and more states which is also called Potts model. Then, we generalize the result of Shiu and Chen (2015) to the case of the 2−D Ising model with two states.The results we obtain improve the ones obtained by Ingrassia (1994). Then, we introduce some method to disrupt the Gibbs sampler in order to improve its convergence rate to equilibrium

APA, Harvard, Vancouver, ISO, and other styles

19

Zhu, Qingyun. "Product Deletion and Supply Chain Management." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/527.

Full text

Abstract:

One of the most significant changes in the evolution of modern business management is that organizations no longer compete as individual entities in the market, but as interlocking supply chains. Markets are no longer simply trading desks but dynamic ecosystems where people, organizations and the environment interact. Products and associated materials and resources are links that bridge supply chains from upstream (sourcing and manufacturing) to downstream (delivering and consuming). The lifecycle of a product plays a critical role in supply chains. Supply chains may be composed by, designed around, and modified for products. Product-related issues greatly impact supply chains. Existing studies have advanced product management and product lifecycle management literature through dimensions of product innovation, product growth, product line extensions, product efficiencies, and product acquisition. Product deletion, rationalization, or reduction research is limited but is a critical issue for many reasons. Sustainability is an important reason for this managerial decision. This study, grounded from multiple literature streams in both marketing and supply chain fields, identified relations and propositions to form a firm-level analysis on the role of supply chains in organizational product deletion decisions. Interviews, observational and archival data from international companies (i.e.: Australia, China, India, and Iran) contributed to the empirical support as case studies through a grounded theory approach. Bayesian analysis, an underused empirical analysis tool, was utilized to provide insights into this underdeveloped research stream; and its relationship to qualitative research enhances broader methodological understanding. Gibbs sampler and reversible jump Markov chain Monte Carlo (MCMC) simulation were used for Bayesian analysis based on collected data. The integrative findings are exploratory but provide insights for a number of research propositions.

APA, Harvard, Vancouver, ISO, and other styles

20

Ozturk, Olcay. "Bayesian Semiparametric Models For Nonignorable Missing Datamechanisms In Logistic Regression." Master's thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12613241/index.pdf.

Full text

Abstract:

In this thesis, Bayesian semiparametric models for the missing data mechanisms of nonignorably missing covariates in logistic regression are developed. In the missing data literature, fully parametric approach is used to model the nonignorable missing data mechanisms. In that approach, a probit or a logit link of the conditional probability of the covariate being missing is modeled as a linear combination of all variables including the missing covariate itself. However, nonignorably missing covariates may not be linearly related with the probit (or logit) of this conditional probability. In our study, the relationship between the probit of the probability of the covariate being missing and the missing covariate itself is modeled by using a penalized spline regression based semiparametric approach. An efficient Markov chain Monte Carlo (MCMC) sampling algorithm to estimate the parameters is established. A WinBUGS code is constructed to sample from the full conditional posterior distributions of the parameters by using Gibbs sampling. Monte Carlo simulation experiments under different true missing data mechanisms are applied to compare the bias and efficiency properties of the resulting estimators with the ones from the fully parametric approach. These simulations show that estimators for logistic regression using semiparametric missing data models maintain better bias and efficiency properties than the ones using fully parametric missing data models when the true relationship between the missingness and the missing covariate has a nonlinear form. They are comparable when this relationship has a linear form.

APA, Harvard, Vancouver, ISO, and other styles

21

Junqueira, Vinícius Silva. "Qualidade das informações de parentesco na avaliação genética de bovinos de corte." Universidade Federal de Viçosa, 2014. http://locus.ufv.br/handle/123456789/4812.

Full text

Abstract:

Made available in DSpace on 2015-03-26T13:42:34Z (GMT). No. of bitstreams: 1 texto completo.pdf: 9480742 bytes, checksum: f87c9b2653b911e326eef4b5a0dd87ce (MD5) Previous issue date: 2014-07-25
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
The aim of this study was to evaluate the quality of relationship information over genetic parameters estimates and accuracies of breeding values. Thus, we evaluated the quality by making corrections based on SNPs markers. Variance components were estimated under Bayesian approach via Gibbs sampling. The evaluation of the correct relationship was performed using markers of 3,591 individuals in a population consisting of 12,668 animals. Mendelian conflicts were deflned as conflicts between the progeny and parents markers. Thus, 460 changes were performed, which 54% had a buII or cow on pedigree. We observed, on average, match parent were defined with 75 markers, whiIe no- match parent n was performed with 2,700 markers. Annealing algorithm Molecular coancestry program provided 2,174 new haIf-sibs relationships. We observed that higher quality on pedigree information provided increase in accuracy of breeding values and higher heritability estimates (0.22 i 0.0286), suggesting the possibility of direct selection for tick resistance. We used a 5-fold cross-validation strategy using K-means and random methods to group the animals. The results of cross-validation indicate that higher quality of the pedigree relationships provide higher accuracy. The weaning weight was used to evaluate muItipIe sires progeny (MS) and embryo transfer and in vitro fertilization animals (TEF). We used three strategies to include MS information: genetic groups (GG), bayesian hierarchical method (HIER) and the average numerator relationship matrix (ANRM). The deviance information criteria (DIC) was used as Bayesian measure of fIt, which suggested that HIER strategy provided better fIt when MS information is consider. However, ANRM provided best fit to include TEF animals. We use foster dam information to estimate the maternal genetic and maternal permanent environmental effects when considering TEF animals. We didn't observed statistical difference in variance components and genetic parameters considering information TEF animals, but breeding values showed greater values. Spearman correlations between breeding values were higher for base, animals with certain paternity and MS progeny. However, the use of TEF animals information significantly changed the predicted breeding values. The use of appropriate methodologies to include the information of MS progeny and TEF animals provide most accurate prediction of breeding values and assist in higher rates of genetic gain.
O objetivo desse estudo foi verificar o reflexo da qualidade das informações de parentesco sobre as estimativas de parâmetros genéticos e acurácias das predições de valor genética Dessa forma, foi avaliada a qualidade dessas informações ao realizar correções no pedigree baseadas em marcadores do tipo SNPs. Os componentes de variância foram estimados sob enforque Bayesiano por amostragem de Gibbs. A avaliação da correta definição de parentesco foi realizada utilizando-se informações de marcadores SNPs de 3.591 indivíduos em uma população constituída de 12.668 animais. Os conflitos mendelianos entre as marcas da progênie e dos pais foram utilizados como critério de avaliação de parentesco. Dessa forma, foram realizadas 460 mudanças no pedigree, dentre os quais 54% possuíam um touro ou vaca identificado. Foi observado que, em média, novos relacionamentos genéticos foram definidos com 75 marcas em conflito, enquanto que a rejeição do parentesco foi realizada com 2.700 marcas. O uso do programa Molecular Coancestry para inferência de parentesco a partir de informações de marcadores moleculares proporcionou a definição de 2.174 novos relacionamentos de meio-irmãos. Foi possível observar que as correções de parentesco proporcionaram aumento da acurácia média dos valores genéticos preditos. O aumento na qualidade das informações de pedigree proporcionou a estimação de herdabilidade de maior magnitude (0,22 i 0,0286), sugerindo a possibilidade de seleção direta para a resistência a carrapatos. Foi utilizada uma estratégia de validação cruzada pelo método K-médias e também de forma aleatória, no qual cinco grupos de treinamento foram formados. Os resultados da validação cruzada indicam que maior qualidade nos relacionamentos do pedigree proporcionam maior valor de acurácia. O peso padronizado aos 205 dias foi utilizado para avaliar a inclusão de informações de filhos de reprodutores múltiplos (RM) e de animais oriundos de biotécnicas da reprodução (TEF). Foram avaliadas três estratégias de inclusão dessas informações: grupos genéticos (GG), método hierárquico bayesiano (HIER) e matriz da média dos numeradores dos coeficientes de parentesco (ANRM). O critério de informação da deviance (DIC) foi utilizado como avaliador da qualidade de ajuste e sugeriu que a estratégia HIER proporcionou melhor ajuste ao considerar as informações de RM. Entretanto, o método ANRM apresentou melhor ajuste ao incluir informações dos animais TEF. A inclusão das informações de animais TEF foi realizada pelo uso da informação da receptora para estimação do efeito genético materno e de ambiente permanente materno. Não foi observada diferença estatística nas estimativas de componentes de variância e de parâmetros genéticos ao considerar informações de animais TEF, porém os valores genéticos preditos apresentaram maior magnitude. As correlações de Spearman entre os valores genéticos foram de elevadas magnitudes para os animais fundadores, animais com certeza no parentesco e para os fiIhos de RM. Entretanto, o uso das informações dos animais TEF modificou de forma significativa as predições dos valores genéticos. O uso de metodologias adequadas para incluir a informação de animais com incerteza de paternidade e animais oriundos de transferência de embriões e fertilização in vitro proporcionam a predição de valores genéticos mais acurados e auxiliam em maiores taxas de ganho genético.

APA, Harvard, Vancouver, ISO, and other styles

22

Deng, Wei. "Multiple imputation for marginal and mixed models in longitudinal data with informative missingness." Connect to resource, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1126890027.

Full text

Abstract:

Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xiii, 108 p.; also includes graphics. Includes bibliographical references (p. 104-108). Available online via OhioLINK's ETD Center

APA, Harvard, Vancouver, ISO, and other styles

23

Yamaki, Marcos. "Estimação de parâmetros genéticos de produção de leite e de gordura da raça Pardo-suíça, utilizando metodologias freqüentista e bayesiana." Universidade Federal de Viçosa, 2006. http://locus.ufv.br/handle/123456789/5720.

Full text

Abstract:

Made available in DSpace on 2015-03-26T13:55:08Z (GMT). No. of bitstreams: 1 texto completo.pdf: 905318 bytes, checksum: 167ccc3c1b47051e3ce28eb0224bed43 (MD5) Previous issue date: 2006-07-31
Conselho Nacional de Desenvolvimento Científico e Tecnológico
First lactation data of 6.262 Brown-Swiss cows from 311 herds, daughters of 803 sires with calving between 1980 and 2003 were used to estimate genetic parameters for milk and fat production traits. The components of variance were estimated by restricted maximum likelihood (REML) and bayesian methods, using animal model with uni and two-traits analisys . The estimation by REML was obtained with the software MTDFREML (BOLDMAN et al. 1995) testing unitrait models with different effects to covariables and considering contemporary group and season as fixed effect. The best fitting obtained on unitrait analisys were used on two-trait analisys. The estimative of additive variance was reduced when lactation length was included on the model suggesting that the animals were been fitted to the same base on the capacity of transmit a longer or shorter lactation length to the progeny. Therefore, fitting to this covariable is not recommended. On the other side, the age of calving has linearly influenced milk and fat production. The heritability estimates were 0,26 and 0,25 to milk and fat yield respectively with genetic correlation of 0,95. the high correlation among these traits suggests that part of genes that acts on milk yield also respond to fat yield, in such way that selection for milk yield results, indirectly, in increase on fat yield. The estimation by Bayesian inference was made on software MTGSAM (VAN TASSELL E VAN VLECK, 1995). Chain lengths were tested to obtain the marginal posterior densities of unitrait analisys, the best option of chain length, burn-in and sampling interval was used on two-trait analisys. The burn-in periods were tested with the software GIBANAL (VAN KAAM, 1998) witch analysis inform a sampling interval for each burn-in tested, the criteria for choosing the sampling interval was made with the serial correlation resulting by burn-in and sampling process. The heritability estimates were 0,33 ± 0,05 for both traits with genetic correlation of 0,95. Similar results were obtained on studies using the same methodology on first lactation records. The stationary phase adequately reached with a 500.000 chain length and 30.000 burn-in iteractions.
Dados de primeira lactação de 6.262 vacas distribuídas em 311 rebanhos, filhas de 803 touros com partos entre os anos de 1980 e 2003 foram utilizados para estimar de componentes de variância para as características de produção de leite e gordura com informações de primeira lactação, em animais da raça Pardo-Suíça. Os componentes de variância foram estimados pelo método da máxima verossimilhança restrita (REML) e Bayesiano, sob modelo animal, por meio de análises uni e bicaracterística. A estimação realizada via REML foi obtida com o programa MTDFREML (BOLDMAN et al. 1995) testando modelos unicaracterística com diferentes efeitos para as covariáveis e considerados grupo contemporâneo e estação como efeitos fixos. Os melhores ajustes obtidos nas analises unicaracterística foram utilizados na análise bicaracterística. A duração da lactação reduziu a estimativa da variância aditiva quando era utilizada no modelo sugerindo que os animais estariam sendo corrigidos para uma mesma base quanto à capacidade de imprimir duração da lactação mais longa ou mais curta à progênie sendo, portanto, não recomendado o ajuste para esta covariável. Já a idade da vaca ao parto, influenciou linearmente a produção de leite e gordura. As herdabilidades estimadas foram 0,26 e 0,25 para produção de leite e gordura respectivamente com correlação genética de 0,95. A alta correlação entre a produção de leite e gordura obtida sugere que parte dos genes que atuam na produção de leite também responde pela produção de gordura, de tal forma que a seleção para a produção de leite resulta, indiretamente, em aumentos na produção de gordura. A estimação via inferência Bayesiana foi realizada com o programa MTGSAM (VAN TASSELL E VAN VLECK, 1995). Foram testados diversos tamanhos de cadeia para a obtenção das densidades marginais a posteriori das análises unicaracterística, a melhor proposta para o tamanho de cadeia, burn-in e amostragem foi utilizada para a análise bicaracterística. Os períodos de burn-in foram testados pelo programa GIBANAL (VAN KAAM, 1998) cujas análises fornecem um intervalo de amostragem para cada burn-in testado, o critério de escolha do intervalo de amostragem foi feito de acordo com a correlação serial, resultante do burn-in e do processo de amostragem. As estimativas de herdabilidade obtidas foram 0,33 ± 0,05 para ambas as características com correlação de 0,95. Resultados similares foram obtidos em estudos utilizando a mesma metodologia em informações de primeira lactação. A fase estacionária foi adequadamente atingida com uma cadeia de 500.000 iterações e descarte inicial de 30.000 iterações.

APA, Harvard, Vancouver, ISO, and other styles

24

Franzén, Jessica. "Bayesian Cluster Analysis : Some Extensions to Non-standard Situations." Doctoral thesis, Stockholm University, Department of Statistics, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-7686.

Full text

Abstract:

The Bayesian approach to cluster analysis is presented. We assume that all data stem from a finite mixture model, where each component corresponds to one cluster and is given by a multivariate normal distribution with unknown mean and variance. The method produces posterior distributions of all cluster parameters and proportions as well as associated cluster probabilities for all objects. We extend this method in several directions to some common but non-standard situations. The first extension covers the case with a few deviant observations not belonging to one of the normal clusters. An extra component/cluster is created for them, which has a larger variance or a different distribution, e.g. is uniform over the whole range. The second extension is clustering of longitudinal data. All units are clustered at all time points separately and the movements between time points are modeled by Markov transition matrices. This means that the clustering at one time point will be affected by what happens at the neighbouring time points. The third extension handles datasets with missing data, e.g. item non-response. We impute the missing values iteratively in an extra step of the Gibbs sampler estimation algorithm. The Bayesian inference of mixture models has many advantages over the classical approach. However, it is not without computational difficulties. A software package, written in Matlab for Bayesian inference of mixture models is introduced. The programs of the package handle the basic cases of clustering data that are assumed to arise from mixture models of multivariate normal distributions, as well as the non-standard situations.

APA, Harvard, Vancouver, ISO, and other styles

25

Severgnini, Battista. "Essays in Total Factor Productivity measurement." Doctoral thesis, Humboldt-Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät, 2010. http://dx.doi.org/10.18452/16195.

Full text

Abstract:

Diese Dissertation umfasst sowohl einen theoretisches als auch einen empirischen Beitrag zur Analyse der Messung der gesamten Faktorproduktivität (TFP). Das erste Kapitel inspiziert die bestehende Literatur über die häufigsten Techniken der TFP Messung und gibt einen Überblick über deren Limitierung. Das zweite Kapitel betrachtet Daten, die durch ein Real Business Cycle Modell generiert wurden und untersucht das quantifizierbare Ausmaß von Messfehlern des Solow Residuums als ein Maß für TFP Wachstum, wenn der Kapitalstock fehlerhaft gemessen wird und wenn Kapazitätsauslastung und Abschreibungen endogen sind. Das dritte Kapitel schlägt eine neue Methodologie in einem bayesianischen Zusammenhang vor, die auf Zustands- Raum-Modellen basiert. Das vierte Kapitel führt einen neuen Ansatz zur Bestimmung möglicher Spill-over Effekte auf Grund neuer Technologien auf die Produktivität ein und kombiniert eine kontrafaktische Zerlegung, die von den Hauptannahmen des Malquist Indexes abgeleitet wird mit ökonometrischen Methoden, die auf Machado and Mata (2005) zurückgehen.
This dissertation consists of theoretical and empirical contributions to the study on Total Factor Productivity (TFP) measurement. The first chapter surveys the literature on the most used techniques in measuring TFP and surveys the limits of these frameworks. The second chapter considers data generated from a Real Business Cycle model and studies the quantitative extent of measurement error for the Solow residual as a measure of TFP growth when the capital stock is measured with error and when capacity utilization and depreciation are endogenous. Furthermore, it proposes two alternative measurements of TFP growth which do not require capital stocks. The third chapter proposes a new methodology based on State-space models in a Bayesian framework. Applying the Kalman Filter to artificial data, it proposes a computation of the initial condition for productivity growth based on the properties of the Malmquist index. The fourth chapter introduces a new approach for identifying possible spillovers emanating from new technologies on productivity combining a counterfactual decomposition derived from the main properties of the Malmquist index and the econometric technique introduced by Machado and Mata (2005).

APA, Harvard, Vancouver, ISO, and other styles

26

Lu, Qunfang Flora. "Bayesian forecasting of stock prices via the Ohlson model." Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-050605-155155/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Figueiredo, Cléber da Costa. "Calibração linear assimétrica." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-08032013-141153/.

Full text

Abstract:

A presente tese aborda aspectos teóricos e aplicados da estimação dos parâmetros do modelo de calibração linear com erros distribuídos conforme a distribuição normal-assimétrica (Azzalini, 1985) e t-normal-assimétrica (Gómez, Venegas e Bolfarine, 2007). Aplicando um modelo assimétrico, não é necessário transformar as variáveis a fim de obter erros simétricos. A estimação dos parâmetros e das variâncias dos estimadores do modelo de calibração foram estudadas através da visão freqüentista e bayesiana, desenvolvendo algoritmos tipo EM e amostradores de Gibbs, respectivamente. Um dos pontos relevantes do trabalho, na óptica freqüentista, é a apresentação de uma reparametrização para evitar a singularidade da matriz de informação de Fisher sob o modelo de calibração normal-assimétrico na vizinhança de lambda = 0. Outro interessante aspecto é que a reparametrização não modifica o parâmetro de interesse. Já na óptica bayesiana, o ponto forte do trabalho está no desenvolvimento de medidas para verificar a qualidade do ajuste e que levam em consideração a assimetria do conjunto de dados. São propostas duas medidas para medir a qualidade do ajuste: o ADIC (Asymmetric Deviance Information Criterion) e o EDIC (Evident Deviance Information Criterion), que são extensões da ideia de Spiegelhalter et al. (2002) que propôs o DIC ordinário que só deve ser usado em modelos simétricos.
This thesis focuses on theoretical and applied estimation aspects of the linear calibration model with skew-normal (Azzalini, 1985) and skew-t-normal (Gómez, Venegas e Bolfarine, 2007) error distributions. Applying the asymmetrical distributed error methodology, it is not necessary to transform the variables in order to have symmetrical errors. The frequentist and the Bayesian solution are presented. The parameter estimation and its variance estimation were studied using the EM algorithm and the Gibbs sampler, respectively, in each approach. The main point, in the frequentist approach, is the presentation of a new parameterization to avoid singularity of the information matrix under the skew-normal calibration model in a neighborhood of lambda = 0. Another interesting aspect is that the reparameterization developed to make the information matrix nonsingular, when the skewness parameter is near to zero, leaves the parameter of interest unchanged. The main point, in the Bayesian framework, is the presentation of two measures of goodness-of-fit: ADIC (Asymmetric Deviance Information Criterion) and EDIC (Evident Deviance Information Criterion ). They are natural extensions of the ordinary DIC developed by Spiegelhalter et al. (2002).

APA, Harvard, Vancouver, ISO, and other styles

28

Bai, Yan. "A Bayesian approach to detect the onset of activity limitation among adults in NHIS." Link to electronic thesis, 2005. http://www.wpi.edu/Pubs/ETD/Available/etd-050605-155002/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Ozkan, Pelin. "Analysis Of Stochastic And Non-stochastic Volatility Models." Master's thesis, METU, 2004. http://etd.lib.metu.edu.tr/upload/3/12605421/index.pdf.

Full text

Abstract:

Changing in variance or volatility with time can be modeled as deterministic by using autoregressive conditional heteroscedastic (ARCH) type models, or as stochastic by using stochastic volatility (SV) models. This study compares these two kinds of models which are estimated on Turkish / USA exchange rate data. First, a GARCH(1,1) model is fitted to the data by using the package E-views and then a Bayesian estimation procedure is used for estimating an appropriate SV model with the help of Ox code. In order to compare these models, the LR test statistic calculated for non-nested hypotheses is obtained.

APA, Harvard, Vancouver, ISO, and other styles

30

Luo, Yuqun. "Incorporation of Genetic Marker Information in Estimating Modelparameters for Complex Traits with Data From Large Complex Pedigrees." The Ohio State University, 2002. http://rave.ohiolink.edu/etdc/view?acc_num=osu1039109696.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Cerqueira, Pedro Henrique Ramos. "Structural equation models applied to quantitative genetics." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-05112015-145419/.

Full text

Abstract:

Causal models have been used in different areas of knowledge in order to comprehend the causal associations between variables. Over the past decades, the amount of studies using these models have been growing a lot, especially those related to biological systems where studying and learning causal relationships among traits are essential for predicting the consequences of interventions in such system. Graph analysis (GA) and structural equation modeling (SEM) are tools used to explore such associations. While GA allows searching causal structures that express qualitatively how variables are causally connected, fitting SEM with a known causal structure allows to infer the magnitude of causal effects. Also SEM can be viewed as multiple regression models in which response variables can be explanatory variables for others. In quantitative genetics studies, SEM aimed to study the direct and indirect genetic effects associated to individuals through information related to them, beyond the observed characteristics, such as the kinship relations. In those studies typically the assumptions of linear relationships among traits are made. However, in some scenarios, nonlinear relationships can be observed, which make unsuitable the mentioned assumptions. To overcome this limitation, this paper proposes to use a mixed effects polynomial structural equation model, second or superior degree, to model those nonlinear relationships. Two studies were developed, a simulation and an application to real data. The first study involved simulation of 50 data sets, with a fully recursive causal structure involving three characteristics in which linear and nonlinear causal relations between them were allowed. The second study involved the analysis of traits related to dairy cows of the Holstein breed. Phenotypic relationships between traits were calving difficulty, gestation length and also the proportion of perionatal death. We compare the model of multiple traits and polynomials structural equations models, under different polynomials degrees in order to assess the benefits of the SEM polynomial of second or higher degree. For some situations the inappropriate assumption of linearity results in poor predictions of the direct, indirect and total of the genetic variances and covariance, either overestimating, underestimating, or even assign opposite signs to covariances. Therefore, we conclude that the inclusion of a polynomial degree increases the SEM expressive power.
Modelos causais têm sido muitos utilizados em estudos em diferentes áreas de conhecimento, a fim de compreender as associações ou relações causais entre variáveis. Durante as últimas décadas, o uso desses modelos têm crescido muito, especialmente estudos relacionados à sistemas biológicos, uma vez que compreender as relações entre características são essenciais para prever quais são as consequências de intervenções em tais sistemas. Análise do grafo (AG) e os modelos de equações estruturais (MEE) são utilizados como ferramentas para explorar essas relações. Enquanto AG nos permite buscar por estruturas causais, que representam qualitativamente como as variáveis são causalmente conectadas, ajustando o MEE com uma estrutura causal conhecida nos permite inferir a magnitude dos efeitos causais. Os MEE também podem ser vistos como modelos de regressão múltipla em que uma variável resposta pode ser vista como explanatória para uma outra característica. Estudos utilizando MEE em genética quantitativa visam estudar os efeitos genéticos diretos e indiretos associados aos indivíduos por meio de informações realcionadas aos indivíduas, além das característcas observadas, como por exemplo o parentesco entre eles. Neste contexto, é tipicamente adotada a suposição que as características observadas são relacionadas linearmente. No entanto, para alguns cenários, relações não lineares são observadas, o que torna as suposições mencionadas inadequadas. Para superar essa limitação, este trabalho propõe o uso de modelos de equações estruturais de efeitos polinomiais mistos, de segundo grau ou seperior, para modelar relações não lineares. Neste trabalho foram desenvolvidos dois estudos, um de simulação e uma aplicação a dados reais. O primeiro estudo envolveu a simulação de 50 conjuntos de dados, com uma estrutura causal completamente recursiva, envolvendo 3 características, em que foram permitidas relações causais lineares e não lineares entre as mesmas. O segundo estudo envolveu a análise de características relacionadas ao gado leiteiro da raça Holandesa, foram utilizadas relações entre os seguintes fenótipos: dificuldade de parto, duração da gestação e a proporção de morte perionatal. Nós comparamos o modelo misto de múltiplas características com os modelos de equações estruturais polinomiais, com diferentes graus polinomiais, a fim de verificar os benefícios do MEE polinomial de segundo grau ou superior. Para algumas situações a suposição inapropriada de linearidade resulta em previsões pobres das variâncias e covariâncias genéticas diretas, indiretas e totais, seja por superestimar, subestimar, ou mesmo atribuir sinais opostos as covariâncias. Portanto, verificamos que a inclusão de um grau de polinômio aumenta o poder de expressão do MEE.

APA, Harvard, Vancouver, ISO, and other styles

32

Gama, Manuela Pires Monteiro da [UNESP]. "Parâmetros genéticos para desempenho em corridas de cavalos puro sangue inglês utilizando procedimentos Bayesiano e Thurstoniano." Universidade Estadual Paulista (UNESP), 2012. http://hdl.handle.net/11449/92594.

Full text

Abstract:

Made available in DSpace on 2014-06-11T19:26:07Z (GMT). No. of bitstreams: 0 Previous issue date: 2012-05-25Bitstream added on 2014-06-13T19:54:01Z : No. of bitstreams: 1 gama_mpm_me_jabo.pdf: 190063 bytes, checksum: b6fc2664d1583f57dbb78c9f59a29b61 (MD5)
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
O objetivo desse trabalho foi estimar parâmetros genéticos para o caráter tempo e colocação final em corridas de cavalos Puro Sangue Inglês (PSI) com procedimentos Bayesiano e Thurstoniano, a fim de fornecer subsídios para a seleção de reprodutores e consequente melhoramento genético da raça no Brasil. A partir de dados fornecidos pela empresa Turf Total Ltda foram consideradas 251.754 informações de tempo e 272.277 informações de colocações finais em 34.316 corridas de cavalos PSI ocorridas entre 1992 e 2011 em 6 hipódromos do país, para as distâncias de 1.000, 1.300, 1.600 e 2.000 metros. Os efeitos considerados fixos foram idade, sexo, posição de largada e páreo para as análises de tempo, e sexo, idade, posição de largada, páreo e nível de dificuldade da corrida para as análises de colocações finais. As herdabilidades e correlações genéticas para tempo foram estimadas utilizando inferências bayesianas, ao passo que para as estimativas de herdabilidade de colocação final utilizou-se o modelo Thurstoniano. As estimativas de herdabilidade para tempo em análises unicaraterísticas. foram semelhantes às encontradas na literatura, e variaram entre 0,31 e 0,04 e as repetibilidades encontradas variaram de 0,61 a 0,22, respectivamente com o aumento das distâncias. As estimadas para colocação final variaram de 0,57 a 0,21, apresentando a mesma tendência que as herdabilidades para tempo. As estimativas de herdabilidade para tempo na análise multicaracterística variaram entre 0,34 e 0,15 com repetibilidade entre 0,63 e 0,36. Nessa análise, a seleção para tempo também mostrou-se mais eficiente em distâncias menores, onde as herdabilidades foram maiores. As estimativas de correlações genéticas foram positivas e variaram de 0,47 a 0,97. Conclui-se que, em distâncias mais curtas, a seleção tanto para tempo...
The objective of this study was to estimate genetic paremeters for racing time and final rank in Thoroughbred horses using Bayesian and Thurstonian procedures, in order to provide data that contribute for selection and the consequent genetic improvement of the breed in Brazil. Data were provided by the company Turf Total Ltda. and consisted of 251,754 racing time records and 272,277 final rank records obtained from 34,316 Thoroughbred races (distances of 1,000, 1,300, 1,600 and 2,000 m) that occurred between 1992 and 2011 on six race tracks. Fixed effects included age, sex, post-position and race for the analysis of racing time, and sex, age, post-position, race and level of difficulty for final rank analysis. The heritabilities for racing time and final rank and the genetic correlations between racing times were estimated by Bayesian inference. In addition, a Thurstonian model was used to estimate the heritability for final rank. The heritability estimates for racing time in one-trait analysis were similar to those reported in the literature and ranged from 0.31 to 0.04. Repeatability estimates tended to decrease with increasing race distance (0.61 to 0.22) .The heritabilities estimates for final rank ranged from 0.57 to 0.21 and showed the same trend as the heritabilities for time. The heritability estimates for racing time obtained by multi-trait analysis ranged form 0.34 to 0.15, with repeatabilities of 0.63 to 0.36 at the distances studied. Multi-trait analysis also showed that selection for racing time was more efficient at shorter distances when heritabilities were higher. The genetic correlations were all positive and ranged from 0.47 to 0.97. In conclusion, selection for both racing time and final rank is more efficient at shorter distances... (Complete abstract click electronic access below)

APA, Harvard, Vancouver, ISO, and other styles

33

Gama, Manuela Pires Monteiro da. "Parâmetros genéticos para desempenho em corridas de cavalos puro sangue inglês utilizando procedimentos Bayesiano e Thurstoniano /." Jaboticabal : [s.n.], 2012. http://hdl.handle.net/11449/92594.

Full text

Abstract:

Orientador: Marcílio Dias Silveira da Mota
Coorientador: Henrique Nunes de Oliveira
Banca: Humberto Tonhati
Banca: Evaldo Antonio Lencioni Titto
Resumo: O objetivo desse trabalho foi estimar parâmetros genéticos para o caráter tempo e colocação final em corridas de cavalos Puro Sangue Inglês (PSI) com procedimentos Bayesiano e Thurstoniano, a fim de fornecer subsídios para a seleção de reprodutores e consequente melhoramento genético da raça no Brasil. A partir de dados fornecidos pela empresa Turf Total Ltda foram consideradas 251.754 informações de tempo e 272.277 informações de colocações finais em 34.316 corridas de cavalos PSI ocorridas entre 1992 e 2011 em 6 hipódromos do país, para as distâncias de 1.000, 1.300, 1.600 e 2.000 metros. Os efeitos considerados fixos foram idade, sexo, posição de largada e páreo para as análises de tempo, e sexo, idade, posição de largada, páreo e nível de dificuldade da corrida para as análises de colocações finais. As herdabilidades e correlações genéticas para tempo foram estimadas utilizando inferências bayesianas, ao passo que para as estimativas de herdabilidade de colocação final utilizou-se o modelo Thurstoniano. As estimativas de herdabilidade para tempo em análises unicaraterísticas. foram semelhantes às encontradas na literatura, e variaram entre 0,31 e 0,04 e as repetibilidades encontradas variaram de 0,61 a 0,22, respectivamente com o aumento das distâncias. As estimadas para colocação final variaram de 0,57 a 0,21, apresentando a mesma tendência que as herdabilidades para tempo. As estimativas de herdabilidade para tempo na análise multicaracterística variaram entre 0,34 e 0,15 com repetibilidade entre 0,63 e 0,36. Nessa análise, a seleção para tempo também mostrou-se mais eficiente em distâncias menores, onde as herdabilidades foram maiores. As estimativas de correlações genéticas foram positivas e variaram de 0,47 a 0,97. Conclui-se que, em distâncias mais curtas, a seleção tanto para tempo... (Resumo completo, clicar acesso eletrônico abaixo)
Abstract: The objective of this study was to estimate genetic paremeters for racing time and final rank in Thoroughbred horses using Bayesian and Thurstonian procedures, in order to provide data that contribute for selection and the consequent genetic improvement of the breed in Brazil. Data were provided by the company Turf Total Ltda. and consisted of 251,754 racing time records and 272,277 final rank records obtained from 34,316 Thoroughbred races (distances of 1,000, 1,300, 1,600 and 2,000 m) that occurred between 1992 and 2011 on six race tracks. Fixed effects included age, sex, post-position and race for the analysis of racing time, and sex, age, post-position, race and level of difficulty for final rank analysis. The heritabilities for racing time and final rank and the genetic correlations between racing times were estimated by Bayesian inference. In addition, a Thurstonian model was used to estimate the heritability for final rank. The heritability estimates for racing time in one-trait analysis were similar to those reported in the literature and ranged from 0.31 to 0.04. Repeatability estimates tended to decrease with increasing race distance (0.61 to 0.22) .The heritabilities estimates for final rank ranged from 0.57 to 0.21 and showed the same trend as the heritabilities for time. The heritability estimates for racing time obtained by multi-trait analysis ranged form 0.34 to 0.15, with repeatabilities of 0.63 to 0.36 at the distances studied. Multi-trait analysis also showed that selection for racing time was more efficient at shorter distances when heritabilities were higher. The genetic correlations were all positive and ranged from 0.47 to 0.97. In conclusion, selection for both racing time and final rank is more efficient at shorter distances... (Complete abstract click electronic access below)
Mestre

APA, Harvard, Vancouver, ISO, and other styles

34

Wu, Yi-Fang. "Accuracy and variability of item parameter estimates from marginal maximum a posteriori estimation and Bayesian inference via Gibbs samplers." Diss., University of Iowa, 2015. https://ir.uiowa.edu/etd/5879.

Full text

Abstract:

Item response theory (IRT) uses a family of statistical models for estimating stable characteristics of items and examinees and defining how these characteristics interact in describing item and test performance. With a focus on the three-parameter logistic IRT (Birnbaum, 1968; Lord, 1980) model, the current study examines the accuracy and variability of the item parameter estimates from the marginal maximum a posteriori estimation via an expectation-maximization algorithm (MMAP/EM) and the Markov chain Monte Carlo Gibbs sampling (MCMC/GS) approach. In the study, the various factors which have an impact on the accuracy and variability of the item parameter estimates are discussed, and then further evaluated through a large scale simulation. The factors of interest include the composition and length of tests, the distribution of underlying latent traits, the size of samples, and the prior distributions of discrimination, difficulty, and pseudo-guessing parameters. The results of the two estimation methods are compared to determine the lower limit--in terms of test length, sample size, test characteristics, and prior distributions of item parameters--at which the methods can satisfactorily recover item parameters and efficiently function in reality. For practitioners, the results help to define limits on the appropriate use of the BILOG-MG (which implements MMAP/EM) and also, to assist in deciding the utility of OpenBUGS (which carries out MCMC/GS) for item parameter estimation in practice.

APA, Harvard, Vancouver, ISO, and other styles

35

Araujo, Neto Francisco Ribeiro de [UNESP]. "Estimativas de componentes de (CO) variância de características de crescimentos na raça Nelore, utilizando inferência bayesiana." Universidade Estadual Paulista (UNESP), 2008. http://hdl.handle.net/11449/92632.

Full text

Abstract:

Made available in DSpace on 2014-06-11T19:26:08Z (GMT). No. of bitstreams: 0 Previous issue date: 2008-07-28Bitstream added on 2014-06-13T18:29:31Z : No. of bitstreams: 1 araujoneto_fr_me_jabo.pdf: 389310 bytes, checksum: 1c20248bba0e60c4bb9d24bce6f9ca34 (MD5)
Neste trabalho objetivou-se avaliar a aplicação de método bayesiano na análise de dados de crescimento, em um modelo multi-características. Dados de 54.182 animais da raça Nelore foram utilizados na estimação de componentes de variância de: a) pesos padronizados as idades de 120 (P20), 210 (P210), 365 (P365), 450 (P450) e 550 (P550) dias; b) perímetro escrotal as idades-padrão de 365 (PE365), 450 (PE450) e 550 (PE550) dias; c) ganhos de peso entre as idades 120/210 (GP1), 210/365 (GP2), 365/450 (GP3) e 450/550 (GP4) e; d) crescimento escrotal nos intervalos de 365/450 (CP1) e 450/550 (CP2). As análises foram realizadas empregando-se o software GIBBS2F90, assumindo um modelo animal para oito características: P120, PE365 e ganhos de peso e perímetro escrotal. As demais foram obtidas mediante propriedades da soma de variâncias. As herdabilidades diretas estimadas foram 0,17; 0,19; 0,13; 0,15; 0,33; 0,19; 0,23; 0,24; 0,35; 0,37; 0,39; 0,52; 0,56; e;0,48 para GP1, GP2, GP3, GP4, CP1, CP2, P120, P210, P365, P450, P550, PE365, PE450 e PE550, respectivamente. Os valores de correlação genética variaram de -0,025 (GP3/PE550) a 0,97 (PE450/PE550). Verificou-se que as características em idades padrão apresentam maior variabilidade genética que as medidas intervalares, o que as indica como critérios de seleção que possibilitaria uma melhor resposta a seleção. Com relação à implementação do método bayesiano em modelo multi-características, verificou-se eficiente apesar do pequeno número de amostras efetivas geradas em alguns parâmetros, indicando a necessidade de um maior número de interações.
This work, the objective is to evaluate application of Bayesian methods in multiple-trait animal models. Records from 54.182 Nellore males were used for estimation of variance components for: a) weight standardized ages of 120 (W120), 210 (W210), 365 (W365), 450 (W450) and 550 (W550) days; b) scrotal circumference at age-standard from 365 (EC365), 450 (EC450) and 550 (EC550); c) weight gains between the standardized ages of 120/210 (WG1), 210/365 (WG2), 365/450 (WG3) e 450/550 (WG4)e; d) scrotal growth in the intervals 365/450 (EG1) and 450/550 (EG2). Analyses were performed using the GIBBS2F90, assuming a multiple-trait animal model. The direct heritability estimates were 0,17; 0,19; 0,13; 0,15; 0,33; 0,19; 0,23; 0,24; 0,35; 0,37; 0,39; 0,52; 0,56; e;0,48 for WG1, WG2, WG3, WG4, EG1, EG2, W120, W210, W365, W450, W550, EC365, EC450 and EC550, respectively. The correlations values was ranging from -0,025 (between WG3 and EC550) to 0,97 (between EC450 and EC550) to genetics. It was found that the standard features in ages show greater genetic variability that measures interval, which indicates how the criteria of selection that would better meet the selection. Regarding the implementation of the Bayesian method in multi-model features, there was efficient despite the small number of effective samples generated in some parameters, indicating the need for a greater number of interactions.

APA, Harvard, Vancouver, ISO, and other styles

36

Araújo, Ronyere Olegário de. "COMPONENTES DE COVARIÂNCIAS ESTIMADOS POR METODOLOGIA BAYESIANA PARA PARÂMETROS BIOLÓGICOS OBTIDOS POR MODELOS NÃO LINEARES PARA BUBALINOS DA RAÇA MURRAH." Universidade Federal de Santa Maria, 2009. http://repositorio.ufsm.br/handle/1/10746.

Full text

Abstract:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
The aimed of this work was to study the adjustment of classical non linear models, Von Bertalanffy, Brody, Gompertz, Logistic to growing records of buffaloes of Murrah breed, raised on lowlands in the State of Rio Grande do Sul, and to estimate covariance components by Bayesian focus, for growing curve parameters with biological interpretation. In paper 01 there were studied the adjustment of the classical non linear models already mentioned to growing data for a group of 66 buffaloes females, born from 1982 to 1989, sired by three males and 38 females. There were evaluated the traits Asymptotic weight (A) and Maturity rate (K). The total pair of records weight-age was 26 weighting/female and 1,638 observations. The criterions utilized to select the model that better adjust the growing curve were: asymptotic standard deviation (DPA); the determination coefficient (R2); the residual absolute average deviation (DMA) and asymptotic index (AI). It was concluded that all the models overestimated the birth weight (PN) in bigger or smaller magnitude. In crescent order, the models Von Bertalanffy, Gompertz, Logistic and Brody overestimated PN by 28.55; 32.74; 42.70 and 43.45 kg, respectively. The Logistic model underestimated A (-2.09 kg) and Von Bertalanffy, Gompertz, and Brody overestimated A in crescent order 8.04, 17.7 and 280.33 kg, respectively. Based on the adjustment criterions and in the predicted curves behavior, the Gompertz model, followed by Logistic and Von Bertalanffy were the best adjustment. In Paper 2 there were studied the adjustments of the same models and for the same traits in Paper 01 for a group of 67 buffaloes, born from 1982 to 1989 sired by three males and 42 females. It was concluded that all the models overestimated PN. Von Bertalanffy and Brody models overestimated A, and Gompertz and Logistic models underestimated it. The smaller DPA was obtained by Brody model characterizing a bigger R2 but this model presented the bigger DMA. Considering all the criterions, Gompertz model presented the best adjustment followed by Logistic and Von Bertalanffy. It is suggested do not use Brody model to describe the growing curve for animals of Murrah breed raised in the conditions of this work. In Paper 3 there were estimated covariance components and genetic parameters by Bayesian focus, using the Family BLUPF90, for the parameters A and K, estimated by Gompertz model and adopting an animal model. The heritability coefficients presented elevated values for A and for K (0.57 and 0.34, respectively), indicating that selection can be used as an instrument for change the curve shape of this population. However, the use of this information must be done with to much attention because these traits are negatively correlated. In this case a restricted selection index should be used with more success.
Objetivou-se com este trabalho estudar o ajuste de modelos não-lineares Von Bertalanffy, Brody, Gompertz e Logístico aos dados de crescimento de búfalos(as) da raça Murrah criados em terras baixas no Estado do Rio Grande do Sul e estimar os componentes de (co)variâncias, sob enfoque Bayesiano, para os parâmetros da curva de crescimento com interpretação biológica. No Capítulo 01 foram estudados os ajustes dos modelos não-lineares, supracitados, aos dados de crescimento para um grupo de 63 búfalas, nascidas no período de 1982 a 1989, filhas de três reprodutores e 38 matrizes. Foram avaliadas as características Peso Assintótico (A) e a Taxa de Maturação (K). Os pares de registro peso-idade totalizaram 26 pesagens/fêmea e 1.638 observações. Os critérios utilizados para selecionar o modelo de melhor ajuste à curva de crescimento foram: desvio padrão assintótico (DPA); o coeficiente de determinação (R2); o desvio médio absoluto dos resíduos (DMA) e o índice assintótico (IA). Em ordem crescente, os modelos Von Bertalanffy, Gompertz, Brody e Logístico superestimaram o PN em 28,55; 32,74; 42,70 e 43,45 kg, respectivamente. O modelo Logístico subestimou o A (-2,09 kg) e os demais modelos (Gompertz, Von Bertalanffy e Brody) superestimaram este parâmetro, em: 8,04; 17,7 e 280,33 kg, respectivamente. Com base nos critérios de ajuste e na visualização das curvas preditas, o modelo Gompertz, seguido dos modelos Logístico e Von Bertalanffy seriam os de melhor ajuste. No Capítulo 02 estudaram-se os ajustes dos mesmos modelos para as mesmas características referenciados no Capítulo 01, aos dados de crescimento para um grupo de 64 búfalos, nascidos no período de 1982 a 1989, filhos de três reprodutores e 42 matrizes. Concluiu-se que todos os modelos superestimaram o PN. Os modelos Von Bertalanffy e Brody superestimaram o A em 14,7 e 167,22 kg, respectivamente, ao passo que os modelos Gompertz e Logístico o subestimaram em 5 e 13 kg, respectivamente. Considerando todos os critérios, o modelo Logístico apresentou o melhor ajuste seguido dos modelos Gompertz e Von Bertalanffy. Sugere-se que o modelo Brody não seja utilizado para descrever a curva de crescimento de búfalos(as) da raça Murrah, criados sob as condições deste trabalho. No Capítulo 03, foram estimados os componentes de (co)variâncias e os parâmetros genéticos sob enfoque Bayesiano, utilizando os programas da Família BLUPF90, dos parâmetros A e K, estimados pelo modelo Gompertz, adotando um modelo animal. Os coeficientes de herdabilidade foram de elevada magnitude tanto para A quanto para K (0,57 e 0,34, respectivamente), indicando que a seleção pode ser usada como instrumento para alterar a forma da curva de crescimento desses animais. Entretanto, o uso dessas informações deve ser feito com grande cautela, uma vez que as características a serem trabalhadas na modificação do formato da curva de crescimento são negativamente correlacionadas, além também, da grande variabilidade das estimativas. Neste caso, os índices de seleção restritos poderiam ser utilizados com maior sucesso.

APA, Harvard, Vancouver, ISO, and other styles

37

Araujo, Neto Francisco Ribeiro de. "Estimativas de componentes de (CO) variância de características de crescimentos na raça Nelore, utilizando inferência bayesiana /." Jaboticabal : [s.n.], 2008. http://hdl.handle.net/11449/92632.

Full text

Abstract:

Orientador: Henrique Nunes de Oliveira
Banca: Lúcia Galvão de Albuquerque
Banca: Maria Eugênia Zerlotti Mercadante
Resumo: Neste trabalho objetivou-se avaliar a aplicação de método bayesiano na análise de dados de crescimento, em um modelo multi-características. Dados de 54.182 animais da raça Nelore foram utilizados na estimação de componentes de variância de: a) pesos padronizados as idades de 120 (P20), 210 (P210), 365 (P365), 450 (P450) e 550 (P550) dias; b) perímetro escrotal as idades-padrão de 365 (PE365), 450 (PE450) e 550 (PE550) dias; c) ganhos de peso entre as idades 120/210 (GP1), 210/365 (GP2), 365/450 (GP3) e 450/550 (GP4) e; d) crescimento escrotal nos intervalos de 365/450 (CP1) e 450/550 (CP2). As análises foram realizadas empregando-se o software GIBBS2F90, assumindo um modelo animal para oito características: P120, PE365 e ganhos de peso e perímetro escrotal. As demais foram obtidas mediante propriedades da soma de variâncias. As herdabilidades diretas estimadas foram 0,17; 0,19; 0,13; 0,15; 0,33; 0,19; 0,23; 0,24; 0,35; 0,37; 0,39; 0,52; 0,56; e;0,48 para GP1, GP2, GP3, GP4, CP1, CP2, P120, P210, P365, P450, P550, PE365, PE450 e PE550, respectivamente. Os valores de correlação genética variaram de -0,025 (GP3/PE550) a 0,97 (PE450/PE550). Verificou-se que as características em idades padrão apresentam maior variabilidade genética que as medidas intervalares, o que as indica como critérios de seleção que possibilitaria uma melhor resposta a seleção. Com relação à implementação do método bayesiano em modelo multi-características, verificou-se eficiente apesar do pequeno número de amostras efetivas geradas em alguns parâmetros, indicando a necessidade de um maior número de interações.
Abstract: This work, the objective is to evaluate application of Bayesian methods in multiple-trait animal models. Records from 54.182 Nellore males were used for estimation of variance components for: a) weight standardized ages of 120 (W120), 210 (W210), 365 (W365), 450 (W450) and 550 (W550) days; b) scrotal circumference at age-standard from 365 (EC365), 450 (EC450) and 550 (EC550); c) weight gains between the standardized ages of 120/210 (WG1), 210/365 (WG2), 365/450 (WG3) e 450/550 (WG4)e; d) scrotal growth in the intervals 365/450 (EG1) and 450/550 (EG2). Analyses were performed using the GIBBS2F90, assuming a multiple-trait animal model. The direct heritability estimates were 0,17; 0,19; 0,13; 0,15; 0,33; 0,19; 0,23; 0,24; 0,35; 0,37; 0,39; 0,52; 0,56; e;0,48 for WG1, WG2, WG3, WG4, EG1, EG2, W120, W210, W365, W450, W550, EC365, EC450 and EC550, respectively. The correlations values was ranging from -0,025 (between WG3 and EC550) to 0,97 (between EC450 and EC550) to genetics. It was found that the standard features in ages show greater genetic variability that measures interval, which indicates how the criteria of selection that would better meet the selection. Regarding the implementation of the Bayesian method in multi-model features, there was efficient despite the small number of effective samples generated in some parameters, indicating the need for a greater number of interactions.
Mestre

APA, Harvard, Vancouver, ISO, and other styles

38

Rivas, Cruz Manuel A. "Medical relevance and functional consequences of protein truncating variants." Thesis, University of Oxford, 2015. http://ora.ox.ac.uk/objects/uuid:a042ca18-7b35-4a62-aef0-e3ba2e8795f7.

Full text

Abstract:

Genome-wide association studies have greatly improved our understanding of the contribution of common variants to the genetic architecture of complex traits. However, two major limitations have been highlighted. First, common variant associations typically do not identify the causal variant and/or the gene that it is exerting its effect on to influence a trait. Second, common variant associations usually consist of variants with small effects. As a consequence, it is more challenging to harness their translational impact. Association studies of rare variants and complex traits may be able to help address these limitations. Empirical population genetic data shows that deleterious variants are rare. More specifically, there is a very strong depletion of common protein truncating variants (PTVs, commonly referred to as loss-of-function variants) in the genome, a group of variants that have been shown to have large effect on gene function, are enriched for severe disease-causing mutations, but in other instances may actually be protective against disease. This thesis is divided into three parts dedicated to the study of protein truncating variants, their medical relevance, and their functional consequences. First, I present statistical, bioinformatic, and computational methods developed for the study of protein truncating variants and their association to complex traits, and their functional consequences. Second, I present application of the methods to a number of case-control and quantitative trait studies discovering new variants and genes associated to breast and ovarian cancer, type 1 diabetes, lipids, and metabolic traits measured with NMR spectroscopy. Third, I present work on improving annotation of protein truncating variants by studying their functional consequences. Taken together, these results highlight the utility of interrogating protein truncating variants in medical and functional genomic studies.

APA, Harvard, Vancouver, ISO, and other styles

39

Hsu, Yu-Chin, and 徐鈺欽. "Identifying Tool Combinations Critical to Semiconductor Manufacturing Yield with Gibbs Sampler." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/13904671629559340090.

Full text

Abstract:

碩士
國立臺灣大學
工業工程學研究所
100
In semiconductor manufacturing, the soundness of tool commonality analysis (TCA) technique has a high impact on the effectiveness of product yield diagnosis. However, all up-to-date TCA algorithms are based on greedy search strategies, which are naturally poor in identifying combinational root causes. When the root cause of wafer yield loss is tool combination instead of a single tool, the greedy-search-oriented TCA algorithm usually results in both high false and high miss identification rates. As the feature size of semiconductor devices continuously shrinks down, the problem induced by greedy-search-oriented TCA algorithm becomes severer because the total number of tools is getting large and wafer yield loss is more likely caused by a specific tool combination. To cope with the tool combination problem, a new TCA algorithm based on Gibbs Sampler, a Markov Chain Monte Carlo (MCMC) stochastic search technique, is proposed. In specific, a tool health indicator with binary value is defined for each tool to determine if it should be involved in the tool combination as root cause. With the Gibbs Sampler, the computation complexity is reduced from O(2n) to about O(n2), where n is the number of tools. Simulation and field data validation results show that the proposed TCA algorithm performs well in identifying the ill tool combination.

APA, Harvard, Vancouver, ISO, and other styles

40

Lee, Kuan-Chun, and 李冠俊. "Compositional Vision and Inference: Perturbation Formulation for Context Sensitivity with Gibbs Sampler." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/b7nb9u.

Full text

Abstract:

碩士
國立清華大學
數學系
102
In this thesis, we discuss a generative model for Bayesian image analysis. In this model, we focus on building a prior of pares of an image with context information based on compositionality and a conditional model of image pixels given a particular interpretation. Also, a MCMC inference algorithm, Gibbs sampler, is introduced. Finally, Gibbs sampler and our model will be applied to a facial pose estimation experiment.

APA, Harvard, Vancouver, ISO, and other styles

41

Huang, Wei-ting, and 黃瑋婷. "Motif Finding in DNA Sequences Using a Genetic Algorithm with Gibbs Sampler." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/01156091168278589174.

Full text

Abstract:

碩士
國立成功大學
電機工程學系碩博士班
95
In molecular biology, consensus sequences often represent conserved functional domains. Motif finding in DNA sequences is a fundamental problem in computational biology, with important applications in finding conserved functional domains. In this thesis, we evaluate several methods including a genetic algorithm (GA), the structured genetic algorithm (S-GA), and Gibbs sampler method in the motif finding problem. The structured genetic algorithm can search for motifs of varying lengths without a fixed motif length as the input, which is different from the other two methods. From the simulation results of these three methods, the GA performs better than the other two approaches. However, the Gibbs sampler method is more efficient in computing. To enjoy better performance and efficiency, we combine the GA with the Gibbs sampler method for motif finding problems. We have validated the effectiveness of our approach through extensive simulations in diverse benchmark examples.

APA, Harvard, Vancouver, ISO, and other styles

42

YU, Hsin-Hsuan, and 余信萱. "The Application of Gibbs Sampler Method to the Distribution of Asset Return Correlation in the New Basel Accord." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/20149968963515633765.

Full text

Abstract:

碩士
國立臺北大學
統計學系
95
The asset correlation is the key variable for calculating the regulatory capital in the IRB approach of New Basel Accord. However, the range of asset correlation has suffer a lot of controversy since the Second Consultative paper（CP2）in 2001. Then, the third Consultative paper（CP3）announced in April, 2003 formally introduced the asset return correlation as a decreasing function of default probability. CP3 not only defined that the range of asset correlation from 0.12 to 0.24 for corporations but also addressed a negative relationship between asset correlation and probability of default. As CP3 did not explain the theoretical concept for the formula of asset correlation, there were a lot of studies discuss the appropriateness of this formula. This study applied Bayesian method to ASRF model to estimate the posterior distribution of asset correlation. We use the data for the firms in the United State from year 2001 to 2005. The empirical results suggest that the asset correlation is a decreasing function of probability of default and an increasing function of firm size, and indicate that there may be some important factors impact the asset correlation were ignored.

APA, Harvard, Vancouver, ISO, and other styles

43

Kagoda, Paulo Abuneeri. "A comprehensive analysis of extreme rainfall." Thesis, 2008. http://hdl.handle.net/10539/5337.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Chen, Zhu 1985. "The effects of three different priors for variance parameters in the normal-mean hierarchical model." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-05-1516.

Full text

Abstract:

Many prior distributions are suggested for variance parameters in the hierarchical model. The “Non-informative” interval of the conjugate inverse-gamma prior might cause problems. I consider three priors – conjugate inverse-gamma, log-normal and truncated normal for the variance parameters and do the numerical analysis on Gelman’s 8-schools data. Then with the posterior draws, I compare the Bayesian credible intervals of parameters using the three priors. I use predictive distributions to do predictions and then discuss the differences of the three priors suggested.
text

APA, Harvard, Vancouver, ISO, and other styles

45

Maldonado, Hernan. "Bayesian Regression Inference Using a Normal Mixture Model." 2012. http://digital.library.duq.edu/u?/etd,156427.

Full text

Abstract:

In this thesis we develop a two component mixture model to perform a Bayesian regression. We implement our model computationally using the Gibbs sampler algorithm and apply it to a dataset of differences in time measurement between two clocks. The dataset has ``good" time measurements and ``bad" time measurements that were associated with the two components of our mixture model. From our theoretical work we show that latent variables are a useful tool to implement our Bayesian normal mixture model with two components. After applying our model to the data we found that the model reasonably assigned probabilities of occurrence to the two states of the phenomenon of study; it also identified two processes with the same slope, different intercepts and different variances.
McAnulty College and Graduate School of Liberal Arts;
Computational Mathematics
MS;
Thesis;

APA, Harvard, Vancouver, ISO, and other styles

46

Erar, Bahar. "Mixture model cluster analysis under different covariance structures using information complexity." 2011. http://trace.tennessee.edu/utk_gradthes/968.

Full text

Abstract:

In this thesis, a mixture-model cluster analysis technique under different covariance structures of the component densities is developed and presented, to capture the compactness, orientation, shape, and the volume of component clusters in one expert system to handle Gaussian high dimensional heterogeneous data sets to achieve flexibility in currently practiced cluster analysis techniques. Two approaches to parameter estimation are considered and compared; one using the Expectation-Maximization (EM) algorithm and another following a Bayesian framework using the Gibbs sampler. We develop and score several forms of the ICOMP criterion of Bozdogan (1994, 2004) as our fitness function; to choose the number of component clusters, to choose the correct component covariance matrix structure among nine candidate covariance structures, and to select the optimal parameters and the best fitting mixture-model. We demonstrate our approach on simulated datasets and a real large data set, focusing on early detection of breast cancer. We show that our approach improves the probability of classification error over the existing methods.

APA, Harvard, Vancouver, ISO, and other styles

47

"Bayesian analysis for time series of count data." Thesis, 2014. http://hdl.handle.net/10388/ETD-2014-07-1589.

Full text

Abstract:

Time series involving count data are present in a wide variety of applications. In many applications, the observed counts are usually small and dependent. Failure to take these facts into account can lead to misleading inferences and may detect false relationships. To tackle such issues, a Poisson parameter-driven model is assumed for the time series at hand. This model can account for the time dependence between observations through introducing an autoregressive latent process. In this thesis, we consider Bayesian approaches for estimating the Poisson parameter-driven model. The main challenge is that the likelihood function for the observed counts involves a high dimensional integral after integrating out the latent variables. The main contributions of this thesis are threefold. First, I develop a new single-move (SM) Markov chain Monte Carlo (MCMC) method to sample the latent variables one by one. Second, I adopt the idea of the particle Gibbs sampler (PGS) method \citep{andrieu} into our model setting and compare its performance with the SM method. Third, I consider Bayesian composite likelihood methods and compare three different adjustment methods with the unadjusted method and the SM method. The comparisons provide a practical guide to what method to use. We conduct simulation studies to compare the latter two methods with the SM method. We conclude that the SM method outperforms the PGS method for small sample size, while they perform almost the same for large sample size. However, the SM method is much faster than the PGS method. The adjusted Bayesian composite methods provide closer results to the SM than the unadjusted one. The PGS and the selected adjustment method from simulation studies are compared with the SM method via a real data example. Similar results are obtained: first, the PGS method provides results very close to those of the SM method. Second, the adjusted composite likelihood methods provide closer results to the SM than the unadjusted one.

APA, Harvard, Vancouver, ISO, and other styles

48

Groiez, Assia. "Recyclage des candidats dans l'algorithme Metropolis à essais multiples." Thèse, 2014. http://hdl.handle.net/1866/10853.

Full text

Abstract:

Les méthodes de Monte Carlo par chaînes de Markov (MCCM) sont des méthodes servant à échantillonner à partir de distributions de probabilité. Ces techniques se basent sur le parcours de chaînes de Markov ayant pour lois stationnaires les distributions à échantillonner. Étant donné leur facilité d’application, elles constituent une des approches les plus utilisées dans la communauté statistique, et tout particulièrement en analyse bayésienne. Ce sont des outils très populaires pour l’échantillonnage de lois de probabilité complexes et/ou en grandes dimensions. Depuis l’apparition de la première méthode MCCM en 1953 (la méthode de Metropolis, voir [10]), l’intérêt pour ces méthodes, ainsi que l’éventail d’algorithmes disponibles ne cessent de s’accroître d’une année à l’autre. Bien que l’algorithme Metropolis-Hastings (voir [8]) puisse être considéré comme l’un des algorithmes de Monte Carlo par chaînes de Markov les plus généraux, il est aussi l’un des plus simples à comprendre et à expliquer, ce qui en fait un algorithme idéal pour débuter. Il a été sujet de développement par plusieurs chercheurs. L’algorithme Metropolis à essais multiples (MTM), introduit dans la littérature statistique par [9], est considéré comme un développement intéressant dans ce domaine, mais malheureusement son implémentation est très coûteuse (en termes de temps). Récemment, un nouvel algorithme a été développé par [1]. Il s’agit de l’algorithme Metropolis à essais multiples revisité (MTM revisité), qui définit la méthode MTM standard mentionnée précédemment dans le cadre de l’algorithme Metropolis-Hastings sur un espace étendu. L’objectif de ce travail est, en premier lieu, de présenter les méthodes MCCM, et par la suite d’étudier et d’analyser les algorithmes Metropolis-Hastings ainsi que le MTM standard afin de permettre aux lecteurs une meilleure compréhension de l’implémentation de ces méthodes. Un deuxième objectif est d’étudier les perspectives ainsi que les inconvénients de l’algorithme MTM revisité afin de voir s’il répond aux attentes de la communauté statistique. Enfin, nous tentons de combattre le problème de sédentarité de l’algorithme MTM revisité, ce qui donne lieu à un tout nouvel algorithme. Ce nouvel algorithme performe bien lorsque le nombre de candidats générés à chaque itérations est petit, mais sa performance se dégrade à mesure que ce nombre de candidats croît.
Markov Chain Monte Carlo (MCMC) algorithms are methods that are used for sampling from probability distributions. These tools are based on the path of a Markov chain whose stationary distribution is the distribution to be sampled. Given their relative ease of application, they are one of the most popular approaches in the statistical community, especially in Bayesian analysis. These methods are very popular for sampling from complex and/or high dimensional probability distributions. Since the appearance of the first MCMC method in 1953 (the Metropolis algorithm, see [10]), the interest for these methods, as well as the range of algorithms available, continue to increase from one year to another. Although the Metropolis-Hastings algorithm (see [8]) can be considered as one of the most general Markov chain Monte Carlo algorithms, it is also one of the easiest to understand and explain, making it an ideal algorithm for beginners. As such, it has been studied by several researchers. The multiple-try Metropolis (MTM) algorithm , proposed by [9], is considered as one interesting development in this field, but unfortunately its implementation is quite expensive (in terms of time). Recently, a new algorithm was developed by [1]. This method is named the revisited multiple-try Metropolis algorithm (MTM revisited), which is obtained by expressing the MTM method as a Metropolis-Hastings algorithm on an extended space. The objective of this work is to first present MCMC methods, and subsequently study and analyze the Metropolis-Hastings and standard MTM algorithms to allow readers a better perspective on the implementation of these methods. A second objective is to explore the opportunities and disadvantages of the revisited MTM algorithm to see if it meets the expectations of the statistical community. We finally attempt to fight the sedentarity of the revisited MTM algorithm, which leads to a new algorithm. The latter performs efficiently when the number of generated candidates in a given iteration is small, but the performance of this new algorithm then deteriorates as the number of candidates in a given iteration increases.

APA, Harvard, Vancouver, ISO, and other styles

49

Livingston, Jr Glen. "A Bayesian analysis of a regime switching volatility model." Thesis, 2017. http://hdl.handle.net/1959.13/1342483.

Full text

Abstract:

Research Doctorate - Doctor of Philosophy (PhD)
Non-linear time series data is often generated by complex systems. While linear models provide a good first approximation of a system, often a more sophisticated non-linear model is required to properly account for the features of such data. Correctly accounting for these features should lead to the fitting of a more appropriate model. Determining the features exhibited by a particular data set is a difficult task, particularly for inexperienced modellers. Therefore, it is important to move towards a modelling paradigm where little to no user input is required, in order to open statistical modelling to users less experienced in MCMC. This sort of modelling process requires a general class of models that is able to account for the features found in most linear and non-linear data sets. One such class is the STAR-GARCH class of models. These are reasonably general models that permit regime changes in the conditional mean and allow for changes in the conditional covariance. In this thesis, we develop original algorithms that combine the tasks of parameter estimation and model selection for univariate and multivariate STAR-GARCH models. The model order of the conditional mean and the model index of the conditional covariance equation are included as parameters for the model requiring estimation. Combining the tasks of parameter estimation and model selection is facilitated through the Reversible Jump MCMC methodology. Other MCMC algorithms employed for the posterior distribution simulators are the Gibbs sampler, Metropolis-Hastings, Multiple-Try Metropolis and Delayed Rejection Metropolis-Hastings algorithms. The posterior simulation algorithms are successfully implemented in the statistical software program R, and their performance is tested in both extensive simulation studies and practical applications to real world data. The current literature on multivariate extensions of STAR, GARCH, and STAR-GARCH models is quite limited from a Bayesian perspective. The implementation of a set of estimation algorithms that not only provide parameter estimates but is also able to automatically fit the model with highest posterior probability is a significant and original contribution. The impact of such a contribution will hopefully be a step forward on the path towards the automation of time series modelling.

APA, Harvard, Vancouver, ISO, and other styles

50

Thyer, Mark Andrew. "Modelling Long-Term Persistence in Hydrological Time Series." Thesis, 2001. http://hdl.handle.net/1959.13/24891.

Full text

Abstract:

The hidden state Markov (HSM) model is introduced as a new conceptual framework for modelling long-term persistence in hydrological time series. Unlike the stochastic models currently used, the conceptual basis of the HSM model can be related to the physical processes that influence long-term hydrological time series in the Australian climatic regime. A Bayesian approach was used for model calibration. This enabled rigourous evaluation of parameter uncertainty, which proved crucial for the interpretation of the results. Applying the single site HSM model to rainfall data from selected Australian capital cities provided some revealing insights. In eastern Australia, where there is a significant influence from the tropical Pacific weather systems, the results showed a weak wet and medium dry state persistence was likely to exist. In southern Australia the results were inconclusive. However, they suggested a weak wet and strong dry persistence structure may exist, possibly due to the infrequent incursion of tropical weather systems in southern Australia. This led to the postulate that the tropical weather systems are the primary cause of two-state long-term persistence. The single and multi-site HSM model results for the Warragamba catchment rainfall data supported this hypothesis. A strong two-state persistence structure was likely to exist in the rainfall regime of this important water supply catchment. In contrast, the single and multi-site results for the Williams River catchment rainfall data were inconsistent. This illustrates further work is required to understand the application of the HSM model. Comparisons with the lag-one autoregressive [AR(1)] model showed that it was not able to reproduce the same long-term persistence as the HSM model. However, with record lengths typical of real data the difference between the two approaches was not statistically significant. Nevertheless, it was concluded that the HSM model provides a conceptually richer framework than the AR(1) model.
PhD Doctorate

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Gibbs sampler'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles