Dissertations / Theses: 'Hierarchical variables'

1

Auyang, Arick Gin-Yu. "Robustness and hierarchical control of performance variables through coordination during human locomotion." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/42837.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The kinematic motor redundancy of the human legs provides more local degrees of freedom than are necessary to achieve low degree of freedom performance variables like leg length and orientation. The purpose of this dissertation is to investigate how the neuromuscular skeletal system simplifies control of a kinematically redundant system to achieve stable locomotion under different conditions. I propose that the neuromuscular skeletal system minimizes step to step variance of leg length and orientation while allowing segment angles to vary within the set of acceptable combinations of angles that achieves the desired leg length and orientation. I find that during human hopping, control of the locomotor system is organized hierarchically such that leg length and orientation are achieved by structuring segment angle variance. I also found that leg length and leg orientation was minimized for a variety of conditions and perturbations, including frequency, constrained foot placement, and different speeds. The results of this study will give valuable information on interjoint compensation strategies used when the locomotor system is perturbed. This work also provides evidence for neuromuscular system strategies in adapting to novel, difficult tasks. This information can be extended to give insight into new and different areas to focus on during gait rehabilitation of humans suffering from motor control deficits in movement and gait.

2

RIGGI, DANIELE. "Mixture factor model for hierarchical data structure and applications to the italian educational school system." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2011. http://hdl.handle.net/10281/19465.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Nowadays, the educational context is one of the most important and interesting applicative field of social science. The object of interest is often the relation between student ability and motivation. In the context of social science the multilevel structure and the latent variable models are often encountered. In this work, we present an extension of the multilevel mixture factor models (MMFM) (Riggi & Vermunt, in press, 2011 ; Varriale & Vermunt, in press, 2009), with an application to the Italian school system. These models are a combination of Factor Analysis and Latent Class. The purpose of the MMFA is multiple: the teacher classification according to class motivation structure, and the analysis of home and teacher influences on pupil reading motivation.

3

Chastaing, Gaëlle. "Indices de Sobol généralisés par variables dépendantes." Thesis, Grenoble, 2013. http://www.theses.fr/2013GRENM046.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Dans un modèle qui peut s'avérer complexe et fortement non linéaire, les paramètres d'entrée, parfois en très grand nombre, peuvent être à l'origine d'une importante variabilité de la sortie. L'analyse de sensibilité globale est une approche stochastique permettant de repérer les principales sources d'incertitude du modèle, c'est-à-dire d'identifier et de hiérarchiser les variables d'entrée les plus influentes. De cette manière, il est possible de réduire la dimension d'un problème, et de diminuer l'incertitude des entrées. Les indices de Sobol, dont la construction repose sur une décomposition de la variance globale du modèle, sont des mesures très fréquemment utilisées pour atteindre de tels objectifs. Néanmoins, ces indices se basent sur la décomposition fonctionnelle de la sortie, aussi connue soue le nom de décomposition de Hoeffding. Mais cette décomposition n'est unique que si les variables d'entrée sont supposées indépendantes. Dans cette thèse, nous nous intéressons à l'extension des indices de Sobol pour des modèles à variables d'entrée dépendantes. Dans un premier temps, nous proposons une généralisation de la décomposition de Hoeffding au cas où la forme de la distribution des entrées est plus générale qu'une distribution produit. De cette décomposition généralisée aux contraintes d'orthogonalité spécifiques, il en découle la construction d'indices de sensibilité généralisés capable de mesurer la variabilité d'un ou plusieurs facteurs corrélés dans le modèle. Dans un second temps, nous proposons deux méthodes d'estimation de ces indices. La première est adaptée à des modèles à entrées dépendantes par paires. Elle repose sur la résolution numérique d'un système linéaire fonctionnel qui met en jeu des opérateurs de projection. La seconde méthode, qui peut s'appliquer à des modèles beaucoup plus généraux, repose sur la construction récursive d'un système de fonctions qui satisfont les contraintes d'orthogonalité liées à la décomposition généralisée. En parallèle, nous mettons en pratique ces différentes méthodes sur différents cas tests
A mathematical model aims at characterizing a complex system or process that is too expensive to experiment. However, in this model, often strongly non linear, input parameters can be affected by a large uncertainty including errors of measurement of lack of information. Global sensitivity analysis is a stochastic approach whose objective is to identify and to rank the input variables that drive the uncertainty of the model output. Through this analysis, it is then possible to reduce the model dimension and the variation in the output of the model. To reach this objective, the Sobol indices are commonly used. Based on the functional ANOVA decomposition of the output, also called Hoeffding decomposition, they stand on the assumption that the incomes are independent. Our contribution is on the extension of Sobol indices for models with non independent inputs. In one hand, we propose a generalized functional decomposition, where its components is subject to specific orthogonal constraints. This decomposition leads to the definition of generalized sensitivity indices able to quantify the dependent inputs' contribution to the model variability. On the other hand, we propose two numerical methods to estimate these constructed indices. The first one is well-fitted to models with independent pairs of dependent input variables. The method is performed by solving linear system involving suitable projection operators. The second method can be applied to more general models. It relies on the recursive construction of functional systems satisfying the orthogonality properties of summands of the generalized decomposition. In parallel, we illustrate the two methods on numerical examples to test the efficiency of the techniques

4

Pfister, Mark. "Distribution of a Sum of Random Variables when the Sample Size is a Poisson Distribution." Digital Commons @ East Tennessee State University, 2018. https://dc.etsu.edu/etd/3459.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

A probability distribution is a statistical function that describes the probability of possible outcomes in an experiment or occurrence. There are many different probability distributions that give the probability of an event happening, given some sample size n. An important question in statistics is to determine the distribution of the sum of independent random variables when the sample size n is fixed. For example, it is known that the sum of n independent Bernoulli random variables with success probability p is a Binomial distribution with parameters n and p: However, this is not true when the sample size is not fixed but a random variable. The goal of this thesis is to determine the distribution of the sum of independent random variables when the sample size is randomly distributed as a Poisson distribution. We will also discuss the mean and the variance of this unconditional distribution.

5

Hay, John Leslie. "Statistical modelling for non-Gaussian time series data with explanatory variables." Thesis, Queensland University of Technology, 1999.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

6

Gebremeskel, Haftu Gebrehiwot. "Implementing hierarchical bayesian model to fertility data: the case of Ethiopia." Doctoral thesis, Università degli studi di Padova, 2016. http://hdl.handle.net/11577/3424458.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Background: Ethiopia is a country with 9 ethnically-based administrative regions and 2 city administrations, often cited, among other things, with high fertility rates and rapid population growth rate. Despite the country’s effort in their reduction, they still remain high, especially at regional-level. To this end, the study of fertility in Ethiopia, particularly on its regions, where fertility variation and its repercussion are at boiling point, is paramount important. An easy way of finding different characteristics of a fertility distribution is to build a suitable model of fertility pattern through different mathematical curves. ASFR is worthwhile in this regard. In general, the age-specific fertility pattern is said to have a typical shape common to all human populations through years though many countries some from Africa has already started showing a deviation from this classical bell shaped curve. Some of existing models are therefore inadequate to describe patterns of many of the African countries including Ethiopia. In order to describe this shape (ASF curve), a number of parametric and non-parametric functions have been exploited in the developed world though fitting these models to curves of Africa in general and that of Ethiopian in particular data has not been undertaken yet. To accurately model fertility patterns in Ethiopia, a new mathematical model that is both easily used, and provides good fit for the data is required. Objective: The principal goals of this thesis are therefore fourfold: (1). to examine the pattern of ASFRs at country and regional level,in Ethiopia; (2). to propose a model that best captures various shapes of ASFRs at both country and regional level, and then compare the performance of the model with some existing ones; (3). to fit the proposed model using Hierarchical Bayesian techniques and show that this method is flexible enough for local estimates vis-´a-vis traditional formula, where the estimates might be very imprecise, due to low sample size; and (4). to compare the resulting estimates obtained with the non-hierarchical procedures, such as Bayesian and Maximum likelihood counterparts. Methodology: In this study, we proposed a four parametric parametric model, Skew Normal model, to fit the fertility schedules, and showed that it is flexible enough in capturing fertility patterns shown at country level and most regions of Ethiopia. In order to determine the performance of this proposed model, we conducted a preliminary analysis along with ten other commonly used parametric and non-parametric models in demographic literature, namely: Quadratic Spline function, Cubic Splines, Coale-Trussell function, Beta, Gamma, Hadwiger distribution, Polynomial models, the Adjusted Error Model, Gompertz curve, Skew Normal, and Peristera & Kostaki Model. The criterion followed in fitting these models was Nonlinear Regression with nonlinear least squares (nls) estimation. We used Akaike Information Criterion (AIC) as model selecction criterion. For many demographers, however, estimating regional-specific ASFR model and the associated uncertainty introduced due those factors can be difficult, especially in a situation where we have extremely varying sample size among different regions. Recently, it has been proposed that Hierarchical procedures might provide more reliable parameter estimates than Non-Hierarchical procedures, such as complete pooling and independence to make local/regional-level analyses. In this study, a Hierarchical Bayesian procedure was, therefore, formulated to explore the posterior distribution of model parameters (for generation of region-specific ASFR point estimates and uncertainty bound). Besides, other non-hierarchical approaches, namely Bayesian and the maximum likelihood methods, were also instrumented to estimate parameters and compare the result obtained using these approaches with Hierarchical Bayesian counterparts. Gibbs sampling along with MetropolisHastings argorithm in R (Development Core Team, 2005) was applied to draw the posterior samples for each parameter. Data augmentation method was also implemented to ease the sampling process. Sensitivity analysis, convergence diagnosis and model checking were also thoroughly conducted to ensure how robust our results are. In all cases, non-informative prior distributions for all regional vectors (parameters) were used in order to real the lack of knowledge about these random variables. Result: The results obtained from this preliminary analysis testified that the values of the Akaike Information criterion(AIC) for the proposed model, Skew Normal (SN), is lowest: in the capital, Addis Ababa, Dire Dawa, Harari, Affar, Gambela, Benshangul-Gumuz, and country level data as well. On the contrary, its value was also higher some of the models and lower the rest on the remain regions, namely: Tigray, Oromiya, Amhara, Somali and SNNP. This tells us that the proposed model was able to capturing the pattern of fertility at the empirical fertility data of Ethiopia and its regions better than the other existing models considered in 6 of the 11 regions. The result from the HBA indicates that most of the posterior means were much closer to the true fixed fertility values. They were also more precise and have lower uncertainty with narrower credible interval vis-´a-vis the other approaches, ML and Bayesian estimate analogues. Conclusion: From the preliminary analysis, it can be concluded that the proposed model was better to capture ASFR pattern at national level and its regions than the other existing common models considered. Following this result, we conducted inference and prediction on the model parameters using these three approaches: HBA, BA and ML methods. The overall result suggested several points. One such is that HBA was the best approach to implement for such a data as it gave more consistent, precise (the low uncertainty) than the other approaches. Generally, both ML method and Bayesian method can be used to analyze our model, but they can be applicable to different conditions. ML method can be applied when precise values of model parameters have been known, large sample size can be obtained in the test; and similarly, Bayesian method can be applied when uncertainties on the model parameters exist, prior knowledge on the model parameters are available, and few data is available in the study.
Background: L’Etiopia è una nazione divisa in 9 regioni amministrative (definite su base etnica) e due città. Si tratta di una nazione citata spesso come esempio di alta fecondità e rapida crescita demografica. Nonostante gli sforzi del governo, fecondità e cresita della popolazione rimangono elevati, specialmente a livello regionale. Pertanto, lo studio della fecondità in Etiopia e nelle sue regioni – caraterizzate da un’alta variabilità – è di vitale importanza. Un modo semplice di rilevare le diverse caratteristiche della distribuzione della feconditàè quello di costruire in modello adatto, specificando diverse funzioni matematiche. In questo senso, vale la pena concentrarsi sui tassi specifici di fecondità, i quali mostrano una precisa forma comune a tutte le popolazioni. Tuttavia, molti paesi mostrano una “simmetrizzazione” che molti modelli non riescono a cogliere adeguatamente. Pertanto, per cogliere questa la forma dei tassi specifici, sono stati utilizzati alcuni modelli parametrici ma l’uso di tali modelliè ancora molto limitato in Africa ed in Etiopia in particolare. Obiettivo: In questo lavoro si utilizza un nuovo modello per modellare la fecondità in Etiopia con quattro obiettivi specifici: (1). esaminare la forma dei tassi specifici per età dell’Etiopia a livello nazionale e regionale; (2). proporre un modello che colga al meglio le varie forme dei tassi specifici sia a livello nazionale che regionale. La performance del modello proposto verrà confrontata con quella di altri modelli esistenti; (3). adattare la funzione di fecondità proposta attraverso un modello gerarchico Bayesiano e mostrare che tale modelloè sufficientemente flessibile per stimare la fecondità delle singole regioni – dove le stime possono essere imprecise a causa di una bassa numerosità campionaria; (4). confrontare le stime ottenute con quelle fornite da metodi non gerarchici (massima verosimiglianza o Bayesiana semplice) Metodologia: In questo studio, proponiamo un modello a 4 parametri, la Normale Asimmetrica, per modellare i tassi specifici di fecondità. Si mostra che questo modello è sufficientemente flessibile per cogliere adeguatamente le forme dei tassi specifici a livello sia nazionale che regionale. Per valutare la performance del modello, si è condotta un’analisi preliminare confrontandolo con altri dieci modelli parametrici e non parametrici usati nella letteratura demografica: la funzione splie quadratica, la Cubic-Spline, i modelli di Coale e Trussel, Beta, Gamma, Hadwiger, polinomiale, Gompertz, Peristera-Kostaki e l’Adjustment Error Model. I modelli sono stati stimati usando i minimi quadrati non lineari (nls) e il Criterio d’Informazione di Akaike viene usato per determinarne la performance. Tuttavia, la stima per le singole regioni pu‘o risultare difficile in situazioni dove abbiamo un’alta variabilità della numerosità campionaria. Si propone, quindi di usare procedure gerarchiche che permettono di ottenere stime più affidabili rispetto ai modelli non gerarchici (“pooling” completo o “unpooling”) per l’analisi a livello regionale. In questo studia si formula un modello Bayesiano gerarchico ottenendo la distribuzione a posteriori dei parametri per i tassi di fecnodità specifici a livello regionale e relativa stima dell’incertezza. Altri metodi non gerarchici (Bayesiano semplice e massima verosimiglianza) vengono anch’essi usati per confronto. Gli algoritmi Gibbs Sampling e Metropolis-Hastings vengono usati per campionare dalla distribuzione a posteriori di ogni parametro. Anche il metodo del “Data Augmentation” viene utilizzato per ottenere le stime. La robustezza dei risultati viene controllata attraverso un’analisi di sensibilità e l’opportuna diagnostica della convergenza degli algoritmi viene riportata nel testo. In tutti i casi, si sono usate distribuzioni a priori non-informative. Risultati: I risutlati ottenuti dall’analisi preliminare mostrano che il modello Skew Normal ha il pi`u basso AIC nelle regioni Addis Ababa, Dire Dawa, Harari, Affar, Gambela, Benshangul-Gumuz e anche per le stime nazionali. Nelle altre regioni (Tigray, Oromiya, Amhara, Somali e SNNP) il modello Skew Normal non risulta il milgiore, ma comunque mostra un buon adattamento ai dati. Dunque, il modello Skew Normal risulta il migliore in 6 regioni su 11 e sui tassi specifici di tutto il paese. Conclusioni: Dunque, il modello Skew Normal risulta globalmente il migliore. Da questo risultato iniziale, siè partiti per costruire i modelli Gerachico Bayesiano, Bayesiano semplice e di massima verosimiglianza. Il risultato del confronto tra questi tre approcci è che il modello gerarchico fornisce stime più preciso rispetto agli altri.

7

Gardiner, Robert B. "The relationship between teacher qualifications and chemistry achievement in the context of other student and teacher/school variables : application of hierarchical linear modelling /." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0003/MQ42384.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Han, Gang. "Modeling the output from computer experiments having quantitative and qualitative input variables and its applications." Columbus, Ohio : Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1228326460.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Saves, Paul. "High dimensional multidisciplinary design optimization for eco-design aircraft." Electronic Thesis or Diss., Toulouse, ISAE, 2024. http://www.theses.fr/2024ESAE0002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

De nos jours, un intérêt significatif et croissant pour améliorer les processus de conception de véhicules s'observe dans le domaine de l'optimisation multidisciplinaire grâce au développement de nouveaux outils et de nouvelles techniques. Concrètement, en conception aérostructure, les variables aérodynamiques et structurelles s'influencent mutuellement et ont un effet conjoint sur des quantités d'intérêt telles que le poids ou la consommation de carburant. L'optimisation multidisciplinaire se présente alors comme un outil puissant pouvant effectuer des compromis interdisciplinaires.Dans le cadre de la conception aéronautique, le processus multidisciplinaire implique généralement des variables de conception mixtes, continues et catégorielles. Par exemple, la taille des pièces structurelles d'un avion peut être décrite à l'aide de variables continues, le nombre de panneaux est associé à un entier et la liste des sections transverses ou le choix des matériaux correspondent à des choix catégoriels. L'objectif de cette thèse est de proposer une approche efficace pour optimiser un modèle multidisciplinaire boîte noire lorsque le problème d'optimisation est contraint et implique un grand nombre de variables de conception mixtes (typiquement 100 variables). L'approche d'optimisation bayésienne utilisée consiste en un enrichissement séquentiel adaptatif d'un métamodèle pour approcher l'optimum de la fonction objectif tout en respectant les contraintes.Les modèles de substitution par processus gaussiens sont parmi les plus utilisés dans les problèmes d'ingénierie pour remplacer des modèles haute fidélité coûteux en temps de calcul. L'optimisation globale efficace est une méthode heuristique d'optimisation bayésienne conçue pour la résolution globale de problèmes d'optimisation coûteux à évaluer permettant d'obtenir des résultats de bonne qualité rapidement. Cependant, comme toute autre méthode d'optimisation globale, elle souffre du fléau de la dimension, ce qui signifie que ses performances sont satisfaisantes pour les problèmes de faible dimension, mais se détériorent rapidement à mesure que la dimension de l'espace de recherche augmente. Ceci est d'autant plus vrai que les problèmes de conception de systèmes complexes intègrent à la fois des variables continues et catégorielles, augmentant encore la taille de l'espace de recherche. Dans cette thèse, nous proposons des méthodes pour réduire de manière significative le nombre de variables de conception comme, par exemple, des techniques d'apprentissage actif telles que la régression par moindres carrés partiels. Ainsi, ce travail adapte l'optimisation bayésienne aux variables discrètes et à la grande dimension pour réduire le nombre d'évaluations lors de l'optimisation de concepts d'avions innovants moins polluants comme la configuration hybride électrique "DRAGON"
Nowadays, there has been significant and growing interest in improving the efficiency of vehicle design processes through the development of tools and techniques in the field of multidisciplinary design optimization (MDO). In fact, when optimizing both the aerodynamics and structures, one needs to consider the effect of the aerodynamic shape variables and structural sizing variables on the weight which also affects the fuel consumption. MDO arises as a powerful tool that can perform this trade-off automatically. The objective of the Ph. D project is to propose an efficient approach for solving an aero-structural wing optimization process at the conceptual design level. The latter is formulated as a constrained optimization problem that involves a large number of design variables (typically 700 variables). The targeted optimization approach is based on a sequential enrichment (typically efficient global optimization (EGO)), using an adaptive surrogate model. Kriging surrogate models are one of the most widely used in engineering problems to substitute time-consuming high fidelity models. EGO is a heuristic method, designed for the solution of global optimization problems that has performed well in terms of quality of the solution computed. However, like any other method for global optimization, EGO suffers from the curse of dimensionality, meaning that its performance is satisfactory on lower dimensional problems, but deteriorates as the dimensionality of the optimization search space increases. For realistic aircraft wing design problems, the typical size of the design variables exceeds 700 and, thus, trying to solve directly the problems using EGO is ruled out. In practical test cases, high dimensional MDO problems may possess a lower intrinsic dimensionality, which can be exploited for optimization. In this context, a feature mapping can then be used to map the original high dimensional design variable onto a sufficiently small design space. Most of the existing approaches in the literature use random linear mapping to reduce the dimension, sometimes active learning is used to build this linear embedding. Generalizations to non-linear subspaces are also proposed using the so-called variational autoencoder. For instance, a composition of Gaussian processes (GP), referred as deep GP, can be very useful. In this PhD thesis, we will investigate efficient parameterization tools to significantly reduce the number of design variables by using active learning technics. An extension of the method could be also proposed to handle mixed continuous and categorical inputs using some previous works on low dimensional problems. Practical implementations within the OpenMDAO framework (an open source MDO framework developed by NASA) are expected

10

Guin, Ophélie. "Méthodes bayésiennes semi-paramétriques d'extraction et de sélection de variables dans le cadre de la dendroclimatologie." Phd thesis, Université Paris Sud - Paris XI, 2011. http://tel.archives-ouvertes.fr/tel-00636704.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Selon le Groupe Intergouvernemental d'experts sur l'Évolution du Climat (GIEC), il est important de connaitre le climat passé afin de replacer le changement climatique actuel dans son contexte. Ainsi, de nombreux chercheurs ont travaillé à l'établissement de procédures permettant de reconstituer les températures ou les précipitations passées à l'aide d'indicateurs climatiques indirects. Ces procédures sont généralement basées sur des méthodes statistiques mais l'estimation des incertitudes associées à ces reconstructions reste une difficulté majeure. L'objectif principal de cette thèse est donc de proposer de nouvelles méthodes statistiques permettant une estimation précise des erreurs commises, en particulier dans le cadre de reconstructions à partir de données sur les cernes d'arbres.De manière générale, les reconstructions climatiques à partir de mesures de cernes d'arbres se déroulent en deux étapes : l'estimation d'une variable cachée, commune à un ensemble de séries de mesures de cernes, et supposée climatique puis l'estimation de la relation existante entre cette variable cachée et certaines variables climatiques. Dans les deux cas, nous avons développé une nouvelle procédure basée sur des modèles bayésiens semi- paramétriques. Tout d'abord, concernant l'extraction du signal commun, nous proposons un modèle hiérarchique semi-paramétrique qui offre la possibilité de capturer les hautes et les basses fréquences contenues dans les cernes d'arbres, ce qui était difficile dans les études dendroclimatologiques passées. Ensuite, nous avons développé un modèle additif généralisé afin de modéliser le lien entre le signal extrait et certaines variables climatiques, permettant ainsi l'existence de relations non-linéaires contrairement aux méthodes classiques de la dendrochronologie. Ces nouvelles méthodes sont à chaque fois comparées aux méthodes utilisées traditionnellement par les dendrochronologues afin de comprendre ce qu'elles peuvent apporter à ces derniers.

11

Rockwood, Nicholas John. "Estimating Multilevel Structural Equation Models with Random Slopes for Latent Covariates." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1554478681581538.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Julião, Heloise Pavanato. "Abundância e distribuiçãoda baleia jubarte (Megaptera novaeangliae) na costa do Brasil." reponame:Repositório Institucional da FURG, 2013. http://repositorio.furg.br/handle/1/4023.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Dissertação(mestrado) - Universidade Federal do Rio Grande, Programa de Pós–Graduação em Oceanografia Biológica, Instituto de Oceanografia, 2013.
Submitted by Cristiane Gomides (cristiane_gomides@hotmail.com) on 2013-10-09T18:43:46Z No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5)
Approved for entry into archive by Sabrina Andrade (sabrinabeatriz@ibest.com.br) on 2013-10-17T03:12:06Z (GMT) No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5)
Made available in DSpace on 2013-10-17T03:12:06Z (GMT). No. of bitstreams: 1 Heloise.pdf: 1525937 bytes, checksum: 44441e69ced9544eaba26ec6b8f8e2d9 (MD5) Previous issue date: 2013
População é a unidade fundamental da conservação e sua forma mais simples de monitoramento envolve a amostragem temporal regular para a determinação do status populacional. Uma das populações de baleia jubarte do Hemisfério Sul utiliza a costa do Brasil entre maio e dezembro para se reprodução e criação dos filhotes. Esta população, denominada “estoque reprodutivo A” pela Comissão Internacional da Baleia, tem mostrado sinais de recuperação após um marcado declínio devido a caça e um longo período de moratória. Esta população se concentra principalmente no Banco dos Abrolhos (BA), onde águas calmas e quentes parecem constituir um hábitat ideal. Este estudo teve o objetivo de estimar o tamanho da população de jubartes para o ano de 2011, bem como predizer a distribuição de grupos na costa brasileira. O método de amostragem de distâncias foi implementado, e modelos hierárquicos Bayesianos foram propostos para estimar a abundância. Modelos auto-regressivos condicionais foram aplicados para predizer a densidade em células de 0.5° de latitude e longitude. O tamanho da população foi estimado em 10,160 baleias (Cr.I.95%=6,607-17,692). As maiores densidades foram encontradas entre o Banco dos Abrolhos e a Baía de Todos os Santos (BA). Os resultados sugerem que o aumento populacional acarreta a expansão da população para além do Banco dos Abrolhos.
Population is the fundamental unit of conservation and its simplest monitoring tool involves regular sampling over time for population assessing status. One of the Southern Hemisphere humpback whale populations winters at the Brazilian coast typically from May to December where breeding and calving occur. This population, labeled as “breeding stock A” by International Whaling Commission, has shown signs of recovery after the long period of whaling. The goal of this study was to estimate the population size of humpback whales up to 2011, and predict group distribution along the Brazilian coast. Distance sampling methods were implemented and hierarchical Bayesian models were proposed to estimate abundance. Conditional auto-regressive models were used to predict the density in a lattice of 0.5° of latitude and longitude. Population size was estimated at 10,160 whales (Cr.I.95%=6,607-17,692). Highest densities were predicted to occur between Abrolhos Bank and Todos os Santos Bay (BA). The results suggest that the population increase leads to a population expansion beyond Abrolhos Bank.

13

Frühwirth-Schnatter, Sylvia, and Regina Tüchler. "Bayesian parsimonious covariance estimation for hierarchical linear mixed models." Institut für Statistik und Mathematik, WU Vienna University of Economics and Business, 2004. http://epub.wu.ac.at/774/1/document.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

We considered a non-centered parameterization of the standard random-effects model, which is based on the Cholesky decomposition of the variance-covariance matrix. The regression type structure of the non-centered parameterization allows to choose a simple, conditionally conjugate normal prior on the Cholesky factor. Based on the non-centered parameterization, we search for a parsimonious variance-covariance matrix by identifying the non-zero elements of the Cholesky factors using Bayesian variable selection methods. With this method we are able to learn from the data for each effect, whether it is random or not, and whether covariances among random effects are zero or not. An application in marketing shows a substantial reduction of the number of free elements of the variance-covariance matrix. (author's abstract)
Series: Research Report Series / Department of Statistics and Mathematics

14

Chao, Yi. "Bayesian Hierarchical Latent Model for Gene Set Analysis." Thesis, Virginia Tech, 2009. http://hdl.handle.net/10919/32060.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Pathway is a set of genes which are predefined and serve a particular celluar or physiological function. Ranking pathways relevant to a particular phenotype can help researchers focus on a few sets of genes in pathways. In this thesis, a Bayesian hierarchical latent model was proposed using generalized linear random effects model. The advantage of the approach was that it can easily incorporate prior knowledges when the sample size was small and the number of genes was large. For the covariance matrix of a set of random variables, two Gaussian random processes were considered to construct the dependencies among genes in a pathway. One was based on the polynomial kernel and the other was based on the Gaussian kernel. Then these two kernels were compared with constant covariance matrix of the random effect by using the ratio, which was based on the joint posterior distribution with respect to each model. For mixture models, log-likelihood values were computed at different values of the mixture proportion, compared among mixtures of selected kernels and point-mass density (or constant covariance matrix). The approach was applied to a data set (Mootha et al., 2003) containing the expression profiles of type II diabetes where the motivation was to identify pathways that can discriminate between normal patients and patients with type II diabetes.
Master of Science

15

Robbins, Donald H. "Hierarchical modeling of laminated composite plates using variable kinematic finite elements and mesh superposition." Diss., Virginia Tech, 1993. http://hdl.handle.net/10919/40117.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Stone, Elizabeth Anne. "Multilevel Model Selection: A Regularization Approach Incorporating Heredity Constraints." Diss., Temple University Libraries, 2013. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/234414.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Statistics
Ph.D.
This dissertation focuses on estimation and selection methods for a simple linear model with two levels of variation. This model provides a foundation for extensions to more levels. We propose new regularization criteria for model selection, subset selection, and variable selection in this context. Regularization is a penalized-estimation approach that shrinks the estimate and selects variables for structured data. This dissertation introduces a procedure (HM-ALASSO) that extends regularized multilevel-model estimation and selection to enforce principles of fixed heredity (e.g., including main effects when their interactions are included) and random heredity (e.g., including fixed effects when their random terms are included). The goals in developing this method were to create a procedure that provided reasonable estimates of all parameters, adhered to fixed and random heredity principles, resulted in a parsimonious model, was theoretically justifiable, and was able to be implemented and used in available software. The HM-ALASSO incorporates heredity-constrained selection directly into the estimation process. HM-ALASSO is shown to enjoy the properties of consistency, sparsity, and asymptotic normality. The ability of HM-ALASSO to produce quality estimates of the underlying parameters while adhering to heredity principles is demonstrated using simulated data. The performance of HM-ALASSO is illustrated using a subset of the High School and Beyond (HS&B) data set that includes math-achievement outcomes modeled via student- and school-level predictors. The HM-ALASSO framework is flexible enough that it can be adapted for various rule sets and parameterizations.
Temple University--Theses

17

Jiang, Bo. "Partition Models for Variable Selection and Interaction Detection." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:10911.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Variable selection methods play important roles in modeling high-dimensional data and are key to data-driven scientific discoveries. In this thesis, we consider the problem of variable selection with interaction detection. Instead of building a predictive model of the response given combinations of predictors, we start by modeling the conditional distribution of predictors given partitions based on responses. We use this inverse modeling perspective as motivation to propose a stepwise procedure for effectively detecting interaction with few assumptions on parametric form. The proposed procedure is able to detect pairwise interactions among p predictors with a computational time of \(O(p)\) instead of \(O(p^2)\) under moderate conditions. We establish consistency of the proposed procedure in variable selection under a diverging number of predictors and sample size. We demonstrate its excellent empirical performance in comparison with some existing methods through simulation studies as well as real data examples. Next, we combine the forward and inverse modeling perspectives under the Bayesian framework to detect pleiotropic and epistatic effects in effects in expression quantitative loci (eQTLs) studies. We augment the Bayesian partition model proposed by Zhang et al. (2010) to capture complex dependence structure among gene expression and genetic markers. In particular, we propose a sequential partition prior to model the asymmetric roles played by the response and the predictors, and we develop an efficient dynamic programming algorithm for sampling latent individual partitions. The augmented partition model significantly improves the power in detecting eQTLs compared to previous methods in both simulations and real data examples pertaining to yeast. Finally, we study the application of Bayesian partition models in the unsupervised learning of transcription factor (TF) families based on protein binding microarray (PBM). The problem of TF subclass identification can be viewed as the clustering of TFs with variable selection on their binding DNA sequences. Our model provides simultaneous identification of TF families and their shared sequence preferences, as well as DNA sequences bound preferentially by individual members of TF families. Our analysis may aid in deciphering cis regulatory codes and determinants of protein-DNA binding specificity.
Statistics

18

Šulc, Zdeněk. "Similarity Measures for Nominal Data in Hierarchical Clustering." Doctoral thesis, Vysoká škola ekonomická v Praze, 2013. http://www.nusl.cz/ntk/nusl-261939.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This dissertation thesis deals with similarity measures for nominal data in hierarchical clustering, which can cope with variables with more than two categories, and which aspire to replace the simple matching approach standardly used in this area. These similarity measures take into account additional characteristics of a dataset, such as frequency distribution of categories or number of categories of a given variable. The thesis recognizes three main aims. The first one is an examination and clustering performance evaluation of selected similarity measures for nominal data in hierarchical clustering of objects and variables. To achieve this goal, four experiments dealing both with the object and variable clustering were performed. They examine the clustering quality of the examined similarity measures for nominal data in comparison with the commonly used similarity measures using a binary transformation, and moreover, with several alternative methods for nominal data clustering. The comparison and evaluation are performed on real and generated datasets. Outputs of these experiments lead to knowledge, which similarity measures can generally be used, which ones perform well in a particular situation, and which ones are not recommended to use for an object or variable clustering. The second aim is to propose a theory-based similarity measure, evaluate its properties, and compare it with the other examined similarity measures. Based on this aim, two novel similarity measures, Variable Entropy and Variable Mutability are proposed; especially, the former one performs very well in datasets with a lower number of variables. The third aim of this thesis is to provide a convenient software implementation based on the examined similarity measures for nominal data, which covers the whole clustering process from a computation of a proximity matrix to evaluation of resulting clusters. This goal was also achieved by creating the nomclust package for the software R, which covers this issue, and which is freely available.

19

Pirathiban, Ramethaa. "Improving species distribution modelling: Selecting absences and eliciting variable usefulness for input into standard algorithms or a Bayesian hierarchical meta-factor model." Thesis, Queensland University of Technology, 2019. https://eprints.qut.edu.au/134401/1/Ramethaa_Pirathiban_Thesis.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis explores and proposes methods to improve species distribution models. Throughout this thesis, a rich class of statistical modelling techniques has been developed to address crucial and interesting issues related to the data input into these models. The overall contribution of this research is the advancement of knowledge on species distribution modelling through an increased understanding of extraneous zeros, quality of the ecological data, variable selection that incorporates ecological theory and evaluating performance of the fitted models. Though motivated by the challenge of species distribution modelling from ecology, this research is broadly relevant to many ﬁelds, including bio-security and medicine. Speciﬁcally, this research is of potential signiﬁcance to researchers seeking to: identify and explain extraneous zeros; assess the quality of their data; or employ expert-informed variable selection.

20

Charalambous, Christiana. "Variable selection in joint modelling of mean and variance for multilevel data." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/variable-selection-in-joint-modelling-of-mean-and-variance-for-multilevel-data(cbe5eb08-1e77-4b44-b7df-17bd4bf4937f).html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

We propose to extend the use of penalized likelihood based variable selection methods to hierarchical generalized linear models (HGLMs) for jointly modellingboth the mean and variance structures. We are interested in applying these newmethods on multilevel structured data, hence we assume a two-level hierarchical structure, with subjects nested within groups. We consider a generalized linearmixed model (GLMM) for the mean, with a structured dispersion in the formof a generalized linear model (GLM). In the first instance, we model the varianceof the random effects which are present in the mean model, or in otherwords the variation between groups (between-level variation). In the second scenario,we model the dispersion parameter associated with the conditional varianceof the response, which could also be thought of as the variation betweensubjects (within-level variation). To do variable selection, we use the smoothlyclipped absolute deviation (SCAD) penalty, a penalized likelihood variable selectionmethod, which shrinks the coefficients of redundant variables to 0 and at thesame time estimates the coefficients of the remaining important covariates. Ourmethods are likelihood based and so in order to estimate the fixed effects in ourmodels, we apply iterative procedures such as the Newton-Raphson method, inthe form of the LQA algorithm proposed by Fan and Li (2001). We carry out simulationstudies for both the joint models for the mean and variance of the randomeffects, as well as the joint models for the mean and dispersion of the response,to assess the performance of our new procedures against a similar process whichexcludes variable selection. The results show that our method increases both theaccuracy and efficiency of the resulting penalized MLEs and has 100% successrate in identifying the zero and non-zero components over 100 simulations. Forthe main real data analysis, we use the Health Survey for England (HSE) 2004dataset. We investigate how obesity is linked to several factors such as smoking,drinking, exercise, long-standing illness, to name a few. We also discover whetherthere is variation in obesity between individuals and between households of individuals,as well as test whether that variation depends on some of the factorsaffecting obesity itself.

21

Raeli, Alice. "Solution of the variable coefficients Poisson equation on Cartesian hierarchical meshes in parallel : applications to phase changing materials." Thesis, Bordeaux, 2017. http://www.theses.fr/2017BORD0669/document.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

On s'interesse aux problèmes elliptiques avec coéficients variables à travers des interfaces intérieures. La solution et ses dérivées normales peuvent subir des variations significatives à travers les frontières intérieures. On présente une méthode compacte aux différences finies sur des maillages adaptés de type octree conçues pour une résolution en parallèle. L'idée principale est de minimiser l'erreur de troncature sur la discretisation locale, en fonction de la configuration du maillage, en rapprochant une convergence à l'ordre deux. On montrera des cas 2D et 3D des résultat liés à des applications concrètes
We consider problems governed by a linear elliptic equation with varying coéficients across internal interfaces. The solution and its normal derivative can undergo significant variations through these internal boundaries. We present a compact finite-difference scheme on a tree-based adaptive grid that can be efficiently solved using a natively parallel data structure. The main idea is to optimize the truncation error of the discretization scheme as a function of the local grid configuration to achieve second order accuracy. Numerical illustrations relevant for actual applications are presented in two and three-dimensional configurations

22

Kwon, Hyukje. "A Monte Carlo Study of Missing Data Treatments for an Incomplete Level-2 Variable in Hierarchical Linear Models." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1303846627.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Huo, Shuning. "Bayesian Modeling of Complex High-Dimensional Data." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/101037.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional complex data in different forms, such as medical images, genomics measurements. However, acquisition of more data does not automatically lead to better knowledge discovery. One needs efficient and reliable analytical tools to extract useful information from complex datasets. The main objective of this dissertation is to develop innovative Bayesian methodologies to enable effective and efficient knowledge discovery from complex high-dimensional data. It contains two parts—the development of computationally efficient functional mixed models and the modeling of data heterogeneity via Dirichlet Diffusion Tree. The first part focuses on tackling the computational bottleneck in Bayesian functional mixed models. We propose a computational framework called variational functional mixed model (VFMM). This new method facilitates efficient data compression and high-performance computing in basis space. We also propose a new multiple testing procedure in basis space, which can be used to detect significant local regions. The effectiveness of the proposed model is demonstrated through two datasets, a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part is about modeling data heterogeneity by using Dirichlet Diffusion Trees. We propose a Bayesian latent tree model that incorporates covariates of subjects to characterize the heterogeneity and uncover the latent tree structure underlying data. This innovative model may reveal the hierarchical evolution process through branch structures and estimate systematic differences between groups of samples. We demonstrate the effectiveness of the model through the simulation study and a brain tumor real data.
Doctor of Philosophy
With the rapid development of modern high-throughput technologies, scientists can now collect high-dimensional data in different forms, such as engineering signals, medical images, and genomics measurements. However, acquisition of such data does not automatically lead to efficient knowledge discovery. The main objective of this dissertation is to develop novel Bayesian methods to extract useful knowledge from complex high-dimensional data. It has two parts—the development of an ultra-fast functional mixed model and the modeling of data heterogeneity via Dirichlet Diffusion Trees. The first part focuses on developing approximate Bayesian methods in functional mixed models to estimate parameters and detect significant regions. Two datasets demonstrate the effectiveness of proposed method—a mass spectrometry dataset in a cancer study and a neuroimaging dataset in an Alzheimer's disease study. The second part focuses on modeling data heterogeneity via Dirichlet Diffusion Trees. The method helps uncover the underlying hierarchical tree structures and estimate systematic differences between the group of samples. We demonstrate the effectiveness of the method through the brain tumor imaging data.

24

Trahan, Patrick. "Classification of Carpiodes Using Fourier Descriptors: A Content Based Image Retrieval Approach." ScholarWorks@UNO, 2009. http://scholarworks.uno.edu/td/1085.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Taxonomic classification has always been important to the study of any biological system. Many biological species will go unclassified and become lost forever at the current rate of classification. The current state of computer technology makes image storage and retrieval possible on a global level. As a result, computer-aided taxonomy is now possible. Content based image retrieval techniques utilize visual features of the image for classification. By utilizing image content and computer technology, the gap between taxonomic classification and species destruction is shrinking. This content based study utilizes the Fourier Descriptors of fifteen known landmark features on three Carpiodes species: C.carpio, C.velifer, and C.cyprinus. Classification analysis involves both unsupervised and supervised machine learning algorithms. Fourier Descriptors of the fifteen known landmarks provide for strong classification power on image data. Feature reduction analysis indicates feature reduction is possible. This proves useful for increasing generalization power of classification.

25

Huang, Huei-Ching, and 黃慧青. "Latent Class Model with Two Hierarchical Latent Variables." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/12163370355322189586.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

"Bayesian analysis of generalized latent variable models with hierarchical data." Thesis, 2009. http://library.cuhk.edu.hk/record=b6075429.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Pan, Junhao.
Thesis (Ph.D.)--Chinese University of Hong Kong, 2009.
Includes bibliographical references (leaves 121-135).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstract also in Chinese.

27

BIONDI, LUIGI. "Identifiability of Discrete Hierarchical Models with One Latent Variable." Doctoral thesis, 2016. http://hdl.handle.net/2158/1028810.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

In the thesis we discuss in depth the problem of identifiability for discrete models, showing many concrete situations in which it may fail. After this survey we concentrate on hierarchical log-linear models for binary variables with one hidden variable. These are more general of the LC models because may include some higher-order interactions. These models may sometimes be interpreted as discrete undirected graphical models (called also concentration graph models), but they are more general.

28

"Multiple Imputation for Two-Level Hierarchical Models with Categorical Variables and Missing at Random Data." Doctoral diss., 2016. http://hdl.handle.net/2286/R.I.40705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

abstract: Accurate data analysis and interpretation of results may be influenced by many potential factors. The factors of interest in the current work are the chosen analysis model(s), the presence of missing data, and the type(s) of data collected. If analysis models are used which a) do not accurately capture the structure of relationships in the data such as clustered/hierarchical data, b) do not allow or control for missing values present in the data, or c) do not accurately compensate for different data types such as categorical data, then the assumptions associated with the model have not been met and the results of the analysis may be inaccurate. In the presence of clustered/nested data, hierarchical linear modeling or multilevel modeling (MLM; Raudenbush & Bryk, 2002) has the ability to predict outcomes for each level of analysis and across multiple levels (accounting for relationships between levels) providing a significant advantage over single-level analyses. When multilevel data contain missingness, multilevel multiple imputation (MLMI) techniques may be used to model both the missingness and the clustered nature of the data. With categorical multilevel data with missingness, categorical MLMI must be used. Two such routines for MLMI with continuous and categorical data were explored with missing at random (MAR) data: a formal Bayesian imputation and analysis routine in JAGS (R/JAGS) and a common MLM procedure of imputation via Bayesian estimation in BLImP with frequentist analysis of the multilevel model in Mplus (BLImP/Mplus). Manipulated variables included interclass correlations, number of clusters, and the rate of missingness. Results showed that with continuous data, R/JAGS returned more accurate parameter estimates than BLImP/Mplus for almost all parameters of interest across levels of the manipulated variables. Both R/JAGS and BLImP/Mplus encountered convergence issues and returned inaccurate parameter estimates when imputing and analyzing dichotomous data. Follow-up studies showed that JAGS and BLImP returned similar imputed datasets but the choice of analysis software for MLM impacted the recovery of accurate parameter estimates. Implications of these findings and recommendations for further research will be discussed.
Dissertation/Thesis
Doctoral Dissertation Educational Psychology 2016

29

"Type I and type II error in hierarchical analysis of variance using logistic regression for dichotomous dependent variables." Tulane University, 1992.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Hierarchical models seek to explain relations between individual level and aggregate level data. A new method of analysis for hierarchical data in which the outcome variable is dichotomous has been developed (Pullum, 1991). This method partitions the variance into a within-cluster component and a between-cluster component, and then uses logistic regression models to explain the variance in the two components in a unified equation The question arises as to how the number of contexts and the number of individuals within those contexts affect the power of Pullum's model when there are different degrees of dependence between the outcome variable and the predictors at both the individual level and the cluster level of the model. Also, how does Pullum's model differ in terms of power with the power obtained with conventional logistic regression models To answer these questions, this study uses simulation methodology in which random data sets are generated with constant total samples of 1000 and 500 but varying numbers of individuals per cluster and number of clusters. Situations in which varying degrees of dependence between a dichotomous dependent variable, and dichotomous individual level and cluster level predictors are simulated. These data sets are then analyzed using both Pullum's technique and conventional logistic regression. The improvement Chi-square statistics for these models are determined for each term in the two models. Each randomly generated data set is simulated 1000 times, and the probability of rejecting the null hypothesis (there is no association between outcome and predictor) is calculated. These probabilities are used to determine the Type I and Type II error rates for the two models Results show that the two methods do not differ when there are adequate numbers of individuals per cluster and adequate numbers of clusters. However, the hierarchical analysis is more conservative than the conventional method in the detection of individual level and interaction effects for situations in which there are a small number of individuals per cluster and in the detection of cluster level effects for the individual level variable when there are a small number of clusters
acase@tulane.edu

30

Yurecko, Michele. "Investigating the relationship between reading achievement, and state-level ecological variables and educational reform a hierarchical analysis of item difficulty variation /." 2009. http://hdl.rutgers.edu/1782.2/rucore10001600001.ETD.000051083.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Lemyre, Gabriel. "Modèles de Markov à variables latentes : matrice de transition non-homogène et reformulation hiérarchique." Thesis, 2021. http://hdl.handle.net/1866/25476.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Ce mémoire s’intéresse aux modèles de Markov à variables latentes, une famille de modèles dans laquelle une chaîne de Markov latente régit le comportement d’un processus stochastique observable à travers duquel transparaît une version bruitée de la chaîne cachée. Pouvant être vus comme une généralisation naturelle des modèles de mélange, ces processus stochastiques bivariés ont entre autres démontré leur faculté à capter les dynamiques variables de maintes séries chronologiques et, plus spécifiquement en finance, à reproduire la plupart des faits stylisés des rendements financiers. Nous nous intéressons en particulier aux chaînes de Markov à temps discret et à espace d’états fini, avec l’objectif d’étudier l’apport de leurs reformulations hiérarchiques et de la relaxation de l’hypothèse d’homogénéité de la matrice de transition à la qualité de l’ajustement aux données et des prévisions, ainsi qu’à la reproduction des faits stylisés. Nous présentons à cet effet deux structures hiérarchiques, la première permettant une nouvelle interprétation des relations entre les états de la chaîne, et la seconde permettant de surcroît une plus grande parcimonie dans la paramétrisation de la matrice de transition. Nous nous intéressons de plus à trois extensions non-homogènes, dont deux dépendent de variables observables et une dépend d’une autre variable latente. Nous analysons pour ces modèles la qualité de l’ajustement aux données et des prévisions sur la série des log-rendements du S&P 500 et du taux de change Canada-États-Unis (CADUSD). Nous illustrons de plus la capacité des modèles à reproduire les faits stylisés, et présentons une interprétation des paramètres estimés pour les modèles hiérarchiques et non-homogènes. Les résultats obtenus semblent en général confirmer l’apport potentiel de structures hiérarchiques et des modèles non-homogènes. Ces résultats semblent en particulier suggérer que l’incorporation de dynamiques non-homogènes aux modèles hiérarchiques permette de reproduire plus fidèlement les faits stylisés—même la lente décroissance de l’autocorrélation des rendements centrés en valeur absolue et au carré—et d’améliorer la qualité des prévisions obtenues, tout en conservant la possibilité d’interpréter les paramètres estimés.
This master’s thesis is centered on the Hidden Markov Models, a family of models in which an unobserved Markov chain dictactes the behaviour of an observable stochastic process through which a noisy version of the latent chain is observed. These bivariate stochastic processes that can be seen as a natural generalization of mixture models have shown their ability to capture the varying dynamics of many time series and, more specifically in finance, to reproduce the stylized facts of financial returns. In particular, we are interested in discrete-time Markov chains with finite state spaces, with the objective of studying the contribution of their hierarchical formulations and the relaxation of the homogeneity hypothesis for the transition matrix to the quality of the fit and predictions, as well as the capacity to reproduce the stylized facts. We therefore present two hierarchical structures, the first allowing for new interpretations of the relationships between states of the chain, and the second allowing for a more parsimonious parameterization of the transition matrix. We also present three non-homogeneous models, two of which have transition probabilities dependent on observed explanatory variables, and the third in which the probabilities depend on another latent variable. We first analyze the goodness of fit and the predictive power of our models on the series of log returns of the S&P 500 and the exchange rate between canadian and american currencies (CADUSD). We also illustrate their capacity to reproduce the stylized facts, and present interpretations of the estimated parameters for the hierarchical and non-homogeneous models. In general, our results seem to confirm the contribution of hierarchical and non-homogeneous models to these measures of performance. In particular, these results seem to suggest that the incorporation of non-homogeneous dynamics to a hierarchical structure may allow for a more faithful reproduction of the stylized facts—even the slow decay of the autocorrelation functions of squared and absolute returns—and better predictive power, while still allowing for the interpretation of the estimated parameters.

32

CASSESE, ALBERTO. "A Hierarchical Bayesian Modeling Approach To Genetical Genomics." Doctoral thesis, 2013. http://hdl.handle.net/2158/794601.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

The proposed method re ects my continuing interest in the development of novel Bayesian methodologies for the analysis of data that arise in genomics. Novel methodological questions are now being generated in Bioinformatics and require the integration of dierent concepts, methods, tools and data types. The proposed modeling approach is general and can be readily applied to high-throughput data of dierent types, and to data from dierent cancers and diseases. A single mutation is not enough to trigger cancer, as this is the result of a number of complex biological events. Thus, discovering amplication of oncogenes or deletion of tumor suppressors are important steps in elucidating tumor genesis. Delineating the association between gene expression and CGH data is particularly useful in cancer studies, where copy number aberrations are widespread, due to genomic instability. This project focuses on the development of an innovative statistical model that integrates gene expression and genetics data. Our approach explicit models the relationship between these two types of data, allowing for the quantication of the eect of the genetic aberrations on the gene expression levels. The proposed model assumes that gene expression levels are aected by copy number aberrations in corresponding and adjacent segments and also allows for the possibility that changes in gene expression may be due to extraneous causes other than copy number aberrations. It allows, at the same time, to model array CGH data to learn about genome-wide changes in copy number considering information taken from all the samples simultaneously.

33

Lin, Lin. "Bayesian Variable Selection in Clustering and Hierarchical Mixture Modeling." Diss., 2012. http://hdl.handle.net/10161/5846.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Clustering methods are designed to separate heterogeneous data into groups of similar objects such that objects within a group are similar, and objects in different groups are dissimilar. From the machine learning perspective, clustering can also be viewed as one of the most important topics within the unsupervised learning problem, which involves finding structures in a collection of unlabeled data. Various clustering methods have been developed under different problem contexts. Specifically, high dimensional data has stimulated a high level of interest in combining clustering algorithms and variable selection procedures; large data sets with expanding dimension have provoked an increasing need for relevant, customized clustering algorithms that offer the ability to detect low probability clusters.

This dissertation focuses on the model-based Bayesian approach to clustering. I first develop a new Bayesian Expectation-Maximization algorithm in fitting Dirichlet process mixture models and an algorithm to identify clusters under mixture models by aggregating mixture components. These two algorithms are used extensively throughout the dissertation. I then develop the concept and theory of a new variable selection method that is based on an evaluation of subsets of variables for the discriminatory evidence they provide in multivariate mixture modeling. This new approach to discriminative information analysis uses a natural measure of concordance between mixture component densities. The approach is both effective and computationally attractive for routine use in assessing and prioritizing subsets of variables according to their roles in the discrimination of one or more clusters. I demonstrate that the approach is useful for providing an objective basis for including or excluding specific variables in flow cytometry data analysis. These studies demonstrate how ranked sets of such variables can be used to optimize clustering strategies and selectively visualize identified clusters of the data of interest.

Next, I create a new approach to Bayesian mixture modeling with large data sets for a specific, important class of problems in biological subtype identification. The context, that of combinatorial encoding in flow cytometry, naturally introduces the hierarchical structure that these new models are designed to incorporate. I describe these novel classes of Bayesian mixture models with hierarchical structures that reflect the underlying problem context. The Bayesian analysis involves structured priors and computations using customized Markov chain Monte Carlo methods for model fitting that exploit a distributed GPU (graphics processing unit) implementation. The hierarchical mixture model is applied in the novel use of automated flow cytometry technology to measure levels of protein markers on thousands to millions of cells.

Finally, I develop a new approach to cluster high dimensional data based on Kingman's coalescent tree modeling ideas. Under traditional clustering models, the number of parameters required to construct the model increases exponentially with the number of dimensions. This phenomenon can lead to model overfitting and an enormous computational search challenge. The approach addresses these issues by proposing to learn the data structure in each individual dimension and combining these dimensions in a flexible tree-based model class. The new tree-based mixture model is studied extensively under various simulation studies, under which the model's superiority is reflected compared with traditional mixture models.

Dissertation

34

Li, Yingbo. "Bayesian Hierarchical Models for Model Choice." Diss., 2013. http://hdl.handle.net/10161/8063.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

With the development of modern data collection approaches, researchers may collect hundreds to millions of variables, yet may not need to utilize all explanatory variables available in predictive models. Hence, choosing models that consist of a subset of variables often becomes a crucial step. In linear regression, variable selection not only reduces model complexity, but also prevents over-fitting. From a Bayesian perspective, prior specification of model parameters plays an important role in model selection as well as parameter estimation, and often prevents over-fitting through shrinkage and model averaging.

We develop two novel hierarchical priors for selection and model averaging, for Generalized Linear Models (GLMs) and normal linear regression, respectively. They can be considered as "spike-and-slab" prior distributions or more appropriately "spike- and-bell" distributions. Under these priors we achieve dimension reduction, since their point masses at zero allow predictors to be excluded with positive posterior probability. In addition, these hierarchical priors have heavy tails to provide robust- ness when MLE's are far from zero.

Zellner's g-prior is widely used in linear models. It preserves correlation structure among predictors in its prior covariance, and yields closed-form marginal likelihoods which leads to huge computational savings by avoiding sampling in the parameter space. Mixtures of g-priors avoid fixing g in advance, and can resolve consistency problems that arise with fixed g. For GLMs, we show that the mixture of g-priors using a Compound Confluent Hypergeometric distribution unifies existing choices in the literature and maintains their good properties such as tractable (approximate) marginal likelihoods and asymptotic consistency for model selection and parameter estimation under specific values of the hyper parameters.

While the g-prior is invariant under rotation within a model, a potential problem with the g-prior is that it inherits the instability of ordinary least squares (OLS) estimates when predictors are highly correlated. We build a hierarchical prior based on scale mixtures of independent normals, which incorporates invariance under rotations within models like ridge regression and the g-prior, but has heavy tails like the Zeller-Siow Cauchy prior. We find this method out-performs the gold standard mixture of g-priors and other methods in the case of highly correlated predictors in Gaussian linear models. We incorporate a non-parametric structure, the Dirichlet Process (DP) as a hyper prior, to allow more flexibility and adaptivity to the data.

Dissertation

35

Xin-HanHuang and 黃信翰. "Application of Spatial Bayesian Hierarchical Model with Variable Selection to fMRI data." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/18746413115126691555.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
國立成功大學
統計學系
103
We propose a spatial Bayesian hierarchical model to analyze functional magnetic resonance imaging data with complex spatial and temporal structures. Several studies have found that the spatial dependence not only appear in signal changes but also in temporal correlations among voxels. However, currently existing statistical approaches ignore the spatial dependence of temporal correlations for the computational efficiency. We consider the spatial random effect models to simultaneously model spatial dependences in both signal changes and temporal correlations, but keep computationally feasible. Through simulation, the proposed approach improves the accuracy of identifying the activations. We study the properties of the model through its performance on simulations and a real event-related fMRI data set.

36

CAVICCHIA, CARLO. "Hierarchical latent variable models for dimensionality reduction: an application on composite indicators." Doctoral thesis, 2020. http://hdl.handle.net/11573/1363237.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This thesis is devoted to the development of new hierarchical latent variable models for Dimensionality Reduction with a specific focus on the construction of Composite Indicators (CIs). Since our society is producing a huge quantity of data, the construction of model-based CIs represents an interesting and still open methodological challenge. This dissertation is motivated by the necessity to provide model-based CIs which are built according to a statistical approach avoiding subjective choices (e.g., normative weights), therefore, the new insights and proposals hope to represent a contribution to the current literature. This thesis provides an introduction to CIs, a brief review of the most used methods in Multidimensional Data Analysis framework, a discussion about measurement models and methodological proposals to model latent concepts. Factor Analysis and its hierarchical extensions have been introduced in order to set the starting point of the analysis. A first proposal represents a new latent factor model that could be used for building CIs, it aims to investigate the hierarchical structure of the data in order to define two levels of CIs. The model, named Hierarchical Disjoint Non-Negative Factor Analysis is composed of two novelties: a model which is the two level hierarchical extension of FA and its disjoint extension with non-negative loadings. The latter model is enriched by considerations about the CIs used for tracking coherent policy conclusions. A set of features, properties and rules useful to build "good" CIs have been presented and explained. The last proposal in the thesis represents a new model for positive data correlation matrices which aims to detect reliable concepts and to build the hierarchy from them to the most general one. The proposed models are illustrated both via simulation studies and real data applications, to analyze their performances and abilities. In particular, the main application in this thesis regards the construction of a hierarchically aggregated index for the multidimensional phenomenon Waste Management in European Union. Waste Management is becoming even more important for its impact on human-being's lives, and many data have been produced about it, therefore the construction of a CI able to reduce its dimensionality and to highlight the main dimensions of it has a extraordinary usefulness in order to provide support to EU countries' action and policies.

37

Minto, Cóilín. "Ecological Inference from Variable Recruitment Data." 2011. http://hdl.handle.net/10222/13881.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

To understand the processes affecting the abundance of wild populations is a fundamental goal of ecology and a prerequisite for the management of living resources. Variable abundance, however, makes the investigation of ecological processes challenging. Recruitment, the process whereby new individuals enter a given stage of a ?sh population, is a highly variable entity. I have confronted this issue by developing methodologies speci?cally designed to account for, and ecologically interpret, patterns of variability in recruitment. To provide the necessary context, Chapter 2 begins with a review of the history of recruitment science. I focus on the major achievements as well as present limitations, particularly regarding environmental drivers. Approaches that include explicit environmental information are contrasted with time-varying parameter techniques. In Chapter 3, I ask what patterns of variability in pre-recruit survival can tell us about the strength of density-dependent mortality. I provide methods to investigate the presence of density-dependent mortality where this has previously been hindered by highly variable data. Stochastic density-independent variability is found to be attenuated via density dependence. Sources of recruitment variability are further partitioned in Chapter 4. Using time-varying parameter techniques, signi?cant temporal variation in the annual reproductive rate is found to have occurred in many Atlantic cod populations. Multivariate state space models suggest that populations in close proximity typically have a shared response to environmental change whereas marked differences occur across latitude. Hypotheses that could result in consistent changes in productivity of cod populations are tested in Chapter 5. I focus on a meta-analytical investigation of potential interactions between Atlantic cod and small pelagic species, testing aspects of the cultivation-depensation hypothesis. The ?ndings suggest that predation or competition by herring and mackerel on egg and larval cod could delay recovery of depleted cod populations. Chapter 6 concludes with a critical re?ection on: the suitability of the theories employed, the underlying assumptions of the empirical approaches, and the quality of the data used in my thesis. Application of ecological insights to ?sheries management is critically evaluated. I then propose future work on recruitment processes based on methods presented herein.

38

Chen, Ting-Shiang, and 陳庭祥. "Target Human Following Using Neural-Network-Based RFID Localization System with Hierarchical Variable Structure Control." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/azy8r5.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

碩士
國立臺灣科技大學
電機工程系
104
At the beginning, the received signal strength indicators (RSSIs) of the three tags on a triangular pattern are read by two perpendicular antennas. These 6 RSSIs and their corresponding pose and the azimuth angle of target human (TH) with respect to automatic guided vehicle (AGV) are obtained. Since the relations of these pairs of input and output are nonlinear, coupled, and stochastic, it is difficult to obtain an effective model. A 1st-order low-pass filter with unit dc gain is first employed to remove the unnecessary high frequencies of RSSIs. Due to the advantageous features of neural network modeling, e.g., stochastic approximation, insensitive to noise, different numbers of input and output, the multilayer neural network (MLNN) with Levenberg-Marquard Back-Propagation (LMBP) learning law is employed to achieve the model between six filtered RSSIs and three outputs (i.e., the pose and the azimuth angle of TH). Then the trajectory to track the TH is on-line planned and predicted from the output of Multilayer Perceptron Network (MLPN). The hierarchical variable structure control (HVSC) is employed to on-line track the planning trajectory such that the TH following is achieved. For an effective implementation, a software/hardware based platform is employed to develop the software for the MLPN modeling, the trajectory planning algorithm and the HVSC algorithm, and the hardware for the control signal (e.g., the PWM for driving the motor) and for the sensor inputs (e.g., the decoder for obtaining the position or velocity of motor, the USB interface for receiving RFID signal). Finally, the experiments for the TH following by the proposed NN-based RFID localization system and HVSC algorithm confirm the effectiveness, efficiency, and robustness of the proposed method.

39

Goldstein, Leigh Ann. "Relationships among quality of life, self-care, and affiliated individuation in persons on chronic warfarin therapy." 2013. http://hdl.handle.net/2152/21865.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

This descriptive, correlational, cross-sectional study explored the relationships among the variables self-care action, self-care knowledge, and affiliated individuation and quality of life for persons on chronic warfarin therapy. This study also explored the moderating effects of self-care knowledge and affiliated individuation on quality of life. This research was guided by a theoretical framework based on modeling and role-modeling theory (Erickson, Tomlin, & Swain, 1983). The sample consisted of 83 adults between the ages of 30 to 91 years. The majority of participants were Caucasian, educated, retired and almost evenly distributed between male and female. Each subject completed the following instruments: the Oral Anticoagulation Knowledge (OAK) test, the Duke Anticoagulation Satisfaction Scale (DASS), the Basic Needs Satisfaction Inventory (BNSI), and the generic quality of life survey (SF36v2). Data was analyzed using correlation and hierarchical multiple regression analysis. Results indicated significant correlations among most of the study variables. Self-care action significantly explained variances in all but two quality of life variables. Self-care knowledge and affiliated individuation had statistically significant moderating effects on the DASS negative impact and hassles/burdens subscales. Self-care knowledge also demonstrated a significant moderating effect on the SF36v2 physical function subscale. These findings support the concepts proposed by the study's theoretical framework. This research serves as validation of Acton's (1997) study findings for the concept of affiliated individuation and its value as a self-care resource in a specific clinical population.
text

40

Kaplan, Andrea Jean. "An overview of multilevel regression." Thesis, 2010. http://hdl.handle.net/2152/ETD-UT-2010-12-2462.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

Due to the inherently hierarchical nature of many natural phenomena, data collected rests in nested entities. As an example, students are nested in schools, school are nested in districts, districts are nested in counties, and counties are nested within states. Multilevel models provide a statistical framework for investigating and drawing conclusions regarding the influence of factors at differing hierarchical levels of analysis. The work in this paper serves as an introduction to multilevel models and their comparison to Ordinary Least Squares (OLS) regression. We overview three basic model structures: variable intercept model, variable slope model, and hierarchical linear model and illustrate each model with an example of student data. Then, we contrast the three multilevel models with the OLS model and present a method for producing confidence intervals for the regression coefficients.
text

41

Xu, Lizhen. "Bayesian Methods for Genetic Association Studies." Thesis, 2012. http://hdl.handle.net/1807/34972.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Abstract:

We develop statistical methods for tackling two important problems in genetic association studies. First, we propose a Bayesian approach to overcome the winner's curse in genetic studies. Second, we consider a Bayesian latent variable model for analyzing longitudinal family data with pleiotropic phenotypes. Winner's curse in genetic association studies refers to the estimation bias of the reported odds ratios (OR) for an associated genetic variant from the initial discovery samples. It is a consequence of the sequential procedure in which the estimated effect of an associated genetic marker must first pass a stringent significance threshold. We propose a hierarchical Bayes method in which a spike-and-slab prior is used to account for the possibility that the significant test result may be due to chance. We examine the robustness of the method using different priors corresponding to different degrees of confidence in the testing results and propose a Bayesian model averaging procedure to combine estimates produced by different models. The Bayesian estimators yield smaller variance compared to the conditional likelihood estimator and outperform the latter in the low power studies. We investigate the performance of the method with simulations and applications to four real data examples. Pleiotropy occurs when a single genetic factor influences multiple quantitative or qualitative phenotypes, and it is present in many genetic studies of complex human traits. The longitudinal family studies combine the features of longitudinal studies in individuals and cross-sectional studies in families. Therefore, they provide more information about the genetic and environmental factors associated with the trait of interest. We propose a Bayesian latent variable modeling approach to model multiple phenotypes simultaneously in order to detect the pleiotropic effect and allow for longitudinal and/or family data. An efficient MCMC algorithm is developed to obtain the posterior samples by using hierarchical centering and parameter expansion techniques. We apply spike and slab prior methods to test whether the phenotypes are significantly associated with the latent disease status. We compute Bayes factors using path sampling and discuss their application in testing the significance of factor loadings and the indirect fixed effects. We examine the performance of our methods via extensive simulations and apply them to the blood pressure data from a genetic study of type 1 diabetes (T1D) complications.

Dissertations / Theses on the topic 'Hierarchical variables'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles