To see the other types of publications on this topic, follow the link: Latent.

Dissertations / Theses on the topic 'Latent'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Latent.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Anaya, Leticia H. "Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc103284/.

Full text
Abstract:
In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents (customer comments) at two levels of abstraction for a quality assurance setting. Then two computer algorithms, LSA and LDA, were used to perform classification on these documents. The results indicate that humans outperformed all computer algorithms and had an accuracy rate of 94% at the higher level of abstraction and 76% at the lower level of abstraction. At the high level of abstraction, the accuracy rates were 84% for both LSA and LDA and at the lower level, the accuracy rate were 67% for LSA and 64% for LDA. The findings of this research have many strong implications for the improvement of information systems that process unstructured text. Document classifiers have many potential applications in many fields (e.g., fraud detection, information retrieval, national security, and customer management). Development and refinement of algorithms that classify text is a fruitful area of ongoing research and this dissertation contributes to this area.
APA, Harvard, Vancouver, ISO, and other styles
2

Xiong, Hao. "Diversified Latent Variable Models." Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/18512.

Full text
Abstract:
Latent variable model is a common probabilistic framework which aims to estimate the hidden states of observations. More specifically, the hidden states can be the position of a robot, the low dimensional representation of an observation. Meanwhile, various latent variable models have been explored, such as hidden Markov models (HMM), Gaussian mixture model (GMM), Bayesian Gaussian process latent variable model (BGPLVM), etc. Moreover, these latent variable models have been successfully applied to a wide range of fields, such as robotic navigation, image and video compression, natural language processing. So as to make the learning of latent variable more efficient and robust, some approaches seek to integrate latent variables with related priors. For instance, the dynamic prior can be incorporated so that the learned latent variables take into account the time sequence. Besides, some methods introduce inducing points as a small set representing the large size latent variable to enhance the optimization speed of the model. Though those priors are effective to facilitate the robustness of the latent variable models, the learned latent variables are inclined to be dense rather than diverse. This is to say that there are significant overlapping between the generated latent variables. Consequently, the latent variable model will be ambiguous after optimization. Clearly, a proper diversity prior play a pivotal role in having latent variables capture more diverse features of the observations data. In this thesis, we propose diversified latent variable models incorporated by different types of diversity priors, such as single/dual diversity encouraging prior, multi-layered DPP prior, shared diversity prior. Furthermore, we also illustrate how to formulate the diversity priors in different latent variable models and perform learning, inference on the reformulated latent variable models.
APA, Harvard, Vancouver, ISO, and other styles
3

Etessami, Pantea. "Mutagenesis studies on the genome of cassava latent virus : (African cassava latent virus)." Thesis, University of East Anglia, 1989. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.235620.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Murphy, Sean Michael. "Disease management and latent choices." Online access for everyone, 2008. http://www.dissertations.wsu.edu/Dissertations/Summer2008/S_Murphy_062608.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Cartmill, Ian. "Builders' liability for latent defects." Thesis, University of Oxford, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.302694.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Shafia, Aminath. "Latent infection of Botrytis cinerea." Thesis, University of Reading, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.499372.

Full text
Abstract:
Latent B. cinerea was detected in nine symptomless wild host species from the families Asteraceae and Brassicaceae, in addition to greenhouse grown lettuce. Conventional testing methods revealed that latent B. cinerea was equally prevalent in the root system as the above ground parts. Incidence of latent infection was moderate in some species (Achillea milleforlium, Arabidopsis thaliana, Centraurea nigra, Cirsium vulgare, Senecio jacobaea, Senecio vulgaris and Taraaxacum agg.) and rare in others (Tussilago farfara and Bellis perennis). In greenhouse lettuce, latent infection was activated by prolonged water stress and artificial inoculation. Despite inoculation, unstressed, vigorously growing lettuce and Arabidopsis plants remained asymptomatic throughout the growing period. Fungicide seed treatment did not significantly affect the amount of latent B. cinerea recovered from the lettuce plants. Introduction of antagonistic micro-organism Trichoderma harzianum T-39 into the soil decreased the amount of latent infection recovered from lettuce leaves but increased it in the stem. A weak negative correlation was found between photosynthesis and the amount of B. cinerea recovered from the leaves. Weight of the plants was reduced due to inoculation of B. cinerea even though latent infection was unaltered. There was no relation between plant weight and total endophytic B. cinerea. A marginal increase of the phenolic contents of the leaf was observed due to inoculation, but no changes to the antioxidant activity, chlorophyll content or carotenoids were found. The high incidence of latent infection found in greenhouse grown lettuce plants with or without successful inoculation may have been due to the presence of several genetically distinct isolates of B. cinerea. Eight different haplotypes were identified among the 32 isolates assessed. A single very common haplotype presumably originated from seed borne infection, because it was rare in plants grown from fungicide treated seed. Latency may be attributed to a mild strain defence response by the presence of several genetically different strains of the pathogen present within the plant as endophytes.
APA, Harvard, Vancouver, ISO, and other styles
7

Ponweiser, Martin. "Latent Dirichlet Allocation in R." WU Vienna University of Economics and Business, 2012. http://epub.wu.ac.at/3558/1/main.pdf.

Full text
Abstract:
Topic models are a new research field within the computer sciences information retrieval and text mining. They are generative probabilistic models of text corpora inferred by machine learning and they can be used for retrieval and text mining tasks. The most prominent topic model is latent Dirichlet allocation (LDA), which was introduced in 2003 by Blei et al. and has since then sparked off the development of other topic models for domain-specific purposes. This thesis focuses on LDA's practical application. Its main goal is the replication of the data analyses from the 2004 LDA paper ``Finding scientific topics'' by Thomas Griffiths and Mark Steyvers within the framework of the R statistical programming language and the R~package topicmodels by Bettina Grün and Kurt Hornik. The complete process, including extraction of a text corpus from the PNAS journal's website, data preprocessing, transformation into a document-term matrix, model selection, model estimation, as well as presentation of the results, is fully documented and commented. The outcome closely matches the analyses of the original paper, therefore the research by Griffiths/Steyvers can be reproduced. Furthermore, this thesis proves the suitability of the R environment for text mining with LDA. (author's abstract)
Series: Theses / Institute for Statistics and Mathematics
APA, Harvard, Vancouver, ISO, and other styles
8

Creagh-Osborne, Jane. "Latent variable generalized linear models." Thesis, University of Plymouth, 1998. http://hdl.handle.net/10026.1/1885.

Full text
Abstract:
Generalized Linear Models (GLMs) (McCullagh and Nelder, 1989) provide a unified framework for fixed effect models where response data arise from exponential family distributions. Much recent research has attempted to extend the framework to include random effects in the linear predictors. Different methodologies have been employed to solve different motivating problems, for example Generalized Linear Mixed Models (Clayton, 1994) and Multilevel Models (Goldstein, 1995). A thorough review and classification of this and related material is presented. In Item Response Theory (IRT) subjects are tested using banks of pre-calibrated test items. A useful model is based on the logistic function with a binary response dependent on the unknown ability of the subject. Item parameters contribute to the probability of a correct response. Within the framework of the GLM, a latent variable, the unknown ability, is introduced as a new component of the linear predictor. This approach affords the opportunity to structure intercept and slope parameters so that item characteristics are represented. A methodology for fitting such GLMs with latent variables, based on the EM algorithm (Dempster, Laird and Rubin, 1977) and using standard Generalized Linear Model fitting software GLIM (Payne, 1987) to perform the expectation step, is developed and applied to a model for binary response data. Accurate numerical integration to evaluate the likelihood functions is a vital part of the computational process. A study of the comparative benefits of two different integration strategies is undertaken and leads to the adoption, unusually, of Gauss-Legendre rules. It is shown how the fitting algorithms are implemented with GLIM programs which incorporate FORTRAN subroutines. Examples from IRT are given. A simulation study is undertaken to investigate the sampling distributions of the estimators and the effect of certain numerical attributes of the computational process. Finally a generalized latent variable model is developed for responses from any exponential family distribution.
APA, Harvard, Vancouver, ISO, and other styles
9

Mao, Cheng Ph D. Massachusetts Institute of Technology. "Matrix estimation with latent permutations." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/117863.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 151-167).
Motivated by various applications such as seriation, network alignment and ranking from pairwise comparisons, we study the problem of estimating a structured matrix with rows and columns shuffled by latent permutations, given noisy and incomplete observations of its entries. This problem is at the intersection of shape constrained estimation which has a long history in statistics, and latent permutation learning which has driven a recent surge of interest in the machine learning community. Shape constraints on matrices, such as monotonicity and smoothness, are generally more robust than parametric assumptions, and often allow for adaptive and efficient estimation in high dimensions. On the other hand, latent permutations underlie many graph matching and assignment problems that are computationally intractable in the worst-case and not yet well-understood in the average-case. Therefore, it is of significant interest to both develop statistical approaches and design efficient algorithms for problems where shape constraints meet latent permutations. In this work, we consider three specific models: the statistical seriation model, the noisy sorting model and the strong stochastic transitivity model. First, statistical seriation consists in permuting the rows of a noisy matrix in such a way that all its columns are approximately monotone, or more generally, unimodal. We study both global and adaptive rates of estimation for this model, and introduce an efficient algorithm for the monotone case. Next, we move on to ranking from pairwise comparisons, and consider the noisy sorting model. We establish the minimax rates of estimation for noisy sorting, and propose a near-linear time multistage algorithm that achieves a near-optimal rate. Finally, we study the strong stochastic transitivity model that significantly generalizes the noisy sorting model for estimation from pairwise comparisons. Our efficient algorithm achieves the rate (n- 3 /4 ), narrowing a gap between the statistically optimal rate Õ(n-1 ) and the state-of-the-art computationally efficient rate [Theta] (n- 1/ 2 ). In addition, we consider the scenario where a fixed subset of pairwise comparisons is given. A dichotomy exists between the worst-case design, where consistent estimation is often impossible, and an average-case design, where we show that the optimal rate of estimation depends on the degree sequence of the comparison topology.
by Cheng Mao.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
10

Dallaire, Patrick. "Bayesian nonparametric latent variable models." Doctoral thesis, Université Laval, 2016. http://hdl.handle.net/20.500.11794/26848.

Full text
Abstract:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
One of the important problems in machine learning is determining the complexity of the model to learn. Too much complexity leads to overfitting, which finds structures that do not actually exist in the data, while too low complexity leads to underfitting, which means that the expressiveness of the model is insufficient to capture all the structures present in the data. For some probabilistic models, the complexity depends on the introduction of one or more latent variables whose role is to explain the generative process of the data. There are various approaches to identify the appropriate number of latent variables of a model. This thesis covers various Bayesian nonparametric methods capable of determining the number of latent variables to be used and their dimensionality. The popularization of Bayesian nonparametric statistics in the machine learning community is fairly recent. Their main attraction is the fact that they offer highly flexible models and their complexity scales appropriately with the amount of available data. In recent years, research on Bayesian nonparametric learning methods have focused on three main aspects: the construction of new models, the development of inference algorithms and new applications. This thesis presents our contributions to these three topics of research in the context of learning latent variables models. Firstly, we introduce the Pitman-Yor process mixture of Gaussians, a model for learning infinite mixtures of Gaussians. We also present an inference algorithm to discover the latent components of the model and we evaluate it on two practical robotics applications. Our results demonstrate that the proposed approach outperforms, both in performance and flexibility, the traditional learning approaches. Secondly, we propose the extended cascading Indian buffet process, a Bayesian nonparametric probability distribution on the space of directed acyclic graphs. In the context of Bayesian networks, this prior is used to identify the presence of latent variables and the network structure among them. A Markov Chain Monte Carlo inference algorithm is presented and evaluated on structure identification problems and as well as density estimation problems. Lastly, we propose the Indian chefs process, a model more general than the extended cascading Indian buffet process for learning graphs and orders. The advantage of the new model is that it accepts connections among observable variables and it takes into account the order of the variables. We also present a reversible jump Markov Chain Monte Carlo inference algorithm which jointly learns graphs and orders. Experiments are conducted on density estimation problems and testing independence hypotheses. This model is the first Bayesian nonparametric model capable of learning Bayesian learning networks with completely arbitrary graph structures.
APA, Harvard, Vancouver, ISO, and other styles
11

Tiwari, Puneet. "Exploring latent structures in innovation." Thesis, University of Bristol, 2016. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.705470.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Morciano, Marcello. "Latent factor modelling of disability." Thesis, University of Essex, 2016. http://repository.essex.ac.uk/16224/.

Full text
Abstract:
This PhD thesis uses survey data and involves the application of latent factor structural equation methods to the study of the economics of disability and disability policy in later life, a topic which is currently very high on the policy agenda. It comprises four studies. The first chapter investigates the presence of health-related sample attrition (the drop-out of eligible sample members over time) in the English Longitudinal Study of Ageing (ELSA). The second chapter examines whether different indic ators of disability, collected in three widely-used household surveys, are consistent with a common set of findings relating to the targeting of disability benefits. In the third chapter we estimate the additional per sonal costs experienced by disabled older people to achieve the same material standard of living as similar people living without disability. Chapter 4 assesses the presence of socio-economic disparities in birth-cohort trends in later life physical and cognitive disability and in the receipt of non-means-tested cash disability benefits.
APA, Harvard, Vancouver, ISO, and other styles
13

Hood, Steven Brian. "Latent variable realism in psychometrics." [Bloomington, Ind.] : Indiana University, 2008. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3319901.

Full text
Abstract:
Thesis (Ph.D.)--Indiana University, Dept. of History and Philosophy of Science, 2008.
Title from PDF t.p. (viewed on May 11, 2009). Source: Dissertation Abstracts International, Volume: 69-08, Section: A, page: 3173. Adviser: Colin F. Allen.
APA, Harvard, Vancouver, ISO, and other styles
14

Wegelin, Jacob A. "Latent models for cross-covariance /." Thesis, Connect to this title online; UW restricted, 2001. http://hdl.handle.net/1773/8982.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Mena-Chavez, Ramses H. "Stationary models using latent structures." Thesis, University of Bath, 2003. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.425643.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Ödling, David, and Arvid Österlund. "Factorisation of Latent Variables in Word Space Models : Studying redistribution of weight on latent variables." Thesis, KTH, Skolan för teknikvetenskap (SCI), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-153776.

Full text
Abstract:
The ultimate goal of any DSM is a scalable and accurate representation of lexical semantics. Recent developments due to Bullinaria & Levy (2012) and Caron (2001) indicate that the accuracy of such models can be improved by redistribution of weight on the principal components. However, this method is poorly understood and barely replicated due to the computational expensive dimension reduction and the puzzling nature of the results. This thesis aims to explore the nature of these results. Beginning by reproducing the results in Bullinaria & Levy (2012) we move onto deepen the understanding of these results, quantitatively as well as qualitatively, using various forms of the BLESS test and juxtapose these with previous results.  The main result of this thesis is the verification of the 100% score on the TOEFL test and 91.5% on a paradigmatic version of the BLESS test. Our qualitative tests indicate that the redistribution of weight away from the first principal components is slightly different between word categories and hence the improvement in the TOEFL and BLESS results. While we do not find any significant relation between word frequencies and weight distribution, we find an empirical relation for the optimal weight distribution. Based on these results, we suggest a range of further studies to better understand these phenomena.
Målet med alla semantiska fördelningsmodeller (DSMs) är en skalbaroch precis representation av semantiska relationer. Nya rön från Bullinaria & Levy (2012) och Caron (2001) indikerar att man kan förbättra prestandan avsevärt genom att omfördela vikten ifrån principalkomponenterna med störst varians mot de lägre. Varför metoden fungerar är dock fortfarande oklart, delvis på grund av höga beräkningskostnader för PCA men även på grund av att resultaten strider mot tidigare praxis. Vi börjar med att replikera resultaten i Bullinaria & Levy (2012) för att sedan fördjupa oss i resultaten, både kvantitativt och kvalitativt, genom att använda oss av BLESS testet. Huvudresultaten av denna studie är verifiering av 100% på TOEFL testet och ett nytt resultat på en paradigmatisk variant av BLESStestet på 91.5%. Våra resultat tyder på att en omfördelning av vikten ifrån de första principalkomponenterna leder till en förändring i fördelningensins emellan de semantiska relationerna vilket delvis förklarar förbättringen i TOEFL resultaten. Vidare finner vi i enlighet med tidigare resultat ingen signifikant relation mellan ordfrekvenser och viktomfördelning. Utifrån dessa resultat föreslår vi en rad experiment som kan ge vidare insikt till dessa intressanta resultat.
APA, Harvard, Vancouver, ISO, and other styles
17

PENNONI, FULVIA. "Issues on the Estimation of Latent Variable and Latent Class Models with Social Science Applications." Doctoral thesis, Università degli Studi di Firenze, 2004. http://hdl.handle.net/10281/46004.

Full text
Abstract:
This Ph.D. work is made of different reseach problems which have in common the precence of latent variables. Chapters 1 and 2 provide accessible primer on the models developped in the subsequent chapters. Chapters 3 and 4 are written in form of articles. A list of references at the end of each chapter is provided and a general bibliography is also reported as last part of the work. The first chapter introduces the models of depedence and association and their interpretation using graphical models which have been proved useful to display in graphical form the essential relationships between variables. The structure of the graph yields direct information about various aspects related to the statistical analysis. At first we provide the necessary notation and background on graph theory. We describe the Markov properties that associate a set of conditional independence assumptions to an undirected and directed graph. Such definitions does not depend of any particular distributional form and hence can be applied to models with both discrete and continuous random variables. In particular we consider models for Gaussian continuous variables where the structure is assumed to be adequately described via a vector of means and by a covariance matrix. The concentration and the covariance graphs models are illustrated. The specification of the complex multivariate distribution through univariate regressions induced by a Directed Acyclic Graph (DAG) can be regarded as a simplification, as the single regression models typically involve considerably fewer variables than the whole multivariate vector. In the present work it is shown that such models are a subclass of the structural equation models developed for linear analysis known as Structural Equation Models (SEM) The chapter is concluded by some bibliographical notes. Chapter 2 takes into account the latent class model for measuring one or more latent categorical variables by means of a set of observed categorical variables. After some notes on the model identifiability and estimation we consider the model extension to study latent changes over time when longitudinal studies are used. The hidden Markov model is presented cosisting of hidden state variables and observed variables both varying over time. In Chapter 3 we consider in detail the DAG Gaussian models in which one of the variables is not observed. Once the condition for global identification has been satisfied, we show how the incomplete log-likelihood of the observed data can be maximize using the EM algorithm. As the EM does not provide the matrix of the second derivatives we propose a method for obtaining an explicit formula of the observed information matrix using the missing information principle. We illustrate the models with two examples on real data concerning the educational attainement and criminological research. The first appendix of the chapter reports details on the calculations of the quantities necessary for the E-step of the EM algorithm. The second appendix reports the code of the statistical software R to get the estimated standard errors, which may implemented in the R package called ggm. Chapter 4 starts from the practical problem of classifying criminal activity. The latent class cluster model is extended by proposing a latent class model that also incorporates the longitudinal structure of data using a method similar to a local likelihood approach. The data set which is taken from the Home Office Offenders Index of England and Wales. It contains the complete criminal histories of a sample of those born in 1953 and followed for forty years.
APA, Harvard, Vancouver, ISO, and other styles
18

Stares, Sally Rebecca. "Latent trait and latent class models in survey analysis : case studies in public perceptions of biotechnology." Thesis, London School of Economics and Political Science (University of London), 2008. http://etheses.lse.ac.uk/2970/.

Full text
Abstract:
In latent variable models the existence of one or more unobserved (latent) variables is posited to explain the associations between a set of observed (manifest) variables. These models are useful for analysing attitudinal survey data, where multiple items are used to capture complex constructs such as attitudes, which cannot be directly observed. In such research they are most commonly applied in the form of factor analyses based on linear regression models. However, these are inappropriate when observed items are categorical, which is often the case with attitudinal surveys. Latent trait and latent class models, based on logistic models, are then more suitable. In this thesis I demonstrate how they can be employed to address common challenges in attitudinal survey research. The case study data illustrating these challenges are from the Eurobarometer survey on public perceptions of biotechnology, fielded in 2002 in fifteen European countries. Using these data I investigate the viability of cross-nationally comparable measures of three central constructs in studies of public perceptions of biotechnology: attitudes towards applications of biotechnology, knowledge of biology and genetics, and engagement with science and with biotechnology. The analyses aim to capture these complex constructs, taking account of 'don't know' responses by including them as categories of nominal observed items, and exploring the comparability of measures of these constructs cross-nationally by assessing the similarity of measurement models between countries. The results of these analyses are informative in three ways: substantively, adding to our knowledge of people's representations of biotechnology; methodologically, increasing our understanding of how the survey items function; and practically, informing future questionnaire design. I also formulate a taxonomy of issues and choices in attitudinal survey research as a conceptual framework through which to discuss more broadly the potential value of latent trait and latent class models in survey research in social psychology.
APA, Harvard, Vancouver, ISO, and other styles
19

Sheikha, Hassan. "Text mining Twitter social media for Covid-19 : Comparing latent semantic analysis and latent Dirichlet allocation." Thesis, Högskolan i Gävle, Avdelningen för datavetenskap och samhällsbyggnad, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-32567.

Full text
Abstract:
In this thesis, the Twitter social media is data mined for information about the covid-19 outbreak during the month of March, starting from the 3’rd and ending on the 31’st. 100,000 tweets were collected from Harvard’s opensource data and recreated using Hydrate. This data is analyzed further using different Natural Language Processing (NLP) methodologies, such as termfrequency inverse document frequency (TF-IDF), lemmatizing, tokenizing, Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA). Furthermore, the results of the LSA and LDA algorithms is reduced dimensional data that will be clustered using clustering algorithms HDBSCAN and K-Means for later comparison. Different methodologies are used to determine the optimal parameters for the algorithms. This is all done in the python programing language, as there are libraries for supporting this research, the most important being scikit-learn. The frequent words of each cluster will then be displayed and compared with factual data regarding the outbreak to discover if there are any correlations. The factual data is collected by World Health Organization (WHO) and is then visualized in graphs in ourworldindata.org. Correlations with the results are also looked for in news articles to find any significant moments to see if that affected the top words in the clustered data. The news articles with good timelines used for correlating incidents are that of NBC News and New York Times. The results show no direct correlations with the data reported by WHO, however looking into the timelines reported by news sources some correlation can be seen with the clustered data. Also, the combination of LDA and HDBSCAN yielded the most desireable results in comparison to the other combinations of the dimnension reductions and clustering. This was much due to the use of GridSearchCV on LDA to determine the ideal parameters for the LDA models on each dataset as well as how well HDBSCAN clusters its data in comparison to K-Means.
APA, Harvard, Vancouver, ISO, and other styles
20

Larsson, Patrik. "Automatisk FAQ med Latent Semantisk Analys." Thesis, Linköping University, Department of Computer and Information Science, 2009. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-53672.

Full text
Abstract:

I denna uppsats presenteras teknik för att automatiskt besvara frågor skrivna i naturligt språk, givet att man har tillgång till en samling tidigare ställda frågor och deras respektive svar.

Jag bygger ett prototypsystem som utgår från en databas med epost-konversationer från HP Help Desk. Systemet kombinerar Latent Semantisk Analys med en täthetsbaserad klustringsalgoritm och en enkel klassificeringsalgoritm för att identifiera frekventa svar och besvara nya frågor.

De automatgenererade svaren utvärderas automatiskt och resultaten jämförs med de som tidigare presenterats för samma datamängd. Inverkan av olika parametrar studeras också i detalj.

Studien visar att detta tillvägagångssätt ger goda resultat, utan att man behöver utföra någon som helst lingvistisk förbearbetning.

APA, Harvard, Vancouver, ISO, and other styles
21

Ozsoy, Makbule Gulcin. "Text Summarization Using Latent Semantic Analysis." Master's thesis, METU, 2011. http://etd.lib.metu.edu.tr/upload/12612988/index.pdf.

Full text
Abstract:
Text summarization solves the problem of presenting the information needed by a user in a compact form. There are different approaches to create well formed summaries in literature. One of the newest methods in text summarization is the Latent Semantic Analysis (LSA) method. In this thesis, different LSA based summarization algorithms are explained and two new LSA based summarization algorithms are proposed. The algorithms are evaluated on Turkish and English documents, and their performances are compared using their ROUGE scores.
APA, Harvard, Vancouver, ISO, and other styles
22

Arnekvist, Isac, and Ludvig Ericson. "Finding competitors using Latent Dirichlet Allocation." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-186386.

Full text
Abstract:
Identifying business competitors is of interest to many, but is becoming increasingly hard in an expanding global market. The aim of this report is to investigate whether Latent Dirichlet Allocation (LDA) can be used to identify and rank competitors based on distances between LDA representations of company descriptions. The performance of the LDA model was compared to that of bag-of-words and random ordering by evaluating then comparing them on a handful of common information retrieval metrics. Several different distance metrics were evaluated to determine which metric had best correspondence between representation distance and companies being competitors. Cosine similarity was found to outperform the other distance metrics. While both LDA and bag-of-words representations were found to be significantly better than random ordering, LDA was found to perform worse than bag-of-words. However, computation of distance metrics was considerably faster for LDA representations. The LDA representations capture features that are not helpful for identifying competitors, and it is suggested that LDA representations could be used together with some other data source or heuristic.
Det finns ett intresse av att kunna identifiera affärskonkurrenter, men detta blir allt svårare på en ständigt växande och alltmer global marknad. Syftet med denna rapport är att undersöka om Latent Dirichlet Allocation (LDA) kan användas för att identifiera och rangordna konkurrenter. Detta genom att jämföra avstånden mellan LDA-representationerna av dessas företagsbeskrivningar. Effektiviteten av LDA i detta syfte jämfördes med den för bag-of-words samt slumpmässig ordning, detta med hjälp av några vanliga informationsteoretiska mått. Flera olika avståndsmått utvärderades för att bestämma vilken av dessa som bäst åstadkommer att konkurrerande företag hamnar nära varandra. I detta fall fanns Cosine similarity överträffa andra avståndsmått. Medan både LDA och bag-of-words konstaterades vara signifikant bättre än slumpmässig ordning så fanns att LDA presterar kvalitativt sämre än bag-of-words. Uträkning av avståndsmått var dock betydligt snabbare med LDA-representationer. Att omvandla webbinnehåll till LDA-representationer fångar dock vissa ospecifika likheter som inte nödvändigt beskriver konkurrenter. Det kan möjligen vara fördelaktigt att använda LDA-representationer ihop med någon ytterligare datakälla och/eller heuristik.
APA, Harvard, Vancouver, ISO, and other styles
23

au, gwilcox@murdoch edu, and Hazilawati Hamzah. "Latent Equine Herpesvirus Infections in Horses." Murdoch University, 2008. http://wwwlib.murdoch.edu.au/adt/browse/view/adt-MU20081022.131528.

Full text
Abstract:
A significant characteristic of the herpesviruses is that they form latent infections in infected hosts, and can be reactivated to again induce lytic infections by stressors. This thesis deals with an epidemiological investigation of equine herpesvirus infection, particularly gammaherpesvirus infections, in foals and if there was evidence of reactivation of latent virus infections by stressors such as those associated with weaning. A longitudinal study of EHV infections in young foals and the effect of stressors such as weaning on the prevalence of virus infection was undertaken by the detection of DNA and mRNA of EHV2, EHV5, EHV1 and EHV4 in peripheral blood leukocyte (PBL) and nasal swabs (NS) from 13 mares and 46 foals in 4 stables. EHV2 and EHV5 infections were detected commonly in the study population but infections with the alphaherpesviruses EHV1 and EHV4 were not detected, although lytic infections by the alphaherpesviruses may have been missed due to the frequency of sampling. Age differences in the prevalence of EHV2 and EHV5 infection were detected: the prevalence of EHV2 was higher in young foals than in older foals and adult animals; the prevalence of EHV5 was higher in older foals and adults than in younger foals. The prevalence of EHV2 and EHV5 infection increased in association with weaning, presumably in association with stressors associated with weaning, but was not clearly associated with disease in the weaned animals. It was also observed that EHV5 produced a transient lytic infection in PBL of young foals but tended to produce a persistent lytic infection of PBL in older foals and adults. Persistent lytic EHV5 infections of ¡Ý37 weeks duration were also detected in 2 of 13 adult mares and this has not been reported previously. The persistent lytic infection of PBL was not associated with the detection of virus in NS and the mares with persistent lytic infection of PBL with EHV5 did not transfer the infection to their foals. To determine if any of the animals examined were latently infected with the alphaherpesviruses, an examination for transcripts of genes 63 and 64 of EHV1 and EHV4, putative latency-associated transcripts (LAT) of EHV, was undertaken. Evidence of these transcripts was detected in PBL and bronchiolar lymph nodes in the absence of transcripts of the structural gB, supporting previous studies indicating that transcripts of genes 63 and 64 may represent LAT. In PBL, EHV1 gene 64 RNA transcripts but not gene 63 transcripts were detected in PBL. In bronchiolar lymph nodes, EHV1 gene 64 (but not gene 63) RNA transcripts were detected. In contrast, EHV4 infection was detected in the trigeminal ganglia only and there was no evidence of EHV4 infection in lymph nodes or PBL. In the trigeminal ganglion, EHV4 gB DNA and gene 63 RNA transcripts were detected. The presence of RNA transcripts of EHV1 gene 64 in PBL and lymph node in the absence of any evidence of the replication of structural proteins suggests PBL and lymph node are sites of EHV1 latent infections. The presence of EHV4 gene 63 transcripts in trigeminal ganglia in the absence of any evidence of replication of structural proteins suggests the trigeminal ganglion is the major site of latency of EHV4. As a potential means of detecting latency of the gammaherpesvirus EHV2, 4 EHV2 genes ORF74, E4, E8 and E10 were selected as having possible roles during EHV2 latency based on sequence analysis and comparison with gene products identified or postulated as having roles in latency in other gammaherpesviruses. Kinetic transcription of these genes was evaluated in an in vitro time course study using a non neuronal cell line (equine kidney [EK] cells). While the gB and gH genes encoding structural glycoproteins were abundantly transcribed in vitro, the 4 putative EHV2 latency-associated genes were minimally transcribed during lytic infection in EK cells, a result analogous to results obtained for the expression of LATs in other gammaherpesviruses. Attempts to demonstrate transcription products of these genes in PBL or other tissues of horsed presumed to be latently infected with EHV2 (in which gB transcripts had been detected previously) and actively infected with EHV2 (in which gB transcripts had been detected at the time of sampling), were unsuccessful.
APA, Harvard, Vancouver, ISO, and other styles
24

Christmas, Jacqueline. "Robust spatio-temporal latent variable models." Thesis, University of Exeter, 2011. http://hdl.handle.net/10036/3051.

Full text
Abstract:
Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are widely-used mathematical models for decomposing multivariate data. They capture spatial relationships between variables, but ignore any temporal relationships that might exist between observations. Probabilistic PCA (PPCA) and Probabilistic CCA (ProbCCA) are versions of these two models that explain the statistical properties of the observed variables as linear mixtures of an alternative, hypothetical set of hidden, or latent, variables and explicitly model noise. Both the noise and the latent variables are assumed to be Gaussian distributed. This thesis introduces two new models, named PPCA-AR and ProbCCA-AR, that augment PPCA and ProbCCA respectively with autoregressive processes over the latent variables to additionally capture temporal relationships between the observations. To make PPCA-AR and ProbCCA-AR robust to outliers and able to model leptokurtic data, the Gaussian assumptions are replaced with infinite scale mixtures of Gaussians, using the Student-t distribution. Bayesian inference calculates posterior probability distributions for each of the parameter variables, from which we obtain a measure of confidence in the inference. It avoids the pitfalls associated with the maximum likelihood method: integrating over all possible values of the parameter variables guards against overfitting. For these new models the integrals required for exact Bayesian inference are intractable; instead a method of approximation, the variational Bayesian approach, is used. This enables the use of automatic relevance determination to estimate the model orders. PPCA-AR and ProbCCA-AR can be viewed as linear dynamical systems, so the forward-backward algorithm, also known as the Baum-Welch algorithm, is used as an efficient method for inferring the posterior distributions of the latent variables. The exact algorithm is tractable because Gaussian assumptions are made regarding the distribution of the latent variables. This thesis introduces a variational Bayesian forward-backward algorithm based on Student-t assumptions. The new models are demonstrated on synthetic datasets and on real remote sensing and EEG data.
APA, Harvard, Vancouver, ISO, and other styles
25

Johnson, George Anthony. "Luminescence studies of latent fingerprint residue." Thesis, University of East Anglia, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.257556.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Baffi, Giuseppe. "Non-linear projection to latent structures." Thesis, University of Newcastle Upon Tyne, 1998. http://hdl.handle.net/10443/893.

Full text
Abstract:
This Thesis focuses on the study of multivariate statistical regression techniques which have been used to produce non-linear empirical models of chemical processes, and on the development of a novel approach to non-linear Projection to Latent Structures regression. Empirical modelling relies on the availability of process data and sound empirical regression techniques which can handle variable collinearities, measurement noise, unknown variable and noise distributions and high data set dimensionality. Projection based techniques, such as Principal Component Analysis (PCA) and Projection to Latent Structures (PLS), have been shown to be appropriate for handling such data sets. The multivariate statistical projection based techniques of PCA and linear PLS are described in detail, highlighting the benefits which can be gained by using these approaches. However, many chemical processes exhibit severely nonlinear behaviour and non-linear regression techniques are required to develop empirical models. The derivation of an existing quadratic PLS algorithm is described in detail. The procedure for updating the model parameters which is required by the quadratic PLS algorithms is explored and modified. A new procedure for updating the model parameters is presented and is shown to perform better the existing algorithm. The two procedures have been evaluated on the basis of the performance of the corresponding quadratic PLS algorithms in modelling data generated with a strongly non-linear mathematical function and data generated with a mechanistic model of a benchmark pH neutralisation system. Finally a novel approach to non-linear PLS modelling is then presented combining the general approximation properties of sigmoid neural networks and radial basis function networks with the new weights updating procedure within the PLS framework. These algorithms are shown to outperform existing neural network PLS algorithms and the quadratic PLS approaches. The new neural network PLS algorithms have been evaluated on the basis of their performance in modelling the same data used to compare the quadratic PLS approaches.
APA, Harvard, Vancouver, ISO, and other styles
27

Nugent, Lianne Karen. "Latent invasion by selected Xylariaceous fungi." Thesis, Liverpool John Moores University, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.402948.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Eriksson, Jenny, and Pia Eskola. "Automatisk tesauruskonstruktion med latent semantisk indexering." Thesis, Högskolan i Borås, Institutionen Biblioteks- och informationsvetenskap / Bibliotekshögskolan, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:hb:diva-18226.

Full text
Abstract:
The aim of this thesis is to examine how thesauri constructed with latent semantic indexing LSI are performing when used for query expansion. There is a well-known problem with synonymy in information retrieval and one solution to this problem is to use a thesaurus. In this thesis thesauri are created automatically to find statistically related words and not only synonyms. LSI is a method that uses singular value decomposition SVD to reduce dimensions in a matrix and find latent relationships between words. We constructed nine thesauri and used them for query expansion in a Swedish database, GP_HDINF. To evaluate the performance of the thesauri precision and recall were used. We found some interesting results in how the thesauri performed, even though the results from this study did not show improvements of the retrieval effectiveness when using the thesauri for query expansion. In this study it is interesting to notice that when the recall for a topic improved precision also improved or was unchanged.
Uppsatsnivå: D
APA, Harvard, Vancouver, ISO, and other styles
29

Chen, George H. "Latent source models for nonparametric inference." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/99774.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 95-101).
Nearest-neighbor inference methods have been widely and successfully used in numerous applications such as forecasting which news topics will go viral, recommending products to people in online stores, and delineating objects in images by looking at image patches. However, there is little theoretical understanding of when, why, and how well these nonparametric inference methods work in terms of key problem-specific quantities relevant to practitioners. This thesis bridges the gap between theory and practice for these methods in the three specific case studies of time series classification, online collaborative filtering, and patch-based image segmentation. To do so, for each of these problems, we prescribe a probabilistic model in which the data appear generated from unknown "latent sources" that capture salient structure in the problem. These latent source models naturally lead to nearest-neighbor or nearest-neighbor-like inference methods similar to ones already used in practice. We derive theoretical performance guarantees for these methods, relating inference quality to the amount of training data available and problems-specific structure modeled by the latent sources.
by George H. Chen.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
30

Choudhury, Charisma Farheen 1978. "Modeling driving decisions with latent plans." Thesis, Massachusetts Institute of Technology, 2007. http://hdl.handle.net/1721.1/42220.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2007.
Includes bibliographical references (p. 227-238).
Driving is a complex task that includes a series of interdependent decisions. In many situations, these decisions are based on a specific plan. The plan is however unobserved or latent and only the manifestations of the plan through actions are observed. Examples include selection of a target lane before execution of the lane change, choice of a merging tactic before execution of the merge. Change in circumstances (e.g. reaction of the neighboring drivers, delay in execution) can lead to updates to the initially chosen plan. These latent plans are ignored in the state-of-the-art driving behavior models. Use of these myopic models in the traffic simulators often lead to unrealistic traffic flow characteristics and incorrect representation of congestion. A modeling methodology has been formulated to address the effects of unobserved plans in the decisions of the drivers and hence overcome the deficiency of the existing driving behavior models and simulation tools. The actions of the driver are conditional on the current plan. The current plan can depend on previous plans and be influenced by anticipated future conditions. A Hidden Markov Model is used to address the effect of previous plans in the choice of the current plan and to capture the state-dependence among decisions. Effects of anticipated future circumstances in the current plan are captured through predicted conditions based on current information. The heterogeneity in decision making and planning capabilities of drivers are explicitly addressed. The methodology has been applied in developing driving behavior models for four traffic scenarios: freeway lane changing, freeway merging, urban intersection lane choice and urban arterial lane changing. In all applications, the models are estimated with disaggregate trajectory data using the maximum likelihood technique.
(cont.) Estimation results show that the latent plan models have a significantly better goodness-of-fit compared to the 'reduced form' models where the latent plans are ignored and only the choice of actions are modeled. The justifications for using the latent plan modeling approach are further strengthened by validation case studies within the microscopic traffic simulator MITSIMLab where the simulation capabilities of the latent plan models are compared against the reduced form models. In all cases, the latent plan models better replicate the observed traffic conditions.
by Charisma Farheen Choudhury.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
31

Wanigasekara, Prashan. "Latent state space models for prediction." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106269.

Full text
Abstract:
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Engineering, System Design and Management Program, Engineering and Management Program, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 95-98).
In this thesis, I explore a novel algorithm to model the joint behavior of multiple correlated signals. Our chosen example is the ECG (Electrocardiogram) and ABP (Arterial Blood Pressure) signals from patients in the ICU (Intensive Care Unit). I then use the generated models to predict blood pressure levels of ICU patients based on their historical ECG and ABP signals. The algorithm used is a variant of a Hidden Markov model. The new extension is termed as the Latent State Space Copula Model. In the novel Latent State Space Copula Modelthe ECG, ABP signals are considered to be correlated and are modeled using a bivariate Gaussian copula with Weibull marginals generated by a hidden state. We assume that there are hidden patient "states" that transition from one hidden state to another driving a joint ECG-ABP behavior. We estimate the parameters of the model using a novel Gibbs sampling approach. Using this model, we generate predictors that are the state probabilities at any given time step and use them to predict a patient's future health condition. The predictions made by the model are binary and detects whether the Mean arterial pressure(MAP) is going to be above or below a certain threshold at a future time step. Towards the end of the thesis I do a comparison between the new Latent State Space Copula Model and a state of the art Classical Discrete HMM. The Latent State Space Copula Model achieves an Area Under the ROC (AUROC) curve of .7917 for 5 states while the Classical Discrete HMM achieves an AUROC of .7609 for 5 states.
by Prashan Wanigasekara.
S.M. in Engineering and Management
APA, Harvard, Vancouver, ISO, and other styles
32

Paquet, Ulrich. "Bayesian inference for latent variable models." Thesis, University of Cambridge, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613111.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

O'Sullivan, Aidan Michael. "Bayesian latent variable models with applications." Thesis, Imperial College London, 2013. http://hdl.handle.net/10044/1/19191.

Full text
Abstract:
The massive increases in computational power that have occurred over the last two decades have contributed to the increasing prevalence of Bayesian reasoning in statistics. The often intractable integrals required as part of the Bayesian approach to inference can be approximated or estimated using intensive sampling or optimisation routines. This has extended the realm of applications beyond simple models for which fully analytic solutions are possible. Latent variable models are ideally suited to this approach as it provides a principled method for resolving one of the more difficult issues associated with this class of models, the question of the appropriate number of latent variables. This thesis explores the use of latent variable models in a number of different settings employing Bayesian methods for inference. The first strand of this research focusses on the use of a latent variable model to perform simultaneous clustering and latent structure analysis of multivariate data. In this setting the latent variables are of key interest providing information on the number of sub-populations within a heterogeneous data set and also the differences in latent structure that define them. In the second strand latent variable models are used as a tool to study relational or network data. The analysis of this type of data, which describes the interconnections between different entities or nodes, is complicated due to the dependencies between nodes induced by these connections. The conditional independence assumptions of the latent variable framework provide a means of taking these dependencies into account, the nodes are independent conditioned on an associated latent variable. This allows us to perform model based clustering of a network making inference on the number of clusters. Finally the latent variable representation of the network, which captures the structure of the network in a different form, can be studied as part of a latent variable framework for detecting differences between networks. Approximation schemes are required as part of the Bayesian approach to model estimation. The two methods that are considered in this thesis are stochastic Markov chain Monte Carlo methods and deterministic variational approximations. Where possible these are extended to incorporate model selection over the number of latent variables and a comparison, the first of its kind in this setting, of their relative performance in unsupervised model selection for a range of different settings is presented. The findings of the study help to ascertain in which settings one method may be preferred to the other.
APA, Harvard, Vancouver, ISO, and other styles
34

Ernste, Huib, and Manfred M. Fischer. "Latent Class Modeling and Typological Analysis." WU Vienna University of Economics and Business, 1991. http://epub.wu.ac.at/4222/1/WSG_DP_1191.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Choubey, Rahul. "Tag recommendation using Latent Dirichlet Allocation." Thesis, Kansas State University, 2011. http://hdl.handle.net/2097/9785.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
Doina Caragea
The vast amount of data present on the internet calls for ways to label and organize this data according to specific categories, in order to facilitate search and browsing activities. This can be easily accomplished by making use of folksonomies and user provided tags. However, it can be difficult for users to provide meaningful tags. Tag recommendation systems can guide the users towards informative tags for online resources such as websites, pictures, etc. The aim of this thesis is to build a system for recommending tags to URLs available through a bookmark sharing service, called BibSonomy. We assume that the URLs for which we recommend tags do not have any prior tags assigned to them. Two approaches are proposed to address the tagging problem, both of them based on Latent Dirichlet Allocation (LDA) Blei et al. [2003]. LDA is a generative and probabilistic topic model which aims to infer the hidden topical structure in a collection of documents. According to LDA, documents can be seen as mixtures of topics, while topics can be seen as mixtures of words (in our case, tags). The first approach that we propose, called topic words based approach, recommends the top words in the top topics representing a resource as tags for that particular resource. The second approach, called topic distance based approach, uses the tags of the most similar training resources (identified using the KL-divergence Kullback and Liebler [1951]) to recommend tags for a test untagged resource. The dataset used in this work was made available through the ECML/PKDD Discovery Challenge 2009. We construct the documents that are provided as input to LDA in two ways, thus producing two different datasets. In the first dataset, we use only the description and the tags (when available) corresponding to a URL. In the second dataset, we crawl the URL content and use it to construct the document. Experimental results show that the LDA approach is not very effective at recommending tags for new untagged resources. However, using the resource content gives better results than using the description only. Furthermore, the topic distance based approach is better than the topic words based approach, when only the descriptions are used to construct documents, while the topic words based approach works better when the contents are used to construct documents.
APA, Harvard, Vancouver, ISO, and other styles
36

Zhang, Cheng. "Structured Representation Using Latent Variable Models." Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191455.

Full text
Abstract:
Over the past two centuries the industrial revolution automated a great part of work that involved human muscles. Recently, since the beginning of the 21st century, the focus has shifted towards automating work that is involving our brain to further improve our lives. This is accomplished by establishing human-level intelligence through machines, which lead to the growth of the field of artificial intelligence. Machine learning is a core component of artificial intelligence. While artificial intelligence focuses on constructing an entire intelligence system, machine learning focuses on the learning ability and the ability to further use the learned knowledge for different tasks. This thesis targets the field of machine learning, especially structured representation learning, which is key for various machine learning approaches. Humans sense the environment, extract information and make action decisions based on abstracted information. Similarly, machines receive data, abstract information from data through models and make decisions about the unknown through inference. Thus, models provide a mechanism for machines to abstract information. This commonly involves learning useful representations which are desirably compact, interpretable and useful for different tasks. In this thesis, the contribution relates to the design of efficient representation models with latent variables. To make the models useful, efficient inference algorithms are derived to fit the models to data. We apply our models to various applications from different domains, namely E-health, robotics, text mining, computer vision and recommendation systems. The main contribution of this thesis relates to advancing latent variable models and deriving associated inference schemes for representation learning. This is pursued in three different directions. Firstly, through supervised models, where better representations can be learned knowing the tasks, corresponding to situated knowledge of humans. Secondly, through structured representation models, with which different structures, such as factorized ones, are used for latent variable models to form more efficient representations. Finally, through non-parametric models, where the representation is determined completely by the data. Specifically, we propose several new models combining supervised learning and factorized representation as well as a further model combining non-parametric modeling and supervised approaches. Evaluations show that these new models provide generally more efficient representations and a higher degree of interpretability. Moreover, this thesis contributes by applying these proposed models in different practical scenarios, demonstrating that these models can provide efficient latent representations. Experimental results show that our models improve the performance for classical tasks, such as image classification and annotations, robotic scene and action understanding. Most notably, one of our models is applied to a novel problem in E-health, namely diagnostic prediction using discomfort drawings. Experimental investigation show here that our model can achieve significant results in automatic diagnosing and provides profound understanding of typical symptoms. This motivates novel decision support systems for healthcare personnel.

QC 20160905

APA, Harvard, Vancouver, ISO, and other styles
37

Zhang, Dengfeng. "Latent Class Model in Transportation Study." Diss., Virginia Tech, 2015. http://hdl.handle.net/10919/51203.

Full text
Abstract:
Statistics, as a critical component in transportation research, has been widely used to analyze driver safety, travel time, traffic flow and numerous other problems. Many of these popular topics can be interpreted as to establish the statistical models for the latent structure of data. Over the past several years, the interest in latent class models has continuously increased due to their great potential in solving practical problems. In this dissertation, I developed several latent class models to quantitatively analyze the hidden structure of transportation data and addressed related application issues. The first model is focused on the uncertainty of travel time, which is critical for assessing the reliability of transportation systems. Travel time is random in nature, and contains substantial variability, especially under congested traffic conditions. A Bayesian mixture model, with the ability to incorporate the influence from covariates such as traffic volume, has been proposed. This model advances the previous multi-state travel time reliability model in which the relationship between response and predictors was lacking. The Bayesian mixture travel time model, however, lack the power to accurately predict the future travel time. The analysis indicates that the independence assumption, which is difficult to justify in real data, could be a potential issue. Therefore, I proposed a Hidden Markov model to accommodate dependency structure, and the modeling results were significantly improved. The second and third parts of the dissertation focus on the driver safety identification. Given the demographic information and crash history, the number of crashes, as a type of count data, is commonly modeled by Poisson regression. However, the over-dispersion issue within the data implies that a single Poisson distribution is insufficient to depict the substantial variability. Poisson mixture model is proposed and applied to identify risky and safe drivers. The lower bound of the estimated misclassification rate is evaluated using the concept of overlap probability. Several theoretical results have been discussed regarding the overlap probability. I also introduced quantile regression based on discrete data to specifically model the high-risk drivers. In summary, the major objective of my research is to develop latent class methods and explore the hidden structure within the transportation data, and the approaches I employed can also be implemented for similar research questions in other areas.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
38

McCarthy, Catherine M. "Latent Vulnerability Among Low-Risk Adolescents." Diss., Temple University Libraries, 2010. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/95153.

Full text
Abstract:
Educational Psychology
Ph.D.
This longitudinal study assessed education achievement outcomes among a cohort of eighth graders for whom future college-level academic success would be predicted. The sample was drawn from the NELS:88 database and was comprised of students who scored in the top quintile on a mathematics achievement test and who were identified as representing the top two quartiles of a measurement of socio-economic status. This group, identified as low-risk for academic failure, was predicted to attain a bachelor's degree by the age of twenty-six. A subgroup from among this sample did not attain a bachelor's degree by age twenty-six. In the interest of illuminating features of latent vulnerability, differences between the two groups were explored. Data from the nationally representative sample of 2,355 students was analyzed using several approaches. Results suggest that certain vulnerabilities which may be considered to be dormant (e.g., negative self-concept), eventually have negative effects on academic outcomes for the non-graduating group despite predictions to the contrary. These adolescents exhibit features of latent vulnerability.
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
39

Surian, Didi. "Novel Applications Using Latent Variable Models." Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14014.

Full text
Abstract:
Latent variable models have achieved a great success in many research communities, including machine learning, information retrieval, data mining, natural language processing, etc. Latent variable models use an assumption that the data, which is observable, has an affinity to some hidden/latent variables. In this thesis, we present a suite of novel applications using latent variable models. In particular, we (i) extend topic models using directional distributions, (ii) propose novel solutions using latent variable models to detect outliers (anomalies) and (iii) to answer cross-modal retrieval problem. We present a study of directional distributions in modeling data. Specifically, we implement the von Mises-Fisher (vMF) distribution and develop latent variable models which are based on directed graphical models. The directed graphical models are commonly used to represent the conditional dependency among the variables. Under Bayesian treatment, we propose approximate posterior inference algorithms using variational methods for the models. We show that by incorporating the vMF distribution, the quality of clustering is improved rather than by using word count-based topic models. Furthermore, with the properties of directional distributions in hand, we extend the applications to detect outliers in various data sets and settings. Finally, we present latent variable models that are based on supervised learning to answer the cross-modal retrieval problem. In the cross-modal retrieval problem, the objective is to find matching content across different modalities such as text and image. We explore various approaches such as by using one-class learning methods, generating negative instances and using ranking methods. We show that our models outperform generic approaches such as Canonical Correlation Analysis (CCA) and its variants.
APA, Harvard, Vancouver, ISO, and other styles
40

Hamzah, Hazilawati. "Latent equine herpesvirus infections in horses." Thesis, Hamzah, Hazilawati (2008) Latent equine herpesvirus infections in horses. PhD thesis, Murdoch University, 2008. https://researchrepository.murdoch.edu.au/id/eprint/731/.

Full text
Abstract:
A significant characteristic of the herpesviruses is that they form latent infections in infected hosts, and can be reactivated to again induce lytic infections by stressors. This thesis deals with an epidemiological investigation of equine herpesvirus infection, particularly gammaherpesvirus infections, in foals and if there was evidence of reactivation of latent virus infections by stressors such as those associated with weaning. A longitudinal study of EHV infections in young foals and the effect of stressors such as weaning on the prevalence of virus infection was undertaken by the detection of DNA and mRNA of EHV2, EHV5, EHV1 and EHV4 in peripheral blood leukocyte (PBL) and nasal swabs (NS) from 13 mares and 46 foals in 4 stables. EHV2 and EHV5 infections were detected commonly in the study population but infections with the alphaherpesviruses EHV1 and EHV4 were not detected, although lytic infections by the alphaherpesviruses may have been missed due to the frequency of sampling. Age differences in the prevalence of EHV2 and EHV5 infection were detected: the prevalence of EHV2 was higher in young foals than in older foals and adult animals; the prevalence of EHV5 was higher in older foals and adults than in younger foals. The prevalence of EHV2 and EHV5 infection increased in association with weaning, presumably in association with stressors associated with weaning, but was not clearly associated with disease in the weaned animals. It was also observed that EHV5 produced a transient lytic infection in PBL of young foals but tended to produce a persistent lytic infection of PBL in older foals and adults. Persistent lytic EHV5 infections of ¡Ý37 weeks duration were also detected in 2 of 13 adult mares and this has not been reported previously. The persistent lytic infection of PBL was not associated with the detection of virus in NS and the mares with persistent lytic infection of PBL with EHV5 did not transfer the infection to their foals. To determine if any of the animals examined were latently infected with the alphaherpesviruses, an examination for transcripts of genes 63 and 64 of EHV1 and EHV4, putative latency-associated transcripts (LAT) of EHV, was undertaken. Evidence of these transcripts was detected in PBL and bronchiolar lymph nodes in the absence of transcripts of the structural gB, supporting previous studies indicating that transcripts of genes 63 and 64 may represent LAT. In PBL, EHV1 gene 64 RNA transcripts but not gene 63 transcripts were detected in PBL. In bronchiolar lymph nodes, EHV1 gene 64 (but not gene 63) RNA transcripts were detected. In contrast, EHV4 infection was detected in the trigeminal ganglia only and there was no evidence of EHV4 infection in lymph nodes or PBL. In the trigeminal ganglion, EHV4 gB DNA and gene 63 RNA transcripts were detected. The presence of RNA transcripts of EHV1 gene 64 in PBL and lymph node in the absence of any evidence of the replication of structural proteins suggests PBL and lymph node are sites of EHV1 latent infections. The presence of EHV4 gene 63 transcripts in trigeminal ganglia in the absence of any evidence of replication of structural proteins suggests the trigeminal ganglion is the major site of latency of EHV4. As a potential means of detecting latency of the gammaherpesvirus EHV2, 4 EHV2 genes ORF74, E4, E8 and E10 were selected as having possible roles during EHV2 latency based on sequence analysis and comparison with gene products identified or postulated as having roles in latency in other gammaherpesviruses. Kinetic transcription of these genes was evaluated in an in vitro time course study using a non neuronal cell line (equine kidney [EK] cells). While the gB and gH genes encoding structural glycoproteins were abundantly transcribed in vitro, the 4 putative EHV2 latency-associated genes were minimally transcribed during lytic infection in EK cells, a result analogous to results obtained for the expression of LATs in other gammaherpesviruses. Attempts to demonstrate transcription products of these genes in PBL or other tissues of horsed presumed to be latently infected with EHV2 (in which gB transcripts had been detected previously) and actively infected with EHV2 (in which gB transcripts had been detected at the time of sampling), were unsuccessful.
APA, Harvard, Vancouver, ISO, and other styles
41

Hamzah, Hazilawati. "Latent equine herpesvirus infections in horses." Hamzah, Hazilawati (2008) Latent equine herpesvirus infections in horses. PhD thesis, Murdoch University, 2008. http://researchrepository.murdoch.edu.au/731/.

Full text
Abstract:
A significant characteristic of the herpesviruses is that they form latent infections in infected hosts, and can be reactivated to again induce lytic infections by stressors. This thesis deals with an epidemiological investigation of equine herpesvirus infection, particularly gammaherpesvirus infections, in foals and if there was evidence of reactivation of latent virus infections by stressors such as those associated with weaning. A longitudinal study of EHV infections in young foals and the effect of stressors such as weaning on the prevalence of virus infection was undertaken by the detection of DNA and mRNA of EHV2, EHV5, EHV1 and EHV4 in peripheral blood leukocyte (PBL) and nasal swabs (NS) from 13 mares and 46 foals in 4 stables. EHV2 and EHV5 infections were detected commonly in the study population but infections with the alphaherpesviruses EHV1 and EHV4 were not detected, although lytic infections by the alphaherpesviruses may have been missed due to the frequency of sampling. Age differences in the prevalence of EHV2 and EHV5 infection were detected: the prevalence of EHV2 was higher in young foals than in older foals and adult animals; the prevalence of EHV5 was higher in older foals and adults than in younger foals. The prevalence of EHV2 and EHV5 infection increased in association with weaning, presumably in association with stressors associated with weaning, but was not clearly associated with disease in the weaned animals. It was also observed that EHV5 produced a transient lytic infection in PBL of young foals but tended to produce a persistent lytic infection of PBL in older foals and adults. Persistent lytic EHV5 infections of ¡Ý37 weeks duration were also detected in 2 of 13 adult mares and this has not been reported previously. The persistent lytic infection of PBL was not associated with the detection of virus in NS and the mares with persistent lytic infection of PBL with EHV5 did not transfer the infection to their foals. To determine if any of the animals examined were latently infected with the alphaherpesviruses, an examination for transcripts of genes 63 and 64 of EHV1 and EHV4, putative latency-associated transcripts (LAT) of EHV, was undertaken. Evidence of these transcripts was detected in PBL and bronchiolar lymph nodes in the absence of transcripts of the structural gB, supporting previous studies indicating that transcripts of genes 63 and 64 may represent LAT. In PBL, EHV1 gene 64 RNA transcripts but not gene 63 transcripts were detected in PBL. In bronchiolar lymph nodes, EHV1 gene 64 (but not gene 63) RNA transcripts were detected. In contrast, EHV4 infection was detected in the trigeminal ganglia only and there was no evidence of EHV4 infection in lymph nodes or PBL. In the trigeminal ganglion, EHV4 gB DNA and gene 63 RNA transcripts were detected. The presence of RNA transcripts of EHV1 gene 64 in PBL and lymph node in the absence of any evidence of the replication of structural proteins suggests PBL and lymph node are sites of EHV1 latent infections. The presence of EHV4 gene 63 transcripts in trigeminal ganglia in the absence of any evidence of replication of structural proteins suggests the trigeminal ganglion is the major site of latency of EHV4. As a potential means of detecting latency of the gammaherpesvirus EHV2, 4 EHV2 genes ORF74, E4, E8 and E10 were selected as having possible roles during EHV2 latency based on sequence analysis and comparison with gene products identified or postulated as having roles in latency in other gammaherpesviruses. Kinetic transcription of these genes was evaluated in an in vitro time course study using a non neuronal cell line (equine kidney [EK] cells). While the gB and gH genes encoding structural glycoproteins were abundantly transcribed in vitro, the 4 putative EHV2 latency-associated genes were minimally transcribed during lytic infection in EK cells, a result analogous to results obtained for the expression of LATs in other gammaherpesviruses. Attempts to demonstrate transcription products of these genes in PBL or other tissues of horsed presumed to be latently infected with EHV2 (in which gB transcripts had been detected previously) and actively infected with EHV2 (in which gB transcripts had been detected at the time of sampling), were unsuccessful.
APA, Harvard, Vancouver, ISO, and other styles
42

Parsons, S. "Approximation methods for latent variable models." Thesis, University College London (University of London), 2016. http://discovery.ucl.ac.uk/1513250/.

Full text
Abstract:
Modern statistical models are often intractable, and approximation methods can be required to perform inference on them. Many different methods can be employed in most contexts, but not all are fully understood. The current thesis is an investigation into the use of various approximation methods for performing inference on latent variable models. Composite likelihoods are used as surrogates for the likelihood function of state space models (SSM). In chapter 3, variational approximations to their evaluation are investigated, and the interaction of biases as composite structure changes is observed. The bias effect of increasing the block size in composite likelihoods is found to balance the statistical benefit of including more data in each component. Predictions and smoothing estimates are made using approximate Expectation- Maximisation (EM) techniques. Variational EM estimators are found to produce predictions and smoothing estimates of a lesser quality than stochastic EM estimators, but at a massively reduced computational cost. Surrogate latent marginals are introduced in chapter 4 into a non-stationary SSM with i.i.d. replicates. They are cheap to compute, and break functional dependencies on parameters for previous time points, giving estimation algorithms linear computational complexity. Gaussian variational approximations are integrated with the surrogate marginals to produce an approximate EM algorithm. Using these Gaussians as proposal distributions in importance sampling is found to offer a positive trade-off in terms of the accuracy of predictions and smoothing estimates made using estimators. A cheap to compute model based hierarchical clustering algorithm is proposed in chapter 5. A cluster dissimilarity measure based on method of moments estimators is used to avoid likelihood function evaluation. Computation time for hierarchical clustering sequences is further reduced with the introduction of short-lists that are linear in the number of clusters at each iteration. The resulting clustering sequences are found to have plausible characteristics in both real and synthetic datasets.
APA, Harvard, Vancouver, ISO, and other styles
43

Oldmeadow, Christopher. "Latent variable models in statistical genetics." Thesis, Queensland University of Technology, 2009. https://eprints.qut.edu.au/31995/1/Christopher_Oldmeadow_Thesis.pdf.

Full text
Abstract:
Understanding the complexities that are involved in the genetics of multifactorial diseases is still a monumental task. In addition to environmental factors that can influence the risk of disease, there is also a number of other complicating factors. Genetic variants associated with age of disease onset may be different from those variants associated with overall risk of disease, and variants may be located in positions that are not consistent with the traditional protein coding genetic paradigm. Latent Variable Models are well suited for the analysis of genetic data. A latent variable is one that we do not directly observe, but which is believed to exist or is included for computational or analytic convenience in a model. This thesis presents a mixture of methodological developments utilising latent variables, and results from case studies in genetic epidemiology and comparative genomics. Epidemiological studies have identified a number of environmental risk factors for appendicitis, but the disease aetiology of this oft thought useless vestige remains largely a mystery. The effects of smoking on other gastrointestinal disorders are well documented, and in light of this, the thesis investigates the association between smoking and appendicitis through the use of latent variables. By utilising data from a large Australian twin study questionnaire as both cohort and case-control, evidence is found for the association between tobacco smoking and appendicitis. Twin and family studies have also found evidence for the role of heredity in the risk of appendicitis. Results from previous studies are extended here to estimate the heritability of age-at-onset and account for the eect of smoking. This thesis presents a novel approach for performing a genome-wide variance components linkage analysis on transformed residuals from a Cox regression. This method finds evidence for a dierent subset of genes responsible for variation in age at onset than those associated with overall risk of appendicitis. Motivated by increasing evidence of functional activity in regions of the genome once thought of as evolutionary graveyards, this thesis develops a generalisation to the Bayesian multiple changepoint model on aligned DNA sequences for more than two species. This sensitive technique is applied to evaluating the distributions of evolutionary rates, with the finding that they are much more complex than previously apparent. We show strong evidence for at least 9 well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least 7 classes in an alignment of four mammals, including human. A pattern of enrichment and depletion of genic regions in the profiled segments suggests they are functionally significant, and most likely consist of various functional classes. Furthermore, a method of incorporating alignment characteristics representative of function such as GC content and type of mutation into the segmentation model is developed within this thesis. Evidence of fine-structured segmental variation is presented.
APA, Harvard, Vancouver, ISO, and other styles
44

Marshall, Neil A. "The role of EBV latent membrane protein 1 induced regulatory T-cells in latent infection and Hodgkin lymphoma." Thesis, University of Aberdeen, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.430397.

Full text
Abstract:
Healthy EBV seropositive donors tested for the ability to mount Th responses against LMP1 responded with secretion of high levels of the immunosuppressive cytokine IL-10 which was secreted from cells phenotypically analogous to the Tr1 class of regulatory T-cells.  The epitopes inducing this IL-10 secretion were clustered in the hydrophobic, transmembrane half of the protein that corresponded to a cluster of high affinity MHC class II binding domains.  Since the LMP1 induced TO cells could effectively suppress immune responses in an IL- 10 dependent manner, it seems likely that the induction of regulatory T-cells serves to prevent anti-EBV immune responses and aid viral persistence. PBMC and HLIL from HL patients also responded to stimulation with LMP1 by predominantly secreting IL-10.  However, these patients appeared to mount weaker responses compared to healthy donors.  In addition, a defect in their ability to mount Th1 responses against a range of control stimuli was documented.  HL patient HLIL were highly enriched for populations of both Tr1and CD4CD25 regulatory T-cells, which were strongly suppressive.  Such enriched populations of regulatory T-cells may be induced by LMP1 in EBV positive cases of HL and will act to prevent anti-tumour immune responses and aid tumour growth and persistence. Thus the Th cell response to LMP1 is dominated by the induction of regulatory T-cells in both latently infected donors and HL patients.  This strategy of induction of Tr cells should serve to protect EBV infected cells (either latently infected B-cells or H-RS cells) from clearance by the host.  This strategy of immune evasion via the induction of Tr cells may well be found with other persistent viral infections and tumours.
APA, Harvard, Vancouver, ISO, and other styles
45

Esmail, Hanif. "How latent is 'latent' tuberculosis? : the radiographic, transcriptional and immunological characterisation of subclinical tuberculosis in HIV infected adults." Thesis, Imperial College London, 2014. http://hdl.handle.net/10044/1/30658.

Full text
Abstract:
Background: The central hypothesis of the thesis is that the neat division of tuberculosis (TB) into states of active disease and latent infection is an oversimplification and that the transition between latent and active TB involves passage through a subclinical phase of disease, which may be prolonged, during which pathology evolves. The primary aim of this thesis is to utilise [18F]-fluoro-2-deoxy-D-glucose positron emission tomography combined with computed tomography (FDG-PET/CT) to identify and define intra-thoracic pathology consistent with subclinical TB in a cohort of asymptomatic adults diagnosed with latent TB at high risk of developing active TB (due to HIV co-infection) and then to identify transcriptional and immunological biomarkers that distinguish those with radiographic evidence of subclinical pathology. Such biomarkers may have future translational potential as tests more predictive of active TB compared to the currently available tests (tuberculin skin testing (TST) and interferon gamma release assays (IGRA)) and may also aid our understanding of the biology of TB. Methodology: Healthy HIV infected, ART naïve, adult outpatients living in an area with very high TB burden (Khayelitsha township, Cape Town, South Africa) were screened to identify 35 participants that were asymptomatic, with CD4 count ≥ 350/mm3, evidence of latent TB (by QuantiFERON Gold in tube (QFGIT)) and with no history of previous tuberculosis or evidence of current active TB. These participants had FDG-PET/CT performed and were then commenced on isoniazid preventive therapy (IPT) or standard TB therapy if clinically indicated and had repeat FDG-PET/CT following treatment. A number of additional groups of HIV infected and uninfected control participants with and without active TB were also recruited for blood sampling. Microarray, carried out on RNA extracted from whole blood, was used to identify differentially abundant transcripts between those with and without subclinical pathology. A 38-plex assay and ELISA covering a total of 45 analytes were then used to identify serological or QFGIT plasma biomarkers that distinguish those with and without subclinical pathology. Main Results: Parenchymal abnormalities in the 35 participants were evaluated in detail and interpreted in relation to the historical autopsy data and 28.6% were categorised as having evidence of subclinical TB pathology. Analysis of the whole blood microarray for these 35 participants along with 15 age, sex and CD4 count matched controls with clinical active TB identified 82 transcripts that clustered 80% of those with subclinical TB with active TB. Those with more metabolically active subclinical pathology, as determined by FDG uptake, clustered more effectively with clinical active TB. This signature was confirmed as specific to TB in HIV uninfected controls. Transcripts relating to the classical complement pathway and Fcγ receptor were found to be overabundant in subclinical and active TB in relation to those with latent TB with no evidence of subclinical pathology. Neutrophil related transcripts were over abundant only in clinical active TB, particularly in those that were smear positive. Network analysis of the 82 transcript signature, informed the selection of 45 soluble protein analytes. 10 analytes showed a significant difference in concentration between the 3 groups (active, subclinical and latent TB). IL-1α with a cut-off of 16.9 pg/mL and circulating immune complex (CIC) with a cut-off of 100.9 μg Eq/mL individually classified 50% and together 70% of those with subclinical TB as active TB. In addition when assessed across 5 stages of increasing disease activity by PET findings and smear status all 10 analytes showed a significant increasing trend. Conclusion: The utility of FDG-PET/CT a novel research tool in the study of latent TB in humans has been systematically evaluated for the first time in this thesis. It has allowed for the identification of pathology within the lungs consistent with subclinical TB not reliably identified on CXR. Microarray analysis of whole blood has contributed of our understanding of which biological process may be pertinent from the early subclinical stages of disease, suggesting that the classical complement pathway and overabundance of Fcγ receptor may be important. Furthermore, the approach has lead to the identification of transcriptional and serological biomarkers that distinguish those with subclinical pathology from those without. These biomarkers may have translational potential as more predictive diagnostic tests for active TB.
APA, Harvard, Vancouver, ISO, and other styles
46

Kao, Ling-Jing. "Data augmentation for latent variables in marketing." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1155653751.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Martino, Sara. "Approximate Bayesian Inference for Latent Gaussian Models." Doctoral thesis, Norwegian University of Science and Technology, Department of Mathematical Sciences, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1949.

Full text
Abstract:

This thesis consists of five papers, presented in chronological order. Their content is summarised in this section.

Paper I introduces the approximation tool for latent GMRF models and discusses, in particular, the approximation for the posterior of the hyperparameters θ in equation (1). It is shown that this approximation is indeed very accurate, as even long MCMC runs cannot detect any error in it. A Gaussian approximation to the density of χi|θ, y is also discussed. This appears to give reasonable results and it is very fast to compute. However, slight errors are detected when comparing the approximation with long MCMC runs. These are mostly due to the fact that a possible - skewed density is approximated via a symmetric one. Paper I presents also some details about sparse matrices algorithms.

The core of the thesis is presented in Paper II. Here most of the remaining issues present in Paper I are solved. Three different approximation for χi|θ, y with different degrees of accuracy and computational costs are described. Moreover, ways to assess the approximation error and considerations about the asymptotical behaviour of the approximations are also discussed. Through a series of examples covering a wide range of commonly used latent GMRF models, the approximations are shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.

Paper III applies the same ideas as Paper II to generalised linear mixed models where χ represents a latent variable at n spatial sites on a two dimensional domain. Out of these n sites k, with n >> k , are observed through data. The n sites are assumed to be on a regular grid and wrapped on a torus. For the class of models described in Paper III the computations are based on discrete Fourier transform instead of sparse matrices. Paper III illustrates also how marginal likelihood π (y) can be approximated, provides approximate strategies for Bayesian outlier detection and perform approximate evaluation of spatial experimental design.

Paper IV presents yet another application of the ideas in Paper II. Here approximate techniques are used to do inference on multivariate stochastic volatility models, a class of models widely used in financial applications. Paper IV discusses also problems deriving from the increased dimension of the parameter vector θ, a condition which makes all numerical integration more computationally intensive. Different approximations for the posterior marginals of the parameters θ, π(θi)|y), are also introduced. Approximations to the marginal likelihood π(y) are used in order to perform model comparison.

Finally, Paper V is a manual for a program, named inla which implements all approximations described in Paper II. A large series of worked out examples, covering many well known models, illustrate the use and the performance of the inla program. This program is a valuable instrument since it makes most of the Bayesian inference techniques described in this thesis easily available for everyone.

APA, Harvard, Vancouver, ISO, and other styles
48

Jasiak, Joanna. "Three essays on econometrics of latent variables." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1996. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq21473.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Burnham, Alison J. "Multivariate latent variable regression, modelling and estimation." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape11/PQDD_0006/NQ42728.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Burnham, Alison J. "Multivariate latent variable regression : modelling and estimation /." *McMaster only, 1997.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography