Pour voir les autres types de publications sur ce sujet consultez le lien suivant : Modèle latent.

Thèses sur le sujet « Modèle latent »

Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres

Choisissez une source :

Consultez les 50 meilleures thèses pour votre recherche sur le sujet « Modèle latent ».

À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.

Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.

Parcourez les thèses sur diverses disciplines et organisez correctement votre bibliographie.

1

Brault, Vincent. « Estimation et sélection de modèle pour le modèle des blocs latents ». Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112238/document.

Texte intégral
Résumé :
Le but de la classification est de partager des ensembles de données en sous-ensembles les plus homogènes possibles, c'est-à-dire que les membres d'une classe doivent plus se ressembler entre eux qu'aux membres des autres classes. Le problème se complique lorsque le statisticien souhaite définir des groupes à la fois sur les individus et sur les variables. Le modèle des blocs latents définit une loi pour chaque croisement de classe d'objets et de classe de variables, et les observations sont supposées indépendantes conditionnellement au choix de ces classes. Toutefois, il est impossible de factoriser la loi jointe des labels empêchant le calcul de la logvraisemblance et l'utilisation de l'algorithme EM. Plusieurs méthodes et critères existent pour retrouver ces partitions, certains fréquentistes, d'autres bayésiens, certains stochastiques, d'autres non. Dans cette thèse, nous avons d'abord proposé des conditions suffisantes pour obtenir l'identifiabilité. Dans un second temps, nous avons étudié deux algorithmes proposés pour contourner le problème de l'algorithme EM : VEM de Govaert et Nadif (2008) et SEM-Gibbs de Keribin, Celeux et Govaert (2010). En particulier, nous avons analysé la combinaison des deux et mis en évidence des raisons pour lesquelles les algorithmes dégénèrent (terme utilisé pour dire qu'ils renvoient des classes vides). En choisissant des lois a priori judicieuses, nous avons ensuite proposé une adaptation bayésienne permettant de limiter ce phénomène. Nous avons notamment utilisé un échantillonneur de Gibbs dont nous proposons un critère d'arrêt basé sur la statistique de Brooks-Gelman (1998). Nous avons également proposé une adaptation de l'algorithme Largest Gaps (Channarond et al. (2012)). En reprenant leurs démonstrations, nous avons démontré que les estimateurs des labels et des paramètres obtenus sont consistants lorsque le nombre de lignes et de colonnes tendent vers l'infini. De plus, nous avons proposé une méthode pour sélectionner le nombre de classes en ligne et en colonne dont l'estimation est également consistante à condition que le nombre de ligne et de colonne soit très grand. Pour estimer le nombre de classes, nous avons étudié le critère ICL (Integrated Completed Likelihood) dont nous avons proposé une forme exacte. Après avoir étudié l'approximation asymptotique, nous avons proposé un critère BIC (Bayesian Information Criterion) puis nous conjecturons que les deux critères sélectionnent les mêmes résultats et que ces estimations seraient consistantes ; conjecture appuyée par des résultats théoriques et empiriques. Enfin, nous avons comparé les différentes combinaisons et proposé une méthodologie pour faire une analyse croisée de données
Classification aims at sharing data sets in homogeneous subsets; the observations in a class are more similar than the observations of other classes. The problem is compounded when the statistician wants to obtain a cross classification on the individuals and the variables. The latent block model uses a law for each crossing object class and class variables, and observations are assumed to be independent conditionally on the choice of these classes. However, factorizing the joint distribution of the labels is impossible, obstructing the calculation of the log-likelihood and the using of the EM algorithm. Several methods and criteria exist to find these partitions, some frequentist ones, some bayesian ones, some stochastic ones... In this thesis, we first proposed sufficient conditions to obtain the identifiability of the model. In a second step, we studied two proposed algorithms to counteract the problem of the EM algorithm: the VEM algorithm (Govaert and Nadif (2008)) and the SEM-Gibbs algorithm (Keribin, Celeux and Govaert (2010)). In particular, we analyzed the combination of both and highlighted why the algorithms degenerate (term used to say that it returns empty classes). By choosing priors wise, we then proposed a Bayesian adaptation to limit this phenomenon. In particular, we used a Gibbs sampler and we proposed a stopping criterion based on the statistics of Brooks-Gelman (1998). We also proposed an adaptation of the Largest Gaps algorithm (Channarond et al. (2012)). By taking their demonstrations, we have shown that the labels and parameters estimators obtained are consistent when the number of rows and columns tend to infinity. Furthermore, we proposed a method to select the number of classes in row and column, the estimation provided is also consistent when the number of row and column is very large. To estimate the number of classes, we studied the ICL criterion (Integrated Completed Likelihood) whose we proposed an exact shape. After studying the asymptotic approximation, we proposed a BIC criterion (Bayesian Information Criterion) and we conjecture that the two criteria select the same results and these estimates are consistent; conjecture supported by theoretical and empirical results. Finally, we compared the different combinations and proposed a methodology for co-clustering
Styles APA, Harvard, Vancouver, ISO, etc.
2

Mura, Thibault. « Prévention des démences : analyse du déclin cognitif à l’aide d’un modèle longitudinal non linéaire à variable latente ». Thesis, Montpellier 1, 2012. http://www.theses.fr/2012MON1T018/document.

Texte intégral
Résumé :
Ce travail doctoral a pour premier objectif de replacer les démences dans leur contexte de santé publique en estimant des projections de nombre de cas de démences en France et en Europe jusqu'en 2050. La sensibilité de ces projections aux changements d'hypothèses sur les valeurs d'incidence ou de mortalité des sujets déments, sur le scenario démographique utilisé, et sur la mise en place d'une intervention de prévention, a également été évaluée. Dans ce contexte de forte augmentation du nombre de cas à venir, la prévention des démences, qu'elle soit primaire ou secondaire, sera amenée à tenir une place primordiale dans la prise en charge sociétale de ce problème. Pour pouvoir aboutir à des résultats, les recherches en prévention primaire et secondaire ont besoin de s'appuyer sur une méthodologie adaptée et de sélectionner des critères de jugements pertinents. Le déclin cognitif semble être un critère de jugement de choix, mais son l'utilisation doit éviter un certain nombre d'écueils et de biais. Nous avons dans un premier temps illustré l'analyse de ce critère dans le cadre d'un questionnement de prévention primaire à l'aide d'un modèle non linéaire à variable latente pour données longitudinales. Nous avons pour cela étudié la relation entre consommation chronique de benzodiazépines et déclin cognitif, et montré l'absence d'association sur un large échantillon. Dans un second temps nous avons utilisé ce type de modèle pour décrire et comparer les propriétés métrologiques d'un large ensemble de tests neuropsychologiques dans une cohorte clinique de sujets atteints de déficit cognitif modéré (MCI), et pour étudier la sensibilité de ces tests aux changements cognitifs lié aux prodromes de la maladie d'Alzheimer. Nos travaux ont ainsi permis de fournir des arguments permettant de sélectionner des tests neuropsychologiques susceptibles d'être utilisés dans le cadre de recherches de prévention secondaire pour identifier et/ou suivre les patients présentant un déficit cognitif modéré (MCI) lié à une maladie d'Alzheimer
The first aim of this doctoral work is to replace dementia in its public health context by estimating the number of dementia cases expected to occur in France and Europe over the next few decades until 2050. The sensitivity of these projections to hypotheses made on dementia incidence and mortality, demographic scenario used, and implementation of a prevention intervention, was also assessed. In this context of increasing number of future cases, the primary and secondary prevention of dementia will take a prominent place in the social management of this problem. Relevant research in the field of primary and secondary prevention requires an appropriate methodology and the use of relevant outcome. Cognitive decline seems to be an appropriate outcome, but a number of biases must be avoided. First, we illustrated the use of this criterion in the context of primary prevention using a nonlinear model with latent variable for longitudinal data to investigated the association between chronic use of benzodiazepines and cognitive decline. We showed the absence of association in a large population-based cohort. Secondly we used this model to describe and compare the metrological properties of a broad range of neuropsychological tests in a clinical cohort of patients with mild cognitive impairment (MCI). We also investigated the sensitivity of these tests to cognitive changes associated with prodromal Alzheimer's disease. Our work provides arguments for selecting neuropsychological tests which can be used in secondary prevention research, to identify and / or to follow patients with mild cognitive impairment (MCI) due to Alzheimer's disease
Styles APA, Harvard, Vancouver, ISO, etc.
3

Samuth, Benjamin. « Ηybrid mοdels cοmbining deep neural representatiοns and nοn-parametric patch-based methοds fοr phοtοrealistic image generatiοn ». Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMC249.

Texte intégral
Résumé :
Le domaine de la génération d'images a récemment connu de fortesavancées grâce aux rapides évolutions des modèles neuronaux profonds.Leur succès ayant atteint une portée au-delà de la sphèrescientifique, de multiples inquiétudes et questionnements se sontlégitimement soulevées quant à leur fonctionnement et notammentl'usage de leurs données d'entraînement. En effet, ces modèles sont sivolumineux en paramètres et coûteux en énergie qu'il en devientdifficile d'offrir des garanties et des explications concrètes. Àl'inverse, des modèles légers et explicables seraient souhaitablespour répondre à ces nouvelles problématiques, mais au coût d'unequalité et flexibilité de génération moindre.Cette thèse explore l'idée de construire des « modèles hybrides », quicombineraient intelligemment les qualités des méthodes légères oufrugales avec les performances des réseaux profonds. Nous étudionsd'abord le cas du transfert de style artistique à l'aide d'une méthodecontrainte, multi-échelle, et à patchs. Nous déterminons alorsqualitativement l'intérêt d'une métrique perceptuelle dans cetteopération. Par ailleurs, nous développons deux méthodes hybrides degénération de visages photoréalistes, à l'aide d'un auto-encodeurpré-entraîné. Le premier s'attaque à la génération de visages avecpeu d'échantillons à l'aide de patchs latents, montrant une notablerobustesse et des résultats convaincants avec un simple algorithmeséquentiel à patchs. Le second offre une solution à la généralisationde la tâche à une plus grande variétés de visages grâce à des modèlesde mixtures de gaussiennes. En particulier, nous montrons que cesmodèles offrent des performances similaires à d'autres modèlesneuronaux, tout en s'affranchissant d'une quantité importante deparamètres et d'étapes de calculs
Image generation has encountered great progress thanks to the quickevolution of deep neural models. Their reach went beyond thescientific domain and thus multiple legitimate concerns and questionshave been raised, in particular about how the training data aretreated. On the opposite, lightweight and explainable models wouldbe a fitting answer to these emerging problematics, but their qualityand range of applications are limited.This thesis strives to build “hybrid models”. They would efficientlycombine the qualities of lightweight or frugal methods with theperformance of deep networks. We first study the case of artisticstyle transfer with a multiscale and constrained patch-basedmethod. We qualitatively find out the potential of perceptual metricsin the process. Besides, we develop two hybrid models forphotorealistic face generation, each built around a pretrainedauto-encoder. The first model tackles the problem of few-shot facegeneration with the help of latent patches. Results shows a notablerobustness and convincing synthesis with a simple patch-basedsequential algorithm. The second model uses Gaussian mixtures modelsas a way to generalize the previous method to wider varieties offaces. In particular, we show that these models perform similarly toother neural methods, while removing a non-negligible number ofparameters and computing steps at the same time
Styles APA, Harvard, Vancouver, ISO, etc.
4

Dantan, Etienne. « Modèles conjoints pour données longitudinales et données de survie incomplètes appliqués à l'étude du vieillissement cognitif ». Thesis, Bordeaux 2, 2009. http://www.theses.fr/2009BOR21658/document.

Texte intégral
Résumé :
Dans l'étude du vieillissement cérébral, le suivi des personnes âgées est soumis à une forte sélection avec un risque de décès associé à de faibles performances cognitives. La modélisation de l'histoire naturelle du vieillissement cognitif est complexe du fait de données longitudinales et données de survie incomplètes. Par ailleurs, un déclin accru des performances cognitives est souvent observé avant le diagnostic de démence sénile, mais le début de cette accélération n'est pas facile à identifier. Les profils d'évolution peuvent être variés et associés à des risques différents de survenue d'un événement; cette hétérogénéité des déclins cognitifs de la population des personnes âgées doit être prise en compte. Ce travail a pour objectif d'étudier des modèles conjoints pour données longitudinales et données de survie incomplètes afin de décrire l'évolution cognitive chez les personnes âgées. L'utilisation d'approches à variables latentes a permis de tenir compte de ces phénomènes sous-jacents au vieillissement cognitif que sont l'hétérogénéité et l'accélération du déclin. Au cours d'un premier travail, nous comparons deux approches pour tenir compte des données manquantes dans l'étude d'un processus longitudinal. Dans un second travail, nous proposons un modèle conjoint à état latent pour modéliser simultanément l'évolution cognitive et son accélération pré-démentielle, le risque de démence et le risque de décès
In cognitive ageing study, older people are highly selected by a risk of death associated with poor cognitive performances. Modeling the natural history of cognitive decline is difficult in presence of incomplete longitudinal and survival data. Moreover, the non observed cognitive decline acceleration beginning before the dementia diagnosis is difficult to evaluate. Cognitive decline is highly heterogeneous, e.g. there are various patterns associated with different risks of survival event. The objective is to study joint models for incomplete longitudinal and survival data to describe the cognitive evolution in older people. Latent variable approaches were used to take into account the non-observed mechanisms, e.g. heterogeneity and decline acceleration. First, we compared two approaches to consider missing data in longitudinal data analysis. Second, we propose a joint model with a latent state to model cognitive evolution and its pre-dementia acceleration, dementia risk and death risk
Styles APA, Harvard, Vancouver, ISO, etc.
5

Robert, Valérie. « Classification croisée pour l'analyse de bases de données de grandes dimensions de pharmacovigilance ». Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS111/document.

Texte intégral
Résumé :
Cette thèse regroupe des contributions méthodologiques à l'analyse statistique des bases de données de pharmacovigilance. Les difficultés de modélisation de ces données résident dans le fait qu'elles produisent des matrices souvent creuses et de grandes dimensions. La première partie des travaux de cette thèse porte sur la classification croisée du tableau de contingence de pharmacovigilance à l’aide du modèle des blocs latents de Poisson normalisé. L'objectif de la classification est d'une part de fournir aux pharmacologues des zones intéressantes plus réduites à explorer de manière plus précise, et d'autre part de constituer une information a priori utilisable lors de l'analyse des données individuelles de pharmacovigilance. Dans ce cadre, nous détaillons une procédure d'estimation partiellement bayésienne des paramètres du modèle et des critères de sélection de modèles afin de choisir le modèle le plus adapté aux données étudiées. Les données étant de grandes dimensions, nous proposons également une procédure pour explorer de manière non exhaustive mais pertinente, l'espace des modèles en coclustering. Enfin, pour mesurer la performance des algorithmes, nous développons un indice de classification croisée calculable en pratique pour un nombre de classes élevé. Les développements de ces outils statistiques ne sont pas spécifiques à la pharmacovigilance et peuvent être utile à toute analyse en classification croisée. La seconde partie des travaux de cette thèse porte sur l'analyse statistique des données individuelles, plus nombreuses mais également plus riches en information. L'objectif est d'établir des classes d'individus selon leur profil médicamenteux et des sous-groupes d'effets et de médicaments possiblement en interaction, palliant ainsi le phénomène de coprescription et de masquage que peuvent présenter les méthodes existantes sur le tableau de contingence. De plus, l'interaction entre plusieurs effets indésirables y est prise en compte. Nous proposons alors le modèle des blocs latents multiple qui fournit une classification croisée simultanée des lignes et des colonnes de deux tableaux de données binaires en leur imposant le même classement en ligne. Nous discutons des hypothèses inhérentes à ce nouveau modèle et nous énonçons des conditions suffisantes de son identifiabilité. Ensuite, nous présentons une procédure d'estimation de ses paramètres et développons des critères de sélection de modèles associés. De plus, un modèle de simulation numérique des données individuelles de pharmacovigilance est proposé et permet de confronter les méthodes entre elles et d'étudier leurs limites. Enfin, la méthodologie proposée pour traiter les données individuelles de pharmacovigilance est explicitée et appliquée à un échantillon de la base française de pharmacovigilance entre 2002 et 2010
This thesis gathers methodological contributions to the statistical analysis of large datasets in pharmacovigilance. The pharmacovigilance datasets produce sparse and large matrices and these two characteritics are the main statistical challenges for modelling them. The first part of the thesis is dedicated to the coclustering of the pharmacovigilance contingency table thanks to the normalized Poisson latent block model. The objective is on the one hand, to provide pharmacologists with some interesting and reduced areas to explore more precisely. On the other hand, this coclustering remains a useful background information for dealing with individual database. Within this framework, a parameter estimation procedure for this model is detailed and objective model selection criteria are developed to choose the best fit model. Datasets are so large that we propose a procedure to explore the model space in coclustering, in a non exhaustive way but a relevant one. Additionnally, to assess the performances of the methods, a convenient coclustering index is developed to compare partitions with high numbers of clusters. The developments of these statistical tools are not specific to pharmacovigilance and can be used for any coclustering issue. The second part of the thesis is devoted to the statistical analysis of the large individual data, which are more numerous but also provides even more valuable information. The aim is to produce individual clusters according their drug profiles and subgroups of drugs and adverse effects with possible links, which overcomes the coprescription and masking phenomenons, common contingency table issues in pharmacovigilance. Moreover, the interaction between several adverse effects is taken into account. For this purpose, we propose a new model, the multiple latent block model which enables to cocluster two binary tables by imposing the same row ranking. Assertions inherent to the model are discussed and sufficient identifiability conditions for the model are presented. Then a parameter estimation algorithm is studied and objective model selection criteria are developed. Moreover, a numeric simulation model of the individual data is proposed to compare existing methods and study its limits. Finally, the proposed methodology to deal with individual pharmacovigilance data is presented and applied to a sample of the French pharmacovigilance database between 2002 and 2010
Styles APA, Harvard, Vancouver, ISO, etc.
6

Georgescu, Vera. « Classification de données multivariées multitypes basée sur des modèles de mélange : application à l'étude d'assemblages d'espèces en écologie ». Phd thesis, Université d'Avignon, 2010. http://tel.archives-ouvertes.fr/tel-00624382.

Texte intégral
Résumé :
En écologie des populations, les distributions spatiales d'espèces sont étudiées afin d'inférer l'existence de processus sous-jacents, tels que les interactions intra- et interspécifiques et les réponses des espèces à l'hétérogénéité de l'environnement. Nous proposons d'analyser les données spatiales multi-spécifiques sous l'angle des assemblages d'espèces, que nous considérons en termes d'abondances absolues et non de diversité des espèces. Les assemblages d'espèces sont une des signatures des interactions spatiales locales des espèces entre elles et avec leur environnement. L'étude des assemblages d'espèces peut permettre de détecter plusieurs types d'équilibres spatialisés et de les associer à l'effet de variables environnementales. Les assemblages d'espèces sont définis ici par classification non spatiale des observations multivariées d'abondances d'espèces. Les méthodes de classification basées sur les modèles de mélange ont été choisies afin d'avoir une mesure de l'incertitude de la classification et de modéliser un assemblage par une loi de probabilité multivariée. Dans ce cadre, nous proposons : 1. une méthode d'analyse exploratoire de données spatiales multivariées d'abondances d'espèces, qui permet de détecter des assemblages d'espèces par classification, de les cartographier et d'analyser leur structure spatiale. Des lois usuelles, telle que la Gaussienne multivariée, sont utilisées pour modéliser les assemblages, 2. un modèle hiérarchique pour les assemblages d'abondances lorsque les lois usuelles ne suffisent pas. Ce modèle peut facilement s'adapter à des données contenant des variables de types différents, qui sont fréquemment rencontrées en écologie, 3. une méthode de classification de données contenant des variables de types différents basée sur des mélanges de lois à structure hiérarchique (définies en 2.). Deux applications en écologie ont guidé et illustré ce travail : l'étude à petite échelle des assemblages de deux espèces de pucerons sur des feuilles de clémentinier et l'étude à large échelle des assemblages d'une plante hôte, le plantain lancéolé, et de son pathogène, l'oïdium, sur les îles Aland en Finlande
Styles APA, Harvard, Vancouver, ISO, etc.
7

Arzamendia, Lopez Juan Pablo. « Métholodogie de conception des matériaux architecturés pour le stockage latent dans le domaine du bâtiment ». Thesis, Lyon, INSA, 2013. http://www.theses.fr/2013ISAL0060/document.

Texte intégral
Résumé :
L'utilisation de systèmes de stockage par chaleur latente constitue une solution permettant l'effacement du chauffage d'un bâtiment résidentiel pendant les périodes de forte demande. Une telle stratégie peut avoir pour objectif le lissage des pics d'appel en puissance du réseau électrique. Cependant, la faible conductivité des matériaux à changement de phase (MCP) qui constituent ces systèmes et le besoin d'une puissance de décharge importante imposent l'utilisation de matériaux dits "architecturés" afin d'optimiser la conductivité équivalente des matériaux stockeurs. Nos travaux s'intéressent plus particulièrement à la méthodologie pour la conception de matériaux pour ces systèmes afin de satisfaire aux exigences de stockage d'énergie et de puissance de restitution. La méthodologie proposée dans ces travaux de thèse est dénommé « Top-down methodology ». Cette méthodologie comporte trois échelles : l'échelle bâtiment (top), l'échelle système et l'échelle matériau (down). L'échelle bâtiment a comme objectif de spécifier le cahier des charges. A l'échelle système, des indicateurs de performance sont définis. Enfin, à l'échelle matériau, l'architecture du matériau solution est proposée. Un outil numérique modélisant le système de stockage par chaleur latente de type échangeur de chaleur air/MCP à été développé pour évaluer les indicateurs de performance. Ce modèle numérique est vérifié avec un cas analytique et validé par comparaison avec des données expérimentales. La méthodologie développée est mise en œuvre dans un deuxième cas d'étude pour le même type de système de stockage. L'analyse du système via les nombres adimensionnels permet d'obtenir des indicateurs de performance du système. A l'issue de cette étape, les propriétés matériaux et fonctionnelles optimales du système sont donc connues. Enfin, un matériau architecturé est alors proposé afin de satisfaire les exigences du système de stockage. Nous montrons alors que par l'intermédiaire d'une plaque sandwich contenant des clous et du MCP les propriétés matériaux nécessaires sont obtenues. De plus, afin de satisfaire aux exigences en termes de propriétés fonctionnelles, le design du système est modifié en ajoutant des ailettes sur les surfaces d'échange. Nous montrons que avec 20 ailettes de 3mm d'épaisseur sur la surface d'échange de la planche à clous, le chauffage est effacé pendant 2h lors de la période de forte demande journalière pendant l'hiver
The use of energy storage systems that exploit latent heat represents a promising solution to erase the heating demand of residential buildings during periods of peak demand. Equipping a building with such components can contribute to the goal of peak shaving in terms of public electricity grid supply. Significant drawbacks, however, are the low thermal conductivity of Phase Change Materials (PCM) that typically constitute such systems,and the requirement for a high rate of discharge. Consequently, the use of so-called architectured materials has been put forward as a means to optimize the effective conductivity of storage materials. Our work is focused upon the development of a methodology to design optimal materials for such systems that meet the criteria of energy storage and energy output. A so-called “top-down metholodogy” was implemented for the present work. This approach includes three scales of interest: building (top), system and material (down). The aim of the building scale analysis is to formulate a set of general design requirements. These are complemented by performance indicators, which are defined at the scale of the system. Finally, at the scale of the material, the architecture of the identified material is elaborated. A numerical simulation tool was developed to determine performance indicators for a latent heat energy storage system comprising of an air/PCM heat exchanger. This model was tested against a benchmark analytical solution and validated though comparison to experimental data. The developed methodology is applied to the specific case of an air/PCM exchanger latent-heat energy storage system. The system is analysed through the study of dimensionless numbers, which provide a set of design indicators for the system. As a result of this stage, the optimal material and functional properties are thus identified. Finally, an architectured material is proposed that would satisfy the design requirements of the storage system. We demonstrate that an arrangement composed of a sandwich of planar layers with nails and PCM can offer the required material properties. Furthermore, in order to meet the desired functional properties, the system design is modified by the addition of fins at the exchange surfaces. With the addition of 20 fins of 3mm thickness attached to the exchange surface of the sandwich panel, the storage system eliminated the heating demand for 2 hours during the period of high daily demand in winter
Styles APA, Harvard, Vancouver, ISO, etc.
8

Diallo, Alhassane. « Recherche de sous-groupes évolutifs et leur impact sur la survie : application à des patients atteints d'ataxie spinocérébelleuse (SCA) Body Mass Index Decline Is Related to Spinocerebellar Ataxia Disease Progression Survival in patients with spinocerebellar ataxia types 1, 2, 3, and 6 (EUROSCA) : a longitudinal cohort study ». Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUS447.

Texte intégral
Résumé :
Dans les études de cohorte, le plus souvent les modèles utilisés supposent que la population d’étude suit un profil moyen d’évolution. Cependant dans de nombreux cas, comme pour les ataxies spinocérébelleuses (SCA), il n’est pas rare qu’une hétérogénéité soit suspectée. Cette hétérogénéité pourrait aussi être influencée par d’autres évènements intercurrents : évolution conjointe d’un second phénotype ou survenue d’un évènement tel la sortie d’étude ou le décès. Dans la première partie de cette thèse, nous avons analysé l’évolution de l’IMC des patients SCA et identifié des profils d’évolution différente de l’IMC. Nous avons identifié 3 sous-groupes d’évolution de l’IMC: diminution (23% des patients), augmentation (18%) et stable (59%); et que les patients qui baissent leur IMC progressent plus rapidement. Dans la seconde partie, nous avons étudié la survie des patients SCA. Nous avons montré qu’elle est différente selon le génotype. La survie est plus courte chez les SCA1, intermédiaire chez les SCA2 et SCA3, et plus longue chez les SCA6. Enfin, nous avons évalué l’impact à long-terme de la progression de l’ataxie sur la survie. Nous avons montré que la progression de l’ataxie est associée à une survie plus courte quel que soit le génotype. Uniquement chez les patients SCA1, nous avons identifié trois sous-groupes de patients homogènes en termes de progression de la maladie et de risque de décès
In cohort studies, most often the models used assume that the study population follows an average pattern of evolution. However, in many cases, such as for spinocerebellar ataxias (SCA), it is not uncommon for heterogeneity to be suspected. This heterogeneity could also be inuenced by other intercurrent events for example joint evolution of a second phenotype or occurrence of an event such as dropout or death. In the first part of this thesis, we analyzed the evolution of the BMI of SCA patients and looked for different evolution profiles of BMI. We identified 3 subgroups of BMI evolution: decrease (23% of patients), increase (18%) and stable (59%) ; and we have shown that patients who lower their BMI are faster disease progression. In the second part, we studied the survival of SCA patients and developed a prognostic nomogram. We have shown that it is different according on the genotype. Survival is shorter in SCA1, intermediate in SCA2 and SCA3, and longer in SCA6. Finally, we assessed the long-term impact of ataxia progression on survival. We have shown that progression of ataxia is associated with shorter survival regardless of genotype. Only in SCA1 patients, we identified three subgroups of homogeneous patients in terms of disease progression and risk of death
Styles APA, Harvard, Vancouver, ISO, etc.
9

Corneli, Marco. « Dynamic stochastic block models, clustering and segmentation in dynamic graphs ». Thesis, Paris 1, 2017. http://www.theses.fr/2017PA01E012/document.

Texte intégral
Résumé :
Cette thèse porte sur l’analyse de graphes dynamiques, définis en temps discret ou continu. Nous introduisons une nouvelle extension dynamique du modèle a blocs stochastiques (SBM), appelée dSBM, qui utilise des processus de Poisson non homogènes pour modéliser les interactions parmi les paires de nœuds d’un graphe dynamique. Les fonctions d’intensité des processus ne dépendent que des classes des nœuds comme dans SBM. De plus, ces fonctions d’intensité ont des propriétés de régularité sur des intervalles temporels qui sont à estimer, et à l’intérieur desquels les processus de Poisson redeviennent homogènes. Un récent algorithme d’estimation pour SBM, qui repose sur la maximisation d’un critère exact (ICL exacte) est ici adopté pour estimer les paramètres de dSBM et sélectionner simultanément le modèle optimal. Ensuite, un algorithme exact pour la détection de rupture dans les séries temporelles, la méthode «pruned exact linear time» (PELT), est étendu pour faire de la détection de rupture dans des données de graphe dynamique selon le modèle dSBM. Enfin, le modèle dSBM est étendu ultérieurement pour faire de l’analyse de réseau textuel dynamique. Les réseaux sociaux sont un exemple de réseaux textuels: les acteurs s’échangent des documents (posts, tweets, etc.) dont le contenu textuel peut être utilisé pour faire de la classification et détecter la structure temporelle du graphe dynamique. Le modèle que nous introduisons est appelé «dynamic stochastic topic block model» (dSTBM)
This thesis focuses on the statistical analysis of dynamic graphs, both defined in discrete or continuous time. We introduce a new extension of the stochastic block model (SBM) for dynamic graphs. The proposed approach, called dSBM, adopts non homogeneous Poisson processes to model the interaction times between pairs of nodes in dynamic graphs, either in discrete or continuous time. The intensity functions of the processes only depend on the node clusters, in a block modelling perspective. Moreover, all the intensity functions share some regularity properties on hidden time intervals that need to be estimated. A recent estimation algorithm for SBM, based on the greedy maximization of an exact criterion (exact ICL) is adopted for inference and model selection in dSBM. Moreover, an exact algorithm for change point detection in time series, the "pruned exact linear time" (PELT) method is extended to deal with dynamic graph data modelled via dSBM. The approach we propose can be used for change point analysis in graph data. Finally, a further extension of dSBM is developed to analyse dynamic net- works with textual edges (like social networks, for instance). In this context, the graph edges are associated with documents exchanged between the corresponding vertices. The textual content of the documents can provide additional information about the dynamic graph topological structure. The new model we propose is called "dynamic stochastic topic block model" (dSTBM).Graphs are mathematical structures very suitable to model interactions between objects or actors of interest. Several real networks such as communication networks, financial transaction networks, mobile telephone networks and social networks (Facebook, Linkedin, etc.) can be modelled via graphs. When observing a network, the time variable comes into play in two different ways: we can study the time dates at which the interactions occur and/or the interaction time spans. This thesis only focuses on the first time dimension and each interaction is assumed to be instantaneous, for simplicity. Hence, the network evolution is given by the interaction time dates only. In this framework, graphs can be used in two different ways to model networks. Discrete time […] Continuous time […]. In this thesis both these perspectives are adopted, alternatively. We consider new unsupervised methods to cluster the vertices of a graph into groups of homogeneous connection profiles. In this manuscript, the node groups are assumed to be time invariant to avoid possible identifiability issues. Moreover, the approaches that we propose aim to detect structural changes in the way the node clusters interact with each other. The building block of this thesis is the stochastic block model (SBM), a probabilistic approach initially used in social sciences. The standard SBM assumes that the nodes of a graph belong to hidden (disjoint) clusters and that the probability of observing an edge between two nodes only depends on their clusters. Since no further assumption is made on the connection probabilities, SBM is a very flexible model able to detect different network topologies (hubs, stars, communities, etc.)
Styles APA, Harvard, Vancouver, ISO, etc.
10

Boucquemont, Julie. « Modèles statistiques pour l'étude de la progression de la maladie rénale chronique ». Thesis, Bordeaux, 2014. http://www.theses.fr/2014BORD0411/document.

Texte intégral
Résumé :
Cette thèse avait pour but d'illustrer l'intérêt de méthodes statistiques avancées lorsqu'on s'in­ téresse aux associations entre différents facteurs et la progression de la maladie rénale chronique (MRC). Dans un premier temps, une revue de la littérature a été effectuée alin d'identifier les méthodes classiquement utilisées pour étudier les facteurs de progression de la MRC ; leurs limites et des méthodes permettant de mieux prendre en compte ces limites ont été discutées. Notre second travail s'est concentré sur les analyses de données de survie et la prise en compte de la censure par intervalle, qui survient lorsque l'évènement d'intérêt est la progression vers un stade spécifique de la MRC, et le risque compétitif avec le décès. Une comparaison entre des modèles de survie standards et le modêle illness-death pour données censurées par intervalle nous a permis d'illustrer l'impact de la modélisation choisie sur les estimations à la fois des effets des facteurs de risque et des probabilités d'évènements, à partir des données de la cohorte NephroTest. Les autres travaux ont porté sur les analyses de données longitudinales de la fonction rénale. Nous avons illustré l'intérêt du modèle linéaire mixte dans ce contexte et présenté son extension pour la prise en compte de sous-populations de trajectoires de la fonction rénale différentes. Nous avons ainsi identifier cinq classes, dont une avec un déclin très rapide et une autre avec une amélioration de la fonction rénale au cours du temps. Des perspectives de travaux liés à la prédiction permettent enfin de lier les deux types d'analyses présentées dans la thèse
The objective of this thesis was to illustrate the benefit of using advanced statistical methods to study associations between risk factors and chrouic kidney disease (CKD) progression. In a first time, we conducted a literature review of statistical methods used to investigate risk factors of CKD progression, identified important methodological issues, and discussed solutions. In our sec­ ond work, we focused on survival analyses and issues with interval-censoring, which occurs when the event of interest is the progression to a specifie CKD stage, and competing risk with death. A comparison between standard survival models and the illness-death mode! for interval-censored data allowed us to illustrate the impact of modeling on the estimates of both the effects of risk factors and the probabilities of events, using data from the NephroTest cohort. Other works fo­ cused on analysis of longitudinal data on renal function. We illustrated the interest of linear mixed mode! in this context and presented its extension to account for sub-populations with different trajectories of renal function. We identified five classes, including one with a strong decline and one with an improvement of renal function over time. Severa! perspectives on predictions bind the two types of analyses presented in this thesis
Styles APA, Harvard, Vancouver, ISO, etc.
11

Xiong, Hao. « Diversified Latent Variable Models ». Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/18512.

Texte intégral
Résumé :
Latent variable model is a common probabilistic framework which aims to estimate the hidden states of observations. More specifically, the hidden states can be the position of a robot, the low dimensional representation of an observation. Meanwhile, various latent variable models have been explored, such as hidden Markov models (HMM), Gaussian mixture model (GMM), Bayesian Gaussian process latent variable model (BGPLVM), etc. Moreover, these latent variable models have been successfully applied to a wide range of fields, such as robotic navigation, image and video compression, natural language processing. So as to make the learning of latent variable more efficient and robust, some approaches seek to integrate latent variables with related priors. For instance, the dynamic prior can be incorporated so that the learned latent variables take into account the time sequence. Besides, some methods introduce inducing points as a small set representing the large size latent variable to enhance the optimization speed of the model. Though those priors are effective to facilitate the robustness of the latent variable models, the learned latent variables are inclined to be dense rather than diverse. This is to say that there are significant overlapping between the generated latent variables. Consequently, the latent variable model will be ambiguous after optimization. Clearly, a proper diversity prior play a pivotal role in having latent variables capture more diverse features of the observations data. In this thesis, we propose diversified latent variable models incorporated by different types of diversity priors, such as single/dual diversity encouraging prior, multi-layered DPP prior, shared diversity prior. Furthermore, we also illustrate how to formulate the diversity priors in different latent variable models and perform learning, inference on the reformulated latent variable models.
Styles APA, Harvard, Vancouver, ISO, etc.
12

Podosinnikova, Anastasia. « Sur la méthode des moments pour l'estimation des modèles à variables latentes ». Thesis, Paris Sciences et Lettres (ComUE), 2016. http://www.theses.fr/2016PSLEE050/document.

Texte intégral
Résumé :
Les modèles linéaires latents sont des modèles statistique puissants pour extraire la structure latente utile à partir de données non structurées par ailleurs. Ces modèles sont utiles dans de nombreuses applications telles que le traitement automatique du langage naturel et la vision artificielle. Pourtant, l'estimation et l'inférence sont souvent impossibles en temps polynomial pour de nombreux modèles linéaires latents et on doit utiliser des méthodes approximatives pour lesquelles il est difficile de récupérer les paramètres. Plusieurs approches, introduites récemment, utilisent la méthode des moments. Elles permettent de retrouver les paramètres dans le cadre idéalisé d'un échantillon de données infini tiré selon certains modèles, mais ils viennent souvent avec des garanties théoriques dans les cas où ce n'est pas exactement satisfait. Dans cette thèse, nous nous concentrons sur les méthodes d'estimation fondées sur l'appariement de moment pour différents modèles linéaires latents. L'utilisation d'un lien étroit avec l'analyse en composantes indépendantes, qui est un outil bien étudié par la communauté du traitement du signal, nous présentons plusieurs modèles semiparamétriques pour la modélisation thématique et dans un contexte multi-vues. Nous présentons des méthodes à base de moment ainsi que des algorithmes pour l'estimation dans ces modèles, et nous prouvons pour ces méthodes des résultats de complexité améliorée par rapport aux méthodes existantes. Nous donnons également des garanties d'identifiabilité, contrairement à d'autres modèles actuels. C'est une propriété importante pour assurer leur interprétabilité
Latent linear models are powerful probabilistic tools for extracting useful latent structure from otherwise unstructured data and have proved useful in numerous applications such as natural language processing and computer vision. However, the estimation and inference are often intractable for many latent linear models and one has to make use of approximate methods often with no recovery guarantees. An alternative approach, which has been popular lately, are methods based on the method of moments. These methods often have guarantees of exact recovery in the idealized setting of an infinite data sample and well specified models, but they also often come with theoretical guarantees in cases where this is not exactly satisfied. In this thesis, we focus on moment matchingbased estimation methods for different latent linear models. Using a close connection with independent component analysis, which is a well studied tool from the signal processing literature, we introduce several semiparametric models in the topic modeling context and for multi-view models and develop moment matching-based methods for the estimation in these models. These methods come with improved sample complexity results compared to the previously proposed methods. The models are supplemented with the identifiability guarantees, which is a necessary property to ensure their interpretability. This is opposed to some other widely used models, which are unidentifiable
Styles APA, Harvard, Vancouver, ISO, etc.
13

Creagh-Osborne, Jane. « Latent variable generalized linear models ». Thesis, University of Plymouth, 1998. http://hdl.handle.net/10026.1/1885.

Texte intégral
Résumé :
Generalized Linear Models (GLMs) (McCullagh and Nelder, 1989) provide a unified framework for fixed effect models where response data arise from exponential family distributions. Much recent research has attempted to extend the framework to include random effects in the linear predictors. Different methodologies have been employed to solve different motivating problems, for example Generalized Linear Mixed Models (Clayton, 1994) and Multilevel Models (Goldstein, 1995). A thorough review and classification of this and related material is presented. In Item Response Theory (IRT) subjects are tested using banks of pre-calibrated test items. A useful model is based on the logistic function with a binary response dependent on the unknown ability of the subject. Item parameters contribute to the probability of a correct response. Within the framework of the GLM, a latent variable, the unknown ability, is introduced as a new component of the linear predictor. This approach affords the opportunity to structure intercept and slope parameters so that item characteristics are represented. A methodology for fitting such GLMs with latent variables, based on the EM algorithm (Dempster, Laird and Rubin, 1977) and using standard Generalized Linear Model fitting software GLIM (Payne, 1987) to perform the expectation step, is developed and applied to a model for binary response data. Accurate numerical integration to evaluate the likelihood functions is a vital part of the computational process. A study of the comparative benefits of two different integration strategies is undertaken and leads to the adoption, unusually, of Gauss-Legendre rules. It is shown how the fitting algorithms are implemented with GLIM programs which incorporate FORTRAN subroutines. Examples from IRT are given. A simulation study is undertaken to investigate the sampling distributions of the estimators and the effect of certain numerical attributes of the computational process. Finally a generalized latent variable model is developed for responses from any exponential family distribution.
Styles APA, Harvard, Vancouver, ISO, etc.
14

Dallaire, Patrick. « Bayesian nonparametric latent variable models ». Doctoral thesis, Université Laval, 2016. http://hdl.handle.net/20.500.11794/26848.

Texte intégral
Résumé :
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
One of the important problems in machine learning is determining the complexity of the model to learn. Too much complexity leads to overfitting, which finds structures that do not actually exist in the data, while too low complexity leads to underfitting, which means that the expressiveness of the model is insufficient to capture all the structures present in the data. For some probabilistic models, the complexity depends on the introduction of one or more latent variables whose role is to explain the generative process of the data. There are various approaches to identify the appropriate number of latent variables of a model. This thesis covers various Bayesian nonparametric methods capable of determining the number of latent variables to be used and their dimensionality. The popularization of Bayesian nonparametric statistics in the machine learning community is fairly recent. Their main attraction is the fact that they offer highly flexible models and their complexity scales appropriately with the amount of available data. In recent years, research on Bayesian nonparametric learning methods have focused on three main aspects: the construction of new models, the development of inference algorithms and new applications. This thesis presents our contributions to these three topics of research in the context of learning latent variables models. Firstly, we introduce the Pitman-Yor process mixture of Gaussians, a model for learning infinite mixtures of Gaussians. We also present an inference algorithm to discover the latent components of the model and we evaluate it on two practical robotics applications. Our results demonstrate that the proposed approach outperforms, both in performance and flexibility, the traditional learning approaches. Secondly, we propose the extended cascading Indian buffet process, a Bayesian nonparametric probability distribution on the space of directed acyclic graphs. In the context of Bayesian networks, this prior is used to identify the presence of latent variables and the network structure among them. A Markov Chain Monte Carlo inference algorithm is presented and evaluated on structure identification problems and as well as density estimation problems. Lastly, we propose the Indian chefs process, a model more general than the extended cascading Indian buffet process for learning graphs and orders. The advantage of the new model is that it accepts connections among observable variables and it takes into account the order of the variables. We also present a reversible jump Markov Chain Monte Carlo inference algorithm which jointly learns graphs and orders. Experiments are conducted on density estimation problems and testing independence hypotheses. This model is the first Bayesian nonparametric model capable of learning Bayesian learning networks with completely arbitrary graph structures.
Styles APA, Harvard, Vancouver, ISO, etc.
15

Wegelin, Jacob A. « Latent models for cross-covariance / ». Thesis, Connect to this title online ; UW restricted, 2001. http://hdl.handle.net/1773/8982.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
16

Mena-Chavez, Ramses H. « Stationary models using latent structures ». Thesis, University of Bath, 2003. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.425643.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
17

Amoualian, Hesam. « Modélisation et apprentissage de dépendances á l’aide de copules dans les modéles probabilistes latents ». Thesis, Université Grenoble Alpes (ComUE), 2017. http://www.theses.fr/2017GREAM078/document.

Texte intégral
Résumé :
Ce travail de thése a pour objectif de s’intéresser à une classe de modèles hiérarchiques bayesiens, appelés topic models, servant à modéliser de grands corpus de documents et ceci en particulier dans le cas où ces documents arrivent séquentiellement. Pour cela, nous introduisons au Chapitre 3, trois nouveaux modèles prenant en compte les dépendances entre les thèmes relatifs à chaque document pour deux documents successifs. Le premier modèle s’avère être une généralisation directe du modèle LDA (Latent Dirichlet Allocation). On utilise une loi de Dirichlet pour prendre en compte l’influence sur un document des paramètres relatifs aux thèmes sous jacents du document précédent. Le deuxième modèle utilise les copules, outil générique servant à modéliser les dépendances entre variables aléatoires. La famille de copules utilisée est la famille des copules Archimédiens et plus précisément la famille des copules de Franck qui vérifient de bonnes propriétés (symétrie, associativité) et qui sont donc adaptés à la modélisation de variables échangeables. Enfin le dernier modèle est une extension non paramétrique du deuxième. On intègre cette fois ci lescopules dans la construction stick-breaking des Processus de Dirichlet Hiérarchique (HDP). Nos expériences numériques, réalisées sur cinq collections standard, mettent en évidence les performances de notre approche, par rapport aux approches existantes dans la littérature comme les dynamic topic models, le temporal LDA et les Evolving Hierarchical Processes, et ceci à la fois sur le plan de la perplexité et en terme de performances lorsqu’on cherche à détecter des thèmes similaires dans des flux de documents. Notre approche, comparée aux autres, se révèle être capable de modéliser un plus grand nombre de situations allant d’une dépendance forte entre les documents à une totale indépendance. Par ailleurs, l’hypothèse d’échangeabilité sous jacente à tous les topics models du type du LDA amène souvent à estimer des thèmes différents pour des mots relevant pourtant du même segment de phrase ce qui n’est pas cohérent. Dans le Chapitre 4, nous introduisons le copulaLDA (copLDA), qui généralise le LDA en intégrant la structure du texte dans le modèle of the text et de relaxer l’hypothèse d’indépendance conditionnelle. Pour cela, nous supposons que les groupes de mots dans un texte sont reliés thématiquement entre eux. Nous modélisons cette dépendance avec les copules. Nous montrons de manièreempirique l’efficacité du modèle copLDA pour effectuer à la fois des tâches de natureintrinsèque et extrinsèque sur différents corpus accessibles publiquement. Pour compléter le modèle précédent (copLDA), le chapitre 5 présente un modèle de type LDA qui génére des segments dont les thèmes sont cohérents à l’intérieur de chaque document en faisant de manière simultanée la segmentation des documents et l’affectation des thèmes à chaque mot. La cohérence entre les différents thèmes internes à chaque groupe de mots est assurée grâce aux copules qui relient les thèmes entre eux. De plus ce modèle s’appuie tout à la fois sur des distributions spécifiques pour les thèmes reliés à chaque document et à chaque groupe de mots, ceci permettant de capturer les différents degrés de granularité. Nous montrons que le modèle proposé généralise naturellement plusieurs modèles de type LDA qui ont été introduits pour des tâches similaires. Par ailleurs nos expériences, effectuées sur six bases de données différentes mettent en évidence les performances de notre modèle mesurée de différentes manières : à l’aide de la perplexité, de la Pointwise Mutual Information Normalisée, qui capture la cohérence entre les thèmes et la mesure Micro F1 measure utilisée en classification de texte
This thesis focuses on scaling latent topic models for big data collections, especiallywhen document streams. Although the main goal of probabilistic modeling is to find word topics, an equally interesting objective is to examine topic evolutions and transitions. To accomplish this task, we propose in Chapter 3, three new models for modeling topic and word-topic dependencies between consecutive documents in document streams. The first model is a direct extension of Latent Dirichlet Allocation model (LDA) and makes use of a Dirichlet distribution to balance the influence of the LDA prior parameters with respect to topic and word-topic distributions of the previous document. The second extension makes use of copulas, which constitute a generic tool to model dependencies between random variables. We rely here on Archimedean copulas, and more precisely on Franck copula, as they are symmetric and associative and are thus appropriate for exchangeable random variables. Lastly, the third model is a non-parametric extension of the second one through the integration of copulas in the stick-breaking construction of Hierarchical Dirichlet Processes (HDP). Our experiments, conducted on five standard collections that have been used in several studies on topic modeling, show that our proposals outperform previous ones, as dynamic topic models, temporal LDA and the Evolving Hierarchical Processes,both in terms of perplexity and for tracking similar topics in document streams. Compared to previous proposals, our models have extra flexibility and can adapt to situations where there are no dependencies between the documents.On the other hand, the "Exchangeability" assumption in topic models like LDA oftenresults in inferring inconsistent topics for the words of text spans like noun-phrases, which are usually expected to be topically coherent. In Chapter 4, we propose copulaLDA (copLDA), that extends LDA by integrating part of the text structure to the model and relaxes the conditional independence assumption between the word-specific latent topics given the per-document topic distributions. To this end, we assume that the words of text spans like noun-phrases are topically bound and we model this dependence with copulas. We demonstrate empirically the effectiveness of copLDA on both intrinsic and extrinsic evaluation tasks on several publicly available corpora. To complete the previous model (copLDA), Chapter 5 presents an LDA-based model that generates topically coherent segments within documents by jointly segmenting documents and assigning topics to their words. The coherence between topics is ensured through a copula, binding the topics associated to the words of a segment. In addition, this model relies on both document and segment specific topic distributions so as to capture fine-grained differences in topic assignments. We show that the proposed model naturally encompasses other state-of-the-art LDA-based models designed for similar tasks. Furthermore, our experiments, conducted on six different publicly available datasets, show the effectiveness of our model in terms of perplexity, Normalized Pointwise Mutual Information, which captures the coherence between the generated topics, and the Micro F1 measure for text classification
Styles APA, Harvard, Vancouver, ISO, etc.
18

Dupuy, Christophe. « Inference and applications for topic models ». Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEE055/document.

Texte intégral
Résumé :
La plupart des systèmes de recommandation actuels se base sur des évaluations sous forme de notes (i.e., chiffre entre 0 et 5) pour conseiller un contenu (film, restaurant...) à un utilisateur. Ce dernier a souvent la possibilité de commenter ce contenu sous forme de texte en plus de l'évaluer. Il est difficile d'extraire de l'information d'un texte brut tandis qu'une simple note contient peu d'information sur le contenu et l'utilisateur. Dans cette thèse, nous tentons de suggérer à l'utilisateur un texte lisible personnalisé pour l'aider à se faire rapidement une opinion à propos d'un contenu. Plus spécifiquement, nous construisons d'abord un modèle thématique prédisant une description de film personnalisée à partir de commentaires textuels. Notre modèle sépare les thèmes qualitatifs (i.e., véhiculant une opinion) des thèmes descriptifs en combinant des commentaires textuels et des notes sous forme de nombres dans un modèle probabiliste joint. Nous évaluons notre modèle sur une base de données IMDB et illustrons ses performances à travers la comparaison de thèmes. Nous étudions ensuite l'inférence de paramètres dans des modèles à variables latentes à grande échelle, incluant la plupart des modèles thématiques. Nous proposons un traitement unifié de l'inférence en ligne pour les modèles à variables latentes à partir de familles exponentielles non-canoniques et faisons explicitement apparaître les liens existants entre plusieurs méthodes fréquentistes et Bayesiennes proposées auparavant. Nous proposons aussi une nouvelle méthode d'inférence pour l'estimation fréquentiste des paramètres qui adapte les méthodes MCMC à l'inférence en ligne des modèles à variables latentes en utilisant proprement un échantillonnage de Gibbs local. Pour le modèle thématique d'allocation de Dirichlet latente, nous fournissons une vaste série d'expériences et de comparaisons avec des travaux existants dans laquelle notre nouvelle approche est plus performante que les méthodes proposées auparavant. Enfin, nous proposons une nouvelle classe de processus ponctuels déterminantaux (PPD) qui peut être manipulée pour l'inférence et l'apprentissage de paramètres en un temps potentiellement sous-linéaire en le nombre d'objets. Cette classe, basée sur une factorisation spécifique de faible rang du noyau marginal, est particulièrement adaptée à une sous-classe de PPD continus et de PPD définis sur un nombre exponentiel d'objets. Nous appliquons cette classe à la modélisation de documents textuels comme échantillons d'un PPD sur les phrases et proposons une formulation du maximum de vraisemblance conditionnel pour modéliser les proportions de thèmes, ce qui est rendu possible sans aucune approximation avec notre classe de PPD. Nous présentons une application à la synthèse de documents avec un PPD sur 2 à la puissance 500 objets, où les résumés sont composés de phrases lisibles
Most of current recommendation systems are based on ratings (i.e. numbers between 0 and 5) and try to suggest a content (movie, restaurant...) to a user. These systems usually allow users to provide a text review for this content in addition to ratings. It is hard to extract useful information from raw text while a rating does not contain much information on the content and the user. In this thesis, we tackle the problem of suggesting personalized readable text to users to help them make a quick decision about a content. More specifically, we first build a topic model that predicts personalized movie description from text reviews. Our model extracts distinct qualitative (i.e., which convey opinion) and descriptive topics by combining text reviews and movie ratings in a joint probabilistic model. We evaluate our model on an IMDB dataset and illustrate its performance through comparison of topics. We then study parameter inference in large-scale latent variable models, that include most topic models. We propose a unified treatment of online inference for latent variable models from a non-canonical exponential family, and draw explicit links between several previously proposed frequentist or Bayesian methods. We also propose a novel inference method for the frequentist estimation of parameters, that adapts MCMC methods to online inference of latent variable models with the proper use of local Gibbs sampling.~For the specific latent Dirichlet allocation topic model, we provide an extensive set of experiments and comparisons with existing work, where our new approach outperforms all previously proposed methods. Finally, we propose a new class of determinantal point processes (DPPs) which can be manipulated for inference and parameter learning in potentially sublinear time in the number of items. This class, based on a specific low-rank factorization of the marginal kernel, is particularly suited to a subclass of continuous DPPs and DPPs defined on exponentially many items. We apply this new class to modelling text documents as sampling a DPP of sentences, and propose a conditional maximum likelihood formulation to model topic proportions, which is made possible with no approximation for our class of DPPs. We present an application to document summarization with a DPP on 2 to the power 500 items, where the summaries are composed of readable sentences
Styles APA, Harvard, Vancouver, ISO, etc.
19

Sagara, Issaka. « Méthodes d'analyse statistique pour données répétées dans les essais cliniques : intérêts et applications au paludisme ». Thesis, Aix-Marseille, 2014. http://www.theses.fr/2014AIXM5081/document.

Texte intégral
Résumé :
De nombreuses études cliniques ou interventions de lutte ont été faites ou sont en cours en Afrique pour la lutte contre le fléau du paludisme. En zone d'endémie, le paludisme est une maladie récurrente. La revue de littérature indique une application limitée des outils statistiques appropriés existants pour l'analyse des données récurrentes de paludisme. Nous avons mis en oeuvre des méthodes statistiques appropriées pour l'analyse des données répétées d'essais thérapeutiques de paludisme. Nous avons également étudié les mesures répétées d'hémoglobine lors du suivi de traitements antipaludiques en vue d'évaluer la tolérance ou sécurité des médicaments en regroupant les données de 13 essais cliniques.Pour l'analyse du nombre d'épisodes de paludisme, la régression binomiale négative a été mise en oeuvre. Pour modéliser la récurrence des épisodes de paludisme, quatre modèles ont été utilisés : i) Les équations d'estimation généralisées (GEE) utilisant la distribution de Poisson; et trois modèles qui sont une extension du modèle Cox: ii) le modèle de processus de comptage d'Andersen-Gill (AG-CP), iii) le modèle de processus de comptage de Prentice-Williams-Peterson (PWP-CP); et iv) le modèle de Fragilité partagée de distribution gamma. Pour l'analyse de sécurité, c'est-à-dire l'évaluation de l'impact de traitements antipaludiques sur le taux d'hémoglobine ou la survenue de l'anémie, les modèles linéaires et latents généralisés mixtes (« GLLAMM : generalized linear and latent mixed models ») ont été mis en oeuvre. Les perspectives sont l'élaboration de guides de bonnes pratiques de préparation et d'analyse ainsi que la création d'un entrepôt des données de paludisme
Numerous clinical studies or control interventions were done or are ongoing in Africa for malaria control. For an efficient control of this disease, the strategies should be closer to the reality of the field and the data should be analyzed appropriately. In endemic areas, malaria is a recurrent disease. Repeated malaria episodes are common in African. However, the literature review indicates a limited application of appropriate statistical tools for the analysis of recurrent malaria data. We implemented appropriate statistical methods for the analysis of these data We have also studied the repeated measurements of hemoglobin during malaria treatments follow-up in order to assess the safety of the study drugs by pooling data from 13 clinical trials.For the analysis of the number of malaria episodes, the negative binomial regression has been implemented. To model the recurrence of malaria episodes, four models were used: i) the generalized estimating equations (GEE) using the Poisson distribution; and three models that are an extension of the Cox model: ii) Andersen-Gill counting process (AG-CP), iii) Prentice-Williams-Peterson counting process (PWP-CP); and (iv) the shared gamma frailty model. For the safety analysis, i.e. the assessment of the impact of malaria treatment on hemoglobin levels or the onset of anemia, the generalized linear and latent mixed models (GLLAMM) has been implemented. We have shown how to properly apply the existing statistical tools in the analysis of these data. The prospects of this work remain in the development of guides on good practices on the methodology of the preparation and analysis and storage network for malaria data
Styles APA, Harvard, Vancouver, ISO, etc.
20

Christmas, Jacqueline. « Robust spatio-temporal latent variable models ». Thesis, University of Exeter, 2011. http://hdl.handle.net/10036/3051.

Texte intégral
Résumé :
Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) are widely-used mathematical models for decomposing multivariate data. They capture spatial relationships between variables, but ignore any temporal relationships that might exist between observations. Probabilistic PCA (PPCA) and Probabilistic CCA (ProbCCA) are versions of these two models that explain the statistical properties of the observed variables as linear mixtures of an alternative, hypothetical set of hidden, or latent, variables and explicitly model noise. Both the noise and the latent variables are assumed to be Gaussian distributed. This thesis introduces two new models, named PPCA-AR and ProbCCA-AR, that augment PPCA and ProbCCA respectively with autoregressive processes over the latent variables to additionally capture temporal relationships between the observations. To make PPCA-AR and ProbCCA-AR robust to outliers and able to model leptokurtic data, the Gaussian assumptions are replaced with infinite scale mixtures of Gaussians, using the Student-t distribution. Bayesian inference calculates posterior probability distributions for each of the parameter variables, from which we obtain a measure of confidence in the inference. It avoids the pitfalls associated with the maximum likelihood method: integrating over all possible values of the parameter variables guards against overfitting. For these new models the integrals required for exact Bayesian inference are intractable; instead a method of approximation, the variational Bayesian approach, is used. This enables the use of automatic relevance determination to estimate the model orders. PPCA-AR and ProbCCA-AR can be viewed as linear dynamical systems, so the forward-backward algorithm, also known as the Baum-Welch algorithm, is used as an efficient method for inferring the posterior distributions of the latent variables. The exact algorithm is tractable because Gaussian assumptions are made regarding the distribution of the latent variables. This thesis introduces a variational Bayesian forward-backward algorithm based on Student-t assumptions. The new models are demonstrated on synthetic datasets and on real remote sensing and EEG data.
Styles APA, Harvard, Vancouver, ISO, etc.
21

Chen, George H. « Latent source models for nonparametric inference ». Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/99774.

Texte intégral
Résumé :
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 95-101).
Nearest-neighbor inference methods have been widely and successfully used in numerous applications such as forecasting which news topics will go viral, recommending products to people in online stores, and delineating objects in images by looking at image patches. However, there is little theoretical understanding of when, why, and how well these nonparametric inference methods work in terms of key problem-specific quantities relevant to practitioners. This thesis bridges the gap between theory and practice for these methods in the three specific case studies of time series classification, online collaborative filtering, and patch-based image segmentation. To do so, for each of these problems, we prescribe a probabilistic model in which the data appear generated from unknown "latent sources" that capture salient structure in the problem. These latent source models naturally lead to nearest-neighbor or nearest-neighbor-like inference methods similar to ones already used in practice. We derive theoretical performance guarantees for these methods, relating inference quality to the amount of training data available and problems-specific structure modeled by the latent sources.
by George H. Chen.
Ph. D.
Styles APA, Harvard, Vancouver, ISO, etc.
22

Wanigasekara, Prashan. « Latent state space models for prediction ». Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106269.

Texte intégral
Résumé :
Thesis: S.M. in Engineering and Management, Massachusetts Institute of Technology, School of Engineering, System Design and Management Program, Engineering and Management Program, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 95-98).
In this thesis, I explore a novel algorithm to model the joint behavior of multiple correlated signals. Our chosen example is the ECG (Electrocardiogram) and ABP (Arterial Blood Pressure) signals from patients in the ICU (Intensive Care Unit). I then use the generated models to predict blood pressure levels of ICU patients based on their historical ECG and ABP signals. The algorithm used is a variant of a Hidden Markov model. The new extension is termed as the Latent State Space Copula Model. In the novel Latent State Space Copula Modelthe ECG, ABP signals are considered to be correlated and are modeled using a bivariate Gaussian copula with Weibull marginals generated by a hidden state. We assume that there are hidden patient "states" that transition from one hidden state to another driving a joint ECG-ABP behavior. We estimate the parameters of the model using a novel Gibbs sampling approach. Using this model, we generate predictors that are the state probabilities at any given time step and use them to predict a patient's future health condition. The predictions made by the model are binary and detects whether the Mean arterial pressure(MAP) is going to be above or below a certain threshold at a future time step. Towards the end of the thesis I do a comparison between the new Latent State Space Copula Model and a state of the art Classical Discrete HMM. The Latent State Space Copula Model achieves an Area Under the ROC (AUROC) curve of .7917 for 5 states while the Classical Discrete HMM achieves an AUROC of .7609 for 5 states.
by Prashan Wanigasekara.
S.M. in Engineering and Management
Styles APA, Harvard, Vancouver, ISO, etc.
23

Paquet, Ulrich. « Bayesian inference for latent variable models ». Thesis, University of Cambridge, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.613111.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
24

O'Sullivan, Aidan Michael. « Bayesian latent variable models with applications ». Thesis, Imperial College London, 2013. http://hdl.handle.net/10044/1/19191.

Texte intégral
Résumé :
The massive increases in computational power that have occurred over the last two decades have contributed to the increasing prevalence of Bayesian reasoning in statistics. The often intractable integrals required as part of the Bayesian approach to inference can be approximated or estimated using intensive sampling or optimisation routines. This has extended the realm of applications beyond simple models for which fully analytic solutions are possible. Latent variable models are ideally suited to this approach as it provides a principled method for resolving one of the more difficult issues associated with this class of models, the question of the appropriate number of latent variables. This thesis explores the use of latent variable models in a number of different settings employing Bayesian methods for inference. The first strand of this research focusses on the use of a latent variable model to perform simultaneous clustering and latent structure analysis of multivariate data. In this setting the latent variables are of key interest providing information on the number of sub-populations within a heterogeneous data set and also the differences in latent structure that define them. In the second strand latent variable models are used as a tool to study relational or network data. The analysis of this type of data, which describes the interconnections between different entities or nodes, is complicated due to the dependencies between nodes induced by these connections. The conditional independence assumptions of the latent variable framework provide a means of taking these dependencies into account, the nodes are independent conditioned on an associated latent variable. This allows us to perform model based clustering of a network making inference on the number of clusters. Finally the latent variable representation of the network, which captures the structure of the network in a different form, can be studied as part of a latent variable framework for detecting differences between networks. Approximation schemes are required as part of the Bayesian approach to model estimation. The two methods that are considered in this thesis are stochastic Markov chain Monte Carlo methods and deterministic variational approximations. Where possible these are extended to incorporate model selection over the number of latent variables and a comparison, the first of its kind in this setting, of their relative performance in unsupervised model selection for a range of different settings is presented. The findings of the study help to ascertain in which settings one method may be preferred to the other.
Styles APA, Harvard, Vancouver, ISO, etc.
25

Zhang, Cheng. « Structured Representation Using Latent Variable Models ». Doctoral thesis, KTH, Datorseende och robotik, CVAP, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-191455.

Texte intégral
Résumé :
Over the past two centuries the industrial revolution automated a great part of work that involved human muscles. Recently, since the beginning of the 21st century, the focus has shifted towards automating work that is involving our brain to further improve our lives. This is accomplished by establishing human-level intelligence through machines, which lead to the growth of the field of artificial intelligence. Machine learning is a core component of artificial intelligence. While artificial intelligence focuses on constructing an entire intelligence system, machine learning focuses on the learning ability and the ability to further use the learned knowledge for different tasks. This thesis targets the field of machine learning, especially structured representation learning, which is key for various machine learning approaches. Humans sense the environment, extract information and make action decisions based on abstracted information. Similarly, machines receive data, abstract information from data through models and make decisions about the unknown through inference. Thus, models provide a mechanism for machines to abstract information. This commonly involves learning useful representations which are desirably compact, interpretable and useful for different tasks. In this thesis, the contribution relates to the design of efficient representation models with latent variables. To make the models useful, efficient inference algorithms are derived to fit the models to data. We apply our models to various applications from different domains, namely E-health, robotics, text mining, computer vision and recommendation systems. The main contribution of this thesis relates to advancing latent variable models and deriving associated inference schemes for representation learning. This is pursued in three different directions. Firstly, through supervised models, where better representations can be learned knowing the tasks, corresponding to situated knowledge of humans. Secondly, through structured representation models, with which different structures, such as factorized ones, are used for latent variable models to form more efficient representations. Finally, through non-parametric models, where the representation is determined completely by the data. Specifically, we propose several new models combining supervised learning and factorized representation as well as a further model combining non-parametric modeling and supervised approaches. Evaluations show that these new models provide generally more efficient representations and a higher degree of interpretability. Moreover, this thesis contributes by applying these proposed models in different practical scenarios, demonstrating that these models can provide efficient latent representations. Experimental results show that our models improve the performance for classical tasks, such as image classification and annotations, robotic scene and action understanding. Most notably, one of our models is applied to a novel problem in E-health, namely diagnostic prediction using discomfort drawings. Experimental investigation show here that our model can achieve significant results in automatic diagnosing and provides profound understanding of typical symptoms. This motivates novel decision support systems for healthcare personnel.

QC 20160905

Styles APA, Harvard, Vancouver, ISO, etc.
26

Surian, Didi. « Novel Applications Using Latent Variable Models ». Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14014.

Texte intégral
Résumé :
Latent variable models have achieved a great success in many research communities, including machine learning, information retrieval, data mining, natural language processing, etc. Latent variable models use an assumption that the data, which is observable, has an affinity to some hidden/latent variables. In this thesis, we present a suite of novel applications using latent variable models. In particular, we (i) extend topic models using directional distributions, (ii) propose novel solutions using latent variable models to detect outliers (anomalies) and (iii) to answer cross-modal retrieval problem. We present a study of directional distributions in modeling data. Specifically, we implement the von Mises-Fisher (vMF) distribution and develop latent variable models which are based on directed graphical models. The directed graphical models are commonly used to represent the conditional dependency among the variables. Under Bayesian treatment, we propose approximate posterior inference algorithms using variational methods for the models. We show that by incorporating the vMF distribution, the quality of clustering is improved rather than by using word count-based topic models. Furthermore, with the properties of directional distributions in hand, we extend the applications to detect outliers in various data sets and settings. Finally, we present latent variable models that are based on supervised learning to answer the cross-modal retrieval problem. In the cross-modal retrieval problem, the objective is to find matching content across different modalities such as text and image. We explore various approaches such as by using one-class learning methods, generating negative instances and using ranking methods. We show that our models outperform generic approaches such as Canonical Correlation Analysis (CCA) and its variants.
Styles APA, Harvard, Vancouver, ISO, etc.
27

Parsons, S. « Approximation methods for latent variable models ». Thesis, University College London (University of London), 2016. http://discovery.ucl.ac.uk/1513250/.

Texte intégral
Résumé :
Modern statistical models are often intractable, and approximation methods can be required to perform inference on them. Many different methods can be employed in most contexts, but not all are fully understood. The current thesis is an investigation into the use of various approximation methods for performing inference on latent variable models. Composite likelihoods are used as surrogates for the likelihood function of state space models (SSM). In chapter 3, variational approximations to their evaluation are investigated, and the interaction of biases as composite structure changes is observed. The bias effect of increasing the block size in composite likelihoods is found to balance the statistical benefit of including more data in each component. Predictions and smoothing estimates are made using approximate Expectation- Maximisation (EM) techniques. Variational EM estimators are found to produce predictions and smoothing estimates of a lesser quality than stochastic EM estimators, but at a massively reduced computational cost. Surrogate latent marginals are introduced in chapter 4 into a non-stationary SSM with i.i.d. replicates. They are cheap to compute, and break functional dependencies on parameters for previous time points, giving estimation algorithms linear computational complexity. Gaussian variational approximations are integrated with the surrogate marginals to produce an approximate EM algorithm. Using these Gaussians as proposal distributions in importance sampling is found to offer a positive trade-off in terms of the accuracy of predictions and smoothing estimates made using estimators. A cheap to compute model based hierarchical clustering algorithm is proposed in chapter 5. A cluster dissimilarity measure based on method of moments estimators is used to avoid likelihood function evaluation. Computation time for hierarchical clustering sequences is further reduced with the introduction of short-lists that are linear in the number of clusters at each iteration. The resulting clustering sequences are found to have plausible characteristics in both real and synthetic datasets.
Styles APA, Harvard, Vancouver, ISO, etc.
28

Oldmeadow, Christopher. « Latent variable models in statistical genetics ». Thesis, Queensland University of Technology, 2009. https://eprints.qut.edu.au/31995/1/Christopher_Oldmeadow_Thesis.pdf.

Texte intégral
Résumé :
Understanding the complexities that are involved in the genetics of multifactorial diseases is still a monumental task. In addition to environmental factors that can influence the risk of disease, there is also a number of other complicating factors. Genetic variants associated with age of disease onset may be different from those variants associated with overall risk of disease, and variants may be located in positions that are not consistent with the traditional protein coding genetic paradigm. Latent Variable Models are well suited for the analysis of genetic data. A latent variable is one that we do not directly observe, but which is believed to exist or is included for computational or analytic convenience in a model. This thesis presents a mixture of methodological developments utilising latent variables, and results from case studies in genetic epidemiology and comparative genomics. Epidemiological studies have identified a number of environmental risk factors for appendicitis, but the disease aetiology of this oft thought useless vestige remains largely a mystery. The effects of smoking on other gastrointestinal disorders are well documented, and in light of this, the thesis investigates the association between smoking and appendicitis through the use of latent variables. By utilising data from a large Australian twin study questionnaire as both cohort and case-control, evidence is found for the association between tobacco smoking and appendicitis. Twin and family studies have also found evidence for the role of heredity in the risk of appendicitis. Results from previous studies are extended here to estimate the heritability of age-at-onset and account for the eect of smoking. This thesis presents a novel approach for performing a genome-wide variance components linkage analysis on transformed residuals from a Cox regression. This method finds evidence for a dierent subset of genes responsible for variation in age at onset than those associated with overall risk of appendicitis. Motivated by increasing evidence of functional activity in regions of the genome once thought of as evolutionary graveyards, this thesis develops a generalisation to the Bayesian multiple changepoint model on aligned DNA sequences for more than two species. This sensitive technique is applied to evaluating the distributions of evolutionary rates, with the finding that they are much more complex than previously apparent. We show strong evidence for at least 9 well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least 7 classes in an alignment of four mammals, including human. A pattern of enrichment and depletion of genic regions in the profiled segments suggests they are functionally significant, and most likely consist of various functional classes. Furthermore, a method of incorporating alignment characteristics representative of function such as GC content and type of mutation into the segmentation model is developed within this thesis. Evidence of fine-structured segmental variation is presented.
Styles APA, Harvard, Vancouver, ISO, etc.
29

Laclau, Charlotte. « Hard and fuzzy block clustering algorithms for high dimensional data ». Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB014.

Texte intégral
Résumé :
Notre capacité grandissante à collecter et stocker des données a fait de l'apprentissage non supervisé un outil indispensable qui permet la découverte de structures et de modèles sous-jacents aux données, sans avoir à \étiqueter les individus manuellement. Parmi les différentes approches proposées pour aborder ce type de problème, le clustering est très certainement le plus répandu. Le clustering suppose que chaque groupe, également appelé cluster, est distribué autour d'un centre défini en fonction des valeurs qu'il prend pour l'ensemble des variables. Cependant, dans certaines applications du monde réel, et notamment dans le cas de données de dimension importante, cette hypothèse peut être invalidée. Aussi, les algorithmes de co-clustering ont-ils été proposés: ils décrivent les groupes d'individus par un ou plusieurs sous-ensembles de variables au regard de leur pertinence. La structure des données finalement obtenue est composée de blocs communément appelés co-clusters. Dans les deux premiers chapitres de cette thèse, nous présentons deux approches de co-clustering permettant de différencier les variables pertinentes du bruit en fonction de leur capacité \`a révéler la structure latente des données, dans un cadre probabiliste d'une part et basée sur la notion de métrique, d'autre part. L'approche probabiliste utilise le principe des modèles de mélanges, et suppose que les variables non pertinentes sont distribuées selon une loi de probabilité dont les paramètres sont indépendants de la partition des données en cluster. L'approche métrique est fondée sur l'utilisation d'une distance adaptative permettant d'affecter à chaque variable un poids définissant sa contribution au co-clustering. D'un point de vue théorique, nous démontrons la convergence des algorithmes proposés en nous appuyant sur le théorème de convergence de Zangwill. Dans les deux chapitres suivants, nous considérons un cas particulier de structure en co-clustering, qui suppose que chaque sous-ensemble d'individus et décrit par un unique sous-ensemble de variables. La réorganisation de la matrice originale selon les partitions obtenues sous cette hypothèse révèle alors une structure de blocks homogènes diagonaux. Comme pour les deux contributions précédentes, nous nous plaçons dans le cadre probabiliste et métrique. L'idée principale des méthodes proposées est d'imposer deux types de contraintes : (1) nous fixons le même nombre de cluster pour les individus et les variables; (2) nous cherchons une structure de la matrice de données d'origine qui possède les valeurs maximales sur sa diagonale (par exemple pour le cas des données binaires, on cherche des blocs diagonaux majoritairement composés de valeurs 1, et de 0 à l’extérieur de la diagonale). Les approches proposées bénéficient des garanties de convergence issues des résultats des chapitres précédents. Enfin, pour chaque chapitre, nous dérivons des algorithmes permettant d'obtenir des partitions dures et floues. Nous évaluons nos contributions sur un large éventail de données simulées et liées a des applications réelles telles que le text mining, dont les données peuvent être binaires ou continues. Ces expérimentations nous permettent également de mettre en avant les avantages et les inconvénients des différentes approches proposées. Pour conclure, nous pensons que cette thèse couvre explicitement une grande majorité des scénarios possibles découlant du co-clustering flou et dur, et peut être vu comme une généralisation de certaines approches de biclustering populaires
With the increasing number of data available, unsupervised learning has become an important tool used to discover underlying patterns without the need to label instances manually. Among different approaches proposed to tackle this problem, clustering is arguably the most popular one. Clustering is usually based on the assumption that each group, also called cluster, is distributed around a center defined in terms of all features while in some real-world applications dealing with high-dimensional data, this assumption may be false. To this end, co-clustering algorithms were proposed to describe clusters by subsets of features that are the most relevant to them. The obtained latent structure of data is composed of blocks usually called co-clusters. In first two chapters, we describe two co-clustering methods that proceed by differentiating the relevance of features calculated with respect to their capability of revealing the latent structure of the data in both probabilistic and distance-based framework. The probabilistic approach uses the mixture model framework where the irrelevant features are assumed to have a different probability distribution that is independent of the co-clustering structure. On the other hand, the distance-based (also called metric-based) approach relied on the adaptive metric where each variable is assigned with its weight that defines its contribution in the resulting co-clustering. From the theoretical point of view, we show the global convergence of the proposed algorithms using Zangwill convergence theorem. In the last two chapters, we consider a special case of co-clustering where contrary to the original setting, each subset of instances is described by a unique subset of features resulting in a diagonal structure of the initial data matrix. Same as for the two first contributions, we consider both probabilistic and metric-based approaches. The main idea of the proposed contributions is to impose two different kinds of constraints: (1) we fix the number of row clusters to the number of column clusters; (2) we seek a structure of the original data matrix that has the maximum values on its diagonal (for instance for binary data, we look for diagonal blocks composed of ones with zeros outside the main diagonal). The proposed approaches enjoy the convergence guarantees derived from the results of the previous chapters. Finally, we present both hard and fuzzy versions of the proposed algorithms. We evaluate our contributions on a wide variety of synthetic and real-world benchmark binary and continuous data sets related to text mining applications and analyze advantages and inconvenients of each approach. To conclude, we believe that this thesis covers explicitly a vast majority of possible scenarios arising in hard and fuzzy co-clustering and can be seen as a generalization of some popular biclustering approaches
Styles APA, Harvard, Vancouver, ISO, etc.
30

Martino, Sara. « Approximate Bayesian Inference for Latent Gaussian Models ». Doctoral thesis, Norwegian University of Science and Technology, Department of Mathematical Sciences, 2007. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-1949.

Texte intégral
Résumé :

This thesis consists of five papers, presented in chronological order. Their content is summarised in this section.

Paper I introduces the approximation tool for latent GMRF models and discusses, in particular, the approximation for the posterior of the hyperparameters θ in equation (1). It is shown that this approximation is indeed very accurate, as even long MCMC runs cannot detect any error in it. A Gaussian approximation to the density of χi|θ, y is also discussed. This appears to give reasonable results and it is very fast to compute. However, slight errors are detected when comparing the approximation with long MCMC runs. These are mostly due to the fact that a possible - skewed density is approximated via a symmetric one. Paper I presents also some details about sparse matrices algorithms.

The core of the thesis is presented in Paper II. Here most of the remaining issues present in Paper I are solved. Three different approximation for χi|θ, y with different degrees of accuracy and computational costs are described. Moreover, ways to assess the approximation error and considerations about the asymptotical behaviour of the approximations are also discussed. Through a series of examples covering a wide range of commonly used latent GMRF models, the approximations are shown to give extremely accurate results in a fraction of the computing time used by MCMC algorithms.

Paper III applies the same ideas as Paper II to generalised linear mixed models where χ represents a latent variable at n spatial sites on a two dimensional domain. Out of these n sites k, with n >> k , are observed through data. The n sites are assumed to be on a regular grid and wrapped on a torus. For the class of models described in Paper III the computations are based on discrete Fourier transform instead of sparse matrices. Paper III illustrates also how marginal likelihood π (y) can be approximated, provides approximate strategies for Bayesian outlier detection and perform approximate evaluation of spatial experimental design.

Paper IV presents yet another application of the ideas in Paper II. Here approximate techniques are used to do inference on multivariate stochastic volatility models, a class of models widely used in financial applications. Paper IV discusses also problems deriving from the increased dimension of the parameter vector θ, a condition which makes all numerical integration more computationally intensive. Different approximations for the posterior marginals of the parameters θ, π(θi)|y), are also introduced. Approximations to the marginal likelihood π(y) are used in order to perform model comparison.

Finally, Paper V is a manual for a program, named inla which implements all approximations described in Paper II. A large series of worked out examples, covering many well known models, illustrate the use and the performance of the inla program. This program is a valuable instrument since it makes most of the Bayesian inference techniques described in this thesis easily available for everyone.

Styles APA, Harvard, Vancouver, ISO, etc.
31

Dominicus, Annica. « Latent variable models for longitudinal twin data ». Doctoral thesis, Stockholm : Mathematical statistics, Dept. of mathematics, Stockholm university, 2006. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-848.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
32

Saba, Laura M. « Latent pattern mixture models for binary outcomes / ». Connect to full text via ProQuest. Limited to UCD Anschutz Medical Campus, 2007.

Trouver le texte intégral
Résumé :
Thesis (Ph.D. in Biostatistics) -- University of Colorado Denver, 2007.
Typescript. Includes bibliographical references (leaves 70-71). Free to UCD affiliates. Online version available via ProQuest Digital Dissertations;
Styles APA, Harvard, Vancouver, ISO, etc.
33

Jung, Sunho. « Regularized structural equation models with latent variables ». Thesis, McGill University, 2009. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=66858.

Texte intégral
Résumé :
In structural equation models with latent variables, maximum likelihood (ML) estimation is currently the most prevailing estimation method. However, the ML method fails to provide accurate solutions in a number of situations including those involving small sample sizes, nonnormality, and model misspecification. To over come these difficulties, regularized extensions of two-stage least squares estimation are proposed that incorporate a ridge type of regularization in the estimation of parameters. Two simulation studies and two empirical applications demonstrate that the proposed method is a promising alternative to both the maximum likelihood and non-regularized two-stage least squares estimation methods. An optimal value of the regularization parameter is found by the K-fold cross validation technique. A nonparametric bootstrap method is used to evaluate the stability of solutions. A goodness-of-fit measure is used for assessing the overall fit.
Dans les modèles d'équations structurales avec des variables latentes, l'estimation demaximum devraisemblance est la méthode d'estimation la plus utilisée. Par contre, la méthode de maximum devraisemblance souvent ne réussit pas á fournir des solutions exactes, par exemple lorsque les échantillons sont petits, les données ne sont pas normale, ou lorsque le modèle est mal specifié. L'estimation des moindres carrés á deux-phases est asymptotiquement sans distribution et robuste contre mauvaises spécifications, mais elle manque de robustesse quand les chantillons sont petits. Afin de surmonter les trois difficultés mentionnés ci-dessus et d'obtenir une estimation plus exacte, des extensions régularisées des moindres carrés á deux phases sont proposé á qui incorporent directement un type de régularisation dans les modèles d'équations structurales avec des variables latentes. Deux études de simulation et deux applications empiriques démontrent que la méthode propose est une alternative prometteuse aux méthodes de maximum vraisemblance et de l'estimation des moindres carrés á deux-phases. Un paramètre de régularisation valeur optimale a été trouvé par la technique de validation croisé d'ordre K. Une méthode non-paramétrique Bootstrap est utilisée afin d'évaluer la stabilité des solutions. Une mesure d'adéquation est utilisée pour estimer l'adéquation globale.
Styles APA, Harvard, Vancouver, ISO, etc.
34

Moustaki, Irini. « Latent variable models for mixed manifest variables ». Thesis, London School of Economics and Political Science (University of London), 1996. http://etheses.lse.ac.uk/78/.

Texte intégral
Résumé :
Latent variable models are widely used in social sciences in which interest is centred on entities such as attitudes, beliefs or abilities for which there e)dst no direct measuring instruments. Latent modelling tries to extract these entities, here described as latent (unobserved) variables, from measurements on related manifest (observed) variables. Methodology already exists for fitting a latent variable model to manifest data that is either categorical (latent trait and latent class analysis) or continuous (factor analysis and latent profile analysis). In this thesis a latent trait and a latent class model are presented for analysing the relationships among a set of mixed manifest variables using one or more latent variables. The set of manifest variables contains metric (continuous or discrete) and binary items. The latent dimension is continuous for the latent trait model and discrete for the latent class model. Scoring methods for allocating individuals on the identified latent dimen-sions based on their responses to the mixed manifest variables are discussed. ' Item nonresponse is also discussed in attitude scales with a mixture of binary and metric variables using the latent trait model. The estimation and the scoring methods for the latent trait model have been generalized for conditional distributions of the observed variables given the vector of latent variables other than the normal and the Bernoulli in the exponential family. To illustrate the use of the naixed model four data sets have been analyzed. Two of the data sets contain five memory questions, the first on Thatcher's resignation and the second on the Hillsborough football disaster; these five questions were included in BMRBI's August 1993 face to face omnibus survey. The third and the fourth data sets are from the 1990 and 1991 British Social Attitudes surveys; the questions which have been analyzed are from the sexual attitudes sections and the environment section respectively.
Styles APA, Harvard, Vancouver, ISO, etc.
35

White, S. A. « Latent structure models for repeated measurements experiments ». Thesis, University of Nottingham, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.376163.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
36

Burridge, C. Y. « Latent variable models for genotype-environment interaction ». Thesis, University of Reading, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.383469.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
37

Challis, E. A. L. « Variational approximate inference in latent linear models ». Thesis, University College London (University of London), 2013. http://discovery.ucl.ac.uk/1414228/.

Texte intégral
Résumé :
Latent linear models are core to much of machine learning and statistics. Specific examples of this model class include Bayesian generalised linear models, Gaussian process regression models and unsupervised latent linear models such as factor analysis and principal components analysis. In general, exact inference in this model class is computationally and analytically intractable. Approximations are thus required. In this thesis we consider deterministic approximate inference methods based on minimising the Kullback-Leibler (KL) divergence between a given target density and an approximating `variational' density. First we consider Gaussian KL (G-KL) approximate inference methods where the approximating variational density is a multivariate Gaussian. Regarding this procedure we make a number of novel contributions: sufficient conditions for which the G-KL objective is differentiable and convex are described, constrained parameterisations of Gaussian covariance that make G-KL methods fast and scalable are presented, the G-KL lower-bound to the target density's normalisation constant is proven to dominate those provided by local variational bounding methods. We also discuss complexity and model applicability issues of G-KL and other Gaussian approximate inference methods. To numerically validate our approach we present results comparing the performance of G-KL and other deterministic Gaussian approximate inference methods across a range of latent linear model inference problems. Second we present a new method to perform KL variational inference for a broad class of approximating variational densities. Specifically, we construct the variational density as an affine transformation of independently distributed latent random variables. The method we develop extends the known class of tractable variational approximations for which the KL divergence can be computed and optimised and enables more accurate approximations of non-Gaussian target densities to be obtained.
Styles APA, Harvard, Vancouver, ISO, etc.
38

Albanese, Maria Teresinha. « Latent variable models for binary response data ». Thesis, London School of Economics and Political Science (University of London), 1990. http://etheses.lse.ac.uk/1220/.

Texte intégral
Résumé :
Most of the results in this thesis are obtained for the logit/probit model for binary response data given by Bartholomew (1980), which is sometimes called the two-parameter logistic model. In most the cases the results also hold for other common binary response models. By profiling and an approximation, we investigate the behaviour of the likelihood function, to see if it is suitable for ML estimation. Particular attention is given to the shape of the likelihood around the maximum point in order to see whether the information matrix will give a good guide to the variability of the estimates. The adequacy of the asymptotic variance-covariance matrix is inwestigated through jackknife and bootstrap techniques. We obtain the marginal ML estimators for the Rasch model and compare them with those obtained from conditional ML estimation. We also test the fit of the Rasch model against a logit/probit model with a likelihood ratio test, and investigate the behaviour of the likelihood function for the Rasch model and its bootstrap estimates together with approximate methods. For both fixed and decreasing sample size, we investigate the stability of the discrimination parameter estimates ai, 1 when the number of items is reduced. We study the conditions which give rise to large discrimination parameter estimates. This leads to a method for the generation of a (p+1)th item with any fixed ap+1,1 and ap+1,0. In practice it is importante to measure the latent variable and this is usually done by using the posterior mean or the component scores. We give some theoretical and applied results for the relation between the linearity of the plot of the posterior mean latent variable values, the component scores and the normality of those posterior distributions.
Styles APA, Harvard, Vancouver, ISO, etc.
39

Basbug, Mehmet Emin. « Integrating Exponential Dispersion Models to Latent Structures ». Thesis, Princeton University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10254057.

Texte intégral
Résumé :

Latent variable models have two basic components: a latent structure encoding a hypothesized complex pattern and an observation model capturing the data distribution. With the advancements in machine learning and increasing availability of resources, we are able to perform inference in deeper and more sophisticated latent variable models. In most cases, these models are designed with a particular application in mind; hence, they tend to have restrictive observation models. The challenge, surfaced with the increasing diversity of data sets, is to generalize these latent models to work with different data types. We aim to address this problem by utilizing exponential dispersion models (EDMs) and proposing mechanisms for incorporating them into latent structures. (Abstract shortened by ProQuest.)

Styles APA, Harvard, Vancouver, ISO, etc.
40

Wenzel, Florian. « Scalable Inference in Latent Gaussian Process Models ». Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/20926.

Texte intégral
Résumé :
Latente Gauß-Prozess-Modelle (latent Gaussian process models) werden von Wissenschaftlern benutzt, um verborgenen Muster in Daten zu er- kennen, Expertenwissen in probabilistische Modelle einfließen zu lassen und um Vorhersagen über die Zukunft zu treffen. Diese Modelle wurden erfolgreich in vielen Gebieten wie Robotik, Geologie, Genetik und Medizin angewendet. Gauß-Prozesse definieren Verteilungen über Funktionen und können als flexible Bausteine verwendet werden, um aussagekräftige probabilistische Modelle zu entwickeln. Dabei ist die größte Herausforderung, eine geeignete Inferenzmethode zu implementieren. Inferenz in probabilistischen Modellen bedeutet die A-Posteriori-Verteilung der latenten Variablen, gegeben der Daten, zu berechnen. Die meisten interessanten latenten Gauß-Prozess-Modelle haben zurzeit nur begrenzte Anwendungsmöglichkeiten auf großen Datensätzen. In dieser Doktorarbeit stellen wir eine neue effiziente Inferenzmethode für latente Gauß-Prozess-Modelle vor. Unser neuer Ansatz, den wir augmented variational inference nennen, basiert auf der Idee, eine erweiterte (augmented) Version des Gauß-Prozess-Modells zu betrachten, welche bedingt konjugiert (conditionally conjugate) ist. Wir zeigen, dass Inferenz in dem erweiterten Modell effektiver ist und dass alle Schritte des variational inference Algorithmus in geschlossener Form berechnet werden können, was mit früheren Ansätzen nicht möglich war. Unser neues Inferenzkonzept ermöglicht es, neue latente Gauß-Prozess- Modelle zu studieren, die zu innovativen Ergebnissen im Bereich der Sprachmodellierung, genetischen Assoziationsstudien und Quantifizierung der Unsicherheit in Klassifikationsproblemen führen.
Latent Gaussian process (GP) models help scientists to uncover hidden structure in data, express domain knowledge and form predictions about the future. These models have been successfully applied in many domains including robotics, geology, genetics and medicine. A GP defines a distribution over functions and can be used as a flexible building block to develop expressive probabilistic models. The main computational challenge of these models is to make inference about the unobserved latent random variables, that is, computing the posterior distribution given the data. Currently, most interesting Gaussian process models have limited applicability to big data. This thesis develops a new efficient inference approach for latent GP models. Our new inference framework, which we call augmented variational inference, is based on the idea of considering an augmented version of the intractable GP model that renders the model conditionally conjugate. We show that inference in the augmented model is more efficient and, unlike in previous approaches, all updates can be computed in closed form. The ideas around our inference framework facilitate novel latent GP models that lead to new results in language modeling, genetic association studies and uncertainty quantification in classification tasks.
Styles APA, Harvard, Vancouver, ISO, etc.
41

Chen, Tao. « Search-based learning of latent tree models / ». View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?CSED%202009%20CHEN.

Texte intégral
Styles APA, Harvard, Vancouver, ISO, etc.
42

Fusi, Nicolo. « Probabilistic latent variable models in statistical genomics ». Thesis, University of Sheffield, 2015. http://etheses.whiterose.ac.uk/8326/.

Texte intégral
Résumé :
In this thesis, we propose different probabilistic latent variable mod- els to identify and capture the hidden structure present in commonly studied genomics datasets. We start by investigating how to cor- rect for unwanted correlations due to hidden confounding factors in gene expression data. This is particularly important in expression quantitative trait loci (eQTL) studies, where the goal is to identify associations between genetic variants and gene expression levels. We start with a na¨ ıve approach, which estimates the latent factors from the gene expression data alone, ignoring the genetics, and we show that it leads to a loss of signal in the data. We then highlight how, thanks to the formulation of our model as a probabilistic model, it is straightforward to modify it in order to take into account the specific properties of the data. In particular, we show that in the na¨ ıve ap- proach the latent variables ”explain away” the genetic signal, and that this problem can be avoided by jointly inferring these latent variables while taking into account the genetic information. We then extend this, so far additive, model to additionally detect interactions between the latent variables and the genetic markers. We show that this leads to a better reconstruction of the latent space and that it helps dis- secting latent variables capturing general confounding factors (such as batch effects) from those capturing environmental factors involved in genotype-by-environment interactions. Finally, we investigate the effects of misspecifications of the noise model in genetic studies, show- ing how the probabilistic framework presented so far can be easily ex- tended to automatically infer non-linear monotonic transformations of the data such that the common assumption of Gaussian distributed residuals is respected.
Styles APA, Harvard, Vancouver, ISO, etc.
43

Ridall, Peter Gareth. « Bayesian Latent Variable Models for Biostatistical Applications ». Thesis, Queensland University of Technology, 2004. https://eprints.qut.edu.au/16164/1/Peter_Ridall_Thesis.pdf.

Texte intégral
Résumé :
In this thesis we develop several kinds of latent variable models in order to address three types of bio-statistical problem. The three problems are the treatment effect of carcinogens on tumour development, spatial interactions between plant species and motor unit number estimation (MUNE). The three types of data looked at are: highly heterogeneous longitudinal count data, quadrat counts of species on a rectangular lattice and lastly, electrophysiological data consisting of measurements of compound muscle action potential (CMAP) area and amplitude. Chapter 1 sets out the structure and the development of ideas presented in this thesis from the point of view of: model structure, model selection, and efficiency of estimation. Chapter 2 is an introduction to the relevant literature that has in influenced the development of this thesis. In Chapter 3 we use the EM algorithm for an application of an autoregressive hidden Markov model to describe longitudinal counts. The data is collected from experiments to test the effect of carcinogens on tumour growth in mice. Here we develop forward and backward recursions for calculating the likelihood and for estimation. Chapter 4 is the analysis of a similar kind of data using a more sophisticated model, incorporating random effects, but estimation this time is conducted from the Bayesian perspective. Bayesian model selection is also explored. In Chapter 5 we move to the two dimensional lattice and construct a model for describing the spatial interaction of tree types. We also compare the merits of directed and undirected graphical models for describing the hidden lattice. Chapter 6 is the application of a Bayesian hierarchical model (MUNE), where the latent variable this time is multivariate Gaussian and dependent on a covariate, the stimulus. Model selection is carried out using the Bayes Information Criterion (BIC). In Chapter 7 we approach the same problem by using the reversible jump methodology (Green, 1995) where this time we use a dual Gaussian-Binary representation of the latent data. We conclude in Chapter 8 with suggestions for the direction of new work. In this thesis, all of the estimation carried out on real data has only been performed once we have been satisfied that estimation is able to retrieve the parameters from simulated data. Keywords: Amyotrophic lateral sclerosis (ALS), carcinogens, hidden Markov models (HMM), latent variable models, longitudinal data analysis, motor unit disease (MND), partially ordered Markov models (POMMs), the pseudo auto- logistic model, reversible jump, spatial interactions.
Styles APA, Harvard, Vancouver, ISO, etc.
44

Ridall, Peter Gareth. « Bayesian Latent Variable Models for Biostatistical Applications ». Queensland University of Technology, 2004. http://eprints.qut.edu.au/16164/.

Texte intégral
Résumé :
In this thesis we develop several kinds of latent variable models in order to address three types of bio-statistical problem. The three problems are the treatment effect of carcinogens on tumour development, spatial interactions between plant species and motor unit number estimation (MUNE). The three types of data looked at are: highly heterogeneous longitudinal count data, quadrat counts of species on a rectangular lattice and lastly, electrophysiological data consisting of measurements of compound muscle action potential (CMAP) area and amplitude. Chapter 1 sets out the structure and the development of ideas presented in this thesis from the point of view of: model structure, model selection, and efficiency of estimation. Chapter 2 is an introduction to the relevant literature that has in influenced the development of this thesis. In Chapter 3 we use the EM algorithm for an application of an autoregressive hidden Markov model to describe longitudinal counts. The data is collected from experiments to test the effect of carcinogens on tumour growth in mice. Here we develop forward and backward recursions for calculating the likelihood and for estimation. Chapter 4 is the analysis of a similar kind of data using a more sophisticated model, incorporating random effects, but estimation this time is conducted from the Bayesian perspective. Bayesian model selection is also explored. In Chapter 5 we move to the two dimensional lattice and construct a model for describing the spatial interaction of tree types. We also compare the merits of directed and undirected graphical models for describing the hidden lattice. Chapter 6 is the application of a Bayesian hierarchical model (MUNE), where the latent variable this time is multivariate Gaussian and dependent on a covariate, the stimulus. Model selection is carried out using the Bayes Information Criterion (BIC). In Chapter 7 we approach the same problem by using the reversible jump methodology (Green, 1995) where this time we use a dual Gaussian-Binary representation of the latent data. We conclude in Chapter 8 with suggestions for the direction of new work. In this thesis, all of the estimation carried out on real data has only been performed once we have been satisfied that estimation is able to retrieve the parameters from simulated data. Keywords: Amyotrophic lateral sclerosis (ALS), carcinogens, hidden Markov models (HMM), latent variable models, longitudinal data analysis, motor unit disease (MND), partially ordered Markov models (POMMs), the pseudo auto- logistic model, reversible jump, spatial interactions.
Styles APA, Harvard, Vancouver, ISO, etc.
45

Pegoraro, Fulvio <1974&gt. « Discrete time pricing : models with latent variables ». Doctoral thesis, Università Ca' Foscari Venezia, 2004. http://hdl.handle.net/10579/197.

Texte intégral
Résumé :
L'obbiettivo della presente Tesi é di considerare la specificazione di mod­elli di pricing in tempo discreto (in generale, incompleti) con variabili latenti, al fine di sfruttare i vantaggi derivanti da tale contesto a tempo discreto e al fine di fornire una descrizione completa degli aspetti storici e neutrali al rischio dei prezzi dei titoli. Negli ultimi anni osserviamo un importante sviluppo di modelli di pric­ing in tempo discreto, dove la modellizzazione secondo il principio dello Stochastic Discount Factor (SDF) e la caratterizzazione della distribuzione condizionale delle variabili di stato tramite la trasformata di Laplace sem­brano fornire risultati promettenti. Più precisamente, la caratterizzazione generale di modelli di pricing in tempo discreto, usando questo tipo di approccio, e dove é assunta una speci­ficazione Compound Autoregressive (CAR ovvero affine) per le variabili di stato [vedi Darolles, Gourieroux, Jasiak (2002)], é stata proposta da Gourier­oux e Monfort (2003) e Gourieroux, Monfort e Polimenis (2002, 2003); in questi articoli viene presentata la metodologia generale di pricing e vengono specificati modelli per la Struttura a Termine e per il Rischio di Credito. Il tempo discreto é un contesto naturale per sviluppare modelli di valoriz­zazione volti a future implementazioni econometriche; infatti, i dati storici sono campionati con frequenza discreta, le transazioni finanziarie sono tipi­camente registrate a intervalli temporali discreti, la stima di parametri e i test statistici implicano dati a tempo discreto e le previsione sono fatte a orizzonti discreti. Un secondo e importante vantaggio che si ha nel lavorare in tempo dis­creto emerge quando consideriamo la classe di processi affini per applicazioni finanziarie. La classe di processi affini in tempo discreto (processi CAR) [pro­posti, come indicato sopra, da Darolles, Gourieroux, Jasiak (2002)] é molto più ampia della classe equivalente in tempo continuo proposta da Duffie, Fil­ipovic and Schachermayer (2003) : tutti i processi affini in tempo continuo campionati a istanti temporali discreti sono CAR, mentre esiste un ampio numero di processi CAR senza un processo equivalente in tempo continuo. Questa é una conseguenza del problema di embedding che caratterizza la classe affine in tempo continuo : tali processi devono essere infinitamente decomponibili, mentre tale condizione non é necessaria in tempo discreto [vedi Darolles, Gourieroux and Jasiak (2002) and Gourieroux, Monfort and Polimenis (2002)]. Nella Tesi sfrutteremo il contesto a tempo discreto anche per introdurre processi Non-Gaussiani e Non-Markoviani come le Misture di Processi Con­dizionatamente Gaussiani. Per quanto riguarda l'utilizzo della trasformata di Laplace condizionale per descrivere la distribuzione storica e neutrale al rischio delle variabili di stato, si osservi come in molte applicazione economico-finanziarie siamo por­tati in modo naturale a dover calcolare la trasformata di Laplace di tali variabili di stato. Alcuni esempi possibili sono i seguenti : (a) optimal port­folio problems (CARA utility functions, Markowitz), (b) asset pricing by thè certainty equivalence principle (CARA utility functions), (c) discrete time derivative pricing and term strueture models with exponential-affine SDFs, (d) panel duration models, (e) extreme risk [see Darolles, Gourieroux and Jasiak (2002) for details]. Vedremo, inoltre, che la trasformata di Laplace é uno strumento molto utile per caratterizzare anche la distribuzione storica e neutrale al rischio di Misture di Processi Condizionatamente Gaussiani. Per finire, la necessità di prendere in considerazione le fonti di rischio ril­evanti nell'influenzare il titolo da valorizzare, porta a considerare lo Stochas­tic Discount Factor (SDF) come strumento per caratterizzare la procedura di pricing : lo SDF é una variabile casuale (chiamata anche Pricing Kernel o State Price Deflator) che sintetizza sia l'attualizzazione temporale che la correzione per il rischio, e che porta a specificare, conseguentemente, una procedura di valorizzazione che fornisce una modellizzazione completa degli aspetti storici e neutrali al rischio. Il tempo discreto implica in generale un contesto a mercato incompleto e una molteplicità di formule di pricing; il problema della molteplicità viene ridotto imponendo una forma particolare allo SDF; il Pricing Kernel viene specificato, infatti, secondo una forma esponenziale-affine che si é dimostrata utile in molte circostanze e che troviamo sovente in letteratura [vedi Lu­cas (1978), Gerber e Shiu (1994), Stutzer (1995, 1996), Buchen e Kelly (1996), Buhlmann et al. (1997, 1998), Polimenis (2001), Gourieroux e Mon­fort (2002)]. Inoltre, uno SDF con una forma esponenziale-affine presenta proprietà tecniche interessanti : tale approccio infatti, che é basato sulla trasformata di Esscher in un contesto dinamico a tempo discreto, perme­tte di selezionare una misura martingale (di pricing) equivalente che riflette, nella formula di pricing, le diverse fonti di rischio da valorizzare. Ora, il contesto a tempo discreto, assieme ai principi di modellizzazione dello SDF esponenziale-affine e della transformata di Laplace, costituiscono gli strumenti usati nei tre capitoli fondamentali della Tesi. La Tesi analizza il ruolo che l'introduzione di variabili latenti può avere, in questa classe di modelli di pricing a tempo discreto, nello specificare metodologie di valorizzazione complete e coerenti rispetto alle indicazioni empiriche. Nei Capitoli 2 e 3 l'obbiettivo, infatti, é quello di specificare metodologie per la valorizzazione di prodotti derivati in grado di prendere in considerazione i tipici fenomeni di skewness e excess kurtosis che osserviamo nella distribuzione dei rendimenti di titoli azionari, e di riuscire quindi a repli­care le volatilità implicite di Black e Scholes (BS) e superfici di volatilità im­plicita con forme di smile a volatility skew coerenti con l'evidenza empirica1. Qui, le variabili latenti introducono cambiamenti di regime nella dinamica del titolo sottostante (cambiamenti, per esempio, tra un mercato a regime di alta e bassa volatilità) ovvero, introducono nella distribuzione storica del rendimento rischioso fenomeni come medie e varianze stocastiche. Nel Capi­tolo 4, vengono proposti modelli affini bifattoriali per la struttura a termine dei tassi di interesse (in tempo discreto) con variabili latenti; l'obbiettivo é quello di ottenere famiglie di possibili strutture a termine con forme più prossime (rispetto ai modelli unifattoriali in tempo discreto e continuo) a quelle osservate. In questo caso, le variabili latenti introducono parametri stocastici (continui e discreti) nella dinamica del fattore (tasso di interesse a breve scadenza) responsabile della forma della struttura a termine nei modelli unifattoriale. In altre parole, si vogliono definire procedure di pricing capaci di pren­dere in considerazione, in modo coerente e utile, le fonti di rischio descritte dai cambiamenti di regime e dai parametri stocastici; vogliamo specificare metodologie di valorizzazione non basate su ipotesi arbitrarie (spesso us­ate in letterature) come, per esempio, la neutralità al rischio degli investi­tori o la natura idiosincratica del rischio, e vogliamo derivare formule di pricing che hanno una forma analitica o che sono facili da implementare. L'organizzazione della Tesi é indicata nel prossimo paragrafo. 1 Questi due capitoli corrispondono a due articoli scritti con Henri Bertholon e Alain Monfort. Sintesi dei capitoli Nel CAPITOLO 1 viene inizialmente presentato il principio di modelliz­zazione dello SDF, e come é legato alla Law of One Price e al principio di Absence of Arbitrage Opportunity; successivamente, vengono descritti gli strumenti base che caratterizzano i modelli di pricing a tempo discreto sviluppati nella tesi : lo SDF esponenziale-affine, e la rappresentazione della distribuzione condizionale delle variabili di stato tramite la trasformata di Laplace considerando come esempio i processi CAR. Nel CAPITOLO 2 proponiamo una nuova procedura di valorizzazione di opzioni Europee che porta ad una generalizzazione della formula di Black e Scholes [utile, quindi, dal punto di vista delle istituzioni finanziarie]; in particolare, ci focalizziamo sulle due fonti fondamentali di cattiva specifi­cazione dell'approccio BS, ovvero l'assenza di Gaussianità e la dinamica. Gli strumenti utilizzati sono le misture in tempo discreto di processi condizion­atamente gaussiani, cioè processi {yt} tali che yt+1 é gaussiano condizion­atamente ai propri valori passati e al valore presente zt+ì di un white noise non osservabile a valori discreti. Forniamo (in un semplice caso statico) le simulazioni di volatilità implicita di BS e di superfici di volatilità implicita, e osserviamo l'abilità delle procedure di pricing che proponiamo nel repli­care smiles e volatility skews coerenti con l'evidenza empirica. Per quanto riguarda le superfici di volatilità implicita, il modello statico mostra qualche limite che é superato, con una dinamica di tipo Regime-Switching attribuita a zt+i, nel Capitolo 3. Il CAPITOLO 3 presenta una naturale evoluzione del precedente capitolo; infatti, prende in considerazione il caso in cui la variabile latente zt+\ non sia più un white noise ma, tipicamente, una Catena di Markov. Più precisa­mente, presentiamo il modello General Switching Regime per il pricing di derivati, applicato ai casi di opzioni Europee e path dependent. Studiamo inoltre le condizioni sotto le quali c'è una trasmissione di causalità (assenza di causalità istantanea, assenza di causalità, indipendenza), esistente tra la madia e la varianza stocastica, dal mondo storico al mondo neutrale al ris­chio. A questo scopo separiamo la dinamica della media e della varianza (in un caso di Hidden Markov Chain) usando due distinte variabili latenti (zit+i , Z2t+i), dove sia zu+i che Zit+\ possono prendere J possibili valori, e dove la prima variabile latente descrive la dinamica della media mentre la seconda quella della varianza. Lo scopo del CAPITOLO 4 é di introdurre parametri stocastici e cambi­amenti di regime nei modelli affini unifattoriali per la struttura a termine presentati da Gourieroux, Monfort and Polimenis (2002) [GMP (2002)], al fine di estendere la dinamica del tasso a breve termine e di ampliare, con­seguentemente, la ricchezza di curve della struttura a termine che tali modelli sono in grado di riprodurre. Vengono studiati diversi modelli alternativi e vengono presentate le simulazioni sulle possibili struttura a termine che essi sono in grado di replicare; in particolare, le strutture a termine ottenute mostrano forme con gobbe verso l'alto e verso il basso, forme con diversi gradi curvatura e con due mode. Per finire, presentiamo in problema dell' in­dividuazione di mimicking factors [un vettore Rt = (rf;i+2,..., rt,t+n) di tassi di interesse con differenti maturity] per i parametri stocastici e i cambiamenti di regime : questo é un problema interessante, dal punto di vista statistico, data l'osservabilità dei tassi di interesse. The aim of the thesis is to consider, as a new research direction, the specification of discrete time pricing models (in general incomplete) with latent variables, in order to exploit the advantages coming from the discrete time framework and in order to give a complete description of historical and risk-neutral aspects of asset prices. In the last years, we observe an important development of asset pric­ing models in discrete time, where the use of the Stochastic Discount Factor (SDF) modeling principle and the characterization of the state variables con­ditional distributions by means of the Laplace transform seem promising. More precisely, the general discrete time characterization of asset pricing models, using this kind of approach, and where a compound autoregressive (affine or CAR) specification for the state variables is assumed [see Darolles, Gourieroux, Jasiak (2002)], has been proposed by Gourieroux and Monfort (2003) and Gourieroux, Monfort and Polimenis (2002, 2003); in these pa­pers the general pricing methodology and the specifications of Affine Term Structure models, along with the Credit Risk Analysis, are presented. The discrete time is a natural framework to develop pricing models for fu­ture econometric implementations, given that all historical data are sampled discretely, financial transactions are typically recorded at discrete intervals, parameter estimation and hypothesis testing involve discrete data records, and forecasts are produced at discrete horizons. A second important advantage to work in discrete time emerges when we consider the class of affine processes for financial applications. The class of discrete time affine (CAR) processes [proposed, as indicated above, by Darolles, Gourieroux and Jasiak (2002)] is much larger than the equivalent continuous time class proposed by Duffie, Filipovic and Schachermayer (2003) : all continuous time affine processes sampled at discrete points are CAR, while there exists a large number of CAR processes without a continuous time counterpart. This is a consequence of the embedding problem that characterizes the continuous time class : these processes have to be infinitely decomposable, and this decomposition condition is not necessary in discrete time [see Darolles, Gourieroux and Jasiak (2002) and Gourieroux, Monfort and Polimenis (2002) for details]. In this Thesis, we will also exploit the discrete time framework in order to introduce non-Gaussian and non-Markovian processes like, for instance, the Mixtures of Conditionally Normal Processes. With regard to the use of the conditional Laplace transform to describe the historical and risk-neutral (pricing) distribution of the state variables, we observe that in many financial and economic applications we are naturally lead to determine the Laplace transform of the processes of interest; possible examples are the followings : (a) optimal portfolio problems (CARA utility functions, Markowitz), (b) asset pricing by the certainty equivalence princi­ple (CARA utility functions), (c) discrete time derivative pricing and term structure models with exponential-affine SDFs, (d) panel duration models, (e) extreme risk [see Darolles, Gourieroux and Jasiak (2002) for details]. In this Thesis we will see that the Laplace transform is also very convenient for the class of Mixtures of Conditionally Normal Processes. Finally, the need to take into account the relevant sources of risk that influence the asset one wants to price, lead to consider a Stochastic Discount Factor (SDF) approach to characterize the pricing procedure : the SDF is a random variable (called also Pricing Kernel or State Price Deflator) which summarizes both the time discounting and the risk correction, and which specifies, consequently, a pricing procedure that gives a complete modelisa­tion of the historical and risk-neutral (pricing) aspects. Given that discrete time implies in general an incomplete market frame­work and a multiplicity of asset pricing formulas, the multiplicity problem is reduced by imposing a special structure on the SDF; in particular, it is pos­sible to consider for the pricing kernel an exponential-affine function of the state variables which has proved useful in many circumstances and that we find frequently in the literature [see Lucas (1978), Gerber and Shiu (1994), Stutzer (1995, 1996), Buchen and Kelly (1996), Buhlmann et al. (1997, 1998), Polimenis (2001), Gourieroux and Monfort (2002)]. Moreover, a SDF with an exponential-affine form presents interesting technical properties : it is the Esscher transform approach, in a dynamic discrete time framework, which gives the possibility to select an equivalent martingale (pricing) mea­sure that reflects, in the pricing formula, the different sources of risks to be priced. Now, the discrete time framework, along with the exponential-affine SDF modeling principle and the Laplace transform approach, constitute the in­struments used in the three core chapters of the Thesis. The Thesis analyzes the role that the introduction of latent variables could play, in this class of discrete time pricing models, for the specification of com­plete and coherent, with respect to the empirical evidence, pricing methodolo­gies. In Chapters 2 and 3 the purpose is, indeed, to specify derivative pricing methodologies able to take into account time-varying stock returns skewness and excess kurtosis, that is, pricing procedure able to replicate phenomena like implied Black and Scholes volatilities and implied volatility surfaces with smile and volatility skew shapes coherent with empirical studies2. Here, the latent variables are regimes of the underlying risky asset (switches, for in­stance, between a low volatility and a high volatility regime of the market), that is, they introduce phenomena like stochastic means and variances in the historical dynamics of the stock return underlying the derivative product. In Chapter 4, the Thesis proposes discrete time two-factor affine term structure models with latent variables, able to obtain families of possible term struc­tures with shapes closer (with respect to one-factor continuous and discrete time models) to the observed ones. In this case the latent variables intro­duce discrete and continuous stochastic parameters in the dynamics of the factor (the short term interest rate) that explains the term structure of the univariate models. In other words, we want to define pricing procedure able to take into account in a coherent and useful way the sources of risk described by the switching of regimes and by the stochastic parameters; we want to spec­ify pricing procedure not characterized by arbitrary assumptions, frequently used in the literature, like, for instance, the risk-neutrality of the investors or the idiosyncratic nature of the risk. In addition, we want to provide pricing formulas which have an analytical form or which are simple to implement. The organization of the Thesis is detailed below. 2They correspond to two papers written with Henri Bertholon and Alain Monfort. Outline of the chapters In CHAPTER 1 first we present the SDF modeling principle, and its rela­tions with the Law of One Price and the Absence of Arbitrage Opportunity principle, then we consider the basic tools characterizing the discrete time pricing models developed in this Thesis : the exponential-affine SDF, the conditional Laplace transform description of the future uncertainty and the CAR processes. In CHAPTER 2 we propose a new European option pricing procedure which lead to a generalization of the Black and Scholes pricing formula [and, therefore, useful for financial institutions]. We focus on two impor­tant sources of misspecification for the Black-Scholes approach, namely the lack of normality and the dynamics. The basic tools are the mixtures of discrete time conditionally normal processes, that is to say processes {yt} such that yt+i is gaussian conditionally to its past values and the present value zt+i of a discrete value unobservable white noise process. We provide (in a static framework) simulations of implied Black-Scholes volatilities and implied volatilities surfaces, and we observe the ability of the proposed as­set pricing methodology to replicate smiles and volatility skews coherent with empirical results. With regard to implied volatility surfaces, the static model shows some limit which is overcome, with a Regime-Switching dynamics for Zt+i, in Chapter 3. CHAPTER 3 presents a natural evolution of the previous chapter, that is, it considers the case where the latent variable zt+1 is no more a white noise but, typically a Markov chain. More precisely, we present the derivative pricing General Switching Regime model applied to the cases of European and path dependent options. We also study the conditions under which there is a transmission of causality relations (absence of instantaneous casuality, absence of causality, independence), existing between the stochastic mean and variance, from the historical to the risk-neutral world. For this purpose we separate the dynamics of these two moments (in the case of a Hidden Markov Chain) using two distinct latent variables (zit+i , 22t+i)> where both Z\t+\ and Z2t+\ can take J values and where the first latent variable describe the dynamics of the mean while the second one describe the dynamics of the variance. The aim of CHAPTER 4 is to introduce stochastic parameters and switch­ing regimes in the one-factor Affine Term Structure Models proposed by Gourieroux, Monfort and Polimenis (2002) [GMP (2002) hereafter], in order to extend the dynamics of the short rate and to improve, consequently, the richness of shapes of the term structure they are able to replicate. Different models are studied and simulations of the possible term structures we are able to replicate are presented; in particular, the provided term structures show shapes with bumps both upwards and downwards, shapes with different degrees of curvature and with two modes. Finally, we present the problem to find mimicking factors [a vector Rt (rt)t+2,..., ru+n) of interest rates at different maturities] for stochastic parameters and switching regimes : this is an interesting problem, from a statistical point of view, because of the observability of the interest rates.
Styles APA, Harvard, Vancouver, ISO, etc.
46

Gao, Sheng. « Latent factor models for link prediction problems ». Paris 6, 2012. http://www.theses.fr/2012PA066056.

Texte intégral
Résumé :
Avec la croissance d'Internet et celle des médias sociaux, les données relationnelles, qui décrivent un ensemble d'objets liés entre eux par différents relations, sont devenues courantes. En conséquence, une grande variété d'applications, telles que les systèmes de recommandation, l'analyse de réseaux sociaux, la fouille de données Web ou la bioinformatique, ont motivé l'étude de techniques d'apprentissage relationnel. Parmi le large éventail de ces techniques, nous traitons dans cette thèse le problème de prédiction de liens. Le problème de la prédiction de liens est une tache fondamentale de l'apprentissage relationnel, consistant à prédire la présence ou l'absence de liens entre objets, à partir de la topologie du réseau et/ou les attributs des objets. Cependant, la complexité et la sparsité des réseaux font de cette tache un problème ardu. Dans cette thèse, nous proposons des solutions pour faciliter l'apprentissage dans le cas de différentes applications. Dans le chapitre 3, nous présentons un cadre unifié afin de traiter le problème générique de prédiction de liens. Nous discutons les différentes caractéristiques des modèles des points de vue probabiliste et computationnel. Ensuite, en se focalisant sur les applications traitées dans cette thèse, nous proposons des modèles à facteurs latents pour deux types de taches de prédiction de liens: (i) prédiction structurelle de liens et (ii) prédiction temporelle de liens. Concernant la prédiction structurelle de liens, nous proposons dans le chapitre 4 une nouvelle application que nous appellons Prédiction de Motifs de Liens (PML). Nous introduisons un facteur latent spécifique pour différents types de relations en plus de facteurs latents pour caractériser les objets. Nous présentons un modèle de actorisation tensorielle dans un cadre Bayésien pour révéler la causalité intrinsèque de l'interaction sociale dans les réseaux multi-relationnels. De plus, étant donné la structure complexe des données relationnelles, nous proposons dans le chapitre 5 un modèle qui incorpore simultanément l'effet des facteurs de caractéristiques latentes et l'impact de la structure en blocs du réseau. Concernant la prédiction temporelle de liens dans les réseaux dynamiques, nous proposons dans le Chapitre 6 un modèle latent unifié qui intègre des sources d'information multiples, la topologie globale du réseau, les attributs des noeuds et les informations de proximité du réseau afin de capturer les motifs d'évolution temporelle des liens. Ce modèle joint repose sur la factorisation latente de matrices et sur une techniques de régularisation pour graphes. Chaque modèle proposé dans cette thèse a des performances comparables ou supérieures aux méthodes existantes. Des évaluations complètes sont conduites sur des jeux de données réels pour démontrer leur performances supérieures sur les méthodes de base. La quasi-totalité d'entre eux ont fait l'objet d'une publication dans des conférences nationales ou internationales
With the rising of Internet as well as modern social media, relational data has become ubiquitous, which consists of those kinds of data where the objects are linked to each other with various relation types. Accordingly, various relational learning techniques have been studied in a large variety of applications with relational data, such as recommender systems, social network analysis, Web mining or bioinformatic. Among a wide range of tasks encompassed by relational learning, we address the problem of link prediction in this thesis. Link prediction has arisen as a fundamental task in relational learning, which considers to predict the presence or absence of links between objects in the relational data based on the topological structure of the network and/or the attributes of objects. However, the complexity and sparsity of network structure make this a great challenging problem. In this thesis, we propose solutions to reduce the difficulties in learning and fit various models into corresponding applications. Basically, in Chapter 3 we present a unified framework of latent factor models to address the generic link prediction problem, in which we specifically discuss various configurations in the models from computational perspective and probabilistic view. Then, according to the applications addressed in this dissertation, we propose different latentfactor models for two classes of link prediction problems: (i) structural link prediction. (ii) temporal link prediction. In terms of structural link prediction problem, in Chapter 4 we define a new task called Link Pattern Prediction (LPP) in multi-relational networks. By introducing a specific latent factor for different relation types in addition to using latent feature factors to characterize objects, we develop a computational tensor factorization model, and the probabilistic version with its Bayesian treatment to reveal the intrinsic causality of interaction patterns in multi-relational networks. Moreover, considering the complex structural patterns in relational data, in Chapter 5 we propose a novel model that simultaneously incorporates the effect of latent feature factors and the impact from the latent cluster structures in the network, and also develop an optimization transfer algorithm to facilitate the model learning procedure. In terms of temporal link prediction problem in time-evolving networks, in Chapter 6 we propose a unified latent factor model which integrates multiple information sources in the network, including the global network structure, the content of objects and the graph proximity information from the network to capture the time-evolving patterns of links. This joint model is constructed based on matrix factorization and graph regularization technique. Each model proposed in this thesis achieves state-of-the-art performances, extensive experiments are conducted on real world datasets to demonstrate their significant improvements over baseline methods. Almost all of themhave been published in international or national peer-reviewed conference proceedings
Styles APA, Harvard, Vancouver, ISO, etc.
47

Pasquiou, Alexandre. « Deciphering the neural bases of language comprehension using latent linguistic representations ». Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG041.

Texte intégral
Résumé :
Au cours des dernières décennies, les modèles de langage (MLs) ont atteint des performances équivalentes à celles de l'homme sur plusieurs tâches. Ces modèles peuvent générer des représentations vectorielles qui capturent diverses propriétés linguistiques des mots d'un texte, telles que la sémantique ou la syntaxe. Les neuroscientifiques ont donc mis à profit ces progrès et ont commencé à utiliser ces modèles pour explorer les bases neurales de la compréhension du langage. Plus précisément, les représentations des ML calculées à partir d'une histoire sont utilisées pour modéliser les données cérébrales d'humains écoutant la même histoire, ce qui permet l'examen de plusieurs niveaux de traitement du langage dans le cerveau. Si les représentations du ML s'alignent étroitement avec une région cérébrale, il est probable que le modèle et la région codent la même information. En utilisant les données cérébrales d'IRMf de participants américains écoutant l'histoire du Petit Prince, cette thèse 1) examine les facteurs influant l'alignement entre les représentations des MLs et celles du cerveau, ainsi que 2) les limites de telles alignements. La comparaison de plusieurs MLs pré-entraînés et personnalisés (GloVe, LSTM, GPT-2 et BERT) a révélé que les Transformers s'alignent mieux aux données d'IRMf que LSTM et GloVe. Cependant, aucun d'entre eux n'est capable d'expliquer tout le signal IRMf, suggérant des limites liées au paradigme d'encodage ou aux MLs. En étudiant l'architecture des Transformers, nous avons constaté qu'aucune région cérébrale n'est mieux expliquée par une couche ou une tête d'attention spécifique. Nos résultats montrent que la nature et la quantité de données d'entraînement affectent l'alignement. Ainsi, les modèles pré-entraînés sur de petits ensembles de données ne sont pas efficaces pour capturer les activations cérébrales. Nous avons aussi montré que l'entraînement des MLs influence leur capacité à s'aligner aux données IRMf et que la perplexité n'est pas un bon prédicteur de leur capacité à s'aligner. Cependant, entraîner les MLs améliore particulièrement leur performance d'alignement dans les régions coeur de la sémantique, indépendamment de l'architecture et des données d'entraînement. Nous avons également montré que les représentations du cerveau et des MLs convergent d'abord pendant l'entraînement du modèle avant de diverger l'une de l'autre. Cette thèse examine en outre les bases neurales de la syntaxe, de la sémantique et de la sensibilité au contexte en développant une méthode qui peut sonder des dimensions linguistiques spécifiques. Cette méthode utilise des MLs restreints en information, c'est-à-dire des architectures entraînées sur des espaces de représentations contenant un type spécifique d'information. Tout d'abord, l'entraînement de MLs sur des représentations sémantiques et syntaxiques a révélé un bon alignement dans la plupart du cortex mais avec des degrés relatifs variables. La quantification de cette sensibilité relative à la syntaxe et à la sémantique a montré que les régions cérébrales les plus sensibles à la syntaxe sont plus localisées, contrairement au traitement de la sémantique qui reste largement distribué dans le cortex. Une découverte notable de cette thèse est que l'étendue des régions cérébrales sensibles à la syntaxe et à la sémantique est similaire dans les deux hémisphères. Cependant, l'hémisphère gauche a une plus grande tendance à distinguer le traitement syntaxique et sémantique par rapport à l'hémisphère droit. Dans un dernier ensemble d'expériences, nous avons conçu une méthode qui contrôle les mécanismes d'attention dans les Transformers afin de générer des représentations qui utilisent un contexte de taille fixe. Cette approche fournit des preuves de la sensibilité au contexte dans la plupart du cortex. De plus, cette analyse a révélé que les hémisphères gauche et droit avaient tendance à traiter respectivement des informations contextuelles plus courtes et plus longues
In the last decades, language models (LMs) have reached human level performance on several tasks. They can generate rich representations (features) that capture various linguistic properties such has semantics or syntax. Following these improvements, neuroscientists have increasingly used them to explore the neural bases of language comprehension. Specifically, LM's features computed from a story are used to fit the brain data of humans listening to the same story, allowing the examination of multiple levels of language processing in the brain. If LM's features closely align with a specific brain region, then it suggests that both the model and the region are encoding the same information. LM-brain comparisons can then teach us about language processing in the brain. Using the fMRI brain data of fifty US participants listening to "The Little Prince" story, this thesis 1) investigates the reasons why LMs' features fit brain activity and 2) examines the limitations of such comparisons. The comparison of several pre-trained and custom-trained LMs (GloVe, LSTM, GPT-2 and BERT) revealed that Transformers better fit fMRI brain data than LSTM and GloVe. Yet, none are able to explain all the fMRI signal, suggesting either limitations related to the encoding paradigm or to the LMs. Focusing specifically on Transformers, we found that no brain region is better fitted by specific attentional head or layer. Our results caution that the nature and the amount of training data greatly affects the outcome, indicating that using off-the-shelf models trained on small datasets is not effective in capturing brain activations. We showed that LMs' training influences their ability to fit fMRI brain data, and that perplexity was not a good predictor of brain score. Still, training LMs particularly improves their fitting performance in core semantic regions, irrespective of the architecture and training data. Moreover, we showed a partial convergence between brain's and LM's representations.Specifically, they first converge during model training before diverging from one another. This thesis further investigates the neural bases of syntax, semantics and context-sensitivity by developing a method that can probe specific linguistic dimensions. This method makes use of "information-restricted LMs", that are customized LMs architectures trained on feature spaces containing a specific type of information, in order to fit brain data. First, training LMs on semantic and syntactic features revealed a good fitting performance in a widespread network, albeit with varying relative degrees. The quantification of this relative sensitivity to syntax and semantics showed that brain regions most attuned to syntax tend to be more localized, while semantic processing remain widely distributed over the cortex. One notable finding from this analysis was that the extent of semantic and syntactic sensitive brain regions was similar across hemispheres. However, the left hemisphere had a greater tendency to distinguish between syntactic and semantic processing compared to the right hemisphere. In a last set of experiments we designed "masked-attention generation", a method that controls the attention mechanisms in transformers, in order to generate latent representations that leverage fixed-size context. This approach provides evidence of context-sensitivity across most of the cortex. Moreover, this analysis found that the left and right hemispheres tend to process shorter and longer contextual information respectively
Styles APA, Harvard, Vancouver, ISO, etc.
48

Cuesta, Ramirez Jhouben Janyk. « Optimization of a computationally expensive simulator with quantitative and qualitative inputs ». Thesis, Lyon, 2022. http://www.theses.fr/2022LYSEM010.

Texte intégral
Résumé :
Dans cette thèse, les problèmes mixtes couteux sont abordés par le biais de processus gaussiens où les variables discrètes sont relaxées en variables latentes continues. L'espace continu est plus facilement exploité par les techniques classiques d'optimisation bayésienne que ne le serait un espace mixte. Les variables discrètes sont récupérées soit après l'optimisation continue, soit simultanément avec une contrainte supplémentaire de compatibilité continue-discrète qui est traitée avec des lagrangiens augmentés. Plusieurs implémentations possibles de ces optimiseurs mixtes bayésiens sont comparées. En particulier, la reformulation du problème avec des variables latentes continues est mise en concurrence avec des recherches travaillant directement dans l'espace mixte. Parmi les algorithmes impliquant des variables latentes et un lagrangien augmenté, une attention particulière est consacrée aux multiplicateurs de lagrange pour lesquels des techniques d'estimation locale et globale sont étudiées. Les comparaisons sont basées sur l'optimisation répétée de trois fonctions analytiques et sur une application mécanique concernant la conception d'une poutre. Une étude supplémentaire pour l'application d'une stratégie d'optimisation mixte proposée dans le domaine de l'auto-calibrage mixte est faite. Cette analyse s'inspire d'une application de quantification des radionucléides, qui définit une fonction inverse spécifique nécessitant l'étude de ses multiples propriétés dans le scenario continu. une proposition de différentes stratégies déterministes et bayésiennes a été faite en vue d'une définition complète dans un contexte de variables mixtes
In this thesis, costly mixed problems are approached through gaussian processes where the discrete variables are relaxed into continuous latent variables. the continuous space is more easily harvested by classical bayesian optimization techniques than a mixed space would. discrete variables are recovered either subsequently to the continuous optimization, or simultaneously with an additional continuous-discrete compatibility constraint that is handled with augmented lagrangians. several possible implementations of such bayesian mixed optimizers are compared. in particular, the reformulation of the problem with continuous latent variables is put in competition with searches working directly in the mixed space. among the algorithms involving latent variables and an augmented lagrangian, a particular attention is devoted to the lagrange multipliers for which a local and a global estimation techniques are studied. the comparisons are based on the repeated optimization of three analytical functions and a mechanical application regarding a beam design. an additional study for applying a proposed mixed optimization strategy in the field of mixed self-calibration is made. this analysis was inspired in an application in radionuclide quantification, which defined an specific inverse function that required the study of its multiple properties in the continuous scenario. a proposition of different deterministic and bayesian strategies was made towards a complete definition in a mixed variable setup
Styles APA, Harvard, Vancouver, ISO, etc.
49

Ödling, David, et Arvid Österlund. « Factorisation of Latent Variables in Word Space Models : Studying redistribution of weight on latent variables ». Thesis, KTH, Skolan för teknikvetenskap (SCI), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-153776.

Texte intégral
Résumé :
The ultimate goal of any DSM is a scalable and accurate representation of lexical semantics. Recent developments due to Bullinaria & Levy (2012) and Caron (2001) indicate that the accuracy of such models can be improved by redistribution of weight on the principal components. However, this method is poorly understood and barely replicated due to the computational expensive dimension reduction and the puzzling nature of the results. This thesis aims to explore the nature of these results. Beginning by reproducing the results in Bullinaria & Levy (2012) we move onto deepen the understanding of these results, quantitatively as well as qualitatively, using various forms of the BLESS test and juxtapose these with previous results.  The main result of this thesis is the verification of the 100% score on the TOEFL test and 91.5% on a paradigmatic version of the BLESS test. Our qualitative tests indicate that the redistribution of weight away from the first principal components is slightly different between word categories and hence the improvement in the TOEFL and BLESS results. While we do not find any significant relation between word frequencies and weight distribution, we find an empirical relation for the optimal weight distribution. Based on these results, we suggest a range of further studies to better understand these phenomena.
Målet med alla semantiska fördelningsmodeller (DSMs) är en skalbaroch precis representation av semantiska relationer. Nya rön från Bullinaria & Levy (2012) och Caron (2001) indikerar att man kan förbättra prestandan avsevärt genom att omfördela vikten ifrån principalkomponenterna med störst varians mot de lägre. Varför metoden fungerar är dock fortfarande oklart, delvis på grund av höga beräkningskostnader för PCA men även på grund av att resultaten strider mot tidigare praxis. Vi börjar med att replikera resultaten i Bullinaria & Levy (2012) för att sedan fördjupa oss i resultaten, både kvantitativt och kvalitativt, genom att använda oss av BLESS testet. Huvudresultaten av denna studie är verifiering av 100% på TOEFL testet och ett nytt resultat på en paradigmatisk variant av BLESStestet på 91.5%. Våra resultat tyder på att en omfördelning av vikten ifrån de första principalkomponenterna leder till en förändring i fördelningensins emellan de semantiska relationerna vilket delvis förklarar förbättringen i TOEFL resultaten. Vidare finner vi i enlighet med tidigare resultat ingen signifikant relation mellan ordfrekvenser och viktomfördelning. Utifrån dessa resultat föreslår vi en rad experiment som kan ge vidare insikt till dessa intressanta resultat.
Styles APA, Harvard, Vancouver, ISO, etc.
50

PENNONI, FULVIA. « Issues on the Estimation of Latent Variable and Latent Class Models with Social Science Applications ». Doctoral thesis, Università degli Studi di Firenze, 2004. http://hdl.handle.net/10281/46004.

Texte intégral
Résumé :
This Ph.D. work is made of different reseach problems which have in common the precence of latent variables. Chapters 1 and 2 provide accessible primer on the models developped in the subsequent chapters. Chapters 3 and 4 are written in form of articles. A list of references at the end of each chapter is provided and a general bibliography is also reported as last part of the work. The first chapter introduces the models of depedence and association and their interpretation using graphical models which have been proved useful to display in graphical form the essential relationships between variables. The structure of the graph yields direct information about various aspects related to the statistical analysis. At first we provide the necessary notation and background on graph theory. We describe the Markov properties that associate a set of conditional independence assumptions to an undirected and directed graph. Such definitions does not depend of any particular distributional form and hence can be applied to models with both discrete and continuous random variables. In particular we consider models for Gaussian continuous variables where the structure is assumed to be adequately described via a vector of means and by a covariance matrix. The concentration and the covariance graphs models are illustrated. The specification of the complex multivariate distribution through univariate regressions induced by a Directed Acyclic Graph (DAG) can be regarded as a simplification, as the single regression models typically involve considerably fewer variables than the whole multivariate vector. In the present work it is shown that such models are a subclass of the structural equation models developed for linear analysis known as Structural Equation Models (SEM) The chapter is concluded by some bibliographical notes. Chapter 2 takes into account the latent class model for measuring one or more latent categorical variables by means of a set of observed categorical variables. After some notes on the model identifiability and estimation we consider the model extension to study latent changes over time when longitudinal studies are used. The hidden Markov model is presented cosisting of hidden state variables and observed variables both varying over time. In Chapter 3 we consider in detail the DAG Gaussian models in which one of the variables is not observed. Once the condition for global identification has been satisfied, we show how the incomplete log-likelihood of the observed data can be maximize using the EM algorithm. As the EM does not provide the matrix of the second derivatives we propose a method for obtaining an explicit formula of the observed information matrix using the missing information principle. We illustrate the models with two examples on real data concerning the educational attainement and criminological research. The first appendix of the chapter reports details on the calculations of the quantities necessary for the E-step of the EM algorithm. The second appendix reports the code of the statistical software R to get the estimated standard errors, which may implemented in the R package called ggm. Chapter 4 starts from the practical problem of classifying criminal activity. The latent class cluster model is extended by proposing a latent class model that also incorporates the longitudinal structure of data using a method similar to a local likelihood approach. The data set which is taken from the Home Office Offenders Index of England and Wales. It contains the complete criminal histories of a sample of those born in 1953 and followed for forty years.
Styles APA, Harvard, Vancouver, ISO, etc.
Nous offrons des réductions sur tous les plans premium pour les auteurs dont les œuvres sont incluses dans des sélections littéraires thématiques. Contactez-nous pour obtenir un code promo unique!

Vers la bibliographie