Log in

Relevant bibliographies by topics / Chain graph models / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Chain graph models.

Dissertations / Theses on the topic 'Chain graph models'

Author: Grafiati

Published: 9 March 2023

Last updated: 10 March 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 23 dissertations / theses for your research on the topic 'Chain graph models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Drton, Mathias. "Maximum likelihood estimation in Gaussian AMP chain graph models and Gaussian ancestral graph models /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/8952.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Levitz, Michael. "Separation, completeness, and Markov properties for AMP chain graph models /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/9564.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Gastaldello, Mattia. "Enumeration Algorithms and Graph Theoretical Models to Address Biological Problems Related To Symbiosis." Thesis, Lyon, 2018. http://www.theses.fr/2018LYSE1019/document.

Full text

Abstract:

Dans cette thèse, nous abordons deux problèmes de théorie des graphes liés à deux problèmes biologiques de symbiose (deux organismes vivent en symbiose s'ils ont une interaction étroite et à long terme). Le premier problème est lié au phénomène de l'Incompatibilité cytoplasmique (IC) induit par certaines bactéries parasites chez leurs hôtes. L'IC se traduit par l'impossibilité de donner naissance à une progéniture saine lorsqu'un mÃ¢le infecté s'accouple avec une femelle non infectée. En termes de graphe ce problème peut s'interpréter comme la recherche d'une couverture minimum par des "sous-graphes des chaînes" d'un graphe biparti. Un graphe des chaînes est un graphe biparti dont les noeuds peuvent être ordonnés selon leur voisinage.En terme biologique, la taille minimale représente le nombre de facteurs génétiques impliqués dans le phénomène de l'IC. Dans la première moitié de la thèse, nous abordons trois problèmes connexes à ce modèle de la théorie des graphes. Le premier est l'énumération de tous les graphes des chaînes maximaux arêtes induits d'un graphe biparti G, pour lequel nous fournissons un algorithme en delai polynomial avec un retard de O(n^2m) oÃ¹ n est le nombre de noeuds et m le nombre d'arêtes de G. Dans la même section, nous montrons que (n/2)! et 2^(\sqrt{m}\log m) bornent le nombre de sous-graphes de chaînes maximales de G et nous les utilisons pour établir la complexité "input-sensitive" de notre algorithme. Le deuxième problème que nous traitons est de trouver le nombre minimum de graphes des chaînes nécessaires pour couvrir tous les bords d'un graphe biparti.Pour résoudre ce problème NP-hard, en combinant notre algorithme avec la technique d'inclusion-exclusion, nous fournissons un algorithme exponentiel exact en O^*((2+c)^m), pour chaque c > 0 (par O^* on entend la notation O standard mais en omettant les facteurs polynomiaux). Le troisième problème est l'énumération de toutes les couvertures minimales par des sous-graphes des chaînes. Nous montrons qu'il est possible d'énumérer toutes les couvertures minimales de G en temps O([(M + 1) |S|] ^ [\ log ((M + 1) |S|)]) oÃ¹ S est le nombre de couvertures minimales de G et M le nombre maximum des sous-graphes des chaînes dans une couverture minimale. Nous présentons ensuite la relation entre le second problème et le calcul de la dimension intervallaire d'un poset biparti. Nous donnons une interprétation de nos résultats dans le contexte de la dimension d'ordre
In this thesis, we address two graph theoretical problems connected to two different biological problems both related to symbiosis (two organisms live in symbiosis if they have a close and long term interaction). The first problem is related to the size of a minimum cover by "chain subgraphs" of a bipartite graph. A chain graph is a bipartite graph whose nodes can be ordered by neighbourhood inclusion. In biological terms, the size of a minimum cover by chain subgraphs represents the number of genetic factors involved in the phenomenon of Cytoplasmic Incompatibility (CI) induced by some parasitic bacteria in their insect hosts. CI results in the impossibility to give birth to an healthy offspring when an infected male mates with an uninfected female. In the first half of the thesis we address three related problems. One is the enumeration of all the maximal edge induced chain subgraphs of a bipartite graph G, for which we provide a polynomial delay algorithm with a delay of O(n^2m) where n is the number of nodes and m the number of edges of G. Furthermore, we show that (n/2)! and 2^(\sqrt{m} \log m) bound the number of maximal chain subgraphs of G and use them to establish the input-sensitive complexity of the algorithm. The second problem we treat is finding the minimum number of chain subgraphs needed to cover all the edges of a bipartite graph. To solve this NP-hard problem, we provide an exact exponential algorithm which runs in time O^*((2+c)^m), for every c>0, by a procedure which uses our algorithm and an inclusion-exclusion technique (by O^* we denote standard big O notation but omitting polynomial factors). Notice that, since a cover by chain subgraphs is a family of subsets of edges, the existence of an algorithm whose complexity is close to 2^m is not obvious. Indeed, the basic search space would have size 2^(2^m), which corresponds to all families of subsets of edges of a graph on $m$ edges. The third problem is the enumeration of all minimal covers by chain sugbgraphs. We show that it is possible to enumerate all such minimal covers of G in time O([(M+1)|S|]^[\log((M+1)|S|)]) where S is the number of minimal covers of G and M the maximum number of chain graphs in a minimal cover. We then present the relation between the second problem and the computation of the interval order dimension of a bipartite poset. We give an interpretation of our results in the context of poset and interval poset dimension... [etc]

APA, Harvard, Vancouver, ISO, and other styles

4

NICOLUSSI, FEDERICA. "Marginal parametrizations for conditional independence models and graphical models for categorical data." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2013. http://hdl.handle.net/10281/43679.

Full text

Abstract:

The graphical models (GM) for categorical data are models useful to representing conditional independencies through graphs. The parametric marginal models for categorical data have useful properties for the asymptotic theory. This work is focused on nding which GMs can be represented by marginal parametrizations. Following theorem 1 of Bergsma, Rudas and Németh [9], we have proposed a method to identify when a GM is parametrizable according to a marginal model. We have applied this method to the four types of GMs for chain graphs, summarized by Drton [22]. In particular, with regard to the so-called GM of type II and GM of type III, we have found the subclasses of these models which are parametrizable with marginal models, and therefore they are smooth. About the so-called GM of type I and GM of type IV, in the literature it is known that these models are smooth and we have provided new proof of this result. Finally we have applied the mean results concerning the GM of type II on the EVS data-set.

APA, Harvard, Vancouver, ISO, and other styles

5

Sonntag, Dag. "Chain Graphs : Interpretations, Expressiveness and Learning Algorithms." Doctoral thesis, Linköpings universitet, Databas och informationsteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-125921.

Full text

Abstract:

Probabilistic graphical models are currently one of the most commonly used architectures for modelling and reasoning with uncertainty. The most widely used subclass of these models is directed acyclic graphs, also known as Bayesian networks, which are used in a wide range of applications both in research and industry. Directed acyclic graphs do, however, have a major limitation, which is that only asymmetric relationships, namely cause and effect relationships, can be modelled between their variables. A class of probabilistic graphical models that tries to address this shortcoming is chain graphs, which include two types of edges in the models representing both symmetric and asymmetric relationships between the variables. This allows for a wider range of independence models to be modelled and depending on how the second edge is interpreted, we also have different so-called chain graph interpretations. Although chain graphs were first introduced in the late eighties, most research on probabilistic graphical models naturally started in the least complex subclasses, such as directed acyclic graphs and undirected graphs. The field of chain graphs has therefore been relatively dormant. However, due to the maturity of the research field of probabilistic graphical models and the rise of more data-driven approaches to system modelling, chain graphs have recently received renewed interest in research. In this thesis we provide an introduction to chain graphs where we incorporate the progress made in the field. More specifically, we study the three chain graph interpretations that exist in research in terms of their separation criteria, their possible parametrizations and the intuition behind their edges. In addition to this we also compare the expressivity of the interpretations in terms of representable independence models as well as propose new structure learning algorithms to learn chain graph models from data.

APA, Harvard, Vancouver, ISO, and other styles

6

Di, Natale Anna. "Stochastic models and graph theory for Zipf's law." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/17065/.

Full text

Abstract:

In questo elaborato ci siamo occupati della legge di Zipf sia da un punto di vista applicativo che teorico. Tale legge empirica afferma che il rango in frequenza (RF) delle parole di un testo seguono una legge a potenza con esponente -1. Per quanto riguarda l'approccio teorico abbiamo trattato due classi di modelli in grado di ricreare leggi a potenza nella loro distribuzione di probabilità. In particolare, abbiamo considerato delle generalizzazioni delle urne di Polya e i processi SSR (Sample Space Reducing). Di questi ultimi abbiamo dato una formalizzazione in termini di catene di Markov. Infine abbiamo proposto un modello di dinamica delle popolazioni capace di unificare e riprodurre i risultati dei tre SSR presenti in letteratura. Successivamente siamo passati all'analisi quantitativa dell'andamento del RF sulle parole di un corpus di testi. Infatti in questo caso si osserva che la RF non segue una pura legge a potenza ma ha un duplice andamento che può essere rappresentato da una legge a potenza che cambia esponente. Abbiamo cercato di capire se fosse possibile legare l'analisi dell'andamento del RF con le proprietà topologiche di un grafo. In particolare, a partire da un corpus di testi abbiamo costruito una rete di adiacenza dove ogni parola era collegata tramite un link alla parola successiva. Svolgendo un'analisi topologica della struttura del grafo abbiamo trovato alcuni risultati che sembrano confermare l'ipotesi che la sua struttura sia legata al cambiamento di pendenza della RF. Questo risultato può portare ad alcuni sviluppi nell'ambito dello studio del linguaggio e della mente umana. Inoltre, siccome la struttura del grafo presenterebbe alcune componenti che raggruppano parole in base al loro significato, un approfondimento di questo studio potrebbe condurre ad alcuni sviluppi nell'ambito della comprensione automatica del testo (text mining).

APA, Harvard, Vancouver, ISO, and other styles

7

Moghadasin, Babak. "An Approach on Learning Multivariate Regression Chain Graphs from Data." Thesis, Linköpings universitet, Databas och informationsteknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-94019.

Full text

Abstract:

The necessity of modeling is vital for the purpose of reasoning and diagnosing in complex systems, since the human mind might sometimes have a limited capacity and an inability to be objective. The chain graph (CG) class is a powerful and robust tool for modeling real-world applications. It is a type of probabilistic graphical models (PGM) and has multiple interpretations. Each of these interpretations has a distinct Markov property. This thesis deals with the multivariate regression chain graph (MVR-CG) interpretation. The main goal of this thesis is to implement and evaluate the results of the MVR-PC-algorithm proposed by Sonntag and Peña in 2012. This algorithm uses a constraint based approach used in order to learn a MVR-CG from data.In this study the MRV-PC-algorithm is implemented and tested to see whether the implementation is correct. For this purpose, it is run on several different independence models that can be perfectly represented by MVR-CGs. The learned CG and the independence model of the given probability distribution are then compared to ensure that they are in the same Markov equivalence class. Additionally, for the purpose of checking how accurate the algorithm is, in learning a MVR-CG from data, a large number of samples are passed to the algorithm. The results are analyzed based on number of nodes and average number of adjacents per node. The accuracy of the algorithm is measured by the precision and recall of independencies and dependencies.In general, the higher the number of samples given to the algorithm, the more accurate the learned MVR-CGs become. In addition, when the graph is sparse, the result becomes significantly more accurate. The number of nodes can affect the results slightly. When the number of nodes increases it can lead to better results, if the average number of adjacents is fixed. On the other hand, if the number of nodes is fixed and the average number of adjacents increases, the effect is more considerable and the accuracy of the results dramatically declines. Moreover the type of the random variables can affect the results. Given the samples with discrete variables, the recall of independencies measure would be higher and the precision of independencies measure would be lower. Conversely, given the samples with continuous variables, the recall of independencies would be less but the precision of independencies would be higher.

APA, Harvard, Vancouver, ISO, and other styles

8

PENNONI, FULVIA. "Metodi statistici multivariati applicati all'analisi del comportamento dei titolari di carta di credito di tipo revolving." Bachelor's thesis, Universita' degli studi di Perugia, 2000. http://hdl.handle.net/10281/50024.

Full text

Abstract:

Il presente lavoro di tesi illustra un'applicazione dei modelli grafici per il l’analisi del credit scoring comportamentale o behavioural scoring. Quest'ultimo e' definito come: ‘the systems and models that allow lenders to make better decisions in managing existing clients by forcasting their future performance’, secondo Thomas (1999). La classe di modelli grafici presa in considerazione e’ quella dei modelli garfici a catena. Sono dei modelli statistici multivariati che consetono di modellizzare in modo appropriato le relazioni tra le variabili che descrivono il comporatemento dei titoloari della carta. Dato che sono basati su un'espansione log-lineare della funzione di densità delle variabili consentono di rappresentare anche graficamente associazioni orientate, inerenti sottoinsiemi di variabili. Consentono, inoltre, di individuare la struttura che rappresenti in modo più parsimonioso possibile tali relazioni e modellare simultaneamente più di una variabile risposta. Sono utili quando esiste un ordinamento anche parziale tra le variabili che permette di suddividerle in meramente esogene, gruppi d’intermedie tra loro concatenate e di risposta. Nei modelli grafici la struttura d’indipendenza delle variabili viene rappresentata visivamente attraverso un grafo. Nel grafo le variabili sono rappresentate da nodi legati da archi i quali mostrano le dipendenze in probabilità tra le variabili. La mancanza di un arco implica che due nodi sono indipendenti dati gli altri nodi. Tali modelli risultano particolarmente utili per la teoria che li accomuna con i sistemi esperti, per cui una volta selezionato il modello è possibile interrogare il sistema esperto per modellare la distribuzione di probabilità congiunta e marginale delle variabili. Nel primo capitolo vengono presentati i principali modelli statistici adottati nel credit scoring. Il secondo capitolo prende in considerazione le variabili categoriche. Le informazioni sui titolari di carta di credito sono, infatti, compendiate in tabelle di contingenza. Si introducono le nozioni d’indipendenza tra due variabili e di indipendenza condizionata tra più di due variabili. Si elencano alcune misure d’associazione tra variabili, in particolare, si introducono i rapporti di odds che costituiscono la base per la costruzione dei modelli multivariati utilizzati. Nel terzo capitolo vengono illustrati i modelli log-lineari e logistici che appartengono alla famiglia dei modelli lineari generalizzati. Essendo metodi multivariati consentono di studiare l’associazione tra le variabili considerandole simultaneamente. In particolare viene descritta una speciale parametrizzazione log-lineare che permette di tener conto della scala ordinale con cui sono misurate alcune delle variabili categoriche utilizzate. Questa è anche utile per trovare la migliore categorizzazione delle variabili continue. Si richiamano, inoltre, i risultati relativi alla stima di massima verosimiglianza dei parametri dei modelli, accennando anche agli algoritmi numerici iterativi necessari per la risoluzione delle equazioni di verosimiglianza rispetto ai parametri incogniti. Si fa riferimento al test del rapporto di verosimiglianza per valutare la bontà di adattamento del modello ai dati. Il capitolo quarto introduce alla teoria dei grafi, esponendone i concetti principali ed evidenziando alcune proprietà che consentono la rappresentazione visiva del modello mediante il grafo, mettendone in luce i vantaggi interpretativi. In tale capitolo si accenna anche al problema derivante dalla sparsità della tabella di contingenza, quando le dimensioni sono elevate. Vengono pertanto descritti alcuni metodi adottati per far fronte a tale problema ponendo l’accento sulle definizioni di collassabilità. Il quinto capitolo illustra un’applicazione dei metodi descritti su un campione composto da circa sessantamila titolari di carta di credito revolving, rilasciata da una delle maggiori società finanziarie italiane operanti nel settore. Le variabili prese in esame sono quelle descriventi le caratteristiche socioeconomiche del titolare della carta, desumibili dal modulo che il cliente compila alla richiesta di finanziamento e lo stato del conto del cliente in due periodi successivi. Ogni mese, infatti, i clienti vengono classificati dalla società in: ‘attivi’, ‘inattivi’ o ‘dormienti’ a seconda di come si presenta il saldo del conto. Lo scopo del lavoro è stato quello di ricercare indipendenze condizionate tra le variabili in particolare rispetto alle due variabili obbiettivo e definire il profilo di coloro che utilizzano maggiormente la carta. Le conclusioni riguardanti le analisi effettuate al capitolo quinto sono riportate nell’ultima sezione. L’appendice descrive alcuni dei principali programmi relativi ai software statistici utilizzati per le elaborazioni.
In this thesis work the use of graphical models is proposed to the analysis of credit scoring. In particular the applied application is related to the behavioural scoring which is defined by Thomas (1999) as ‘the systems and models that allow lenders to make better decisions in managing existing clients by forecasting their future performance’. The multivariate statistical models, named chain graph models, proposed for the application allow us to model in a proper way the relation between the variables describing the behaviour of the holders of the credit card. The proposed models are named chain graph models. They are based on a log-linear expansion of the density function of the variables. They allow to: depict oriented association between subset of variables; to detect the structure which accounts for a parsimonious description of the relations between variables; to model simultaneously more than one response variable. They are useful in particular when there is a partial ordering between variables such that they can be divided into exogenous, intermediate and responses. In the graphical models the independence structure is represented by a graph. The variables are represented by nodes, joint by edges showing the dependence in probability among variables. The missing edge means that two nodes are independent given the other nodes. Such class of models is very useful for the theory which combines them with the expert systems. In fact, once the model has been selected, it is possible to link it to the expert system to model the joint and marginal probability of the variables. The first chapter introduces the most used statistical models for the credit scoring analysis. The second chapter introduces the categorical variables. The information related to the credit card holder are stored in a contingency table. It illustrates also the notion of independence between two variables and conditional independence among more than two variables. The odds ratio is introduced as a measure of association between two variables. It is the base of the model formulation. The third chapter introduces the log-linear and logistic models belonging to the family of generalized linear models. They are multivariate methods allowing to study the association between variables considering them simultaneously. A log-linear parameterization is described in details. Its advantage is also that it allow us to take into account of the ordinal scale on which the categorical variables are measured. This is also useful to find the better categorization of the continuous variables. The results related to the maximum likelihood estimation of the model parameters are mentioned as well as the numerical iterative algorithm which are used to solve the likelihood equations with respect to the unknown parameters. The score test is illustrated to evaluate the goodness of fit of the model to the data. Chapter 4 introduces some main concepts of the graph theory in connection with their properties which allow us to depict the model through the graph, showing the interpretative advantages. The sparsity of the contingency table is also mentioned, when there are many cells. The collapsibility conditions are considered as well. Finally, Chapter 5 illustrates the application of the proposed methodology on a sample composed by 70000 revolving credit card holders. The data are released by a one of biggest Italian financial society working in this sector. The variables are the socioeconomic characteristics of the credit card holder, taken form the form filled by the customer when asking for the credit. Every months the society refines the classification of the customers in active, inactive or asleep according to the balance. The application of the proposed method was devoted to find the existing conditional independences between variables related to the two responses which are the balance of the account at two subsequent dates and therefore to define the profiles of most frequently users of the revolving credit card. The chapter ends with some conclusive remarks. The appendix of the chapter reports the code of the used statistical softwares.

APA, Harvard, Vancouver, ISO, and other styles

9

Weng, Huibin. "A Social Interaction Model with Endogenous Network Formation." University of Cincinnati / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ucin159317152899108.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Wang, Yan-Jiang, and yanjiang_wang@tmmu edu cn. "Clearance of amyloid-beta in Alzheimer's disease: To understand the pathogenesis and develop potential therapies in animal models." Flinders University. School of Medicine, 2010. http://catalogue.flinders.edu.au./local/adt/public/adt-SFU20100419.124325.

Full text

Abstract:

Alzheimer's disease (AD) is the most common cause of dementia. No strong disease-modifying treatments are currently available. Amyloid-beta peptide (Abeta) appears to play a pivotal role in the pathogenesis of AD. We focused our interest on revealing the pathogenesis of the disease and developing novel therapeutic modalities. The thesis consists of three projects: 1. Prevention of AD by intramuscular delivery of an anti-Abeta single chain antibody (scFv) gene: Immunotherapy is effective in removing brain Abetazbut was associated with detrimental effects. In the present study, the gene of an anti-Abeta scFv was delivered in the hind leg muscles of APPSwe/PS1dE9 mice with adeno-associated virus at three months of age. Six months later, we found that brain Abeta accumulation, AD-type pathologies and cognitive impairment were significantly attenuated in scFv-treated mice relative to enhanced green fluorescence protein (EGFP)-treated mice. Intramuscular delivery of scFv gene was well tolerated by the animals. These findings suggest that peripheral application of scFv is effective and safe in preventing the development of AD, and would be a promising non-inflammatory immunological modality for prevention and treatment of AD. 2. Prevention of AD with grape seed derived polyphenols: Polyphenols extracted from grape seeds are able to inhibit Abetanaggregation, reduce Abeta production and protect against Abeta neurotoxicity in vitro. We investigated the therapeutic effects of a polyphenol-rich grape seed extract (GSE) in vivo. APPSwe/PS1dE9 transgenic mice were fed with normal AIN-93G diet (control diet), AIN-93G diet with 0.07% curcumin, or diet with 2% GSE beginning at 3 months of age for 9 months. Total phenolic content of GSE was 592.5 mg/g dry weight, including gallic acid, catechin, epicatechin and proanthocyanidins. Long-term feeding of GSE diet was well tolerated. The Abetanlevels in the brain and serum of the mice fed with GSE were reduced by 33% and 44% respectively compared with the mice fed with the control diet. Amyloid plaques and microgliosis in the brain of mice fed with GSE were also reduced by 49% and 70% respectively. In conclusion, polyphenol-rich GSE is promising to be a safe and effective drug to prevent the development of AD. 3. Roles of p75NTR in the development of AD: P75NTR has been suggested to mediate Abeta induced neurotoxicity. However, its role in the development of AD is undetermined. APPSwe/PS1dE9 transgenic mice were crossed with p75NTR knockout mice to generate APPSwe/PS1dE9 mice with p75NTR gene deleted. P75NTR mainly expressed in the basal forebrain neurons and degenerative neurites in neocortex and hippocampus. Genetic deletion of p75NTR gene in APPSwe/PS1dE9 mice reduced soluble Abeta levels, but increased the insoluble Abeta accumulation and Abeta plaque formation in the brain. P75NTR deletion decreased Abeta production of cortical neurons in vitro. Recombinant extracellular domain of p75NTR attenuated the oligomerization and fibrillation of synthetic Abeta42 peptide in vitro, and reduced local Abeta plaques after hippocampus injection in vivo. Our data suggest that p75NTR plays an important role in AD development and may be a valid therapeutic target for the treatment of AD.

APA, Harvard, Vancouver, ISO, and other styles

11

Sadeghi, Kayvan. "Graphical representation of independence structures." Thesis, University of Oxford, 2012. http://ora.ox.ac.uk/objects/uuid:86ff6155-a6b9-48f9-9dac-1ab791748072.

Full text

Abstract:

In this thesis we describe subclasses of a class of graphs with three types of edges, called loopless mixed graphs (LMGs). The class of LMGs contains almost all known classes of graphs used in the literature of graphical Markov models. We focus in particular on the subclass of ribbonless graphs (RGs), which as special cases include undirected graphs, bidirected graphs, and directed acyclic graphs, as well as ancestral graphs and summary graphs. We define a unifying interpretation of independence structure for LMGs and pairwise and global Markov properties for RGs, discuss their maximality, and, in particular, prove the equivalence of pairwise and global Markov properties for graphoids defined over the nodes of RGs. Three subclasses of LMGs (MC, summary, and ancestral graphs) capture the modified independence model after marginalisation over unobserved variables and conditioning on selection variables of variables satisfying independence restrictions represented by a directed acyclic graph (DAG). We derive algorithms to generate these graphs from a given DAG or from a graph of a specific subclass, and we study the relationships between these classes of graphs. Finally, a manual and codes are provided that explain methods and functions in R for implementing and generating various graphs studied in this thesis.

APA, Harvard, Vancouver, ISO, and other styles

12

Kreacic, Eleonora. "Some problems related to the Karp-Sipser algorithm on random graphs." Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:3b2eb52a-98f5-4af8-9614-e4909b8b9ffa.

Full text

Abstract:

We study certain questions related to the performance of the Karp-Sipser algorithm on the sparse Erdös-Rényi random graph. The Karp-Sipser algorithm, introduced by Karp and Sipser [34] is a greedy algorithm which aims to obtain a near-maximum matching on a given graph. The algorithm evolves through a sequence of steps. In each step, it picks an edge according to a certain rule, adds it to the matching and removes it from the remaining graph. The algorithm stops when the remining graph is empty. In [34], the performance of the Karp-Sipser algorithm on the Erdös-Rényi random graphs G(n,M = [^cn/₂]) and G(n, p = ^c/_n), c > 0 is studied. It is proved there that the algorithm behaves near-optimally, in the sense that the difference between the size of a matching obtained by the algorithm and a maximum matching is at most o(n), with high probability as n → ∞. The main result of [34] is a law of large numbers for the size of a maximum matching in G(n,M = ^cn/₂) and G(n, p = ^c/_n), c > 0. Aronson, Frieze and Pittel [2] further refine these results. In particular, they prove that for c < e, the Karp-Sipser algorithm obtains a maximum matching, with high probability as n → ∞; for c > e, the difference between the size of a matching obtained by the algorithm and the size of a maximum matching of G(n,M = ^cn/₂) is of order Θ_{log n}(n^1/5), with high probability as n → ∞. They further conjecture a central limit theorem for the size of a maximum matching of G(n,M = ^cn/₂) and G(n, p = ^c/_n) for all c > 0. As noted in [2], the central limit theorem for c < 1 is a consequence of the result of Pittel [45]. In this thesis, we prove a central limit theorem for the size of a maximum matching of both G(n,M = ^cn/₂) and G(n, p = ^c/_n) for c > e. (We do not analyse the case 1 ≤ c ≤ e). Our approach is based on the further analysis of the Karp-Sipser algorithm. We use the results from [2] and refine them. For c > e, the difference between the size of a matching obtained by the algorithm and the size of a maximum matching is of order Θ_{log n}(n^1/5), with high probability as n → ∞, and the study [2] suggests that this difference is accumulated at the very end of the process. The question how the Karp-Sipser algorithm evolves in its final stages for c > e, motivated us to consider the following problem in this thesis. We study a model for the destruction of a random network by fire. Let us assume that we have a multigraph with minimum degree at least 2 with real-valued edge-lengths. We first choose a uniform random point from along the length and set it alight. The edges burn at speed 1. If the fire reaches a node of degree 2, it is passed on to the neighbouring edge. On the other hand, a node of degree at least 3 passes the fire either to all its neighbours or none, each with probability 1/2. If the fire extinguishes before the graph is burnt, we again pick a uniform point and set it alight. We study this model in the setting of a random multigraph with N nodes of degree 3 and α(N) nodes of degree 4, where α(N)/N → 0 as N → ∞. We assume the edges to have i.i.d. standard exponential lengths. We are interested in the asymptotic behaviour of the number of fires we must set alight in order to burn the whole graph, and the number of points which are burnt from two different directions. Depending on whether α(N) » √N or not, we prove that after the suitable rescaling these quantities converge jointly in distribution to either a pair of constants or to (complicated) functionals of Brownian motion. Our analysis supports the conjecture that the difference between the size of a matching obtained by the Karp-Sipser algorithm and the size of a maximum matching of the Erdös-Rényi random graph G(n,M = ^cn/₂) for c > e, rescaled by n^1/5, converges in distribution.

APA, Harvard, Vancouver, ISO, and other styles

13

Pace, Bruno. "O modelo de Axelrod com tensão superficial." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/43/43134/tde-26042012-123155/.

Full text

Abstract:

Nesta dissertação foram estudados alguns modelos vetoriais que pretendem modelar e descrever alguns aspectos de sistemas sociais e de sua organização cultural. Partimos do modelo de Axelrod, um processo estocástico definido em uma rede, e introduzimos uma pequena alteração no modelo que desencadeou mudanças qualitativas interessantes, especialmente o surgimento de uma tensão superficial, que leva ao aparecimento de estados metaestáveis e de regiões culturais mais fixamente localizadas no espaço. Através da ótica da mecânica estatística e de extensas simulações computacionais, exploramos alguns dos aspectos que julgamos mais importantes na caracterização desse rico modelo.
Axelrod\'s model for cultural dissemination is a discrete vector representation for modeling social and cultural systems. In this work we have studied it and other related models, and a subtle change in the model\'s rule was proposed. Our slight alterations to the model yielded significant qualitative changes, specifically the emergence of surface tension, driving the system to metastable states. Using concepts from statistical mechanics and extensive numerical simulations, we explored some of the aspects that better describe the rich model devised, such as its transient and stationary behaviour.

APA, Harvard, Vancouver, ISO, and other styles

14

Curtis, Andrew B. "Path Planning for Unmanned Air and Ground Vehicles in Urban Environments." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2270.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

CARLI, FEDERICO. "Stratified Staged Trees: Modelling, Software and Applications." Doctoral thesis, Università degli studi di Genova, 2021. http://hdl.handle.net/11567/1057653.

Full text

Abstract:

The thesis is focused on Probabilistic Graphical Models (PGMs), which are a rich framework for encoding probability distributions over complex domains. In particular, joint multivariate distributions over large numbers of random variables that interact with each other can be investigated through PGMs and conditional independence statements can be succinctly represented with graphical representations. These representations sit at the intersection of statistics and computer science, relying on concepts mainly from probability theory, graph algorithms and machine learning. They are applied in a wide variety of fields, such as medical diagnosis, image understanding, speech recognition, natural language processing, and many more. Over the years theory and methodology have developed and been extended in a multitude of directions. In particular, in this thesis different aspects of new classes of PGMs called Staged Trees and Chain Event Graphs (CEGs) are studied. In some sense, Staged Trees are a generalization of Bayesian Networks (BNs). Indeed, BNs provide a transparent graphical tool to define a complex process in terms of conditional independent structures. Despite their strengths in allowing for the reduction in the dimensionality of joint probability distributions of the statistical model and in providing a transparent framework for causal inference, BNs are not optimal GMs in all situations. The biggest problems with their usage mainly occur when the event space is not a simple product of the sample spaces of the random variables of interest, and when conditional independence statements are true only under certain values of variables. This happens when there are context-specific conditional independence structures. Some extensions to the BN framework have been proposed to handle these issues: context-specific BNs, Bayesian Multinets, or Similarity Networks citep{geiger1996knowledge}. These adopt a hypothesis variable to encode the context-specific statements over a particular set of random variables. For each value taken by the hypothesis variable the graphical modeller has to construct a particular BN model called local network. The collection of these local networks constitute a Bayesian Multinet, Probabilistic Decision Graphs, among others. It has been showed that Chain Event Graph (CEG) models encompass all discrete BN models and its discrete variants described above as a special subclass and they are also richer than Probabilistic Decision Graphs whose semantics is actually somewhat distinct. Unlike most of its competitors, CEGs can capture all (also context-specific) conditional independences in a unique graph, obtained by a coalescence over the vertices of an appropriately constructed probability tree, called Staged Tree. CEGs have been developed for categorical variables and have been used for cohort studies, causal analysis and case-control studies. The user’s toolbox to efficiently and effectively perform uncertainty reasoning with CEGs further includes methods for inference and probability propagation, the exploration of equivalence classes and robustness studies. The main contributions of this thesis to the literature on Staged Trees are related to Stratified Staged Trees with a keen eye of application. Few observations are made on non-Stratified Staged Trees in the last part of the thesis. A core output of the thesis is an R software package which efficiently implements a host of functions for learning and estimating Staged Trees from data, relying on likelihood principles. Also structural learning algorithms based on distance or divergence between pair of categorical probability distributions and based on the clusterization of probability distributions in a fixed number of stages for each stratum of the tree are developed. Also a new class of Directed Acyclic Graph has been introduced, named Asymmetric-labeled DAG (ALDAG), which gives a BN representation of a given Staged Tree. The ALDAG is a minimal DAG such that the statistical model embedded in the Staged Tree is contained in the one associated to the ALDAG. This is possible thanks to the use of colored edges, so that each color indicates a different type of conditional dependence: total, context-specific, partial or local. Staged Trees are also adopted in this thesis as a statistical tool for classification purpose. Staged Tree Classifiers are introduced, which exhibit comparable predictive results based on accuracy with respect to algorithms from state of the art of machine learning such as neural networks and random forests. At last, algorithms to obtain an ordering of variables for the construction of the Staged Tree are designed.

APA, Harvard, Vancouver, ISO, and other styles

16

Herman, Joseph L. "Multiple sequence analysis in the presence of alignment uncertainty." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:88a56d9f-a96e-48e3-b8dc-a73f3efc8472.

Full text

Abstract:

Sequence alignment is one of the most intensely studied problems in bioinformatics, and is an important step in a wide range of analyses. An issue that has gained much attention in recent years is the fact that downstream analyses are often highly sensitive to the specific choice of alignment. One way to address this is to jointly sample alignments along with other parameters of interest. In order to extend the range of applicability of this approach, the first chapter of this thesis introduces a probabilistic evolutionary model for protein structures on a phylogenetic tree; since protein structures typically diverge much more slowly than sequences, this allows for more reliable detection of remote homologies, improving the accuracy of the resulting alignments and trees, and reducing sensitivity of the results to the choice of dataset. In order to carry out inference under such a model, a number of new Markov chain Monte Carlo approaches are developed, allowing for more efficient convergence and mixing on the high-dimensional parameter space. The second part of the thesis presents a directed acyclic graph (DAG)-based approach for representing a collection of sampled alignments. This DAG representation allows the initial collection of samples to be used to generate a larger set of alignments under the same approximate distribution, enabling posterior alignment probabilities to be estimated reliably from a reasonable number of samples. If desired, summary alignments can then be generated as maximum-weight paths through the DAG, under various types of loss or scoring functions. The acyclic nature of the graph also permits various other types of algorithms to be easily adapted to operate on the entire set of alignments in the DAG. In the final part of this work, methodology is introduced for alignment-DAG-based sequence annotation using hidden Markov models, and RNA secondary structure prediction using stochastic context-free grammars. Results on test datasets indicate that the additional information contained within the DAG allows for improved predictions, resulting in substantial gains over simply analysing a set of alignments one by one.

APA, Harvard, Vancouver, ISO, and other styles

17

Mercier, Fabien. "Cinq essais dans le domaine monétaire, bancaire et financier." Thesis, Paris 2, 2014. http://www.theses.fr/2014PA020065.

Full text

Abstract:

La thèse étudie plusieurs problématiques centrales et actuelles de la finance moderne : la rationalité limitée des agents et leurs biais comportementaux vis-à-vis des valeurs nominales,le problème de la juste évaluation du prix des actions, la refonte du paysage de l'industrie post-négociation en Europe suite à l'introduction du projet de l'Euro système Target-2 Securities, ainsi que les modèles de défaut et les méthodes d’estimation des cycles de défaut pour un secteur donné. Les techniques employées sont variées: enquêtes sur données individuelles, économétrie, théorie des jeux, théorie des graphes, simulations de Monte-Carlo,chaînes de Markov cachées. Concernant l’illusion monétaire, les résultats confirment la robustesse des résultats d’études précédentes tout en dévoilant de nouvelles perspectives de recherche, par exemple tenter d’expliquer la disparité des réponses selon les caractéristiques individuelles des répondants,en particulier leur formation universitaire. L’étude du modèle de la Fed montre que la relation de long terme entre taux nominal des obligations d’Etat et rendement des actions n’est ni robuste, ni utile à la prédiction sur des horizons temporels réduits. L’étude sur Target 2 Securities a été confirmée par les faits. Enfin, le modèle d’estimation des défauts à partir de chaînes de Markov cachées fait preuve de bonnes performances dans un contexte européen, malgré la relative rareté des données pour sa calibration
The thesis studies various themes that are central to modern finance : economic agents rationality and behavioural biases with respect to nominal values, the problem of asset fundamental valuation, the changing landscape of the European post-trade industry catalysed by the Eurosystem project Target 2 Securities, and models of defaults and methods to estimate defaults cycles for a given sector. Techniques employed vary: studies on individual data,econometrics, game theory, graph theory, Monte-Carlo simulations and hidden Markov chains. Concerning monetary illusion, results confirm those of previous study while emphasizing new areas for investigation concerning the interplay of individual characteristics, such as university education, and money illusion. The study of the Fed model shows that the long term relationship assumed between nominal government bond yield and dividend yield is neither robust, nor useful for reduced time horizons. The default model based on hidden Markov chains estimation gives satisfactory results in a European context, and this besides the relative scarcity of data used for its calibration

APA, Harvard, Vancouver, ISO, and other styles

18

Todeschini, Adrien. "Probabilistic and Bayesian nonparametric approaches for recommender systems and networks." Thesis, Bordeaux, 2016. http://www.theses.fr/2016BORD0237/document.

Full text

Abstract:

Nous proposons deux nouvelles approches pour les systèmes de recommandation et les réseaux. Dans la première partie, nous donnons d’abord un aperçu sur les systèmes de recommandation avant de nous concentrer sur les approches de rang faible pour la complétion de matrice. En nous appuyant sur une approche probabiliste, nous proposons de nouvelles fonctions de pénalité sur les valeurs singulières de la matrice de rang faible. En exploitant une représentation de modèle de mélange de cette pénalité, nous montrons qu’un ensemble de variables latentes convenablement choisi permet de développer un algorithme espérance-maximisation afin d’obtenir un maximum a posteriori de la matrice de rang faible complétée. L’algorithme résultant est un algorithme à seuillage doux itératif qui adapte de manière itérative les coefficients de réduction associés aux valeurs singulières. L’algorithme est simple à mettre en œuvre et peut s’adapter à de grandes matrices. Nous fournissons des comparaisons numériques entre notre approche et de récentes alternatives montrant l’intérêt de l’approche proposée pour la complétion de matrice à rang faible. Dans la deuxième partie, nous présentons d’abord quelques prérequis sur l’approche bayésienne non paramétrique et en particulier sur les mesures complètement aléatoires et leur extension multivariée, les mesures complètement aléatoires composées. Nous proposons ensuite un nouveau modèle statistique pour les réseaux creux qui se structurent en communautés avec chevauchement. Le modèle est basé sur la représentation du graphe comme un processus ponctuel échangeable, et généralise naturellement des modèles probabilistes existants à structure en blocs avec chevauchement au régime creux. Notre construction s’appuie sur des vecteurs de mesures complètement aléatoires, et possède des paramètres interprétables, chaque nœud étant associé un vecteur représentant son niveau d’affiliation à certaines communautés latentes. Nous développons des méthodes pour simuler cette classe de graphes aléatoires, ainsi que pour effectuer l’inférence a posteriori. Nous montrons que l’approche proposée peut récupérer une structure interprétable à partir de deux réseaux du monde réel et peut gérer des graphes avec des milliers de nœuds et des dizaines de milliers de connections
We propose two novel approaches for recommender systems and networks. In the first part, we first give an overview of recommender systems and concentrate on the low-rank approaches for matrix completion. Building on a probabilistic approach, we propose novel penalty functions on the singular values of the low-rank matrix. By exploiting a mixture model representation of this penalty, we show that a suitably chosen set of latent variables enables to derive an expectation-maximization algorithm to obtain a maximum a posteriori estimate of the completed low-rank matrix. The resulting algorithm is an iterative soft-thresholded algorithm which iteratively adapts the shrinkage coefficients associated to the singular values. The algorithm is simple to implement and can scale to large matrices. We provide numerical comparisons between our approach and recent alternatives showing the interest of the proposed approach for low-rank matrix completion. In the second part, we first introduce some background on Bayesian nonparametrics and in particular on completely random measures (CRMs) and their multivariate extension, the compound CRMs. We then propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of CRMs, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges

APA, Harvard, Vancouver, ISO, and other styles

19

Rusch, Thomas, Marcus Wurzer, and Reinhold Hatzinger. "Chain Graph Models in R: Implementing the Cox-Wermuth Procedure." 2013. http://epub.wu.ac.at/3781/1/psychoco2013_(2).pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

GASTALDELLO, MATTIA. "Enumeration algorithms and graph theoretical models to address biological problems related to symbiosis." Doctoral thesis, 2018. http://hdl.handle.net/11573/1072500.

Full text

Abstract:

In this thesis, we address two graph theoretical problems connected to two different biological problems both related to symbiosis (two organisms live in symbiosis if they have a close and long term interaction). The first problem is related to the size of a minimum cover by "chain subgraphs" of a bipartite graph. A chain graph is a bipartite graph whose nodes can be ordered by neighbourhood inclusion. In biological terms, the size of a minimum cover by chain subgraphs represents the number of genetic factors involved in the phenomenon of Cytoplasmic Incompatibility (CI) induced by some parasitic bacteria in their insect hosts. CI results in the impossibility to give birth to an healthy offspring when an infected male mates with an uninfected female. In the first half of the thesis we address three related problems. One is the enumeration of all the maximal edge induced chain subgraphs of a bipartite graph G, for which we provide a polynomial delay algorithm with a delay of O(n^2m) where n is the number of nodes and m the number of edges of G. Furthermore, we show that (n/2)! and 2^(\sqrt{m} \log m) bound the number of maximal chain subgraphs of G and use them to establish the input-sensitive complexity of the algorithm. The second problem we treat is finding the minimum number of chain subgraphs needed to cover all the edges of a bipartite graph. To solve this NP-hard problem, we provide an exact exponential algorithm which runs in time O^*((2+c)^m), for every c>0, by a procedure which uses our algorithm and an inclusion-exclusion technique (by O^* we denote standard big O notation but omitting polynomial factors). Notice that, since a cover by chain subgraphs is a family of subsets of edges, the existence of an algorithm whose complexity is close to 2^m is not obvious. Indeed, the basic search space would have size 2^(2^m), which corresponds to all families of subsets of edges of a graph on $m$ edges. The third problem is the enumeration of all minimal covers by chain sugbgraphs. We show that it is possible to enumerate all such minimal covers of G in time O([(M+1)|S|]^[\log((M+1)|S|)]) where S is the number of minimal covers of G and M the maximum number of chain graphs in a minimal cover. We then present the relation between the second problem and the computation of the interval order dimension of a bipartite poset. We give an interpretation of our results in the context of poset and interval poset dimension. Indeed, we can compute the interval dimension of a bipartite poset P in O^*((2+c)^p) where p is the number of incomparable pairs of P. Finally, we extend further these results to the problem of computing the poset dimension. By a classical result in poset theory (Trotter "split" operation), we then obtain a procedure which solves this problem in O^*((2+c)^(p/2)). To improve our results on the poset dimension and to perform better than O(\sqrt{2}^p), i.e. the minimum time to run the inclusion-exclusion formula on which these results are based, we introduce an associated graph GCP for each poset, called the graph of critical pairs. In this way we obtain two algorithms, an exponential and a polynomial space one. These algorithms compute the poset dimension in 2^q and O(2.9977 ^q) time respectively where q is the number of critical pairs of P (intuitively, critical pairs are the fundamental incomparable pairs to consider). In the second part of the thesis, we deal with the Reconciliation Model of two phylogenetic trees and the exploration of its optimal solutions space. Phylogenetic tree reconciliation is the approach commonly used to investigate the coevolution of sets of organisms such as hosts and symbionts. Given a phylogenetic tree for each such set, respectively denoted by H and S, together with a mapping phi of the leaves of S to the leaves of H, a reconciliation is a mapping rho of the internal nodes of S to the nodes of H which extends \phi with some constraints. Depending on the mapping of a node and its children, four types of events can be identified: "cospeciation" (when host and symbiont evolve together), "duplication" (when the symbiont evolves into different species but not the host), "loss" (when the host evolves into two new species but not the symbiont, leading to the loss of the symbiont in one of the two new host species) and "host switch" (when the symbiont evolves into two new species with one species remaining with its current host while the other switches, that is jumps to another host species). Given a cost for each kind of event, c_c, c_d, c_l, and c_h respectively, we can assign a total cost to each reconciliation. The set of reconciliations with minimum cost is denoted by Rec(H,P,phi,C), where C = (c_c,c_d,c_l,c_h), and its elements are said to represent "parsimonious" reconciliations. However, their number can be often huge. Without further information, any biological interpretation of the underlying coevolution would require that all the parsimonious reconciliations are enumerated and examined. The latter is however impossible without providing some sort of high level view of the situation. In this thesis, we approached this problem by introducing two equivalence relations to collect similar reconciliations and reduce the optimal solutions set to a smaller set of representatives of these equivalence classes. We introduce as well a new distance among optimal reconciliations DH and we compare it with the distances already present in the literature. We show that we can embed the set of parsimonious reconciliations Rec(H,P,phi,C) into the discrete k dimensional hypercube H^k = {0,1}^k and that DH coincides with the "Hamming Distance" on H^k. Finally, we present a series of results on reconciliations based on the conditions c_c <= c_d and c_l > 0 which lead to prove that a reconciliation is characterized by its set of host switches. The equivalence relations and the distance DH are all based on the host-switch events. We performed experiments on some real datasets and we present some of the results obtained to show the efficacy of the two equivalence relations. We comment these results under the light of the chosen cost vector C. The most outstanding results we obtain is in the case of the dataset related to the parasite Wolbachia where we pass from ~4.08 x 10^{42} parsimonious reconciliations to ~ 1.15 x 10^{3} representatives.
Dans cette thèse, nous abordons deux problèmes de théorie des graphes liés à deux problèmes biologiques de symbiose (deux organismes vivent en symbiose s'ils ont une interaction étroite et à long terme). Le premier problème est lié au phénomène de l'Incompatibilité cytoplasmique (IC) induit par certaines bactéries parasites chez leurs hôtes. L'IC se traduit par l'impossibilité de donner naissance à une progéniture saine lorsqu'un mâle infecté s'accouple avec une femelle non infectée. En termes de graphe ce problème peut s’interpréter comme la recherche d'une couverture minimum par des "sous-graphes des chaînes" d'un graphe biparti. Un graphe des chaînes est un graphe biparti dont les nœuds peuvent être ordonnés selon leur voisinage. En terme biologique, la taille minimale représente le nombre de facteurs génétiques impliqués dans le phénomène de l'IC. Dans la première moitié de la thèse, nous abordons trois problèmes connexes à ce modèle de la théorie des graphes. Le premier est l'énumération de tous les graphes des chaînes maximaux arêtes induits d'un graphe biparti G, pour lequel nous fournissons un algorithme en delai polynomial avec un retard de O(n^2m) où n est le nombre de noeuds et m le nombre d'arêtes de G. Dans la même section, nous montrons que (n/2)! et 2^(\sqrt{m}\log m) bornent le nombre de sous-graphes de chaînes maximales de G et nous les utilisons pour établir la complexité "input-sensitive" de notre algorithme. Le deuxième problème que nous traitons est de trouver le nombre minimum de graphes des chaînes nécessaires pour couvrir tous les bords d'un graphe biparti. Pour résoudre ce problème NP-hard, en combinant notre algorithme avec la technique d'inclusion-exclusion, nous fournissons un algorithme exponentiel exact en O^*((2+c)^m), pour chaque c > 0 (par O^* on entend la notation O standard mais en omettant les facteurs polynomiaux). Le troisième problème est l'énumération de toutes les couvertures minimales par des sous-graphes des chaînes. Nous montrons qu'il est possible d'énumérer toutes les couvertures minimales de G en temps O([(M + 1) |S|] ^ [\ log ((M + 1) |S|)]) où S est le nombre de couvertures minimales de G et M le nombre maximum des sous-graphes des chaînes dans une couverture minimale. Nous présentons ensuite la relation entre le second problème et le calcul de la dimension intervallaire d'un poset biparti. Nous donnons une interprétation de nos résultats dans le contexte de la dimension d'ordre et la dimension intervallaire. En effet, nous pouvons calculer la dimension intervallaire d'un poset biparti P en O^*((2+ c)^p) où p est le nombre de de paires incomparables de P. Enfin, nous étendons ces résultats au problème du calcul de la dimension d'ordre. Par un résultat classique en théorie des ordres (l'opération "split" de Trotter), nous obtenons alors une procédure qui résout ce problème dans O^*((2+ c)^(p/2)). Pour améliorer nos résultats sur la dimension d'ordre et faire mieux que O(\sqrt{2}^p), i.e. le temps minimum pour exécuter la formule d'inclusion-exclusion sur laquelle ces résultats sont basés, pour chaque poset nous introduisons un graphe associé GCP, dit "graphe des paires critiques". De cette façon, nous obtenons deux algorithmes, un en espace exponentiel et un en espace polynomial. Ces algorithmes calculent la dimension d'ordre en temps 2^q et O(2.9977^q) respectivement où q est le nombre de paires critiques de P (intuitivement, les paires critiques sont les paires incomparables fondamentales à considérer). Dans la deuxième partie de la thèse, nous traitons du modèle de réconciliation des arbres phylogénétiques et de l'exploration de l'espace de solutions optimales de ces réconciliations. La réconciliation d'arbre phylogénétique est l'approche couramment utilisée pour étudier la coévolution d'ensembles d'organismes tels que les hôtes et les symbiotes. Cette approche consiste à une fonction de l'arbre phylogénétique des symbiotes P sur celui de les hôtes H en respectant quelques contraintes. Selon la function, quatre types d'événements peuvent être identifiés: "cospéciation" (quand l'hôte et le symbiont évoluent ensemble), "duplication" (quand le symbiont évolue en différentes espèces mais pas l'hôte), "perte" (lorsque l'hôte évolue en deux nouvelles espèces mais pas le symbiont, entraînant la perte du symbiont chez l'une des deux nouvelles espèces d'hôtes) et "host switch" (lorsque le symbiont évolue en deux nouvelles espèces dont l'une infecte une autre espèce hôte). Compte tenu d'un coût pour chaque type d'événement, c_c, c_d, c_l, et c_h respectivement, nous pouvons affecter un coût total à chaque réconciliation. Les réconciliations avec minimum coût sont dites "parcimonieuses", et l'ensemble des ces réconciliations de coût minimum est noté Rec(H, P, C), où C = (c_c, c_d, c_l, c_h). Le nombre de réconciliations parcimonieuses peut être souvent énorme et, sans autre information, toute interprétation biologique de la coévolution sous-jacente exigerait que tous les rapprochements parcimonieux soient énumérés et examinés. Cela est toutefois impossible sans fournir une sorte de vue de haut niveau de la situation. Dans cette thèse, nous avons abordé ce problème en introduisant deux relations d'équivalence pour mettre ensemble des réconciliations similaires et réduire les solutions optimales à un plus petit ensemble de représentants de ces classes d'équivalence. Nous avons ensuite introduit une nouvelle distance DH parmi les réconciliations optimales et nous la comparons aux distances déjà présentes dans la littérature. Nous montrons que nous pouvons projeter l'ensemble des rapprochements parcimonieux Rec(H, P, C) dans l'hypercube discret H^k = {0,1}^k et que DH coïncide avec la "Distance de Hamming" sur H^k. Dans cette thèse nous présentons une série de résultats sur les réconciliations basées sur les conditions c_c <= c_d et c_l > 0 qui sont raisonnables et conduisent à prouver que l'ensemble des "host switch" d'une réconciliation caractérise celle-ci. Les relations d'équivalence et la distance introduites sont toutes les trois basées sur les événements de "host switch". Nous présentons aussi quelques résultats expérimentaux pour montrer l'efficacité des deux relations d'équivalence et nous rapportons ces résultats au vecteur de coût choisi. Les meilleurs résultats ont été obtenus dans le cas de l'ensemble de données lié au parasite Wolbachia (une bactérie très présente parmi les insectes avec de nombreuses applications dans le contrôle des épidémies et la reproduction des insectes) où nous passons d'un nombre de réconciliations parcimonieuses de ~ 4.08 x 10^{42} réconciliations à ~ 1.15 x 10^{3} représentants.

APA, Harvard, Vancouver, ISO, and other styles

21

Jin, Ick Hoon. "Statistical Inference for Models with Intractable Normalizing Constants." Thesis, 2011. http://hdl.handle.net/1969.1/150938.

Full text

Abstract:

In this dissertation, we have proposed two new algorithms for statistical inference for models with intractable normalizing constants: the Monte Carlo Metropolis-Hastings algorithm and the Bayesian Stochastic Approximation Monte Carlo algorithm. The MCMH algorithm is a Monte Carlo version of the Metropolis-Hastings algorithm. At each iteration, it replaces the unknown normalizing constant ratio by a Monte Carlo estimate. Although the algorithm violates the detailed balance condition, it still converges, as shown in the paper, to the desired target distribution under mild conditions. The BSAMC algorithm works by simulating from a sequence of approximated distributions using the SAMC algorithm. A strong law of large numbers has been established for BSAMC estimators under mild conditions. One significant advantage of our algorithms over the auxiliary variable MCMC methods is that they avoid the requirement for perfect samples, and thus it can be applied to many models for which perfect sampling is not available or very expensive. In addition, although the normalizing constant approximation is also involved in BSAMC, BSAMC can perform very robustly to initial guesses of parameters due to the powerful ability of SAMC in sample space exploration. BSAMC has also provided a general framework for approximated Bayesian inference for the models for which the likelihood function is intractable: sampling from a sequence of approximated distributions with their average converging to the target distribution. With these two illustrated algorithms, we have demonstrated how the SAMCMC method can be applied to estimate the parameters of ERGMs, which is one of the typical examples of statistical models with intractable normalizing constants. We showed that the resulting estimate is consistent, asymptotically normal and asymptotically efficient. Compared to the MCMLE and SSA methods, a significant advantage of SAMCMC is that it overcomes the model degeneracy problem. The strength of SAMCMC comes from its varying truncation mechanism, which enables SAMCMC to avoid the model degeneracy problem through re-initialization. MCMLE and SSA do not possess the re-initialization mechanism, and tend to converge to a solution near the starting point, so they often fail for the models which suffer from the model degeneracy problem.

APA, Harvard, Vancouver, ISO, and other styles

22

Che, Xuan. "Spatial graphical models with discrete and continuous components." Thesis, 2012. http://hdl.handle.net/1957/33644.

Full text

Abstract:

Graphical models use Markov properties to establish associations among dependent variables. To estimate spatial correlation and other parameters in graphical models, the conditional independences and joint probability distribution of the graph need to be specified. We can rely on Gaussian multivariate models to derive the joint distribution when all the nodes of the graph are assumed to be normally distributed. However, when some of the nodes are discrete, the Gaussian model no longer affords an appropriate joint distribution function. We develop methods specifying the joint distribution of a chain graph with both discrete and continuous components, with spatial dependencies assumed among all variables on the graph. We propose a new group of chain graphs known as the generalized tree networks. Constructing the chain graph as a generalized tree network, we partition its joint distributions according to the maximal cliques. Copula models help us to model correlation among discrete variables in the cliques. We examine the method by analyzing datasets with simulated Gaussian and Bernoulli Markov random fields, as well as with a real dataset involving household income and election results. Estimates from the graphical models are compared with those from spatial random effects models and multivariate regression models.
Graduation date: 2013

APA, Harvard, Vancouver, ISO, and other styles

23

Al-Mohannadi, Hamad, Qublai K. A. Mirza, Anitta P. Namanya, Irfan U. Awan, Andrea J. Cullen, and Disso Jules F. Pagna. "Cyber-Attack Modeling Analysis Techniques: An Overview." 2016. http://hdl.handle.net/10454/10703.

Full text

Abstract:

Yes
Cyber attack is a sensitive issue in the world of Internet security. Governments and business organisations around the world are providing enormous effort to secure their data. They are using various types of tools and techniques to keep the business running, while adversaries are trying to breach security and send malicious software such as botnets, viruses, trojans etc., to access valuable data. Everyday the situation is getting worse because of new types of malware emerging to attack networks. It is important to understand those attacks both before and after they happen in order to provide better security to our systems. Understanding attack models provide more insight into network vulnerability; which in turn can be used to protect the network from future attacks. In the cyber security world, it is difficult to predict a potential attack without understanding the vulnerability of the network. So, it is important to analyse the network to identify top possible vulnerability list, which will give an intuitive idea to protect the network. Also, handling an ongoing attack poses significant risk on the network and valuable data, where prompt action is necessary. Proper utilisation of attack modelling techniques provide advance planning, which can be implemented rapidly during an ongoing attack event. This paper aims to analyse various types of existing attack modelling techniques to understand the vulnerability of the network; and the behaviour and goals of the adversary. The ultimate goal is to handle cyber attack in efficient manner using attack modelling techniques.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!