Dissertations / Theses on the topic 'Inferenza'

To see the other types of publications on this topic, follow the link: Inferenza.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Inferenza.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

AGOSTINELLI, Claudio. "Inferenza statistica robusta basata sulla funzione di verosimiglianza pesata: alcuni sviluppi." Doctoral thesis, country:ITA, 1998. http://hdl.handle.net/10278/25831.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zanotto, Davide <1991&gt. "Inferenza Bayesiana applicata alla teoria di portafoglio: il Modello Black-Litterman." Master's Degree Thesis, Università Ca' Foscari Venezia, 2017. http://hdl.handle.net/10579/9595.

Full text
Abstract:
Questo elaborato mira ad analizzare uno dei più recenti sviluppi nel campo della teoria di portafoglio, offrendo un’analisi teorica e pratica del modello Black-Litterman. Il primo capitolo illustra le teorie precedenti su cui si basa il modello: vengono esaminati i fondamenti della Modern Portfolio Theory, analizzando le intuizioni di Markowitz e Sharpe. Nel secondo capitolo viene presentato il Modello Black-Litterman per l’allocazione ottima dei titoli. Il modello in oggetto viene analizzato nel dettaglio dal punto di vista teorico, ponendo attenzione sui diversi metodi di calcolo che possono essere utilizzati per determinare i valori delle variabili che lo compongono. Nel terzo capitolo viene proposta un’applicazione pratica del modello, mostrando tutti i passaggi necessari per determinare le scelte di allocazione dei titoli. Ipotizzando il caso di un investitore che voglia costruire un portafoglio azionario ottimale, vengono evidenziate le differenze tra le scelte di allocazione che si ottengono passando dalla teoria tradizionale all’implementazione della tecnica formalizzata da Back e Litterman.
APA, Harvard, Vancouver, ISO, and other styles
3

Monda, Anna. "Inferenza non parametrica nel contesto di dati dipendenti: polinomi vocali e verosimiglianza empirica." Doctoral thesis, Universita degli studi di Salerno, 2013. http://hdl.handle.net/10556/1285.

Full text
Abstract:
2010 - 2011
Il presente lavoro si inserisce nel contesto delle più recenti ricerche sugli strumenti di analisi non parametrica ed in particolare analizza l'utilizzo dei Polinomi Locali e della Verosimiglianza Empirica, nel caso di dati dipendenti. Le principali forme di dipendenza che verranno trattate in questo lavoro sono quelle che rispondono alla definizione di alpha-mixing ed in particolare il nostro si presenta come un tentativo di conciliare, in questo ambito, tecniche non parametriche, rappresentate dai Polinomi Locali, all'approccio di Empirical Likelihood, cercando di aggregare ed enfatizzare i punti di forza di entrambe le metodologie: i Polinomi Locali ci forniranno una stima più e accurata da collocare all'interno della definizione di Verosimiglianza Empirica fornita da Owen (1988). I vantaggi sono facili da apprezzare in termini di immediatezza ed utilizzo pratico di questa tecnica. I risultati vengono analizzati sia da un punto di vista teorico, sia confermati poi, da un punto di vista empirico, riuscendo a trarre dai dati anche utili informazioni in grado di fornire l'effettiva sensibilità al più cruciale e delicato parametro da stabilire nel caso di stimatori Polinomi Locali: il parametro di bandwidth. Lungo tutto l'elaborato presenteremo, in ordine, dapprima il contesto all'interno del quale andremo ad operare, precisando più nello specifico le forme di dipendenza trattate, nel capitolo secondo, enunceremo le caratteristiche e proprietà dei polinomi locali, successivamente, nel corso del capitolo terzo, analizzeremo nel dettaglio la verosimiglianza empirica, con particolare attenzione, anche in questo caso, alle proprietà teoriche, infine, nel quarto capitolo presenteremo risultati teorici personali, conseguiti a partire dalla trattazione teorica precedente. Il capitolo conclusivo propone uno studio di simulazione, sulla base delle proprietà teoriche ottenute nel capitolo precedente. Nelle battute conclusive troveranno spazio delucidazioni sugli esiti delle simulazioni, i quali, non soltanto confermano la validità dei risultati teorici esposti nel corso dell'elaborato, ma forniscono anche evidenze a favore di un'ulteriore analisi, per i test proposti, rispetto alla sensibilità verso il parametro di smoothing impiegato. [a cura dell'autore]
X n.s.
APA, Harvard, Vancouver, ISO, and other styles
4

PENNACCHIO, Roberto. "L'ermeneutica triadica sistemica. Analisi dei campi di inferenza nel senso comune e in psicoterapia." Doctoral thesis, Università degli studi di Bergamo, 2011. http://hdl.handle.net/10446/862.

Full text
Abstract:
Nella seconda ricerca è stata analizzata l’ampiezza del campo di inferenza delle spiegazioni introdotte da 12 clienti e dalla terapeuta nelle prime due sedute di consultazione individuale ad orientamento sistemico-relazionale in riferimento a due distinte classi di comportamento: 1) i sintomi; 2) i comportamenti, le emozioni o gli eventi che riguardano una relazione significativa del cliente. I risultati dimostrano che in un contesto non artificioso e altamente motivante come quello psicoterapeutico i clienti accedono più facilmente all’ermeneutica triadica, sebbene le spiegazioni triadiche risultino infrequenti per rendere conto del comportamento sintomatico. L’assenza di differenze nell’ampiezza del campo di inferenza fra gli attributori è spiegata dal fatto che l’attività del terapeuta sistemico-relazionale nel corso delle sedute consulenziali, diversamente che in fasi più avanzate del processo terapeutico, è maggiormente diretta ad ampliare il campo di osservazione piuttosto che il campo di inferenza.
Nella prima ricerca sono state analizzate le spiegazioni fornite da 400 soggetti (studenti universitari) ad un comportamento inaspettato presentato attraverso 4 situazioni-stimolo in cui è stata manipolata l’ampiezza del campo di osservazione. I risultati dimostrano che le spiegazioni triadiche sono inconsuete, ma non del tutto estranee al senso comune e aumentano significativamente con l’allargamento del campo di osservazione dalla monade alla triade.
La teoria sistemico-narrativista del cambiamento terapeutico suppone che le persone: a) normalmente non utilizzino schemi esplicativi triadici, b) ma siano in grado di accedere all’ermeneutica triadica in seduta, grazie alle tecniche di conduzione del terapeuta. Per verificare questi presupposti sono state effettuate due ricerche.
The second study analyses the explanations’ breadth of inference field introduced by 12 clients and the therapist during the first two sessions of individual systemic therapy in reference to two distinct classes of behaviour: 1) symptoms; 2) behaviours, emotions or events that concern a significant relationship of the client. The results show that in a non-artificial and highly motivated context, like therapeutic one, clients access easier to triadic hermeneutics. However, triadic explanations are infrequent to account for the symptomatic behaviour. There were no differences in the breadth of inference field between clients and therapist: in fact the activity of the systemic-relational therapist during the first sessions, unlike in the later stages of the therapeutic process, is mainly intended to widen the observational field rather the inference field.
To test these two assumptions the first study analyses the explanations (provided by 400 undergraduates) of an unexpected behaviour framed into 4 stimulus situations where the breadth of observation field was manipulated. The results show that triadic explanations are rather unusual, but not completely extraneous to lay-thinking and they increase significantly with the widening of the observation field from the monad to the triad.
Systemic-narrative theory of therapeutic change assumes but does not prove that persons: a) normally do not use triadic hermeneutics, b) are able, thanks to the therapist’s interviewing techniques, to construe triadic explanations.
APA, Harvard, Vancouver, ISO, and other styles
5

Mancini, Martina. "Teorema di Cochran e applicazioni." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/9145/.

Full text
Abstract:
La statistica è un ramo della matematica che studia i metodi per raccogliere, organizzare e analizzare un insieme di dati numerici, la cui variazione è influenzata da cause diverse, con lo scopo sia di descrivere le caratteristiche del fenomeno a cui i dati si riferiscono, sia di dedurre, ove possibile, le leggi generali che lo regolano. La statistica si suddivide in statistica descrittiva o deduttiva e in statistica induttiva o inferenza statistica. Noi ci occuperemo di approfondire la seconda, nella quale si studiano le condizioni per cui le conclusioni dedotte dall'analisi statistica di un campione sono valide in casi più generali. In particolare l'inferenza statistica si pone l'obiettivo di indurre o inferire le proprietà di una popolazione (parametri) sulla base dei dati conosciuti relativi ad un campione. Lo scopo principale di questa tesi è analizzare il Teorema di Cochran e illustrarne le possibili applicazioni nei problemi di stima in un campione Gaussiano. In particolare il Teorema di Cochran riguarda un'importante proprietà delle distribuzioni normali multivariate, che risulta fondamentale nella determinazione di intervalli di fiducia per i parametri incogniti.
APA, Harvard, Vancouver, ISO, and other styles
6

CAMPAGNER, ANDREA. "Robust Learning Methods for Imprecise Data and Cautious Inference." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2023. https://hdl.handle.net/10281/404829.

Full text
Abstract:
La rappresentazione, quantificazione e gestione dell'incertezza è uno dei problemi centrali nell'Intelligenza Artificiale, ed in particolare nel Machine Learning, in cui l'incertezza è intrinsecamente collegata alla natura induttiva dell'apprendimento. Tra diverse forme d'incertezza, la modellazione dell'imprecisione, cioè il problem di gestire dati o conoscenza imperfetta o incompleta, ha recentemente attratto molto interesse nella comunità di ricerca, per via delle sue implicazione teoriche e applicate sull'uso di strumenti basati sul Machine Learning. Questo lavoro si concentra sul problema di gestire l'imprecision nel Machine Learning, sotto due diverse prospettive. Da un lato, l'imprecisione che riguarda i dati di input alla pipeline di Machine Learning, da cui si origina il problema dell'apprendimento da dati imprecisi. Dall'altro, l'imprecisione come strumento per implementare processi di quantificazione dell'incertezza nel Machine Learning, al fine di permettere a questi ultimi di fornire previsioni set-valued e portare quindi alla definizione di metodi di inferenza cauta. Lo scopo di questo lavoro, quindi, riguarda lo studio teorico ed empirico dei due scenari summenzionati. Per quanto riguarda il problema dell'apprendimento da dati imprecisi, il focus principale riguarda l'investigazione del problema dell'apprendimento da fuzzy label, sia da un punto di visto teorico che algoritmo. I contributi principali includono: la proposta di una caratterizzazione teorica del problema; la proposta di un nuovo algoritmo di ensemble, basato su pseudo-label, e il suo studio dal punto di visto teorico ed empirico; l'applicazione del summenzionato algoritmo in tre problemi medici reali; ed infine la proposta e lo studio di algoritmi di feature selection per ridurre la complessità computazionale e limitare la "curse of dimensionality" per algoritmi di apprendimento da fuzzy label. Per quanto riguarda l'inferenza cauta, il focus principale riguarda lo studio teorico di tre framework per l'inferenza cauta e lo sviluppo di nuovi algoritmi ed approcci per estendere l'applicabilità di tali framework in setting complessi. I contributi principali in questo senso riguardo lo studio delle proprietà teoriche di, e le relazioni tra, metodi di inferenza cauta decision-teorici, basati sulla selective prediction e sulla conformal prediction; lo studio di modelli ensemble di inferenza cauta, sia da un punto di vista empirico che teorico, mostrando in particolare che tali ensemble permettono di migliorare la robustezza e la generalizzazione di algoritmi di Machine Learning, nonché di facilitare l'applicazione di metodi d'inferenza cauta a dati complessi, multi-sorgenti o multi-modali
The representation, quantification and proper management of uncertainty is one of the central problems in Artificial Intelligence, and particularly so in Machine Learning, in which uncertainty is intrinsically tied to the inductive nature of the learning problem. Among different forms of uncertainty, the modeling of imprecision, that is the problem of dealing with data or knowledge that are imperfect} and incomplete, has recently attracted interest in the research community, for its theoretical and application-oriented implications on the practice and use of Machine Learning-based tools and methods. This work focuses on the problem of dealing with imprecision in Machine Learning, from two different perspectives. On the one hand, when imprecision affects the input data to a Machine Learning pipeline, leading to the problem of learning from imprecise data. On the other hand, when imprecision is used a way to implement uncertainty quantification for Machine Learning methods, by allowing these latter to provide set-valued predictions, leading to so-called cautious inference methods. The aim of this work, then, will be to investigate theoretical as well as empirical issues related to the two above mentioned settings. Within the context of learning from imprecise data, focus will be given on the investigation of the learning from fuzzy labels setting, both from a learning-theoretical and algorithmic point of view. Main contributions in this sense include: a learning-theoretical characterization of the hardness of learning from fuzzy labels problem; the proposal of a novel, pseudo labels-based, ensemble learning algorithm along with its theoretical study and empirical analysis, by which it is shown to provide promising results in comparison with the state-of-the-art; the application of this latter algorithm in three relevant real-world medical problems, in which imprecision occurs, respectively, due to the presence of conflicting expert opinions, the use of vague technical vocabulary, and the presence of individual variability in biochemical parameters; as well as the proposal of feature selection algorithms that may help in reducing the computational complexity of this task or limit the curse of dimensionality. Within the context of cautious inference, focus will be given to the theoretical study of three popular cautious inference frameworks, as well as to the development of novel algorithms and approaches to further the application of cautious inference in relevant settings. Main contributions in this sense include the study of the theoretical properties of, and relationships among, decision-theoretic, selective prediction and conformal prediction methods; the proposal of novel cautious inference techniques drawing from the interaction between decision-theoretic and conformal predictions methods, and their evaluation in medical settings; as well as the study of ensemble of cautious inference models, both from an empirical point of view, as well as from a theoretical one, by which it is shown that such ensembles could be useful to improve robustness, generalization, as well as to facilitate application of cautious inference methods on multi-source and multi-modal data.
APA, Harvard, Vancouver, ISO, and other styles
7

Capriati, Paola Bianca Martina. "L'utilizzo del metodo Bootstrap nella statistica inferenziale." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/8715/.

Full text
Abstract:
In questo lavoro viene introdotto il metodo Bootstrap, sviluppato a partire dal 1979 da Bradley Efron. Il Bootstrap è una tecnica statistica di ricampionamento basata su calcoli informatici, e quindi definita anche computer-intensive. In particolare vengono analizzati i vantaggi e gli svantaggi di tale metodo tramite esempi con set di dati reali implementati tramite il software statistico R. Tali analisi vertono su due tra i principali utilizzi del Bootstrap, la stima puntuale e la costruzione di intervalli di confidenza, basati entrambi sulla possibilità di approssimare la distribuzione campionaria di un qualsiasi stimatore, a prescindere dalla complessità di calcolo.
APA, Harvard, Vancouver, ISO, and other styles
8

BOLZONI, MATTIA. "Variational inference and semi-parametric methods for time-series probabilistic forecasting." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2021. http://hdl.handle.net/10281/313704.

Full text
Abstract:
Prevedere la probabilità di eventi futuri è un problema comune. L'approccio più utilizzato assume una struttura fissa per questa probabilità, detta modello, dipendente da variabili latenti dette parametri. Dopo aver osservato dei dati è possibile inferire una distribuzione per queste variabili non osservabili. Il procedimento di inferenza non è sempre immediato, siccome selezionare un singolo valore per i parametri potrebbe portare a scarsi risultati, mentre approssimare una distribuzione usando MCMC potrebbe essere complicato. L'inferenza variazionale (VI) sta ricevendo una crescente attenzione come alternativa per approssimare la distribuzione a posteriori tramite un problema di ottimo. Tuttavia, VI spesso impone una struttura parametrica alla distribuzione proposta. Il primo contributo della tesi, detto Hierarchical Variational Inference (HVI), è una metodologia che utilizza reti neurali per creare un'approssimazione semi-parametrica della distribuzione a posteriori. HVI richiede gli stessi requisiti minimi di un Metropolis-Hastings o di un Hamiltonian MCMC, per essere applicata. Il secondo contributo è un pacchetto Python per l'inferenza variazionale su serie storiche usando modelli media-covarianza. Questo utilizza HVI e tecniche di VI standard combinate con reti neurali. I risultati sperimentali, su dati econometrici e finanziari, mostrano un consistente miglioramento della previsione usando VI, rispetto a stime puntuali dei parametri, in particolare producendo stimatori con minor variabilità.
Probabilistic forecasting is a common task. The usual approach assumes a fixed structure for the outcome distribution, often called model, that depends on unseen quantities called parameters. It uses data to infer a reasonable distribution over these latent values. The inference step is not always straightforward, because single-value can lead to poor performances and overfitting while handling a proper distribution with MCMC can be challenging. Variational Inference (VI) is emerging as a viable optimisation based alternative that models the target posterior with instrumental variables called variational parameters. However, VI usually imposes a parametric structure on the proposed posterior. The thesis's first contribution is Hierarchical Variational Inference (HVI) a methodology that uses Neural Networks to create semi-parametric posterior approximations with the same minimum requirements as Metropolis-Hastings or Hamiltonian MCMC. The second contribution is a Python package to conduct VI on time-series models for mean-covariance estimate, using HVI and standard VI techniques combined with Neural Networks. Results on econometric and financial data show a consistent improvement using VI, compared to point estimate, obtaining lower variance forecasting.
APA, Harvard, Vancouver, ISO, and other styles
9

MASPERO, DAVIDE. "Computational strategies to dissect the heterogeneity of multicellular systems via multiscale modelling and omics data analysis." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2022. http://hdl.handle.net/10281/368331.

Full text
Abstract:
L'eterogeneità pervade i sistemi biologici e si manifesta in differenze strutturali e funzionali osservate sia tra diversi individui di uno stesso gruppo (es. organismi o patologie), sia fra gli elementi costituenti di un singolo individuo (es. cellule). Lo studio dell’eterogeneità dei sistemi biologici e, in particolare, di quelli multicellulari è fondamentale per la comprensione meccanicistica di fenomeni fisiologici e patologici complessi (es. il cancro), così come per la definizione di strategie prognostiche, diagnostiche e terapeutiche efficaci. Questo lavoro è focalizzato sullo sviluppo e l’applicazione di metodi computazionali e modelli matematici per la caratterizzazione dell’eterogeneità di sistemi multicellulari e delle sottopopolazioni di cellule tumorali che sottendono l’evoluzione di una patologia neoplastica. Analoghe metodologie sono state sviluppate per caratterizzare efficacemente l’evoluzione e l’eterogeneità virale. La ricerca è suddivisa in due porzioni complementari, la prima finalizzata alla definizione di metodi per l’analisi e l’integrazione di dati omici generati da esperimenti di sequenziamento, la seconda alla modellazione e simulazione multiscala di sistemi multicellulari. Per quanto riguarda il primo filone, le tecnologie di next-generation sequencing permettono di generare enormi moli di dati omici, relativi per esempio al genoma o trascrittoma di un determinato individuo, attraverso esperimenti di bulk o single-cell sequencing. Una delle sfide principale in informatica è quella di definire metodi computazionali per estrarre informazione utile da tali dati, tenendo conto degli alti livelli di errori dato-specifico, dovuti principalmente a limiti tecnologici. In particolare, nell’ambito di questo lavoro, ci si è concentrati sullo sviluppo di metodi per l’analisi di dati di espressione genica e di mutazioni genomiche. In dettaglio, è stata effettuata una comparazione esaustiva dei metodi di machine-learning per il denoising e l’imputation di dati di single-cell RNA-sequencing. Inoltre, sono stati sviluppati metodi per il mapping dei profili di espressione su reti metaboliche, attraverso un framework innovativo che ha consentito di stratificare pazienti oncologici in base al loro metabolismo. Una successiva estensione del metodo ha permesso di analizzare la distribuzione dei flussi metabolici all'interno di una popolazione di cellule, via un approccio di flux balance analysis. Per quanto riguarda l’analisi dei profili mutazionali, è stato ideato e implementato il primo metodo per la ricostruzione di modelli filogenomici a partire da dati longitudinali a risoluzione single-cell, che sfrutta un framework che combina una Markov Chain Monte Carlo con una nuova funzione di likelihood pesata. Analogamente, è stato sviluppato un framework che sfrutta i profili delle mutazioni a bassa frequenza per ricostruire filogenie robuste e probabili catene di infenzione, attraverso l’analisi dei dati di sequenziamento di campioni virali. Gli stessi profili mutazionali permettono anche di deconvolvere il segnale nelle firme associati a specifici meccanismi molecolari che generano tali mutazioni, attraverso un approccio basato su non-negative matrix factorization. La ricerca condotta per quello che riguarda la simulazione computazionale ha portato allo sviluppo di un modello multiscala, in cui la simulazione della dinamica di popolazioni cellulari, rappresentata attraverso un Cellular Potts Model, è accoppiata all'ottimizzazione di un modello metabolico associato a ciascuna cellula sintetica. Co modello è possibile rappresentare ipotesi in termini matematici e osservare proprietà emergenti da tali assunti. Infine, un primo tentativo per combinare i due approcci metodologici ha condotto all'integrazione di dati di single-cell RNA-seq all'interno del modello multiscala, consentendo di formulare ipotesi data-driven sulle proprietà emergenti del sistema.
Heterogeneity pervades biological systems and manifests itself in the structural and functional differences observed both among different individuals of the same group (e.g., organisms or disease systems) and among the constituent elements of a single individual (e.g., cells). The study of the heterogeneity of biological systems and, in particular, of multicellular systems is fundamental for the mechanistic understanding of complex physiological and pathological phenomena (e.g., cancer), as well as for the definition of effective prognostic, diagnostic, and therapeutic strategies. This work focuses on developing and applying computational methods and mathematical models for characterising the heterogeneity of multicellular systems and, especially, cancer cell subpopulations underlying the evolution of neoplastic pathology. Similar methodologies have been developed to characterise viral evolution and heterogeneity effectively. The research is divided into two complementary portions, the first aimed at defining methods for the analysis and integration of omics data generated by sequencing experiments, the second at modelling and multiscale simulation of multicellular systems. Regarding the first strand, next-generation sequencing technologies allow us to generate vast amounts of omics data, for example, related to the genome or transcriptome of a given individual, through bulk or single-cell sequencing experiments. One of the main challenges in computer science is to define computational methods to extract useful information from such data, taking into account the high levels of data-specific errors, mainly due to technological limitations. In particular, in the context of this work, we focused on developing methods for the analysis of gene expression and genomic mutation data. In detail, an exhaustive comparison of machine-learning methods for denoising and imputation of single-cell RNA-sequencing data has been performed. Moreover, methods for mapping expression profiles onto metabolic networks have been developed through an innovative framework that has allowed one to stratify cancer patients according to their metabolism. A subsequent extension of the method allowed us to analyse the distribution of metabolic fluxes within a population of cells via a flux balance analysis approach. Regarding the analysis of mutational profiles, the first method for reconstructing phylogenomic models from longitudinal data at single-cell resolution has been designed and implemented, exploiting a framework that combines a Markov Chain Monte Carlo with a novel weighted likelihood function. Similarly, a framework that exploits low-frequency mutation profiles to reconstruct robust phylogenies and likely chains of infection has been developed by analysing sequencing data from viral samples. The same mutational profiles also allow us to deconvolve the signal in the signatures associated with specific molecular mechanisms that generate such mutations through an approach based on non-negative matrix factorisation. The research conducted with regard to the computational simulation has led to the development of a multiscale model, in which the simulation of cell population dynamics, represented through a Cellular Potts Model, is coupled to the optimisation of a metabolic model associated with each synthetic cell. Using this model, it is possible to represent assumptions in mathematical terms and observe properties emerging from these assumptions. Finally, we present a first attempt to combine the two methodological approaches which led to the integration of single-cell RNA-seq data within the multiscale model, allowing data-driven hypotheses to be formulated on the emerging properties of the system.
APA, Harvard, Vancouver, ISO, and other styles
10

PINNA, ANDREA. "Simulation and identification of gene regulatory networks." Doctoral thesis, Università degli Studi di Cagliari, 2014. http://hdl.handle.net/11584/266423.

Full text
Abstract:
Gene regulatory networks are a well-established model to represent the functioning, at gene level, of utterly elaborated biological networks. Studying and understanding such models of gene communication might enable researchers to rightly address costly laboratory experiments, e.g. by selecting a small set of genes deemed to be responsible for a particular disease, or by indicating with confidence which molecule is supposed to be susceptible to certain drug treatments. This thesis explores two main aspects regarding gene regulatory networks: (i) the simulation of realistic perturbative and systems genetics experiments in gene networks, and (ii) the inference of gene networks from simulated and real data measurements. In detail, the following themes will be discussed: (i) SysGenSIM, an open source software to produce gene networks with realistic topology and simulate systems genetics or targeted perturbative experiments; (ii) two state of the arts algorithms for the structural identification of gene networks from single-gene knockout measurements; (iii) an approach to reverse-engineering gene networks from heterogeneous compendia; (iv) a methodology to infer gene interactions fromsystems genetics dataset. These works have been positively recognized by the scientific community. In particular, SysGenSIM has been used – in addition to providing valuable test benches for the development of the above inference algorithms – to generate benchmark datasets for international competitions as the DREAM5 Systems Genetics challenge and the StatSeq workshop. The identificationmethodologies earned their worth by accurately reverse-engineering gene networks at established contests, namely the DREAM Network Inference challenges. Results are explained and discussed thoroughly in the thesis.
APA, Harvard, Vancouver, ISO, and other styles
11

HAMMAD, AHMED TAREK. "Tecniche di valutazione degli effetti dei Programmi e delle Politiche Pubbliche. L' approccio di apprendimento automatico causale." Doctoral thesis, Università Cattolica del Sacro Cuore, 2022. http://hdl.handle.net/10280/110705.

Full text
Abstract:
L'analisi dei meccanismi causali è stata considerata in varie discipline come la sociologia, l’epidemiologia, le scienze politiche, la psicologia e l’economia. Questi approcci permettere di scoprire relazioni e meccanismi causali studiando il ruolo di una variabile di trattamento (come ad esempio una politica pubblica o un programma) su un insieme di variabili risultato di interesse o diverse variabili intermedie sul percorso causale tra il trattamento e le variabili risultato. Questa tesi si concentra innanzitutto sulla revisione e l'esplorazione di strategie alternative per indagare gli effetti causali e gli effetti di mediazione multipli utilizzando algoritmi di apprendimento automatico (Machine Learning) che si sono dimostrati particolarmente adatti per rispondere a domande di ricerca in contesti complessi caratterizzati dalla presenza di relazioni non lineari. In secondo luogo, la tesi fornisce due esempi empirici in cui due algoritmi di Machine Learning, ovvero Generalized Random Foresta e Multiple Additive Regression Trees, vengono utilizzati per tenere conto di importanti variabili di controllo nell'inferenza causale seguendo un approccio “data-driven”.
The analysis of causal mechanisms has been considered in various disciplines such as sociology, epidemiology, political science, psychology and economics. These approaches allow uncovering causal relations and mechanisms by studying the role of a treatment variable (such as a policy or a program) on a set of outcomes of interest or different intermediates variables on the causal path between the treatment and the outcome variables. This thesis first focuses on reviewing and exploring alternative strategies to investigate causal effects and multiple mediation effects using Machine Learning algorithms which have been shown to be particularly suited for assessing research questions in complex settings with non-linear relations. Second, the thesis provides two empirical examples where two Machine Learning algorithms, namely the Generalized Random Forest and Multiple Additive Regression Trees, are used to account for important control variables in causal inference in a data-driven way. By bridging a fundamental gap between causality and advanced data modelling, this work combines state of the art theories and modelling techniques.
APA, Harvard, Vancouver, ISO, and other styles
12

ROMIO, SILVANA ANTONIETTA. "Modelli marginali strutturali per lo studio dell'effetto causale di fattori di rischio in presenza di confondenti tempo dipendenti." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2010. http://hdl.handle.net/10281/8048.

Full text
Abstract:
Uno degli obiettivi piu importanti della ricerca epidemiologica è quello di analizzare la relazione tra uno o più fattori di rischio ed un evento. Tali relazioni sono spesso complicate dalla presenza di confondenti, il cui concetto è estremamente complesso da formalizzare. Dal punto di vista dell'analisi causale, si dice che esiste confondimento quando la misura di associazione non coincide con quella di effetto corrispondente, cioè quando ad esempio il rischio relativo non coincide con il rischio relativo causale. Il problema è quindi quello di individuare i disegni e le ipotesi sulla base delle quali è possibile calcolare l'effetto causale oggetto di studio. Ad esempio, gli studi clinici controllati randomizzati sono nati con lo scopo di minimizzare l'influenza di errori sistematici nella misurazione dell'effetto di un fattore di rischio su di un outcome. Inoltre in questi studi le misure di associazione risultano essere uguali a quelle di effetto (causali). Negli studi osservazionali lo scenario diventa più complesso per la presenza di una o più variabili che possono alterare o 'confondere' la relazione d'interesse poichè lo sperimentatore non può in alcun modo intervenire sulle covariate osservate né sull'outcome. Di particolare interesse risulta quindi l'identificazione di metodi che permettano di risolvere il problema del confondimento. Il problema è particolarmente complesso nello studio dell'effetto causale di un fattore di rischio in presenza di confondenti tempo dipendenti e cioè una variabile che, condizionatamente alla storia di esposizione pregressa è un predittore sia dell'outcome che dell'esposizione successiva. Nel presente lavoro è stato studiato un importante problema di sanità pubblica come quello di esplorare l'esistenza di una relazione causale tra abitudine al fumo e diminuzione dell'indice di massa corporea (body mass index - BMI) considerando come confondente tempo dipendente lo stesso BMI misurato al tempo precedente, utilizzando un modello marginale strutturale per misure ripetute avendo a disposizione i dati relativi ad una coorte di studenti svedesi (coorte BROMS). L'elevata numerosita di tale coorte e l'accuratezza e tipologia dei dati raccolti la rendono particolarente adatta allo studio di fenomeni dinamici comportamentali caratteristici dell'adolescenza. Dallo studio emerge come l'effetto causale cumulato del fumo di sigaretta sulla riduzione del BMI è significativo solo nelle donne, con una stima del parametro relativo all'interazione tra l'esposizione al fumo e genere pari a 0.322 (p-value < 0.001) mentre la stima del parametro relativo al consumo cumulato di sigarette nei maschi è non signicativo e pari a 0.053 (p-value pari a 0.464). I risultati ottenuti sono consistenti con quanto riportato in studi precedenti.
APA, Harvard, Vancouver, ISO, and other styles
13

VALENTINI, FRANCESCO. "Three Essays on the Conditional Inference Approach for Binary Panel Data Models." Doctoral thesis, Università Politecnica delle Marche, 2020. http://hdl.handle.net/11566/274074.

Full text
Abstract:
La tesi consiste in una collezione di tre diversi saggi relativi all’approccio di Inferenza Condizionale applicata alla stima di modelli per dati panel binari ad effetti fissi. Il lavoro è organizzato in tre capitoli. Una dettagliata rassegna della letteratura concernente le principali proposte teoriche riguardanti la stima dei modelli sopra citati è riportata nel Capitolo 1, dove gli stimatori basati sull’approccio di Inferenza Condizionale e quelli basati sulla “Correzione del Bias” sono descritti e dove correttezza in campioni finiti è valutata attraverso una simulazione di Monte Carlo. Il secondo ed il terzo capitolo contengono proposte su problemi di applicazione degli stimatori di Inferenza Condizionale: (i) il problema di testare meccanismi endogeni di autoselezione del campione e (ii) il costo computazionale delle funzioni di verosimiglianza condizionali impiegate nella stima dei parametri. La risposta alla prima questione è affrontata attraverso l’approssimazione di un modello logit ad effetti fissi, stimato attraverso una procedura a due stadi, che ammette un semplice test di azzeramento. Il test consente di identificare la potenziale endogeneità del meccanismo di selezione e di isolare per costruzione il ruolo delle componenti inosservate costanti nel tempo. Il test è applicato ai dati SHARE su un problema relativo all’impatto del pensionamento sullo stato di salute. Infine, il problema computazionale si presenta quando il panel è caratterizzato da un numero moderatamente grande di osservazioni nel tempo (T), ciò rende il calcolo della funzione di verosimiglianza impossibile attraverso le consuete tecniche di algebra matriciale. Questo lavoro propone un nuovo metodo ricorsivo per il calcolo delle stesse, relativamente ad una classe di modelli dinamici come il modello Quadratic Expoential. Una simulazione di Monte Carlo mostra come il limite computazionale sia stato rimosso attraverso l’utilizzo dell’algoritmo.
This thesis is a collection of three essays concerning the Conditional Inference approach applied to the estimation of binary panel data models with fixed effects. The work is organised in three chapters. A detailed literature review of the main theoretical proposals about the estimation of the aforementioned models is reported in Chapter 1, where the Conditional Inference estimators and the “Bias-Corrected” estimators are described and whose finite sample performance is evaluated by a Monte Carlo experiment. The second and the third chapters focus on real data problem affecting the Conditional Inference estimators: (i) the problem of testing for endogenous self- and sample-selection mechanisms and (ii) the computational burden of the conditional likelihood function involved in the parameters estimation procedure. The first issue is dealt with a methodology that relies on an approximation of a fixed-effects logit model estimated by conditional maximum likelihood in a two-step procedure and that admits a very simple variable-addition test. The test is able to identify the idiosyncratic endogeneity since the choice of the Conditional Inference approach allows to handle heterogeneity endogeneity and to overcome the incidental parameters problem at the same time. The test is applied on SHARE data to a problem concerning health and retirement. Finally, Conditional Inference estimators require the maximisation of peculiar likelihood functions, whose computational burden limits the applicability of these techniques when the number of time occasions (T) in the panel becomes large, so that the parameters estimation is no longer feasible when the likelihood function is computed by standard algebra operations. This work proposes a novel way to recursively compute the conditional likelihood function of dynamic models, focusing on the Quadratic Exponential model. A Monte Carlo simulation shows how the recursive algorithm removes the computational burden due to large-T.
APA, Harvard, Vancouver, ISO, and other styles
14

Gullà, Francesca. "Study and development of method and tool to support the design of adaptive and adaptable user interface." Doctoral thesis, Università Politecnica delle Marche, 2019. http://hdl.handle.net/11566/263144.

Full text
Abstract:
Negli ultimi decenni, la vita quotidiana delle persone è stata trasformata dal rapido sviluppo delle tecnologie dell'informazione e della comunicazione (ICT). L'introduzione dell’ICT in ogni area delle attività umane ha reso la tecnologia informatica sempre più orientata a supportare la comunicazione con gli utenti. Per questo motivo, è cresciuto il bisogno di capire come progettare l'interazione dei sistemi informatici con gli utenti per ottenere sistemi facili da usare. Un buon design dell'interfaccia può consentire all'utente di interagire e gestire il sistema senza problemi. Tuttavia, la ricchezza della moderna tecnologia informatica consente molti usi dei sistemi interattivi e spesso diventa necessario che le interfacce utente siano adattate al contesto di utilizzo; in particolare, le Interfacce utente adattive (AUI) stanno diventando uno dei principali obiettivi affrontati dalla ricerca in ambito interazione uomo-macchina (HCI). In letteratura ci sono molti articoli che si concentrano sui metodi per progettare un'interfaccia adattiva, tuttavia, vi è una mancanza di una singola metodologia efficace per la progettazione di interfacce Adattive, in grado di supportare il progettista nella fase di progettazione dell'interfaccia utente. Inoltre, il contesto di ricerca delle interfacce adattive è molto eterogeneo e strettamente legato a ciascun dominio di applicazione. In questo contesto, l'obiettivo del presente lavoro di ricerca consiste nello studio e sviluppo di un metodo e uno strumento in grado di supportare la progettazione di un'interfaccia utente adattabile ed adattiva. Il metodo e lo strumento proposti sono stati utilizzati: (i) per supportare lo sviluppo di un nuovo sistema adattivo in grado di assistere l'utente nello svolgimento delle attività quotidiane; (ii) riprogettare l'applicazione Hoover Wizard esistente per renderla adattiva. I risultati sperimentali dimostrano che il metodo e lo strumento proposti supportano efficacemente la progettazione di un'interfaccia utente adattabile ed adattiva. Inoltre, i test con gli utenti hanno dimostrato che il sistema adattivo ed adattabile risulta il migliore in termini di prestazioni e soddisfazione. In dettaglio, la maggior parte degli utenti ha trovato il vantaggio di utilizzare l'interfaccia adattiva.
In recent decades, people's daily lives have been transformed by the faster developing of information and communication technologies (ICT). The introduction of ICT in every area of human activities has made computer technology increasingly oriented towards supporting communication with users. For this reason, a growing need to understand how to design the computer systems interaction with users in order to obtain easy-to-use systems has been. A good interface design can let the user interact and deal with the system without problem. However, the wealth of modern computer technology allows many uses of interactive systems and often becomes necessary for the user interfaces to be adapted to the context of use; as a matter of fact, the Adaptive User Interfaces (AUIs) are becoming one of the major objectives addressed by Human-Computer Interaction research. In literature there are many articles that focus on methods to design an adaptive interface, however, there is a lack of a single effective methodology for the design of Adaptive interfaces, able to support the designer in the UI design phase. Furthermore, the adaptive interfaces research context is very heterogeneous and closely linked with each domain of application. In this context, the goal of the present research work consists of the study and development of a method and tool to support the design of adaptive and adaptable user interface. The proposed method and tool have been used: (i) to support the development of a new adaptive system able to assist the user in performing daily life activities; (ii) to re-design the existing Hoover Wizard Application in order to make it adaptive. Experimental results demonstrate that the proposed method and tool effectively support designing of an adaptive and adaptable user interface. In addition, tests with users have shown that the adaptive and adaptable system was the best in terms of both performance and satisfaction. In detail, most of the users have found the benefit of using the adaptive interface.
APA, Harvard, Vancouver, ISO, and other styles
15

Flügge, Sebastian, Sandra Zimmer, and Uwe Petersohn. "Wissensrepräsentation und diagnostische Inferenz mittels Bayesscher Netze im medizinischen Diskursbereich." Technische Universität Dresden, 2019. https://tud.qucosa.de/id/qucosa%3A35133.

Full text
Abstract:
Für die diagnostische Inferenz unter Unsicherheit werden Bayessche Netze untersucht. Grundlage dafür bildet eine adäquate einheitliche Repräsentation des notwendigen Wissens. Dies ist sowohl generisches als auch auf Erfahrungen beruhendes spezifisches Wissen, welches in einer Wissensbasis gespeichert wird. Zur Wissensverarbeitung wird eine Kombination der Problemlösungsmethoden des Concept Based und Case Based Reasoning eingesetzt. Concept Based Reasoning wird für die Diagnose-, Therapie- und Medikationsempfehlung und -evaluierung über generischesWissen eingesetzt. Sonderfälle in Form von spezifischen Patientenfällen werden durch das Case Based Reasoning verarbeitet. Darüber hinaus erlaubt der Einsatz von Bayesschen Netze den Umgang mit Unsicherheit, Unschärfe und Unvollständigkeit. Es können so die gültigen allgemeinen Konzepte nach derenWahrscheinlichkeit ausgegeben werden. Dazu werden verschiedene Inferenzmechanismen vorgestellt und anschließend im Rahmen der Entwicklung eines Prototypen evaluiert. Mit Hilfe von Tests wird die Klassifizierung von Diagnosen durch das Netz bewertet.:1 Einleitung 2 Repräsentation und Inferenz 3 Inferenzmechanismen 4 Prototypische Softwarearchitektur 5 Evaluation 6 Zusammenfassung
APA, Harvard, Vancouver, ISO, and other styles
16

Zeller, Camila Borelli. "Modelo de Grubbs em grupos." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/307093.

Full text
Abstract:
Orientador: Filidor Edilfonso Vilca Labra
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica
Made available in DSpace on 2018-08-05T23:55:16Z (GMT). No. of bitstreams: 1 Zeller_CamilaBorelli_M.pdf: 3683998 bytes, checksum: 26267086098b12bd76b1d5069f688223 (MD5) Previous issue date: 2006
Resumo: Neste trabalho, apresentamos um estudo de inferência estatística no modelo de Grubbs em grupos, que representa uma extensão do modelo proposto por Grubbs (1948,1973) que é freqüentemente usado para comparar instrumentos ou métodos de medição. Nós consideramos a parametrização proposta por Bedrick (2001). O estudo é baseado no método de máxima verossimilhança. Testes de hipóteses são considerados e baseados nas estatísticas de wald, escore e razão de verossimilhanças. As estimativas de máxima verossimilhança do modelo de Grubbs em grupos são obtidas usando o algoritmo EM e considerando que as observações seguem uma distribuição normal. Apresentamos um estudo de análise de diagnóstico no modelo de Grubbs em grupos com o interesse de avaliar o impacto que um determinado subgrupo exerce na estimativa dos parâmetros. Vamos utilizar a metodologia de influência local proposta por Cook (1986), considerando o esquema de perturbação: ponderação de casos. Finalmente, apresentamos alguns estudos de simulação e ilustramos os resultados teóricos obtidos usando dados encontrados na literatura
Abstract: In this work, we presented a study of statistical inference in the Grubbs's model with subgroups, that represents an extension of the model proposed by Grubbs (1948,1973) that is frequently used to compare instruments or measurement methods. We considered the parametrization proposed by Bedrick (2001). The study is based on the maximum likelihood method. Tests of hypotheses are considered and based on the wald statistics, score and likelihood ratio statistics. The maximum likelihood estimators of the Grubbs's model with subgroups are obtained using the algorithm EM and considering that the observations follow a normal distribution. We also presented a study of diagnostic analysis in the Grubb's model with subgroups with the interest of evaluating the effect that a certain one subgroup exercises in the estimate of the parameters. We will use the methodology of local influence proposed by Cook (1986) considering the schemes of perturbation of case weights. Finally, we presented some simulation studies and we illustrated the obtained theoretical results using data found in the literature
Mestrado
Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
17

Chaves, Nathalia Lima 1989. "Modelos de regressão Birnbaum-Saunders baseados na distribuição normal assimétrica centrada." [s.n.], 2015. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306793.

Full text
Abstract:
Orientadores: Caio Lucidius Naberezny Azevedo, Filidor Edilfonso Vilca Labra
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica
Made available in DSpace on 2018-08-26T22:33:37Z (GMT). No. of bitstreams: 1 Chaves_NathaliaLima_M.pdf: 3044792 bytes, checksum: 8fea3cd9d074997b605026a7a4526c35 (MD5) Previous issue date: 2015
Resumo: A classe de modelos Birnbaum-Saunders (BS) foi desenvolvida a partir de problemas que surgiram na área de confiabilidade de materiais. Tais problemas, em geral, são ligados ao estudo de fadiga de materiais. No entanto, nos últimos tempos, essa classe de modelos tem sido aplicada em áreas fora do referido contexto como, por exemplo, em ciências da saúde, ambiental, florestal, demográficas, atuariais, financeira, entre outras, devido à sua grande versatilidade. Neste trabalho desenvolvemos a distribuição Birnbaum-Saunders (BS) baseada na normal assimétrica padrão sob a parametrização centrada (BSNAC) que, além de representar uma extensão da distribuição BS usual, apresenta diversas vantagens em relação à distribuição BS baseada na distribuição normal assimétrica sob a parametrização usual. Desenvolvemos também um modelo de regressão linear log-Birnbaum-Saunders. Apresentamos, tanto para a distribuição BSNAC quanto para o respectivo modelo de regressão, diversas propriedades. Desenvolvemos procedimentos de estimação sob os enfoques frenquentista e bayesiano, bem como ferramentas de diagnóstico para os modelos propostos, contemplando análise residual e medidas de influência. Realizamos estudos de simulação, considerando diferentes cenários, com o intuito de comparar as estimativas frequentistas e bayesianas, bem como avaliar o desempenho das medidas de diagnóstico. A metodologia aqui proposta foi ilustrada tanto com dados provenientes de estudos de simulação, quanto com conjuntos de dados reais
Abstract: The class of Birnbaum-Saunders (BS) models was developed from problems that arose in the field of material reliability. These problems generally are related to the study of material fatigue. However, in the last years, this class of models has been applied in areas outside that context, such as in health sciences, environmental, forestry, demographic, actuarial, financial, among others, due to its great versatility. In this work, we developed the skew-normal Birnbaum-Saunders distribution under the centered parameterization (BSNAC), which also represents an extension of the usual BS distribution and presents several advantages over the BS distribution based on the skew-normal distribution under the usual parameterization. We also developed a log-Birnbaum-Saunders linear regression model. We present several properties of both BSNAC distribution and the related regression model. We develop estimation procedures under the frequentist and Bayesian approaches, as well as diagnostic tools for the proposed models, contemplating residual analysis and measures of influence. We conducted simulation studies considering different scenarios, in order to compare the frequentist and Bayesian estimates and evaluate the performance of diagnostic measures. The methodology proposed here is illustrated with data sets from both simulation studies and real data sets
Mestrado
Estatistica
Mestra em Estatística
APA, Harvard, Vancouver, ISO, and other styles
18

Benites, Sánchez Luis Enrique 1983. "Modelos Birnbaum-Saunders bivariados." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/307089.

Full text
Abstract:
Orientador: Filidor Edilfonso Vilca Labra
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica
Made available in DSpace on 2018-08-24T23:42:51Z (GMT). No. of bitstreams: 1 BenitesSanchez_LuisEnrique_M.pdf: 5806394 bytes, checksum: be4e77277f256bd4832454c572be588c (MD5) Previous issue date: 2014
Resumo: Vários trabalhos tem sido feitos sobre a distribuição BS univariada e suas extensões. A versão bivariada deste modelo foi apresentada recentemente por Kundu et al. (2010). Eles propuseram uma distribuição BS bivariada com estrutura de dependência e estabeleceram várias propriedades atrativas para a distribuição BS bivariada, que possui uma estreita relação com a distribuição normal bivariada; assim como a distribuição BS univariada tem com a distribuição normal univariada. Este trabalho apresenta um estudo de alguns aspectos de inferência, análise de diagnóstico e análise de tempo de vida baseada na função taxa de falha da distribuição BS bivariada: aspectos de inferência serão através de testes de hipótese considerando as estatísticas de Wald, Razão de Verossimilhança e Escore; o análise de diagnóstico será baseada na metodologia de Cook (1986) e a discussão sobre análise sobrevivência será baseada na idéia de Basu (1971). Finalmente, exemplos numéricos serão apresentados para ilustrar as metodologias propostas e as propriedades das estatísticas serão investigadas por meio de simulações de Monte Carlo
Abstract: Several works have been done on the univariate BS distribution and its extensions, the bivariate version of this model was presented only recently by Kundu et al. (2010). They proposed a bivariate BS distribution with dependence structure and established several attractive properties of that bivariate distribution. It possesses a close relationship with the bivariate normal distribution just as the univariate BS distribution has with the univariate normal. In this work provides a study some aspect of inference, analysis of diagnostics and lifetime analysis based on the failure rate function of bivariate BS distributions: Hypotheses test studies are considered using the Wald, Score and Likelihood Ratios statistics, the analysis of diagnostics is based on the Cook (1986) approach, and the discussion on lifetime analysis is based on the idea of Basu (1971). Finally, numerical examples are given to illustrate our methodology and the properties of the statistics are investigated through Monte Carlo simulations
Mestrado
Estatistica
Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
19

Elling, Eva. "Effects of MIFID II on Stock Trade Volumes of Nasdaq Stockholm." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257510.

Full text
Abstract:
Introducing new financial legislation to financial markets require caution to achieve the intended outcome. This thesis aims to investigate whether or not the newly installed revised Markets in Financial Instruments Directive- the MIFID II regulation - temporally influenced the trading stock volume levels of Nasdaq Stockholm during its introduction to the Swedish stock market. A first approach of a generalized Negative Binomial model is carried out on aggregated data, followed by an individual Fixed Effects model in an attempt to eliminate omitted variable bias caused by missing unobserved variables for the individual stocks. The aggregated data is attained by taking the equally weighted average of the trading volume and adjusting for seasonality through Seasonal and Trend decomposition using Loess in combination with a regression model with ARIMA errors to mitigate calendar effects. Due to robustness of the aggregated data, the Negative Binomial model manage to capture significant effects of the regulation on the Small Cap. segment, even though clusters of the data show signs of divergent reactions to MIFID II. Since the Fixed Effects model operate on non-aggregated TSCS data and because of the varying effects on each stock the Fixed Effect model fails in its attempt to do the same.
Implementation av nya finansiella regelverk på finansmarknaden kräver aktsamhet för att uppnå de tilltänka målen. Det här arbetet undersöker huruvida MIFID II regleringen orsakade en temporär medelvärdesskiftning av de handlade aktievolymerna på Nasdaq Stockholm under regelverkets introduktion på den svenska marknaden. Först testas en generaliserad Negative Binomial regression applicerat på aggregerad data, därefter en individuell Fixed Effects modell för att försöka eliminera fel på grund av saknade, okända variabler. Det aggrigerade datasettet erhålls genom att ta genomsnittet av handelsvolymerna och justera dessa för sässongsmässiga mönster med metoden STL i kombination med regression med ARIMA residualer för att även ta hänsyn till kalender relaterade effekter. Eftersom den aggrigerade datan är robust lyckas the Negative Binomial regressionen fånga signifikanta effekter av regleringen för Small Cap. segmentet trots att datat uppvisar tecken på att subgrupper inom segmentet reagerat väldigt olika på den nya regleringen. Eftersom Fixed Effects modellen är applicerad på icke-aggrigerad TSCS data och pågrund av den varierande effekten på de individuella aktierna lyckas inte denna modell med detta.
APA, Harvard, Vancouver, ISO, and other styles
20

Baldoni, Pedro Luiz 1989. "Modelos lineares generalizados mistos multivariados para caracterização genética de doenças." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/307180.

Full text
Abstract:
Orientador: Hildete Prisco Pinheiro
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação
Made available in DSpace on 2018-08-24T09:34:36Z (GMT). No. of bitstreams: 1 Baldoni_PedroLuiz_M.pdf: 4328843 bytes, checksum: 0ab04f375988e62ac31097716ac0eaa5 (MD5) Previous issue date: 2014
Resumo: Os Modelos Lineares Generalizados Mistos (MLGM) são uma generalização natural dos Modelos Lineares Mistos (MLM) e dos Modelos Lineares Generalizados (MLG). A classe dos MLGM estende a suposição de normalidade dos dados permitindo o uso de várias outras distribuições bem como acomoda a superdispersão frequentemente observada e também a correlação existente entre observações em estudos longitudiais ou com medidas repetidas. Entretanto, a teoria de verossimilhança para MLGM não é imediata uma vez que a função de verossimilhança marginal não possui forma fechada e envolve integrais de alta dimensão. Para solucionar este problema, diversas metodologias foram propostas na literatura, desde técnicas clássicas como quadraturas numéricas, por exemplo, até métodos sofisticados envolvendo algoritmo EM, métodos MCMC e quase-verossimilhança penalizada. Tais metodologias possuem vantagens e desvantagens que devem ser avaliadas em cada tipo de problema. Neste trabalho, o método de quase-verossimilhança penalizada (\cite{breslow1993approximate}) foi utilizado para modelar dados de ocorrência de doença em uma população de vacas leiteiras pois demonstrou ser robusto aos problemas encontrados na teoria de verossimilhança deste conjunto de dados. Além disto, os demais métodos não se mostram calculáveis frente à complexidade dos problemas existentes em genética quantitativa. Adicionalmente, estudos de simulação são apresentados para verificar a robustez de tal metodologia. A estabilidade dos estimadores e a teoria de robustez para este problema não estão completamente desenvolvidos na literatura
Abstract: Generalized Linear Mixed Models (GLMM) are a generalization of Linear Mixed Models (LMM) and of Generalized Linear Models (GLM). The class of models GLMM extends the normality assumption of the data and allows the use of several other probability distributions, for example, accommodating the over dispersion often observed and also the correlation among observations in longitudinal or repeated measures studies. However, the likelihood theory of the GLMM class is not straightforward since its likelihood function has not closed form and involves a high order dimensional integral. In order to solve this problem, several methodologies were proposed in the literature, from classical techniques as numerical quadrature¿s, for example, up to sophisticated methods involving EM algorithm, MCMC methods and penalized quasi-likelihood. These methods have advantages and disadvantages that must be evaluated in each problem. In this work, the penalized quasi-likelihood method (\cite{breslow1993approximate}) was used to model infection data in a population of dairy cattle because demonstrated to be robust in the problems faced in the likelihood theory of this data. Moreover, the other methods do not show to be treatable faced to the complexity existing in quantitative genetics. Additionally, simulation studies are presented in order to verify the robustness of this methodology. The stability of these estimators and the robust theory of this problem are not completely studied in the literature
Mestrado
Estatistica
Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
21

Calabrese, Chris M. Eng Massachusetts Institute of Technology. "Distributed inference : combining variational inference with distributed computing." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/85407.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 95-97).
The study of inference techniques and their use for solving complicated models has taken off in recent years, but as the models we attempt to solve become more complex, there is a worry that our inference techniques will be unable to produce results. Many problems are difficult to solve using current approaches because it takes too long for our implementations to converge on useful values. While coming up with more efficient inference algorithms may be the answer, we believe that an alternative approach to solving this complicated problem involves leveraging the computation power of multiple processors or machines with existing inference algorithms. This thesis describes the design and implementation of such a system by combining a variational inference implementation (Variational Message Passing) with a high-level distributed framework (Graphlab) and demonstrates that inference is performed faster on a few large graphical models when using this system.
by Chris Calabrese.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
22

Wenzel, Florian. "Scalable Inference in Latent Gaussian Process Models." Doctoral thesis, Humboldt-Universität zu Berlin, 2020. http://dx.doi.org/10.18452/20926.

Full text
Abstract:
Latente Gauß-Prozess-Modelle (latent Gaussian process models) werden von Wissenschaftlern benutzt, um verborgenen Muster in Daten zu er- kennen, Expertenwissen in probabilistische Modelle einfließen zu lassen und um Vorhersagen über die Zukunft zu treffen. Diese Modelle wurden erfolgreich in vielen Gebieten wie Robotik, Geologie, Genetik und Medizin angewendet. Gauß-Prozesse definieren Verteilungen über Funktionen und können als flexible Bausteine verwendet werden, um aussagekräftige probabilistische Modelle zu entwickeln. Dabei ist die größte Herausforderung, eine geeignete Inferenzmethode zu implementieren. Inferenz in probabilistischen Modellen bedeutet die A-Posteriori-Verteilung der latenten Variablen, gegeben der Daten, zu berechnen. Die meisten interessanten latenten Gauß-Prozess-Modelle haben zurzeit nur begrenzte Anwendungsmöglichkeiten auf großen Datensätzen. In dieser Doktorarbeit stellen wir eine neue effiziente Inferenzmethode für latente Gauß-Prozess-Modelle vor. Unser neuer Ansatz, den wir augmented variational inference nennen, basiert auf der Idee, eine erweiterte (augmented) Version des Gauß-Prozess-Modells zu betrachten, welche bedingt konjugiert (conditionally conjugate) ist. Wir zeigen, dass Inferenz in dem erweiterten Modell effektiver ist und dass alle Schritte des variational inference Algorithmus in geschlossener Form berechnet werden können, was mit früheren Ansätzen nicht möglich war. Unser neues Inferenzkonzept ermöglicht es, neue latente Gauß-Prozess- Modelle zu studieren, die zu innovativen Ergebnissen im Bereich der Sprachmodellierung, genetischen Assoziationsstudien und Quantifizierung der Unsicherheit in Klassifikationsproblemen führen.
Latent Gaussian process (GP) models help scientists to uncover hidden structure in data, express domain knowledge and form predictions about the future. These models have been successfully applied in many domains including robotics, geology, genetics and medicine. A GP defines a distribution over functions and can be used as a flexible building block to develop expressive probabilistic models. The main computational challenge of these models is to make inference about the unobserved latent random variables, that is, computing the posterior distribution given the data. Currently, most interesting Gaussian process models have limited applicability to big data. This thesis develops a new efficient inference approach for latent GP models. Our new inference framework, which we call augmented variational inference, is based on the idea of considering an augmented version of the intractable GP model that renders the model conditionally conjugate. We show that inference in the augmented model is more efficient and, unlike in previous approaches, all updates can be computed in closed form. The ideas around our inference framework facilitate novel latent GP models that lead to new results in language modeling, genetic association studies and uncertainty quantification in classification tasks.
APA, Harvard, Vancouver, ISO, and other styles
23

Miller, J. Glenn (James). "Predictive inference." Diss., Georgia Institute of Technology, 2002. http://hdl.handle.net/1853/24294.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Cleave, Nancy. "Ecological inference." Thesis, University of Liverpool, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.304826.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Henke, Joseph D. "Visualizing inference." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/91826.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 75-76).
Common Sense Inference is an increasingly attractive technique to make computer interfaces more in touch with how human users think. However, the results of the inference process are often hard to interpret and evaluate. Visualization has been successful in many other fields of science, but to date it has not been used much for visualizing the results of inference. This thesis presents Alar, an interface which allows dynamic exploration of the results of the inference process. It enables users to detect errors in the input data and fine tune how liberal or conservative the inference should be. It accomplishes this through novel extensions to the AnalogySpace framework for inference and visualizing concepts and even assertions as nodes in a graph, clustered by their semantic relatedness. A usability study was performed and the results show users were able to successfully use Alar to determine the cause of an incorrect inference.
by Joseph D. Henke.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
26

Machado, Marco Antônio Rosa. "O papel do processo inferencial na compreensão de textos escritos." [s.n.], 2005. http://repositorio.unicamp.br/jspui/handle/REPOSIP/270429.

Full text
Abstract:
Orientador: Anna Christina Bentes da Silva
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Estudos da Linguagem
Made available in DSpace on 2018-08-06T07:27:22Z (GMT). No. of bitstreams: 1 Machado_MarcoAntonioRosa_M.pdf: 713668 bytes, checksum: e4135fa185cb3aa8cada00ae49c2d208 (MD5) Previous issue date: 2005
Resumo: Esta dissertação investiga o processo de compreensão de textos escritos, focalizando mais especificamente o processo de geração de inferências realizado pelos leitores. Para isso investigamos as inferências que alunas do curso de Letras fazem (em comentários escritos) a partir de um texto literário narrativo. O corpus analisado neste trabalho é constituído por comentários escritos produzidos por alunas do segundo ano do curso de Letras de uma universidade pública do interior de Goiás, feitos a partir de três contatos (audição, leitura e comentário/leitura) com um texto literário narrativo: o conto ¿A gaivota¿, de Augusta Faro. Partimos do pressuposto de que a compreensão de textos depende tanto dos processos de decodificação como da realização de inferências, pois acreditamos ser muito difícil haver compreensão sem inferências. Inferência é tomada aqui como uma estratégia cognitiva pela qual o leitor gera uma informação semântica nova, a partir de uma informação semântica dada, em um determinado contexto. Tendo isto em mente, buscamos investigar (i) quais inferências são produzidas ao longo dos comentários escritos, considerando os diferentes contextos - audição/leitura - de recepção de um texto literário escrito; (ii) de que forma a exibição desta ¿competência inferencial¿ por parte destes sujeitos relaciona-se aos diferentes tipos de contexto. Além disso, buscamos relacionar o processo de compreensão dos elementos constitutivos da narrativa e as inferências realizadas nos comentários dos sujeitos. Percebemos que o processo inferencial está relacionado tanto aos esquemas mentais dos sujeitos como ao seu contexto pessoal, de modo que, utilizando-se destas duas fontes de informação extratextual, os sujeitos buscam estabelecer o sentido do texto com a realização de inferências lógicas, informativas e elaborativas. E, no caso específico do conto utilizado em nossa pesquisa, percebemos que as inferências giraram em torno dos elementos constitutivos desta narrativa, especialmente da personagem e da ação
Abstract: This dissertation examines the process of written texts comprehension, focusing specifically on the readers¿ production of inferences. We investigate the inferences that Languages undergraduate students make (in terms of written comments) from a literary narrative text. The analyzed corpus comprises written comments made by undergraduate students of the second year of the Language course at a State University from the countryside of Goiás. The comments were uttered from three types of contact (listening, reading and commenting/reading) with a literary narrative text: the short story ¿A gaivota¿, by Augusta Faro. We depart from the assumption that the comprehension of texts depends both on processes of decoding and inferring, regarding that there must not be comprehension if one draws no inferences. Inference is conceived here as a cognitive strategy by which the reader generates new semantic information, from a given piece of semantic information, in a given context. Thus, we intend to depict (i) which inferences are made (in terms of written comments) in different contexts (listening/reading), after reading a literary written text and (ii) in which terms the fact of such subjects display a ¿inferential competence¿ has to do to the different kinds of contexts. Besides, we intend to relate the process of comprehension of narrative text constitutive elements and the inferences made in the subjects¿ comments. We realized that the inferential process relates both to the subjects¿ mental schemes and to their personal context, so that the subjects, once based on these two sources of extra-textual information, engage on giving meaning to the text by making logical, informative and elaborative inferences. And, in the specific case of the short story applied to the research, we realized that the inferences encompass the constitutive elements of such narrative, specially the character and the action
Mestrado
Mestre em Linguística
APA, Harvard, Vancouver, ISO, and other styles
27

Ferreira, Matheus Kingeski. "Modelos hierárquicos de ocupação para Pontoporia blainvillei (Cetacea: pontoporiidae) na costa do Brasil." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2018. http://hdl.handle.net/10183/180700.

Full text
Abstract:
Conhecer a distribuição geográfica das espécies é primordial para a tomada de ações efetivas de conservação. Modelos de ocupação são ferramentas importantes para estimar a distribuição das espécies, especialmente quando as informações são incompletas, como é o caso de muitas espécies ameaçadas ou em áreas ainda insuficientemente amostradas. O objetivo deste estudo é ampliar e refinar o conhecimento sobre a distribuição geográfica da toninha, Pontoporia blainvillei, um pequeno cetáceo ameaçado de extinção restrito às águas costeiras do Atlântico Sul ocidental, através de modelos de ocupação. Foram realizadas amostragens aéreas com 4 observadores independentes, em 2058 sítios de 4x4km na distribuição da espécie no Brasil. Foram utilizadas cinco covariáveis de detecção (transparência da água, escala Beaufort, reflexo solar, posição dos amostradores e número de amostradores) e três covariáveis de ocupação (batimetria, temperatura média e produtividade primária) com índices de correlação de Pearson menor que 0,7. Todas as covariáveis contínuas foram estandardizadas com média zero e desvio padrão igual a um. Os modelos de ocupação com autocorrealação espacial foram estimados com Inferência Bayesiana utilizando priors ‘vagos’ (média zero e variância 1.0E6). Em apenas 75 sítios foram detectadas toninhas. A probabilidade de detecção média foi de 0.23 (CRI 0.006 a 0.51), onde as covariáveis Beaufort (efeito negativo), reflexo solar (efeito negativo) e transparência da água (efeito positivo) apresentaram efeitos significativos. A média estimada de ocupação foi de 0,066 (CRI 0,01 a 0,31). As covariáveis batimetria e a temperatura média apresentaram efeitos positivos e negativos sobre o processo de ocupação, respectivamente. Espacialmente o modelo prevê três áreas com altas probabilidades de ocupação aparentemente disjuntas: a) costa norte do Rio de Janeiro; b) costas norte de 3 Santa catarina até São Paulo; c) costa do Rio Grande do Sul. Assim, agregamos importantes informações para a conservação da espécie e realização de novos estudos, apontando onde podemos encontrar maiores probabilidade de ocupação na costa do Brasil e covariáveis que determinam a ocupação e a detecção da espécie.
Knowing the geographic distribution of a species is essential for taking effective conservation actions. Occupation Models are important tools for estimating species distribution, especially when information is incomplete, as is the case with many endangered species or in under-sampled areas. The aim of this study is to expand and refine the knowledge about the geographic distribution of the franciscana, Pontoporia blainvillei, a threatened small cetacean restricted to the coastal waters of the western South Atlantic, through Occupation Models. Aerial samplings were carried out with 4 independent observers, in 2058 sites of 4x4km across the distribution of the species in Brazilian waters. Five detection covariates were used (water transparency, Beaufort scale, solar reflectance, observer position and number of observers) and three covariates of occupation (bathymetry, mean temperature and primary productivity) with Pearson correlation indices less than 0.7. All continuous covariates were standardized with mean zero and standard deviation equal to one. Occupancy Models with spatial autocorrection were estimated using Bayesian Inference using 'vague' priors (zero mean and variance 1.0E6). Franciscana was detected only in 75 sites. The average detection probability 4 was 0.23 (CRI 0.006 to 0.51), where Beaufort (negative effect), solar reflex (negative effect) and water transparency (positive effect) covariables had significant effects. The estimated mean occupancy was 0.066 (CRI 0.01 to 0.31). The bathymetry and the mean temperature covariables had positive and negative effects on the occupation process, respectively. Spatially the model predicts three apparently disjunct areas with high probability of occupation: a) north coast of Rio de Janeiro; b) north coasts of Santa Catarina to São Paulo; c) coast of Rio Grande do Sul. Thus, we add important information for the conservation of species and new studies, pointing out where we can find greater likelihood of occupation on the coast of Brazil and covariates that determine the occupation and the detection of the species.
APA, Harvard, Vancouver, ISO, and other styles
28

Silva, Ana Roberta dos Santos 1989. "Modelos de regressão beta retangular heteroscedásticos aumentados em zeros e uns." [s.n.], 2015. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306787.

Full text
Abstract:
Orientador: Caio Lucidius Naberezny Azevedo
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática Estatística e Computação Científica
Made available in DSpace on 2018-08-26T19:30:15Z (GMT). No. of bitstreams: 1 Silva_AnaRobertadosSantos_M.pdf: 4052475 bytes, checksum: 08fb6f3f7b4ed838df4eea2dbcf06a29 (MD5) Previous issue date: 2015
Resumo: Neste trabalho desenvolvemos a distribuição beta retangular aumentada em zero e um, bem como um correspondente modelo de regressão beta retangular aumentado em zero e um para analisar dados limitados-aumentados (representados por variáveis aleatórias mistas com suporte limitado), que apresentam valores discrepantes. Desenvolvemos ferramentas de inferência sob as abordagens bayesiana e frequentista. No que diz respeito à inferência bayesiana, devido à impossibilidade de obtenção analítica das posteriores de interesse, utilizou-se algoritmos MCMC. Com relação à estimação frequentista, utilizamos o algoritmo EM. Desenvolvemos técnicas de análise de resíduos, utilizando o resíduo quantil aleatorizado, tanto sob o enfoque frequentista quanto bayesiano. Desenvolvemos, também, medidas de influência, somente sob o enfoque bayesiano, utilizando a medida de Kullback Leibler. Além disso, adaptamos métodos de checagem preditiva à posteriori existentes na literatura, ao nosso modelo, utilizando medidas de discrepância apropriadas. Para a comparação de modelos, utilizamos os critérios usuais na literatura, como AIC, BIC e DIC. Realizamos diversos estudos de simulação, considerando algumas situações de interesse prático, com o intuito de comparar as estimativas bayesianas com as frequentistas, bem como avaliar o comportamento das ferramentas de diagnóstico desenvolvidas. Um conjunto de dados da área psicométrica foi analisado para ilustrar o potencial do ferramental desenvolvido
Abstract: In this work we developed the zero-one augmented rectangular beta distribution, as well as a correspondent zero-one augmented rectangular beta regression model to analyze limited-augmented data (represented by mixed random variables with limited support), which present outliers. We develop inference tools under the Bayesian and frequentist approaches. Regarding to the Bayesian inference, due the impossibility of obtaining analytically the posterior distributions of interest, we used MCMC algorithms. Concerning the frequentist estimation, we use the EM algorithm. We develop techniques of residual analysis, by using the randomized quantile residuals, under both frequentist and Bayesian approaches. We also developed influence measures, only under the Bayesian approach, by using the measure of Kullback Leibler. In addition, we adapt methods of posterior predictive checking available in the literature, to our model, using appropriate discrepancy measures. For model selection, we use the criteria commonly employed in the literature, such as AIC, BIC and DIC. We performed several simulation studies, considering some situations of practical interest, in order to compare the Bayesian and frequentist estimates, as well as to evaluate the behavior of the developed diagnostic tools. A psychometric real data set was analyzed to illustrate the performance of the developed tools
Mestrado
Estatistica
Mestra em Estatística
APA, Harvard, Vancouver, ISO, and other styles
29

Peres, Tarcisio de Souza. "Avaliação de transcritos diferencialmente expressos neoplasias humanas com ORESTES." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/309348.

Full text
Abstract:
Orientador: Fernando Lopes Alberto
Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Ciencias Medicas
Made available in DSpace on 2018-08-07T09:04:32Z (GMT). No. of bitstreams: 1 Peres_TarcisiodeSouza_M.pdf: 2330056 bytes, checksum: 66c1d9241ad60e2d973cfb7361a5eb7c (MD5) Previous issue date: 2006
Resumo: Durante todo o século XX, a pesquisa do câncer se desenvolveu de maneira sistemática, porém os últimos 25 anos foram notadamente caracterizados por rápidos avanços que geraram uma rica e complexa base de conhecimentos, evidenciando a doença dentro de um conjunto dinâmico de alterações no genoma. Desta forma, o entendimento completo dos fenômenos moleculares envolvidos na fisiopatologia das neoplasias depende do conhecimento dos diversos processos celulares e bioquímicos característicos da célula tumoral e que, porventura, a diferenciem da célula normal (GOLUB e SLONIM, 1999). Nesse trabalho buscamos o melhor entendimento das vias moleculares no processo neoplásico por meio da análise dos dados do Projeto Genoma Humano do Câncer (CAMARGO, 2001) com vistas à identificação de genes diferencialmente expressos nas neoplasias dos seguintes tecidos: mama, cólon, cabeça e pescoço, pulmão, sistema nervoso central, próstata, estômago, testículo e útero. A metodologia de geração dos transcritos utilizada pelo Projeto Genoma Humano do Câncer é conhecida como ORESTES (DIAS et al, 2000). Inicialmente, os dados de seqüenciamento (fragmentos ORESTES) foram agrupados por meio de uma técnica conhecida em Bioinformática como ¿montagem¿, utilizando o pacote de programas de computador PHRED/PHRAP (EWING e GREEN P., 1998). A comparação de cada agrupamento com seqüências conhecidas (depositadas em bases públicas) foi realizada por meio do algoritmo BLAST (ALTSCHUL et al, 1990). Um subconjunto de genes foi selecionado com base em critérios específicos e submetido à avaliação de seus níveis de expressão em diferentes tecidos com base em abordagem de inferência Bayesiana (CHEN et al, 1998), em contraposição às abordagens mais clássicas, como testes de hipótese nula (AUDIC e CLAVERIE, 1997). A inferência Bayesiana foi viabilizada pelo desenvolvimento de uma ferramenta computacional escrita em linguagem PERL (PERES et al, 2005). Com o apoio da literatura, foi criada uma lista de genes relacionados ao fenômeno neoplásico. Esta lista foi confrontada com as informações de expressão gênica, constituindo-se em um dos parâmetros de um sistema de classificação (definido para a seleção dos genes de interesse). Desta forma, parte da base de conhecimento sobre câncer foi utilizada em conjunto com os dados de expressão gênica inferidos a partir dos fragmentos ORESTES. Para contextualização biológica da informação gerada, os genes foram classificados segundo nomenclatura GO (ASHBURNER et al, 2000) e KEGG (OGATA et al, 1999). Parte dos genes apontados como diferencialmente expressos em pelo menos um tecido tumoral, em relação ao seu equivalente normal, integram vias relacionadas ao fenômeno neoplásico (HAHN e WEINBERG, 2002). Dos genes associados a estas vias, 52% deles possuíam fator de expressão diferencial (em módulo) superior a cinco. Finalmente, dez entre os genes classificados foram escolhidos para confirmação experimental dos achados. Os resultados de qPCR em amostras de tecido gástrico normal e neoplásico foram compatíveis com com os dados de expressão gênica inferidos a partir dos fragmentos ORESTES
Abstract: The XXth century showed the development in cancer research in a systematic way, most notably in the last 25 years that were characterized by rapid advances that generated a rich and complex body of knowledge, highlighting the disease within a dynamic group of changes in the genome. The complete understanding of the molecular phenomena involved in the physiopathology of neoplasia is based upon the knowledge of the varied cellular and biochemical processes which are characteristic of the tumor and which make it different from the normal cell (GOLUB e SLONIM, 1999) In this work, we investigated the molecular pathways in the neoplasic process through data analyses of the cDNA sequences generated on the Human Cancer Genome Project (CAMARGO, 2001). The following neoplasias were included: breast, colon, head and neck, lungs, central nervous system, prostate gland, stomach, testicle and womb. The methodology of generation of transcripts used by the Genome Project of Human Cancer is known as ORESTES (DIAS et al, 2000). Initially, the sequence of data (ORESTES fragments) were grouped and assembled according to similarity scores. For this purpose, we used the package of computer programs PHRED/PHRAP (EWING e GREEN P., 1998). The resulting consensus sequences, each representing a cluster, were compared to known sequences (deposited in public databanks) through the BLAST algorithm (ALTSCHUL et al, 1990). A subgroup of genes was selected based on specific criteria and their levels of expression in different tissues were evaluated by a bayesian inference approach (CHEN et al, 1998), as compared to more classical approaches such as null hypothesis tests (AUDIC e CLAVERIE, 1997). The Bayesian inference tool was represented as a PERL script developed for this work. A list of genes, putatively related to the neoplasic phenotype, was created with the support of the literature. This list was compared to the gene expression information, becoming one of the parameters of a ranking system (defined for the selection of genes of interest). Therefore, part of the knowledge related to cancer was used together with the data of gene expression inferred from ORESTES fragments. For a more accurate understanding of the molecular pathways involved in the generated information, the genes were classified according to the Gene Ontology (ASHBURNER et al, 2000) and KEGG (OGATA et al, 1999) nomenclatures. Additional global analyses by pathways related to the neoplasic phenomenon (HAHN e WEINBERG, 2002) demonstrated differential expression of the selected genes. About 52% of the genes in this pathways were differentially expressed in tumor tissue with at least a 5-fold. Finally, ten genes were selected for experimental validation (in vitro) of the findings with real-time quantitative PCR, confirming in silico results
Mestrado
Ciencias Biomedicas
Mestre em Ciências Médicas
APA, Harvard, Vancouver, ISO, and other styles
30

Bettini, Humberto Filipe de Andrade Januario. "Inferências de condutas em um oligopólio diferenciado : estudos sobre o comportamento do entrante em transporte aéreo no Brasil." [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/286088.

Full text
Abstract:
Orientador: José Maria FerreiraJardim da Silveira
Tese (doutorado) - Universidade Estadual de Campinas, Instituto de Economia
Made available in DSpace on 2018-08-22T19:30:33Z (GMT). No. of bitstreams: 1 Bettini_HumbertoFilipedeAndradeJanuario_D.pdf: 4068438 bytes, checksum: 83f5a7b43126e9389c4775dbdba66788 (MD5) Previous issue date: 2013
Resumo: Esta Tese debruça-se sobre temas teóricos e empíricos organizados em três frentes, cada qual correspondendo a um Capítulo. A primeira frente apresenta uma revisão de ocorrências do conceito de capacidade produtiva instalada em algumas vertentes de estudos microeconômicos, exprimindo aspectos relativos à sua caracterização técnica e também aspectos "estratégicos", cuja importância competitiva para as firmas que a detém será ressaltada. Na sequência, esta primeira frente conceitua capacidade produtiva no transporte aéreo e apresenta uma tipologia para suas fases decisórias, destacando implicações competitivas pertinentes em relação a temas como reversibilidade de ações e perecibilidade dos bens. Na segunda frente, o tema de investigação são os motivadores concretos por trás de entradas em mercados aéreos no Brasil recente, optando-se pela seleção de um caso. Esta frente também se desdobra em dois esforços: inicia-se por meio de uma revisão da literatura acerca de aspectos variados em Organização Industrial teórica e empírica que se mostram pertinentes à replicação de um exercício de identificação econométrica, tarefa a que se dedica a segunda parte do Capítulo. Assim, o Capítulo 2 realiza o primeiro exercício empírico presente nesta Tese, buscando estabelecer uma leitura das estratégias que cercaram a ação de entrada de uma nova empresa - a Azul Linhas Aéreas - no cenário do transporte aéreo doméstico brasileiro. Na terceira frente, repetem-se o objeto de estudo - trata-se novamente da Azul - e a metodologia empregada, a abordagem econométrica, embora se trate de um foco diferente. Após uma revisão de teorias que destacam aspectos como cognição, rotinas, tempo para reação e a distinção entre estratégias complementares e substitutas, parte-se para a especificação e a estimação do segundo exercício econométrico. Aqui o interesse são as reações em capacidade que foram postas em curso por empresas rivais já estabelecidas atendo-se às principais incumbentes, ou seja, GOL e TAM. Por meio da utilização de um modelo que decompõe a identificação das reações em termos espaciais e temporais, teorias quanto ao uso estratégico da capacidade produtiva e a extensão dos mercados relevantes poderão ser apreciadas à luz das inferências obtidas. Dentre as justificativas mais gerais para o estudo, destacam-se dois aspectos em matéria de regulação econômica: o primeiro se traduz na compreensão das condições capazes de romper com uma situação de extrema concentração de mercado. Ao longo da década de 2000, GOL e TAM paulatinamente alcançaram um patamar de dominância quase absoluta do mercado doméstico em termos agregados, configurando uma estrutura que analistas o público e os meios de comunicação (jornalismos convencional e econômico) rotularam de um "virtual duopólio". Assim, a compreensão acerca da interação estratégica que surge entre uma entrante e as duas principais incumbentes como meio para se manter ou se romper uma determinada configuração de mercado é matéria de grande relevância. Uma segunda justificativa se associa à tentativa empírica de se verificar quão substitutos entre si são os aeroportos de Viracopos, em Campinas, e os paulistanos de Congonhas e Guarulhos. Este tema, no contexto de um país com recursos escassos, déficits históricos em matéria de infra-estrutura e em pleno curso de implementar um processo de concessão de aeroportos - Campinas / Viracopos e São Paulo / Guarulhos inclusos - reveste-se de grande importância. Dentre os achados, corrobora-se a hipótese de que a Azul privilegia adensar operações a partir de aeroportos já integrados em sua rede, e também opta por adicionar ligações que contribuam para a conectividade de passageiros dentro do seu sistema de operações. Ademais, encontraram-se indícios que corroboram a hipótese de haver alguma substituição e alguma complementaridade entre o aeroporto campineiro e os terminais que atendem a Região Metropolitana de São Paulo, resultado que nos ampara na enunciação de recomendações em termos de políticas públicas, e também indícios quanto à diferença nas reações de GOL e TAM, o que sugere diferentes nichos de mercado e/ou diferentes utilizações de afiliadas regionais
Abstract: This Thesis is made of theoretical and empirical themes organized into three fronts, each corresponding to a Chapter. The first front reviews the occurrences of the installed productive capacity concept in some Microeconomics branches, expressing aspects related to its technical character and also strategic aspects whose importance for competition between firms will be then highlighted. In sequence, this first front presents the concepts of productive capacity in air transport and presents a typology for the decision steps and putting special focus on relevant competitive consequences that come in sequence of aspects such action reversibility and good perecibility. In the second front, we investigate the concrete drivers for entries in Brazilian airline markets in current period, by opting for a case study in this task. This front is also divided in two separate efforts. We initiate with a literature review of selected aspects on fields of both theoretical and empirical Industrial Organization studies that are relevant for backing up the econometric exercise that we then develop. Therefore, Chapter 2 ends with the presentation and discussion of the first empirical exercise of this Thesis, aimed at trying to identify the strategies that surrounded market entries by Azul Airlines. In the third front, we repeat the study subject - Azul Airlines - and also the methodology employed - econometrics, but we establish an opposite focus: after another selected literature review devoted to themes such economic cognition, routines, time to react and the distinction between complementary and substitute strategies, we specify and estimate the second econometric exercise of the Thesis. Now the interest lies on capacity reactions that rival airlines (specifically GOL and TAM) carried on in an answer to Azul's market entries. Making use of a model that decomposes reactions in both time and space dimensions, theories regarding the strategic use of productive capacity and the extension of relevant markets are appreciate in the sequence of obtained inferences. Among the most general justifications for the study we carry on, we can highlight some themes in economic regulation: first, the understanding of the conditions behind the breakup of a severe market concentration is something of utmost importance. In the 2000s, GOL and TAM reached such a market concentration degree that analysts, the flying public and means of communication forged the term "virtual duopoly" for designating such structure: together, GOL and TAM had nearly a 90% market share in domestic segment. In such a scenario, to understand the strategic interaction that emerges between an entrant and the two main incumbents as a means for keeping or ruining a determined market structure is a subject of great relevance. A second justification is associated to the empirical effort devoted to identifying how substitutes are the airports of Campinas (Viracopos) and those located in the metropolitan region of São Paulo, namely Congonhas and Guarulhos. This subject is a matter of great importance in a country where there is resource scarcity, historical infrastructure deficits, and an airport concession initiative is on course, in fact including Viracopos and Guarulhos airports. Among main findings, we could see that Azul privileges the strategy of making its network denser by means of adding destinations from airports already present in its network, and also prefers to add new links that can contribute for system-wide passenger connectivity. Moreover, some elements point to the validity of the airport relation (both complementarity and substitutability) between Campinas airport and those serving the metropolitan region of São Paulo, what backs some public policy recommendations and also the notion that the main incumbents - GOL and TAM - seem to belong to different market niches, as they react differently, and/or the hypothesis that regional airlines may have been used in order to compete against the new entrant
Doutorado
Teoria Economica
Doutor em Ciências Econômicas
APA, Harvard, Vancouver, ISO, and other styles
31

Zhai, Yongliang. "Stochastic processes, statistical inference and efficient algorithms for phylogenetic inference." Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/59095.

Full text
Abstract:
Phylogenetic inference aims to reconstruct the evolutionary history of populations or species. With the rapid expansion of genetic data available, statistical methods play an increasingly important role in phylogenetic inference by analyzing genetic variation of observed data collected at current populations or species. In this thesis, we develop new evolutionary models, statistical inference methods and efficient algorithms for reconstructing phylogenetic trees at the level of populations using single nucleotide polymorphism data and at the level of species using multiple sequence alignment data. At the level of populations, we introduce a new inference method to estimate evolutionary distances for any two populations to their most recent common ancestral population using single-nucleotide polymorphism allele frequencies. Our method is based on a new evolutionary model for both drift and fixation. To scale this method to large numbers of populations, we introduce the asymmetric neighbor-joining algorithm, an efficient method for reconstructing rooted bifurcating trees. Asymmetric neighbor-joining provides a scalable rooting method applicable to any non-reversible evolutionary modelling setup. We explore the statistical properties of asymmetric neighbor-joining, and demonstrate its accuracy on synthetic data. We validate our method by reconstructing rooted phylogenetic trees from the Human Genome Diversity Panel data. Our results are obtained without using an outgroup, and are consistent with the prevalent recent single-origin model of human migration. At the level of species, we introduce a continuous time stochastic process, the geometric Poisson indel process, that allows indel rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor-joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling in a real data experiment compared to some existing methods that either ignore indels or ignore indel rate variation.
Science, Faculty of
Statistics, Department of
Graduate
APA, Harvard, Vancouver, ISO, and other styles
32

Pérez, García Nora. "Inferencia espacial y predicción de la distribución de plantas: un estudio a diferentes escalas = Spatial inference and prediction of plant species distribution: a multiscale study." Doctoral thesis, Universitat de Barcelona, 2013. http://hdl.handle.net/10803/129083.

Full text
Abstract:
La pérdida de biodiversidad se ha acentuado en los últimos años debido a numerosas amenazas que incluyen la destrucción y degradación del hábitat, el cambio climático, la propagación de especies invasoras y la sobreexplotación. Por tanto, uno de los principales retos de la biología de la conservación es detener su continuo y acelerado declive. Sin embargo, mientras fenómenos como el calentamiento global, la contaminación y los cambios en los usos del suelo operan en áreas muy grandes o durante largos períodos de tiempo, los datos de campo que caracterizan la investigación ecológica se recogen normalmente en áreas relativamente pequeñas durante estudios de corta duración. Por tanto, los científicos necesitan cada vez más la utilización de medidas y datos locales para evaluar estos cambios a escala paisajística, regional y global, y modelos o simulaciones estadísticas para extrapolar estos datos ambientales al espacio geográfico. La presente Tesis doctoral estudia la aplicabilidad del modelado de la distribución de especies (MDE) en el análisis de la diversidad vegetal, mediante el uso de distintas técnicas de modelado, escalas geográficas y niveles de organización de plantas. Es por tanto, una investigación aplicada que pretende mostrar la utilidad de nuevas herramientas de análisis espacial para la gestión y conservación de la diversidad vegetal. El estudio se ha realizado a través de diferentes aproximaciones, con el objetivo de: i) describir patrones de riqueza de plantas en Cataluña y detectar puntos calientes de biodiversidad en la región, mediante la utilización de tres técnicas distintas de modelado y en base a la distribución conocida de toda su flora (Capítulo 1); ii) evaluar el efecto del cambio climático en las regiones alpinas y subalpinas del Pirineo Oriental, a través del estudio de la distribución potencial de unidades de vegetación (enfoque top-down) y valorar su aplicabilidad en relación al modelado de especies (enfoque bottom-up) (Capítulo 2); iii) construir un modelo dinámico para evaluar los efectos del cambio climático en la viabilidad a largo plazo de un taxón endémico y amenazado de la Península Ibérica, Vella pseudocytisus subsp. paui Gómez Campo. La generación de un modelo dinámico proporciona una aproximación más realista al incorporar la dinámica demográfica de la especie estudiada, y nos permite estimar su riesgo de extinción (Capítulo 3); iv) seleccionar e implementar un algoritmo de modelado en línea en el servidor de datos Sistema de la Vegetación Ibérica y Macaronésica (SIVIM) (Capítulo 4). La aplicación de MDE nos ha permitido obtener mejores patrones de la distribución espacial de la riqueza florística para Cataluña, frente a los patrones previamente proyectados basándose en los datos recogidos por el BDBC. Por otro lado, el uso de un enfoque top-down para predecir cambios futuros en la distribución de las unidades de vegetación puede ser tan útil como el utilizado en estudios previos para especies, con el objetivo de obtener mejores herramientas para la planificación de políticas relacionadas con la conservación de la biodiversidad. Finalmente, los resultados obtenidos en la presente Tesis doctoral nos permiten afirmar que los modelos de distribución de especies son herramientas de gran valor en la investigación de la biología de la conservación vegetal, al proporcionar alternativas precisas para la descripción de patrones biogeográficos, predicción de los efectos del cambio climático, evaluación del potencial dispersivo de las especies, así como para el diseño de reservas y planes de conservación. Así mismo, es importante destacar el papel de la escala en el modelado de la distribución de especies y comunidades vegetales, ya que ésta determinará tanto la aplicabilidad de los resultados como su significación biológica.
The loss of biodiversity has been accentuated in recent years due to numerous threats including the destruction and degradation of habitat, climate change, the spread of invasive species and over-exploitation. However, while phenomena such as global warming, pollution and changes in land-use operate over very large areas or for long periods of time, the field data that characterize the ecological research is normally gathered in relatively small areas during studies of short duration. Scientists are therefore required to increasingly use local measurements and data to evaluate changes to the landscape of regional and global scales by using models or statistical simulations to extrapolate these environmental data to the geographical space. This doctoral thesis examines the applicability of modelling the distribution of species in the analysis of plant diversity through the use of various techniques of modelling (SDMs), geographical scales and levels of plant organization. It is, therefore, an applied research that aims to show the utility of new tools of spatial analysis for the management and conservation of plant diversity. The study was conducted through different approaches, aiming to: i) describe plant richness patterns in Catalonia and detect biodiversity hotspots in the region through the use of three different modelling techniques and based on the known distribution of its entire flora (Chapter 1); ii) evaluate the effect of climate change in the alpine and subalpine regions of the Oriental Pyrenees through the study of the potential distribution of vegetation units (top-down approach) and evaluate their applicability in relation to species (bottom-up approach) modelling (Chapter 2); iii) assemble a dynamic model to assess the effects of climate change on the long-term viability of an endemic and threatened taxon of the Iberian Peninsula, Vella pseudocytisus subsp. paui Gómez Campo. The generation of a dynamic model provides a more realistic approach to incorporate the demographic dynamics of the studied species, and allows us to estimate its risk of extinction (Chapter 3); iv) select and implement an online modelling algorithm on the server of the Iberian and Macaronesian Vegetation Information System (SIVIM) (Chapter 4). The application of SDMs has allowed us to obtain better spatial distribution patterns of the floristic richness of Catalonia compared to the previously projected patterns based on data collected by the BDBC. On the other hand, the use of a top-down approach to predict future changes in the distribution of vegetation units can be as useful as those used in previous studies for species, with the goal of obtaining better tools for the planning of policies related to the conservation of biodiversity. Finally, the results obtained through this doctoral thesis allow us to state that species distribution models are tools of great value in the research of plant conservation biology, by providing precise alternatives for the description of biogeographical patterns, prediction of the effects of climate change, evaluation of the dispersive potential of species, as well as for the design of reserves and conservation plans. In addition, it is important to highlight the role of scale in the modelling of the distribution of species and vegetation communities, since this will determine both the applicability of the results and their biological significance.
APA, Harvard, Vancouver, ISO, and other styles
33

Wu, Jianrong. "Asymptotic likelihood inference." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq41050.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Morris, Quaid Donald Jozef 1972. "Practical probabilistic inference." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/29989.

Full text
Abstract:
Thesis (Ph. D. in Computational Neuroscience)--Massachusetts Institute of Technology, Dept. of Brain and Cognitive Sciences, 2003.
Includes bibliographical references (leaves 157-163).
The design and use of expert systems for medical diagnosis remains an attractive goal. One such system, the Quick Medical Reference, Decision Theoretic (QMR-DT), is based on a Bayesian network. This very large-scale network models the appearance and manifestation of disease and has approximately 600 unobservable nodes and 4000 observable nodes that represent, respectively, the presence and measurable manifestation of disease in a patient. Exact inference of posterior distributions over the disease nodes is extremely intractable using generic algorithms. Inference can be made much more efficient by exploiting the QMR-DT's unique structure. Indeed, tailor-made inference algorithms for the QMR-DT efficiently generate exact disease posterior marginals for some diagnostic problems and accurate approximate posteriors for others. In this thesis, I identify a risk with using the QMR-DT disease posteriors for medical diagnosis. Specifically, I show that patients and physicians conspire to preferentially report findings that suggest the presence of disease. Because the QMR-DT does not contain an explicit model of this reporting bias, its disease posteriors may not be useful for diagnosis. Correcting these posteriors requires augmenting the QMR-DT with additional variables and dependencies that model the diagnostic procedure. I introduce the diagnostic QMR-DT (dQMR-DT), a Bayesian network containing both the QMR-DT and a simple model of the diagnostic procedure. Using diagnostic problems sampled from the dQMR-DT, I show the danger of doing diagnosis using disease posteriors from the unaugmented QMR-DT.
(cont.) I introduce a new class of approximate inference methods, based on feed-forward neural networks, for both the QMR-DT and the dQMR-DT. I show that these methods, recognition models, generate accurate approximate posteriors on the QMR-DT, on the dQMR-DT, and on a version of the dQMR-DT specified only indirectly through a set of presolved diagnostic problems.
by Quaid Donald Jozef Morris.
Ph.D.in Computational Neuroscience
APA, Harvard, Vancouver, ISO, and other styles
35

Levine, Daniel S. Ph D. Massachusetts Institute of Technology. "Focused active inference." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/95559.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Aeronautics and Astronautics, 2014.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 91-99).
In resource-constrained inferential settings, uncertainty can be efficiently minimized with respect to a resource budget by incorporating the most informative subset of observations - a problem known as active inference. Yet despite the myriad recent advances in both understanding and streamlining inference through probabilistic graphical models, which represent the structural sparsity of distributions, the propagation of information measures in these graphs is less well understood. Furthermore, active inference is an NP-hard problem, thus motivating investigation of bounds on the suboptimality of heuristic observation selectors. Prior work in active inference has considered only the unfocused problem, which assumes all latent states are of inferential interest. Often one learns a sparse, high-dimensional model from data and reuses that model for new queries that may arise. As any particular query involves only a subset of relevant latent states, this thesis explicitly considers the focused problem where irrelevant states are called nuisance variables. Marginalization of nuisances is potentially computationally expensive and induces a graph with less sparsity; observation selectors that treat nuisances as notionally relevant may fixate on reducing uncertainty in irrelevant dimensions. This thesis addresses two primary issues arising from the retention of nuisances in the problem and representing a gap in the existing observation selection literature. The interposition of nuisances between observations and relevant latent states necessitates the derivation of nonlocal information measures. This thesis presents propagation algorithms for nonlocal mutual information (MI) on universally embedded paths in Gaussian graphical models, as well as algorithms for estimating MI on Gaussian graphs with cycles via embedded substructures, engendering a significant computational improvement over existing linear algebraic methods. The presence of nuisances also undermines application of a technical diminishing returns condition called submodularity, which is typically used to bound the performance of greedy selection. This thesis introduces the concept of submodular relaxations, which can be used to generate online-computable performance bounds, and analyzes the class of optimal submodular relaxations providing the tightest such bounds.
by Daniel S. Levine.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
36

Olšarová, Nela. "Inference propojení komponent." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236505.

Full text
Abstract:
The Master Thesis deals with the design of hardware component interconnection inference algorithm that is supposed to be used in the FPGA schema editor that was integrated into educational integrated development environment VLAM IDE. The aim of the algorithm is to support user by finding an optimal interconnection of two given components. The editor and the development environment are implemented as an Eclipse plugin using GMF framework. A brief description of this technologies and the embedded systems design are followed by the design of the inference algorithm. This problem is a topic of combinatorial optimization, related to the bipartite matching and assignment problem. After this, the implementation of the algorithm is described, followed by tests and a summary of achieved results.
APA, Harvard, Vancouver, ISO, and other styles
37

MacCartney, Bill. "Natural language inference /." May be available electronically:, 2009. http://proquest.umi.com/login?COPT=REJTPTU1MTUmSU5UPTAmVkVSPTI=&clientId=12498.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Amjad, Muhammad Jehangir. "Sequential data inference via matrix estimation : causal inference, cricket and retail." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/120190.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 185-193).
This thesis proposes a unified framework to capture the temporal and longitudinal variation across multiple instances of sequential data. Examples of such data include sales of a product over a period of time across several retail locations; trajectories of scores across cricket games; and annual tobacco consumption across the United States over a period of decades. A key component of our work is the latent variable model (LVM) which views the sequential data as a matrix where the rows correspond to multiple sequences while the columns represent the sequential aspect. The goal is to utilize information in the data within the sequence and across different sequences to address two inferential questions: (a) imputation or "filling missing values" and "de-noising" observed values, and (b) forecasting or predicting "future" values, for a given sequence of data. Using this framework, we build upon the recent developments in "matrix estimation" to address the inferential goals in three different applications. First, a robust variant of the popular "synthetic control" method used in observational studies to draw causal statistical inferences. Second, a score trajectory forecasting algorithm for the game of cricket using historical data. This leads to an unbiased target resetting algorithm for shortened cricket games which is an improvement upon the biased incumbent approach (Duckworth-Lewis-Stern). Third, an algorithm which leads to a consistent estimator for the time- and location-varying demand of products using censored observations in the context of retail. As a final contribution, the algorithms presented are implemented and packaged as a scalable open-source library for the imputation and forecasting of sequential data with applications beyond those presented in this work.
by Muhammad Jehangir Amjad.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
39

Schwaller, Loïc. "Exact Bayesian Inference in Graphical Models : Tree-structured Network Inference and Segmentation." Thesis, Université Paris-Saclay (ComUE), 2016. http://www.theses.fr/2016SACLS210/document.

Full text
Abstract:
Cette thèse porte sur l'inférence de réseaux. Le cadre statistique naturel à ce genre de problèmes est celui des modèles graphiques, dans lesquels les relations de dépendance et d'indépendance conditionnelles vérifiées par une distribution multivariée sont représentées à l'aide d'un graphe. Il s'agit alors d'apprendre la structure du modèle à partir d'observations portant sur les sommets. Nous considérons le problème d'un point de vue bayésien. Nous avons également décidé de nous concentrer sur un sous-ensemble de graphes permettant d'effectuer l'inférence de manière exacte et efficace, à savoir celui des arbres couvrants. Il est en effet possible d'intégrer une fonction définie sur les arbres couvrants en un temps cubique par rapport au nombre de variables à la condition que cette fonction factorise selon les arêtes, et ce malgré le cardinal super-exponentiel de cet ensemble. En choisissant les distributions a priori sur la structure et les paramètres du modèle de manière appropriée, il est possible de tirer parti de ce résultat pour l'inférence de modèles graphiques arborescents. Nous proposons un cadre formel complet pour cette approche.Nous nous intéressons également au cas où les observations sont organisées en série temporelle. En faisant l'hypothèse que la structure du modèle graphique latent subit un certain nombre de brusques changements, le but est alors de retrouver le nombre et la position de ces points de rupture. Il s'agit donc d'un problème de segmentation. Sous certaines hypothèses de factorisation, l'exploration exhaustive de l'ensemble des segmentations est permise et, combinée aux résultats sur les arbres couvrants, permet d'obtenir, entre autres, la distribution a posteriori des points de ruptures en un temps polynomial à la fois par rapport au nombre de variables et à la longueur de la série
In this dissertation we investigate the problem of network inference. The statistical frame- work tailored to this task is that of graphical models, in which the (in)dependence relation- ships satis ed by a multivariate distribution are represented through a graph. We consider the problem from a Bayesian perspective and focus on a subset of graphs making structure inference possible in an exact and e cient manner, namely spanning trees. Indeed, the integration of a function de ned on spanning trees can be performed with cubic complexity with respect to number of variables under some factorisation assumption on the edges, in spite of the super-exponential cardinality of this set. A careful choice of prior distributions on both graphs and distribution parameters allows to use this result for network inference in tree-structured graphical models, for which we provide a complete and formal framework.We also consider the situation in which observations are organised in a multivariate time- series. We assume that the underlying graph describing the dependence structure of the distribution is a ected by an unknown number of abrupt changes throughout time. Our goal is then to retrieve the number and locations of these change-points, therefore dealing with a segmentation problem. Using spanning trees and assuming that segments are inde- pendent from one another, we show that this can be achieved with polynomial complexity with respect to both the number of variables and the length of the series
APA, Harvard, Vancouver, ISO, and other styles
40

Presutto, Matteo. "Squeezing and Accelerating Neural Networks on Resource Constrained Hardware for Real Time Inference." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-247911.

Full text
Abstract:
As the internet user base increases over the years, so do the logistic difficulties of handling higher and higher volumes of data. This large amount of information is now being exploited by Artificial Intelligence algorithms to deliver value to our society on a global scale. Among all the algorithms employed, the widespread adoption of Neural Networks in industrial settings is promoting the automation of tasks previously unsolvable by computers. As of today, efficiency limits the applicability of such technology on Big Data and efforts are being put to develop new acceleration solutions.In this project, we analyzes the computational capabilities of a multicore Digital Signal Processor called the EMCA (Ericsson Many-Core Architecture) when it comes to executing Neural Networks. The EMCA is a proprietary chip used for real-time processing of data in the pipeline of a Radio Base Station.We developed an inference engine to run Neural Networks on the EMCA. The software of such engine has been produced using a proprietary operating system called Flake OS, which runs on the EMCA. On top of the inference engine, we wrote a neural network squeezing pipeline based on quantization. On MNIST, the quantization algorithm can reduce the size of the networks by 4x folds with sub 1% accuracy degradation. The inference engine has been optimized to exploit the quantization utility and can run quantized neural networks. Tests have been done to understand the direct implications of using such algorithm. We show that the quantization is indeed beneficial for inference on DSPs.Finally, the EMCA has demonstrated state of the art computational capabilities for neural network inferencing.
Liksom antalet internetanvändare årligen ökar, så gör också de logistiska svårigheterna att hantera större och större volymer av data. Denna stora mängd av information används nu av artificiell intelligens algoritmer för att leverera värde till vårt samhälle på en global skala. Av alla använda algoritmer, så möjliggör det utbredda införandet av neurala nätverk i industriella omgivningar, att uppgifter som tidigare inte kunde lösas av datorer nu kan automatiseras. Idag så finns det effektivitetsfaktorer som begränsar användbarheten av dessa tekniker för stora datamängder och insatser görs därför för att utveckla nya accelererade lösningar. I det här projektet så analyserar vi beräkningsförmågan av en multicore digital signalprocessor kallad EMCA (Ericsson Many-Core Architecture) för att exekvera neurala nätverk. EMCAn är ett proprietärt mikro-chip som används för real-tids beräkningar av data i pipelinen av en radiobasstation. Vi utvecklade en inferensmotor för att köra neurala nätverk på EM-CAn. Mjukvaran för motorn använde ett proprietärt operativsystem, kallat Flake OS, som körs på EMCAn. Ovanpå inferensmotorn skrev vi en pipeline för att reducera storleken av det neurala nätverket med hjälp av kvantisering. På MNIST så kan kvantiseringsalgorit-men minska storleken av näten upp till 4 gånger med under 1% precisionsdegradering. Inferensmotorn har optimerats för att utnyttja kvantiseringsfunktionen och kan exekvera kvantiserade neurala nätverk. Tester har gjorts för att förstå de direkta följderna av att använda sådana algoritmer. Vi visar att kvantisering verkligen är till nytta för att göra inferens på DSPer. Slutligen, EMCAn har demonstrerat toppmodern beräkningsförmåga för inferens av neurala nätverk.
APA, Harvard, Vancouver, ISO, and other styles
41

Thouin, Frédéric. "Bayesian inference in networks." Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=104476.

Full text
Abstract:
Bayesian inference is a method that can be used to estimate an unknown and/or unobservable parameter based on evidence that is accumulated over time.In this thesis, we apply Bayesian inference techniques in the context of two network-based problems.First, we consider multi-target tracking in networks with superpositional sensors, i.e., sensors that generate measurements equal to the sum of individual contributions of each target.We derive a tractable form for a novel moment-based multi-target filter called the Additive Likelihood Moment (ALM) filter. We show, through simulations, that our particle approximation of the ALM filter is more accurate and computationally efficient than Markov chain Monte Carlo-based particle methods to perform radio-frequency (RF) tomographic tracking of multiple targets.The second problem we study is multi-path available bandwidth estimation in computer networks.We propose a probabilistic-rate-based definition for the available bandwidth, probabilistic available bandwidth (PAB), that addresses flaws of the classical utilization-based definition and existing estimation tools. We design a network-wide estimation tool that uses factor graphs, belief propagation and adaptive sampling to minimize the overhead. We deploy our tool on the Planet Lab network and show that it can produce accurate estimates of the PAB and achieve significant gains (over 70%) in terms of measurement overhead and latency over a popular estimation tool (Pathload). We extend our tool to i) track PAB in time and ii) use chirps to further reduce the number of required measurements by over 80%. Our simulations and online experiments demonstrate that our tracking algorithm is more accurate than block-based approaches without any significant additional complexity.
L'inférence bayésienne est une méthode qui peut être utilisée pour estimer des paramètres inconnus et/ou inobservables à partir de preuves accumulées au fil du temps. Dans cette thèse, nous appliquons les techniques d'inférence bayésienne à deux problèmes de réseautique.Premièrement, nous considérons la poursuite de plusieurs cibles dans des réseaux de capteurs où les mesures générées sont égales à la somme des contributions individuelles de chaque cible. Nous obtenons une forme traitable pour un filtre multi-cibles appelé filtre Additive Likelihood Moment (ALM). Nous montrons, au moyen de simulations, que notre approximation particulaire du filtre ALM est plus précise et efficace que les méthodes particulaires de Monte-Carlo par chaînes de Markov pour effectuer une poursuite tomographique de plusieurs cibles à l'aide de radiofréquences.Le deuxième problème que nous étudions est l'estimation simultanée pour plusieurs chemins de bande passante disponible dans les réseaux informatiques. Nous proposons une définition probabiliste de la bande passante disponible, probabilistic available bandwidth (PAB), qui vise a corriger les failles de i) la définition classique fondée sur l'utilisation et ii) des outils d'estimation existants. Nous concevons un outil d'estimation pour l'ensemble du réseau qui utilise les réseaux bayésiens, la propagation de croyance et d'échantillonnage adapté pour minimiser le surdébit. Nous validons notre outil sur le réseau Planet Lab et montrons qu'il peut produire des estimations précises de la PAB et procure des gains significatifs (plus de 70%) en termes de surdébit et de latence en comparaison avec un outil d'estimation populaire (Pathload). Nous proposons ensuite une extension à notre outil pour i) suivre la PAB dans le temps et ii) utiliser les ``chirps'' pour réduire davantage le nombre de mesures requises par plus de 80%. Nos simulations et expériences en ligne montrent que notre algorithme de suivi est plus précis, sans complexité supplémentaire notable, que les approches qui traitent l'information en bloc sans modèle dynamique.
APA, Harvard, Vancouver, ISO, and other styles
42

Anderson, Christopher Lyon. "Type inference for JavaScript." Thesis, Imperial College London, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.429404.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Adami, K. Z. "Bayesian inference and deconvolution." Thesis, University of Cambridge, 2004. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.595341.

Full text
Abstract:
This thesis is concerned with the development of Bayesian methods for inference and deconvolution. We compare and contrast different Bayesian methods for model selection, specifically Markov Chain Monte Carlo methods (MCMC) and Variational methods and their application to medical and industrial problems. In chapter 1, the Bayesian framework is outlined. In chapter 2 the different methods for Bayesian model selection are introduced and we assess each method in turn. Problems with MCMC methods and Variational methods are highlighted, before a new method which combines the strengths of both the MCMC methods and the Variational methods is developed. Chapter 3 applies the inferential methods described in chapter 2 to the problem of interpolation, before a regression neural network is implemented and tested on a set of data from the microelectronics industry. Chapter 4 applies the interpolation methods developed in chapter three to characterise the electrical nature of the testing site in the integrated circuit (IC) manufacturing process. Chapter 5 describes Independent Component Analysis (ICA) as a solution to the bilinear decomposition problem and its application to Magnetic Resonance Imaging. This chapter also compares and contrasts various Bayesian algorithms for the bilinear problem with a non-Bayesian MUSIC algorithm. Chapter 6 describes various models for the deconvolution of images including a regression network. The ICA model of chapter 5 is then extended to the deconvolution and blind deconvolution problems with the addition of intrinsic correlation functions.
APA, Harvard, Vancouver, ISO, and other styles
44

Frühwirth-Schnatter, Sylvia. "On Fuzzy Bayesian Inference." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 1990. http://epub.wu.ac.at/384/1/document.pdf.

Full text
Abstract:
In the paper at hand we apply it to Bayesian statistics to obtain "Fuzzy Bayesian Inference". In the subsequent sections we will discuss a fuzzy valued likelihood function, Bayes' theorem for both fuzzy data and fuzzy priors, a fuzzy Bayes' estimator, fuzzy predictive densities and distributions, and fuzzy H.P.D .-Regions. (author's abstract)
Series: Forschungsberichte / Institut für Statistik
APA, Harvard, Vancouver, ISO, and other styles
45

Upsdell, M. P. "Bayesian inference for functions." Thesis, University of Nottingham, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.356022.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Davies, Winton H. E. "Communication of inductive inference." Thesis, University of Aberdeen, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.400670.

Full text
Abstract:
This thesis addresses the question: "How can knowledge learnt through inductive inference be communicated in a multi-agent system?". Existing agent communication languages, such as KQML, assume logically sound inference methods. Unfortunately, induction is logically unsound. In general, machine learning techniques infer knowledge (or hypotheses) consistent with the locally available facts. However, in a multi-agent system, hypotheses learnt by one agent can directly contradict knowledge held by another. If an agent communicates induced knowledge as though it were logically sound, then the knowledge held by other agents in the community may become inconsistent. The answer we present in this thesis is that agents must, in general, communicate the bounds to such induced knowledge. The Version Space framework characterises inductive inference as a process which identifies the set of hypotheses that are consistent with both the observable facts and the constraints of the hypothesis description language. A Version Space can be expressed by two boundary sets, which represent the most general and most specific hypotheses. We thus propose that when communicating an induced hypothesis, that the hypothesis be bounded by descriptions of the most general and most specific hypotheses. In order to allow agents to integrate induced hypotheses with their own facts or their own induced hypotheses, the technique of Version Space Intersection can be used. We have investigated how boundary set descriptions can be generated for the common case of machine learning algorithms which learn hypotheses from unrestricted Version Spaces. This is a hard computational problem, as it is the equivalent of finding the minimal DNF description of a set of logical sentences. We consider four alternate approaches: exact minimization using the Quine-McCluskey algorithm; a naive, information-theoretic hill-climbing search; Espresso II, a sophisticated, heuristic logic minimization algorithm; and unsound approximation techniques. We demonstrate that none of these techniques are scalable to realistic machine learning problems.
APA, Harvard, Vancouver, ISO, and other styles
47

Feldman, Jacob 1965. "Perceptual decomposition as inference." Thesis, Massachusetts Institute of Technology, 1990. http://hdl.handle.net/1721.1/13693.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Witty, Carl Roger. "The ontic inference language." Thesis, Massachusetts Institute of Technology, 1995. http://hdl.handle.net/1721.1/35027.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

De, León Eduardo Enrique. "Medical abstract inference dataset." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/119516.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (page 35).
In this thesis, I built a dataset for predicting clinical outcomes from medical abstracts and their title. Medical Abstract Inference consists of 1,794 data points. Titles were filtered to include the abstract's reported medical intervention and clinical outcome. Data points were annotated with the interventions effect on the outcome. Resulting labels were one of the following: increased, decreased, or had no significant difference on the outcome. In addition, rationale sentences were marked, these sentences supply the necessary supporting evidence for the overall prediction. Preliminary modeling was also done to evaluate the corpus. Preliminary models included top performing Natural Language Inference models as well as Rationale based models and linear classifiers.
by Eduardo Enrique de León.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
50

Eaton, Frederik. "Combining approximations for inference." Thesis, University of Cambridge, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.609868.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography