Dissertations / Theses on the topic 'Inferenza statistica'

To see the other types of publications on this topic, follow the link: Inferenza statistica.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Inferenza statistica.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

AGOSTINELLI, Claudio. "Inferenza statistica robusta basata sulla funzione di verosimiglianza pesata: alcuni sviluppi." Doctoral thesis, country:ITA, 1998. http://hdl.handle.net/10278/25831.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Capriati, Paola Bianca Martina. "L'utilizzo del metodo Bootstrap nella statistica inferenziale." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/8715/.

Full text
Abstract:
In questo lavoro viene introdotto il metodo Bootstrap, sviluppato a partire dal 1979 da Bradley Efron. Il Bootstrap è una tecnica statistica di ricampionamento basata su calcoli informatici, e quindi definita anche computer-intensive. In particolare vengono analizzati i vantaggi e gli svantaggi di tale metodo tramite esempi con set di dati reali implementati tramite il software statistico R. Tali analisi vertono su due tra i principali utilizzi del Bootstrap, la stima puntuale e la costruzione di intervalli di confidenza, basati entrambi sulla possibilità di approssimare la distribuzione campionaria di un qualsiasi stimatore, a prescindere dalla complessità di calcolo.
APA, Harvard, Vancouver, ISO, and other styles
3

Monda, Anna. "Inferenza non parametrica nel contesto di dati dipendenti: polinomi vocali e verosimiglianza empirica." Doctoral thesis, Universita degli studi di Salerno, 2013. http://hdl.handle.net/10556/1285.

Full text
Abstract:
2010 - 2011
Il presente lavoro si inserisce nel contesto delle più recenti ricerche sugli strumenti di analisi non parametrica ed in particolare analizza l'utilizzo dei Polinomi Locali e della Verosimiglianza Empirica, nel caso di dati dipendenti. Le principali forme di dipendenza che verranno trattate in questo lavoro sono quelle che rispondono alla definizione di alpha-mixing ed in particolare il nostro si presenta come un tentativo di conciliare, in questo ambito, tecniche non parametriche, rappresentate dai Polinomi Locali, all'approccio di Empirical Likelihood, cercando di aggregare ed enfatizzare i punti di forza di entrambe le metodologie: i Polinomi Locali ci forniranno una stima più e accurata da collocare all'interno della definizione di Verosimiglianza Empirica fornita da Owen (1988). I vantaggi sono facili da apprezzare in termini di immediatezza ed utilizzo pratico di questa tecnica. I risultati vengono analizzati sia da un punto di vista teorico, sia confermati poi, da un punto di vista empirico, riuscendo a trarre dai dati anche utili informazioni in grado di fornire l'effettiva sensibilità al più cruciale e delicato parametro da stabilire nel caso di stimatori Polinomi Locali: il parametro di bandwidth. Lungo tutto l'elaborato presenteremo, in ordine, dapprima il contesto all'interno del quale andremo ad operare, precisando più nello specifico le forme di dipendenza trattate, nel capitolo secondo, enunceremo le caratteristiche e proprietà dei polinomi locali, successivamente, nel corso del capitolo terzo, analizzeremo nel dettaglio la verosimiglianza empirica, con particolare attenzione, anche in questo caso, alle proprietà teoriche, infine, nel quarto capitolo presenteremo risultati teorici personali, conseguiti a partire dalla trattazione teorica precedente. Il capitolo conclusivo propone uno studio di simulazione, sulla base delle proprietà teoriche ottenute nel capitolo precedente. Nelle battute conclusive troveranno spazio delucidazioni sugli esiti delle simulazioni, i quali, non soltanto confermano la validità dei risultati teorici esposti nel corso dell'elaborato, ma forniscono anche evidenze a favore di un'ulteriore analisi, per i test proposti, rispetto alla sensibilità verso il parametro di smoothing impiegato. [a cura dell'autore]
X n.s.
APA, Harvard, Vancouver, ISO, and other styles
4

Mancini, Martina. "Teorema di Cochran e applicazioni." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2015. http://amslaurea.unibo.it/9145/.

Full text
Abstract:
La statistica è un ramo della matematica che studia i metodi per raccogliere, organizzare e analizzare un insieme di dati numerici, la cui variazione è influenzata da cause diverse, con lo scopo sia di descrivere le caratteristiche del fenomeno a cui i dati si riferiscono, sia di dedurre, ove possibile, le leggi generali che lo regolano. La statistica si suddivide in statistica descrittiva o deduttiva e in statistica induttiva o inferenza statistica. Noi ci occuperemo di approfondire la seconda, nella quale si studiano le condizioni per cui le conclusioni dedotte dall'analisi statistica di un campione sono valide in casi più generali. In particolare l'inferenza statistica si pone l'obiettivo di indurre o inferire le proprietà di una popolazione (parametri) sulla base dei dati conosciuti relativi ad un campione. Lo scopo principale di questa tesi è analizzare il Teorema di Cochran e illustrarne le possibili applicazioni nei problemi di stima in un campione Gaussiano. In particolare il Teorema di Cochran riguarda un'importante proprietà delle distribuzioni normali multivariate, che risulta fondamentale nella determinazione di intervalli di fiducia per i parametri incogniti.
APA, Harvard, Vancouver, ISO, and other styles
5

HAMMAD, AHMED TAREK. "Tecniche di valutazione degli effetti dei Programmi e delle Politiche Pubbliche. L' approccio di apprendimento automatico causale." Doctoral thesis, Università Cattolica del Sacro Cuore, 2022. http://hdl.handle.net/10280/110705.

Full text
Abstract:
L'analisi dei meccanismi causali è stata considerata in varie discipline come la sociologia, l’epidemiologia, le scienze politiche, la psicologia e l’economia. Questi approcci permettere di scoprire relazioni e meccanismi causali studiando il ruolo di una variabile di trattamento (come ad esempio una politica pubblica o un programma) su un insieme di variabili risultato di interesse o diverse variabili intermedie sul percorso causale tra il trattamento e le variabili risultato. Questa tesi si concentra innanzitutto sulla revisione e l'esplorazione di strategie alternative per indagare gli effetti causali e gli effetti di mediazione multipli utilizzando algoritmi di apprendimento automatico (Machine Learning) che si sono dimostrati particolarmente adatti per rispondere a domande di ricerca in contesti complessi caratterizzati dalla presenza di relazioni non lineari. In secondo luogo, la tesi fornisce due esempi empirici in cui due algoritmi di Machine Learning, ovvero Generalized Random Foresta e Multiple Additive Regression Trees, vengono utilizzati per tenere conto di importanti variabili di controllo nell'inferenza causale seguendo un approccio “data-driven”.
The analysis of causal mechanisms has been considered in various disciplines such as sociology, epidemiology, political science, psychology and economics. These approaches allow uncovering causal relations and mechanisms by studying the role of a treatment variable (such as a policy or a program) on a set of outcomes of interest or different intermediates variables on the causal path between the treatment and the outcome variables. This thesis first focuses on reviewing and exploring alternative strategies to investigate causal effects and multiple mediation effects using Machine Learning algorithms which have been shown to be particularly suited for assessing research questions in complex settings with non-linear relations. Second, the thesis provides two empirical examples where two Machine Learning algorithms, namely the Generalized Random Forest and Multiple Additive Regression Trees, are used to account for important control variables in causal inference in a data-driven way. By bridging a fundamental gap between causality and advanced data modelling, this work combines state of the art theories and modelling techniques.
APA, Harvard, Vancouver, ISO, and other styles
6

BOLZONI, MATTIA. "Variational inference and semi-parametric methods for time-series probabilistic forecasting." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2021. http://hdl.handle.net/10281/313704.

Full text
Abstract:
Prevedere la probabilità di eventi futuri è un problema comune. L'approccio più utilizzato assume una struttura fissa per questa probabilità, detta modello, dipendente da variabili latenti dette parametri. Dopo aver osservato dei dati è possibile inferire una distribuzione per queste variabili non osservabili. Il procedimento di inferenza non è sempre immediato, siccome selezionare un singolo valore per i parametri potrebbe portare a scarsi risultati, mentre approssimare una distribuzione usando MCMC potrebbe essere complicato. L'inferenza variazionale (VI) sta ricevendo una crescente attenzione come alternativa per approssimare la distribuzione a posteriori tramite un problema di ottimo. Tuttavia, VI spesso impone una struttura parametrica alla distribuzione proposta. Il primo contributo della tesi, detto Hierarchical Variational Inference (HVI), è una metodologia che utilizza reti neurali per creare un'approssimazione semi-parametrica della distribuzione a posteriori. HVI richiede gli stessi requisiti minimi di un Metropolis-Hastings o di un Hamiltonian MCMC, per essere applicata. Il secondo contributo è un pacchetto Python per l'inferenza variazionale su serie storiche usando modelli media-covarianza. Questo utilizza HVI e tecniche di VI standard combinate con reti neurali. I risultati sperimentali, su dati econometrici e finanziari, mostrano un consistente miglioramento della previsione usando VI, rispetto a stime puntuali dei parametri, in particolare producendo stimatori con minor variabilità.
Probabilistic forecasting is a common task. The usual approach assumes a fixed structure for the outcome distribution, often called model, that depends on unseen quantities called parameters. It uses data to infer a reasonable distribution over these latent values. The inference step is not always straightforward, because single-value can lead to poor performances and overfitting while handling a proper distribution with MCMC can be challenging. Variational Inference (VI) is emerging as a viable optimisation based alternative that models the target posterior with instrumental variables called variational parameters. However, VI usually imposes a parametric structure on the proposed posterior. The thesis's first contribution is Hierarchical Variational Inference (HVI) a methodology that uses Neural Networks to create semi-parametric posterior approximations with the same minimum requirements as Metropolis-Hastings or Hamiltonian MCMC. The second contribution is a Python package to conduct VI on time-series models for mean-covariance estimate, using HVI and standard VI techniques combined with Neural Networks. Results on econometric and financial data show a consistent improvement using VI, compared to point estimate, obtaining lower variance forecasting.
APA, Harvard, Vancouver, ISO, and other styles
7

ROMIO, SILVANA ANTONIETTA. "Modelli marginali strutturali per lo studio dell'effetto causale di fattori di rischio in presenza di confondenti tempo dipendenti." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2010. http://hdl.handle.net/10281/8048.

Full text
Abstract:
Uno degli obiettivi piu importanti della ricerca epidemiologica è quello di analizzare la relazione tra uno o più fattori di rischio ed un evento. Tali relazioni sono spesso complicate dalla presenza di confondenti, il cui concetto è estremamente complesso da formalizzare. Dal punto di vista dell'analisi causale, si dice che esiste confondimento quando la misura di associazione non coincide con quella di effetto corrispondente, cioè quando ad esempio il rischio relativo non coincide con il rischio relativo causale. Il problema è quindi quello di individuare i disegni e le ipotesi sulla base delle quali è possibile calcolare l'effetto causale oggetto di studio. Ad esempio, gli studi clinici controllati randomizzati sono nati con lo scopo di minimizzare l'influenza di errori sistematici nella misurazione dell'effetto di un fattore di rischio su di un outcome. Inoltre in questi studi le misure di associazione risultano essere uguali a quelle di effetto (causali). Negli studi osservazionali lo scenario diventa più complesso per la presenza di una o più variabili che possono alterare o 'confondere' la relazione d'interesse poichè lo sperimentatore non può in alcun modo intervenire sulle covariate osservate né sull'outcome. Di particolare interesse risulta quindi l'identificazione di metodi che permettano di risolvere il problema del confondimento. Il problema è particolarmente complesso nello studio dell'effetto causale di un fattore di rischio in presenza di confondenti tempo dipendenti e cioè una variabile che, condizionatamente alla storia di esposizione pregressa è un predittore sia dell'outcome che dell'esposizione successiva. Nel presente lavoro è stato studiato un importante problema di sanità pubblica come quello di esplorare l'esistenza di una relazione causale tra abitudine al fumo e diminuzione dell'indice di massa corporea (body mass index - BMI) considerando come confondente tempo dipendente lo stesso BMI misurato al tempo precedente, utilizzando un modello marginale strutturale per misure ripetute avendo a disposizione i dati relativi ad una coorte di studenti svedesi (coorte BROMS). L'elevata numerosita di tale coorte e l'accuratezza e tipologia dei dati raccolti la rendono particolarente adatta allo studio di fenomeni dinamici comportamentali caratteristici dell'adolescenza. Dallo studio emerge come l'effetto causale cumulato del fumo di sigaretta sulla riduzione del BMI è significativo solo nelle donne, con una stima del parametro relativo all'interazione tra l'esposizione al fumo e genere pari a 0.322 (p-value < 0.001) mentre la stima del parametro relativo al consumo cumulato di sigarette nei maschi è non signicativo e pari a 0.053 (p-value pari a 0.464). I risultati ottenuti sono consistenti con quanto riportato in studi precedenti.
APA, Harvard, Vancouver, ISO, and other styles
8

MASPERO, DAVIDE. "Computational strategies to dissect the heterogeneity of multicellular systems via multiscale modelling and omics data analysis." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2022. http://hdl.handle.net/10281/368331.

Full text
Abstract:
L'eterogeneità pervade i sistemi biologici e si manifesta in differenze strutturali e funzionali osservate sia tra diversi individui di uno stesso gruppo (es. organismi o patologie), sia fra gli elementi costituenti di un singolo individuo (es. cellule). Lo studio dell’eterogeneità dei sistemi biologici e, in particolare, di quelli multicellulari è fondamentale per la comprensione meccanicistica di fenomeni fisiologici e patologici complessi (es. il cancro), così come per la definizione di strategie prognostiche, diagnostiche e terapeutiche efficaci. Questo lavoro è focalizzato sullo sviluppo e l’applicazione di metodi computazionali e modelli matematici per la caratterizzazione dell’eterogeneità di sistemi multicellulari e delle sottopopolazioni di cellule tumorali che sottendono l’evoluzione di una patologia neoplastica. Analoghe metodologie sono state sviluppate per caratterizzare efficacemente l’evoluzione e l’eterogeneità virale. La ricerca è suddivisa in due porzioni complementari, la prima finalizzata alla definizione di metodi per l’analisi e l’integrazione di dati omici generati da esperimenti di sequenziamento, la seconda alla modellazione e simulazione multiscala di sistemi multicellulari. Per quanto riguarda il primo filone, le tecnologie di next-generation sequencing permettono di generare enormi moli di dati omici, relativi per esempio al genoma o trascrittoma di un determinato individuo, attraverso esperimenti di bulk o single-cell sequencing. Una delle sfide principale in informatica è quella di definire metodi computazionali per estrarre informazione utile da tali dati, tenendo conto degli alti livelli di errori dato-specifico, dovuti principalmente a limiti tecnologici. In particolare, nell’ambito di questo lavoro, ci si è concentrati sullo sviluppo di metodi per l’analisi di dati di espressione genica e di mutazioni genomiche. In dettaglio, è stata effettuata una comparazione esaustiva dei metodi di machine-learning per il denoising e l’imputation di dati di single-cell RNA-sequencing. Inoltre, sono stati sviluppati metodi per il mapping dei profili di espressione su reti metaboliche, attraverso un framework innovativo che ha consentito di stratificare pazienti oncologici in base al loro metabolismo. Una successiva estensione del metodo ha permesso di analizzare la distribuzione dei flussi metabolici all'interno di una popolazione di cellule, via un approccio di flux balance analysis. Per quanto riguarda l’analisi dei profili mutazionali, è stato ideato e implementato il primo metodo per la ricostruzione di modelli filogenomici a partire da dati longitudinali a risoluzione single-cell, che sfrutta un framework che combina una Markov Chain Monte Carlo con una nuova funzione di likelihood pesata. Analogamente, è stato sviluppato un framework che sfrutta i profili delle mutazioni a bassa frequenza per ricostruire filogenie robuste e probabili catene di infenzione, attraverso l’analisi dei dati di sequenziamento di campioni virali. Gli stessi profili mutazionali permettono anche di deconvolvere il segnale nelle firme associati a specifici meccanismi molecolari che generano tali mutazioni, attraverso un approccio basato su non-negative matrix factorization. La ricerca condotta per quello che riguarda la simulazione computazionale ha portato allo sviluppo di un modello multiscala, in cui la simulazione della dinamica di popolazioni cellulari, rappresentata attraverso un Cellular Potts Model, è accoppiata all'ottimizzazione di un modello metabolico associato a ciascuna cellula sintetica. Co modello è possibile rappresentare ipotesi in termini matematici e osservare proprietà emergenti da tali assunti. Infine, un primo tentativo per combinare i due approcci metodologici ha condotto all'integrazione di dati di single-cell RNA-seq all'interno del modello multiscala, consentendo di formulare ipotesi data-driven sulle proprietà emergenti del sistema.
Heterogeneity pervades biological systems and manifests itself in the structural and functional differences observed both among different individuals of the same group (e.g., organisms or disease systems) and among the constituent elements of a single individual (e.g., cells). The study of the heterogeneity of biological systems and, in particular, of multicellular systems is fundamental for the mechanistic understanding of complex physiological and pathological phenomena (e.g., cancer), as well as for the definition of effective prognostic, diagnostic, and therapeutic strategies. This work focuses on developing and applying computational methods and mathematical models for characterising the heterogeneity of multicellular systems and, especially, cancer cell subpopulations underlying the evolution of neoplastic pathology. Similar methodologies have been developed to characterise viral evolution and heterogeneity effectively. The research is divided into two complementary portions, the first aimed at defining methods for the analysis and integration of omics data generated by sequencing experiments, the second at modelling and multiscale simulation of multicellular systems. Regarding the first strand, next-generation sequencing technologies allow us to generate vast amounts of omics data, for example, related to the genome or transcriptome of a given individual, through bulk or single-cell sequencing experiments. One of the main challenges in computer science is to define computational methods to extract useful information from such data, taking into account the high levels of data-specific errors, mainly due to technological limitations. In particular, in the context of this work, we focused on developing methods for the analysis of gene expression and genomic mutation data. In detail, an exhaustive comparison of machine-learning methods for denoising and imputation of single-cell RNA-sequencing data has been performed. Moreover, methods for mapping expression profiles onto metabolic networks have been developed through an innovative framework that has allowed one to stratify cancer patients according to their metabolism. A subsequent extension of the method allowed us to analyse the distribution of metabolic fluxes within a population of cells via a flux balance analysis approach. Regarding the analysis of mutational profiles, the first method for reconstructing phylogenomic models from longitudinal data at single-cell resolution has been designed and implemented, exploiting a framework that combines a Markov Chain Monte Carlo with a novel weighted likelihood function. Similarly, a framework that exploits low-frequency mutation profiles to reconstruct robust phylogenies and likely chains of infection has been developed by analysing sequencing data from viral samples. The same mutational profiles also allow us to deconvolve the signal in the signatures associated with specific molecular mechanisms that generate such mutations through an approach based on non-negative matrix factorisation. The research conducted with regard to the computational simulation has led to the development of a multiscale model, in which the simulation of cell population dynamics, represented through a Cellular Potts Model, is coupled to the optimisation of a metabolic model associated with each synthetic cell. Using this model, it is possible to represent assumptions in mathematical terms and observe properties emerging from these assumptions. Finally, we present a first attempt to combine the two methodological approaches which led to the integration of single-cell RNA-seq data within the multiscale model, allowing data-driven hypotheses to be formulated on the emerging properties of the system.
APA, Harvard, Vancouver, ISO, and other styles
9

Zeller, Camila Borelli. "Modelo de Grubbs em grupos." [s.n.], 2006. http://repositorio.unicamp.br/jspui/handle/REPOSIP/307093.

Full text
Abstract:
Orientador: Filidor Edilfonso Vilca Labra
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Computação Cientifica
Made available in DSpace on 2018-08-05T23:55:16Z (GMT). No. of bitstreams: 1 Zeller_CamilaBorelli_M.pdf: 3683998 bytes, checksum: 26267086098b12bd76b1d5069f688223 (MD5) Previous issue date: 2006
Resumo: Neste trabalho, apresentamos um estudo de inferência estatística no modelo de Grubbs em grupos, que representa uma extensão do modelo proposto por Grubbs (1948,1973) que é freqüentemente usado para comparar instrumentos ou métodos de medição. Nós consideramos a parametrização proposta por Bedrick (2001). O estudo é baseado no método de máxima verossimilhança. Testes de hipóteses são considerados e baseados nas estatísticas de wald, escore e razão de verossimilhanças. As estimativas de máxima verossimilhança do modelo de Grubbs em grupos são obtidas usando o algoritmo EM e considerando que as observações seguem uma distribuição normal. Apresentamos um estudo de análise de diagnóstico no modelo de Grubbs em grupos com o interesse de avaliar o impacto que um determinado subgrupo exerce na estimativa dos parâmetros. Vamos utilizar a metodologia de influência local proposta por Cook (1986), considerando o esquema de perturbação: ponderação de casos. Finalmente, apresentamos alguns estudos de simulação e ilustramos os resultados teóricos obtidos usando dados encontrados na literatura
Abstract: In this work, we presented a study of statistical inference in the Grubbs's model with subgroups, that represents an extension of the model proposed by Grubbs (1948,1973) that is frequently used to compare instruments or measurement methods. We considered the parametrization proposed by Bedrick (2001). The study is based on the maximum likelihood method. Tests of hypotheses are considered and based on the wald statistics, score and likelihood ratio statistics. The maximum likelihood estimators of the Grubbs's model with subgroups are obtained using the algorithm EM and considering that the observations follow a normal distribution. We also presented a study of diagnostic analysis in the Grubb's model with subgroups with the interest of evaluating the effect that a certain one subgroup exercises in the estimate of the parameters. We will use the methodology of local influence proposed by Cook (1986) considering the schemes of perturbation of case weights. Finally, we presented some simulation studies and we illustrated the obtained theoretical results using data found in the literature
Mestrado
Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
10

Filiasi, Mario. "Applications of Large Deviations Theory and Statistical Inference to Financial Time Series." Doctoral thesis, Università degli studi di Trieste, 2015. http://hdl.handle.net/10077/10940.

Full text
Abstract:
2013/2014
La corretta valutazione del rischio finanziario è una delle maggiori attività nell'amibto della ricerca finanziaria, ed è divenuta ancora più importante dopo la recente crisi finanziaria. I recenti progressi dell'econofisica hanno dimostrato come la dinamica dei mercati finanziari può essere studiata in modo attendibile per mezzo dei modelli usati in fisica statistica. L'andamento dei prezzi azionari è costantemente monitorato e registrato ad alte frequenze (fino a 1ms) e ciò produce un'enorme quantità di dati che può essere analizzata statisticamente per validare e calibrare i modelli teorici. Il presente lavoro si inserisce in questa ottica, ed è il risultato dell'interazione tra il Dipartimento di Fisica dell'Università degli Studi di Trieste e List S.p.A., in collaborazione con il Centro Internazionale di Fisica Teorica (ICTP). In questo lavoro svolgeremo un analisi delle serie storiche finanziarie degli ultimi due anni relative al prezzo delle azioni maggiormente scambiate sul mercato italiano. Studieremo le proprietà statistiche dei ritorni finanziari e verificheremo alcuni fatti stilizzati circa i prezzi azionari. I ritorni finanziari sono distribuiti secondo una distribuzione di probabilità a code larghe e pertanto, secondo la Teoria delle Grandi Deviazioni, sono frequentemente soggetti ad eventi estremi che generano salti di prezzo improvvisi. Il fenomeno viene qui identificato come "condensazione delle grandi deviazioni". Studieremo i fenomeni di condensazione secondo le convenzioni della fisica statistica e mostreremo la comparsa di una transizione di fase per distribuzioni a code larghe. Inoltre, analizzaremo empiricamente i fenomeni di condensazione nei prezzi azionari: mostreremo che i ritorni finanziari estremi sono generati da complesse fluttuazioni dei prezzi che limitano gli effetti di salti improvvisi ma che amplificano il movimento diffusivo dei prezzi. Proseguendo oltre l'analisi statistica dei prezzi delle singole azioni, investigheremo la struttura del mercato nella sua interezza. E' opinione comune in letteratura finanziaria che i cambiamenti di prezzo sono dovuti ad eventi esogeni come la diffusione di notizie politiche ed economiche. Nonostante ciò, è ragionevole ipotizzare che i prezzi azionari possano essere influenzati anche da eventi endogeni, come le variazioni di prezzo in altri strumenti finanziari ad essi correlati. La grande quantità di dati a disposizione permette di verificare quest'ipotesi e di studiare la struttura del mercato finanziario per mezzo dell'inferenza statistica. In questo lavoro proponiamo un modello di mercato basato su prezzi azionari interagenti: studieremo un modello di tipo "integrate & fire" ispirato alla dinamica delle reti neurali, in cui ogni azione è influenzata da tutte gli altre per mezzo di un meccanismo con soglie limite di prezzo. Usando un algoritmo di massima verosimiglianza, applicheremo il modello ai dati sperimentali e tenteremo di inferire la rete informativa che è alla base del mercato finanziario.
The correct evaluation of financial risk is one of the most active domain of financial research, and has become even more relevant after the latest financial crisis. The recent developments of econophysics prove that the dynamics of financial markets can be successfully investigated by means of physical models borrowed from statistical physics. The fluctuations of stock prices are continuously recorded at very high frequencies (up to 1ms) and this generates a huge amount of data which can be statistically analysed in order to validate and to calibrate the theoretical models. The present work moves in this direction, and is the result of a close interaction between the Physics Department of the University of Trieste with List S.p.A., in collaboration with the International Centre for Theoretical Physics (ICTP). In this work we analyse the time-series over the last two years of the price of the 20 most traded stocks from the Italian market. We investigate the statistical properties of price returns and we verify some stylized facts about stock prices. Price returns are distributed according to a heavy-tailed distribution and therefore, according to the Large Deviations Theory, they are frequently subject to extreme events which produce abrupt price jumps. We refer to this phenomenon as the condensation of the large deviations. We investigate condensation phenomena within the framework of statistical physics and show the emergence of a phase transition in heavy-tailed distributions. In addition, we empirically analyse condensation phenomena in stock prices: we show that extreme returns are generated by non-trivial price fluctuations, which reduce the effects of sharp price jumps but amplify the diffusive movements of prices. Moving beyond the statistical analysis of the single-stock prices, we investigate the structure of the market as a whole. In financial literature it is often assumed that price changes are due to exogenous events, e.g. the release of economic and political news. Yet, it is reasonable to suppose that stock prices could also be driven by endogenous events, such as the price changes of related financial instruments. The large amount of available data allows us to test this hypothesis and to investigate the structure of the market by means of the statistical inference. In this work we propose a market model based on interacting prices: we study an integrate & fire model, inspired by the dynamics of neural networks, where each stock price depends on the other stock prices through some threshold-passing mechanism. Using a maximum likelihood algorithm, we apply the model to the empirical data and try to infer the information network that underlies the financial market.
XXVII Ciclo
1986
APA, Harvard, Vancouver, ISO, and other styles
11

Frey, Jesse C. "Inference procedures based on order statistics." Connect to this title online, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1122565389.

Full text
Abstract:
Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains xi, 148 p.; also includes graphics. Includes bibliographical references (p. 146-148). Available online via OhioLINK's ETD Center
APA, Harvard, Vancouver, ISO, and other styles
12

Wiberg, Marie H. "Computerized achievement tests : sequential and fixed length tests." Doctoral thesis, Umeå universitet, Statistiska institutionen, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-148.

Full text
Abstract:
The aim of this dissertation is to describe how a computerized achivement test can be constructed and used in practice. Throughout this dissertation the focus is on classifying the examinees into masters and non-masters depending on their ability. However, there has been no attempt to estimate their ability. In paper I, a criterion-referenced computerized test with a fixed number of items is expressed as a statistical inference problem. The theory of optimal design is used to find the test that has the strongest power. A formal proof is provided showing that all items should have the same item characteristics, viz. high discrimination, low guessing and difficulty near the cutoff score, in order to give us the most powerful statistical test. An efficiency study shows how many times more non-optimal items are needed if we do not use optimal items in order to achieve the same power in the test. In paper II, a computerized mastery sequential test is examined using sequential analysis. The focus is on examining the sequential probability ratio test and to minimize the number of items in a test, i.e. to minimize the average sample number function, abbreviated as the ASN function. Conditions under which the ASN function decreases are examined. Further, it is shown that the optimal values are the same for item discrimination and item guessing, but differ for item difficulty compared with tests with fixed number of items. Paper III presents three simulation studies of sequential computerized mastery tests. Three cases are considered, viz. the examinees' responses are either identically distributed, not identically distributed, or not identically distributed together with estimation errors in the item characteristics. The simulations indicate that the observed results from the operating characteristic function differ significantly from the theoretical results. The mean number of items in a test, the distribution of test length and the variance depend on whether the true values of the item characteristics are known and whether they are iid or not. In paper IV computerized tests with both pretested items with known item parameters, and try-out items with unknown item parameters are considered. The aim is to study how the item parameters for try-out items can be estimated in a computerized test. Although the unknown examinees' abilities may act as nuisance parameters, the asymptotic variance of the item parameter estimators can be calculated. Examples show that a more reliable variance estimator yields much larger estimates of the variance than commonly used variance estimators.
APA, Harvard, Vancouver, ISO, and other styles
13

Follestad, Turid. "Stochastic Modelling and Simulation Based Inference of Fish Population Dynamics and Spatial Variation in Disease Risk." Doctoral thesis, Norwegian University of Science and Technology, Faculty of Information Technology, Mathematics and Electrical Engineering, 2003. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-41.

Full text
Abstract:

We present a non-Gaussian and non-linear state-space model for the population dynamics of cod along the Norwegian Skagerak coast, embedded in the framework of a Bayesian hierarchical model. The model takes into account both process error, representing natural variability in the dynamics of a population, and observational error, reflecting the sampling process relating the observed data to true abundances. The data set on which our study is based, consists of samples of two juvenile age-groups of cod taken by beach seine hauls at a set of sample stations within several fjords along the coast. The age-structure population dynamics model, constituting the prior of the Bayesian model, is specified in terms of the recruitment process and the processes of survival for these two juvenile age-groups and the mature population, for which we have no data. The population dynamics is specified on abundances at the fjord level, and an explicit down-scaling from the fjord level to the level of the monitored stations is included in the likelihood, modelling the sampling process relating the observed counts to the underlying fjord abundances.

We take a sampling based approach to parameter estimation using Markov chain Monte Carlo methods. The properties of the model in terms of mixing and convergence of the MCMC algorithm and explored empirically on the basis of a simulated data set, and we show how the mixing properties can be improved by re-parameterisation. Estimation of the model parameters, and not the abundances, is the primary aim of the study, and we also propose an alternative approach to the estimation of the model parameters based on the marginal posterior distribution integrating over the abundances.

Based on the estimated model we illustrate how we can simulate the release of juvenile cod, imitating an experiment conducted in the early 20th century to resolve a controversy between a fisherman and a scientist who could not agree on the effect of releasing cod larvae on the mature abundance of cod. This controversy initiated the monitoring programme generating the data used in our study.

APA, Harvard, Vancouver, ISO, and other styles
14

Lee, Yun-Soo. "On some aspects of distribution theory and statistical inference involving order statistics." Virtual Press, 1991. http://liblink.bsu.edu/uhtbin/catkey/834141.

Full text
Abstract:
Statistical methods based on nonparametric and distribution-free procedures require the use of order statistics. Order statistics are also used in many parametric estimation and testing problems. With the introduction of modern high speed computers, order statistics have gained more importance in recent years in statistical inference - the main reason being that ranking a large number of observations manually was difficult and time consuming in the past, which is no longer the case at present because of the availability of high speed computers. Also, applications of order statistics require in many cases the use of numerical tables and computer is needed to construct these tables.In this thesis, some basic concepts and results involving order statistics are provided. Typically, application of the Theory of Permanents in the distribution of order statistics are discussed. Further, the correlation coefficient between the smallest observation (Y1) and the largest observation (Y,,) of a random sample of size n from two gamma populations, where (n-1) observations of the sample are from one population and the remaining observation is from the other population, is presented.
Department of Mathematical Sciences
APA, Harvard, Vancouver, ISO, and other styles
15

Kim, Woosuk. "Statistical Inference on Dual Generalized Order Statistics for Burr Type III Distribution." University of Cincinnati / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1396533232.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Jones, Lee K., and Richard C. 1943 Larson. "Efficient Computation of Probabilities of Events Described by Order Statistics and Application to a Problem of Queues." Massachusetts Institute of Technology, Operations Research Center, 1991. http://hdl.handle.net/1721.1/5159.

Full text
Abstract:
Consider a set of N i.i.d. random variables in [0, 1]. When the experimental values of the random variables are arranged in ascending order from smallest to largest, one has the order statistics of the set of random variables. In this note an O(N3) algorithm is developed for computing the probability that the order statistics vector lies in a given rectangle. The new algorithm is then applied to a problem of statistical inference in queues. Illustrative computational results are included.
APA, Harvard, Vancouver, ISO, and other styles
17

Ho, Man Wai. "Bayesian inference for models with monotone densities and hazard rates /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?ISMT%202002%20HO.

Full text
Abstract:
Thesis (Ph. D.)--Hong Kong University of Science and Technology, 2002.
Includes bibliographical references (leaves 110-114). Also available in electronic version. Access restricted to campus users.
APA, Harvard, Vancouver, ISO, and other styles
18

Villalobos, Isadora Antoniano. "Bayesian inference for models with infinite-dimensionally generated intractable components." Thesis, University of Kent, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.594106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Bohlin, Lars. "Inferens på rangordningar - En Monte Carlo-analys." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-46322.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Asif, Muneeb. "Bayesian Inference for the Global Minimum Variance Portfolio." Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-68929.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Blomberg, Per. "Informell Statistisk Inferens i modelleringssituationer : En studie om utveckling av ett ramverk för att analysera hur elever uttrycker inferenser." Licentiate thesis, Linnéuniversitetet, Institutionen för matematikdidaktik (MD), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-45572.

Full text
Abstract:
Syftet med denna studie är att bidra med ökad kunskap om lärande och undervisning i informell statistisk inferens. I studien användes en kvalitativ forskningsstrategi inriktad mot prövning och generering av teorier med inspiration av grounded theory. Studiens kunskapsfokus är riktad mot karakterisering av statistiska processer och begrepp där system av begreppsramverk om informell statistisk inferens och modellering utgör en central del av forskningen. För att erhålla adekvat empiri utformades en undervisningssituation där elever engagerades med att planera och genomföra en undersökning. Studien genomfördes i en normal klassrumssituation där undervisningen inriktades mot ett område inom sannolikhet och statistisk där bland annat lådagram och normalfördelning med tillhörande begrepp introduceras. Det empiriska materialet samlades in genom videoinspelning och skriftliga redovisningar. Materialet analyserades genom ett sammansatt ramverk om informell statistisk inferens och modellering. Resultatet av analysen visar exempel på hur elever kan förväntas uttrycka aspekter av informella statistisk inferens då de genomför statistiska undersökningar. Vidare utvecklades ett ramverk som teoretiskt beskriver informell statistisk inferens i modelleringssituationer. Studien pekar på att ISI-modellering har potential att användas för att analysera hur informell statistisk inferens kan komma till uttryck och att identifiera potentiella inlärningsmöjligheter för studenter att utveckla sin förmåga att uttrycka informella statistisk slutledning och att identifiera potentiella inlärningsmöjligheter för elever att utveckla sin förmåga att uttrycka informella inferenser.
The purpose of this study is to improve our knowledge about teaching and learning of informal statistical inference. A qualitative research strategy is used in the study that focuses on the testing and generation of theories inspired by grounded theory. The knowledge focus of the study is aimed at the characterisation of statistical processes and concepts where systems of concept frameworks about informal statistical inference and modelling represent an essential part of the research. In order to obtain adequate empirical data, a teaching situation was devised whereby students were involved in planning and implementing an investigation. The study was conducted in a normal classroom situation where the teaching was focused on an area in probability and statistics that included the introduction of box plots and normal distribution with related concepts. The empirical material was collected through video recordings and written reports. The material was analysed using a combined framework of informal statistical inference and modelling. The results of the analysis highlight examples of how students can be expected to express aspects of informal statistical inference within the context of statistical inquiry. A framework was also developed aimed to theoretically depict informal statistical inference in modelling situations. The study suggests that this framework has the potential to be used to analyse how informal statistical inference of students are expressed and to identify potential learning opportunities for students to develop their ability to express inferences.
APA, Harvard, Vancouver, ISO, and other styles
22

Veraart, Almut Elisabeth Dorothea. "Volatility estimation and inference in the presence of jumps." Thesis, University of Oxford, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.670107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Lundin, Mathias. "Sensitivity Analysis of Untestable Assumptions in Causal Inference." Doctoral thesis, Umeå universitet, Statistiska institutionen, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-43239.

Full text
Abstract:
This thesis contributes to the research field of causal inference, where the effect of a treatment on an outcome is of interest is concerned. Many such effects cannot be estimated through randomised experiments. For example, the effect of higher education on future income needs to be estimated using observational data. In the estimation, assumptions are made to make individuals that get higher education comparable with those not getting higher education, to make the effect estimable. Another assumption often made in causal inference (both in randomised an nonrandomised studies) is that the treatment received by one individual has no effect on the outcome of others. If this assumption is not met, the meaning of the causal effect of the treatment may be unclear. In the first paper the effect of college choice on income is investigated using Swedish register data, by comparing graduates from old and new Swedish universities. A semiparametric method of estimation is used, thereby relaxing functional assumptions for the data. One assumption often made in causal inference in observational studies is that individuals in different treatment groups are comparable, given that a set of pretreatment variables have been adjusted for in the analysis. This so called unconfoundedness assumption is in principle not possible to test and, therefore, in the second paper we propose a Bayesian sensitivity analysis of the unconfoundedness assumption. This analysis is then performed on the results from the first paper. In the third paper of the thesis, we study profile likelihood as a tool for semiparametric estimation of a causal effect of a treatment. A semiparametric version of the Bayesian sensitivity analysis of the unconfoundedness assumption proposed in Paper II is also performed using profile likelihood. The last paper of the thesis is concerned with the estimation of direct and indirect causal effects of a treatment where interference between units is present, i.e., where the treatment of one individual affects the outcome of other individuals. We give unbiased estimators of these direct and indirect effects for situations where treatment probabilities vary between individuals. We also illustrate in a simulation study how direct and indirect causal effects can be estimated when treatment probabilities need to be estimated using background information on individuals.
APA, Harvard, Vancouver, ISO, and other styles
24

Jinn, Nicole Mee-Hyaang. "Toward Error-Statistical Principles of Evidence in Statistical Inference." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/48420.

Full text
Abstract:
The context for this research is statistical inference, the process of making predictions or inferences about a population from observation and analyses of a sample. In this context, many researchers want to grasp what inferences can be made that are valid, in the sense of being able to uphold or justify by argument or evidence. Another pressing question among users of statistical methods is: how can spurious relationships be distinguished from genuine ones? Underlying both of these issues is the concept of evidence. In response to these (and similar) questions, two questions I work on in this essay are: (1) what is a genuine principle of evidence? and (2) do error probabilities have more than a long-run role? Concisely, I propose that felicitous genuine principles of evidence should provide concrete guidelines on precisely how to examine error probabilities, with respect to a test's aptitude for unmasking pertinent errors, which leads to establishing sound interpretations of results from statistical techniques. The starting point for my definition of genuine principles of evidence is Allan Birnbaum's confidence concept, an attempt to control misleading interpretations. However, Birnbaum's confidence concept is inadequate for interpreting statistical evidence, because using only pre-data error probabilities would not pick up on a test's ability to detect a discrepancy of interest (e.g., "even if the discrepancy exists" with respect to the actual outcome. Instead, I argue that Deborah Mayo's severity assessment is the most suitable characterization of evidence based on my definition of genuine principles of evidence.
Master of Arts
APA, Harvard, Vancouver, ISO, and other styles
25

Koskinen, Johan. "Essays on Bayesian Inference for Social Networks." Doctoral thesis, Stockholm : Department of Statistics [Statistiska institutionen], Univ, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-128.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

DENTI, FRANCESCO. "Bayesian Mixtures for Large Scale Inference." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2020. http://hdl.handle.net/10281/262923.

Full text
Abstract:
I modelli mistura bayesiani sono onnipresenti in statistica per la loro semplicità e flessibilità e possono essere facilmente impiegati in un'ampia varietà di contesti. In questa tesi, miriamo a fornire alcuni contributi agli attuali metodi bayesiani di analisi dei dati, spesso motivati ​​da domande di ricerca provenienti da applicazioni biologiche. In particolare, ci concentriamo sullo sviluppo di nuovi modelli mistura bayesiani, tipicamente in un ambiente non parametrico, per migliorare ed estendere aree di ricerca che coinvolgono dati caratterizzati da grande dimensioni: la modellazione di dati nested, test di ipotesi simultaneo e la riduzione della dimensionalità. \\ Pertanto, il nostro obiettivo è duplice: sviluppare metodi statistici robusti motivati da un solido background teorico e proporre algoritmi efficienti, scalabili e trattabili per le loro applicazioni. \\ La tesi è organizzata come segue. Nel capitolo 1 esamineremo brevemente il background metodologico e discuteremo i concetti necessari che appartengono alle diverse aree a cui contribuiremo con questa tesi. \\ Nel capitolo 2 proponiamo un modello di atomi comuni (CAM) per nested data, che supera le limitazioni del processo del nested Dirichlet Process, come discusso in \ citep {Camerlenghi2018}. Deriviamo le sue proprietà teoriche e sviluppiamo uno slice sampler per dati nested al fine di ottenere un algoritmo efficiente per la simulazione della posterior. Abbiamo poi incorporato il modello in un framework di Rounded mixture of Gaussian Kernels, così da applicare il nostro metodo a una abundance table derivante da uno studio di microbioma. \\ Nel capitolo \ref {BNPT} sviluppiamo una versione BNP del two-group model, modellando sia $ f_0 $ che $ f_1 $ con Pitman-Yor mixtures models. Proponiamo di fissare i due parametri $ \sigma_0 $ e $ \sigma_1 $ in modo che $ \sigma_0> \sigma_1 $, in base alla logica secondo cui il PY che modella la distribuzione nulla dovrebbe essere più vicino alla sua misura di base (opportunamente scelta Gaussiana standard), mentre il PY alternativo dovrebbe avere meno vincoli. Per indurre la separazione, impieghiamo una non-local prior sul parametro location della misura base del PY collocato su $ f_1 $. Mostriamo come il modello si comporta in diversi scenari e applichiamo questa metodologia a un set di dati del microbioma. \\ Il capitolo \ref{Peluso} presenta una seconda proposta per il two-group model. Qui, utilizziamo non-local distributions per modellare la densità alternativa direttamente nella formulazione della Likelihood. Abbiamo proposto una formulazione sia parametrica che non parametrica del modello. Forniamo poi una giustificazione teorica per l'adozione di questo approccio e, dopo aver confrontato le prestazioni del nostro modello con diversi concorrenti, presentiamo tre applicazioni su set di dati genomici reali pubblicamente disponibili. \\ Nel capitolo \ref {CRIME} ci concentriamo sul miglioramento del modello per la stima delle dimensioni intrinseche (ID) discusso in \citet {Allegra}, dove gli autori stimano gli IDs modellando il rapporto delle distanze da un punto dal suo primo e secondo vicino più vicino (NN). Innanzitutto, proponiamo di includere distribuzioni a priori più adatte nel loro modello mistura finita. Quindi, estendiamo la metodologia teorica esistente derivando distribuzioni in forma chiusa per i rapporti di distanze da un punto a due NNs di ordine generico. Proponiamo poi un semplice modello di mistura nonparametrica usando il processo di Dirichlet, in cui sfruttiamo le distribuzioni derivate per estrarre più informazioni dai dati. Il capitolo si conclude quindi con studi di simulazione e l'applicazione a dati reali. \\ Infine, il capitolo \ref {Conclusions} presenta le direzioni future e le conclusioni.
Bayesian mixture models are ubiquitous in statistics due to their simplicity and flexibility and can be easily employed in a wide variety of contexts. In this dissertation, we aim at providing a few contributions to current Bayesian data analysis methods, often motivated by research questions from biological applications. In particular, we focus on the development of novel Bayesian mixture models, typically in a nonparametric setting, to improve and extend active research areas that involve large-scale data: the modeling of nested data, multiple hypothesis testing, and dimensionality reduction.\\ Therefore, our goal is twofold: to develop robust statistical methods motivated by a solid theoretical background, and to propose efficient, scalable and tractable algorithms for their applications.\\ The thesis is organized as follows. In Chapter \ref{intro} we shortly review the methodological background and discuss the necessary concepts that belong to the different areas that we will contribute to with this dissertation. \\ In Chapter \ref{CAM} we propose a Common Atoms model (CAM) for nested datasets, which overcomes the limitations of the nested Dirichlet Process, as discussed in \citep{Camerlenghi2018}. We derive its theoretical properties and develop a slice sampler for nested data to obtain an efficient algorithm for posterior simulation. We then embed the model in a Rounded Mixture of Gaussian kernels framework to apply our method to an abundance table from a microbiome study.\\ In Chapter \ref{BNPT} we develop a BNP version of the two-group model \citep{Efron2004}, modeling both the null density $f_0$ and the alternative density $f_1$ with Pitman-Yor process mixture models. We propose to fix the two discount parameters $\sigma_0$ and $\sigma_1$ so that $\sigma_0>\sigma_1$, according to the rationale that the null PY should be closer to its base measure (appropriately chosen to be a standard Gaussian base measure), while the alternative PY should have fewer constraints. To induce separation, we employ a non-local prior \citep{Johnson} on the location parameter of the base measure of the PY placed on $f_1$. We show how the model performs in different scenarios and apply this methodology to a microbiome dataset.\\ Chapter \ref{Peluso} presents a second proposal for the two-group model. Here, we make use of non-local distributions to model the alternative density directly in the likelihood formulation. We propose both a parametric and a nonparametric formulation of the model. We provide a theoretical justification for the adoption of this approach and, after comparing the performance of our model with several competitors, we present three applications on real, publicly available genomic datasets.\\ In Chapter \ref{CRIME} we focus on improving the model for intrinsic dimensions (IDs) estimation discussed in \citet{Allegra}. In particular, the authors estimate the IDs modeling the ratio of the distances from a point to its first and second nearest neighbors (NNs). First, we propose to include more suitable priors in their parametric, finite mixture model. Then, we extend the existing theoretical methodology by deriving closed-form distributions for the ratios of distances from a point to two NNs of generic order. We propose a simple Dirichlet process mixture model, where we exploit the novel theoretical results to extract more information from the data. The chapter is then concluded with simulation studies and the application to real data.\\ Finally, Chapter \ref{Conclusions} presents the future directions and concludes.
APA, Harvard, Vancouver, ISO, and other styles
27

Huh, Ji Young. "Applications of Monte Carlo Methods in Statistical Inference Using Regression Analysis." Scholarship @ Claremont, 2015. http://scholarship.claremont.edu/cmc_theses/1160.

Full text
Abstract:
This paper studies the use of Monte Carlo simulation techniques in the field of econometrics, specifically statistical inference. First, I examine several estimators by deriving properties explicitly and generate their distributions through simulations. Here, simulations are used to illustrate and support the analytical results. Then, I look at test statistics where derivations are costly because of the sensitivity of their critical values to the data generating processes. Simulations here establish significance and necessity for drawing statistical inference. Overall, the paper examines when and how simulations are needed in studying econometric theories.
APA, Harvard, Vancouver, ISO, and other styles
28

Thabane, Lehana. "Contributions to Bayesian statistical inference." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq31133.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Yang, Liqiang. "Statistical Inference for Gap Data." NCSU, 2000. http://www.lib.ncsu.edu/theses/available/etd-20001110-173900.

Full text
Abstract:

This thesis research is motivated by a special type of missing data - Gap Data, which was first encountered in a cardiology study conducted at Duke Medical School. This type of data include multiple observations of certain event time (in this medical study the event is the reopenning of a certain artery), some of them may have one or more missing periods called ``gaps'' before observing the``first'' event. Therefore, for those observations, the observed first event may not be the true first event because the true first event might have happened in one of the missing gaps. Due to this kind of missing information, estimating the survival function of the true first event becomes very difficult. No research nor discussion has been done on this type of data by now. In this thesis, the auther introduces a new nonparametric estimating method to solve this problem. This new method is currently called Imputed Empirical Estimating (IEE) method. According to the simulation studies, the IEE method provide a very good estimate of the survival function of the true first event. It significantly outperforms all the existing estimating approaches in our simulation studies. Besides the new IEE method, this thesis also explores the Maximum Likelihood Estimate in thegap data case. The gap data is introduced as a special type of interval censored data for thefirst time. The dependence between the censoring interval (in the gap data case is the observedfirst event time point) and the event (in the gap data case is the true first event) makes the gap data different from the well studied regular interval censored data. This thesis points of theonly difference between the gap data and the regular interval censored data, and provides a MLEof the gap data under certain assumptions.The third estimating method discussed in this thesis is the Weighted Estimating Equation (WEE)method. The WEE estimate is a very popular nonparametric approach currently used in many survivalanalysis studies. In this thesis the consistency and asymptotic properties of the WEE estimateused in the gap data are discussed. Finally, in the gap data case, the WEE estimate is showed to be equivalent to the Kaplan-Meier estimate. Numerical examples are provied in this thesis toillustrate the algorithm of the IEE and the MLE approaches. The auther also provides an IEE estimate of the survival function based on the real-life data from Duke Medical School. A series of simulation studies are conducted to assess the goodness-of-fit of the new IEE estimate. Plots and tables of the results of the simulation studies are presentedin the second chapter of this thesis.

APA, Harvard, Vancouver, ISO, and other styles
30

Sun, Xiaohai. "Causal inference from statistical data /." Berlin : Logos-Verl, 2008. http://d-nb.info/988947331/04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Czogiel, Irina. "Statistical inference for molecular shapes." Thesis, University of Nottingham, 2010. http://eprints.nottingham.ac.uk/12217/.

Full text
Abstract:
This thesis is concerned with developing statistical methods for evaluating and comparing molecular shapes. Techniques from statistical shape analysis serve as a basis for our methods. However, as molecules are fuzzy objects of electron clouds which constantly undergo vibrational motions and conformational changes, these techniques should be modified to be more suitable for the distinctive features of molecular shape. The first part of this thesis is concerned with the continuous nature of molecules. Based on molecular properties which have been measured at the atom positions, a continuous field--based representation of a molecule is obtained using methods from spatial statistics. Within the framework of reproducing kernel Hilbert spaces, a similarity index for two molecular shapes is proposed which can then be used for the pairwise alignment of molecules. The alignment is carried out using Markov chain Monte Carlo methods and posterior inference. In the Bayesian setting, it is also possible to introduce additional parameters (mask vectors) which allow for the fact that only part of the molecules may be similar. We apply our methods to a dataset of 31 steroid molecules which fall into three activity classes with respect to the binding activity to a common receptor protein. To investigate which molecular features distinguish the activity classes, we also propose a generalisation of the pairwise method to the simultaneous alignment of several molecules. The second part of this thesis is concerned with the dynamic aspect of molecular shapes. Here, we consider a dataset containing time series of DNA configurations which have been obtained using molecular dynamic simulations. For each considered DNA duplex, both a damaged and an undamaged version are available, and the objective is to investigate whether or not the damage induces a significant difference to the the mean shape of the molecule. To do so, we consider bootstrap hypothesis tests for the equality of mean shapes. In particular, we investigate the use of a computationally inexpensive algorithm which is based on the Procrustes tangent space. Two versions of this algorithm are proposed. The first version is designed for independent configuration matrices while the second version is specifically designed to accommodate temporal dependence of the configurations within each group and is hence more suitable for the DNA data.
APA, Harvard, Vancouver, ISO, and other styles
32

方以德 and Yee-tak Daniel Fong. "Statistical inference on biomedical models." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1993. http://hub.hku.hk/bib/B31210788.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Liu, Fei, and 劉飛. "Statistical inference for banding data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2008. http://hub.hku.hk/bib/B41508701.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Junklewitz, Henrik. "Statistical inference in radio astronomy." Diss., Ludwig-Maximilians-Universität München, 2014. http://nbn-resolving.de/urn:nbn:de:bvb:19-177457.

Full text
Abstract:
This thesis unifies several studies, which all are dedicated to the subject of statistical data analysis in radio astronomy and radio astrophysics. Radio astronomy, like astronomy as a whole, has undergone a remarkable development in the past twenty years in introducing new instruments and technologies. New telescopes like the upgraded VLA, LOFAR, or the SKA and its pathfinder missions offer unprecedented sensitivities, previously uncharted frequency domains and unmatched survey capabilities. Many of these have the potential to significantly advance the science of radio astrophysics and cosmology on all scales, from solar and stellar physics, Galactic astrophysics and cosmic magnetic fields, to Galaxy cluster astrophysics and signals from the epoch of reionization. Since then, radio data analysis, calibration and imaging techniques have entered a similar phase of new development to push the boundaries and adapt the field to the new instruments and scientific opportunities. This thesis contributes to these greater developments in two specific subjects, radio interferometric imaging and cosmic magnetic field statistics. Throughout this study, different data analysis techniques are presented and employed in various settings, but all can be summarized under the broad term of statistical infer- ence. This subject encompasses a huge variety of statistical techniques, developed to solve problems in which deductions have to be made from incomplete knowledge, data or measurements. This study focuses especially on Bayesian inference methods that make use of a subjective definition of probabilities, allowing for the expression of probabilities and statistical knowledge prior to an actual measurement. The thesis contains two different sets of application for such techniques. First, situations where a complicated, and generally ill-posed measurement problem can be approached by assuming a statistical signal model prior to infer the desired measured variable. Such a problem very often is met should the measurement device take less data then needed to constrain all degrees of freedom of the problem. The principal case investigated in this thesis is the measurement problem of a radio interferometer, which takes incomplete samples of the Fourier transformed intensity of the radio emission in the sky, such that it is impossible to exactly recover the signal. The new imaging algorithm RESOLVE is presented, optimal for extended radio sources. A first showcase demonstrates the performance of the new technique on real data. Further, a new Bayesian approach to multi-frequency radio interferometric imaging is presented and integrated into RESOLVE. The second field of application are astrophysical problems, in which the inherent stochas- tic nature of a physical process demands a description, where properties of physical quanti- ties can only be statistically estimated. Astrophysical plasmas for instance are very often in a turbulent state, and thus governed by statistical hydrodynamical laws. Two studies are presented that show how properties of turbulent plasma magnetic fields can be inferred from radio observations.
APA, Harvard, Vancouver, ISO, and other styles
35

Bell, Paul W. "Statistical inference for multidimensional scaling." Thesis, University of Newcastle Upon Tyne, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.327197.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Covarrubias, Carlos Cuevas. "Statistical inference for ROC curves." Thesis, University of Warwick, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.399489.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Oe, Bianca Madoka Shimizu. "Statistical inference in complex networks." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-28032017-095426/.

Full text
Abstract:
The complex network theory has been extensively used to understand various natural and artificial phenomena made of interconnected parts. This representation enables the study of dynamical processes running on complex systems, such as epidemics and rumor spreading. The evolution of these dynamical processes is influenced by the organization of the network. The size of some real world networks makes it prohibitive to analyse the whole network computationally. Thus it is necessary to represent it by a set of topological measures or to reduce its size by means of sampling. In addition, most networks are samples of a larger networks whose structure may not be captured and thus, need to be inferred from samples. In this work, we study both problems: the influence of the structure of the network in spreading processes and the effects of sampling in the structure of the network. Our results suggest that it is possible to predict the final fraction of infected individuals and the final fraction of individuals that came across a rumor by modeling them with a beta regression model and using topological measures as regressors. The most influential measure in both cases is the average search information, that quantifies the ease or difficulty to navigate through a network. We have also shown that the structure of a sampled network differs from the original network and that the type of change depends on the sampling method. Finally, we apply four sampling methods to study the behaviour of the epidemic threshold of a network when sampled with different sampling rates and found out that the breadth-first search sampling is most appropriate method to estimate the epidemic threshold among the studied ones.
Vários fenômenos naturais e artificiais compostos de partes interconectadas vem sendo estudados pela teoria de redes complexas. Tal representação permite o estudo de processos dinâmicos que ocorrem em redes complexas, tais como propagação de epidemias e rumores. A evolução destes processos é influenciada pela organização das conexões da rede. O tamanho das redes do mundo real torna a análise da rede inteira computacionalmente proibitiva. Portanto, torna-se necessário representá-la com medidas topológicas ou amostrá-la para reduzir seu tamanho. Além disso, muitas redes são amostras de redes maiores cuja estrutura é difícil de ser capturada e deve ser inferida de amostras. Neste trabalho, ambos os problemas são estudados: a influência da estrutura da rede em processos de propagação e os efeitos da amostragem na estrutura da rede. Os resultados obtidos sugerem que é possível predizer o tamanho da epidemia ou do rumor com base em um modelo de regressão beta com dispersão variável, usando medidas topológicas como regressores. A medida mais influente em ambas as dinâmicas é a informação de busca média, que quantifica a facilidade com que se navega em uma rede. Também é mostrado que a estrutura de uma rede amostrada difere da original e que o tipo de mudança depende do método de amostragem utilizado. Por fim, quatro métodos de amostragem foram aplicados para estudar o comportamento do limiar epidêmico de uma rede quando amostrada com diferentes taxas de amostragem. Os resultados sugerem que a amostragem por busca em largura é a mais adequada para estimar o limiar epidêmico entre os métodos comparados.
APA, Harvard, Vancouver, ISO, and other styles
38

ZHAO, SHUHONG. "STATISTICAL INFERENCE ON BINOMIAL PROPORTIONS." University of Cincinnati / OhioLINK, 2005. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1115834351.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Liu, Fei. "Statistical inference for banding data." Click to view the E-thesis via HKUTO, 2008. http://sunzi.lib.hku.hk/hkuto/record/B41508701.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Fong, Yee-tak Daniel. "Statistical inference on biomedical models /." [Hong Kong] : University of Hong Kong, 1993. http://sunzi.lib.hku.hk/hkuto/record.jsp?B13456921.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Peiris, Thelge Buddika. "Constrained Statistical Inference in Regression." OpenSIUC, 2014. https://opensiuc.lib.siu.edu/dissertations/934.

Full text
Abstract:
Regression analysis constitutes a large portion of the statistical repertoire in applications. In case where such analysis is used for exploratory purposes with no previous knowledge of the structure one would not wish to impose any constraints on the problem. But in many applications we are interested in a simple parametric model to describe the structure of a system with some prior knowledge of the structure. An important example of this occurs when the experimenter has the strong belief that the regression function changes monotonically in some or all of the predictor variables in a region of interest. The analyses needed for statistical inference under such constraints are nonstandard. The specific aim of this study is to introduce a technique which can be used for statistical inferences of a multivariate simple regression with some non-standard constraints.
APA, Harvard, Vancouver, ISO, and other styles
42

FANIZZA, MARCO. "Quantum statistical inference and communication." Doctoral thesis, Scuola Normale Superiore, 2021. http://hdl.handle.net/11384/109209.

Full text
Abstract:
This thesis studies the limits on the performances of inference tasks with quantum data and quantum operations. Our results can be divided in two main parts. In the first part, we study how to infer relative properties of sets of quantum states, given a certain amount of copies of the states. We investigate the performance of optimal inference strategies according to several figures of merit which quantifies the precision of the inference. Since we are not interested in obtaining a complete reconstruction of the states, optimal strategies do not require to perform quantum tomography. In particular, we address the following problems: - We evaluate the asymptotic error probabilities of optimal learning machines for quantum state discrimination. Here, a machine receives a number of copies of a pair of unknown states, which can be seen as training data, together with a test system which is initialized in one of the states of the pair with equal probability. The goal is to implement a measurement to discriminate in which state the test system is, minimizing the error probability. We analyze the optimal strategies for a number of different settings, differing on the prior incomplete information on the states available to the agent. - We evaluate the limits on the precision of the estimation of the overlap between two unknown pure states, given N and M copies of each state. We find an asymptotic expansion of a Fisher information associated with the estimation problem, which gives a lower bound on the mean square error of any estimator. We compute the minimum average mean square error for random pure states, and we evaluate the effect of depolarizing noise on qubit states. We compare the performance of the optimal estimation strategy with the performances of other intuitive strategies, such as the swap test and measurements based on estimating the states. - We evaluate how many samples from a collection of N d-dimensional states are necessary to understand with high probability if the collection is made of identical states or they differ more than a threshold according to a motivated closeness measure. The access to copies of the states in the collection is given as follows: each time the agent ask for a copy of the states, the agent receives one of the states with some fixed probability, together with a different label for each state in the collection. We prove that the problem can be solved with O(pNd=2) copies, and that this scaling is optimal up to a constant independent on d;N; . In the second part, we study optimal classical and quantum communication rates for several physically motivated noise models. - The quantum and private capacities of most realistic channels cannot be evaluated from their regularized expressions. We design several degradable extensions for notable channels, obtaining upper bounds on the quantum and private capacities of the original channels. We obtain sufficient conditions for the degradability of flagged extensions of channels which are convex combination of other channels. These sufficient conditions are easy to verify and simplify the construction of degradable extensions. - We consider the problem of transmitting classical information with continuous variable systems and an energy constraint, when it is impossible to maintain a shared reference frame and in presence of losses. At variance with phase-insensitive noise models, we show that, in some regimes, squeezing improves the communication rates with respect to coherent state sources and with respect to sources producing up to two-photon Fock states. We give upper and lower bounds on the optimal coherent state rate and show that using part of the energy to repeatedly restore a phase reference is strictly suboptimal for high energies.
APA, Harvard, Vancouver, ISO, and other styles
43

Borgos, Hilde Grude. "Stochastic Modeling and Statistical Inference of Geological Fault Populations and Patterns." Doctoral thesis, Norwegian University of Science and Technology, Department of Mathematical Sciences, 2000. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-503.

Full text
Abstract:

The focus of this work is on faults, and the main issue is statistical analysis and stochastic modeling of faults and fault patterns in petroleum reservoirs. The thesis consists of Part I-V and Appendix A-C. The units can be read independently. Part III is written for a geophysical audience, and the topic of this part is fault and fracture size-frequency distributions. The remaining parts are written for a statistical audience, but can also be read by people with an interest in quantitative geology. The topic of Part I and II is statistical model choice for fault size distributions, with a samling algorithm for estimating Bayes factor. Part IV describes work on spatial modeling of fault geometry, and Part V is a short note on line partitioning. Part I, II and III constitute the main part of the thesis. The appendices are conference abstracts and papers based on Part I and IV.


Paper III: reprinted with kind permission of the American Geophysical Union. An edited version of this paper was published by AGU. Copyright [2000] American Geophysical Union
APA, Harvard, Vancouver, ISO, and other styles
44

Bruce, Daniel. "Optimal Design and Inference for Correlated Bernoulli Variables using a Simplified Cox Model." Doctoral thesis, Stockholm : Department of Statistics, Stockholm University, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-7512.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Westerborn, Johan. "On particle-based online smoothing and parameter inference in general state-space models." Doctoral thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215292.

Full text
Abstract:
This thesis consists of 4 papers, presented in Paper A-D, on particle- based online smoothing and parameter inference in general state-space hidden Markov models. In Paper A a novel algorithm, the particle-based, rapid incremental smoother (PaRIS), aimed at efficiently performing online approxima- tion of smoothed expectations of additive state functionals in general hidden Markov models, is presented. The algorithm has, under weak assumptions, linear computational complexity and very limited mem- ory requirements. The algorithm is also furnished with a number of convergence results, including a central limit theorem. In Paper B the problem of marginal smoothing in general hidden Markov models is tackled. A novel, PaRIS-based algorithm is presented where the marginal smoothing distributions are approximated using a lagged estimator where the lag is set adaptively. In Paper C an estimator of the tangent filter is constructed, yield- ing in turn an estimator of the score function. The resulting algorithm is furnished with theoretical results, including a central limit theorem with a uniformly bounded variance. The resulting estimator is applied to online parameter estimation via recursive maximum liklihood. Paper D focuses on the problem of online estimation of parameters in general hidden Markov models. The algorithm is based on a for- ward implementation of the classical expectation-maximization algo- rithm. The algorithm uses the PaRIS algorithm to achieve an efficient algorithm.
Denna avhandling består av fyra artiklar, presenterade i Paper A-D, som behandlar partikelbaserad online-glättning och parameter- skattning i generella dolda Markovkedjor. I papper A presenteras en ny algoritm, PaRIS, med målet att effek- tivt beräkna partikelbaserade online-skattningar av glättade väntevär- den av additiva tillståndsfunktionaler. Algoritmen har, under svaga villkor, en beräkningskomplexitet som växer endast linjärt med antalet partiklar samt högst begränsade minneskrav. Dessutom härleds ett an- tal konvergensresultat för denna algoritm, såsom en central gränsvärdes- sats. Algoritmen testas i en simuleringsstudie. I papper B studeras problemet att skatta marginalglättningsfördel- ningen i dolda Markovkedjor. Detta åstadkoms genom att exekvera PaRIS-algoritmen i marginalläge. Genom ett argument om mixning i Markovkedjor motiveras att avbryta uppdateringen efter en av ett stoppkriterium bestämd fördröjning vilket ger en adaptiv fördröjnings- glättare. I papper C studeras problemet att beräkna derivator av filterfördel- ningen. Dessa används för att beräkna gradienten av log-likelihood funktionen. Algoritmen, som innehåller en uppdateringsmekanism lik- nande den i PaRIS, förses med ett antal konvergensresultat, såsom en central gränsvärdessats med en varians som är likformigt begränsad. Den resulterande algoritmen används för att konstruera en rekursiv parameterskattningsalgoritm. Papper D fokuserar på online-estimering av modellparametrar i generella dolda Markovkedjor. Den presenterade algoritmen kan ses som en kombination av PaRIS algoritmen och en nyligen föreslagen online-implementation av den klassiska EM-algoritmen.

QC 20171009

APA, Harvard, Vancouver, ISO, and other styles
46

Shen, Gang. "Bayesian predictive inference under informative sampling and transformation." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0429104-142754/.

Full text
Abstract:
Thesis (M.S.) -- Worcester Polytechnic Institute.
Keywords: Ignorable Model; Transformation; Poisson Sampling; PPS Sampling; Gibber Sampler; Inclusion Probabilities; Selection Bias; Nonignorable Model; Bayesian Inference. Includes bibliographical references (p.34-35).
APA, Harvard, Vancouver, ISO, and other styles
47

Edin, Moa. "Outcome regression methods in causal inference : The difference LASSO and selection of effect modifiers." Thesis, Umeå universitet, Statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149423.

Full text
Abstract:
In causal inference, a central aim of covariate selection is to provide a subset of covariates, that is sufficient for confounding adjustment. One approach for this is to construct a subset of covariates associated with the outcome. This is sometimes referred to as the outcome approach, which is the subject for this thesis. Apart from confounding, there may exist effect modification. This occurs when a treatment has different effect on the outcome, among different subgroups, defined by effect modifiers. We describe how the outcome approach implemented by regression models, can be used for estimating the ATE, and how sufficient subsets of covariates may be constructed for these models. We also describe a novel method, called the difference LASSO, which results in identification of effect modifiers, rather than determination of sufficient subsets. The method is defined by an algorithm where, in the first step, an incorrectly specified model is fitted. We investigate the bias, arising from this misspecification, analytically and numerically for OLS. The difference LASSO is also compared with a regression estimator. The comparison is done in a simulation study, where the identification of effect modifiers is evaluated. This is done by analyzing the proportion of times a selection procedure results in a set of covariates including only the effect modifiers, or a set where the effect modifiers are included as a subset. The results show that the difference LASSO works relatively well for identification of effect modifiers. Among four designs, a set containing only the true effect modifiers were selected in at least 83:2%. The corresponding result for the regression estimator was 27:9%. However, the difference LASSO builds on biased estimation. Therefore, the method is not plausible for interpretation of treatment effects.
APA, Harvard, Vancouver, ISO, and other styles
48

BENAZZO, Andrea. "LE SIMULAZIONI DEL PROCESSO COALESCENTE IN GENETICA DI POPOLAZIONI: INFERENZE DEMOGRAFICHE ED EVOLUTIVE." Doctoral thesis, Università degli studi di Ferrara, 2012. http://hdl.handle.net/11392/2389456.

Full text
Abstract:
The main goal of population genetics is to understand the factors that affect genetic variation within a species. Mathematical models are used to predict the effects on genetic variation of processes such as mutation, recombination, selection, migration and population size changes, but analytical results are difficult to obtain when these processes interact and when equilibrium conditions are not met. In these situations, common in real biological systems especially when recent human activities (e.g., stocking, urbanization, overhunting) perturb natural populations, computer simulations can be very useful. A computer simulation is a virtual experiment in which a model is used to mimic the biological process on a computer to study its properties. It is an excellent tool for understanding the functioning of complex systems. Simulations are generally used to make predictions about populations, validate statistical methods, study the properties of different sampling strategies, and estimate parameters from real data. In this thesis, I applied genetic simulations to address questions intractable with other methods. First, I analyzed the effects of violating the assumption of panmixiamade by “Extende Bayesian Skyline Plot” (EBSP) method. I showed that migration can influence the inferred demographic history of a population, suggesting wrong dynamics. Second, I used genetic simulations to analyse the performance of the EBSP method in reconstructing a population decline and to compare sampling schemes with different proportions of modern and ancient DNA. I identified some properties of the sampling scheme which clearly positively affect the demographic reconstruction, providing some simple hints for planning a genetic study when both modern and ancient samples are available. Third, I familiarized with the “Approximated Bayesian Computation” methodology and I contributed to a review article presenting the main features, with pros and cons, of this approach. Fourth, I applied the ABC procedure to analyze the hybridization history within the genus Chionodraco, and to evaluate the power of ABC in this context. Realistic demographic models were defined and compared, and evidence was found that hybridization occurred only in interglacial periods. Taken together, the results presented in this thesis confirm the importance of genetic simulations in evolutionary biology. If we consider the increasing availability of simulation packages, along with the increasing speed and storage capacity of personal computers and clusters, it is easy to predict that simulations of genetic and genomic data will spread in many fields to better explore more and more realistic, and consequently complex, models.
APA, Harvard, Vancouver, ISO, and other styles
49

Ren, Sheng. "New Methods of Variable Selection and Inference on High Dimensional Data." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1511883302569683.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Stattin, Oskar. "Large scale inference under sparse and weak alternatives: non-asymptotic phase diagram for CsCsHM statistics." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209963.

Full text
Abstract:
High-throughput measurement technology allows to generate and store huge amounts of features, of which very few can be useful for any one single problem at hand. Examples include genomics, proteomics and astronomy, where massive multiple testing often needs to be per- formed, expecting a few significant effects and essentially a null back- ground. A number of new test procedures have been developed for detecting these, so-called sparse and weak effects, in large scale statistical inference. The most widely used is Higher Criticism, HC (see e.g. Donoho and Jin (2004)). A new class of goodness-of-fit test statistics, called CsCsHM, has recently been derived (see Stepanova and Pavlenko (2017)) for the same type of multiple testing, it is shown to achieve better asymptotic properties than the traditional HC approach.This report empirically investigates the behavior of both test procedures in the neighborhood of the detection boundary, i.e. the threshold for the detectability of sparse and weak effects. This theoretical boundary sharply separates the phase space, spanned by the sparsity and weakness parameters, into two subregions the region of detectability and the region of undetectability. The statistics are also applied and compared for both methodologies for features selection in high dimensional binary classification problems. Besides the study of the methods and simulations, applications of both methods on realistic data are carried out. It is found that the statistics are comparable in performance accuracy.
Modern mätteknologi tillåter att generera och lagra gigantiska mängder data, varav en stor andel är redundanta och varav bara ett fåtal är an- vändbara för ett givet problem. Områden där detta är vanligt är till exempel inom genomik, proteomik och astronomi, där stora multi- pla test ofta behöver utföras, med förväntan om endast några fåsig- nifikanta effekter. Ett antal nya testprocedurer har utvecklats för att testa dessa så-kallade svaga och glesa effekter i storskalig statistisk in- ferens. Den mest populära av dessa är troligen Higher Criticism, HC (se Donoho och Jin (2004)). En ny klass av goodness-of-fit-testvariabel döpt CsCsHM har nyligen blivit härledd (se Stepanova och Pavlenko (2017)) för samma typ av multipla testscenarion och har bevisat bättre asymptotiska egenskaper än den traditionella HC-metoden.Den här rapporten utforskar det empiriska beteendet för båda test- metodikerna i närheten av detektionsgränsen, vilken är tröskeln för detektion av glesa och svaga effekter. Den här teoretiska, skarpa gränsen delar fasrymden, vilken är uppspänd av gleshets- och svaghetsparametrarna, i två delområden:det detektionsbara och det icke-detektionsbara området. Testsvariablernas metodik tillämpas även för variabelselektion för storskalig binär klassificering. Dessa tillämpas, förutom simuleringar, på riktig data. Resultaten pekar på att testvariablerna är jämförbara i prestation.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography