Dissertations / Theses: 'Empirical Bayes'

1

Benhaddou, Rida. "Nonparametric and Empirical Bayes Estimation Methods." Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5765.

Full text

Abstract:

In the present dissertation, we investigate two different nonparametric models; empirical Bayes model and functional deconvolution model. In the case of the nonparametric empirical Bayes estimation, we carried out a complete minimax study. In particular, we derive minimax lower bounds for the risk of the nonparametric empirical Bayes estimator for a general conditional distribution. This result has never been obtained previously. In order to attain optimal convergence rates, we use a wavelet series based empirical Bayes estimator constructed in Pensky and Alotaibi (2005). We propose an adaptive version of this estimator using Lepski's method and show that the estimator attains optimal convergence rates. The theory is supplemented by numerous examples. Our study of the functional deconvolution model expands results of Pensky and Sapatinas (2009, 2010, 2011) to the case of estimating an (r+1)-dimensional function or dependent errors. In both cases, we derive minimax lower bounds for the integrated square risk over a wide set of Besov balls and construct adaptive wavelet estimators that attain those optimal convergence rates. In particular, in the case of estimating a periodic (r+1)-dimensional function, we show that by choosing Besov balls of mixed smoothness, we can avoid the ''curse of dimensionality'' and, hence, obtain higher than usual convergence rates when r is large. The study of deconvolution of a multivariate function is motivated by seismic inversion which can be reduced to solution of noisy two-dimensional convolution equations that allow to draw inference on underground layer structures along the chosen profiles. The common practice in seismology is to recover layer structures separately for each profile and then to combine the derived estimates into a two-dimensional function. By studying the two-dimensional version of the model, we demonstrate that this strategy usually leads to estimators which are less accurate than the ones obtained as two-dimensional functional deconvolutions. Finally, we consider a multichannel deconvolution model with long-range dependent Gaussian errors. We do not limit our consideration to a specific type of long-range dependence, rather we assume that the eigenvalues of the covariance matrix of the errors are bounded above and below. We show that convergence rates of the estimators depend on a balance between the smoothness parameters of the response function, the smoothness of the blurring function, the long memory parameters of the errors, and how the total number of observations is distributed among the channels.
Ph.D.
Doctorate
Mathematics
Sciences
Mathematics

APA, Harvard, Vancouver, ISO, and other styles

2

Brandel, John. "Empirical Bayes methods for missing data analysis." Thesis, Uppsala University, Department of Mathematics, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-121408.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Lönnstedt, Ingrid. "Empirical Bayes Methods for DNA Microarray Data." Doctoral thesis, Uppsala University, Department of Mathematics, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-5865.

Full text

Abstract:

cDNA microarrays is one of the first high-throughput gene expression technologies that has emerged within molecular biology for the purpose of functional genomics. cDNA microarrays compare the gene expression levels between cell samples, for thousands of genes simultaneously.

The microarray technology offers new challenges when it comes to data analysis, since the thousands of genes are examined in parallel, but with very few replicates, yielding noisy estimation of gene effects and variances. Although careful image analyses and normalisation of the data is applied, traditional methods for inference like the Student t or Fisher’s F-statistic fail to work.

In this thesis, four papers on the topics of empirical Bayes and full Bayesian methods for two-channel microarray data (as e.g. cDNA) are presented. These contribute to proving that empirical Bayes methods are useful to overcome the specific data problems. The sample distributions of all the genes involved in a microarray experiment are summarized into prior distributions and improves the inference of each single gene.

The first part of the thesis includes biological and statistical background of cDNA microarrays, with an overview of the different steps of two-channel microarray analysis, including experimental design, image analysis, normalisation, cluster analysis, discrimination and hypothesis testing. The second part of the thesis consists of the four papers. Paper I presents the empirical Bayes statistic B, which corresponds to a t-statistic. Paper II is based on a version of B that is extended for linear model effects. Paper III assesses the performance of empirical Bayes models by comparisons with full Bayes methods. Paper IV provides extensions of B to what corresponds to F-statistics.

APA, Harvard, Vancouver, ISO, and other styles

4

Farrell, Patrick John. "Empirical Bayes estimation of small area proportions." Thesis, McGill University, 1991. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=70301.

Full text

Abstract:

Due to the nature of survey design, the estimation of parameters associated with small areas is extremely problematic. In this study, techniques for the estimation of small area proportions are proposed and implemented. More specifically, empirical Bayes estimation methodologies, where random effects which reflect the complex structure of a multi-stage sample design are incorporated into logistic regression models, are derived and studied.
The proposed techniques are applied to data from the 1950 United States Census to predict local labor force participation rates of females. Results are compared with those obtained using unbiased and synthetic estimation approaches.
Using the proposed methodologies, a sensitivity analysis concerning the prior distribution assumption, conducted with a view toward outlier detection, is performed. The use of bootstrap techniques to correct measures of uncertainty is also studied.

APA, Harvard, Vancouver, ISO, and other styles

5

Lönnstedt, Ingrid. "Empirical Bayes methods for DNA microarray data /." Uppsala : Matematiska institutionen, Univ. [distributör], 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-5865.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Wang, Xue. "Empirical Bayes block shrinkage for wavelet regression." Thesis, University of Nottingham, 2006. http://eprints.nottingham.ac.uk/13516/.

Full text

Abstract:

There has been great interest in recent years in the development of wavelet methods for estimating an unknown function observed in the presence of noise, following the pioneering work of Donoho and Johnstone (1994, 1995) and Donoho et al. (1995). In this thesis, a novel empirical Bayes block (EBB) shrinkage procedure is proposed and the performance of this approach with both independent identically distributed (IID) noise and correlated noise is thoroughly explored. The first part of this thesis develops a Bayesian methodology involving the non-central X[superscript]2 distribution to simultaneously shrink wavelet coefficients in a block, based on the block sum of squares. A useful (and to the best of our knowledge, new) identity satisfied by the non-central X[superscript]2 density is exploited. This identity leads to tractable posterior calculations for suitable families of prior distributions. Also, the families of prior distribution we work with are sufficiently flexible to represent various forms of prior knowledge. Furthermore, an efficient method for finding the hyperparameters is implemented and simulations show that this method has a high degree of computational advantage. The second part relaxes the assumption of IID noise considered in the first part of this thesis. A semi-parametric model including a parametric component and a nonparametric component is presented to deal with correlated noise situations. In the parametric component, attention is paid to the covariance structure of the noise. Two distinct parametric methods (maximum likelihood estimation and time series model identification techniques) for estimating the parameters in the covariance matrix are investigated. Both methods have been successfully implemented and are believed to be new additions to smoothing methods.

APA, Harvard, Vancouver, ISO, and other styles

7

Fletcher, Douglas. "Generalized Empirical Bayes: Theory, Methodology, and Applications." Diss., Temple University Libraries, 2019. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/546485.

Full text

Abstract:

Statistics
Ph.D.
The two key issues of modern Bayesian statistics are: (i) establishing a principled approach for \textit{distilling} a statistical prior distribution that is \textit{consistent} with the given data from an initial believable scientific prior; and (ii) development of a \textit{consolidated} Bayes-frequentist data analysis workflow that is more effective than either of the two separately. In this thesis, we propose generalized empirical Bayes as a new framework for exploring these fundamental questions along with a wide range of applications spanning fields as diverse as clinical trials, metrology, insurance, medicine, and ecology. Our research marks a significant step towards bridging the ``gap'' between Bayesian and frequentist schools of thought that has plagued statisticians for over 250 years. Chapters 1 and 2---based on \cite{mukhopadhyay2018generalized}---introduces the core theory and methods of our proposed generalized empirical Bayes (gEB) framework that solves a long-standing puzzle of modern Bayes, originally posed by Herbert Robbins (1980). One of the main contributions of this research is to introduce and study a new class of nonparametric priors ${\rm DS}(G, m)$ that allows exploratory Bayesian modeling. However, at a practical level, major practical advantages of our proposal are: (i) computational ease (it does not require Markov chain Monte Carlo (MCMC), variational methods, or any other sophisticated computational techniques); (ii) simplicity and interpretability of the underlying theoretical framework which is general enough to include almost all commonly encountered models; and (iii) easy integration with mainframe Bayesian analysis that makes it readily applicable to a wide range of problems. Connections with other Bayesian cultures are also presented in the chapter. Chapter 3 deals with the topic of measurement uncertainty from a new angle by introducing the foundation of nonparametric meta-analysis. We have applied the proposed methodology to real data examples from astronomy, physics, and medical disciplines. Chapter 4 discusses some further extensions and application of our theory to distributed big data modeling and the missing species problem. The dissertation concludes by highlighting two important areas of future work: a full Bayesian implementation workflow and potential applications in cybersecurity.
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

8

Mariotto, Angela Bacellar. "Empirical Bayes inference and the linear model." Thesis, Imperial College London, 1989. http://hdl.handle.net/10044/1/47557.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

KWON, YEIL. "NONPARAMETRIC EMPIRICAL BAYES SIMULTANEOUS ESTIMATION FOR MULTIPLE VARIANCES." Diss., Temple University Libraries, 2018. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/495491.

Full text

Abstract:

Statistics
Ph.D.
The shrinkage estimation has proven to be very useful when dealing with a large number of mean parameters. In this dissertation, we consider the problem of simultaneous estimation of multiple variances and construct a shrinkage type, non-parametric estimator. We take the non-parametric empirical Bayes approach by starting with an arbitrary prior on the variances. Under an invariant loss function, the resultant Bayes estimator relies on the marginal cumulative distribution function of the sample variances. Replacing the marginal cdf by the empirical distribution function, we obtain a Non-parametric Empirical Bayes estimator for multiple Variances (NEBV). The proposed estimator converges to the corresponding Bayes version uniformly over a large set. Consequently, the NEBV works well in a post-selection setting. We then apply the NEBV to construct condence intervals for mean parameters in a post-selection setting. It is shown that the intervals based on the NEBV are shortest among all the intervals which guarantee a desired coverage probability. Through real data analysis, we have further shown that the NEBV based intervals lead to the smallest number of discordances, a desirable property when we are faced with the current "replication crisis".
Temple University--Theses

APA, Harvard, Vancouver, ISO, and other styles

10

Stein, Nathan Mathes. "Advances in Empirical Bayes Modeling and Bayesian Computation." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:11051.

Full text

Abstract:

Chapter 1 of this thesis focuses on accelerating perfect sampling algorithms for a Bayesian hierarchical model. A discrete data augmentation scheme together with two different parameterizations yields two Gibbs samplers for sampling from the posterior distribution of the hyperparameters of the Dirichlet-multinomial hierarchical model under a default prior distribution. The finite-state space nature of this data augmentation permits us to construct two perfect samplers using bounding chains that take advantage of monotonicity and anti-monotonicity in the target posterior distribution, but both are impractically slow. We demonstrate however that a composite algorithm that strategically alternates between the two samplers' updates can be substantially faster than either individually. We theoretically bound the expected time until coalescence for the composite algorithm, and show via simulation that the theoretical bounds can be close to actual performance. Chapters 2 and 3 introduce a strategy for constructing scientifically sensible priors in complex models. We call these priors catalytic priors to suggest that adding such prior information catalyzes our ability to use richer, more realistic models. Because they depend on observed data, catalytic priors are a tool for empirical Bayes modeling. The overall perspective is data-driven: catalytic priors have a pseudo-data interpretation, and the building blocks are alternative plausible models for observations, yielding behavior similar to hierarchical models but with a conceptual shift away from distributional assumptions on parameters. The posterior under a catalytic prior can be viewed as an optimal approximation to a target measure, subject to a constraint on the posterior distribution's predictive implications. In Chapter 3, we apply catalytic priors to several familiar models and investigate the performance of the resulting posterior distributions. We also illustrate the application of catalytic priors in a preliminary analysis of the effectiveness of a job training program, which is complicated by the need to account for noncompliance, partially defined outcomes, and missing outcome data.
Statistics

APA, Harvard, Vancouver, ISO, and other styles

11

Wu, Ying-keh. "Empirical Bayes procedures in time series regression models." Diss., Virginia Polytechnic Institute and State University, 1986. http://hdl.handle.net/10919/76089.

Full text

Abstract:

In this dissertation empirical Bayes estimators for the coefficients in time series regression models are presented. Due to the uncontrollability of time series observations, explanatory variables in each stage do not remain unchanged. A generalization of the results of O'Bryan and Susarla is established and shown to be an extension of the results of Martz and Krutchkoff. Alternatively, as the distribution function of sample observations is hard to obtain except asymptotically, the results of Griffin and Krutchkoff on empirical linear Bayes estimation are extended and then applied to estimating the coefficients in time series regression models. Comparisons between the performance of these two approaches are also made. Finally, predictions in time series regression models using empirical Bayes estimators and empirical linear Bayes estimators are discussed.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

12

Zhang, Shunpu. "Some contributions to empirical Bayes theory and functional estimation." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1997. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/nq23100.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Connell, Matthew Aaron. "Generalized Laguerre Series for Empirical Bayes Estimation: Calculations and Proofs." Ohio University Honors Tutorial College / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=ouhonors1619179966891297.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Liu, Ka-yee. "Bayes and empirical Bayes estimation for the panel threshold autoregressive model and non-Gaussian time series." Click to view the E-thesis via HKUTO, 2005. http://sunzi.lib.hku.hk/hkuto/record/B30706166.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Liu, Ka-yee, and 廖家怡. "Bayes and empirical Bayes estimation for the panel threshold autoregressive model and non-Gaussian time series." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2005. http://hub.hku.hk/bib/B30706166.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Wang, Xiaomu. "Robust Bayes in Hierarchical Modeling and Empirical BayesAnalysis in Multivariate Estimation." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1449069220.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Delaney, James Dillon. "Contributions to the Analysis of Experiments Using Empirical Bayes Techniques." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/11590.

Full text

Abstract:

Specifying a prior distribution for the large number of parameters in the linear statistical model is a difficult step in the Bayesian approach to the design and analysis of experiments. Here we address this difficulty by proposing the use of functional priors and then by working out important details for three and higher level experiments. One of the challenges presented by higher level experiments is that a factor can be either qualitative or quantitative. We propose appropriate correlation functions and coding schemes so that the prior distribution is simple and the results easily interpretable. The prior incorporates well known experimental design principles such as effect hierarchy and effect heredity, which helps to automatically resolve the aliasing problems experienced in fractional designs. The second part of the thesis focuses on the analysis of optimization experiments. Not uncommon are designed experiments with their primary purpose being to determine optimal settings for all of the factors in some predetermined set. Here we distinguish between the two concepts of statistical significance and practical significance. We perform estimation via an empirical Bayes data analysis methodology that has been detailed in the recent literature. But then propose an alternative to the usual next step in determining optimal factor level settings. Instead of implementing variable or model selection techniques, we propose an objective function that assists in our goal of finding the ideal settings for all factors over which we experimented. The usefulness of the new approach is illustrated through the analysis of some real experiments as well as simulation.

APA, Harvard, Vancouver, ISO, and other styles

18

Jakimauskas, Gintautas. "Analysis and application of empirical Bayes methods in data mining." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140423_090853-72998.

Full text

Abstract:

The research object is data mining empirical Bayes methods and algorithms applied in the analysis of large populations of large dimensions. The aim and objectives of the research are to create methods and algorithms for testing nonparametric hypotheses for large populations and for estimating the parameters of data models. The following problems are solved to reach these objectives: 1. To create an efficient data partitioning algorithm of large dimensional data. 2. To apply the data partitioning algorithm of large dimensional data in testing nonparametric hypotheses. 3. To apply the empirical Bayes method in testing the independence of components of large dimensional data vectors. 4. To develop an algorithm for estimating probabilities of rare events in large populations, using the empirical Bayes method and comparing Poisson-gamma and Poisson-Gaussian mathematical models, by selecting an optimal model and a respective empirical Bayes estimator. 5. To create an algorithm for logistic regression of rare events using the empirical Bayes method. The results obtained enables us to perform very fast and efficient partitioning of large dimensional data; testing the independence of selected components of large dimensional data; selecting the optimal model in the estimation of probabilities of rare events, using the Poisson-gamma and Poisson-Gaussian mathematical models and empirical Bayes estimators. The nonsingularity condition in the case of the Poisson-gamma model is presented.
Darbo tyrimų objektas yra duomenų tyrybos empiriniai Bajeso metodai ir algoritmai, taikomi didelio matavimų skaičiaus didelių populiacijų duomenų analizei. Darbo tyrimų tikslas yra sudaryti metodus ir algoritmus didelių populiacijų neparametrinių hipotezių tikrinimui ir duomenų modelių parametrų vertinimui. Šiam tikslui pasiekti yra sprendžiami tokie uždaviniai: 1. Sudaryti didelio matavimo duomenų skaidymo algoritmą. 2. Pritaikyti didelio matavimo duomenų skaidymo algoritmą neparametrinėms hipotezėms tikrinti. 3. Pritaikyti empirinį Bajeso metodą daugiamačių duomenų komponenčių nepriklausomumo hipotezei tikrinti su skirtingais matematiniais modeliais, nustatant optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. 4. Sudaryti didelių populiacijų retų įvykių dažnių vertinimo algoritmą panaudojant empirinį Bajeso metodą palyginant Puasono-gama ir Puasono-Gauso matematinius modelius. 5. Sudaryti retų įvykių logistinės regresijos algoritmą panaudojant empirinį Bajeso metodą. Darbo metu gauti nauji rezultatai įgalina atlikti didelio matavimo duomenų skaidymą; atlikti didelio matavimo nekoreliuotų duomenų pasirinktų komponenčių nepriklausomumo tikrinimą; parinkti didelių populiacijų retų įvykių optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. Pateikta nesinguliarumo sąlyga Puasono-gama modelio atveju.

APA, Harvard, Vancouver, ISO, and other styles

19

Saeed, Awat Abdulla. "Empirical evaluation of semi-supervised naïve Bayes for active learning." Thesis, University of East Anglia, 2018. https://ueaeprints.uea.ac.uk/67655/.

Full text

Abstract:

This thesis describes an empirical evaluation of semi-supervised and active learning individually, and in combination for the naïve Bayes classifier. Active learning aims to minimise the amount of labelled data required to train the classifier by using the model to direct the labelling of the most informative unlabelled examples. The key difficulty with active learning is that the initial model often gives a poor direction for labelling the unlabelled data in the early stages. However, using both labelled and unlabelled data with semi-supervised learning might be achieve a better initial model because the limited labelled data are augmented by the information in the unlabelled data. In this thesis, a suite of benchmark datasets is used to evaluate the benefit of semi-supervised learning and presents the learning curves for experiments to compare the performance of each approach. First, we will show that the semi-supervised naïve Bayes does not significantly improve the performance of the naïve Bayes classifier. Subsequently, a down-weighting technique is used to control the influence of the unlabelled data, but again this does not improve performance. In the next experiment, a novel algorithm is proposed by using a sigmoid transformation to recalibrate the overly confident naïve Bayes classifier. This algorithm does not significantly improve on the naïve Bayes classifier, but at least does improve the semi-supervised naïve Bayes classifier. In the final experiment we investigate the effectiveness of the combination of active and semi-supervised learning and empirically illustrate when the combination does work, and when does not.

APA, Harvard, Vancouver, ISO, and other styles

20

Wang, Junyan. "Empirical Bayes Model Averaging in the Presence of Model Misfit." The Ohio State University, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=osu1469437723.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Everitt, Niklas. "Module identification in dynamic networks: parametric and empirical Bayes methods." Doctoral thesis, KTH, Reglerteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-208920.

Full text

Abstract:

The purpose of system identification is to construct mathematical models of dynamical system from experimental data. With the current trend of dynamical systems encountered in engineering growing ever more complex, an important task is to efficiently build models of these systems. Modelling the complete dynamics of these systems is in general not possible or even desired. However, often, these systems can be modelled as simpler linear systems interconnected in a dynamic network. Then, the task of estimating the whole network or a subset of the network can be broken down into subproblems of estimating one simple system, called module, embedded within the dynamic network. The prediction error method (PEM) is a benchmark in parametric system identification. The main advantage with PEM is that for Gaussian noise, it corresponds to the so called maximum likelihood (ML) estimator and is asymptotically efficient. One drawback is that the cost function is in general nonconvex and a gradient based search over the parameters has to be carried out, rendering a good starting point crucial. Therefore, other methods such as subspace or instrumental variable methods are required to initialize the search. In this thesis, an alternative method, called model order reduction Steiglitz-McBride (MORSM) is proposed. As MORSM is also motivated by ML arguments, it may also be used on its own and will in some cases provide asymptotically efficient estimates. The method is computationally attractive since it is composed of a sequence of least squares steps. It also treats the part of the network of no direct interest nonparametrically, simplifying model order selection for the user. A different approach is taken in the second proposed method to identify a module embedded in a dynamic network. Here, the impulse response of the part of the network of no direct interest is modelled as a realization of a Gaussian process. The mean and covariance of the Gaussian process is parameterized by a set of parameters called hyperparameters that needs to be estimated together with the parameters of the module of interest. Using an empirical Bayes approach, all parameters are estimated by maximizing the marginal likelihood of the data. The maximization is carried out by using an iterative expectation/conditional-maximization scheme, which alternates so called expectation steps with a series of conditional-maximization steps. When only the module input and output sensors are used, the expectation step admits an analytical expression. The conditional-maximization steps reduces to solving smaller optimization problems, which either admit a closed form solution, or can be efficiently solved by using gradient descent strategies. Therefore, the overall optimization turns out computationally efficient. Using markov chain monte carlo techniques, the method is extended to incorporate additional sensors. Apart from the choice of identification method, the set of chosen signals to use in the identification will determine the covariance of the estimated modules. To chose these signals, well known expressions for the covariance matrix could, together with signal constraints, be formulated as an optimization problem and solved. However, this approach does neither tell us why a certain choice of signals is optimal nor what will happen if some properties change. The expressions developed in this part of the thesis have a different flavor in that they aim to reformulate the covariance expressions into a form amenable for interpretation. These expressions illustrate how different properties of the identification problem affects the achievable accuracy. In particular, how the power of the input and noise signals, as well as model structure, affect the covariance.
Systemidentifiering används för att skatta en modell av ett dynamiskt system genom att anpassa modellens parametrar utifrån experimentell mätdata inhämtad från systemet som ska modelleras. Systemen som modelleras tenderar att växa sig så omfattande i skala och så komplexa att direkt modellering varken är genomförbar eller önskad. I många fall går det komplexa systemet att beskriva som en komposition av enklare linära system (moduler) sammakopplade i något vi kallar dynamiska nätverk. Uppgiften att modellera hela eller delar av nätverket kan därmed brytas ner till deluppgiften att modellera en modul i det dynamiska nätverket. Det vanligaste sättet att skatta parametrarna hos en model är genom att minimera det så kallade prediktionsfelet. Den här typen av metod har nyligen anpassats för att identifiera moduler i dynamiska nätverk. Metoden åtnjuter goda egenskaper vad det gäller det modelfel som härrör från stokastisk störningar under experimentet och i de fall där störningarna är normalfördelade sammanfaller metoden med maximum likelihood-metoden. En nackdel med metoden är att functionen som minimeras vanligen är inte är konvex och därmed riskerar metoden att fastna i ett lokalt minimum. Det är därför essentiellt med en bra startpunkt. Andra metoder krävs därmed för att hitta en startpunkt, till exempel kan instrumentvariabelmetoder användas. I den här avhandlingen föreslås en alternativ metod kallad MORSM. MORSM är motiverad med argument hämtade från maximum likelihood och är också asymptotiskt effektiv i vissa fall. MORSM består av steg som kan lösas med minstakvadratmetoden och är därmed beräkningsmässigt attraktiv. Den del av nätverket som är utan intresse skattas enbart ickeparametriskt vilket underlättar valet av modellordning för användaren. En annan utgångspunkt tas i den andra metoden som föreslås för att skatta en modul inbäddad i ett dynamiskt nätverk. Impulssvaret från den del av nätverket som är utan intresse modelleras som realisation av en Gaussisk process. Medelvärdet och kovariansen hos den Gaussiska processen parametriseras av en mängd parametrar kallade hyperparametrar vilka skattas tillsammans med parametrarna för modulen. Parametrarna skattas genom att maximera den marginella likelihood funktionen. Optimeringen utförs iterativt med ECM, en variant av förväntan och maximering algoritmen (EM). Algoritmen har två steg. E-steget har en analytisk lösning medan CM-steget reduceras till delproblem som antingen har analytisk lösning eller har låg dimensionalitet och därmed kan lösas med gradientbaserade metoder. Den övergripande optimeringen är därmed beräkningsmässigt attraktiv. Med hjälp av MCMC tekniker generaliseras metoden till att inkludera ytterligare sensorer vars impulssvar också modelleras som Gaussiska processer. Förutom valet av metod så påverkar valet av signaler vilken nogrannhet eller kovarians den skattade modulen har. Klassiska uttryck för kovariansmatrisen kan användas för att optimera valet av signaler. Dock så ger dessa uttryck ingen insikt i varför valet av vissa signaler är optimalt eller vad som skulle hända om förutsättningarna vore annorlunda. Uttrycken som framställs i den här delen av avhandlingen har ett annat syfte. De försöker i stället uttrycka kovariansen i termer som kan ge insikt i vad som påverkar den nogrannhet som kan uppnås. Mer specifikt uttrycks kovariansen med bland annat avseende på insignalernas spektra, brussignalernas spektra samt modellstruktur.

QC 20170614

APA, Harvard, Vancouver, ISO, and other styles

22

Duan, Xiuwen. "Revisiting Empirical Bayes Methods and Applications to Special Types of Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42340.

Full text

Abstract:

Empirical Bayes methods have been around for a long time and have a wide range of applications. These methods provide a way in which historical data can be aggregated to provide estimates of the posterior mean. This thesis revisits some of the empirical Bayesian methods and develops new applications. We first look at a linear empirical Bayes estimator and apply it on ranking and symbolic data. Next, we consider Tweedie’s formula and show how it can be applied to analyze a microarray dataset. The application of the formula is simplified with the Pearson system of distributions. Saddlepoint approximations enable us to generalize several results in this direction. The results show that the proposed methods perform well in applications to real data sets.

APA, Harvard, Vancouver, ISO, and other styles

23

Liu, Benmei. "Hierarchical Bayes estimation and empirical best prediction of small-area proportions." College Park, Md.: University of Maryland, 2009. http://hdl.handle.net/1903/9149.

Full text

Abstract:

Thesis (Ph.D.) -- University of Maryland, College Park, 2009.
Thesis research directed by: Joint Program in Survey Methodology. Title from t.p. of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

24

Volfson, Alexander. "Exploring the optimal Transformation for Volatility." Digital WPI, 2010. https://digitalcommons.wpi.edu/etd-theses/472.

Full text

Abstract:

This paper explores the fit of a stochastic volatility model, in which the Box-Cox transformation of the squared volatility follows an autoregressive Gaussian distribution, to the continuously compounded daily returns of the Australian stock index. Estimation was difficult, and over-fitting likely, because more variables are present than data. We developed a revised model that held a couple of these variables fixed and then, further, a model which reduced the number of variables significantly by grouping trading days. A Metropolis-Hastings algorithm was used to simulate the joint density and derive estimated volatilities. Though autocorrelations were higher with a smaller Box-Cox transformation parameter, the fit of the distribution was much better.

APA, Harvard, Vancouver, ISO, and other styles

25

Jordaan, Aletta Gertruida. "Empirical Bayes estimation of the extreme value index in an ANOVA setting." Thesis, Stellenbosch : Stellenbosch University, 2014. http://hdl.handle.net/10019.1/86216.

Full text

Abstract:

Thesis (MComm)-- Stellenbosch University, 2014.
ENGLISH ABSTRACT: Extreme value theory (EVT) involves the development of statistical models and techniques in order to describe and model extreme events. In order to make inferences about extreme quantiles, it is necessary to estimate the extreme value index (EVI). Numerous estimators of the EVI exist in the literature. However, these estimators are only applicable in the single sample setting. The aim of this study is to obtain an improved estimator of the EVI that is applicable to an ANOVA setting. An ANOVA setting lends itself naturally to empirical Bayes (EB) estimators, which are the main estimators under consideration in this study. EB estimators have not received much attention in the literature. The study begins with a literature study, covering the areas of application of EVT, Bayesian theory and EB theory. Different estimation methods of the EVI are discussed, focusing also on possible methods of determining the optimal threshold. Specifically, two adaptive methods of threshold selection are considered. A simulation study is carried out to compare the performance of different estimation methods, applied only in the single sample setting. First order and second order estimation methods are considered. In the case of second order estimation, possible methods of estimating the second order parameter are also explored. With regards to obtaining an estimator that is applicable to an ANOVA setting, a first order EB estimator and a second order EB estimator of the EVI are derived. A case study of five insurance claims portfolios is used to examine whether the two EB estimators improve the accuracy of estimating the EVI, when compared to viewing the portfolios in isolation. The results showed that the first order EB estimator performed better than the Hill estimator. However, the second order EB estimator did not perform better than the “benchmark” second order estimator, namely fitting the perturbed Pareto distribution to all observations above a pre-determined threshold by means of maximum likelihood estimation.
AFRIKAANSE OPSOMMING: Ekstreemwaardeteorie (EWT) behels die ontwikkeling van statistiese modelle en tegnieke wat gebruik word om ekstreme gebeurtenisse te beskryf en te modelleer. Ten einde inferensies aangaande ekstreem kwantiele te maak, is dit nodig om die ekstreem waarde indeks (EWI) te beraam. Daar bestaan talle beramers van die EWI in die literatuur. Hierdie beramers is egter slegs van toepassing in die enkele steekproef geval. Die doel van hierdie studie is om ’n meer akkurate beramer van die EWI te verkry wat van toepassing is in ’n ANOVA opset. ’n ANOVA opset leen homself tot die gebruik van empiriese Bayes (EB) beramers, wat die fokus van hierdie studie sal wees. Hierdie beramers is nog nie in literatuur ondersoek nie. Die studie begin met ’n literatuurstudie, wat die areas van toepassing vir EWT, Bayes teorie en EB teorie insluit. Verskillende metodes van EWI beraming word bespreek, insluitend ’n bespreking oor hoe die optimale drempel bepaal kan word. Spesifiek word twee aanpasbare metodes van drempelseleksie beskou. ’n Simulasiestudie is uitgevoer om die akkuraatheid van beraming van verskillende beramingsmetodes te vergelyk, in die enkele steekproef geval. Eerste orde en tweede orde beramingsmetodes word beskou. In die geval van tweede orde beraming, word moontlike beramingsmetodes van die tweede orde parameter ook ondersoek. ’n Eerste orde en ’n tweede orde EB beramer van die EWI is afgelei met die doel om ’n beramer te kry wat van toepassing is vir die ANAVA opset. ’n Gevallestudie van vyf versekeringsportefeuljes word gebruik om ondersoek in te stel of die twee EB beramers die akkuraatheid van beraming van die EWI verbeter, in vergelyking met die EWI beramers wat verkry word deur die portefeuljes afsonderlik te ontleed. Die resultate toon dat die eerste orde EB beramer beter gevaar het as die Hill beramer. Die tweede orde EB beramer het egter slegter gevaar as die tweede orde beramer wat gebruik is as maatstaf, naamlik die passing van die gesteurde Pareto verdeling (PPD) aan alle waarnemings bo ’n gegewe drempel, met behulp van maksimum aanneemlikheidsberaming.

APA, Harvard, Vancouver, ISO, and other styles

26

Chen, Zhao. "Bayesian and Empirical Bayes approaches to power law process and microarray analysis." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000430.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Ali, Abdunnabi M. Carleton University Dissertation Mathematics. "Interface of preliminary test approach and empirical Bayes approach to shrinkage estimation." Ottawa, 1990.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

28

Vila, Jeremy P. "Empirical-Bayes Approaches to Recovery of Structured Sparse Signals via Approximate Message Passing." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1429191048.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Wu, Hao. "An Empirical Bayesian Approach to Misspecified Covariance Structures." The Ohio State University, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=osu1282058097.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Hort, Molly. "A comparison of hypothesis testing procedures for two population proportions." Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/725.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Devarasetty, Prem Chand. "SAFETY IMPROVEMENTS ON MULTILANE ARTERIALS A BEFORE AND AFTER EVALUATION USING THE EMPIRICAL BAYES METHOD." Master's thesis, Orlando, Fla. : University of Central Florida, 2009. http://purl.fcla.edu/fcla/etd/CFE0002723.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Lloyd, Holly. "A Comprehensive Safety Analysis of Diverging Diamond Interchanges." DigitalCommons@USU, 2016. https://digitalcommons.usu.edu/etd/5073.

Full text

Abstract:

As the population grows and the travel demands increase, alternative interchange designs are becoming increasingly popular. The diverging diamond interchange is one alternative design that has been implemented in the United States. This design can accommodate higher flow and unbalanced flow as well as improve safety at the interchange. As the diverging diamond interchange is increasingly considered as a possible solution to problematic interchange locations, it is imperative to investigate the safety effects of this interchange configuration. This report describes the selection of a comparison group of urban diamond interchanges, crash data collection, calibration of functions used to estimate the predicted crash rate in the before and after periods and the Empirical Bayes before and after analysis technique used to determine the safety effectiveness of the diverging diamond interchanges in Utah. A discussion of pedestrian and cyclist safety is also included. The analysis results demonstrated statistically significant decreases in crashes at most of the locations studied. This analysis can be used by UDOT and other transportation agencies as they consider the implementation of the diverging diamond interchanges in the future.

APA, Harvard, Vancouver, ISO, and other styles

33

Llau, Anthoni. "The Impact of Red Light Cameras on Injury Crashes within Miami-Dade County, Florida." FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/2240.

Full text

Abstract:

Previous red light camera (RLC) studies have shown reductions in violations and overall and right angle collisions, however, they may also result in increases in rear-end crashes (Retting & Kyrychenko, 2002; Retting & Ferguson, 2003). Despite their apparent effectiveness, many RLC studies have produced imprecise findings due to inappropriate study designs and/or statistical techniques to control for biases (Retting & Kyrychenko, 2002), therefore, a more comprehensive approach is needed to accurately assess whether they reduce motor vehicle injury collisions. The objective of this proposal is to assess whether RLC’s improve safety at signalized intersections within Miami-Dade County, Florida. Twenty signalized intersections with RLC’s initiating enforcement on January 1st, 2011 were matched to two comparison sites located at least two miles from camera sites to minimize spillover effect. An Empirical Bayes analysis was used to account for regression to the mean. Incidences of all injury, red light running related injury, right-angle/turning, and rear-end collisions were examined. An index of effectiveness along with 95% CI’s were calculated. During the first year of camera enforcement, RLC sites experienced a marginal decrease in right-angle/turn collisions, a significant increase in rear-end collisions, and significant decreases in all-injury and red light running-related injury collisions. An increase in right-angle/turning and rear-end collisions at the RLC sites was observed after two years despite camera enforcement. A significant reduction in red light running-related injury crashes, however, was still observed after two years. A non-significant decline in all injury collisions was also noted. Findings of this research indicate RLC’s reduced red light running-related injury collisions at camera sites, yet its tradeoff was a large increase in rear-end collisions. Further, there was inconclusive evidence whether RLC’s affected right-angle/turning and all injury collisions. Statutory changes in crash reporting during the second year of camera enforcement affected the incidence of right-angle and rear-end collisions, nevertheless, a novelty effect could not be ruled out. A limitation of this study was the small number of injury crashes at each site. In conclusion, future research should consider events such as low frequencies of severe injury/fatal collisions and changes in crash reporting requirements when conducting RLC analyses.

APA, Harvard, Vancouver, ISO, and other styles

34

Cross, Richard J. (Richard John). "Efficient Tools For Reliability Analysis Using Finite Mixture Distributions." Thesis, Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/4853.

Full text

Abstract:

The complexity of many failure mechanisms and variations in component manufacture often make standard probability distributions inadequate for reliability modeling. Finite mixture distributions provide the necessary flexibility for modeling such complex phenomena but add considerable difficulty to the inference. This difficulty is overcome by drawing an analogy to neural networks. With appropropriate modifications, a neural network can represent a finite mixture CDF or PDF exactly. Training with Bayesian Regularization gives an efficient empirical Bayesian inference of the failure time distribution. Training also yields an effective number of parameters from which the number of components in the mixture can be estimated. Credible sets for functions of the model parameters can be estimated using a simple closed-form expression. Complete, censored, and inpection samples can be considered by appropriate choice of the likelihood function. In this work, architectures for Exponential, Weibull, Normal, and Log-Normal mixture networks have been derived. The capabilities of mixture networks have been demonstrated for complete, censored, and inspection samples from Weibull and Log-Normal mixtures. Furthermore, mixture networks' ability to model arbitrary failure distributions has been demonstrated. A sensitivity analysis has been performed to determine how mixture network estimator errors are affected my mixture component spacing and sample size. It is shown that mixture network estimators are asymptotically unbiased and that errors decay with sample size at least as well as with MLE.

APA, Harvard, Vancouver, ISO, and other styles

35

Huang, Zhengyan. "Differential Abundance and Clustering Analysis with Empirical Bayes Shrinkage Estimation of Variance (DASEV) for Proteomics and Metabolomics Data." UKnowledge, 2019. https://uknowledge.uky.edu/epb_etds/24.

Full text

Abstract:

Mass spectrometry (MS) is widely used for proteomic and metabolomic profiling of biological samples. Data obtained by MS are often zero-inflated. Those zero values are called point mass values (PMVs). Zero values can be further grouped into biological PMVs and technical PMVs. The former type is caused by the absence of components and the latter type is caused by detection limit. There is no simple solution to separate those two types of PMVs. Mixture models were developed to separate the two types of zeros apart and to perform the differential abundance analysis. However, we notice that the mixture model can be unstable when the number of non-zero values is small. In this dissertation, we propose a new differential abundance (DA) analysis method, DASEV, which applies an empirical Bayes shrinkage estimation on variance. We hypothesized that performance on variance estimation could be more robust and thus enhance the accuracy of differential abundance analysis. Disregarding the issue the mixture models have, the method has shown promising strategies to separate two types of PMVs. We adapted the mixture distribution proposed in the original mixture model design and assumed that the variances for all components follow a certain distribution. We proposed to calculate the estimated variances by borrowing information from other components via applying the assumed distribution of variance, and then re-estimate other parameters using the estimated variances. We obtained better and more stable estimations on variance, means abundances, and proportions of biological PMVs, especially where the proportion of zeros is large. Therefore, the proposed method achieved obvious improvements in DA analysis. We also propose to extend the method for clustering analysis. To our knowledge, commonly used cluster methods for MS omics data are only K-means and Hierarchical. Both methods have their own limitations while being applied to the zero-inflated data. Model-based clustering methods are widely used by researchers for various data types including zero-inflated data. We propose to use the extension (DASEV.C) as a model-based cluster method. We compared the clustering performance of DASEV.C with K-means and Hierarchical. Under certain scenarios, the proposed method returned more accurate clusters than the standard methods. We also develop an R package dasev for the proposed methods presented in this dissertation. The major functions DASEV.DA and DASEV.C in this R package aim to implement the Bayes shrinkage estimation on variance then conduct the differential abundance and cluster analysis. We designed the functions to allow the flexibility for researchers to specify certain input options.

APA, Harvard, Vancouver, ISO, and other styles

36

McCarthy, Ross James. "Performing Network Level Crash Evaluation Using Skid Resistance." Thesis, Virginia Tech, 2015. http://hdl.handle.net/10919/56576.

Full text

Abstract:

Evaluation of crash count data as a function of roadway characteristics allows Departments of Transportation to predict expected average crash risks in order to assist in identifying segments that could benefit from various treatments. Currently, the evaluation is performed using negative binomial regression, as a function of average annual daily traffic (AADT) and other variables. For this thesis, a crash study was carried out for the interstate, primary and secondary routes, in the Salem District of Virginia. The data used in the study included the following information obtained from Virginia Department of Transportation (VDOT) records: 2010 to 2012 crash data, 2010 to 2012 AADT, and horizontal radius of curvature (CV). Additionally, tire-pavement friction or skid resistance was measured using a continuous friction measurement, fixed-slip device called a Grip Tester. In keeping with the current practice, negative binomial regression was used to relate the crash data to the AADT, skid resistance and CV. To determine which of the variables to include in the final models, the Akaike Information Criterion (AIC) and Log-Likelihood Ratio Tests were performed. By mathematically combining the information acquired from the negative binomial regression models and the information contained in the crash counts, the parameters of each network's true average crash risks were empirically estimated using the Empirical Bayes (EB) approach. The new estimated average crash risks were then used to rank segments according to their empirically estimated crash risk and to prioritize segments according to their expected crash reduction if a friction treatment were applied.
Master of Science

APA, Harvard, Vancouver, ISO, and other styles

37

Brimley, Bradford Keith. "Calibration of the Highway Safety Manual Safety Performance Function and Development of Jurisdiction-Specific Models for Rural Two-Lane Two-Way Roads in Utah." BYU ScholarsArchive, 2011. https://scholarsarchive.byu.edu/etd/2611.

Full text

Abstract:

This thesis documents the results of the calibration of the Highway Safety Manual (HSM) safety performance function (SPF) for rural two-lane two-way roadway segments in Utah and the development of new SPFs using negative binomial and hierarchical Bayesian modeling techniques. SPFs estimate the safety of a roadway entity, such as a segment or intersection, in terms of number of crashes. The new SPFs were developed for comparison to the calibrated HSM SPF. This research was performed for the Utah Department of Transportation (UDOT).The study area was the state of Utah. Crash data from 2005-2007 on 157 selected study segments provided a 3-year observed crash frequency to obtain a calibration factor for the HSM SPF and develop new SPFs. The calibration factor for the HSM SPF for rural two-lane two-way roads in Utah is 1.16. This indicates that the HSM underpredicts the number of crashes on rural two-lane two-way roads in Utah by sixteen percent. The new SPFs were developed from the same data that were collected for the HSM calibration, with the addition of new data variables that were hypothesized to have a significant effect on crash frequencies. Negative binomial regression was used to develop four new SPFs, and one additional SPF was developed using hierarchical (or full) Bayesian techniques. The empirical Bayes (EB) method can be applied with each negative binomial SPF because the models include an overdispersion parameter used with the EB method. The hierarchical Bayesian technique is a newer, more mathematically-intense method that accounts for high levels of uncertainty often present in crash modeling. Because the hierarchical Bayesian SPF produces a density function of a predicted crash frequency, a comparison of this density function with an observed crash frequency can help identify segments with significant safety concerns. Each SPF has its own strengths and weaknesses, which include its data requirements and predicting capability. This thesis recommends that UDOT use Equation 5-11 (a new negative binomial SPF) for predicting crashes, because it predicts crashes with reasonable accuracy while requiring much less data than other models. The hierarchical Bayesian process should be used for evaluating observed crash frequencies to identify segments that may benefit from roadway safety improvements.

APA, Harvard, Vancouver, ISO, and other styles

38

Zerbeto, Ana Paula. "Melhor preditor empírico aplicado aos modelos beta mistos." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-09042014-132109/.

Full text

Abstract:

Os modelos beta mistos são amplamente utilizados na análise de dados que apresentam uma estrutura hierárquica e que assumem valores em um intervalo restrito conhecido. Com o objetivo de propor um método de predição dos componentes aleatórios destes, os resultados previamente obtidos na literatura para o preditor de Bayes empírico foram estendidos aos modelos de regressão beta com intercepto aleatório normalmente distribuído. O denominado melhor preditor empírico (MPE) proposto tem aplicação em duas situações diferentes: quando se deseja fazer predição sobre os efeitos individuais de novos elementos de grupos que já fizeram parte da base de ajuste e quando os grupos não pertenceram à tal base. Estudos de simulação foram delineados e seus resultados indicaram que o desempenho do MPE foi eficiente e satisfatório em diversos cenários. Ao utilizar-se da proposta na análise de dois bancos de dados da área da saúde, observou-se os mesmos resultados obtidos nas simulações nos dois casos abordados. Tanto nas simulações, quanto nas análises de dados reais, foram observados bons desempenhos. Assim, a metodologia proposta se mostrou promissora para o uso em modelos beta mistos, nos quais se deseja fazer predições.
The mixed beta regression models are extensively used to analyse data with hierarquical structure and that take values in a restricted and known interval. In order to propose a prediction method for their random components, the results previously obtained in the literature for the empirical Bayes predictor were extended to beta regression models with random intercept normally distributed. The proposed predictor, called empirical best predictor (EBP), can be applied in two situations: when the interest is predict individuals effects for new elements of groups that were already analysed by the fitted model and, also, for elements of new groups. Simulation studies were designed and their results indicated that the performance of EBP was efficient and satisfatory in most of scenarios. Using the propose to analyse two health databases, the same results of simulations were observed in both two cases of application, and good performances were observed. So, the proposed method is promissing for the use in predictions for mixed beta regression models.

APA, Harvard, Vancouver, ISO, and other styles

39

Kisamore, Jennifer L. "Validity Generalization and Transportability: An Investigation of Distributional Assumptions of Random-Effects Meta-Analytic Methods." [Tampa, Fla.] : University of South Florida, 2003. http://purl.fcla.edu/fcla/etd/SFE0000060.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Jakimauskas, Gintautas. "Duomenų tyrybos empirinių Bajeso metodų tyrimas ir taikymas." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140423_090834-67696.

Full text

Abstract:

Darbo tyrimų objektas yra duomenų tyrybos empiriniai Bajeso metodai ir algoritmai, taikomi didelio matavimų skaičiaus didelių populiacijų duomenų analizei. Darbo tyrimų tikslas yra sudaryti metodus ir algoritmus didelių populiacijų neparametrinių hipotezių tikrinimui ir duomenų modelių parametrų vertinimui. Šiam tikslui pasiekti yra sprendžiami tokie uždaviniai: 1. Sudaryti didelio matavimo duomenų skaidymo algoritmą. 2. Pritaikyti didelio matavimo duomenų skaidymo algoritmą neparametrinėms hipotezėms tikrinti. 3. Pritaikyti empirinį Bajeso metodą daugiamačių duomenų komponenčių nepriklausomumo hipotezei tikrinti su skirtingais matematiniais modeliais, nustatant optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. 4. Sudaryti didelių populiacijų retų įvykių dažnių vertinimo algoritmą panaudojant empirinį Bajeso metodą palyginant Puasono-gama ir Puasono-Gauso matematinius modelius. 5. Sudaryti retų įvykių logistinės regresijos algoritmą panaudojant empirinį Bajeso metodą. Darbo metu gauti nauji rezultatai įgalina atlikti didelio matavimo duomenų skaidymą; atlikti didelio matavimo nekoreliuotų duomenų pasirinktų komponenčių nepriklausomumo tikrinimą; parinkti didelių populiacijų retų įvykių optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. Pateikta nesinguliarumo sąlyga Puasono-gama modelio atveju.
The research object is data mining empirical Bayes methods and algorithms applied in the analysis of large populations of large dimensions. The aim and objectives of the research are to create methods and algorithms for testing nonparametric hypotheses for large populations and for estimating the parameters of data models. The following problems are solved to reach these objectives: 1. To create an efficient data partitioning algorithm of large dimensional data. 2. To apply the data partitioning algorithm of large dimensional data in testing nonparametric hypotheses. 3. To apply the empirical Bayes method in testing the independence of components of large dimensional data vectors. 4. To develop an algorithm for estimating probabilities of rare events in large populations, using the empirical Bayes method and comparing Poisson-gamma and Poisson-Gaussian mathematical models, by selecting an optimal model and a respective empirical Bayes estimator. 5. To create an algorithm for logistic regression of rare events using the empirical Bayes method. The results obtained enables us to perform very fast and efficient partitioning of large dimensional data; testing the independence of selected components of large dimensional data; selecting the optimal model in the estimation of probabilities of rare events, using the Poisson-gamma and Poisson-Gaussian mathematical models and empirical Bayes estimators. The nonsingularity condition in the case of the Poisson-gamma model is presented.

APA, Harvard, Vancouver, ISO, and other styles

41

Mao, Yi. "Domain knowledge, uncertainty, and parameter constraints." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/37295.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Sengupta, Aritra. "Empirical Hierarchical Modeling and Predictive Inference for Big, Spatial, Discrete, and Continuous Data." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1350660056.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Fredette, Marc. "Prediction of recurrent events." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/1142.

Full text

Abstract:

In this thesis, we will study issues related to prediction problems and put an emphasis on those arising when recurrent events are involved. First we define the basic concepts of frequentist and Bayesian statistical prediction in the first chapter. In the second chapter, we study frequentist prediction intervals and their associated predictive distributions. We will then present an approach based on asymptotically uniform pivotals that is shown to dominate the plug-in approach under certain conditions. The following three chapters consider the prediction of recurrent events. The third chapter presents different prediction models when these events can be modeled using homogeneous Poisson processes. Amongst these models, those using random effects are shown to possess interesting features. In the fourth chapter, the time homogeneity assumption is relaxed and we present prediction models for non-homogeneous Poisson processes. The behavior of these models is then studied for prediction problems with a finite horizon. In the fifth chapter, we apply the concepts discussed previously to a warranty dataset coming from the automobile industry. The number of processes in this dataset being very large, we focus on methods providing computationally rapid prediction intervals. Finally, we discuss the possibilities of future research in the last chapter.

APA, Harvard, Vancouver, ISO, and other styles

44

SARTOR, MAUREEN A. "TESTING FOR DIFFERENTIALLY EXPRESSED GENES AND KEY BIOLOGICAL CATEGORIES IN DNA MICROARRAY ANALYSIS." University of Cincinnati / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1195656673.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Olufowobi, Oluwaseun Temitope. "The Safety Impact of Raising Speed Limit on Rural Freeways In Ohio." University of Dayton / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1597014805133206.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Viktorova, Elena [Verfasser], Heike [Akademischer Betreuer] Bickeböller, Tim [Akademischer Betreuer] Beissbarth, Dieter [Akademischer Betreuer] Kube, and Tim [Akademischer Betreuer] Friede. "Gene-Environment Interaction and Extension to Empirical Hierarchical Bayes Models in Genome-Wide Association Studies / Elena Viktorova. Gutachter: Tim Beissbarth ; Dieter Kube ; Tim Friede. Betreuer: Heike Bickeböller." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2014. http://d-nb.info/105419176X/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Sohns, Melanie [Verfasser], Heike [Akademischer Betreuer] Bickeböller, and Martin [Akademischer Betreuer] Schlather. "The Empirical Hierarchical Bayes Approach for Pathway Integration and Gene-Environment Interactions in Genome-Wide Association Studies / Melanie Sohns. Gutachter: Martin Schlather ; Heike Bickeböller. Betreuer: Heike Bickeböller." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2012. http://d-nb.info/1043029478/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Carroll, James Lamond. "A Bayesian Decision Theoretical Approach to Supervised Learning, Selective Sampling, and Empirical Function Optimization." Diss., CLICK HERE for online access, 2010. http://contentdm.lib.byu.edu/ETD/image/etd3413.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Knecht, Casey Scott. "Crash Prediction Modeling for Curved Segments of Rural Two-Lane Two-Way Highways in Utah." BYU ScholarsArchive, 2014. https://scholarsarchive.byu.edu/etd/4352.

Full text

Abstract:

This thesis contains the results of the development of crash prediction models for curved segments of rural two-lane two-way highways in the state of Utah. The modeling effort included the calibration of the predictive model found in the Highway Safety Manual (HSM) as well as the development of Utah-specific models developed using negative binomial regression. The data for these models came from randomly sampled curved segments in Utah, with crash data coming from years 2008-2012. The total number of randomly sampled curved segments was 1,495. The HSM predictive model for rural two-lane two-way highways consists of a safety performance function (SPF), crash modification factors (CMFs), and a jurisdiction-specific calibration factor. For this research, two sample periods were used: a three-year period from 2010 to 2012 and a five-year period from 2008 to 2012. The calibration factor for the HSM predictive model was determined to be 1.50 for the three-year period and 1.60 for the five-year period. These factors are to be used in conjunction with the HSM SPF and all applicable CMFs. A negative binomial model was used to develop Utah-specific crash prediction models based on both the three-year and five-year sample periods. A backward stepwise regression technique was used to isolate the variables that would significantly affect highway safety. The independent variables used for negative binomial regression included the same set of variables used in the HSM predictive model along with other variables such as speed limit and truck traffic that were considered to have a significant effect on potential crash occurrence. The significant variables at the 95 percent confidence level were found to be average annual daily traffic, segment length, total truck percentage, and curve radius. The main benefit of the Utah-specific crash prediction models is that they provide a reasonable level of accuracy for crash prediction yet only require four variables, thus requiring much less effort in data collection compared to using the HSM predictive model.

APA, Harvard, Vancouver, ISO, and other styles

50

Piaseckienė, Karolina. "The statistical methods in the analysis of the Lithuanian language complexity." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140922_141231-96020.

Full text

Abstract:

The target of the work is to apply mathematical and statistical methods in the analysis of the Lithuanian language by identifying and taking into account peculiarities of the Lithuanian language, its heterogeneity, complexity and variability.
Pagrindinis darbo tikslas – pritaikyti matematinius ir statistinius metodus lietuvių kalbos analizėje, identifikuojant ir atsižvelgiant į lietuvių kalbos ypatumus, jos heterogeniškumą, sudėtingumą ir variabilumą.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Empirical Bayes'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles