Dissertations / Theses on the topic 'Empirical Bayes methods'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 26 dissertations / theses for your research on the topic 'Empirical Bayes methods.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Benhaddou, Rida. "Nonparametric and Empirical Bayes Estimation Methods." Doctoral diss., University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5765.
Full textPh.D.
Doctorate
Mathematics
Sciences
Mathematics
Brandel, John. "Empirical Bayes methods for missing data analysis." Thesis, Uppsala University, Department of Mathematics, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-121408.
Full textLönnstedt, Ingrid. "Empirical Bayes Methods for DNA Microarray Data." Doctoral thesis, Uppsala University, Department of Mathematics, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-5865.
Full textcDNA microarrays is one of the first high-throughput gene expression technologies that has emerged within molecular biology for the purpose of functional genomics. cDNA microarrays compare the gene expression levels between cell samples, for thousands of genes simultaneously.
The microarray technology offers new challenges when it comes to data analysis, since the thousands of genes are examined in parallel, but with very few replicates, yielding noisy estimation of gene effects and variances. Although careful image analyses and normalisation of the data is applied, traditional methods for inference like the Student t or Fisher’s F-statistic fail to work.
In this thesis, four papers on the topics of empirical Bayes and full Bayesian methods for two-channel microarray data (as e.g. cDNA) are presented. These contribute to proving that empirical Bayes methods are useful to overcome the specific data problems. The sample distributions of all the genes involved in a microarray experiment are summarized into prior distributions and improves the inference of each single gene.
The first part of the thesis includes biological and statistical background of cDNA microarrays, with an overview of the different steps of two-channel microarray analysis, including experimental design, image analysis, normalisation, cluster analysis, discrimination and hypothesis testing. The second part of the thesis consists of the four papers. Paper I presents the empirical Bayes statistic B, which corresponds to a t-statistic. Paper II is based on a version of B that is extended for linear model effects. Paper III assesses the performance of empirical Bayes models by comparisons with full Bayes methods. Paper IV provides extensions of B to what corresponds to F-statistics.
Lönnstedt, Ingrid. "Empirical Bayes methods for DNA microarray data /." Uppsala : Matematiska institutionen, Univ. [distributör], 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-5865.
Full textJakimauskas, Gintautas. "Analysis and application of empirical Bayes methods in data mining." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140423_090853-72998.
Full textDarbo tyrimų objektas yra duomenų tyrybos empiriniai Bajeso metodai ir algoritmai, taikomi didelio matavimų skaičiaus didelių populiacijų duomenų analizei. Darbo tyrimų tikslas yra sudaryti metodus ir algoritmus didelių populiacijų neparametrinių hipotezių tikrinimui ir duomenų modelių parametrų vertinimui. Šiam tikslui pasiekti yra sprendžiami tokie uždaviniai: 1. Sudaryti didelio matavimo duomenų skaidymo algoritmą. 2. Pritaikyti didelio matavimo duomenų skaidymo algoritmą neparametrinėms hipotezėms tikrinti. 3. Pritaikyti empirinį Bajeso metodą daugiamačių duomenų komponenčių nepriklausomumo hipotezei tikrinti su skirtingais matematiniais modeliais, nustatant optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. 4. Sudaryti didelių populiacijų retų įvykių dažnių vertinimo algoritmą panaudojant empirinį Bajeso metodą palyginant Puasono-gama ir Puasono-Gauso matematinius modelius. 5. Sudaryti retų įvykių logistinės regresijos algoritmą panaudojant empirinį Bajeso metodą. Darbo metu gauti nauji rezultatai įgalina atlikti didelio matavimo duomenų skaidymą; atlikti didelio matavimo nekoreliuotų duomenų pasirinktų komponenčių nepriklausomumo tikrinimą; parinkti didelių populiacijų retų įvykių optimalų modelį ir atitinkamą empirinį Bajeso įvertinį. Pateikta nesinguliarumo sąlyga Puasono-gama modelio atveju.
Everitt, Niklas. "Module identification in dynamic networks: parametric and empirical Bayes methods." Doctoral thesis, KTH, Reglerteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-208920.
Full textSystemidentifiering används för att skatta en modell av ett dynamiskt system genom att anpassa modellens parametrar utifrån experimentell mätdata inhämtad från systemet som ska modelleras. Systemen som modelleras tenderar att växa sig så omfattande i skala och så komplexa att direkt modellering varken är genomförbar eller önskad. I många fall går det komplexa systemet att beskriva som en komposition av enklare linära system (moduler) sammakopplade i något vi kallar dynamiska nätverk. Uppgiften att modellera hela eller delar av nätverket kan därmed brytas ner till deluppgiften att modellera en modul i det dynamiska nätverket. Det vanligaste sättet att skatta parametrarna hos en model är genom att minimera det så kallade prediktionsfelet. Den här typen av metod har nyligen anpassats för att identifiera moduler i dynamiska nätverk. Metoden åtnjuter goda egenskaper vad det gäller det modelfel som härrör från stokastisk störningar under experimentet och i de fall där störningarna är normalfördelade sammanfaller metoden med maximum likelihood-metoden. En nackdel med metoden är att functionen som minimeras vanligen är inte är konvex och därmed riskerar metoden att fastna i ett lokalt minimum. Det är därför essentiellt med en bra startpunkt. Andra metoder krävs därmed för att hitta en startpunkt, till exempel kan instrumentvariabelmetoder användas. I den här avhandlingen föreslås en alternativ metod kallad MORSM. MORSM är motiverad med argument hämtade från maximum likelihood och är också asymptotiskt effektiv i vissa fall. MORSM består av steg som kan lösas med minstakvadratmetoden och är därmed beräkningsmässigt attraktiv. Den del av nätverket som är utan intresse skattas enbart ickeparametriskt vilket underlättar valet av modellordning för användaren. En annan utgångspunkt tas i den andra metoden som föreslås för att skatta en modul inbäddad i ett dynamiskt nätverk. Impulssvaret från den del av nätverket som är utan intresse modelleras som realisation av en Gaussisk process. Medelvärdet och kovariansen hos den Gaussiska processen parametriseras av en mängd parametrar kallade hyperparametrar vilka skattas tillsammans med parametrarna för modulen. Parametrarna skattas genom att maximera den marginella likelihood funktionen. Optimeringen utförs iterativt med ECM, en variant av förväntan och maximering algoritmen (EM). Algoritmen har två steg. E-steget har en analytisk lösning medan CM-steget reduceras till delproblem som antingen har analytisk lösning eller har låg dimensionalitet och därmed kan lösas med gradientbaserade metoder. Den övergripande optimeringen är därmed beräkningsmässigt attraktiv. Med hjälp av MCMC tekniker generaliseras metoden till att inkludera ytterligare sensorer vars impulssvar också modelleras som Gaussiska processer. Förutom valet av metod så påverkar valet av signaler vilken nogrannhet eller kovarians den skattade modulen har. Klassiska uttryck för kovariansmatrisen kan användas för att optimera valet av signaler. Dock så ger dessa uttryck ingen insikt i varför valet av vissa signaler är optimalt eller vad som skulle hända om förutsättningarna vore annorlunda. Uttrycken som framställs i den här delen av avhandlingen har ett annat syfte. De försöker i stället uttrycka kovariansen i termer som kan ge insikt i vad som påverkar den nogrannhet som kan uppnås. Mer specifikt uttrycks kovariansen med bland annat avseende på insignalernas spektra, brussignalernas spektra samt modellstruktur.
QC 20170614
Duan, Xiuwen. "Revisiting Empirical Bayes Methods and Applications to Special Types of Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42340.
Full textHort, Molly. "A comparison of hypothesis testing procedures for two population proportions." Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/725.
Full textKisamore, Jennifer L. "Validity Generalization and Transportability: An Investigation of Distributional Assumptions of Random-Effects Meta-Analytic Methods." [Tampa, Fla.] : University of South Florida, 2003. http://purl.fcla.edu/fcla/etd/SFE0000060.
Full textJakimauskas, Gintautas. "Duomenų tyrybos empirinių Bajeso metodų tyrimas ir taikymas." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140423_090834-67696.
Full textThe research object is data mining empirical Bayes methods and algorithms applied in the analysis of large populations of large dimensions. The aim and objectives of the research are to create methods and algorithms for testing nonparametric hypotheses for large populations and for estimating the parameters of data models. The following problems are solved to reach these objectives: 1. To create an efficient data partitioning algorithm of large dimensional data. 2. To apply the data partitioning algorithm of large dimensional data in testing nonparametric hypotheses. 3. To apply the empirical Bayes method in testing the independence of components of large dimensional data vectors. 4. To develop an algorithm for estimating probabilities of rare events in large populations, using the empirical Bayes method and comparing Poisson-gamma and Poisson-Gaussian mathematical models, by selecting an optimal model and a respective empirical Bayes estimator. 5. To create an algorithm for logistic regression of rare events using the empirical Bayes method. The results obtained enables us to perform very fast and efficient partitioning of large dimensional data; testing the independence of selected components of large dimensional data; selecting the optimal model in the estimation of probabilities of rare events, using the Poisson-gamma and Poisson-Gaussian mathematical models and empirical Bayes estimators. The nonsingularity condition in the case of the Poisson-gamma model is presented.
Piaseckienė, Karolina. "The statistical methods in the analysis of the Lithuanian language complexity." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2014. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2014~D_20140922_141231-96020.
Full textPagrindinis darbo tikslas – pritaikyti matematinius ir statistinius metodus lietuvių kalbos analizėje, identifikuojant ir atsižvelgiant į lietuvių kalbos ypatumus, jos heterogeniškumą, sudėtingumą ir variabilumą.
Fredette, Marc. "Prediction of recurrent events." Thesis, University of Waterloo, 2004. http://hdl.handle.net/10012/1142.
Full textYu, Xue Qin. "Comparing survival from cancer using population-based cancer registry data - methods and applications." Thesis, The University of Sydney, 2007. http://hdl.handle.net/2123/1774.
Full textYu, Xue Qin. "Comparing survival from cancer using population-based cancer registry data - methods and applications." University of Sydney, 2007. http://hdl.handle.net/2123/1774.
Full textOver the past decade, population-based cancer registry data have been used increasingly worldwide to evaluate and improve the quality of cancer care. The utility of the conclusions from such studies relies heavily on the data quality and the methods used to analyse the data. Interpretation of comparative survival from such data, examining either temporal trends or geographical differences, is generally not easy. The observed differences could be due to methodological and statistical approaches or to real effects. For example, geographical differences in cancer survival could be due to a number of real factors, including access to primary health care, the availability of diagnostic and treatment facilities and the treatment actually given, or to artefact, such as lead-time bias, stage migration, sampling error or measurement error. Likewise, a temporal increase in survival could be the result of earlier diagnosis and improved treatment of cancer; it could also be due to artefact after the introduction of screening programs (adding lead time), changes in the definition of cancer, stage migration or several of these factors, producing both real and artefactual trends. In this thesis, I report methods that I modified and applied, some technical issues in the use of such data, and an analysis of data from the State of New South Wales (NSW), Australia, illustrating their use in evaluating and potentially improving the quality of cancer care, showing how data quality might affect the conclusions of such analyses. This thesis describes studies of comparative survival based on population-based cancer registry data, with three published papers and one accepted manuscript (subject to minor revision). In the first paper, I describe a modified method for estimating spatial variation in cancer survival using empirical Bayes methods (which was published in Cancer Causes and Control 2004). I demonstrate in this paper that the empirical Bayes method is preferable to standard approaches and show how it can be used to identify cancer types where a focus on reducing area differentials in survival might lead to important gains in survival. In the second paper (published in the European Journal of Cancer 2005), I apply this method to a more complete analysis of spatial variation in survival from colorectal cancer in NSW and show that estimates of spatial variation in colorectal cancer can help to identify subgroups of patients for whom better application of treatment guidelines could improve outcome. I also show how estimates of the numbers of lives that could be extended might assist in setting priorities for treatment improvement. In the third paper, I examine time trends in survival from 28 cancers in NSW between 1980 and 1996 (published in the International Journal of Cancer 2006) and conclude that for many cancers, falls in excess deaths in NSW from 1980 to 1996 are unlikely to be attributable to earlier diagnosis or stage migration; thus, advances in cancer treatment have probably contributed to them. In the accepted manuscript, I described an extension of the work reported in the second paper, investigating the accuracy of staging information recorded in the registry database and assessing the impact of error in its measurement on estimates of spatial variation in survival from colorectal cancer. The results indicate that misclassified registry stage can have an important impact on estimates of spatial variation in stage-specific survival from colorectal cancer. Thus, if cancer registry data are to be used effectively in evaluating and improving cancer care, the quality of stage data might have to be improved. Taken together, the four papers show that creative, informed use of population-based cancer registry data, with appropriate statistical methods and acknowledgement of the limitations of the data, can be a valuable tool for evaluating and possibly improving cancer care. Use of these findings to stimulate evaluation of the quality of cancer care should enhance the value of the investment in cancer registries. They should also stimulate improvement in the quality of cancer registry data, particularly that on stage at diagnosis. The methods developed in this thesis may also be used to improve estimation of geographical variation in other count-based health measures when the available data are sparse.
Rahal, Abbas. "Bayesian Methods Under Unknown Prior Distributions with Applications to The Analysis of Gene Expression Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42408.
Full textLiley, Albert James. "Statistical co-analysis of high-dimensional association studies." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/270628.
Full textDevarasetty, Prem Chand. "SAFETY IMPROVEMENTS ON MULTILANE ARTERIALS A BEFORE AND AFTER EVALUATION USING THE EMPIRICAL BAYES METHOD." Master's thesis, Orlando, Fla. : University of Central Florida, 2009. http://purl.fcla.edu/fcla/etd/CFE0002723.
Full textFilho, Diógenes Ferreira. "Estudo de expressão gênica em citros utilizando modelos lineares." Universidade de São Paulo, 2010. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-16032010-111945/.
Full textThis paper presents a review of the methodology of microarray experiments for its installation and statistical analysis of data obtained. Then this methodology is applied in data analysis of gene expression in citrus, generated by a macroarray experiment, using linear models with fixed effects considering the inclusion or exclusion of different effects and considering adjustments of models for each gene separately and for all genes simultaneously. The macroarray experiments are similar to the microarray experiments, but use a smaller number of genes. In general, are used due to economic restrictions. Because they have been used a few arrays in the experiment analyzed in this study it was used a empirical Bayes approach that uses estimates of variance more stable and that takes into account the correlation among replicates of the gene within array. A non parametric analysis method was also used to outline the problem of the non normality for some genes. The results obtained in each of the described methods of analysis were then compared.
Wahl, Jean-Baptiste. "The Reduced basis method applied to aerothermal simulations." Thesis, Strasbourg, 2018. http://www.theses.fr/2018STRAD024/document.
Full textWe present in this thesis our work on model order reduction for aerothermal simulations. We consider the coupling between the incompressible Navier-Stokes equations and an advection-diffusion equation for the temperature. Since the physical parameters induce high Reynolds and Peclet numbers, we have to introduce stabilization operators in the formulation to deal with the well known numerical stability issue. The chosen stabilization, applied to both fluid and heat equations, is the usual Streamline-Upwind/Petrov-Galerkin (SUPG) which add artificial diffusivity in the direction of the convection field. We also introduce our order reduction strategy for this model, based on the Reduced Basis Method (RBM). To recover an affine decomposition for this complex model, we implemented a discrete variation of the Empirical Interpolation Method (EIM) which is a discrete version of the original EIM. This variant allows building an approximated affine decomposition for complex operators such as in the case of SUPG. We also use this method for the non-linear operators induced by the shock capturing method. The construction of an EIM basis for non-linear operators involves a potentially huge number of non-linear FEM resolutions - depending on the size of the sampling. Even if this basis is built during an offline phase, we usually can not afford such expensive computational cost. We took advantage of the recent development of the Simultaneous EIM Reduced basis algorithm (SER) to tackle this issue
Laurent, Philippe. "Méthodes d'accéleration pour la résolution numérique en électrolocation et en chimie quantique." Thesis, Nantes, Ecole des Mines, 2015. http://www.theses.fr/2015EMNA0122/document.
Full textThis thesis tackle two different topics.We first design and analyze algorithms related to the electrical sense for applications in robotics. We consider in particular the method of reflections, which allows, like the Schwartz method, to solve linear problems using simpler sub-problems. These ones are obtained by decomposing the boundaries of the original problem. We give proofs of convergence and applications. In order to implement an electrolocation simulator of the direct problem in an autonomous robot, we build a reduced basis method devoted to electrolocation problems. In this way, we obtain algorithms which satisfy the constraints of limited memory and time resources. The second topic is an inverse problem in quantum chemistry. Here, we want to determine some features of a quantum system. To this aim, the system is ligthed by a known and fixed Laser field. In this framework, the data of the inverse problem are the states before and after the Laser lighting. A local existence result is given, together with numerical methods for the solving
Yang, L., and Daniel Neagu. "Integration strategies for toxicity data from an empirical perspective." 2014. http://hdl.handle.net/10454/10814.
Full textThe recent development of information techniques, especially the state-of-the-art “big data” solutions, enables the extracting, gathering, and processing large amount of toxicity information from multiple sources. Facilitated by this technology advance, a framework named integrated testing strategies (ITS) has been proposed in the predictive toxicology domain, in an effort to intelligently jointly use multiple heterogeneous toxicity data records (through data fusion, grouping, interpolation/extrapolation etc.) for toxicity assessment. This will ultimately contribute to accelerating the development cycle of chemical products, reducing animal use, and decreasing development costs. Most of the current study in ITS is based on a group of consensus processes, termed weight of evidence (WoE), which quantitatively integrate all the relevant data instances towards the same endpoint into an integrated decision supported by data quality. Several WoE implementations for the particular case of toxicity data fusion have been presented in the literature, which are collectively studied in this paper. Noting that these uncertainty handling methodologies are usually not simply developed from conventional probability theory due to the unavailability of big datasets, this paper first investigates the mathematical foundations of these approaches. Then, the investigated data integration models are applied to a representative case in the predictive toxicology domain, with the experimental results compared and analysed.
Zhang, Pengyue. "Study designs and statistical methods for pharmacogenomics and drug interaction studies." Diss., 2016. http://hdl.handle.net/1805/11300.
Full textAdverse drug events (ADEs) are injuries resulting from drug-related medical interventions. ADEs can be either induced by a single drug or a drug-drug interaction (DDI). In order to prevent unnecessary ADEs, many regulatory agencies in public health maintain pharmacovigilance databases for detecting novel drug-ADE associations. However, pharmacovigilance databases usually contain a significant portion of false associations due to their nature structure (i.e. false drug-ADE associations caused by co-medications). Besides pharmacovigilance studies, the risks of ADEs can be minimized by understating their mechanisms, which include abnormal pharmacokinetics/pharmacodynamics due to genetic factors and synergistic effects between drugs. During the past decade, pharmacogenomics studies have successfully identified several predictive markers to reduce ADE risks. While, pharmacogenomics studies are usually limited by the sample size and budget. In this dissertation, we develop statistical methods for pharmacovigilance and pharmacogenomics studies. Firstly, we propose an empirical Bayes mixture model to identify significant drug-ADE associations. The proposed approach can be used for both signal generation and ranking. Following this approach, the portion of false associations from the detected signals can be well controlled. Secondly, we propose a mixture dose response model to investigate the functional relationship between increased dimensionality of drug combinations and the ADE risks. Moreover, this approach can be used to identify high-dimensional drug combinations that are associated with escalated ADE risks at a significantly low local false discovery rates. Finally, we proposed a cost-efficient design for pharmacogenomics studies. In order to pursue a further cost-efficiency, the proposed design involves both DNA pooling and two-stage design approach. Compared to traditional design, the cost under the proposed design will be reduced dramatically with an acceptable compromise on statistical power. The proposed methods are examined by extensive simulation studies. Furthermore, the proposed methods to analyze pharmacovigilance databases are applied to the FDA’s Adverse Reporting System database and a local electronic medical record (EMR) database. For different scenarios of pharmacogenomics study, optimized designs to detect a functioning rare allele are given as well.
Yang-YuCheng and 鄭暘諭. "Estimation of False Discovery Rate Using Empirical Bayes Method." Thesis, 2016. http://ndltd.ncl.edu.tw/handle/78t3ye.
Full text國立成功大學
統計學系
104
In multiple testing problems, if you do not adjust the individual type I error rate and still set the individual significance level α, then the overall type I error rate of m hypotheses will be expanded to be mα. This study assumes that several genes have mixed normal distribution, and parameters have prior distribution. We use the Bayesian posterior distribution and EM algorithm to estimate the proportion of the null hypothesis which is true, then to estimate the number of null hypothesis which is true, and FDR. We compare the performance of these estimators for different parameters through the Monte Carlo algorithm. The estimator using McNemar test proposed by Ma & Chao (2011) may cause estimation error too large as the significance level is set to be α=0.05. The estimator proposed by Benjamini & Hochberg (2000) is unstable when the ratio of gene mutation is set to be random. The estimator using Friedman test proposed by Ma & Tsai (2011) also has the same scenario. When the number of genes and the number of patients both are large and the proportion of true null hypothesis is higher, the proposed EBay estimator has the smaller RMSE. Hence it’s more accurate.
Lin, I.-Chin, and 林義欽. "Some Applications of Empirical Bayes Method for Selecting Exponential Distributions." Thesis, 1994. http://ndltd.ncl.edu.tw/handle/95597143003863223702.
Full textLin, Tzu-Yin, and 林姿吟. "An Empirical Bayes Process Monitoring Technique for Categorical Data Utilizing the Likelihood Ratio Method." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/338ub5.
Full text國立交通大學
統計學研究所
92
The purpose of the paper is to develop an empirical Bayes process monitoring technique for manufacturing categorical data utilizing the likelihood ratio method. First, assuming the normal-binomial or -multinomial model, an empirical Bayes inference for manufacturing categorical data is discussed. Next, utilizing the likelihood ratio method, an empirical Bayes process monitoring technique for manufacturing categorical data is proposed. Finally, the average run length behavior of the proposed process monitoring scheme is investigated.
Kuo, Pei-Fen. "Examining the Effects of Site-Selection Criteria for Evaluating the Effectiveness of Traffic Safety Improvement Countermeasures." Thesis, 2012. http://hdl.handle.net/1969.1/ETD-TAMU-2012-05-10841.
Full text