Dissertations / Theses on the topic 'Statistical analysis'

To see the other types of publications on this topic, follow the link: Statistical analysis.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Statistical analysis.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Tao. "Statistical design and analysis of microarray experiments." Connect to this title online, 2005. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1117201363.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Thesis (Ph. D.)--Ohio State University, 2005.
Title from first page of PDF file. Document formatted into pages; contains ix, 146 p.; also includes graphics (some col.) Includes bibliographical references (p. 145-146). Available online via OhioLINK's ETD Center
2

Whitehead, Andile. "Statistical-thermodynamical analysis, using Tsallis statistics, in high energy physics." Master's thesis, University of Cape Town, 2014. http://hdl.handle.net/11427/13391.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Includes bibliographical references.
Obtained via the maximisation of a modified entropy, the Tsallis distribution has been used to fit the transverse momentum distributions of identified particles from several high energy experiments. We propose a form of the distribution described in Cleymans and Worku, 2012, and show it to be thermodynamically consistent. Transverse momenta distributions and fits from ALICE, ATLAS, and CMS using both Tsallis and Boltzmann distributions are presented.
3

尹再英 and Choi-ying Wan. "Statistical analysis for capture-recapture experiments in discrete time." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B31225287.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jung, Andreas. "Statistical analysis of biomedical data." [S.l.] : [s.n.], 2004. http://deposit.ddb.de/cgi-bin/dokserv?idn=970139543.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Fabini, Claudia. "Statistical Analysis of Commodity Prices." St. Gallen, 2009. http://www.biblio.unisg.ch/org/biblio/edoc.nsf/wwwDisplayIdentifier/04602710001/$FILE/04602710001.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Perez, Melo Sergio. "Statistical Analysis of Meteorological Data." FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1527.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Some of the more significant effects of global warming are manifested in the rise of temperatures and the increased intensity of hurricanes. This study analyzed data on Annual, January and July temperatures in Miami in the period spanning from 1949 to 2011; as well as data on central pressure and radii of maximum winds of hurricanes from 1944 to present. Annual Average, Maximum and Minimum Temperatures were found to be increasing with time. Also July Average, Maximum and Minimum Temperatures were found to be increasing with time. On the other hand, no significant trend could be detected for January Average, Maximum and Minimum Temperatures. No significant trend was detected in the central pressures and radii of maximum winds of hurricanes, while the radii of maximum winds for the largest hurricane of the year showed an increasing trend.
7

Zaykin, Dmitri V. "STATISTICAL ANALYSIS OF GENETIC ASSOCIATIONS." NCSU, 1999. http://www.lib.ncsu.edu/theses/available/etd-19990914-043001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:

Zaykin, Dmitri V. Statistical Analysis of Genetic Associations.Advisor: Bruce S. Weir.There is an increasing need for a statistical treatment of geneticdata prompted by recent advances in molecular genetics and moleculartechnology. Study of associations between genes is one of the mostimportant aspects in applications of population genetics theory andstatistical methodology to genetic data. Developments of these methodsare important for conservation biology, experimental populationgenetics, forensic science, and for mapping human disease genes. Overthe next several years, genotypic data will be collected to attemptlocating positions of multiple genes affecting disease phenotype.Adequate statistical methodology is required to analyze thesedata. Special attention should be paid to multiple testing issuesresulting from searching through many genetic markers and high risk offalse associations. In this research we develop theory and methodsneeded to treat some of these problems. We introduce exact conditionaltests for analyzing associations within and between genes in samplesof multilocus genotypes and efficient algorithms to perform them.These tests are formulated for the general case of multiple alleles atarbitrary numbers of loci and lead to multiple testing adjustmentsbased on the closing testing principle, thus providing strongprotection of the family-wise error rate. We discuss an applicationof the closing method to the testing for Hardy-Weinberg equilibriumand computationally efficient shortcuts arising from methods forcombining p-values that allow to deal with large numbers of loci. Wealso discuss efficient Bayesian tests for heterozygote excess anddeficiency, as a special case of testing for Hardy-Weinbergequilibrium, and the frequentist properties of a p-value type ofquantity resulting from them. We further develop new methods forvalidation of experiments and for combining and adjusting independentand correlated p-values and apply them to simulated as well as toactual gene expression data sets. These methods prove to be especiallyuseful in situations with large numbers of statistical tests, such asin whole-genome screens for associations of genetic markers withdisease phenotypes and in analyzing gene expression data obtained fromDNA microarrays.

8

Mihailovici, Manuela. "Statistical analysis of electrocardiogram data." Thesis, McGill University, 1995. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=22860.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
An overview of the statistical procedures used in the analysis of electrocardiogram traces is presented in this thesis.
The purpose of these procedures is twofold: (i) they may suggest underlying mechanisms that influence heart rate (ii) they may be used as a means of classifying one or more patients into disease categories, by using objective criteria rather than the subjective approaches prevalent in current practice.
In an attempt to apply the methods discussed in this thesis, a selected group of patients was analyzed using spectral analysis.
Lack of information and of control of the patients' activities while they were being monitored precluded the possibility of obtaining definitive results.
9

McClelland, Robyn L. (Robyn Leagh). "Statistical analysis of DNA profiles." Thesis, McGill University, 1994. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=68215.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
DNA profiles have become an extremely important tool in forensic investigations, and a match between a suspect and a crime scene specimen is highly incriminating. Presentation of this evidence in court, however, requires a statistical interpretation, one which reflects the uncertainty in the results due to measurement imprecision and sampling variability. No consensus has been reached about how to quantify this uncertainty, and the literature to date is lacking an objective review of possible methods.
This thesis provides a survey of approaches to statistical analysis of DNA profile data currently in use, as well as proposed methods which seem promising. A comparison of frequentist and Bayesian approaches is made, as well as a careful examination of the assumptions required for each method.
10

Sneddon, Duncan J. M. "Statistical analysis of crystallographic data." Thesis, University of Glasgow, 2010. http://theses.gla.ac.uk/1683/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The Cambridge structural database (CSD) is a vast resource for crystallographic information. As of 1st January 2009 there are more than 469,611 crystal structures available in the CSD. This work is centred on a program dSNAP which has been developed at the University of Glasgow. dSNAP is a program that uses statistical methods to group fragments of molecules into groups that have a similar conformation. This work is aimed at applying methods to reduce the number of variables required to describe the geometry of the fragments mined from the CSD. To this end, the geometric definition employed by dSNAP was investigated. The default definition is total geometries which are made up of all angles and all distances, including all non-bonded distances and angles. This geometric definition was investigated in a comparative manner with four other definitions. There were all angles, all distances, bonded angles and distances and bonded angles, distances and torsion angles. These comparisons show that non-bonded information is critical to the formation of groups of fragments with similar conformations. The remainder of this work was focused in reducing the number of variables required to group fragments having similar conformations into distinct groups. Initially a method was developed to calculate the area of triangles between three atoms making up the fragment. This was employed systematically as a means of reducing the total number of variables required to describe the geometry of the fragments. Multivariate statistical methods were also applied with the aim of reducing the number of variables required to describe the geometry of the fragment in a systematic manner. The methods employed were factor analysis and sparse principal components analysis. Both of these methods were used to extract important variables from the original default geometric definition, total geometries. The extracted variables were then used as input for dSNAP and were compared with the original output. Biplots were used to visualise the variables describing the fragments. Biplots are multivariate analogues to scatter plots and are used to visualise how the fragments are related to the variables describing them. Owing to the large number of variables that make up the definition factor analysis was applied to extract the important variables before the biplot was calculated. The biplots give an overview of the correlation matrix and using these plots it is possible to select variables that are influencing the formation of clusters in dSNAP .
11

Moody, Stephen James. "Statistical analysis of galaxy surveys." Thesis, University of Cambridge, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.619673.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

溫達偉 and Tat-wai David Wan. "Statistical analysis on counterfeit currency." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31267737.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Bai, Yang, and 柏楊. "Statistical analysis for longitudinal data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B42841756.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Adhikari, Kaustubh. "Statistical Methodology for Sequence Analysis." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10178.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Rare disease variants are receiving increasing importance in the past few years as the potential cause for many complex diseases, after the common disease variants failed to explain a large part of the missing heritability. With the advancement in sequencing techniques as well as computational capabilities, statistical methodology for analyzing rare variants is now a hot topic, especially in case-control association studies. In this thesis, we initially present two related statistical methodologies designed for case-control studies to predict the number of common and rare variants in a particular genomic region underlying the complex disease. Genome-wide association studies are nowadays routinely performed to identify a few putative marker loci or a candidate region for further analysis. These methods are designed to work with SNP data on such a genomic region highlighted by GWAS studies for potential disease variants. The fundamental idea is to use Bayesian methodology to obtain bivariate posterior distributions on counts of common and rare variants. While the first method uses randomly generated (minimal) ancestral recombination graphs, the second method uses ensemble clustering method to explore the space of genealogical trees that represent the inherent structure in the test subjects. In contrast to the aforesaid methods which work with SNP data, the third chapter deals with next-generation sequencing data to detect the presence of rare variants in a genomic region. We present a non-parametric statistical methodology for rare variant association testing, using the well-known Kolmogorov-Smirnov framework adapted for genetic data. it is a fast, model-free robust statistic, designed for situations where both deleterious and protective variants are present. It is also unique in utilizing the variant locations in the test statistic.
15

Walshaw, David. "Statistical analysis of wind speeds." Thesis, University of Sheffield, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.358279.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Alfahad, Mai F. A. M. "Statistical shape analysis of helices." Thesis, University of Leeds, 2018. http://etheses.whiterose.ac.uk/21675/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Consider a sequence of equally spaced points along a helix in three-dimensional space, which are observed subject to statistical noise. In this thesis, maximum likelihood (ML) method is developed to estimate the parameters of the helix. Statistical properties of the estimator are studied and comparisons are made to other estimators found in the literature. Methods are established here for the fitting of unkinked and kinked helices. For an unkinked helix an initial estimate of a helix axis is estimated by a modified eigen-decomposition or a method from the literature. Mardia-Holmes model can be used to estimate the initial helix axis but it is often not very successful one since it requires initial parameters. A better method for initial axis estimation is the Rotfit method. If the the axis is known, we minimize the residual sum of squares (RSS) to estimate the helix parameters and then optimize the axis estimate. For a kinked helix, we specify a test statistic by simulating the null distribution of unkinked helices. If the kink position is known, then the test statistic approximately follows an F-distribution. If the null hypothesis is rejected i.e. the helix has a change point, and then cut the helix into two sub-helices between the change point where the helix has the maximum statistic. Statistics test are studied to test how differ these two sub-helices from each other. Parametric bootstrap procedure is used to study these statistics. The shapes of protein alpha-helices are used to illustrate the procedure.
17

Van, Dyck Jozef Frans Maria. "Statistical analysis of earthquake catalogs." Thesis, Massachusetts Institute of Technology, 1985. http://hdl.handle.net/1721.1/42969.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Civil Engineering, 1986.
MICROFICHE COPY AVAILABLE IN ARCHIVES AND ENGINEERING
Bibliography: leaves 262-269.
by Jozef Frans Maria Van Dyck.
Ph.D.
18

Bruno, Rexanne Marie. "Statistical Analysis of Survival Data." UNF Digital Commons, 1994. http://digitalcommons.unf.edu/etd/150.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The terminology and ideas involved in the statistical analysis of survival data are explained including the survival function, the probability density function, the hazard function, censored observations, parametric and nonparametric estimations of these functions, the product limit estimation of the survival function, and the proportional hazards estimation of the hazard function with explanatory variables. In Appendix A these ideas are applied to the actual analysis of the survival data for 54 cervical cancer patients.
19

Keating, Karen. "Statistical analysis of pyrosequence data." Diss., Kansas State University, 2012. http://hdl.handle.net/2097/14026.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Doctor of Philosophy
Department of Statistics
Gary L. Gadbury
Since their commercial introduction in 2005, DNA sequencing technologies have become widely available and are now cost-effective tools for determining the genetic characteristics of organisms. While the biomedical applications of DNA sequencing are apparent, these technologies have been applied to many other research areas. One such area is community ecology, in which DNA sequence data are used to identify the presence and abundance of microscopic organisms that inhabit an environment. This is currently an active area of research, since it is generally believed that a change in the composition of microscopic species in a geographic area may signal a change in the overall health of the environment. An overview of DNA pyrosequencing, as implemented by the Roche/Life Science 454 platform, is presented and aspects of the process that can introduce variability in data are identified. Four ecological data sets that were generated by the 454 platform are used for illustration. Characteristics of these data include high dimensionality, a large proportion of zeros (usually in excess of 90%), and nonzero values that are strongly right-skewed. A nonparametric method to standardize these data is presented and effects of standardization on outliers and skewness are examined. Traditional statistical methods for analyzing macroscopic species abundance data are discussed, and the applicability of these methods to microscopic species data is examined. One objective that receives focus is the classification of microscopic species as either rare or common species. This is an important distinction since there is much evidence to suggest that the biological and environmental mechanisms that govern common species are distinctly different than the mechanisms that govern rare species. This indicates that the abundance patterns for common and rare species may follow different probability models, and the suitability of the Pareto distribution for rare species is examined. Techniques for classifying macroscopic species are shown to be ill-suited for microscopic species, and an alternative technique is presented. Recognizing that the structure of the data is similar to that of financial applications (such as insurance claims and the distribution of wealth), the Gini index and other statistics based on the Lorenz curve are explored as potential test statistics for distinguishing rare versus common species.
20

Sargsyan, Alex. "Test Validity and Statistical Analysis." Digital Commons @ East Tennessee State University, 2018. https://dc.etsu.edu/etsu-works/8472.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Bai, Yang. "Statistical analysis for longitudinal data." Click to view the E-thesis via HKUTO, 2009. http://sunzi.lib.hku.hk/hkuto/record/B42841756.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Wan, Tat-wai David. "Statistical analysis on counterfeit currency /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B18024464.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Buu, Yuh-Pey Anne. "Statistical analysis of rater effects." [Gainesville, Fla.] : University of Florida, 2003. http://purl.fcla.edu/fcla/etd/UFE0001244.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Kim, Yangjin. "Statistical analysis of longitudinal data /." free to MU campus, to others for purchase, 2003. http://wwwlib.umi.com/cr/mo/fullcit?p3100054.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Pola, Tommaso. "Statistical analysis of written languages." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/6307/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
La tesi presenta una serie di risultati dell'analisi quantitativa sulla linguistica. Inizialmente sono studiate due fra le leggi empiriche più famose di questo campo, le leggi di Zipf e Heaps, e vengono esposti vari modelli sullo sviluppo del linguaggio. Nella seconda parte si giunge alla discussione di risultati più specifici sulla presenza di fenomeni di burstiness e di correlazioni a lungo raggio nei testi. Tutti questi studi teorici sono affiancati da analisi sperimentali, svolte utilizzando varie traduzioni del libro "Guerra e pace" di Leo Tolstoj e concentrate principalmente sulle eventuali differenze riscontrabili tra le diverse lingue.
26

Crafford, Gretel. "Statistical analysis of grouped data." Thesis, University of Pretoria, 2007. http://hdl.handle.net/2263/25968.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The maximum likelihood (ML) estimation procedure of Matthews and Crowther (1995: A maximum likelihood estimation procedure when modelling in terms of constraints. South African Statistical Journal, 29, 29-51) is utilized to fit a continuous distribution to a grouped data set. This grouped data set may be a single frequency distribution or various frequency distributions that arise from a cross classification of several factors in a multifactor design. It will also be shown how to fit a bivariate normal distribution to a two-way contingency table where the two underlying continuous variables are jointly normally distributed. This thesis is organized in three different parts, each playing a vital role in the explanation of analysing grouped data with the ML estimation of Matthews and Crowther. In Part I the ML estimation procedure of Matthews and Crowther is formulated. This procedure plays an integral role and is implemented in all three parts of the thesis. In Part I the exponential distribution is fitted to a grouped data set to explain the technique. Two different formulations of the constraints are employed in the ML estimation procedure and provide identical results. The justification of the method is further motivated by a simulation study. Similar to the exponential distribution, the estimation of the normal distribution is also explained in detail. Part I is summarized in Chapter 5 where a general method is outlined to fit continuous distributions to a grouped data set. Distributions such as the Weibull, the log-logistic and the Pareto distributions can be fitted very effectively by formulating the vector of constraints in terms of a linear model. In Part II it is explained how to model a grouped response variable in a multifactor design. This multifactor design arise from a cross classification of the various factors or independent variables to be analysed. The cross classification of the factors results in a total of T cells, each containing a frequency distribution. Distribution fitting is done simultaneously to each of the T cells of the multifactor design. Distribution fitting is also done under the additional constraints that the parameters of the underlying continuous distributions satisfy a certain structure or design. The effect of the factors on the grouped response variable may be evaluated from this fitted design. Applications of a single-factor and a two-factor model are considered to demonstrate the versatility of the technique. A two-way contingency table where the two variables have an underlying bivariate normal distribution is considered in Part III. The estimation of the bivariate normal distribution reveals the complete underlying continuous structure between the two variables. The ML estimate of the correlation coefficient ρ is used to great effect to describe the relationship between the two variables. Apart from an application a simulation study is also provided to support the method proposed.
Thesis (PhD (Mathematical Statistics))--University of Pretoria, 2007.
Statistics
unrestricted
27

Wu, Ling. "Stochastic Modeling and Statistical Analysis." Scholar Commons, 2010. https://scholarcommons.usf.edu/etd/1813.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The objective of the present study is to investigate option pricing and forecasting problems in finance. This is achieved by developing stochastic models in the framework of classical modeling approach. In this study, by utilizing the stock price data, we examine the correctness of the existing Geometric Brownian Motion (GBM) model under standard statistical tests. By recognizing the problems, we attempted to demonstrate the development of modified linear models under different data partitioning processes with or without jumps. Empirical comparisons between the constructed and GBM models are outlined. By analyzing the residual errors, we observed the nonlinearity in the data set. In order to incorporate this nonlinearity, we further employed the classical model building approach to develop nonlinear stochastic models. Based on the nature of the problems and the knowledge of existing nonlinear models, three different nonlinear stochastic models are proposed. Furthermore, under different data partitioning processes with equal and unequal intervals, a few modified nonlinear models are developed. Again, empirical comparisons between the constructed nonlinear stochastic and GBM models in the context of three data sets are outlined. Stochastic dynamic models are also used to predict the future dynamic state of processes. This is achieved by modifying the nonlinear stochastic models from constant to time varying coefficients, and then time series models are constructed. Using these constructed time series models, the prediction and comparison problems with the existing time series models are analyzed in the context of three data sets. The study shows that the nonlinear stochastic model 2 with time varying coefficients is robust with respect different data sets. We derive the option pricing formula in the context of three nonlinear stochastic models with time varying coefficients. The option pricing formula in the frame work of hybrid systems, namely, Hybrid GBM (HGBM) and hybrid nonlinear stochastic models are also initiated. Finally, based on our initial investigation about the significance of presented nonlinear stochastic models in forecasting and option pricing problems, we propose to continue and further explore our study in the context of nonlinear stochastic hybrid modeling approach.
28

Gustafson, Helen May. "Statistical analysis of symmetric ciphers." Thesis, Queensland University of Technology, 1996.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
29

SALA, SARA. "Statistical analysis of brain network." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2013. http://hdl.handle.net/10281/43723.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Recent developments in the complex networks analysis, based largely on graph theory, have been used to study the brain network organization. The brain is a complex system that can be represented by a graph. A graph is a mathematical representation which can be useful to study the connectivity of the brain. Nodes in the brain can be identified dividing its volume in regions of interest and links can be identified calculating a measure of dependence between pairs of regions whose activation signal, measured by functional magnetic resonance imaging (fMRI) techniques, represents the strength of the connec-tion between regions. A graph can be synthesized by the so-called adjacency matrix, which, in its simplest form, is an undirected, binary, and symmetric matrix, whose en-tries are set to one if a link exists between a pair of brain areas and zero otherwise. The adjacency matrix is particularly useful because allows the calculation of several measures which summarize global and local character-istics of functional brain connectivity, such as centrality, e ciency, density and small worldness property. In this work, we consider the global measures, such as the clustering coe cient, the characteristic path length and the global e ciency, and the local measures, such as centrality measures and local e ciency, in order to represent global and local dynam-ics and changes between networks. This is achieved by studying with resting state (rs) fMRI data of healthy subjects and patients with neurodegenerative diseases. Furthermore we illustrate an original methodology to construct the adjacency matrix. Its entries, containing the information about the ex-istence of links, are identified by testing the correlation between the time series that characterized the dynamic behavior of the nodes. This involves the problem of multiple comparisons in order to control the error rates. The method based on the estimation of positive false discovery rate (pFDR) has been used. A similar measure involving false negatives (type II errors), called the positive false nondiscovery rate (pFNR) is then considered, proposing new point and interval estimators for pFNR and a method for balancing the two types of error. This approach is demonstrated using both simulations and fMRI data, and providing nite sample as well as large sample results for pFDR and pFNR estimators. Besides a ranking of the most central nodes in the networks is proposed using q-values, the pFDR analog of the p-values. The di erences on the inter-regional connectivity between cases and controls are studied. Finally network models are discussed. In order to gain deeper insights into the complex neurobiological interaction, exponential random graph models (ERGMs) are applied to assess several network properties simultaneously and to compare case/control brain networks.
30

Liu, Fei, and 劉飛. "Statistical inference for banding data." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2008. http://hub.hku.hk/bib/B41508701.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Liu, Fei. "Statistical inference for banding data." Click to view the E-thesis via HKUTO, 2008. http://sunzi.lib.hku.hk/hkuto/record/B41508701.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Ounpraseuth, Songthip T. Young Dean M. "Selected topics in statistical discriminant analysis." Waco, Tex. : Baylor University, 2006. http://hdl.handle.net/2104/4883.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

何志興 and Chi-hing Ho. "The statistical analysis of multivariate counts." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1991. http://hub.hku.hk/bib/B31232218.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Ho, Chi-hing. "The statistical analysis of multivariate counts /." [Hong Kong] : University of Hong Kong, 1991. http://sunzi.lib.hku.hk/hkuto/record.jsp?B12922602.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Young, G. A. "Data-based statistical methods." Thesis, University of Cambridge, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.383307.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Vaitkevičius, Robertas. "Duomenų kompiuterinės statistinės analizės technologijos." Master's thesis, Lithuanian Academic Libraries Network (LABT), 2008. http://vddb.library.lt/obj/LT-eLABa-0001:E.02~2008~D_20080929_140053-98826.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Darbo „Duomenų kompiuterinės statistinės analizės technologijos“ tikslas – išanalizuoti ir palyginti įvairių populiarių statistinių paketų galimybes bei pateikti rekomendacijas vartotojui. Šiame darbe buvo išanalizuoti SPSS 8.0 for Windows, STATISTICA 7 ir Minitab 15 English statistiniai paketai. Atlikti statistiniai skaičiavimai su anketos „Apie tai, kaip tu gyveni“ duomenimis, panaudojant minėtus statistinius paketus. Įvertintos šių statistinių paketų galimybės. Sudarytos statistinių paketų lyginamosios analizės lentelės. Pateiktos rekomendacijos vartotojui, padedančios jam pagrįstai pasirinkti tinkamiausią statistinį paketą, atsižvelgiant į vartotojo poreikius ir galimybes. Statistiniam paketui STATISTICA 7 sukurtos dvi makrokomandos, panaudojant šiame pakete integruotą VISUAL BASIC programavimo kalbą. Pirmoji makrokomanda skaičiuoja tiriamųjų anketų užpildymo baigtumo laipsnius. Antroji makrokomanda filtruoja pasirinkto kintamojo duomenis pagal pasirinktą kriterijų. Darbas inovatyvus tuo, kad sukurtos dvi makrokomandos, praplečiančios statistinio paketo STATISTICA 7 galimybes.
The aim of work “The technologies of computer-based statistical analysis of data”- to analyse and to compare the potentials of various popular statistical packages and to propose the recommendations for the consumer. In this work there were analysed SPSS 8.0 For Windows, STATISTICA 7 and Minitab 15 English statistical packages. Using these mentioned packages there were accomplished statistical calculations according to the questionnaire “About that, how do you live” data. There were assessed the potentials of these statistical packages. There were composed the charts of comparative analysis of the statistical packages. Recommendations were given for the consumer, helping him to pick reasonably the best statistical package, considering consumer’s requirements and possibilities. For the statistical package STATISTICA 7 there were created two macros, using VISUAL BASIC computerese integrated in this package. The first macro calculates the completeness degrees of the investigative questionnaires filling. The second macro filters the data of chosen variable according to the chosen criterion. This work is innovative that there were created two macros, extending potentials of statistical package STATISTICA 7.
37

Zhou, Diwei. "Statistical analysis of diffusion tensor imaging." Thesis, University of Nottingham, 2010. http://eprints.nottingham.ac.uk/11430/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This thesis considers the statistical analysis of diffusion tensor imaging (DTI). DTI is an advanced magnetic resonance imaging (MRI) method that provides a unique insight into biological microstructure \textit{in vivo} by directionally describing the water molecular diffusion. We firstly develop a Bayesian multi-tensor model with reparameterisation for capturing water diffusion at voxels with one or more distinct fibre orientations. Our model substantially alleviates the non-identifiability issue present in the standard multi-tensor model. A Markov chain Monte Carlo (MCMC) algorithm is then developed to study the uncertainty of the model parameters based on the posterior distribution. We apply the Bayesian method to Monte Carlo (MC) simulated datasets as well as a healthy human brain dataset. A region containing crossing fibre bundles is investigated using our multi-tensor model with automatic model selection. A diffusion tensor, a covariance matrix related to the molecular displacement at a particular voxel in the brain, is in the non-Euclidean space of 3x3 positive semidefinite symmetric matrices. We define the sample mean of tensor data to be the Fréchet mean. We carry out the non-Euclidean statistical analysis of diffusion tensor data. The primary focus is on the use of Procrustes size-and-shape space. Comparisons are made with other non-Euclidean techniques, including the log-Euclidean, Riemannian, Cholesky, root Euclidean and power Euclidean methods. The weighted generalised Procrustes analysis has been developed to efficiently interpolate and smooth an arbitrary number of tensors with the flexibility of controlling individual contributions. A new anisotropy measure, Procrustes Anisotropy is defined and compared with other widely used anisotropy measures. All methods are illustrated through synthetic examples as well as white matter tractography of a healthy human brain. Finally, we use Giné’s statistic to design uniformly distributed diffusion gradient direction schemes with different numbers of directions. MC simulation studies are carried out to compare effects of Giné’s and widely used Jones' schemes on tensor estimation. We conclude by discussing potential areas for further research.
38

Chung, Yuk-ka, and 鍾玉嘉. "On the evaluation and statistical analysis of forensic evidence in DNAmixtures." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2011. http://hub.hku.hk/bib/B45983586.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Benfenati, Francesco Maria. "Statistical analysis of oceanographic extreme events." Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/19885/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Condizioni ambientali estreme del mare possono avere un forte impatto sulla navigazione e/o sul successo di operazioni di salvataggio. Le tecniche statistiche sono cruciali per quantificare la presenza di eventi estremi e monitorarne variazioni di frequenza e intensità. Gli eventi estremi "vivono" nella coda di una funzione distribuzione di probabilità (PDF), per questo è importante studiare la PDF in punti lontani diverse deviazioni standard dalla media. L’altezza significativa dell’onda (SWH) è il parametro solitamente usato per valutare l’intensità degli stati del mare. L’analisi degli estremi nella coda di una distribuzione richiede lunghe serie temporali per stime ragionevoli della loro intesità e e frequenza. Dati osservativi (i.e. dati storici da boe), sono spesso assenti e vengono invece utilizzate ricostruzioni numeriche delle onde, con il vantaggio che l’analisi di eventi estremi diventa possibile su una vasta area. Questa tesi vuole condurre un’analisi preliminare delle variazioni spaziali dei valori estremi della SWH nel Mediterraneo. Vengono usati dati orari dal modello del Med-MFC (dal portale del CMEMS), una ricostruzione numerica di onde per il Mediterraneo, che sfrutta il modello "WAM Cycle 4.5.4", coprendo il periodo 2006-2018, con risoluzione spaziale 0.042° (~ 4km). In particolare, consideriamo dati di 11 anni (dal 2007 al 2017), concentrandoci sulle regioni del Mar Ionio e del Mar Iberico. La PDF della SWH è seguita piuttosto bene dall’andamento di una curva Weibull a 2 parametri sia durante l’inverno (Gennaio) che durante l’estate (Luglio), con difetti per quanto riguarda il picco e la coda della distribuzione. A confronto, la curva a 3 parametri Weibull Esponenziata sembra essere più appropriata, anche se non è stato trovato un metodo per dimostrare che sia un fit migliore. Alla fine, viene proposto un metodo di stima del rischio basato sul periodo giornaliero di ritorno delle onde più alte di un certo valore di soglia, ritenute pericolose.
40

Kilby, James W. "Performance management systems : a statistical analysis /." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from the National Technical Information Service, 1993. http://handle.dtic.mil/100.2/ADA272989.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Thesis (M.S. in Information Technology Management) Naval Postgraduate School, September 1993.
Thesis advisor(s): Euske, Kenneth J. ; Haga, William James. "September 1993." Bibliography: p. 69-71. Also available online.
41

Ruan, Lingyan. "Statistical analysis of high dimensional data." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/37135.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This century is surely the century of data (Donoho, 2000). Data analysis has been an emerging activity over the last few decades. High dimensional data is in particular more and more pervasive with the advance of massive data collection system, such as microarrays, satellite imagery, and financial data. However, analysis of high dimensional data is of challenge with the so called curse of dimensionality (Bellman 1961). This research dissertation presents several methodologies in the application of high dimensional data analysis. The first part discusses a joint analysis of multiple microarray gene expressions. Microarray analysis dates back to Golub et al. (1999). It draws much attention after that. One common goal of microarray analysis is to determine which genes are differentially expressed. These genes behave significantly differently between groups of individuals. However, in microarray analysis, there are thousands of genes but few arrays (samples, individuals) and thus relatively low reproducibility remains. It is natural to consider joint analyses that could combine microarrays from different experiments effectively in order to achieve improved accuracy. In particular, we present a model-based approach for better identification of differentially expressed genes by incorporating data from different studies. The model can accommodate in a seamless fashion a wide range of studies including those performed at different platforms, and/or under different but overlapping biological conditions. Model-based inferences can be done in an empirical Bayes fashion. Because of the information sharing among studies, the joint analysis dramatically improves inferences based on individual analysis. Simulation studies and real data examples are presented to demonstrate the effectiveness of the proposed approach under a variety of complications that often arise in practice. The second part is about covariance matrix estimation in high dimensional data. First, we propose a penalised likelihood estimator for high dimensional t-distribution. The student t-distribution is of increasing interest in mathematical finance, education and many other applications. However, the application in t-distribution is limited by the difficulty in the parameter estimation of the covariance matrix for high dimensional data. We show that by imposing LASSO penalty on the Cholesky factors of the covariance matrix, EM algorithm can efficiently compute the estimator and it performs much better than other popular estimators. Secondly, we propose an estimator for high dimensional Gaussian mixture models. Finite Gaussian mixture models are widely used in statistics thanks to its great flexibility. However, parameter estimation for Gaussian mixture models with high dimensionality can be rather challenging because of the huge number of parameters that need to be estimated. For such purposes, we propose a penalized likelihood estimator to specifically address such difficulties. The LASSO penalty we impose on the inverse covariance matrices encourages sparsity on its entries and therefore helps reducing the dimensionality of the problem. We show that the proposed estimator can be efficiently computed via an Expectation-Maximization algorithm. To illustrate the practical merits of the proposed method, we consider its application in model-based clustering and mixture discriminant analysis. Numerical experiments with both simulated and real data show that the new method is a valuable tool in handling high dimensional data. Finally, we present structured estimators for high dimensional Gaussian mixture models. The graphical representation of every cluster in Gaussian mixture models may have the same or similar structure, which is an important feature in many applications, such as image processing, speech recognition and gene network analysis. Failure to consider the sharing structure would deteriorate the estimation accuracy. To address such issues, we propose two structured estimators, hierarchical Lasso estimator and group Lasso estimator. An EM algorithm can be applied to conveniently solve the estimation problem. We show that when clusters share similar structures, the proposed estimator perform much better than the separate Lasso estimator.
42

Nosova, Olga. "Statistical analysis of regional integration effects." Universität Potsdam, 2008. http://opus.kobv.de/ubp/volltexte/2009/2910/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The paper studies the regional integration as the unique process which depends on the degree of cooperation and interchange among regions. The generalisation of existing approaches for regional integration has been classified by the criterions. The data of the main economic indicators have been analysed. The economic analysis proves the differences in production endowments, the asymmetry in fixed capital investment, the disproportional income, and foreign direct investment distribution in 2001 – 2005 in Ukrainian regions. Econometric modelling depicts the existence of the division for the industrial regions with high urbanisation and backward agrarian regions in the Ukraine, the industrial development disparities among regions; the insufficient infrastructure (telecommunications, roads, hotels, services and etc.), the low labour productivity in industrial sector, and insufficient regional trade.
43

Lien, Tonje Gulbrandsen. "Statistical Analysis of Quantitative PCR Data." Thesis, Norges teknisk-naturvitenskapelige universitet, Institutt for matematiske fag, 2011. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-13094.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This thesis seeks to develop a better understanding of the analysis of gene expression to find the amount of transcript in a sample. The mainstream method used is called Polymerase Chain Reaction (PCR) and it exploits the DNA's ability to replicate. The comparative CT method estimate the starting fluorescence level f0 by assuming constant amplification in each PCR cycle, and it uses the fluorescence level which has risen above a certain threshold. We present a generalization of this method, where different threshold values can be used. The main aim of this thesis is to evaluate a new method called the Enzymological method. It estimates f0 by considering a cycle dependent amplification and uses a larger part of the fluorescence curves, than the two CT methods. All methods are tested on dilution series, where the dilution factors are known. In one of the datasets studied, the Clusterin dilution-dataset, we get better estimates from the Enzymological method compared to the two CT methods.
44

Mangisa, Siphumlile. "Statistical analysis of electricity demand profiles." Thesis, Nelson Mandela Metropolitan University, 2013. http://hdl.handle.net/10948/d1011548.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
An electricity demand profile is a graph showing the amount of electricity used by customers over a unit of time. It shows the variation in electricity demand versus time. In the demand profiles, the shape of the graph is of utmost importance. The variations in demand profiles are caused by many factors, such as economic and en- vironmental factors. These variations may also be due to changes in the electricity use behaviours of electricity users. This study seeks to model daily profiles of energy demand in South Africa with a model which is a composition of two de Moivre type models. The model has seven parameters, each with a natural interpretation (one parameter representing minimum demand in a day, two parameters representing the time of morning and afternoon peaks, two parameters representing the shape of each peak, and two parameters representing the total energy per peak). With the help of this model, we trace change in the demand profile over a number of years. The proposed model will be helpful for short to long term electricity demand forecasting.
45

Golumbeanu, Monica. "Statistical Analysis of PAR-CLIP data." Thesis, KTH, Beräkningsbiologi, CB, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-124347.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
From creation to its degradation, the RNA molecule is the action field of many binding proteins with different roles in regulation and RNA metabolism. Since these proteins are involved in a large number of processes, a variety of diseases are related to abnormalities occurring within the binding mechanisms. One of the experimental methods for detecting the binding sites of these proteins is PAR-CLIP built on the next generation sequencing technology. Due to its size and intrinsic noise, PAR-CLIP data analysis requires appropriate pre-processing and thorough statistical analysis. The present work has two main goals. First, to develop a modular pipeline for preprocessing PAR-CLIP data and extracting necessary signals for further analysis. Second, to devise a novel statistical model in order to carry out inference about presence of protein binding sites based on the signals extracted in the pre-processing step.
46

Czarn, Andrew Simon Timothy. "Statistical exploratory analysis of genetic algorithms." University of Western Australia. School of Computer Science and Software Engineering, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0030.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
[Truncated abstract] Genetic algorithms (GAs) have been extensively used and studied in computer science, yet there is no generally accepted methodology for exploring which parameters significantly affect performance, whether there is any interaction between parameters and how performance varies with respect to changes in parameters. This thesis presents a rigorous yet practical statistical methodology for the exploratory study of GAs. This methodology addresses the issues of experimental design, blocking, power and response curve analysis. It details how statistical analysis may assist the investigator along the exploratory pathway. The statistical methodology is demonstrated in this thesis using a number of case studies with a classical genetic algorithm with one-point crossover and bit-replacement mutation. In doing so we answer a number of questions about the relationship between the performance of the GA and the operators and encoding used. The methodology is suitable, however, to be applied to other adaptive optimization algorithms not treated in this thesis. In the first instance, as an initial demonstration of our methodology, we describe case studies using four standard test functions. It is found that the effect upon performance of crossover is predominantly linear while the effect of mutation is predominantly quadratic. Higher order effects are noted but contribute less to overall behaviour. In the case of crossover both positive and negative gradients are found which suggests using rates as high as possible for some problems while possibly excluding it for others. .... This is illustrated by showing how the use of Gray codes impedes the performance on a lower modality test function compared with a higher modality test function. Computer animation is then used to illustrate the actual mechanism by which this occurs. Fourthly, the traditional concept of a GA is that of selection, crossover and mutation. However, a limited amount of data from the literature has suggested that the niche for the beneficial effect of crossover upon GA performance may be smaller than has traditionally been held. Based upon previous results on not-linear-separable problems an exploration is made by comparing two test problem suites, one comprising non-rotated functions and the other comprising the same functions rotated by 45 degrees in the solution space rendering them not-linear-separable. It is shown that for the difficult rotated functions the crossover operator is detrimental to the performance of the GA. It is conjectured that what makes a problem difficult for the GA is complex and involves factors such as the degree of optimization at local minima due to crossover, the bias associated with the mutation operator and the Hamming Distances present in the individual problems due to the encoding. Furthermore, the GA was tested on a real world landscape minimization problem to see if the results obtained would match those from the difficult rotated functions. It is demonstrated that they match and that the features which make certain of the test functions difficult are also present in the real world problem. Overall, the proposed methodology is found to be an effective tool for revealing relationships between a randomized optimization algorithm and its encoding and parameters that are difficult to establish from more ad-hoc experimental studies alone.
47

Guo, Tong. "Statistical analysis of reliability-validity studies." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0028/MQ50780.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Perry, Martin Andrew. "Statistical linkage analysis and association studies." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ57208.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Baah, George Kofi. "Statistical causal analysis for fault localization." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/45762.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The ubiquitous nature of software demands that software is released without faults. However, software developers inadvertently introduce faults into software during development. To remove the faults in software, one of the tasks developers perform is debugging. However, debugging is a difficult, tedious, and time-consuming process. Several semi-automated techniques have been developed to reduce the burden on the developer during debugging. These techniques consist of experimental, statistical, and program-structure based techniques. Most of the debugging techniques address the part of the debugging process that relates to finding the location of the fault, which is referred to as fault localization. The current fault-localization techniques have several limitations. Some of the limitations of the techniques include (1) problems with program semantics, (2) the requirement for automated oracles, which in practice are difficult if not impossible to develop, and (3) the lack of theoretical basis for addressing the fault-localization problem. The thesis of this dissertation is that statistical causal analysis combined with program analysis is a feasible and effective approach to finding the causes of software failures. The overall goal of this research is to significantly extend the state of the art in fault localization. To extend the state-of-the-art, a novel probabilistic model that combines program-analysis information with statistical information in a principled manner is developed. The model known as the probabilistic program dependence graph (PPDG) is applied to the fault-localization problem. The insights gained from applying the PPDG to fault localization fuels the development of a novel theoretical framework for fault localization based on established causal inference methodology. The development of the framework enables current statistical fault-localization metrics to be analyzed from a causal perspective. The analysis of the metrics show that the metrics are related to each other thereby allowing the unification of the metrics. Also, the analysis of metrics from a causal perspective reveal that the current statistical techniques do not find the causes of program failures instead the techniques find the program elements most associated with failures. However, the fault-localization problem is a causal problem and statistical association does not imply causation. Several empirical studies are conducted on several software subjects and the results (1) confirm our analytical results, (2) demonstrate the efficacy of our causal technique for fault localization. The results demonstrate the research in this dissertation significantly improves on the state-of-the-art in fault localization.
50

Hosseini, Mohamadreza. "Statistical models for agroclimate risk analysis." Thesis, University of British Columbia, 2009. http://hdl.handle.net/2429/16019.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In order to model the binary process of precipitation and the dichotomized temperature process, we use the conditional probability of the present given the past. We find necessary and sufficient conditions for a collection of functions to correspond to the conditional probabilities of a discrete-time categorical stochastic process X₁,X₂,···. Moreover we find parametric representations for such processes and in particular rth-order Markov chains. To dichotomize the temperature process, quantiles are often used in the literature. We propose using a two-state definition of the quantiles by considering the "left quantile" and "right quantile" functions instead of the traditional definition. This has various advantages such as a symmetry relation between the quantiles of random variables X and -X. We show that the left (right) sample quantile tends to the left (right) distribution quantile at p ∈[0,1], if and only if the left and right distribution quantiles are identical at p and diverge almost surely otherwise. In order to measure the loss of estimating (or approximating) a quantile, we introduce a loss function that is invariant under strictly monotonic transformations and call it the "probability loss function". Using this loss function, we introduce measures of distance among random variables that are invariant under continuous strictly monotonic transformations. We use this distance measures to show optimal overall fits to a random variable are not necessarily optimal in the tails. This loss function is also used to find equivariant estimators of the parameters of distribution functions. We develop an algorithm to approximate quantiles of large datasets which works by partitioning the data or use existing partitions (possibly of non-equal size). We show the deterministic precision of this algorithm and how it can be adjusted to get customized precisions. Then we develop a framework to optimally summarize very large datasets using quantiles and combining such summaries in order to infer about the original dataset. Finally we show how these higher order Markov models can be used to construct confidence intervals for the probability of frost-free periods.

To the bibliography