Rozprawy doktorskie na temat „Applied Statistics”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Applied Statistics”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Binter, Roman. "Applied probabilistic forecasting". Thesis, London School of Economics and Political Science (University of London), 2012. http://etheses.lse.ac.uk/559/.
Pełny tekst źródłaZhang, Bo. "Machine Learning on Statistical Manifold". Scholarship @ Claremont, 2017. http://scholarship.claremont.edu/hmc_theses/110.
Pełny tekst źródłaBynum, Lucius. "Modeling Subset Behavior: Prescriptive Analytics for Professional Basketball Data". Scholarship @ Claremont, 2018. https://scholarship.claremont.edu/hmc_theses/117.
Pełny tekst źródłaDodson, Huey D. "Applied statistics experience & certification in quality assurance /". Click here to view, 2010. http://digitalcommons.calpoly.edu/statsp/3/.
Pełny tekst źródłaProject advisor: Heather Smith. Title from PDF title page; viewed on Apr. 20, 2010. Includes bibliographical references. Also available on microfiche.
Lochner, Michelle Aileen Anne. "New applications of statistics in astronomy and cosmology". Doctoral thesis, University of Cape Town, 2014. http://hdl.handle.net/11427/12864.
Pełny tekst źródłaOver the last few decades, astronomy and cosmology have become data-driven fields. The parallel increase in computational power has naturally lead to the adoption of more sophisticated statistical techniques for data analysis in these fields, and in particular, Bayesian methods. As the next generation of instruments comes online, this trend should be continued since previously ignored effects must be considered rigorously in order to avoid biases and incorrect scientific conclusions being drawn from the ever-improving data. In the context of supernova cosmology, an example of this is the challenge from contamination as supernova datasets will become too large to spectroscopically confirm the types of all objects. The technique known as BEAMS (Bayesian Estimation Applied to Multiple Species) handles this contamination with a fully Bayesian mixture model approach, which allows unbiased estimates of the cosmological parameters. Here, we extend the original BEAMS formalism to deal with correlated systematics in supernovae data, which we test extensively on thousands of simulated datasets using numerical marginalization and Markov Chain Monte Carlo (MCMC) sampling over the unknown type of the supernova, showing that it recovers unbiased cosmological parameters with good coverage. We then apply Bayesian statistics to the field of radio interferometry. This is particularly relevant in light of the SKA telescope, where the data will be of such high quantity and quality that current techniques will not be adequate to fully exploit it. We show that the current approach to deconvolution of radio interferometric data is susceptible to biases induced by ignored and unknown instrumental effects such as pointing errors, which in general are correlated with the science parameters. We develop an alternative approach - Bayesian Inference for Radio Observations (BIRO) - which is able to determine the joint posterior for all scientific and instrumental parameters. We test BIRO on several simulated datasets and show that it is superior to the standard CLEAN and source extraction algorithms. BIRO fits all parameters simultaneously while providing unbiased estimates - and errors - for the noise, beam width, pointing errors and the fluxes and shapes of the sources.
Tiani, John P. "Using applied statistics to study a pharmaceutical manufacturing process". Worcester, Mass. : Worcester Polytechnic Institute, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0430104-125344/.
Pełny tekst źródłaFitzgerald, Damon. "Household Preferences for Financing Hurricane Risk Mitigation: A Survey Based Empirical Analysis". FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1725.
Pełny tekst źródłaLiu, Xiang. "A Multi-Indexed Logistic Model for Time Series". Digital Commons @ East Tennessee State University, 2016. https://dc.etsu.edu/etd/3140.
Pełny tekst źródłaBrännström, Anton. "A Comparison of Three Methods of Estimation Applied to Contaminated Circular Data". Thesis, Umeå universitet, Statistik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149426.
Pełny tekst źródłaBrody-Moore, Peter. "Bayesian Hierarchical Meta-Analysis of Asymptomatic Ebola Seroprevalence". Scholarship @ Claremont, 2019. https://scholarship.claremont.edu/cmc_theses/2228.
Pełny tekst źródłaLesser, Elizabeth Rochelle. "A New Right Tailed Test of the Ratio of Variances". UNF Digital Commons, 2016. http://digitalcommons.unf.edu/etd/719.
Pełny tekst źródłaAndersson, Carl. "Deep learning applied to system identification : A probabilistic approach". Licentiate thesis, Uppsala universitet, Avdelningen för systemteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-397563.
Pełny tekst źródłaTheisen, Benjamin. "Predicting Turnover Cognition in Applied Behavior Analysis Supervisors". Thesis, The Chicago School of Professional Psychology, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10934198.
Pełny tekst źródłaThe study looked for predictors of turnover cognition among program supervisors at various applied behavior analysis organizations. A hierarchical regression model (n = 248) tested whether burnout moderated effects of training hours on turnover cognition, and whether burnout moderated effects of training procedures on turnover cognition. The best model (R 2 = .719, delta F (2, 239) = 3.22, p = .042) did not detect burnout. Results were interpreted using the Conservation of Resources theory. Recommendations for researchers and organizations planning supervisor retention programs were provided.
Melbourne, Davayne A. "A New method for Testing Normality based upon a Characterization of the Normal Distribution". FIU Digital Commons, 2014. http://digitalcommons.fiu.edu/etd/1248.
Pełny tekst źródłaSaket, Munther Musa. "Cost-significance applied to estimating and control of construction projects". Thesis, University of Dundee, 1986. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.276578.
Pełny tekst źródłaRosa, Joao Miguel Feu. "Mathematical programming applied to diet problems in a Brazilian region". Thesis, Lancaster University, 1992. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.332375.
Pełny tekst źródłaChun, So Yeon. "Hybrid is good: stochastic optimization and applied statistics for or". Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44717.
Pełny tekst źródłaRisk, James Kenneth. "Three Applications of Gaussian Process Modeling in Evaluation of Longevity Risk Management". Thesis, University of California, Santa Barbara, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10620897.
Pełny tekst źródłaLongevity risk, the risk associated with people living too long, is an emerging issue in financial markets. Two major factors related to this are with regards to mortality modeling and pricing of life insurance instruments. We propose use of Gaussian process regression, a technique recently populuarized in machine learning, to aid in both of these problems. In particular, we present three works using Gaussian processes in longevity risk applications. The first is related to pricing, where Gaussian processes can serve as a surrogate for conditional expectation needed for Monte Carlo simulations. Second, we investigate value-at-risk calculations in a related framework, introducing a sequential algorithm allowing Gaussian processes to search for the quantile. Lastly, we use Gaussian processes as a spatial model to model mortality rates and improvement.
Shafie, H. Khalil. "The geometry of Gaussian rotation space random fields /". Thesis, McGill University, 1998. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=35614.
Pełny tekst źródłaZhou, Xiaojie. "Optimal designs for change point problems". Thesis, McGill University, 1997. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=35667.
Pełny tekst źródłaIn designing a longitudinal study, the decision as to when to collect data can have a large impact on the quality of the final inferences. If a change may occur in the distribution of one or more variables under study, the timing of observations can greatly influence the chances of detecting any effects.
Two classes of problems are considered. First, optimal design for the mixture of densities is investigated. Here, a finite sequence of random variables is available for observation. Each observation may come from one of two distributions with a given probability, which may differ from observation to observation. Such a problem may also be regarded as an application of the multi-path change point problem. Assume subjects may each undergo a single change at random change points with common before and after change point distributions, and at any instant a known proportion of the ensemble of paths will have changed. In either case, the goal is to select which data points to observe, in order to provide the most accurate estimates of the means of both distributions.
Second, we study optimal designs for more classical change point problems. We consider three cases: (i) when only the means of the before and after change point distributions are of interest, (ii) when only the location of the change point is of interest, and (iii) when both the change point and the means of the before and after change point distribution are of interest.
In addressing these problems, both analytic closed form solutions and modern statistical computing algorithms such as Monte Carlo integration and simulated an nealing are used to find the optimal designs. Examples that concern human growth patterns and changes in CFC-12 concentrations in the atmosphere are used to illustrate the methods.
Peng, Yuanyuan. "On Singular Values of Random Matrices". Kent State University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=kent1438253068.
Pełny tekst źródłaYin, Kai. "Bayesian Uncertainty Quantification for Differential Equation Models Related to Financial Volatility and Disease Transmission". Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1623863667837324.
Pełny tekst źródłaElkadry, Alaa. "Statistical Analyses of "Randomly Sourced Data"". Thesis, Oakland University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10262484.
Pełny tekst źródłaWarner in 1965 introduced randomized response, and since then many extensions and improvements to the Warner model have been done. In this study a randomized response model applicable to continuous data that considers a mixture of two normal distributions is considered and analyzed. This includes a study of the efficiency, an estimation of some unknown parameters and a discussion of contaminated data issues and an application of this method to the problem of estimating Oakland University student income is presented and discussed. Also, this study includes inference for two or more populations of the same structure as the randomized response model introduced.
The impact of this randomized response model on ranking and selection method is quantified for an indifference-zone procedure and a subset selection procedure. A study on how to choose the best population between k distinct populations using an indifference-zone procedure is presented and some tables for the required sample size needed to have a probability of correct selection higher than some specified value in the preference zone for the randomized response model considered are provided. An application of the subset selection procedure on the considered randomized response model is discussed. The subset selection study is provided for 2 configurations, the slippage configuration and the equi-spaced configuration, and tables are provided for both configurations.
Finally, a discussion on the use of the data obtained from the Bayesian Improved Surname and Geocoding analysis (BISG) tool in hypothesis testing for disparity between different populations. Two approaches are provided on how to use the information arising from the BISG.
Bahuguna, Manoj. "Analytics of Asymmetry and Transformation to Multivariate Normality Through Copula Functions with Applications in Biomedical Sciences and Finance". Thesis, Oakland University, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10263461.
Pełny tekst źródłaIn this work, we study and develop certain aspects of the analytics of asymmetry for univariate and multivariate data. Accordingly, the above work consists of three separate parts.
In the first part of our work, we introduce a new approach to measure the univariate and multivariate skewness based on quantiles and the properties of odd and even functions. We illustrate through numerous examples and simulations that in the multivariate case the Mardia’s measure of skewness fails to provide consistent and meaningful interpretations. However, our new measure appears to provide an index which is more reasonable.
In the second part of our work, our emphasis is to moderate or eliminate asymmetry of multivariate data when the interest is in the study of dependence. Copula transformation has been used as an all-purpose transformation to introduce multivariate normality. Using this approach, even though information about marginal distributions is lost, we are still able to study dependence based modeling problems for asymmetric data using the technique developed for multivariate normal data. We illustrate a variety of applications in areas such as multiple regression, principal component, factor analysis, partial least squares and structural equation models. The results are promising in that our approach shows improvement over results obtained when asymmetry is ignored.
The last part of this work is based on the applications of our copula transformation to financial data. Specifically, we consider the problem of estimation of “beta risk” associated with a particular financial asset. Taking S&P500 index as a proxy for market, we suggest three versions of “beta estimates” which are useful in situations when the returns of the assets and market proxy do not have the most ideal probability distribution, namely, bivariate normal or when data may contain some very extreme (high or low) returns. Using the copula based methods, developed earlier in this dissertation, and winsorization, we obtain the estimates which in high skewness scenarios perform better than the traditional least square estimate of market beta.
Rialland, P. C. R. P. "Three essays in applied microeconomics". Thesis, University of Essex, 2018. http://repository.essex.ac.uk/23688/.
Pełny tekst źródłaMcIntosh, Alasdair. "Interpretable models of genetic drift applied especially to human populations". Thesis, University of Glasgow, 2018. http://theses.gla.ac.uk/30690/.
Pełny tekst źródłaDonnelly, James P. "NFL Betting Market: Using Adjusted Statistics to Test Market Efficiency and Build a Betting Model". Scholarship @ Claremont, 2013. http://scholarship.claremont.edu/cmc_theses/721.
Pełny tekst źródłaWardrop, Daniel M. "Optimality criteria applied to certain response surface designs". Diss., Virginia Polytechnic Institute and State University, 1985. http://hdl.handle.net/10919/49960.
Pełny tekst źródłaPh. D.
incomplete_metadata
Ay, Belit, i Nabiel Efrem. "Benford’s law applied to sale prices on the Swedish housing market". Thesis, Stockholms universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-194865.
Pełny tekst źródłaHoward, Marylesa Marie. "Computational methods for support vector machine classification and large-scale Kalman filtering". Thesis, University of Montana, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3568109.
Pełny tekst źródłaThe first half of this dissertation focuses on computational methods for solving the constrained quadratic program (QP) within the support vector machine (SVM) classifier. One of the SVM formulations requires the solution of bound and equality constrained QPs. We begin by describing an augmented Lagrangian approach which incorporates the equality constraint into the objective function, resulting in a bound constrained QP. Furthermore, all constraints may be incorporated into the objective function to yield an unconstrained quadratic program, allowing us to apply the conjugate gradient (CG) method. Lastly, we adapt the scaled gradient projection method to the SVM QP and compare the performance of these methods with the state-of-the-art sequential minimal optimization algorithm and MATLAB's built in constrained QP solver, quadprog. The augmented Lagrangian method outperforms other state-of-the-art methods on three image test cases.
The second half of this dissertation focuses on computational methods for large-scale Kalman filtering applications. The Kalman filter (KF) is a method for solving a dynamic, coupled system of equations. While these methods require only linear algebra, standard KF is often infeasible in large-scale implementations due to the storage requirements and inverse calculations of large, dense covariance matrices. We introduce the use of the CG and Lanczos methods into various forms of the Kalman filter for low-rank approximations of the covariance matrices, with low-storage requirements. We also use CG for efficient Gaussian sampling within the ensemble Kalman filter method. The CG-based KF methods perform similarly in root-mean-square error when compared to the standard KF methods, when the standard implementations are feasible, and outperform the limited-memory Broyden-Fletcher-Goldfarb-Shanno approximation method.
Raissi, Maziar. "Multi-fidelity Stochastic Collocation". Thesis, George Mason University, 2013. http://pqdtopen.proquest.com/#viewpdf?dispub=3591697.
Pełny tekst źródłaOver the last few years there have been dramatic advances in our understanding of mathematical and computational models of complex systems in the presence of uncertainty. This has led to a growth in the area of uncertainty quantification as well as the need to develop efficient, scalable, stable and convergent computational methods for solving differential equations with random inputs. Stochastic Galerkin methods based on polynomial chaos expansions have shown superiority to other non-sampling and many sampling techniques. However, for complicated governing equations numerical implementations of stochastic Galerkin methods can become non-trivial. On the other hand, Monte Carlo and other traditional sampling methods, are straightforward to implement. However, they do not offer as fast convergence rates as stochastic Galerkin. Other numerical approaches are the stochastic collocation (SC) methods, which inherit both, the ease of implementation of Monte Carlo and the robustness of stochastic Galerkin to a great deal. However, stochastic collocation and its powerful extensions, e.g. sparse grid stochastic collocation, can simply fail to handle more levels of complication. The seemingly innocent Burgers equation driven by Brownian motion is such an example. In this work we propose a novel enhancement to stochastic collocation methods using deterministic model reduction techniques that can handle this pathological example and hopefully other more complicated equations like Stochastic Navier Stokes. Our numerical results show the efficiency of the proposed technique. We also perform a mathematically rigorous study of linear parabolic partial differential equations with random forcing terms. Justified by the truncated Karhunen-Loève expansions, the input data are assumed to be represented by a finite number of random variables. A rigorous convergence analysis of our method applied to parabolic partial differential equations with random forcing terms, supported by numerical results, shows that the proposed technique is not only reliable and robust but also very efficient.
Ruffin, Michael. "User retention and classification in a mobile gaming environment". Thesis, California State University, Long Beach, 2014. http://pqdtopen.proquest.com/#viewpdf?dispub=1527021.
Pełny tekst źródłaGame analytics is a fast growing field where game studios are allocating valuable resources to develop sophisticated statistical models to understand user behavior and monetization habits to optimize game play and performance. Game developers' ability to understand user retention allows for game features that will generate high engagement leading to stronger overall monetization and increased lifetimes of players.
One important industry adopted metric is the percentage of users who log back into the game one day after installation, otherwise known as a one-day retention. Although this is an important metric, game studios typically allocate little resources to determining what user transactions are typically conducted on the day of installation that drive a one-day retention.
In this project, we first conduct a cluster analysis in an attempt to uncover meaningful subgroups based on players' transaction history on their first day of installation. Secondly, we use various classification methods including decision trees, logistic regression, and k-Nearest Neighbor algorithm to determine which behaviors are important in identifying whether a new user will return the following day.
Eiland, E. Earl. "A Coherent Classifier/Prediction/Diagnostic Problem Framework and Relevant Summary Statistics". Thesis, New Mexico Institute of Mining and Technology, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10617960.
Pełny tekst źródłaClassification is a ubiquitous decision activity. Regardless of whether it is predicting the future, e.g., a weather forecast, determining an existing state, e.g., a medical diagnosis, or some other activity, classifier outputs drive future actions. Because of their importance, classifier research and development is an active field.
Regardless of whether one is a classifier developer or an end user, evaluating and comparing classifier output quality is important. Intuitively, classifier evaluation may seem simple, however, it is not. There is a plethora of classifier summary statistics and new summary statistics seem to surface regularly. Summary statistic users appear not to be satisfied with the existing summary statistics. For end users, many existing summary statistics do not provide actionable information. This dissertation addresses the end user's quandary.
The work consists of four parts: 1. Considering eight summary statistics with regard to their purpose (what questions do they quantitatively answer) and efficacy (as defined by measurement theory). 2. Characterizing the classification problem from the end user's perspective and identifying four axioms for end user efficacious classifier evaluation summary statistics. 3. Applying the axia and measurement theory to evaluate eight summary statistics and create two compliant (end user efficacious) summary statistics. 4. Using the compliant summary statistics to show the actionable information they generate.
By applying the recommendations in this dissertation, both end users and researchers benefit. Researchers have summary statistic selection and classifier evaluation protocols that generate the most usable information. End users can also generate information that facilitates tool selection and optimal deployment, if classifier test reports provide the necessary information.
McCants, Michael. "Efficacy of robust regression applied to fractional factorial treatment structures". Thesis, Kansas State University, 2011. http://hdl.handle.net/2097/9260.
Pełny tekst źródłaDepartment of Statistics
James J. Higgins
Completely random and randomized block designs involving n factors at each of two levels are used to screen for the effects of a large number of factors. With such designs it may not be possible either because of costs or because of time to run each treatment combination more than once. In some cases, only a fraction of all the treatments may be run. With a large number of factors and limited observations, even one outlier can adversely affect the results. Robust regression methods are designed to down-weight the adverse affects of outliers. However, to our knowledge practitioners do not routinely apply robust regression methods in the context of fractional replication of 2^n factorial treatment structures. The purpose of this report is examine how robust regression methods perform in this context.
Jacobson, Daniel A. "Networks and multivariate statistics as applied to biological datasets and wine-related omics". Thesis, Stellenbosch : Stellenbosch University, 2013. http://hdl.handle.net/10019.1/85630.
Pełny tekst źródłaENGLISH ABSTRACT: Introduction: Wine production is a complex biotechnological process aiming at productively coordinating the interactions and outputs of several biological systems, including grapevine and many microorganisms such as wine yeast and wine bacteria. High-throughput data generating tools in the elds of genomics, transcriptomics, proteomics, metabolomics and microbiomics are being applied both locally and globally in order to better understand complex biological systems. As such, the datasets available for analysis and mining include de novo datasets created by collaborators as well as publicly available datasets which one can use to get further insight into the systems under study. In order to model the complexity inherent in and across these datasets it is necessary to develop methods and approaches based on network theory and multivariate data analysis as well as to explore the intersections between these two approaches to data modelling, mining and interpretation. Networks: The traditional reductionist paradigm of analysing single components of a biological system has not provided tools with which to adequately analyse data sets that are attempting to capture systems-level information. Network theory has recently emerged as a new discipline with which to model and analyse complex systems and has arisen from the study of real and often quite large networks derived empirically from the large volumes of data that have collected from communications, internet, nancial and biological systems. This is in stark contrast to previous theoretical approaches to understanding complex systems such as complexity theory, synergetics, chaos theory, self-organised criticality, and fractals which were all sweeping theoretical constructs based on small toy models which proved unable to address the complexity of real world systems. Multivariate Data Analysis: Principle components analysis (PCA) and Partial Least Squares (PLS) regression are commonly used to reduce the dimensionality of a matrix (and amongst matrices in the case of PLS) in which there are a considerable number of potentially related variables. PCA and PLS are variance focused approaches where components are ranked by the amount of variance they each explain. Components are, by de nition, orthogonal to one another and as such, uncorrelated. Aims: This thesis explores the development of Computational Biology tools that are essential to fully exploit the large data sets that are being generated by systems-based approaches in order to gain a better understanding of winerelated organisms such as grapevine (and tobacco as a laboratory-based plant model), plant pathogens, microbes and their interactions. The broad aim of this thesis is therefore to develop computational methods that can be used in an integrated systems-based approach to model and describe di erent aspects of the wine making process from a biological perspective. To achieve this aim, computational methods have been developed and applied in the areas of transcriptomics, phylogenomics, chemiomics and microbiomics. Summary: The primary approaches taken in this thesis have been the use of networks and multivariate data analysis methods to analyse highly dimensional data sets. Furthermore, several of the approaches have started to explore the intersection between networks and multivariate data analysis. This would seem to be a logical progression as both networks and multivariate data analysis are focused on matrix-based data modelling and therefore have many of their roots in linear algebra.
AFRIKAANSE OPSOMMING: Inleiding: Wynproduksie is 'n komplekse biotegnologiese proses wat mik op die produktiewe koördinering van verskeie interaksies en uitsette van verskeie biologiese sisteme. Hierdie sisteme sluit in die wingerd, wat van besondere belang is, asook die wyn gis en wyn bakterieë. Hoë-deurset data generasie word huidiglik beide globaal en plaaslik toegepas in die velde van genomika, transkriptomika, proteomika, metabolomika en mikrobiomika. As sulks is hierdie tipe datastelle beskikbaar vir ontleding, bemyning en verkening. Die datastelle kan de novo gegenereer word, met behulp van medewerkers, of dit kan vanuit die publieke databasisse gewerf word waar sulke datastelle dikwels beskikbaar gemaak word sodat verdere insig verkry kan word met betrekking tot die sisteem onder studie. Die hoë-deurset datastelle onder bespreking bevat 'n hoë mate van inherente kompleksiteit, beide ten opsigte van ditself asook tussen verskeie datastelle. Om ten einde hierdie datastelle en hul inherente kompleksiteit te modelleer is dit nodig om metodes en benaderings te ontwikkel wat gesetel is in netwerk teorie en meerveranderlike statistiek. Verdermeer is dit ook nodig om die kruisings tussen netwerk teorie en meerveranderlike statistiek te verken om sodoende die modellering, bemyning, verkening en interpretasie van data te verbeter. Netwerke: Die tradisionele reduksionistiese paradigma, waarby enkele komponente van 'n biologiese sisteem geontleed word, het tot dusver nie voldoende metodes en gereedskap gelewer waarmee datastelle, wat streef om sisteemvlak informasie te bekom, geontleed kan word nie. Netwerk teorie het na vore gekom as 'n nuwe dissipline wat toegepas kan word vir die model-skepping en ontleding van komplekse sisteme. Dit stem uit die studie van egte, dikwels groot netwerke wat empiries afgelei word uit die groot volumes data wat tans na vore kom vanuit kommunikasie-, internet-, nansiële- en biologiese sisteme. Dit is in skrille kontras met vorige teoretiese benaderings wat gestreef het om komplekse sisteme te verstaan met konsepte soos kompleksiteits teorie, synergetics , chaos teorie, self-georganiseerde kritikaliteit en fraktale. Al die bogeneomde is breë teoretiese konstrukte, gebasseer op relatief kleinskaal modelle, wat nie instaat was om oplossings vir die kompleksiteit van egte-wêreld sisteme te bied nie. Meerveranderlike Data-analise: Hoofkomponente-ontleding (PCA) en Partial Least Squares (PLS) regressie word dikwels gebruik om die dimensionaliteit van 'n matriks (en tussen matrikse in die geval van PLS) te verminder. Hierdie matrikse bevat dikwels 'n aansienlike groot hoeveelheid moontlikverwante veranderlikes. PCA en PLS is variansie gedrewe metodes en behels dat komponente gerang word deur die hoeveelheid variansie wat elke component verduidelik. Komponente is by de nisie ortogonaal ten opsigte van mekaar en as sulks ongekorreleerd. Doelwitte: Hierdie tesis verken die ontwikkeling van verskeie Computational Biology metodes wat noodsaaklik is om ten volle die groot skaal datastelle te benut wat tans deur sisteem-gebasseerde benaderings gegenereer word. Die doel is om beter begrip en kennis van wyn verwante organismes te kry, hierdie organismes sluit in die wingerd (met tabak as laboratorium-gebasseerde plant model), plant patogene en microbes sowel as hulle interaksies. Die breë mikpunt van hierdie tesis is dus om gerekenaardiseerde metodes te ontwikkel wat gebruik kan word in 'n geintergreerde sisteem-gebaseerde benadering tot die modellering en beskrywing van verskillende aspekte van die wynmaak proses vanuit 'n biologiese standpunt. Om die mikpunt te bereik is gerekenaardiseerde metodes ontwikkel en toegepas in die velde van transkriptomika, logenomika, chemiomika en mikrobiomika. Opsomming: Die primêre benadering geneem in hierdie tesis is die gebruik van netwerke en meerveranderlike data-ontleding metodes om hoë-dimensie datastelle te ontleed. Verdermeer, verskeie van die metodes begin om die gemeenskaplike grond tussen netwerke en meerveranderlike data-ontleding te verken. Dit blyk om 'n logiese progressie te wees, aangesien beide netwerke en meerveranderlike data-ontleding gefokus is op matriks-gebaseerde data modellering en dus gewortel is in liniêre algebra.
Araújo, Daniel Costa. "Channel estimation techniques applied to massive MIMO systems using sparsity and statistics approaches". reponame:Repositório Institucional da UFC, 2016. http://www.repositorio.ufc.br/handle/riufc/23478.
Pełny tekst źródłaSubmitted by Renato Vasconcelos (ppgeti@ufc.br) on 2017-06-21T13:52:26Z No. of bitstreams: 1 2016_tese_dcaraújo.pdf: 1832588 bytes, checksum: a4bb5d44287b92a9321d5fcc3589f22e (MD5)
Approved for entry into archive by Marlene Sousa (mmarlene@ufc.br) on 2017-06-21T16:17:55Z (GMT) No. of bitstreams: 1 2016_tese_dcaraújo.pdf: 1832588 bytes, checksum: a4bb5d44287b92a9321d5fcc3589f22e (MD5)
Made available in DSpace on 2017-06-21T16:17:55Z (GMT). No. of bitstreams: 1 2016_tese_dcaraújo.pdf: 1832588 bytes, checksum: a4bb5d44287b92a9321d5fcc3589f22e (MD5) Previous issue date: 2016-09-29
Massive MIMO has the potential of greatly increasing the system spectral efficiency by employing many individually steerable antenna elements at the base station (BS). This potential can only be achieved if the BS has sufficient channel state information (CSI) knowledge. The way of acquiring it depends on the duplexing mode employed by the communication system. Currently, frequency division duplexing (FDD) is the most used in the wireless communication system. However, the amount of overhead necessary to estimate the channel scales with the number of antennas which poses a big challenge in implementing massive MIMO systems with FDD protocol. To enable both operating together, this thesis tackles the channel estimation problem by proposing methods that exploit a compressed version of the massive MIMO channel. There are mainly two approaches used to achieve such a compression: sparsity and second order statistics. To derive sparsity-based techniques, this thesis uses a compressive sensing (CS) framework to extract a sparse-representation of the channel. This is investigated initially in a flat channel and afterwards in a frequency-selective one. In the former, we show that the Cramer-Rao lower bound (CRLB) for the problem is a function of pilot sequences that lead to a Grassmannian matrix. In the frequency-selective case, a novel estimator which combines CS and tensor analysis is derived. This new method uses the measurements obtained of the pilot subcarriers to estimate a sparse tensor channel representation. Assuming a Tucker3 model, the proposed solution maps the estimated sparse tensor to a full one which describes the spatial-frequency channel response. Furthermore, this thesis investigates the problem of updating the sparse basis that arises when the user is moving. In this study, an algorithm is proposed to track the arrival and departure directions using very few pilots. Besides the sparsity-based techniques, this thesis investigates the channel estimation performance using a statistical approach. In such a case, a new hybrid beamforming (HB) architecture is proposed to spatially multiplex the pilot sequences and to reduce the overhead. More specifically, the new solution creates a set of beams that is jointly calculated with the channel estimator and the pilot power allocation using the minimum mean square error (MMSE) criterion. We show that this provides enhanced performance for the estimation process in low signal-noise ratio (SNR) scenarios.
Pesquisas em sistemas MIMO massivo (do inglês multiple-input multiple-output) ganha- ram muita atenção da comunidade científica devido ao seu potencial em aumentar a eficiência espectral do sistema comunicações sem-fio utilizando centenas de elementos de antenas na estação de base (EB). Porém, tal potencial só poderá é obtido se a EB possuir suficiente informação do estado de canal. A maneira de adquiri-lo depende de como os recursos de comunicação tempo-frequência são empregados. Atualmente, a solução mais utilizada em sistemas de comunicação sem fio é a multiplexação por divisão na frequência (FDD) dos pilotos. Porém, o grande desafio em implementar esse tipo solução é porque a quantidade de tons pilotos exigidos para estimar o canal aumenta com o número de antenas. Isso resulta na perda do eficiência espectral prometido pelo sistema massivo. Esta tese apresenta métodos de estimação de canal que demandam uma quantidade de tons pilotos reduzida, mas mantendo alta precisão na estimação do canal. Esta redução de tons pilotos é obtida porque os estimadores propostos exploram a estrutura do canal para obter uma redução das dimensões do canal. Nesta tese, existem essencialmente duas abordagens utilizadas para alcançar tal redução de dimensionalidade: uma é através da esparsidade e a outra através das estatísticas de segunda ordem. Para derivar as soluções que exploram a esparsidade do canal, o estimador de canal é obtido usando a teoria de “compressive sensing” (CS) para extrair a representação esparsa do canal. A teoria é aplicada inicialmente ao problem de estimação de canais seletivos e não-seletivos em frequência. No primeiro caso, é mostrado que limitante de Cramer-Rao (CRLB) é definido como uma função das sequências pilotos que geram uma matriz Grassmaniana. No segundo caso, CS e a análise tensorial são combinado para derivar um novo algoritmo de estimatição baseado em decomposição tensorial esparsa para canais com seletividade em frequência. Usando o modelo Tucker3, a solução proposta mapeia o tensor esparso para um tensor cheio o qual descreve a resposta do canal no espaço e na frequência. Além disso, a tese investiga a otimização da base de representação esparsa propondo um método para estimar e corrigir as variações dos ângulos de chegada e de partida, causados pela mobilidade do usuário. Além das técnicas baseadas em esparsidade, esta tese investida aquelas que usam o conhecimento estatístico do canal. Neste caso, uma nova arquitetura de beamforming híbrido é proposta para realizar multiplexação das sequências pilotos. A nova solução consite em criar um conjunto de feixes, que são calculados conjuntamente com o estimator de canal e alocação de potência para os pilotos, usand o critério de minimização erro quadrático médio. É mostrado que esta solução reduz a sequencia pilot e mostra bom desempenho e cenários de baixa relação sinal ruído (SNR).
Gard, Rikard. "Design-based and Model-assisted estimators using Machine learning methods : Exploring the k-Nearest Neighbor metod applied to data from the Recreational Fishing Survey". Thesis, Örebro universitet, Handelshögskolan vid Örebro Universitet, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:oru:diva-72488.
Pełny tekst źródłaChernoff, Parker. "Sabermetrics - Statistical Modeling of Run Creation and Prevention in Baseball". FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3663.
Pełny tekst źródłaBook, Emil, i Linus Ekelöf. "A Multiple Linear Regression Model To Assess The Effects of Macroeconomic Factors On Small and Medium-Sized Enterprises". Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254298.
Pełny tekst źródłaSmå- och medelstora företag (SMEs) har länge varit ansedda som en av de viktigaste komponenterna i ett lands ekonomi, främst för deras bidrag till tillväxt och framgång. Det är därför mycket viktigt att regeringar och lagstiftare för en politik som främjar SMEs optimala tillväxt. Flera år av högkonjunktur och oro över kommande lågkonjunktur har gjort detta ämne ytterst relevant då små företag är de som kommer att drabbas värst av en svårare ekonomisk tillvaro. Denna rapport använder multipel linjär regression för att utvärdera effekterna av olika makroekonomiska faktorer på SMEs i Sverige. Data har insamlats månadsvis för en 10 årsperiod mellan 2009 till 2010. Resultatet blev en modell med fem variabler och en förklaringsgrad på 98%.
Zhi, Tianchen. "Maximum Likelihood Estimation of Parameters in Exponential Power Distribution with Upper Record Values". FIU Digital Commons, 2017. http://digitalcommons.fiu.edu/etd/3211.
Pełny tekst źródłaWilliams, Ulyana P. "On Some Ridge Regression Estimators for Logistic Regression Models". FIU Digital Commons, 2018. https://digitalcommons.fiu.edu/etd/3667.
Pełny tekst źródłaChu, Chi-Yang. "Applied Nonparametric Density and Regression Estimation with Discrete Data| Plug-In Bandwidth Selection and Non-Geometric Kernel Functions". Thesis, The University of Alabama, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10262364.
Pełny tekst źródłaBandwidth selection plays an important role in kernel density estimation. Least-squares cross-validation and plug-in methods are commonly used as bandwidth selectors for the continuous data setting. The former is a data-driven approach and the latter requires a priori assumptions about the unknown distribution of the data. A benefit from the plug-in method is its relatively quick computation and hence it is often used for preliminary analysis. However, we find that much less is known about the plug-in method in the discrete data setting and this motivates us to propose a plug-in bandwidth selector. A related issue is undersmoothing in kernel density estimation. Least-squares cross-validation is a popular bandwidth selector, but in many applied situations, it tends to select a relatively small bandwidth, or undersmooths. The literature suggests several methods to solve this problem, but most of them are the modifications of extant error criterions for continuous variables. Here we discuss this problem in the discrete data setting and propose non-geometric discrete kernel functions as a possible solution. This issue also occurs in kernel regression estimation. Our proposed bandwidth selector and kernel functions perform well in simulated and real data.
Olsen, Jessica Lyn. "An Applied Investigation of Gaussian Markov Random Fields". BYU ScholarsArchive, 2012. https://scholarsarchive.byu.edu/etd/3273.
Pełny tekst źródłaAlbaqshi, Amani Mohammed H. "Generalized Partial Least Squares Approach for Nominal Multinomial Logit Regression Models with a Functional Covariate". Thesis, University of Northern Colorado, 2017. http://pqdtopen.proquest.com/#viewpdf?dispub=10599676.
Pełny tekst źródłaFunctional Data Analysis (FDA) has attracted substantial attention for the last two decades. Within FDA, classifying curves into two or more categories is consistently of interest to scientists, but multi-class prediction within FDA is challenged in that most classification tools have been limited to binary response applications. The functional logistic regression (FLR) model was developed to forecast a binary response variable in the functional case. In this study, a functional nominal multinomial logit regression (F-NM-LR) model was developed that shifts the FLR model into a multiple logit model. However, the model generates inaccurate parameter function estimates due to multicollinearity in the design matrix. A generalized partial least squares (GPLS) approach with cubic B-spline basis expansions was developed to address the multicollinearity and high dimensionality problems that preclude accurate estimates and curve discrimination with the F-NM-LR model. The GPLS method extends partial least squares (PLS) and improves upon current methodology by introducing a component selection criterion that reconstructs the parameter function with fewer predictors. The GPLS regression estimates are derived via Iteratively ReWeighted Partial Least Squares (IRWPLS), defining a set of uncorrelated latent variables to use as predictors for the F-GPLS-NM-LR model. This methodology was compared to the classic alternative estimation method of principal component regression (PCR) in a simulation study. The performance of the proposed methodology was tested via simulations and applications on a spectrometric dataset. The results indicate that the GPLS method performs well in multi-class prediction with respect to the F-NM-LR model. The main difference between the two approaches was that PCR usually requires more components than GPLS to achieve similar accuracy of parameter function estimates of the F-GPLS-NM-LR model. The results of this research imply that the GPLS method is preferable to the F-NM-LR model, and it is a useful contribution to FDA techniques. This method may be particularly appropriate for practical situations where accurate prediction of a response variable with fewer components is a priority.
Paciencia, Todd J. "Improving non-linear approaches to anomaly detection, class separation, and visualization". Thesis, Air Force Institute of Technology, 2015. http://pqdtopen.proquest.com/#viewpdf?dispub=3667806.
Pełny tekst źródłaLinear approaches for multivariate data analysis are popular due to their lower complexity, reduced computational time, and easier interpretation. In many cases, linear approaches produce adequate results; however, non-linear methods may generate more robust transformations, features, and decision boundaries. Of course, these non-linear methods present their own unique challenges that often inhibit their use.
In this research, improvements to existing non-linear techniques are investigated for the purposes of providing better, timely class separation and improved anomaly detection on various multivariate datasets, culminating in application to anomaly detection in hyperspectral imagery. Primarily, kernel-based methods are investigated, with some consideration towards other methods. Improvements to existing linear-based algorithms are also explored. Here, it is assumed that classes in the data have minimal overlap in the originating space or can be made to have minimal overlap in a transformed space, and that class information is unknown a priori. Further, improvements are demonstrated for global anomaly detection on a variety of hyperspectral imagery, utilizing fusion of spatial and spectral information, factor analysis, clustering, and screening. Additionally, new approaches for n-dimensional visualization of data and decision boundaries are developed.
Fregosi, Anna. "Calibration of Thermal Soil Properties in the Shallow Subsurface". Thesis, North Carolina State University, 2016. http://pqdtopen.proquest.com/#viewpdf?dispub=10110538.
Pełny tekst źródłaWe use nonlinear least squares methods and Bayesian inference to calibrate soil properties using models for heat and groundwater transport in the shallow subsurface. We first assume a constant saturation in our domain and use the analytic solution to the heat equation as a model for heat transport. We compare our results to those using the finite element code, Adaptive Hydrology (ADH). We then use ADH to simulate heat and groundwater transport in an unsaturated domain. We use the Model-Independent Parameter Estimation (PEST) software to solve the least squares problem with ADH as our model. In using Bayesian inference, we employ the Delayed Rejection Adaptive Metropolis (DRAM) Markov chain Monte Carlo algorithm to sample from the posterior densities of parameters in both models. We find our results are consistent with those found using soil samples with empirical methods.
Pipher, Brandon. "Comparison of Regression Methods with Non-Convex Penalties". Kent State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=kent1573056251025985.
Pełny tekst źródłaCrk, Vladimir 1958. "Component and system reliability assessment from degradation data". Diss., The University of Arizona, 1998. http://hdl.handle.net/10150/282820.
Pełny tekst źródłaSmith, Laura. "A Numerical Simulation and Statistical Modeling of High Intensity Radiated Fields Experiment Data". W&M ScholarWorks, 2001. https://scholarworks.wm.edu/etd/1539626330.
Pełny tekst źródłaZhao, Yang, i Min Zhang. "The Ising Model on a Heavy Gravity Portfolio Applied to Default Contagion". Thesis, Högskolan i Halmstad, Tillämpad matematik och fysik (MPE-lab), 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-16459.
Pełny tekst źródła