Dissertations / Theses: 'Model selection'

1

Selén, Yngve. "Model selection /." Uppsala : Univ. : Dept. of Information Technology, Univ, 2004. http://www.it.uu.se/research/reports/lic/2004-003/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Selén, Yngve. "Model Selection." Licentiate thesis, Uppsala universitet, Avdelningen för systemteknik, 2004. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-86308.

Full text

Abstract:

Before using a parametric model one has to be sure that it offers a reasonable description of the system to be modeled. If a bad model structure is employed, the obtained model will also be bad, no matter how good is the parameter estimation method. There exist many possible ways of validating candidate models. This thesis focuses on one of the most common ways, i.e., the use of information criteria. First, some common information criteria are presented, and in the later chapters, various extentions and implementations are shown. An important extention, which is advocated in the thesis, is the multi-model (or model averaging) approach to model selection. This multi-model approach consists of forming a weighted sum of several candidate models, which then can be used for inference.

APA, Harvard, Vancouver, ISO, and other styles

3

Evers, Ludger. "Model fitting and model selection for 'mixture of experts' models." Thesis, University of Oxford, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.445776.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Billah, Baki 1965. "Model selection for time series forecasting models." Monash University, Dept. of Econometrics and Business Statistics, 2001. http://arrow.monash.edu.au/hdl/1959.1/8840.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Yoshimura, Arihiro. "Essays on Semiparametric Model Selection and Model Averaging." Kyoto University, 2015. http://hdl.handle.net/2433/199059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

PENG, SISI. "Evaluating Automatic Model Selection." Thesis, Uppsala universitet, Statistiska institutionen, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-154449.

Full text

Abstract:

In this paper, we briefly describe the automatic model selection which is provided by Autometrics in the PcGive program. The modeler only needs to specify the initial model and the significance level at which to reduce the model. Then, the algorithm does the rest. The properties of Autometrics are discussed. We also explain its background concepts and try to see whether the model selected by the Autometrics can perform well. For a given data set, we use Autometrics to find a “new” model, and then compare the “new” model with a previously selected one by another modeler. It is an interesting issue to see whether Autometrics can also find models which fit better to the given data. As an illustration, we choose three examples. It is true that Autometrics is labor saving and always gives us a parsimonious model. It is really an invaluable instrument for social science. But, we still need more examples to strongly support the idea that Autometrics can find a model which fits the data better, just a few examples in this paper is far from enough.

APA, Harvard, Vancouver, ISO, and other styles

7

Bello, Bernardo. "PROCESS MANUFACTURING SELECTION MODEL." Thesis, KTH, Industriell produktion, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-218031.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Aasberg, Pipirs Freddy, and Patrik Svensson. "Tenancy Model Selection Guidelines." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-235716.

Full text

Abstract:

Software as a Service (SaaS) is a subset of cloud services where a vendor provides software as a service to customers. The SaaS application is installed on the SaaS provider’s servers, and is often accessed via the web browser. In the context of SaaS, a customer is called tenant, which often is an organization that is accessing the SaaS application, but it could also be a single individual. A SaaS application can be classified into tenancy models. A tenancy model describes how a tenant’s data is mapped to the storage on the server-side of the SaaS application.By doing a research, the authors have drawn the conclusion that there is a lack of guidance for selecting tenancy models. The purpose of this thesis is to provide guidance for selecting tenancy models. The short-term-goal is to create a tenancy selection guide. The long-term-goal is to provide researchers and students with research material. This thesis provides a guidance model for selection of tenancy models. The model is called Tenancy Model Selection Guidelines (TMSG).TMSG was evaluated by interviewing two professionals from the software industry. The criteria used for evaluating TMSG were Interviewee credibility, Syntactic correctness, Semantic correctness, Usefulness and Model flexibility. In the interviews, both of the interviewees said that TMSG was in need of further refinements. Still they were positive to the achieved result.
Software as a Service (SaaS) är en delmängd av molntjänster där en tjänsteleverantör tillgodoser mjukvara som en tjänst åt kunder. SaaS-applikationen installeras på SaaS-leverantörens servrar, och åtkomsten till applikationen sker oftast via webbläsaren. I sammanhanget av SaaS kallas en kund för ten-ant, vilket oftast består av en organisation, eller i vissa fall enbart av en användare. En SaaS-applikation kan delas in i tenancy-modeller. En tenancymodell beskriver hur en tenant:s data är associerad till lagringsutrymmet på SaaS-leverantörens server.Efter att ha gjort en förstudie kunde författarna dra slutsatsen att det råder guidningsbrist för val av tenancy-modeller. Syftet med denna tes är att tillgodose vägledning för val av tenancy-modeller. Kortsiktsmålet är att skapa en guide för val av tenancy-modeller. Långsiktsmålet är att tillgodose forskare och studenter med forskningsmaterial. Denna tes tillgodoser en modell för guidning av val för tenancy-modeller. Namnet på denna guide är textitTenancy Model Selection Guidelines (TMSG).TMSG utvärderades genom intervjuer med två personer som jobbar inom mjukvaru-branschen. Kriterierna som användes vid utvärderingen av TMSG var följande: Trovärdighet hos den intervjuade personen, Syntaktisk korrekthet, Semantisk korrekthet, Användbarhet och Modellens flexibilitet. I båda intervjuerna ansåg de medverkande att TMSG behöver ytterligare finslipning, och de var båda positiva till det uppnådda resultatet.

APA, Harvard, Vancouver, ISO, and other styles

9

Belitz, Christiane. "Model Selection in Generalised Structured Additive Regression Models." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-78896.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Sommer, Julia. "Regularized estimation and model selection in compartment models." Diss., Ludwig-Maximilians-Universität München, 2013. http://nbn-resolving.de/urn:nbn:de:bvb:19-157673.

Full text

Abstract:

Dynamic imaging series acquired in medical and biological research are often analyzed with the help of compartment models. Compartment models provide a parametric, nonlinear function of interpretable, kinetic parameters describing how some concentration of interest evolves over time. Aiming to estimate the kinetic parameters, this leads to a nonlinear regression problem. In many applications, the number of compartments needed in the model is not known from biological considerations but should be inferred from the data along with the kinetic parameters. As data from medical and biological experiments are often available in the form of images, the spatial data structure of the images has to be taken into account. This thesis addresses the problem of parameter estimation and model selection in compartment models. Besides a penalized maximum likelihood based approach, several Bayesian approaches-including a hierarchical model with Gaussian Markov random field priors and a model state approach with flexible model dimension-are proposed and evaluated to accomplish this task. Existing methods are extended for parameter estimation and model selection in more complex compartment models. However, in nonlinear regression and, in particular, for more complex compartment models, redundancy issues may arise. This thesis analyzes difficulties arising due to redundancy issues and proposes several approaches to alleviate those redundancy issues by regularizing the parameter space. The potential of the proposed estimation and model selection approaches is evaluated in simulation studies as well as for two in vivo imaging applications: a dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) study on breast cancer and a study on the binding behavior of molecules in living cell nuclei observed in a fluorescence recovery after photobleaching (FRAP) experiment.

APA, Harvard, Vancouver, ISO, and other styles

11

Smith, Peter William Frederick. "Edge exclusion and model selection in graphical models." Thesis, Lancaster University, 1990. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.315138.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Guo, Yixuan. "Bayesian Model Selection for Poisson and Related Models." University of Cincinnati / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1439310177.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Dey, Tanujit. "Prediction and Variable Selection." Cleveland, Ohio : Case Western Reserve University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=case1212581055.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Selén, Yngve. "Model selection and sparse modeling /." Uppsala : Department of Information Technology, Uppsala University, 2007. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-8202.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Vinciotti, Veronica. "Model selection in supervised classification." Thesis, Imperial College London, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.397929.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Sassoon, Isabel Karen. "Argumentation for statistical model selection." Thesis, King's College London (University of London), 2018. https://kclpure.kcl.ac.uk/portal/en/theses/argumentation-for-statistical-model-selection(79168e3a-2903-43dc-ac60-97a7c87f94f0).html.

Full text

Abstract:

The increased availability of clinical data, in particular case data collected routinely, provides a valuable opportunity for analysis with a view to support evidence based decision making. In order to con dently leverage this data in support of decision making, it is essential to analyse it with rigour by employing the most appropriate statistical method. It can be dicult for a clinician to choose the appropriate statistical method and indeed the choice is not always straight forward, even for a statistician. The considerations as to what model to use depend on the research question, data and at times background information from the clinician, and will vary from model to model. This thesis develops an intelligent decision support method that supports the clinician by recommending the most appropriate statistical model approach given the research question and the available data. The main contributions of this thesis are: identi cation of the requirements from realworld collaboration with clinicians; development of an argumentation based approach to recommend statistical models based on a research question and data features; an argumentation scheme for proposing possible models; a statistical knowledge base designed to support the argumentation scheme, critical questions and preferences; a method of reasoning with the generated arguments and preference arguments. The approach is evaluated through case studies and a prototype.

APA, Harvard, Vancouver, ISO, and other styles

17

Grosse, Roger Baker. "Model selection in compositional spaces." Thesis, Massachusetts Institute of Technology, 2014. http://hdl.handle.net/1721.1/87789.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2014.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 172-181).
We often build complex probabilistic models by composing simpler models-using one model to generate parameters or latent variables for another model. This allows us to express complex distributions over the observed data and to share statistical structure between dierent parts of a model. In this thesis, we present a space of matrix decomposition models defined by the composition of a small number of motifs of probabilistic modeling, including clustering, low rank factorizations, and binary latent factor models. This compositional structure can be represented by a context-free grammar whose production rules correspond to these motifs. By exploiting the structure of this grammar, we can generically and eciently infer latent components and estimate predictive likelihood for nearly 2500 model structures using a small toolbox of reusable algorithms. Using a greedy search over this grammar, we automatically choose the decomposition structure from raw data by evaluating only a small fraction of all models. The proposed method typically finds the correct structure for synthetic data and backs o gracefully to simpler models under heavy noise. It learns sensible structures for datasets as diverse as image patches, motion capture, 20 Questions, and U.S. Senate votes, all using exactly the same code. We then consider several improvements to compositional structure search. We present compositional importance sampling (CIS), a novel procedure for marginal likelihood estimation which requires only posterior inference and marginal likelihood estimation algorithms corresponding to the production rules of the grammar. We analyze the performance of CIS in the case of identifying additional structure within a low-rank decomposition. This analysis yields insights into how one should design a space of models to be recursively searchable. We next consider the problem of marginal likelihood estimation for the production rules. We present a novel method for obtaining ground truth marginal likelihood values on synthetic data, which enables the rigorous quantitative comparison of marginal likelihood estimators. Using this method, we compare a wide variety of marginal likelihood estimators for the production rules of our grammar. Finally, we present a framework for analyzing the sequences of distributions used in annealed importance sampling, a state-of-the-art marginal likelihood estimator, and present a novel sequence of intermediate distributions based on averaging moments of the initial and target distributions.
by Roger Baker Grosse.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

18

Velasco-Cruz, Ciro. "Spatially Correlated Model Selection (SCOMS)." Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/27791.

Full text

Abstract:

In this dissertation, a variable selection method for spatial data is developed. It is assumed that the spatial process is non-stationary as a whole but is piece-wise stationary. The pieces where the spatial process is stationary are called regions. The variable selection approach accounts for two sources of correlation: (1) the spatial correlation of the data within the regions, and (2) the correlation of adjacent regions. The variable selection is carried out by including indicator variables that characterize the significance of the regression coefficients. The Ising distribution as prior for the vector of indicator variables, models the dependence of adjacent regions. We present a case study on brook trout data where the response of interest is the presence/absence of the fish at sites in the eastern United States. We find that the method outperforms the case of the probit regression where the spatial field is assumed stationary and isotropic. Additionally, the method outperformed the case where multiple regions are assumed independent of their neighbors.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

19

You, Di. "Model Selection in Kernel Methods." The Ohio State University, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=osu1322581224.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Osaka, Haruki. "Asymptotics of Mixture Model Selection." Thesis, The University of Sydney, 2021. https://hdl.handle.net/2123/27230.

Full text

Abstract:

In this thesis, we consider the likelihood ratio test (LRT) when testing for homogeneity in a three component normal mixture model. It is well-known that the LRT in this setting exhibits non-standard asymptotic behaviour, due to non-identifiability of the model parameters and possible degeneracy of Fisher Information matrix. In fact, Liu and Shao (2004) showed that for the test of homogeneity in a two component normal mixture model with a single fixed component, the limiting distribution is an extreme value Gumbel distribution under the null hypothesis, rather than the usual chi-squared distribution in regular parametric models for which the classical Wilks' theorem applies. We wish to generalise this result to a three component normal mixture to show that similar non-standard asymptotics also occurs for this model. Our approach follows closely to that of Bickel and Chernoff (1993), where the relevant asymptotics of the LRT statistic were studied indirectly by first considering a certain Gaussian process associated with the testing problem. The equivalence between the process studied by Bickel and Chernoff (1993) and the LRT was later proved by Liu and Shao (2004). Consequently, they verified that the LRT statistic for this problem diverges to infinity at the rate of loglog n; a statement that was first conjectured in Hartigan (1985). In a similar spirit, we consider the limiting distribution of the supremum of a certain quadratic form. More precisely, the quadratic form we consider is the score statistic for the test for homogeneity in the sub-model where the mean parameters are assumed fixed. The supremum of this quadratic form is shown to have a limiting distribution of extreme value type, again with a divergence rate of loglog n. Finally, we show that the LRT statistic for the three component normal mixture model can be uniformly approximated by this quadratic form, thereby proving that that the two statistics share the same limiting distribution.

APA, Harvard, Vancouver, ISO, and other styles

21

JIANG, DONGMING. "OBJECTIVE BAYESIAN TESTING AND MODEL SELECTION FOR POISSON MODELS." University of Cincinnati / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1185821399.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Liu, Tuo. "Model Selection and Adaptive Lasso Estimation of Spatial Models." The Ohio State University, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=osu1500379101560737.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Wu, Jingwen. "Model-based clustering and model selection for binned data." Thesis, Supélec, 2014. http://www.theses.fr/2014SUPL0005/document.

Full text

Abstract:

Cette thèse étudie les approches de classification automatique basées sur les modèles de mélange gaussiens et les critères de choix de modèles pour la classification automatique de données discrétisées. Quatorze algorithmes binned-EM et quatorze algorithmes bin-EM-CEM sont développés pour quatorze modèles de mélange gaussiens parcimonieux. Ces nouveaux algorithmes combinent les avantages des données discrétisées en termes de réduction du temps d’exécution et les avantages des modèles de mélange gaussiens parcimonieux en termes de simplification de l'estimation des paramètres. Les complexités des algorithmes binned-EM et bin-EM-CEM sont calculées et comparées aux complexités des algorithmes EM et CEM respectivement. Afin de choisir le bon modèle qui s'adapte bien aux données et qui satisfait les exigences de précision en classification avec un temps de calcul raisonnable, les critères AIC, BIC, ICL, NEC et AWE sont étendus à la classification automatique de données discrétisées lorsque l'on utilise les algorithmes binned-EM et bin-EM-CEM proposés. Les avantages des différentes méthodes proposées sont illustrés par des études expérimentales
This thesis studies the Gaussian mixture model-based clustering approaches and the criteria of model selection for binned data clustering. Fourteen binned-EM algorithms and fourteen bin-EM-CEM algorithms are developed for fourteen parsimonious Gaussian mixture models. These new algorithms combine the advantages in computation time reduction of binning data and the advantages in parameters estimation simplification of parsimonious Gaussian mixture models. The complexities of the binned-EM and the bin-EM-CEM algorithms are calculated and compared to the complexities of the EM and the CEM algorithms respectively. In order to select the right model which fits well the data and satisfies the clustering precision requirements with a reasonable computation time, AIC, BIC, ICL, NEC, and AWE criteria, are extended to binned data clustering when the proposed binned-EM and bin-EM-CEM algorithms are used. The advantages of the different proposed methods are illustrated through experimental studies

APA, Harvard, Vancouver, ISO, and other styles

24

Lu, Pingbo. "Calibrated Bayes factors for model selection and model averaging." The Ohio State University, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=osu1343396705.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Schnücker, Annika [Verfasser]. "Model Selection Methods for Panel Vector Autoregressive Models / Annika Schnücker." Berlin : Freie Universität Berlin, 2018. http://d-nb.info/1176708147/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Lipkovich, Ilya A. "Bayesian Model Averaging and Variable Selection in Multivariate Ecological Models." Diss., Virginia Tech, 2002. http://hdl.handle.net/10919/11045.

Full text

Abstract:

Bayesian Model Averaging (BMA) is a new area in modern applied statistics that provides data analysts with an efficient tool for discovering promising models and obtaining esti-mates of their posterior probabilities via Markov chain Monte Carlo (MCMC). These probabilities can be further used as weights for model averaged predictions and estimates of the parameters of interest. As a result, variance components due to model selection are estimated and accounted for, contrary to the practice of conventional data analysis (such as, for example, stepwise model selection). In addition, variable activation probabilities can be obtained for each variable of interest. This dissertation is aimed at connecting BMA and various ramifications of the multivari-ate technique called Reduced-Rank Regression (RRR). In particular, we are concerned with Canonical Correspondence Analysis (CCA) in ecological applications where the data are represented by a site by species abundance matrix with site-specific covariates. Our goal is to incorporate the multivariate techniques, such as Redundancy Analysis and Ca-nonical Correspondence Analysis into the general machinery of BMA, taking into account such complicating phenomena as outliers and clustering of observations within a single data-analysis strategy. Traditional implementations of model averaging are concerned with selection of variables. We extend the methodology of BMA to selection of subgroups of observations and im-plement several approaches to cluster and outlier analysis in the context of the multivari-ate regression model. The proposed algorithm of cluster analysis can accommodate re-strictions on the resulting partition of observations when some of them form sub-clusters that have to be preserved when larger clusters are formed.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

27

Camehl, Annika [Verfasser]. "Model Selection Methods for Panel Vector Autoregressive Models / Annika Schnücker." Berlin : Freie Universität Berlin, 2018. http://d-nb.info/1176708147/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Dhurandhar, Amit. "Semi-analytical method for analyzing models and model selection measures." [Gainesville, Fla.] : University of Florida, 2009. http://purl.fcla.edu/fcla/etd/UFE0024733.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Goşoniu, Nicoleta Francisca. "On model selection in additive regression /." Zürich : ETH, 2008. http://e-collection.ethbib.ethz.ch/show?type=diss&nr=17637.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Lui, Hon-kwong, and 呂漢光. "An econometric model of spouse selection." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B30110750.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Luo, Ye Ph D. Massachusetts Institute of Technology. "High-dimensional econometrics and model selection." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/98686.

Full text

Abstract:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Economics, 2015.
Title as it appears in MIT Commencement Exercises program, June 5, 2015: Essays in high-dimensional econometrics and model selection. Cataloged from PDF version of thesis.
Includes bibliographical references.
This dissertation consists of three chapters. Chapter 1 proposes a new method to solve the many moment problem: in Generalized Method of Moments (GMM), when the number of moment conditions is comparable to or larger than the sample size, the traditional methods lead to biased estimators. We propose a LASSO based selection procedure in order to choose the informative moments and then, using the selected moments, conduct optimal GMM. My method can significantly reduce the bias of the optimal GMM estimator while retaining most of the information in the full set of moments. We establish theoretical asymptotics of the LASSO and post-LASSO estimators. The formulation of LASSO is a convex optimization problem and thus the computational cost is low compared to all existing alternative moment selection procedures. We propose penalty terms using data-driven methods, of which the calculation is carried out by a non-trivial adaptive algorithm. In Chapter 2, we consider partially identified models with many inequalities. Under such circumstances, existing inference procedures may break down asymptotically and are computationally difficult to conduct. We first propose a combinatorial method to select the informative inequalities in the Core Determining Class problem, in which a large set of linear inequalities are generated from a bipartite graph. Our method selects the set of irredudant inequalities and outperforms all existing methods in shrinking the number of inequalities and computational speed. We further consider a more general problem with many linear inequalities. We propose an inequality selection method similar to the Dantzig selector. We establish theoretical results of such a selection method under our sparsity assumptions. Chapter 3 proposes an innovative way of reporting results in empirical analysis of economic data. Instead of reporting the Average Partial Effect, we propose to report multiple effects sorted in increasing order, as an alternative and more complete summary measure of the heterogeneity in the model. We established asymptotics and inference for such a procedure via functional delta method. Numerical examples and an empirical application to female labor supply using data from the 1980 U.S. Census illustrate the performance of our methods in finite samples.
by Ye Luo.
Chapter 1. Chapter 2. Chapter 3. Selecting informative moments via LASSO -- Core determining class : construction, approximation, and inference -- Summarizing partial effects beyond averages.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

32

McGrory, Clare Anne. "Variational approximations in Bayesian model selection." Thesis, University of Glasgow, 2005. http://theses.gla.ac.uk/6941/.

Full text

Abstract:

The research presented in this thesis is on the topic of the Bayesian approach to statistical inference. In particular it focuses on the analysis of mixture models. Mixture models are a useful tool for representing complex data and are widely applied in many areas of statistics (see, for example, Titterington et al. (1985)). The representation of mixture models as missing data models is often useful as it makes more techniques of inference available to us. In addition, it allows us to introduce further dependencies within the mixture model hierarchy leading to the definition of the hidden Markov model and the hidden Markov random field model (see Titterington (1990)). Chapter 1 introduces the main themes of the thesis. It provides an overview of variational methods for approximate Bayesian inference and describes the Deviance Information Criterion for Bayesian model selection. Chapter 2 reviews the theory of finite mixture models and extends the variational approach and the Deviance Information Criterion to mixtures of Gaussians. Chapter 3 examines the use of the variational approximation for general mixtures of exponential family models and considers the specific application to mixtures of Poisson and Exponential densities. Chapter 4 describes how the variational approach can be used in the context of hidden Markov models. It also describes how the Deviance Information Criterion can be used for model selection with this class of model. Chapter 5 explores the use of variational Bayes and the Deviance Information Criterion in hidden Markov random field analysis. In particular, the focus is on the application to image analysis. Chapter 6 summarises the research presented in this thesis and suggests some possible avenues of future development. The material in chapter 2 was presented at the ISBA 2004 world conference in Viña del Mar, Chile and was awarded a prize for best student presentation.

APA, Harvard, Vancouver, ISO, and other styles

33

Arledge, Christopher S. "Cosmological Model Selection and Akaike’s Criterion." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1430478203.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Lui, Hon-kwong. "An econometric model of spouse selection /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B16027450.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Zhang, Tao. "Discrepancy-based algorithms for best-subset model selection." Diss., University of Iowa, 2013. https://ir.uiowa.edu/etd/4800.

Full text

Abstract:

The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables. Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection. In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallows' conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi. In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder.

APA, Harvard, Vancouver, ISO, and other styles

36

Maiti, Dipayan. "Multiset Model Selection and Averaging, and Interactive Storytelling." Diss., Virginia Tech, 2012. http://hdl.handle.net/10919/28563.

Full text

Abstract:

The Multiset Sampler [Leman et al., 2009] has previously been deployed and developed for efficient sampling from complex stochastic processes. We extend the sampler and the surrounding theory to model selection problems. In such problems efficient exploration of the model space becomes a challenge since independent and ad-hoc proposals might not be able to jointly propose multiple parameter sets which correctly explain a new pro- posed model. In order to overcome this we propose a multiset on the model space to en- able efficient exploration of multiple model modes with almost no tuning. The Multiset Model Selection (MSMS) framework is based on independent priors for the parameters and model indicators on variables. We show that posterior model probabilities can be easily obtained from multiset averaged posterior model probabilities in MSMS. We also obtain typical Bayesian model averaged estimates for the parameters from MSMS. We apply our algorithm to linear regression where it allows easy moves between parame- ter modes of different models, and in probit regression where it allows jumps between widely varying model specific covariance structures in the latent space of a hierarchical model. The Storytelling algorithm [Kumar et al., 2006] constructs stories by discovering and con- necting latent connections between documents in a network. Such automated algorithms often do not agree with userâ s mental map of the data. Hence systems that incorporate feedback through visual interaction from the user are of immediate importance. We pro- pose a visual analytic framework in which such interactions are naturally incorporated in to the existing Storytelling algorithm through a redefinition of the latent topic space used in the similarity measure of the network. The document network can be explored us- ing the newly learned normalized topic weights for each document. Hence our algorithm augments the limitations of human sensemaking capabilities in large document networks by providing a collaborative framework between the underlying model and the user. Our formulation of the problem is a supervised topic modeling problem where the supervi- sion is based on relationships imposed by the user as a set of inequalities derived from tolerances on edge costs from inverse shortest path problem. We show a probabilistic modeling of the relationships based on auxiliary variables and propose a Gibbs sampling based strategy. We provide detailed results from a simulated data and the Atlantic Storm data set.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

37

Wenren, Cheng. "Mixed Model Selection Based on the Conceptual Predictive Statistic." Bowling Green State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1403735738.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Pan, Juming. "Adaptive LASSO For Mixed Model Selection via Profile Log-Likelihood." Bowling Green State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1466633921.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Smith, Andrew Korb. "New results in dimension reduction and model selection." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/22586.

Full text

Abstract:

Thesis (Ph. D.)--Industrial and Systems Engineering, Georgia Institute of Technology, 2008.
Committee Chair: Huo, Xiaoming; Committee Member: Serban, Nicoleta; Committee Member: Shapiro, Alexander; Committee Member: Yuan, Ming; Committee Member: Zha, Hongyuan.

APA, Harvard, Vancouver, ISO, and other styles

40

Sommer, Julia C. [Verfasser]. "Regularized estimation and model selection in compartment models / Julia C. Sommer." München : Verlag Dr. Hut, 2013. http://d-nb.info/1037286790/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Bakir, Mehmet Emin. "Automatic selection of statistical model checkers for analysis of biological models." Thesis, University of Sheffield, 2017. http://etheses.whiterose.ac.uk/20216/.

Full text

Abstract:

Statistical Model Checking (SMC) blends the speed of simulation with the rigorous analytical capabilities of model checking, and its success has prompted researchers to implement a number of SMC tools whose availability provides flexibility and fine-tuned control over model analysis. However, each tool has its own practical limitations, and different tools have different requirements and performance characteristics. The performance of different tools may also depend on the specific features of the input model or the type of query to be verified. Consequently, choosing the most suitable tool for verifying any given model requires a significant degree of experience, and in most cases, it is challenging to predict the right one. The aim of our research has been to simplify the model checking process for researchers in biological systems modelling by simplifying and rationalising the model selection process. This has been achieved through delivery of the various key contributions listed below. • We have developed a software component for verification of kernel P (kP) system models, using the NuSMV model checker. We integrated it into a larger software platform (www.kpworkbench.org). • We surveyed five popular SMC tools, comparing their modelling languages, external dependencies, expressibility of specification languages, and performance. To best of our knowledge, this is the first known attempt to categorise the performance of SMC tools based on the commonly used property specifications (property patterns) for model checking. • We have proposed a set of model features which can be used for predicting the fastest SMC for biological model verification, and have shown, moreover, that the proposed features both reduce computation time and increase predictive power. • We used machine learning algorithms for predicting the fastest SMC tool for verification of biological models, and have shown that this approach can successfully predict the fastest SMC tool with over 90% accuracy. • We have developed a software tool, SMC Predictor, that predicts the fastest SMC tool for a given model and property query, and have made this freely available to the wider research community (www.smcpredictor.com). Our results show that using our methodology can generate significant savings in the amount of time and resources required for model verification.

APA, Harvard, Vancouver, ISO, and other styles

42

Smith, Connor James. "Resampling Based Model Selection for Correlated and Complex Data." Thesis, The University of Sydney, 2022. https://hdl.handle.net/2123/27428.

Full text

Abstract:

Variable selection is a key component of regression modelling but slight changes to the initial data can result in changes to the models identified. In this thesis, we identify and examine multiple problems within the variable selection space and how through the use of stability based approaches we can construct solutions, where there is a current lack of statistical frameworks. At its core, this thesis tackles complex data in a generalized linear model (GLM) framework; both in robust and higher dimensional settings. We target three main aspects: - The inability to use exhaustive variable selection approaches within the robust generalized linear model space. - The struggles of stable variable selection for omics micro-array data where the number of variables is significantly larger than the total number of observations. - Extracting information from multiple penalized regression solution paths to classify variables into different classes through both automated and visual classification. In Chapter 1, we provide an overview of variable selection methods with the main focus placed on GLMs. In Chapter 2, we bring variable selection methods in a robust GLM space closer to the gold standard of the exhaustive search through the new RobStab (Robust Stability) framework. In Chapter 3, we propose a novel stability based variable selection method, VIVID (VIsulationation of Variable Importance Differences), for omics GLM data. In Chapter 4, we expand upon the use of a single tuning parameter within penalized regression for variable selection with the new method ParSPaS. In Chapter 5 we make some final remarks and conclude the thesis. For all proposed methods, we provide publicly available computational implementations through R.

APA, Harvard, Vancouver, ISO, and other styles

43

Mu, He Qing. "Bayesian model class selection on regression problems." Thesis, University of Macau, 2010. http://umaclib3.umac.mo/record=b2492988.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Coleman, Kimberley. "A new capture-recapture model selection criterion /." Thesis, McGill University, 2007. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=101841.

Full text

Abstract:

Capture-recapture methods are used to estimate population size from overlapping, incomplete sources of information. With three or more sources, dependence between sources may be modelled using log-linear models. We propose a Coefficient of Incremental Dependence Criterion (CIDC) for selecting an estimate of population size among all possible estimates that result from hierarchical log-linear models. A penalty for the number of parameters in the model was selected via simulation for the three-source and four-source settings. The performance of the proposed criterion was compared to the Akaike Information Criterion (AIC) through simulation. The CIDC was found to modestly outperform the AIC for data generated from a population size of approximately 100, with AIC performing consistently better for larger population sizes. Modifications to the criterion such as incorporating the estimated population size and the type of source interaction present should be investigated, along with the mathematical properties of the CIDC.

APA, Harvard, Vancouver, ISO, and other styles

45

Ning, Hoi-Kwan Flora. "Model-based regression clustering with variable selection." Thesis, University of Oxford, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.497059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Hildebrand, Annelize. "Model selection." Diss., 1995. http://hdl.handle.net/10500/16951.

Full text

Abstract:

In developing an understanding of real-world problems, researchers develop mathematical and statistical models. Various model selection methods exist which can be used to obtain a mathematical model that best describes the real-world situation in some or other sense. These methods aim to assess the merits of competing models by concentrating on a particular criterion. Each selection method is associated with its own criterion and is named accordingly. The better known ones include Akaike's Information Criterion, Mallows' Cp and cross-validation, to name a few. The value of the criterion is calculated for each model and the model corresponding to the minimum value of the criterion is then selected as the "best" model.
Mathematical Sciences
M. Sc. (Statistics)

APA, Harvard, Vancouver, ISO, and other styles

47

Hu, Chin-Yen, and 胡智彥. "Model selection for two part models." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/14583065088945422872.

Full text

Abstract:

碩士
國立雲林科技大學
財務金融系
90
In this study, we want to identify the robust model in different distribution data, especially in facing censored variables or Tobit —like variables. In order to verify this thought, we choose two competitive models: lognormal model and Cragg’s model for examination. With two different kinds distribution simulated data, we use Voung’s model selection tests for two competitive hurdle, or two-tier models. In these simulated data, we find out that Cragg’s model will be more robust than lognormal model. So, we take it to compare with the traditional Tobit model, for another suggestion when researching the R&D expenditure. After testing through KLIC rule with real data, we can find that the Cragg’s model is more suitable than Tobit’s model in this real data set.

APA, Harvard, Vancouver, ISO, and other styles

48

江支耀. "Model selection in regression models with heteroscedasticity." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/27534234750910615312.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Chang, Le. "Essays on Robust Model Selection and Model Averaging for Linear Models." Phd thesis, 2017. http://hdl.handle.net/1885/139176.

Full text

Abstract:

Model selection is central to all applied statistical work. Selecting the variables for use in a regression model is one important example of model selection. This thesis is a collection of essays on robust model selection procedures and model averaging for linear regression models. In the first essay, we propose robust Akaike information criteria (AIC) for MM-estimation and an adjusted robust scale based AIC for M and MM-estimation. Our proposed model selection criteria can maintain their robust properties in the presence of a high proportion of outliers and the outliers in the covariates. We compare our proposed criteria with other robust model selection criteria discussed in previous literature. Our simulation studies demonstrate a significant outperformance of robust AIC based on MM-estimation in the presence of outliers in the covariates. The real data example also shows a better performance of robust AIC based on MM-estimation. The second essay focuses on robust versions of the ``Least Absolute Shrinkage and Selection Operator" (lasso). The adaptive lasso is a method for performing simultaneous parameter estimation and variable selection. The adaptive weights used in its penalty term mean that the adaptive lasso achieves the oracle property. In this essay, we propose an extension of the adaptive lasso named the Tukey-lasso. By using Tukey's biweight criterion, instead of squared loss, the Tukey-lasso is resistant to outliers in both the response and covariates. Importantly, we demonstrate that the Tukey-lasso also enjoys the oracle property. A fast accelerated proximal gradient (APG) algorithm is proposed and implemented for computing the Tukey-lasso. Our extensive simulations show that the Tukey-lasso, implemented with the APG algorithm, achieves very reliable results, including for high-dimensional data where p>n. In the presence of outliers, the Tukey-lasso is shown to offer substantial improvements in performance compared to the adaptive lasso and other robust implementations of the lasso. Real data examples further demonstrate the utility of the Tukey-lasso. In many statistical analyses, a single model is used for statistical inference, ignoring the process that leads to the model being selected. To account for this model uncertainty, many model averaging procedures have been proposed. In the last essay, we propose an extension of a bootstrap model averaging approach, called bootstrap lasso averaging (BLA). BLA utilizes the lasso for model selection. This is in contrast to other forms of bootstrap model averaging that use AIC or Bayesian information criteria (BIC). The use of the lasso improves the computation speed and allows BLA to be applied even when the number of variables p is larger than the sample size n. Extensive simulations confirm that BLA has outstanding finite sample performance, in terms of both variable and prediction accuracies, compared with traditional model selection and model averaging methods. Several real data examples further demonstrate an improved out-of-sample predictive performance of BLA.

APA, Harvard, Vancouver, ISO, and other styles

50

Wu, Ming-chuan, and 吳旻娟. "Developing a selection model for logistics strategy selection." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/29460189068441130214.

Full text

Abstract:

碩士
國立高雄第一科技大學
運輸倉儲營運所
93
In recent years, as rapid change for industrial structure and higher consuming ability, the needs for logistics become more important than before. For all of the enterprise, logistics operations’ ability of a company becomes an important factor for the success of the company. My research defines that the logistics strategy is the development way for the enterprise executes the logistics activity chooses. My research use Fuzzy AHP approach to develop a strategic model for logistics strategy selection to obtains each logistics strategy consideration factor aspects and the indicators weight value, and combine Fuzzy Synthetic Evaluation Model to calculate overall evaluation of this alternative way for the logistics strategy, which help decision makers to have the results of each way by use of these systematic indicators. Not only can we obtain the evaluation of each strategy, but also the result of each indicator in every aspect. In the last, adjuvant with statistic tests to further get the idea of variables of the enterprise’s character, consideration factors of the logistics strategy, and the relationship between logistics strategy execution factors. The main contributions of this research are to classify and reorganize the related literature of logistics strategy and purpose the view of logistics strategy classification. My research hope to assist enterprises to make future considerations and references in logistics strategy related decisions.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Model selection'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles