Dissertations / Theses on the topic 'Parametric regression models'

To see the other types of publications on this topic, follow the link: Parametric regression models.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Parametric regression models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Li, Lingzhu. "Model checking for general parametric regression models." HKBU Institutional Repository, 2019. https://repository.hkbu.edu.hk/etd_oa/654.

Full text
Abstract:
Model checking for regressions has drawn considerable attention in the last three decades. Compared with global smoothing tests, local smoothing tests, which are more sensitive to high-frequency alternatives, can only detect local alternatives dis- tinct from the null model at a much slower rate when the dimension of predictor is high. When the number of covariates is large, nonparametric estimations used in local smoothing tests lack efficiency. Corresponding tests then have trouble in maintaining the significance level and detecting the alternatives. To tackle the issue, we propose two methods under high but fixed dimension framework. Further, we investigate a model checking test under divergent dimension, where the numbers of covariates and unknown parameters go divergent with the sample size n. The first proposed test is constructed upon a typical kernel-based local smoothing test using projection method. Employed by projection and integral, the resulted test statistic has a closed form that depends only on the residuals and distances of the sample points. A merit of the developed test is that the distance is easy to implement compared with the kernel estimation, especially when the dimension is high. Moreover, the test inherits some feature of local smoothing tests owing to its construction. Although it is eventually similar to an Integrated Conditional Moment test in spirit, it leads to a test with a weight function that helps to collect more information from the samples than Integrated Conditional Moment test. Simulations and real data analysis justify the powerfulness of the test. The second test, which is a synthesis of local and global smoothing tests, aims at solving the slow convergence rate caused by nonparametric estimation in local smoothing tests. A significant feature of this approach is that it allows nonparamet- ric estimation-based tests, under the alternatives, also share the merits of existing empirical process-based tests. The proposed hybrid test can detect local alternatives at the fastest possible rate like the empirical process-based ones, and simultane- ously, retains the sensitivity to high-frequency alternatives from the nonparametric estimation-based ones. This feature is achieved by utilizing an indicative dimension in the field of dimension reduction. As a by-product, we have a systematic study on a residual-related central subspace for model adaptation, showing when alterna- tive models can be indicated and when cannot. Numerical studies are conducted to verify its application. Since the data volume nowadays is increasing, the numbers of predictors and un- known parameters are probably divergent as sample size n goes to infinity. Model checking under divergent dimension, however, is almost uncharted in the literature. In this thesis, an adaptive-to-model test is proposed to handle the divergent dimen- sion based on the two previous introduced tests. Theoretical results tell that, to get the asymptotic normality of the parameter estimator, the number of unknown parameters should be in the order of o(n1/3). Also, as a spinoff, we demonstrate the asymptotic properties of estimations for the residual-related central subspace and central mean subspace under different hypotheses.
APA, Harvard, Vancouver, ISO, and other styles
2

Chen, Chunxia. "Semi-parametric estimation in Tobit regression models." Kansas State University, 2013. http://hdl.handle.net/2097/15300.

Full text
Abstract:
Master of Science
Department of Statistics
Weixing Song
In the classical Tobit regression model, the regression error term is often assumed to have a zero mean normal distribution with unknown variance, and the regression function is assumed to be linear. If the normality assumption is violated, then the commonly used maximum likelihood estimate becomes inconsistent. Moreover, the likelihood function will be very complicated if the regression function is nonlinear even the error density is normal, which makes the maximum likelihood estimation procedure hard to implement. In the full nonparametric setup when both the regression function and the distribution of the error term [epsilon] are unknown, some nonparametric estimators for the regression function has been proposed. Although the assumption of knowing the distribution is strict, it is a widely adopted assumption in Tobit regression literature, and is also confirmed by many empirical studies conducted in the econometric research. In fact, a majority of the relevant research assumes that [epsilon] possesses a normal distribution with mean 0 and unknown standard deviation. In this report, we will try to develop a semi-parametric estimation procedure for the regression function by assuming that the error term follows a distribution from a class of 0-mean symmetric location and scale family. A minimum distance estimation procedure for estimating the parameters in the regression function when it has a specified parametric form is also constructed. Compare with the existing semiparametric and nonparametric methods in the literature, our method would be more efficient in that more information, in particular the knowledge of the distribution of [epsilon], is used. Moreover, the computation is relative inexpensive. Given lots of application does assume that [epsilon] has normal or other known distribution, the current work no doubt provides some more practical tools for statistical inference in Tobit regression model.
APA, Harvard, Vancouver, ISO, and other styles
3

Delgado, Carlos Alberto Cardozo. "Semi-parametric generalized log-gamma regression models." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-15032018-185352/.

Full text
Abstract:
The central objective of this work is to develop statistical tools for semi-parametric regression models with generalized log-gamma errors under the presence of censored and uncensored observations. The estimates of the parameters are obtained through the multivariate version of Newton-Raphson algorithm and an adequate combination of Fisher Scoring and Backffitting algorithms. Through analytical tools and using simulations the properties of the penalized maximum likelihood estimators are studied. Some diagnostic techniques such as quantile and deviance-type residuals as well as local influence measures are derived. The methodologies are implemented in the statistical computational environment R. The package sglg is developed. Finally, we give some applications of the models to real data.
O objetivo central do trabalho é proporcionar ferramentas estatísticas para modelos de regressão semiparamétricos quando os erros seguem distribução log-gamma generalizada na presença de observações censuradas ou não censuradas. A estimação paramétrica e não paramétrica são realizadas através dos procedimentos Newton - Raphson, escore de Fisher e Backfitting (Gauss - Seidel). As propriedades assintóticas dos estimadores de máxima verossimilhança penalizada são estudadas em forma analítica, bem como através de simulações. Alguns procedimentos de diagnóstico são desenvolvidos, tais como resíduos tipo componente do desvio e resíduo quantílico, bem como medidas de influ\\^encia local sob alguns esquemas usuais de perturbação. Todos procedimentos do presente trabalho são implementados no ambiente computacional R, o pacote sglg é desenvolvido, assim como algumas aplicações a dados reais são apresentadas.
APA, Harvard, Vancouver, ISO, and other styles
4

Peluso, Alina. "Novel regression models for discrete response." Thesis, Brunel University, 2017. http://bura.brunel.ac.uk/handle/2438/15581.

Full text
Abstract:
In a regression context, the aim is to analyse a response variable of interest conditional to a set of covariates. In many applications the response variable is discrete. Examples include the event of surviving a heart attack, the number of hospitalisation days, the number of times that individuals benefit of a health service, and so on. This thesis advances the methodology and the application of regression models with discrete response. First, we present a difference-in-differences approach to model a binary response in a health policy evaluation framework. In particular, generalized linear mixed methods are employed to model multiple dependent outcomes in order to quantify the effect of an adopted pay-for-performance program while accounting for the heterogeneity of the data at the multiple nested levels. The results show how the policy had a positive effect on the hospitals' quality in terms of those outcomes that can be more influenced by a managerial activity. Next, we focus on regression models for count response variables. In a parametric framework, Poisson regression is the simplest model for count data though it is often found not adequate in real applications, particularly in the presence of excessive zeros and in the case of dispersion, i.e. when the conditional mean is different to the conditional variance. Negative Binomial regression is the standard model for over-dispersed data, but it fails in the presence of under-dispersion. Poisson-Inverse Gaussian regression can be used in the case of over-dispersed data, Generalised-Poisson regression can be employed in the case of under-dispersed data, and Conway-Maxwell Poisson regression can be employed in both cases of over- or under-dispersed data, though the interpretability of these models is ot straightforward and they are often found computationally demanding. While Jittering is the default non-parametric approach for count data, inference has to be made for each individual quantile, separate quantiles may cross and the underlying uniform random sampling can generate instability in the estimation. These features motivate the development of a novel parametric regression model for counts via a Discrete Weibull distribution. This distribution is able to adapt to different types of dispersion relative to Poisson, and it also has the advantage of having a closed form expression for the quantiles. As well as the standard regression model, generalized linear mixed models and generalized additive models are presented via this distribution. Simulated and real data applications with different type of dispersion show a good performance of Discrete Weibull-based regression models compared with existing regression approaches for count data.
APA, Harvard, Vancouver, ISO, and other styles
5

Shadat, Wasel Bin. "Specification testing of Garch regression models." Thesis, University of Manchester, 2011. https://www.research.manchester.ac.uk/portal/en/theses/specification-testing-of-garch-regression-models(56c218db-9b91-4d8c-bf26-8377ab185c71).html.

Full text
Abstract:
This thesis analyses, derives and evaluates specification tests of Generalized Auto-Regressive Conditional Heteroskedasticity (GARCH) regression models, both univariate and multivariate. Of particular interest, in the first half of the thesis, is the derivation of robust test procedures designed to assess the Constant Conditional Correlation (CCC) assumption often employed in multivariate GARCH (MGARCH) models. New asymptotically valid conditional moment tests are proposed which are simple to construct, easily implementable following the full or partial Quasi Maximum Likelihood (QML) estimation and which are robust to non-normality. In doing so, a non-normality robust version of the Tse's (2000) LM test is provided. In addition, a new and easily programmable expressions of the expected Hessian matrix associated with the QMLE is obtained. The finite sample performances of these tests are investigated in an extensive Monte Carlo study, programmed in GAUSS.In the second half of the thesis, attention is devoted to nonparametric testing of GARCH regression models. First simultaneous consistent nonparametric tests of the conditional mean and conditional variance structure of univariate GARCH models are considered. The approach is developed from the Integrated Generalized Spectral (IGS) and Projected Integrated Conditional Moment (PICM) procedures proposed recently by Escanciano (2008 and 2009, respectively) for time series models. Extending Escanciano (2008), a new and simple wild bootstrap procedure is proposed to implement these tests. A Monte Carlo study compares the performance of these nonparametric tests and four parametric tests of nonlinearity and/or asymmetry under a wide range of alternatives. Although the proposed bootstrap scheme does not strictly satisfy the asymptotic requirements, the simulation results demonstrate its ability to control the size extremely well and therefore the power comparison seems justified. Furthermore, this suggests there may exist weaker conditions under which the tests are implementable. The simulation exercise also presents the new evidence of the effect of conditional mean misspecification on various parametric tests of conditional variance. The testing procedures are also illustrated with the help of the S&P 500 data. Finally the PICM and IGS approaches are extended to the MGARCH case. The procedure is illustrated with the help of a bivariate CCC-GARCH model, but can be generalized to other MGARCH specifications. Simulation exercise shows that these tests have satisfactory size and are robust to non-normality. The marginal mean and variance tests have excellent power; however the covariance marginal tests lack power for some alternatives.
APA, Harvard, Vancouver, ISO, and other styles
6

Espigolan, Rafael [UNESP]. "Parametric and semi-parametric models for predicting genomic breeding values of complex traits in Nelore cattle." Universidade Estadual Paulista (UNESP), 2017. http://hdl.handle.net/11449/149846.

Full text
Abstract:
Submitted by RAFAEL ESPIGOLAN (espigolan@yahoo.com.br) on 2017-03-17T22:04:14Z No. of bitstreams: 1 Tese_Rafael_Espigolan.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5)
Approved for entry into archive by Juliano Benedito Ferreira (julianoferreira@reitoria.unesp.br) on 2017-03-22T12:50:50Z (GMT) No. of bitstreams: 1 espigolan_r_dr_jabo.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5)
Made available in DSpace on 2017-03-22T12:50:50Z (GMT). No. of bitstreams: 1 espigolan_r_dr_jabo.pdf: 1532864 bytes, checksum: c79ad7471b25137c47529f25762a83a2 (MD5) Previous issue date: 2017-02-23
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
O melhoramento genético animal visa melhorar a produtividade econômica das futuras gerações de espécies domésticas por meio da seleção. A maioria das características de interesse econômico na pecuária é de expressão quantitativa e complexa, isto é, são influenciadas por vários genes e afetadas por fatores ambientais. As análises estatísticas de informações de fenótipo e pedigree permite estimar os valores genéticos dos candidatos à seleção com base no modelo infinitesimal. Uma grande quantidade de dados genômicos está atualmente disponível para a identificação e seleção de indivíduos geneticamente superiores com o potencial de aumentar a acurácia de predição dos valores genéticos e, portanto, a eficiência dos programas de melhoramento genético animal. Vários estudos têm sido conduzidos com o objetivo de identificar metodologias apropriadas para raças e características específicas, o que resultará em estimativas de valores genéticos genômicos (GEBVs) mais acurados. Portanto, o objetivo deste estudo foi verificar a possibilidade de aplicação de modelos semiparamétricos para a seleção genômica e comparar a habilidade de predição com os modelos paramétricos para dados reais (características de carcaça, qualidade da carne, crescimento e reprodutiva) e simulados. As informações fenotípicas e de pedigree utilizadas foram fornecidas por onze fazendas pertencentes a quatro programas de melhoramento genético animal. Para as características de carcaça e qualidade da carne, o banco de dados continha 3.643 registros para área de olho de lombo (REA), 3.619 registros para espessura de gordura (BFT), 3.670 registros para maciez da carne (TEN) e 3.378 observações para peso de carcaça quente (HCW). Um total de 825.364 registros para peso ao sobreano (YW) e 166.398 para idade ao primeiro parto (AFC) foi utilizado para as características de crescimento e reprodutiva. Genótipos de 2.710, 2.656, 2.749, 2.495, 4.455 e 1.760 animais para REA, BFT, TEN, HCW, YW e AFC foram disponibilizados, respectivamente. Após o controle de qualidade, restaram dados de, aproximadamente, 450.000 polimorfismos de base única (SNP). Os modelos de análise utilizados foram BLUP genômico (GBLUP), single-step GBLUP (ssGBLUP), Bayesian LASSO (BL) e as abordagens semiparamétricas Reproducing Kernel Hilbert Spaces (RKHS) e Kernel Averaging (KA). Para cada característica foi realizada uma validação cruzada composta por cinco “folds” e replicada aleatoriamente trinta vezes. Os modelos estatísticos foram comparados em termos do erro do quadrado médio (MSE) e acurácia de predição (ACC). Os valores de ACC variaram de 0,39 a 0,40 (REA), 0,38 a 0,41 (BFT), 0,23 a 0,28 (TEN), 0,33 a 0,35 (HCW), 0,36 a 0,51 (YW) e 0,49 a 0,56 (AFC). Para todas as características, os modelos GBLUP e BL apresentaram acurácias de predição similares. Para REA, BFT e HCW, todos os modelos apresentaram ACC similares, entretanto a regressão RKHS obteve o melhor ajuste comparado ao KA. Para características com maior quantidade de registros fenotípicos comparada ao número de animais genotipados (YW e AFC) o modelo ssGBLUP é indicado. Considerando o desempenho geral, para todas as características estudadas, a regressão RKHS é, particularmente, uma alternativa interessante para a aplicação na seleção genômica, especialmente para características de baixa herdabilidade. No estudo de simulação, genótipos, pedigree e fenótipos para quatro características (A, B, C e D) foram simulados utilizando valores de herdabilidade baseados nos obtidos com os dados reais (0,09, 0,12, 0,36 e 0,39 para cada característica, respectivamente). O genoma simulado consistiu de 735.293 marcadores e 1.000 QTLs distribuídos aleatoriamente por 29 pares de autossomos, com comprimento variando de 40 a 146 centimorgans (cM), totalizando 2.333 cM. Assumiu-se que os QTLs explicavam 100% da variação genética. Considerando as frequências do alelo menor maiores ou iguais a 0,01, um total de 430.000 marcadores foram selecionados aleatoriamente. Os fenótipos foram obtidos pela soma dos resíduos (aleatoriamente amostrados de uma distribuição normal com média igual a zero) aos valores genéticos verdadeiros, e todo o processo de simulação foi replicado 10 vezes. A ACC foi calculada por meio da correlação entre o valor genético genômico estimado e o valor genético verdadeiro, simulados da 12a a 15a geração. A média do desequilíbrio de ligação, medido entre os pares de marcadores adjacentes para todas as características simuladas foi de 0,21 para as gerações recentes (12a, 13a e 14a), e 0,22 para a 15a geração. A ACC para as características simuladas A, B, C e D variou de 0,43 a 0,44, 0,47 a 0,48, 0,80 a 0,82 e 0,72 a 0,73, respectivamente. Diferentes metodologias de seleção genômica implementadas neste estudo mostraram valores similares de acurácia de predição, e o método mais adequado é dependente da característica explorada. Em geral, as regressões RKHS obtiveram melhor desempenho em termos de ACC com menor valor de MSE em comparação com os outros modelos.
Animal breeding aims to improve economic productivity of future generations of domestic species through selection. Most of the traits of economic interest in livestock have a complex and quantitative expression i.e. are influenced by a large number of genes and affected by environmental factors. Statistical analysis of phenotypes and pedigree information allows estimating the breeding values of the selection candidates based on infinitesimal model. A large amount of genomic data is now available for the identification and selection of genetically superior individuals with the potential to increase the accuracy of prediction of genetic values and thus, the efficiency of animal breeding programs. Numerous studies have been conducted in order to identify appropriate methodologies to specific breeds and traits, which will result in more accurate genomic estimated breeding values (GEBVs). Therefore, the objective of this study was to verify the possibility of applying semi-parametric models for genomic selection and to compare their ability of prediction with those of parametric models for real (carcass, meat quality, growth and reproductive traits) and simulated data. The phenotypic and pedigree information used were provided by farms belonging to four animal breeding programs which represent eleven farms. For carcass and meat quality traits, the data set contained 3,643 records for rib eye area (REA), 3,619 records for backfat thickness (BFT), 3,670 records for meat tenderness (TEN) and 3,378 observations for hot carcass weight (HCW). A total of 825,364 records for yearling weight (YW) and 166,398 for age at first calving (AFC) were used as growth and reproductive traits of Nelore cattle. Genotypes of 2,710, 2,656, 2,749, 2,495, 4,455 and 1,760 animals were available for REA, BFT, TEN, HCW, YW and AFC, respectively. After quality control, approximately 450,000 single nucleotide polymorphisms (SNP) remained. Methods of analysis were genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP), Bayesian LASSO (BL) and the semi-parametric approaches Reproducing Kernel Hilbert Spaces (RKHS) regression and Kernel Averaging (KA). A five-fold cross-validation with thirty random replicates was carried out and models were compared in terms of their prediction mean squared error (MSE) and accuracy of prediction (ACC). The ACC ranged from 0.39 to 0.40 (REA), 0.38 to 0.41 (BFT), 0.23 to 0.28 (TEN), 0.33 to 0.35 (HCW), 0.36 to 0.51 (YW) and 0.49 to 0.56 (AFC). For all traits, the GBLUP and BL models showed very similar prediction accuracies. For REA, BFT and HCW, models provided similar prediction accuracies, however RKHS regression had the best fit across traits considering multiple-step models and compared to KA. For traits which have a higher number of animals with phenotypes compared to the number of those with genotypes (YW and AFC), the ssGBLUP is indicated. Judged by overall performance, across all traits, the RKHS regression is particularly appealing for application in genomic selection, especially for low heritability traits. Simulated genotypes, pedigree, and phenotypes for four traits A, B, C and D were obtained using heritabilities based on real data (0.09, 0.12, 0.36 and 0.39 for each trait, respectively). The simulated genome consisted of 735,293 markers and 1,000 QTLs randomly distributed over 29 pairs of autosomes, with length varying from 40 to 146 centimorgans (cM), totaling 2,333 cM. It was assumed that QTLs explained 100% of genetic variance. Considering Minor Allele Frequencies greater or equal to 0.01, a total of 430,000 markers were randomly selected. The phenotypes were generated by adding residuals, randomly drawn from a normal distribution with mean equal to zero, to the true breeding values and all simulation process was replicated 10 times. ACC was quantified using correlations between the predicted genomic breeding value and true breeding values simulated for the generations of 12 to 15. The average linkage disequilibrium, measured between pairs of adjacent markers for all simulated traits was 0.21 for recent generations (12, 13 and 14), and 0.22 for generation 15. The ACC for simulated traits A, B, C and D ranged from 0.43 to 0.44, 0.47 to 0.48, 0.80 to 0.82 and 0.72 to 0.73, respectively. Different genomic selection methodologies implemented in this study showed similar accuracies of prediction, and the optimal method was sometimes trait dependent. In general, RKHS regressions were preferable in terms of ACC and provided smallest MSE estimates compared to other models.
FAPESP: 2014/00779-0
FAPESP: 2015/13084-3
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Sejong. "Three nonparametric specification tests for parametric regression models : the kernel estimation approach." Connect to resource, 1994. http://rave.ohiolink.edu/etdc/view.cgi?acc%5Fnum=osu1261492759.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Mostafa, Abdelelah M. "Regression approach to software reliability models." [Tampa, Fla] : University of South Florida, 2006. http://purl.fcla.edu/usf/dc/et/SFE0001648.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Läuter, Henning. "Estimation in partly parametric additive Cox models." Universität Potsdam, 2003. http://opus.kobv.de/ubp/volltexte/2011/5150/.

Full text
Abstract:
The dependence between survival times and covariates is described e.g. by proportional hazard models. We consider partly parametric Cox models and discuss here the estimation of interesting parameters. We represent the ma- ximum likelihood approach and extend the results of Huang (1999) from linear to nonlinear parameters. Then we investigate the least squares esti- mation and formulate conditions for the a.s. boundedness and consistency of these estimators.
APA, Harvard, Vancouver, ISO, and other styles
10

Masiulaitytė, Inga. "Regression and degradation models in reliability theory and survival analysis." Doctoral thesis, Lithuanian Academic Libraries Network (LABT), 2010. http://vddb.laba.lt/obj/LT-eLABa-0001:E.02~2010~D_20100527_134956-15325.

Full text
Abstract:
In doctoral thesis redundant systems and degradation models are considered. To ensure high reliability of important elements of the system, the stand-by units can be used. These units are commuted and operate instead of the main failed unit. The stand-by units can function in the different conditions: “hot”, “cold” or “warm” reserving. In the thesis systems with “warm” stand-by units are analyzed. Hypotheses of smooth commuting are formulated and goodness-of-fit tests for these hypotheses are constructed. Nonparametric and parametric point and interval estimation procedures are given. Modeling and statistical estimation of reliability of systems from failure time and degradation data are considered.
Daktaro disertacijos tyrimo objektai yra rezervuotos sistemos ir degradaciniai modeliai. Norint užtikrinti svarbių sistemos elementų aukštą patikimumą, naudojami jų rezerviniai elementai, kurie gali būti įjungiami sugedus šiems pagrindiniams elementams. Rezerviniai elementai gali funkcionuoti skirtinguose režimuose: „karštame“, „šaltame“ arba „šiltame“. Disertacijoje yra nagrinėjamos sistemos su „šiltai“ rezervuotais elementais. Darbe suformuluojama rezervinio elemento „sklandaus įjungimo“ hipotezė ir konstruojami statistiniai kriterijai šiai hipotezei tikrinti. Nagrinėjami neparametrinio ir parametrinio taškinio bei intervalinio vertinimo uždaviniai. Disertacijoje nagrinėjami pakankamai bendri degradacijos modeliai, kurie aprašo elementų gedimų intensyvumą kaip funkciją kiek naudojamų apkrovų, tiek ir degradacijos lygio, kuri savo ruožtu modeliuojama naudojant stochastinius procesus.
APA, Harvard, Vancouver, ISO, and other styles
11

Pawar, Roshan. "Predicting bid prices in construction projects using non-parametric statistical models." [College Station, Tex. : Texas A&M University, 2007. http://hdl.handle.net/1969.1/ETD-TAMU-1464.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Helvaci, Aziz. "Comparison Of Parametric Models For Conceptual Duration Estimation Of Building Projects." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/2/12609759/index.pdf.

Full text
Abstract:
Estimation of construction durations is a very crucial part of project planning, as several key decisions are based on the estimated durations. In general, construction durations are estimated by using planning and scheduling techniques such as Gannt or bar chart, the Critical Path Method (CPM), and the Program Evaluation and Review Technique (PERT). However, these techniques usually require detailed design information for estimation of activity durations and determination of the sequencing of the activities. In some cases, pre-design duration estimates may be performed by using these techniques, however, accuracy of these estimates mainly depends on the experience of the planning engineer. In this study, it is aimed to develop and compare alternative methods for conceptual duration estimation of building constructions with basic data information available at the early stages of projects. Five parametric duration estimation models are developed with the data of 17 building projects which were constructed by a contractor in United States. Regression analysis and artificial neural networks are used in the development of these five duration estimation models. A parametric cost estimation model is developed using regression analysis for cost estimations to be used in calculating the prediction performances of cost based duration estimation models. Finally, prediction performances of all parametric duration estimation models are determined and compared. The models provided reasonably accurate estimates for construction durations. The results also indicated that construction durations can be predicted accurately without making an estimate for the project cost.
APA, Harvard, Vancouver, ISO, and other styles
13

Das, Debasish. "Bayesian Sparse Regression with Application to Data-driven Understanding of Climate." Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/313587.

Full text
Abstract:
Computer and Information Science
Ph.D.
Sparse regressions based on constraining the L1-norm of the coefficients became popular due to their ability to handle high dimensional data unlike the regular regressions which suffer from overfitting and model identifiability issues especially when sample size is small. They are often the method of choice in many fields of science and engineering for simultaneously selecting covariates and fitting parsimonious linear models that are better generalizable and easily interpretable. However, significant challenges may be posed by the need to accommodate extremes and other domain constraints such as dynamical relations among variables, spatial and temporal constraints, need to provide uncertainty estimates and feature correlations, among others. We adopted a hierarchical Bayesian version of the sparse regression framework and exploited its inherent flexibility to accommodate the constraints. We applied sparse regression for the feature selection problem of statistical downscaling of the climate variables with particular focus on their extremes. This is important for many impact studies where the climate change information is required at a spatial scale much finer than that provided by the global or regional climate models. Characterizing the dependence of extremes on covariates can help in identification of plausible causal drivers and inform extremes downscaling. We propose a general-purpose sparse Bayesian framework for covariate discovery that accommodates the non-Gaussian distribution of extremes within a hierarchical Bayesian sparse regression model. We obtain posteriors over regression coefficients, which indicate dependence of extremes on the corresponding covariates and provide uncertainty estimates, using a variational Bayes approximation. The method is applied for selecting informative atmospheric covariates at multiple spatial scales as well as indices of large scale circulation and global warming related to frequency of precipitation extremes over continental United States. Our results confirm the dependence relations that may be expected from known precipitation physics and generates novel insights which can inform physical understanding. We plan to extend our model to discover covariates for extreme intensity in future. We further extend our framework to handle the dynamic relationship among the climate variables using a nonparametric Bayesian mixture of sparse regression models based on Dirichlet Process (DP). The extended model can achieve simultaneous clustering and discovery of covariates within each cluster. Moreover, the a priori knowledge about association between pairs of data-points is incorporated in the model through must-link constraints on a Markov Random Field (MRF) prior. A scalable and efficient variational Bayes approach is developed to infer posteriors on regression coefficients and cluster variables.
Temple University--Theses
APA, Harvard, Vancouver, ISO, and other styles
14

Schildcrout, Jonathan Scott. "Marginal modeling of longitudinal, binary response data : semiparametric and parametric estimation with long response series and an efficient outcome dependent sampling design /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/9540.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Chau, Thi Tuyet Trang. "Non-parametric methodologies for reconstruction and estimation in nonlinear state-space models." Thesis, Rennes 1, 2019. http://www.theses.fr/2019REN1S010/document.

Full text
Abstract:
Le volume des données disponibles permettant de décrire l’environnement, en particulier l’atmosphère et les océans, s’est accru à un rythme exponentiel. Ces données regroupent des observations et des sorties de modèles numériques. Les observations (satellite, in situ, etc.) sont généralement précises mais sujettes à des erreurs de mesure et disponibles avec un échantillonnage spatio-temporel irrégulier qui rend leur exploitation directe difficile. L’amélioration de la compréhension des processus physiques associée à la plus grande capacité des ordinateurs ont permis des avancées importantes dans la qualité des modèles numériques. Les solutions obtenues ne sont cependant pas encore de qualité suffisante pour certaines applications et ces méthodes demeurent lourdes à mettre en œuvre. Filtrage et lissage (les méthodes d’assimilation de données séquentielles en pratique) sont développés pour abonder ces problèmes. Ils sont généralement formalisées sous la forme d’un modèle espace-état, dans lequel on distingue le modèle dynamique qui décrit l’évolution du processus physique (état), et le modèle d’observation qui décrit le lien entre le processus physique et les observations disponibles. Dans cette thèse, nous abordons trois problèmes liés à l’inférence statistique pour les modèles espace-états: reconstruction de l’état, estimation des paramètres et remplacement du modèle dynamique par un émulateur construit à partir de données. Pour le premier problème, nous introduirons tout d’abord un algorithme de lissage original qui combine les algorithmes Conditional Particle Filter (CPF) et Backward Simulation (BS). Cet algorithme CPF-BS permet une exploration efficace de l’état de la variable physique, en raffinant séquentiellement l’exploration autour des trajectoires qui respectent le mieux les contraintes du modèle dynamique et des observations. Nous montrerons sur plusieurs modèles jouets que, à temps de calcul égal, l’algorithme CPF-BS donne de meilleurs résultats que les autres CPF et l’algorithme EnKS stochastique qui est couramment utilisé dans les applications opérationnelles. Nous aborderons ensuite le problème de l’estimation des paramètres inconnus dans les modèles espace-état. L’algorithme le plus usuel en statistique pour estimer les paramètres d’un modèle espace-état est l’algorithme EM qui permet de calculer itérativement une approximation numérique des estimateurs du maximum de vraisemblance. Nous montrerons que les algorithmes EM et CPF-BS peuvent être combinés efficacement pour estimer les paramètres d’un modèle jouet. Pour certaines applications, le modèle dynamique est inconnu ou très coûteux à résoudre numériquement mais des observations ou des simulations sont disponibles. Il est alors possible de reconstruire l’état conditionnellement aux observations en utilisant des algorithmes de filtrage/lissage dans lesquels le modèle dynamique est remplacé par un émulateur statistique construit à partir des observations. Nous montrerons que les algorithmes EM et CPF-BS peuvent être adaptés dans ce cadre et permettent d’estimer de manière non-paramétrique le modèle dynamique de l’état à partir d'observations bruitées. Pour certaines applications, le modèle dynamique est inconnu ou très coûteux à résoudre numériquement mais des observations ou des simulations sont disponibles. Il est alors possible de reconstruire l’état conditionnellement aux observations en utilisant des algorithmes de filtrage/lissage dans lesquels le modèle dynamique est remplacé par un émulateur statistique construit à partir des observations. Nous montrerons que les algorithmes EM et CPF-BS peuvent être adaptés dans ce cadre et permettent d’estimer de manière non-paramétrique le modèle dynamique de l’état à partir d'observations bruitées. Enfin, les algorithmes proposés sont appliqués pour imputer les données de vent (produit par Météo France)
The amount of both observational and model-simulated data within the environmental, climate and ocean sciences has grown at an accelerating rate. Observational (e.g. satellite, in-situ...) data are generally accurate but still subject to observational errors and available with a complicated spatio-temporal sampling. Increasing computer power and understandings of physical processes have permitted to advance in models accuracy and resolution but purely model driven solutions may still not be accurate enough. Filtering and smoothing (or sequential data assimilation methods) have developed to tackle the issues. Their contexts are usually formalized under the form of a space-state model including the dynamical model which describes the evolution of the physical process (state), and the observation model which describes the link between the physical process and the available observations. In this thesis, we tackle three problems related to statistical inference for nonlinear state-space models: state reconstruction, parameter estimation and replacement of the dynamic model by an emulator constructed from data. For the first problem, we will introduce an original smoothing algorithm which combines the Conditional Particle Filter (CPF) and Backward Simulation (BS) algorithms. This CPF-BS algorithm allows for efficient exploration of the state of the physical variable, sequentially refining exploration around trajectories which best meet the constraints of the dynamic model and observations. We will show on several toy models that, at the same computation time, the CPF-BS algorithm gives better results than the other CPF algorithms and the stochastic EnKS algorithm which is commonly used in real applications. We will then discuss the problem of estimating unknown parameters in state-space models. The most common statistical algorithm for estimating the parameters of a space-state model is based on EM algorithm, which makes it possible to iteratively compute a numerical approximation of the maximum likelihood estimators. We will show that the EM and CPF-BS algorithms can be combined to effectively estimate the parameters in toy models. In some applications, the dynamical model is unknown or very expensive to solve numerically but observations or simulations are available. It is thence possible to reconstruct the state conditionally to the observations by using filtering/smoothing algorithms in which the dynamical model is replaced by a statistical emulator constructed from the observations. We will show that the EM and CPF-BS algorithms can be adapted in this framework and allow to provide non-parametric estimation of the dynamic model of the state from noisy observations. Finally the proposed algorithms are applied to impute wind data (produced by Méteo France)
APA, Harvard, Vancouver, ISO, and other styles
16

Čabla, Adam. "Odhady v analýze přežívání." Master's thesis, Vysoká škola ekonomická v Praze, 2009. http://www.nusl.cz/ntk/nusl-17134.

Full text
Abstract:
This thesis introduces methods used in time-to-date analysis. It is written generally and so usable in dealing with any example. The thesis deals with problem of censoring, which means, that some observations occurred after the following, which is typical for the lifetime analysis. Methods mentioned in the thesis are nonparametric and parametric estimates of the survival function and their characteristics, and regression models, concretely Cox model and accelerated failure time model, which examine effect of the covariates on survival function. In the thesis is beside survival function presented hazard function, which express intensity of the analyzed event and cumulative hazard function, which is created as the name suggests by cumulative summation of the hazard function. Estimates of these functions are obtainable from survival function and for parametric estimate often exists formula resulting from parameters of used distribution. Empirical part of the thesis introduces influence of several different types and degrees of censoring on parametric and nonparametric estimates of the survival function, mean and median. The other empirical example is the usage of regression analysis on the data from the lungs cancer research made by Mayo Clinic.
APA, Harvard, Vancouver, ISO, and other styles
17

Nakamura, Luiz Ricardo. "Advances on the Birnbaum-Saunders distribution." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-30092016-171320/.

Full text
Abstract:
The Birnbaum-Saunders (BS) distribution is the most popular model used to describe lifetime process under fatigue. Throughout the years, this distribution has received a wide ranging of applications, demanding some more flexible extensions to solve more complex problems. One of the most well-known extensions of the BS distribution is the generalized Birnbaum- Saunders (GBS) family of distributions that includes the Birnbaum-Saunders special-case (BSSC) and the Birnbaum-Saunders generalized t (BSGT) models as special cases. Although the BS-SC distribution was previously developed in the literature, it was never deeply studied and hence, in this thesis, we provide a full Bayesian study and develop a tool to generate random numbers from this distribution. Further, we develop a very flexible regression model, that admits different degrees of skewness and kurtosis, based on the BSGT distribution using the generalized additive models for location, scale and shape (GAMLSS) framework. We also introduce a new extension of the BS distribution called the Birnbaum-Saunders power (BSP) family of distributions, which contains several special or limiting cases already published in the literature, including the GBS family. The main feature of the new family is that it can produce both unimodal and bimodal shapes depending on its parameter values. We also introduce this new family of distributions into the GAMLSS framework, in order to model any or all the parameters of the distribution using parametric linear and/or nonparametric smooth functions of explanatory variables. Throughout this thesis we present five different applications in real data sets in order to illustrate the developed theoretical results.
A distribuição Birnbaum-Saunders (BS) é o modelo mais popular utilizado para descrever processos de fadiga. Ao longo dos anos, essa distribuição vem recebendo aplicações nas mais diversas áreas, demandando assim algumas extensões mais flexíveis para resolver problemas mais complexos. Uma das extensões mais conhecidas na literatura é a família de distribuições Birnbaum-Saunders generalizada (GBS), que inclui as distribuições Birnbaum-Saunders casoespecial (BS-SC) e Birnbaum-Saunders t generalizada (BSGT) como modelos especiais. Embora a distribuição BS-SC tenha sido previamente desenvolvida na literatura, nunca foi estudada mais profundamente e, assim, nesta tese, um estudo bayesiano é desenvolvido acerca da mesma além de um novo gerador de números aleatórios dessa distribuição ser apresentado. Adicionalmente, um modelo de regressão baseado na distribuição BSGT é desenvolvido utilizando-se os modelos aditivos generalizados para locação, escala e forma (GAMLSS), os quais apresentam grande flexibilidade tanto para a assimetria como para a curtose. Uma nova extensão da distribuição BS também é apresentada, denominada família de distribuições Birnbaum-Saunders potência (BSP), que contém inúmeros casos especiais ou limites já publicados na literatura, incluindo a família GBS. A principal característica desta nova família é que ela é capaz de produzir formas tanto uni como bimodais dependendo do valor de seus parâmetros. Esta nova família também é introduzida na estrutura dos modelos GAMLSS para fornecer uma ferramenta capaz de modelar todos os parâmetros da distribuição como funções lineares e/ou não-lineares suavizadas de variáveis explicativas. Ao longo desta tese são apresentadas cinco diferentes aplicações em conjuntos de dados reais para ilustrar os resultados teóricos obtidos.
APA, Harvard, Vancouver, ISO, and other styles
18

Malsiner-Walli, Gertraud, Paul Hofmarcher, and Bettina Grün. "Semi-parametric Regression under Model Uncertainty: Economic Applications." Wiley, 2019. http://dx.doi.org/10.1111/obes.12294.

Full text
Abstract:
Economic theory does not always specify the functional relationship between dependent and explanatory variables, or even isolate a particular set of covariates. This means that model uncertainty is pervasive in empirical economics. In this paper, we indicate how Bayesian semi-parametric regression methods in combination with stochastic search variable selection can be used to address two model uncertainties simultaneously: (i) the uncertainty with respect to the variables which should be included in the model and (ii) the uncertainty with respect to the functional form of their effects. The presented approach enables the simultaneous identification of robust linear and nonlinear effects. The additional insights gained are illustrated on applications in empirical economics, namely willingness to pay for housing, and cross-country growth regression.
APA, Harvard, Vancouver, ISO, and other styles
19

Tano, Richard. "Determining multimediastreaming content." Thesis, Umeå universitet, Institutionen för fysik, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-50376.

Full text
Abstract:
This Master Thesis report was written by Umeå University Engineering Physics student Richard Tano during his thesis work at Ericsson Luleå. Monitoring network quality is of utmost importance to network providers. This can be done with models evaluating QoS (Quality of Service) and conforming to ITU-T Recommendations. When determining video stream quality there is of more importance to evaluatethe QoE (Quality of Experience) to understand how the user perceives the quality. This isranked in MOS (Mean opinion scores) values. An important aspect of determining the QoEis the video content type, which is correlated to the coding complexity and MOS values ofthe video. In this work the possibilities to improve quality estimation models complying to ITU-T study group 12 (q.14) was investigated. Methods were evaluated and an algorithm was developed that applies time series analysis of packet statistics for determination of videostreams MOS scores. Methods used in the algorithm includes a novel assembling of frequentpattern analysis and regression analysis. A model which incorporates the algorithm for usage from low to high bitrates was dened. The new model resulted in around 20% improvedprecision in MOS score estimation compared to the existing reference model. Furthermore an algorithm using only regression statistics and modeling of related statistical parameters was developed. Improvements in coding estimation was comparable with earlier algorithm but efficiency increased considerably.
Detta examensarbete skrevs av Richard Tano student på Umeå universitet åt Ericsson Luleå. Övervakning av nätets prestanda är av yttersta vikt för nätverksleverantörer. Detta görs med modeller för att utvärdera QoS (Quality of Service) som överensstämmer med ITU-T rekommendationer. Vid bestämning av kvaliten på videoströmmar är det mer meningsfullt att utvärdera QoE (Quality of Experience) för att få insikt i hur användaren uppfattar kvaliten. Detta graderas i värden av MOS (Mean opinion score). En viktig aspekt för att bestämma QoE är typen av videoinnehåll, vilket är korrelerat till videons kodningskomplexitet och MOS värden. I detta arbete undersöktes möjligheterna att förbättra kvalitetsuppskattningsmodellerna under uppfyllande av ITU-T studygroup 12 (q.14). Metoder undersöktes och en algoritm utvecklades som använder tidsserieanalys av paketstatistik för uppskattning av videoströmmars MOS-värden. Metoder som ingår i algoritmen är en nyutvecklad frekventa mönster metod tillsammans med regressions analys. En modell som använder algoritmen från låg till hög bithastighet definierades. Den nya modellen gav omkring 20% förbättrad precision i uppskattning av MOS-värden jämfört med existerande referensmodell. Även en algoritm som enbart använder regressionsstatistik och modellerande av statistiska parametrar utvecklades. Denna algoritm levererade jämförbara resultat med föregående algoritm men gav även kraftigt förbättrad effektivitet.
APA, Harvard, Vancouver, ISO, and other styles
20

Mays, James Edward. "Model robust regression: combining parametric, nonparametric, and semiparametric methods." Diss., Virginia Polytechnic Institute and State University, 1995. http://hdl.handle.net/10919/49937.

Full text
Abstract:
In obtaining a regression fit to a set of data, ordinary least squares regression depends directly on the parametric model formulated by the researcher. If this model is incorrect, a least squares analysis may be misleading. Alternatively, nonparametric regression (kernel or local polynomial regression, for example) has no dependence on an underlying parametric model, but instead depends entirely on the distances between regressor coordinates and the prediction point of interest. This procedure avoids the necessity of a reliable model, but in using no information from the researcher, may fit to irregular patterns in the data. The proper combination of these two regression procedures can overcome their respective problems. Considered is the situation where the researcher has an idea of which model should explain the behavior of the data, but this model is not adequate throughout the entire range of the data. An extension of partial linear regression and two methods of model robust regression are developed and compared in this context. These methods involve parametric fits to the data and nonparametric fits to either the data or residuals. The two fits are then combined in the most efficient proportions via a mixing parameter. Performance is based on bias and variance considerations.
Ph. D.
incomplete_metadata
APA, Harvard, Vancouver, ISO, and other styles
21

Starnes, Brett Alden. "Asymptotic Results for Model Robust Regression." Diss., Virginia Tech, 1999. http://hdl.handle.net/10919/30244.

Full text
Abstract:
Since the mid 1980's many statisticians have studied methods for combining parametric and nonparametric esimates to improve the quality of fits in a regression problem. Notably in 1987, Einsporn and Birch proposed the Model Robust Regression estimate (MRR1) in which estimates of the parametric function, f, and the nonparametric function, g, were combined in a straightforward fashion via the use of a mixing parameter, l. This technique was studied extensively at small samples and was shown to be quite effective at modeling various unusual functions. In 1995, Mays and Birch developed the MRR2 estimate as an alternative to MRR1. This model involved first forming the parametric fit to the data, and then adding in an estimate of g according to the lack of fit demonstrated by the error terms. Using small samples, they illustrated the superiority of MRR2 to MRR1 in most situations. In this dissertation we have developed asymptotic convergence rates for both MRR1 and MRR2 in OLS and GLS (maximum likelihood) settings. In many of these settings, it is demonstrated that the user of MRR1 or MRR2 achieves the best convergence rates available regardless of whether or not the model is properly specified. This is the "Golden Result of Model Robust Regression". It turns out that the selection of the mixing parameter is paramount in determining whether or not this result is attained.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
22

Zhang, Tianyang. "Partly parametric generalized additive model." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/913.

Full text
Abstract:
In many scientific studies, the response variable bears a generalized nonlinear regression relationship with a certain covariate of interest, which may, however, be confounded by other covariates with unknown functional form. We propose a new class of models, the partly parametric generalized additive model (PPGAM) for doing generalized nonlinear regression with the confounding covariate effects adjusted nonparametrically. To avoid the curse of dimensionality, the PPGAM specifies that, conditional on the covariates, the response distribution belongs to the exponential family with the mean linked to an additive predictor comprising a nonlinear parametric function that is of main interest, plus additive, smooth functions of other covariates. The PPGAM extends both the generalized additive model (GAM) and the generalized nonlinear regression model. We propose to estimate a PPGAM by the method of penalized likelihood. We derive some asymptotic properties of the penalized likelihood estimator, including consistency and asymptotic normality of the parametric estimator of the nonlinear regression component. We propose a model selection criterion for the PPGAM, which resembles the BIC. We illustrate the new methodologies by simulations and real applications. We have developed an R package PPGAM that implements the methodologies expounded herein.
APA, Harvard, Vancouver, ISO, and other styles
23

Hoglin, Phillip J. "Survival analysis and accession optimization of prior enlisted United States Marine Corps officers." Thesis, Monterey, California. Naval Postgraduate School, 2004. http://hdl.handle.net/10945/1673.

Full text
Abstract:
Approved for public release, distribution is unlimited
The purpose of this thesis is to firstly analyze the determinants on the survival of United States Marine Corps Officers, and secondly, to develop the methodology to optimize the accessions of prior and non-prior enlisted officers. Using data from the Marine Corps Officer Accession Career file (MCCOAC), the Cox Proportional Hazards Model is used to estimate the effects of officer characteristics on their survival as a commissioned officer in the USMC. A Markov model for career transition is combined with fiscal data to determine the optimum number of prior and non-prior enlisted officers under the constraints of force structure and budget. The findings indicate that prior enlisted officers have a better survival rate than their non-prior enlisted counterparts. Additionally, officers who are married, commissioned through MECEP, graduate in the top third of their TBS class, and are assigned to a combat support MOS have a better survival rate than officers who are unmarried, commissioned through USNA, graduate in the middle third of their TBS class, and are assigned to either combat or combat service support MOS. The findings also indicate that the optimum number of prior enlisted officer accessions may be considerably lower than recent trends and may differ across MOS. Based on the findings; it is recommended that prior enlisted officer accession figures be reviewed.
Major, Australian Army
APA, Harvard, Vancouver, ISO, and other styles
24

Valença, Dione Maria. "O modelo de regressão gama generalizada para discriminar entre modelos parametricos de tempo de vida." [s.n.], 1994. http://repositorio.unicamp.br/jspui/handle/REPOSIP/325403.

Full text
Abstract:
Orientador: Jonathan Biele
Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matematica, Estatistica e Ciencia da Computação
Made available in DSpace on 2018-07-19T05:50:13Z (GMT). No. of bitstreams: 1 Valenca_DioneMaria_M.pdf: 5617916 bytes, checksum: 995fe9ed2de35a3bd029f3773a6d2d24 (MD5) Previous issue date: 1994
Resumo: Não informado.
Abstract: Not informed
Mestrado
Mestre em Estatística
APA, Harvard, Vancouver, ISO, and other styles
25

Diniz, Márcio Augusto. "Modelos bayesianos semi-paramétricos para dados binários." Universidade de São Paulo, 2015. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-02112015-013658/.

Full text
Abstract:
Este trabalho propõe modelos Bayesiano semi-paramétricos para dados binários. O primeiro modelo é uma mistura em escala que permite lidar com discrepâncias relacionadas a curtose do modelo Logístico. É uma extensão relevante a partir do que já foi proposto por Basu e Mukhopadhyay (2000) ao possibilitar a interpretação da distribuição a priori dos parâmetros através de razões de chances. O segundo modelo usufrui da mistura em escala em conjunto com a transformação proposta por \\Yeo e Johnson (2000) possibilitando que a curtose assim como a assimetria sejam ajustadas e um parâmetro informativo de assimetria seja estimado. Esta transformação é muito mais apropriada para lidar com valores negativos do que a transformação de Box e Cox (1964) utilizada por Guerrero e Johnson (1982) e é mais simples do que o modelo proposto por Stukel (1988). Por fim, o terceiro modelo é o mais geral entre todos e consiste em uma mistura de posição e escala tal que possa descrever curtose, assimetria e também bimodalidade. O modelo proposto por Newton et al. (1996), embora, seja bastante geral, não permite uma interpretação palpável da distribuição a priori para os pesquisadores da área aplicada. A avaliação dos modelos é realizada através de medidas de distância de probabilidade Cramér-von Mises, Kolmogorov-Smirnov e Anderson-Darling e também pelas Ordenadas Preditivas Condicionais.
This work proposes semi-parametric Bayesian models for binary data. The first model is a scale mixture that allows handling discrepancies related to kurtosis of Logistic model. It is a more interesting extension than has been proposed by Basu e Mukhopadyay (1998) because this model allows the interpretation of the prior distribution of parameters using odds ratios. The second model enjoys the scale mixture together with the scale transformation proposed by Yeo and Johnson (2000) modeling the kurtosis and the asymmetry such that a parameter of asymmetry is estimated. This transformation is more appropriate to deal with negative values than the transformation of Box e Cox (1964) used by Guerrero e Johnson (1982) and simpler than the model proposed by Stukel (1988). Finally, the third model is the most general among all and consists of a location-scale mixture that can describe kurtosis and skewness also bimodality. The model proposed by Newton et al (1996), although general, does not allow a tangible interpretation of the a priori distribution for reseachers of applied area. The evaluation of the models is performed through distance measurements of distribution of probabilities Cramer-von Mises Kolmogorov-Smirnov and Anderson-Darling and also the Conditional Predictive sorted.
APA, Harvard, Vancouver, ISO, and other styles
26

Hossain, Shahadut. "Dealing with measurement error in covariates with special reference to logistic regression model: a flexible parametric approach." Thesis, University of British Columbia, 2007. http://hdl.handle.net/2429/408.

Full text
Abstract:
In many fields of statistical application the fundamental task is to quantify the association between some explanatory variables or covariates and a response or outcome variable through a suitable regression model. The accuracy of such quantification depends on how precisely we measure the relevant covariates. In many instances, we can not measure some of the covariates accurately, rather we can measure noisy versions of them. In statistical terminology this is known as measurement errors or errors in variables. Regression analyses based on noisy covariate measurements lead to biased and inaccurate inference about the true underlying response-covariate associations. In this thesis we investigate some aspects of measurement error modelling in the case of binary logistic regression models. We suggest a flexible parametric approach for adjusting the measurement error bias while estimating the response-covariate relationship through logistic regression model. We investigate the performance of the proposed flexible parametric approach in comparison with the other flexible parametric and nonparametric approaches through extensive simulation studies. We also compare the proposed method with the other competitive methods with respect to a real-life data set. Though emphasis is put on the logistic regression model the proposed method is applicable to the other members of the generalized linear models, and other types of non-linear regression models too. Finally, we develop a new computational technique to approximate the large sample bias that my arise due to exposure model misspecification in the estimation of the regression parameters in a measurement error scenario.
APA, Harvard, Vancouver, ISO, and other styles
27

Torrent, Hudson da Silva. "Estimação não-paramétrica e semi-paramétrica de fronteiras de produção." reponame:Biblioteca Digital de Teses e Dissertações da UFRGS, 2010. http://hdl.handle.net/10183/25786.

Full text
Abstract:
Existe uma grande e crescente literatura sobre especificação e estimação de fronteiras de produção e, portanto, de eficiência de unidades produtivas. Nesta tese, o foco esta sobre modelos de fronteiras determinísticas, os quais são baseados na hipótese de que os dados observados pertencem ao conjunto tecnológico. Dentre os modelos estatísticos e estimadores para fronteiras determinísticas existentes, uma abordagem promissora e a adotada por Martins-Filho e Yao (2007). Esses autores propõem um procedimento de estimação composto por três estágios. Esse estimador e de fácil implementação, visto que envolve procedimentos não-paramétricos bem conhecidos. Além disso, o estimador possui características desejáveis vis-à-vis estimadores para fronteiras determinísticas tradicionais como DEA e FDH. Nesta tese, três artigos, que melhoram o modelo proposto por Martins-Filho e Yao (2007), sao propostos. No primeiro artigo, o procedimento de estimação desses autores e melhorado a partir de uma variação do estimador exponencial local, proposto por Ziegelmann (2002). Demonstra-se que estimador proposto a consistente e assintoticamente normal. Além disso, devido ao estimador exponencial local, estimativas potencialmente negativas para a função de variância condicional, que poderiam prejudicar a aplicabilidade do estimador proposto por Martins-Filho e Yao, são evitadas. No segundo artigo, e proposto um método original para estimação de fronteiras de produção em apenas dois estágios. E mostrado que se pode eliminar o segundo estágio proposto por Martins-Filho e Yao, assim como, eliminar o segundo estagio proposto no primeiro artigo desta tese. Em ambos os casos, a estimação do mesmo modelo de fronteira de produção requer três estágios, sendo versões diferentes para o segundo estagio. As propriedades assintóticas do estimador proposto são analisadas, mostrando-se consistência e normalidade assintótica sob hipóteses razoáveis. No terceiro artigo, a proposta uma variação semi-paramétrica do modelo estudado no segundo artigo. Reescreve-se aquele modelo de modo que se possa estimar a fronteira de produção e a eficiência de unidades produtivas no contexto de múltiplos insumos, sem incorrer no curse of dimensionality. A abordagem adotada coloca o modelo na estrutura de modelos aditivos, a partir de hipóteses sobre como os insumos se combinam no processo produtivo. Em particular, considera-se aqui os casos de insumos aditivos e insumos multiplicativos, os quais são amplamente considerados em teoria econômica e aplicações. Estudos de Monte Carlo são apresentados em todos os artigos, afim de elucidar as propriedades dos estimadores propostos em amostras finitas. Além disso, estudos com dados reais são apresentados em todos os artigos, nos quais são estimador rankings de eficiência para uma amostra de departamentos policiais dos EUA, a partir de dados sobre criminalidade daquele país.
There exists a large and growing literature on the specification and estimation of production frontiers and therefore efficiency of production units. In this thesis we focus on deterministic production frontier models, which are based on the assumption that all observed data lie in the technological set. Among the existing statistical models and estimators for deterministic frontiers, a promising approach is that of Martins-Filho and Yao (2007). They propose an estimation procedure that consists of three stages. Their estimator is fairly easy to implement as it involves standard nonparametric procedures. In addition, it has a number of desirable characteristics vis-a-vis traditional deterministic frontier estimators as DEA and FDH. In this thesis we propose three papers that improve the model proposed in Martins-Filho and Yao (2007). In the first paper we improve their estimation procedure by adopting a variant of the local exponential smoothing proposed in Ziegelmann (2002). Our estimator is shown to be consistent and asymptotically normal. In addition, due to local exponential smoothing, potential negativity of conditional variance functions that may hinder the use of Martins-Filho and Yao's estimator is avoided. In the second paper we propose a novel method for estimating production frontiers in only two stages. (Continue). There we show that we can eliminate the second stage of Martins-Filho and Yao as well as of our first paper, where estimation of the same frontier model requires three stages under different versions for the second stage. We study asymptotic properties showing consistency andNirtnin, asymptotic normality of our proposed estimator under standard assumptions. In the third paper we propose a semiparametric variation of the frontier model studied in the second paper. We rewrite that model allowing for estimating the production frontier and efficiency of production units in a multiple input context without suffering the curse of dimensionality. Our approach places that model within the framework of additive models based on assumptions regarding the way inputs combine in production. In particular, we consider the cases of additive and multiplicative inputs, which are widely considered in economic theory and applications. Monte Carlo studies are performed in all papers to shed light on the finite sample properties of the proposed estimators. Furthermore a real data study is carried out in all papers, from which we rank efficiency within a sample of USA Law Enforcement agencies using USA crime data.
APA, Harvard, Vancouver, ISO, and other styles
28

Hoare, Armando. "Parametric, non-parametric and statistical modeling of stony coral reef data." [Tampa, Fla] : University of South Florida, 2008. http://purl.fcla.edu/usf/dc/et/SFE0002470.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Race, Jonathan Andrew. "Semi-parametric Survival Analysis via Dirichlet Process Mixtures of the First Hitting Time Model." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu157357742741077.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Devamitta, Perera Muditha V. "Statistical Analysis and Modeling of Ovarian and Breast Cancer." Scholar Commons, 2017. https://scholarcommons.usf.edu/etd/7395.

Full text
Abstract:
The objective of the present study is to investigate key aspects of ovarian and breast cancers, which are two main causes of mortality among women. Identification of the true behavior of survivorship and influential risk factors is essential in designing treatment protocols, increasing disease awareness and preventing possible causes of disease. There is a commonly held belief that African Americans have a higher risk of cancer mortality. We studied racial disparities of women diagnosed with ovarian cancer on overall and disease-free survival and found out that there is no significant difference in the survival experience among the three races: Whites, African Americans and Other races. Tumor sizes at diagnosis among the races were significantly different, as African American women tend to have larger ovarian tumor sizes at the diagnosis. Prognostic models play a major role in health data research. They can be used to estimate adjusted survival probabilities and absolute and relative risks, and to determine significantly contributing risk factors. A prognostic model will be a valuable tool only if it is developed carefully, evaluating the underlying model assumptions and inadequacies and determining if the most relevant model to address the study objectives is selected. In the present study we developed such statistical models for survival data of ovarian and breast cancers. We found that the histology of ovarian cancer had risk ratios that vary over time. We built two types of parametric models to estimate absolute risks and survival probabilities and to adjust the time dependency of the relative risk of Histology. One parametric model is based on classical probability distributions and the other is a more flexible parametric model that estimates the baseline cumulative hazard function using spline functions. In contrast to women diagnosed with ovarian cancer, women with breast cancer showed significantly different survivorship among races where Whites had a poorer overall survival rate compared to African Americans and Other races. In the breast cancer study, we identified that age and progesterone receptor status have time dependent hazard ratios and age and tumor size display non-linear effects on the hazard. We adjusted those non-proportional hazards and non-linear effects by using an extended Cox regression model in order to generate more meaningful interpretations of the data.
APA, Harvard, Vancouver, ISO, and other styles
31

Knefati, Muhammad Anas. "Estimation non-paramétrique du quantile conditionnel et apprentissage semi-paramétrique : applications en assurance et actuariat." Thesis, Poitiers, 2015. http://www.theses.fr/2015POIT2280/document.

Full text
Abstract:
La thèse se compose de deux parties : une partie consacrée à l'estimation des quantiles conditionnels et une autre à l'apprentissage supervisé. La partie "Estimation des quantiles conditionnels" est organisée en 3 chapitres : Le chapitre 1 est consacré à une introduction sur la régression linéaire locale, présentant les méthodes les plus utilisées, pour estimer le paramètre de lissage. Le chapitre 2 traite des méthodes existantes d’estimation nonparamétriques du quantile conditionnel ; Ces méthodes sont comparées, au moyen d’expériences numériques sur des données simulées et des données réelles. Le chapitre 3 est consacré à un nouvel estimateur du quantile conditionnel et que nous proposons ; Cet estimateur repose sur l'utilisation d'un noyau asymétrique en x. Sous certaines hypothèses, notre estimateur s'avère plus performant que les estimateurs usuels. La partie "Apprentissage supervisé" est, elle aussi, composée de 3 chapitres : Le chapitre 4 est une introduction à l’apprentissage statistique et les notions de base utilisées, dans cette partie. Le chapitre 5 est une revue des méthodes conventionnelles de classification supervisée. Le chapitre 6 est consacré au transfert d'un modèle d'apprentissage semi-paramétrique. La performance de cette méthode est montrée par des expériences numériques sur des données morphométriques et des données de credit-scoring
The thesis consists of two parts: One part is about the estimation of conditional quantiles and the other is about supervised learning. The "conditional quantile estimate" part is organized into 3 chapters. Chapter 1 is devoted to an introduction to the local linear regression and then goes on to present the methods, the most used in the literature to estimate the smoothing parameter. Chapter 2 addresses the nonparametric estimation methods of conditional quantile and then gives numerical experiments on simulated data and real data. Chapter 3 is devoted to a new conditional quantile estimator, we propose. This estimator is based on the use of asymmetrical kernels w.r.t. x. We show, under some hypothesis, that this new estimator is more efficient than the other estimators already used. The "supervised learning" part is, too, with 3 chapters: Chapter 4 provides an introduction to statistical learning, remembering the basic concepts used in this part. Chapter 5 discusses the conventional methods of supervised classification. Chapter 6 is devoted to propose a method of transferring a semiparametric model. The performance of this method is shown by numerical experiments on morphometric data and credit-scoring data
APA, Harvard, Vancouver, ISO, and other styles
32

Abdel-Salam, Abdel-Salam Gomaa. "Profile Monitoring with Fixed and Random Effects using Nonparametric and Semiparametric Methods." Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/29387.

Full text
Abstract:
Profile monitoring is a relatively new approach in quality control best used where the process data follow a profile (or curve) at each time period. The essential idea for profile monitoring is to model the profile via some parametric, nonparametric, and semiparametric methods and then monitor the fitted profiles or the estimated random effects over time to determine if there have been changes in the profiles. The majority of previous studies in profile monitoring focused on the parametric modeling of either linear or nonlinear profiles, with both fixed and random effects, under the assumption of correct model specification. Our work considers those cases where the parametric model for the family of profiles is unknown or at least uncertain. Consequently, we consider monitoring profiles via two techniques, a nonparametric technique and a semiparametric procedure that combines both parametric and nonparametric profile fits, a procedure we refer to as model robust profile monitoring (MRPM). Also, we incorporate a mixed model approach to both the parametric and nonparametric model fits. For the mixed effects models, the MMRPM method is an extension of the MRPM method which incorporates a mixed model approach to both parametric and nonparametric model fits to account for the correlation within profiles and to deal with the collection of profiles as a random sample from a common population. For each case, we formulated two Hotelling's T 2 statistics, one based on the estimated random effects and one based on the fitted values, and obtained the corresponding control limits. In addition,we used two different formulas for the estimated variancecovariance matrix: one based on the pooled sample variance-covariance matrix estimator and a second one based on the estimated variance-covariance matrix based on successive differences. A Monte Carlo study was performed to compare the integrated mean square errors (IMSE) and the probability of signal of the parametric, nonparametric, and semiparametric approaches. Both correlated and uncorrelated errors structure scenarios were evaluated for varying amounts of model misspecification, number of profiles, number of observations per profile, shift location, and in- and out-of-control situations. The semiparametric (MMRPM) method for uncorrelated and correlated scenarios was competitive and, often, clearly superior with the parametric and nonparametric over all levels of misspecification. For a correctly specified model, the IMSE and the simulated probability of signal for the parametric and theMMRPM methods were identical (or nearly so). For the severe modelmisspecification case, the nonparametric andMMRPM methods were identical (or nearly so). For the mild model misspecification case, the MMRPM method was superior to the parametric and nonparametric methods. Therefore, this simulation supports the claim that the MMRPM method is robust to model misspecification. In addition, the MMRPM method performed better for data sets with correlated error structure. Also, the performances of the nonparametric and MMRPM methods improved as the number of observations per profile increases since more observations over the same range of X generally enables more knots to be used by the penalized spline method, resulting in greater flexibility and improved fits in the nonparametric curves and consequently, the semiparametric curves. The parametric, nonparametric and semiparametric approaches were utilized for fitting the relationship between torque produced by an engine and engine speed in the automotive industry. Then, we used a Hotelling's T 2 statistic based on the estimated random effects to conduct Phase I studies to determine the outlying profiles. The parametric, nonparametric and seminonparametric methods showed that the process was stable. Despite the fact that all three methods reach the same conclusion regarding the –in-control– status of each profile, the nonparametric and MMRPM results provide a better description of the actual behavior of each profile. Thus, the nonparametric and MMRPM methods give the user greater ability to properly interpret the true relationship between engine speed and torque for this type of engine and an increased likelihood of detecting unusual engines in future production. Finally, we conclude that the nonparametric and semiparametric approaches performed better than the parametric approach when the user's model is misspecified. The case study demonstrates that, the proposed nonparametric and semiparametric methods are shown to be more efficient, flexible and robust to model misspecification for Phase I profile monitoring in a practical application. Thus, our methods are robust to the common problem of model misspecification. We also found that both the nonparametric and the semiparametric methods result in charts with good abilities to detect changes in Phase I data, and in charts with easily calculated control limits. The proposed methods provide greater flexibility and efficiency than current parametric methods used in profile monitoring for Phase I that rely on correct model specification, an unrealistic situation in many practical problems in industrial applications.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
33

Tran, Xuan Quang. "Les modèles de régression dynamique et leurs applications en analyse de survie et fiabilité." Thesis, Bordeaux, 2014. http://www.theses.fr/2014BORD0147/document.

Full text
Abstract:
Cette thèse a été conçu pour explorer les modèles dynamiques de régression, d’évaluer les inférences statistiques pour l’analyse des données de survie et de fiabilité. Ces modèles de régression dynamiques que nous avons considérés, y compris le modèle des hasards proportionnels paramétriques et celui de la vie accélérée avec les variables qui peut-être dépendent du temps. Nous avons discuté des problèmes suivants dans cette thèse.Nous avons présenté tout d’abord une statistique de test du chi-deux généraliséeY2nquiest adaptative pour les données de survie et fiabilité en présence de trois cas, complètes,censurées à droite et censurées à droite avec les covariables. Nous avons présenté en détailla forme pratique deY2nstatistique en analyse des données de survie. Ensuite, nous avons considéré deux modèles paramétriques très flexibles, d’évaluer les significations statistiques pour ces modèles proposées en utilisantY2nstatistique. Ces modèles incluent du modèle de vie accélérés (AFT) et celui de hasards proportionnels (PH) basés sur la distribution de Hypertabastic. Ces deux modèles sont proposés pour étudier la distribution de l’analyse de la duré de survie en comparaison avec d’autre modèles paramétriques. Nous avons validé ces modèles paramétriques en utilisantY2n. Les études de simulation ont été conçus.Dans le dernier chapitre, nous avons proposé les applications de ces modèles paramétriques à trois données de bio-médicale. Le premier a été fait les données étendues des temps de rémission des patients de leucémie aiguë qui ont été proposées par Freireich et al. sur la comparaison de deux groupes de traitement avec des informations supplémentaires sur les log du blanc du nombre de globules. Elle a montré que le modèle Hypertabastic AFT est un modèle précis pour ces données. Le second a été fait sur l’étude de tumeur cérébrale avec les patients de gliome malin, ont été proposées par Sauerbrei & Schumacher. Elle a montré que le meilleur modèle est Hypertabastic PH à l’ajout de cinq variables de signification. La troisième demande a été faite sur les données de Semenova & Bitukov, à concernant les patients de myélome multiple. Nous n’avons pas proposé un modèle exactement pour ces données. En raison de cela était les intersections de temps de survie.Par conséquent, nous vous conseillons d’utiliser un autre modèle dynamique que le modèle de la Simple Cross-Effect à installer ces données
This thesis was designed to explore the dynamic regression models, assessing the sta-tistical inference for the survival and reliability data analysis. These dynamic regressionmodels that we have been considered including the parametric proportional hazards andaccelerated failure time models contain the possibly time-dependent covariates. We dis-cussed the following problems in this thesis.At first, we presented a generalized chi-squared test statisticsY2nthat is a convenient tofit the survival and reliability data analysis in presence of three cases: complete, censoredand censored with covariates. We described in detail the theory and the mechanism to usedofY2ntest statistic in the survival and reliability data analysis. Next, we considered theflexible parametric models, evaluating the statistical significance of them by usingY2nandlog-likelihood test statistics. These parametric models include the accelerated failure time(AFT) and a proportional hazards (PH) models based on the Hypertabastic distribution.These two models are proposed to investigate the distribution of the survival and reliabilitydata in comparison with some other parametric models. The simulation studies were de-signed, to demonstrate the asymptotically normally distributed of the maximum likelihood estimators of Hypertabastic’s parameter, to validate of the asymptotically property of Y2n test statistic for Hypertabastic distribution when the right censoring probability equal 0% and 20%.n the last chapter, we applied those two parametric models above to three scenes ofthe real-life data. The first one was done the data set given by Freireich et al. on thecomparison of two treatment groups with additional information about log white blood cellcount, to test the ability of a therapy to prolong the remission times of the acute leukemiapatients. It showed that Hypertabastic AFT model is an accurate model for this dataset.The second one was done on the brain tumour study with malignant glioma patients, givenby Sauerbrei & Schumacher. It showed that the best model is Hypertabastic PH onadding five significance covariates. The third application was done on the data set given by Semenova & Bitukov on the survival times of the multiple myeloma patients. We did not propose an exactly model for this dataset. Because of that was an existing oneintersection of survival times. We, therefore, suggest fitting other dynamic model as SimpleCross-Effect model for this dataset
APA, Harvard, Vancouver, ISO, and other styles
34

Mackových, Marek. "Regresní analýza EKG pro odhad polohy srdce vůči měřicím elektrodám." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2014. http://www.nusl.cz/ntk/nusl-220859.

Full text
Abstract:
This work focuses on the regression analysis of morphological parameters calculated from the ECG for estimating the position of the heart to the measuring electrodes. It consists of a theoretical analysis of the problems of ECG recording and description of the data obtained from experiments on isolated animal hearts. On the theoretical part is followed by a description of the calculation parameters suitable for regression analysis and their application in the training and testing of the following regression models to estimate the position of the heart to the measuring electrode.
APA, Harvard, Vancouver, ISO, and other styles
35

Kozáček, Vojtěch. "Experimentální stanovení závislosti parametrů NDT a pevnosti v tlaku betonu." Master's thesis, Vysoké učení technické v Brně. Fakulta stavební, 2020. http://www.nusl.cz/ntk/nusl-409957.

Full text
Abstract:
The diploma thesis deals with non-destructive testing of concrete as well as with the relationship between determined parameters and the compressive strength of concrete. The thesis is mainly focused on the ultrasonic pulse velocity method and the rebound hammer test. The experimental part of the thesis describes non-destructive tests performed on concrete blocks. The compressive strength was tested on the drill cores taken from the concrete blocks. The aim of this thesis is to find regression models of the relationship between the compressive strength and non-destructive parameters, and the subsequent analysis of the results.
APA, Harvard, Vancouver, ISO, and other styles
36

Bršlicová, Tereza. "Bezkontaktní detekce fyziologických parametrů z obrazových sekvencí." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2015. http://www.nusl.cz/ntk/nusl-221320.

Full text
Abstract:
This thesis deals with the study of contactless and non-invasive methods for estimating heart and respiratory rate. Non-contact measurement involves sensing persons by using camera and the values of the physiological parameters are then assessed from the sets of image sequences by using suitable approaches. The theoretical part is devoted to description of the various methods and their implementation. The practical part describes the design and realization of the experiment for contactless detection of heart and respiratory rate. The experiment was carried out on 10 volunteers with a known heart and respiratory rate, which was covered by using of a sophisticated system BIOPAC. Processing and analysis of the measured data was conducted in software environment Matlab. Finally, results from contactless detection were compared with the reference from measurement system BIOPAC. Experiment results are statistically evaluated and discussed.
APA, Harvard, Vancouver, ISO, and other styles
37

Savegnago, Rodrigo Pelicioni [UNESP]. "Modelos de regressão aleatória, análise multivariada e redes neurais artificiais na avaliação genética da produção de leite de vacas Holandesas." Universidade Estadual Paulista (UNESP), 2013. http://hdl.handle.net/11449/102800.

Full text
Abstract:
Made available in DSpace on 2014-06-11T19:32:16Z (GMT). No. of bitstreams: 0 Previous issue date: 2013-10-02Bitstream added on 2014-06-13T19:02:45Z : No. of bitstreams: 1 000736370.pdf: 4790865 bytes, checksum: 809ff4894ab6c213aa0b928951bca1be (MD5)
Os objetivos deste trabalho foram estimar parâmetros genéticos para a produção de leite utilizando modelos de regressão aleatória, comparar o ganho genético esperado da produção de leite utilizando diferentes índices de seleção, utilizar análise de agrupamento e discriminante para explorar o perfil genético dos animais para a produção de leite, visando identificar os animais mais indicados para a seleção e investigar quais informações devem ser utilizadas em redes neurais artificiais para que fossem capazes de predizer os valores genéticos dos animais para produção de leite total até 305 dias em lactação. As estimativas de herdabilidade dos controles mensais da produção de leite variaram de 0,12 ± 0,04 a 0,31 ± 0,04. As estimativas de correlação genética e de ambiente permanente apresentaram valores próximos à unidade em controles leiteiros adjacentes, com tendência de diminuição das correlações à medida que o tempo entre os controles leiteiros aumentou. As magnitudes das estimativas de herdabilidade para as classes dos controles leiteiros mensais indicam que a produção de leite deve responder ao processo de seleção dos animais e a fase entre 121 a 240 dias em lactação teria melhor resposta à seleção devido as maiores estimativas de herdabilidade nesse período. A seleção dos animais baseada na produção de leite entre 121 a 150 dias em lactação é recomendada devido à alta herdabilidade para a característica nesta fase e pelas altas correlações genéticas da característica neste período com os demais da lactação. A indicação de qual índice de seleção deve ser utilizado dependerá, entre outros fatores, dos objetivos de seleção estabelecidos pelo programa de melhoramento genético. Se o objetivo de seleção for melhorar a produção de leite e a persistência da lactação, seria mais indicado utilizar índices de seleção baseado...
The objectives of this study were to estimate genetic parameters for milk yield using random regression models, to compare the expected genetic gain of this trait using different selection indexes, to use cluster and discriminant analyses to explore the genetic pattern of milk production of the animals to identify those ones most suitable for selection, and to investigate which information should be used in artificial neural networks to predict the breeding values for milk yield to 305 days in milks. The estimates of heritability for monthly milk production classes ranged from 0.12 ± 0.04 to 0.31 ± 0.04. The estimates of genetic correlation and permanent environment had values close to one in adjacent dairy controls, and decreased as the time between the controls had increased. The magnitudes of heritabilities between 121 to 240 days indicated that the milk yield could better response to selection in this phase due to the highest estimates. The selection based on milk production between 121 to 150 days in milk is recommended due to the high heritability for the trait at this phase and due to the high genetic correlations with milk yield in the other periods of the lactation. The use of a particular selection index will depend on the selection goals of the breeding program. The selection indexes based on eigenvectors of additive genetic matrix with greater selection emphasis for persistence is recommended if the selection goal of the breeding program is to improve milk production and persistence simultaneously. But, if the selection goal is to improve only the milk production, the selection index based on breeding values for milk production up to 305 days in milk would be the most appropriate, because it presented the greatest expected genetic gain for this trait. The breeding values of milk yield on every 30 days were used as grouping variables of the animals. It was found that the population ...
APA, Harvard, Vancouver, ISO, and other styles
38

Savegnago, Rodrigo Pelicioni. "Modelos de regressão aleatória, análise multivariada e redes neurais artificiais na avaliação genética da produção de leite de vacas Holandesas /." Jaboticabal, 2013. http://hdl.handle.net/11449/102800.

Full text
Abstract:
Orientador: Danísio Prado Munari
Coorientador: Lenira El Faro
Banca: João Ademir de Oliveira
Banca: Sandra Aidar de Queiroz
Banca: Cláudia Cristina Paro de Paz
Banca: José Bento Sterman Ferraz
Resumo: Os objetivos deste trabalho foram estimar parâmetros genéticos para a produção de leite utilizando modelos de regressão aleatória, comparar o ganho genético esperado da produção de leite utilizando diferentes índices de seleção, utilizar análise de agrupamento e discriminante para explorar o perfil genético dos animais para a produção de leite, visando identificar os animais mais indicados para a seleção e investigar quais informações devem ser utilizadas em redes neurais artificiais para que fossem capazes de predizer os valores genéticos dos animais para produção de leite total até 305 dias em lactação. As estimativas de herdabilidade dos controles mensais da produção de leite variaram de 0,12 ± 0,04 a 0,31 ± 0,04. As estimativas de correlação genética e de ambiente permanente apresentaram valores próximos à unidade em controles leiteiros adjacentes, com tendência de diminuição das correlações à medida que o tempo entre os controles leiteiros aumentou. As magnitudes das estimativas de herdabilidade para as classes dos controles leiteiros mensais indicam que a produção de leite deve responder ao processo de seleção dos animais e a fase entre 121 a 240 dias em lactação teria melhor resposta à seleção devido as maiores estimativas de herdabilidade nesse período. A seleção dos animais baseada na produção de leite entre 121 a 150 dias em lactação é recomendada devido à alta herdabilidade para a característica nesta fase e pelas altas correlações genéticas da característica neste período com os demais da lactação. A indicação de qual índice de seleção deve ser utilizado dependerá, entre outros fatores, dos objetivos de seleção estabelecidos pelo programa de melhoramento genético. Se o objetivo de seleção for melhorar a produção de leite e a persistência da lactação, seria mais indicado utilizar índices de seleção baseado ...
Abstract: The objectives of this study were to estimate genetic parameters for milk yield using random regression models, to compare the expected genetic gain of this trait using different selection indexes, to use cluster and discriminant analyses to explore the genetic pattern of milk production of the animals to identify those ones most suitable for selection, and to investigate which information should be used in artificial neural networks to predict the breeding values for milk yield to 305 days in milks. The estimates of heritability for monthly milk production classes ranged from 0.12 ± 0.04 to 0.31 ± 0.04. The estimates of genetic correlation and permanent environment had values close to one in adjacent dairy controls, and decreased as the time between the controls had increased. The magnitudes of heritabilities between 121 to 240 days indicated that the milk yield could better response to selection in this phase due to the highest estimates. The selection based on milk production between 121 to 150 days in milk is recommended due to the high heritability for the trait at this phase and due to the high genetic correlations with milk yield in the other periods of the lactation. The use of a particular selection index will depend on the selection goals of the breeding program. The selection indexes based on eigenvectors of additive genetic matrix with greater selection emphasis for persistence is recommended if the selection goal of the breeding program is to improve milk production and persistence simultaneously. But, if the selection goal is to improve only the milk production, the selection index based on breeding values for milk production up to 305 days in milk would be the most appropriate, because it presented the greatest expected genetic gain for this trait. The breeding values of milk yield on every 30 days were used as grouping variables of the animals. It was found that the population ...
Doutor
APA, Harvard, Vancouver, ISO, and other styles
39

Bertipaglia, Tássia Souza [UNESP]. "Estimativas de parâmetros genéticos para pesos do nascimento aos dois anos de idade para bovinos da raça Brahman utilizando modelos de regressão aleatória." Universidade Estadual Paulista (UNESP), 2013. http://hdl.handle.net/11449/92572.

Full text
Abstract:
Made available in DSpace on 2014-06-11T19:26:06Z (GMT). No. of bitstreams: 0 Previous issue date: 2013-07-31Bitstream added on 2014-06-13T20:33:49Z : No. of bitstreams: 1 000736269.pdf: 1514399 bytes, checksum: 7edd04e6622d3b7ac900bfd27ceb3294 (MD5)
O objetivo deste trabalho foi estimar funções de covariância utilizando modelos de regressão aleatória para a análise de medidas repetidas de pesos de bovinos Brahman do Brasil. Parâmetros genéticos foram estimados para 88.788 registros de peso do nascimento aos 744 dias de idade de 17.499 animais provenientes do banco de dados da Associação Brasileira de Criadores de Zebuínos (ABCZ). Os modelos incluíram, como aleatórios, os efeitos genéticos aditivo direto e materno, e ambiente permanente do animal, como fixo o efeito de grupo contemporâneos, e como covariável a idade da vaca ao parto (quadrática) aninhada a classe de idade do animal. As análises de regressão aleatória foram realizadas utilizando polinômio ortogonal de Legendre de quarta ordem para modelar as tendências da média populacional. As variâncias residuais foram modeladas por uma função homogênea e com cinco níveis de classes de idade. Os modelos foram comparados pelos critérios de informação bayesiano de Schwarz (BIC) e Akaike (AIC). O melhor modelo indicado pelos critérios foi o que considerou o efeito genético aditivo direto ajustado por um polinômio quadrático, o efeito genético materno por cúbico, e o efeito de ambiente permanente do animal por cúbico, e a heterogeneidade de variâncias residuais (5 níveis) . As estimativas de herdabilidade para o efeito direto foram maiores ao início e ao final do período estudado, com valores de 0,47 ao nascimento, 0,38 aos 60 e 120 dias, 0,53 aos 205 dias, 0,70 aos 365 dias, 0,76 aos 550 e 0,52 aos 744 dias de idade. As estimativas de herdabilidade materna foram máximas ao nascimento (0,16). As correlações genéticas de maneira geral, exceto para pesos ao nascimento, variaram de moderadas a altas diminuindo conforme o aumento da distância entre as idades. Maior eficiência na seleção para peso pode ser obtida considerando os pesos próximos à desmama...
The objective of this study was to estimate covariance functions using random regression models for repeated measures analysis of weights of Brahman cattle in Brazil. Genetic parameters were estimated for 88,788 records from birth to 744 days of age of 17,499 animals from the database of the Brazilian Association of Zebu Breeders (ABCZ). The models included the random additive direct genetic effects and maternal permanent environment and the animal, as the fixed effect of contemporary group and the covariate age at calving (quadratic) nested class of the animal's age. Regression analyzes were performed using random orthogonal Legendre polynomials of fourth order to model trends in population mean. The residual variances were modeled by a homogeneous function with five levels and age classes. The models were compared by the information criteria Schwarz Bayesian (BIC) and Akaike (AIC). The best model indicated by what criteria was considered the direct genetic effect adjusted by a quadratic polynomial, the maternal genetic effect for cubic and permanent environmental effect of the animal by Cubic, and heterogeneity of residual variances (5 levels). Heritability estimates for direct effect were higher at the beginning and end of the study period, with values of 0.47 at birth, 0.38 at 60 and 120 days to 205 days 0.53, 0.70 at 365 days, 0.76 and 0.52 at 550 to 744 days of age. The maternal heritability estimates were maximal at birth (0.16). Genetic correlations in general, except for birth weights ranged from moderate to high decreases as the distance increases between the ages. Efficiency of selection for weight can be obtained by considering the weights near weaning period in which the estimates of genetic variance and heritability were growing
APA, Harvard, Vancouver, ISO, and other styles
40

Bertipaglia, Tássia Souza. "Estimativas de parâmetros genéticos para pesos do nascimento aos dois anos de idade para bovinos da raça Brahman utilizando modelos de regressão aleatória /." Jaboticabal, 2013. http://hdl.handle.net/11449/92572.

Full text
Abstract:
Orientador: Ricardo da Fonseca
Banca: Danísio Prado Munari
Banca: Maria Eugênia Zerlotti Mercadante
Resumo: O objetivo deste trabalho foi estimar funções de covariância utilizando modelos de regressão aleatória para a análise de medidas repetidas de pesos de bovinos Brahman do Brasil. Parâmetros genéticos foram estimados para 88.788 registros de peso do nascimento aos 744 dias de idade de 17.499 animais provenientes do banco de dados da Associação Brasileira de Criadores de Zebuínos (ABCZ). Os modelos incluíram, como aleatórios, os efeitos genéticos aditivo direto e materno, e ambiente permanente do animal, como fixo o efeito de grupo contemporâneos, e como covariável a idade da vaca ao parto (quadrática) aninhada a classe de idade do animal. As análises de regressão aleatória foram realizadas utilizando polinômio ortogonal de Legendre de quarta ordem para modelar as tendências da média populacional. As variâncias residuais foram modeladas por uma função homogênea e com cinco níveis de classes de idade. Os modelos foram comparados pelos critérios de informação bayesiano de Schwarz (BIC) e Akaike (AIC). O melhor modelo indicado pelos critérios foi o que considerou o efeito genético aditivo direto ajustado por um polinômio quadrático, o efeito genético materno por cúbico, e o efeito de ambiente permanente do animal por cúbico, e a heterogeneidade de variâncias residuais (5 níveis) . As estimativas de herdabilidade para o efeito direto foram maiores ao início e ao final do período estudado, com valores de 0,47 ao nascimento, 0,38 aos 60 e 120 dias, 0,53 aos 205 dias, 0,70 aos 365 dias, 0,76 aos 550 e 0,52 aos 744 dias de idade. As estimativas de herdabilidade materna foram máximas ao nascimento (0,16). As correlações genéticas de maneira geral, exceto para pesos ao nascimento, variaram de moderadas a altas diminuindo conforme o aumento da distância entre as idades. Maior eficiência na seleção para peso pode ser obtida considerando os pesos próximos à desmama ...
Abstract: The objective of this study was to estimate covariance functions using random regression models for repeated measures analysis of weights of Brahman cattle in Brazil. Genetic parameters were estimated for 88,788 records from birth to 744 days of age of 17,499 animals from the database of the Brazilian Association of Zebu Breeders (ABCZ). The models included the random additive direct genetic effects and maternal permanent environment and the animal, as the fixed effect of contemporary group and the covariate age at calving (quadratic) nested class of the animal's age. Regression analyzes were performed using random orthogonal Legendre polynomials of fourth order to model trends in population mean. The residual variances were modeled by a homogeneous function with five levels and age classes. The models were compared by the information criteria Schwarz Bayesian (BIC) and Akaike (AIC). The best model indicated by what criteria was considered the direct genetic effect adjusted by a quadratic polynomial, the maternal genetic effect for cubic and permanent environmental effect of the animal by Cubic, and heterogeneity of residual variances (5 levels). Heritability estimates for direct effect were higher at the beginning and end of the study period, with values of 0.47 at birth, 0.38 at 60 and 120 days to 205 days 0.53, 0.70 at 365 days, 0.76 and 0.52 at 550 to 744 days of age. The maternal heritability estimates were maximal at birth (0.16). Genetic correlations in general, except for birth weights ranged from moderate to high decreases as the distance increases between the ages. Efficiency of selection for weight can be obtained by considering the weights near weaning period in which the estimates of genetic variance and heritability were growing
Mestre
APA, Harvard, Vancouver, ISO, and other styles
41

Balzotti, Christopher Stephen. "Multidisciplinary Assessment and Documentation of Past and Present Human Impacts on the Neotropical Forests of Petén, Guatemala." BYU ScholarsArchive, 2010. https://scholarsarchive.byu.edu/etd/2129.

Full text
Abstract:
Tropical forests provide important habitat for a tremendous diversity of plant and animal species. However, limitations in measuring and monitoring the structure and function of tropical forests has caused these systems to remain poorly understood. Remote-sensing technology has provided a powerful tool for quantification of structural patterns and associating these with resource use. Satellite and aerial platforms can be used to collect remotely sensed images of tropical forests that can be applied to ecological research and management. Chapter 1 of this article highlights the resources available for tropical forest remote sensing and presents a case-study that demonstrates its application to a neotropical forest located in the Petén region of northern Guatemala. The ancient polity of Tikal has been extensively studied by archaeologists and soil scientists, but little is known about the subsistence and ancient farming techniques that sustained its inhabitants. The objective of chapter 2 was to create predictive models for ancient maize (Zea mays L.) agriculture in the Tikal National Park, Petén, Guatemala, improving our understanding of settlement patterns and the ecological potentials surrounding the site in a cost effective manner. Ancient maize agriculture was described in this study as carbon (C) isotopic signatures left in the soil humin fraction. Probability models predicting C isotopic enrichment and carbonate C were used to outline areas of potential long term maize agriculture. It was found that the Tikal area not only supports a great variety of potential food production systems but the models suggest multiple maize agricultural practices were used.
APA, Harvard, Vancouver, ISO, and other styles
42

Nováková, Marie. "Mapování pohybových artefaktů ve fMRI." Master's thesis, Vysoké učení technické v Brně. Fakulta elektrotechniky a komunikačních technologií, 2013. http://www.nusl.cz/ntk/nusl-220039.

Full text
Abstract:
This thesis summarizes a theory of magnetic resonance and the method of functional magnetic resonance. It is focused on the influence of motion artifacts and image preprocessing methods, especially realign. It deals with the possibility of using movement parameters obtained in the process of alignment of functional scans to create maps that show the expression of motion artifacts. In this thesis, three different methods were designed, implemented a tested. These methods lead to the creation of probability, power and statistical group maps showing areas typically affected by movement artifacts.
APA, Harvard, Vancouver, ISO, and other styles
43

Bhatti, Sajjad Haider. "Estimation of the mincerian wage model addressing its specification and different econometric issues." Phd thesis, Université de Bourgogne, 2012. http://tel.archives-ouvertes.fr/tel-00780563.

Full text
Abstract:
In the present doctoral thesis, we estimated Mincer's (1974) semi logarithmic wage function for the French and Pakistani labour force data. This model is considered as a standard tool in order to estimate the relationship between earnings/wages and different contributory factors. Despite of its vide and extensive use, simple estimation of the Mincerian model is biased because of different econometric problems. The main sources of bias noted in the literature are endogeneity of schooling, measurement error, and sample selectivity. We have tackled the endogeneity and measurement error biases via instrumental variables two stage least squares approach for which we have proposed two new instrumental variables. The first instrumental variable is defined as "the average years of schooling in the family of the concerned individual" and the second instrumental variable is defined as "the average years of schooling in the country, of particular age group, of particular gender, at the particular time when an individual had joined the labour force". Schooling is found to be endogenous for the both countries. Comparing two said instruments we have selected second instrument to be more appropriate. We have applied the Heckman (1979) two-step procedure to eliminate possible sample selection bias which found to be significantly positive for the both countries which means that in the both countries, people who decided not to participate in labour force as wage worker would have earned less than participants if they had decided to work as wage earner. We have estimated a specification that tackled endogeneity and sample selectivity problems together as we found in respect to present literature relative scarcity of such studies all over the globe in general and absence of such studies for France and Pakistan, in particular. Differences in coefficients proved worth of such specification. We have also estimated model semi-parametrically, but contrary to general norm in the context of the Mincerian model, our semi-parametric estimation contained non-parametric component from first-stage schooling equation instead of non-parametric component from selection equation. For both countries, we have found parametric model to be more appropriate. We found errors to be heteroscedastic for the data from both countries and then applied adaptive estimation to control adverse effects of heteroscedasticity. Comparing simple and adaptive estimations, we prefer adaptive specification of parametric model for both countries. Finally, we have applied quantile regression on the selected model from mean regression. Quantile regression exposed that different explanatory factors influence differently in different parts of the wage distribution of the two countries. For both Pakistan and France, it would be the first study that corrected both sample selectivity and endogeneity in single specification in quantile regression framework
APA, Harvard, Vancouver, ISO, and other styles
44

Cardozo, Sandra Vergara. "Função da probabilidade da seleção do recurso (RSPF) na seleção de habitat usando modelos de escolha discreta." Universidade de São Paulo, 2009. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-11032009-143806/.

Full text
Abstract:
Em ecologia, o comportamento dos animais é freqüentemente estudado para entender melhor suas preferências por diferentes tipos de alimento e habitat. O presente trabalho esta relacionado a este tópico, dividindo-se em três capítulos. O primeiro capitulo refere-se à estimação da função da probabilidade da seleção de recurso (RSPF) comparado com um modelo de escolha discreta (DCM) com uma escolha, usando as estatísticas qui-quadrado para obter as estimativas. As melhores estimativas foram obtidas pelo método DCM com uma escolha. No entanto, os animais não fazem a sua seleção baseados apenas em uma escolha. Com RSPF, as estimativas de máxima verossimilhança, usadas pela regressão logística ainda não atingiram os objetivos, já que os animais têm mais de uma escolha. R e o software Minitab e a linguagem de programação Fortran foram usados para obter os resultados deste capítulo. No segundo capítulo discutimos mais a verossimilhança do primeiro capítulo. Uma nova verossimilhança para a RSPF é apresentada, a qual considera as unidades usadas e não usadas, e métodos de bootstrapping paramétrico e não paramétrico são usados para estudar o viés e a variância dos estimadores dos parâmetros, usando o programa FORTRAN para obter os resultados. No terceiro capítulo, a nova verossimilhança apresentada no capítulo 2 é usada com um modelo de escolha discreta, para resolver parte do problema apresentado no primeiro capítulo. A estrutura de encaixe é proposta para modelar a seleção de habitat de 28 corujas manchadas (Strix occidentalis), assim como a uma generalização do modelo logit encaixado, usando a maximização da utilidade aleatória e a RSPF aleatória. Métodos de otimização numérica, e o sistema computacional SAS, são usados para estimar os parâmetros de estrutura de encaixe.
In ecology, the behavior of animals is often studied to better understand their preferences for different types of habitat and food. The present work is concerned with this topic. It is divided into three chapters. The first concerns the estimation of a resource selection probability function (RSPF) compared with a discrete choice model (DCM) using chi-squared to obtain estimates. The best estimates were obtained by the DCM method. Nevertheless, animals were not selected based on choice alone. With RSPF, the maximum likelihood estimates used with the logistic regression still did not reach the objectives, since the animals have more than one choice. R and Minitab software and the FORTRAN programming language were used for the computations in this chapter. The second chapter discusses further the likelihood presented in the first chapter. A new likelihood for a RSPF is presented, which takes into account the units used and not used, and parametric and non-parametric bootstrapping are employed to study the bias and variance of parameter estimators, using a FORTRAN program for the calculations. In the third chapter, the new likelihood presented in chapter 2, with a discrete choice model is used to resolve a part of the problem presented in the first chapter. A nested structure is proposed for modelling selection by 28 spotted owls (Strix occidentalis) as well as a generalized nested logit model using random utility maximization and a random RSPF. Numerical optimization methods and the SAS system were employed to estimate the nested structural parameters.
APA, Harvard, Vancouver, ISO, and other styles
45

Silva, Kesley Leandro da. "Estratégias de momentum no mercado cambial." reponame:Repositório Institucional do FGV, 2016. http://hdl.handle.net/10438/15773.

Full text
Abstract:
Submitted by Kesley Leandro da Silva (kesley.leandro@gmail.com) on 2016-03-10T17:32:09Z No. of bitstreams: 1 Dissertação v02.docx: 272937 bytes, checksum: 8b3b51152e65026481b1ba2a1541fcde (MD5)
Rejected by Renata de Souza Nascimento (renata.souza@fgv.br), reason: Kesley, Segue abaixo as alterações que deverão ser realizadas em seu trabalho: - O arquivo deve estar em pdf. - Nome e Título em Letra maiúscula. - Retirar a sigla SP que consta ao lado de SÃO PAULO. - A ficha catalográfica deve estar na parte inferior da pagina - Centralizar os títulos Resumo e Abstract - As páginas anteriores da Introdução não podem estar numeradas. Em seguida, submeter novamente o trabalho. Att on 2016-03-10T21:57:30Z (GMT)
Submitted by Kesley Leandro da Silva (kesley.leandro@gmail.com) on 2016-03-11T15:24:37Z No. of bitstreams: 1 Dissertação v03.pdf: 1405923 bytes, checksum: 28d2a1fb855d75506c6f1f010f4ff5a5 (MD5)
Approved for entry into archive by Renata de Souza Nascimento (renata.souza@fgv.br) on 2016-03-11T15:42:19Z (GMT) No. of bitstreams: 1 Dissertação v03.pdf: 1405923 bytes, checksum: 28d2a1fb855d75506c6f1f010f4ff5a5 (MD5)
Made available in DSpace on 2016-03-11T16:00:12Z (GMT). No. of bitstreams: 1 Dissertação v03.pdf: 1405923 bytes, checksum: 28d2a1fb855d75506c6f1f010f4ff5a5 (MD5) Previous issue date: 2016-02-15
Utilizo dados semanais para investigar a lucratividade de estratégias de momentum no mercado de câmbio baseadas em dois diferentes métodos de extração da tendência, possivelmente não linear. Comparo a performance com as tradicionais regras de médias móveis, método linear bastante utilizado pelos profissionais do mercado. Eu encontro que o desempenho de todas as estratégias é extremamente sensível à escolha da moeda, às defasagens utilizadas e ao critério de avaliação escolhido. A despeito disso, as moedas dos países do G10 apresentam resultados médios melhores com a utilização dos métodos não lineares, enquanto as moedas dos países emergentes apresentam resultados mistos. Adoto também uma metodologia para o gerenciamento do risco das estratégias de momentum, visando minimizar as 'grandes perdas'. Ela tem êxito em diminuir as perdas máximas semanais, o desvio-padrão, a assimetria e curtose para a maior parte das moedas em ambas as estratégias. Quanto ao desempenho, as operações baseadas no filtro HP com gestão do risco apresentam retornos e índices de Sharpe maiores para cerca de 70% das estratégias, enquanto as baseadas na regressão não paramétrica apresentam resultados melhores para cerca de 60% das estratégias.
I use weekly data to investigate the profitability of momentum strategies in the currency market based on two different methods of trending extraction, possibly nonlinear. I compare the performance with the traditional moving averages rules, linear method of trading broadly used by market professionals. I find that the performance of all strategies is extremely sensitive to the choice of currency, lags parameters and the evaluation criteria. Nevertheless, the G10 currencies show better average results with the nonlinear methods, while the emerging market currencies show mixed results. I also adopt a methodology for managing the risk of momentum strategies to minimize the “worst crashes”. It works to lower the maximum weekly losses, the standard deviation, the skewness and the kurtosis for most currencies in both strategies. In terms of performance, HP filter with risk-managed momentum shows higher return and Sharpe ratio for about 70% the observations, while those based on nonparametric regression show higher numbers for about 60% the observations.
APA, Harvard, Vancouver, ISO, and other styles
46

Matias, Stephane Paul Jordão. "Análise paramétrica do consumo de electricidade e água para o comércio alimentar a retalho e grossista." Master's thesis, Instituto Superior de Economia e Gestão, 2012. http://hdl.handle.net/10400.5/10327.

Full text
Abstract:
Mestrado em Decisão Económica e Empresarial
Os consumos de electricidade e água têm sido alvo de vários estudos, com o interesse de perceber o que os influencia e encontrar soluções que promovam a melhoria do desempenho económico e ambiental das organizações. Neste sentido, o presente trabalho pretende elaborar uma análise paramétrica, utilizando o modelo de regressão linear, para detectar as variáveis das quais dependem os consumos de água e electricidade para os formatos de comércio a retalho e grossista. Esta análise permitiu estudar a relevância de algumas variáveis para explicar os respectivos consumos nos vários estabelecimentos do grupo Jerónimo Martins como também detectar os estabelecimentos com consumos extremos.
The electricity and water consumptions has been the subject of several studies which envisage the evaluation of their influences and finding solutions that promote the improvement of the organizations' economic and environmental performance. In this sense, the present work aims to develop a parametric analysis, using the linear regression model, to detect the variables of which depend the consumptions of water and electricity for the retail and cash & carry sectors. This analysis allowed to study the relevance of some variables in explaining the mentioned consumptions of various establishments of the group Jerónimo Martins and to detect the establishments with extreme consumptions.
APA, Harvard, Vancouver, ISO, and other styles
47

Kamari, Halaleh. "Qualité prédictive des méta-modèles construits sur des espaces de Hilbert à noyau auto-reproduisant et analyse de sensibilité des modèles complexes." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASE010.

Full text
Abstract:
Ce travail porte sur le problème de l'estimation d'un méta-modèle d'un modèle complexe, noté m. Le modèle m dépend de d variables d'entrées X1,...,Xd qui sont indépendantes et ont une loi connue. Le méta-modèle, noté f∗, approche la décomposition de Hoeffding de m et permet d'estimer ses indices de Sobol. Il appartient à un espace de Hilbert à noyau auto-reproduisant (RKHS), noté H, qui est construit comme une somme directe d'espaces de Hilbert (Durrande et al. (2013)). L'estimateur du méta-modèle, noté f^, est calculé en minimisant un critère des moindres carrés pénalisé par la somme de la norme de Hilbert et de la norme empirique L2 (Huet and Taupin (2017)). Cette procédure, appelée RKHS ridge groupe sparse, permet à la fois de sélectionner et d'estimer les termes de la décomposition de Hoeffding, et donc de sélectionner les indices de Sobol non-nuls et de les estimer. Il permet d'estimer les indices de Sobol même d'ordre élevé, un point connu pour être difficile à mettre en pratique.Ce travail se compose d'une partie théorique et d'une partie pratique. Dans la partie théorique, j'ai établi les majorations du risque empirique L2 et du risque quadratique de l'estimateur f^ d'un modèle de régression où l'erreur est non-gaussienne et non-bornée. Il s'agit des bornes supérieures par rapport à la norme empirique L2 et à la norme L2 pour la distance entre le modèle m et son estimation f^ dans le RKHS H. Dans la partie pratique, j'ai développé un package R appelé RKHSMetaMod, pour la mise en œuvre des méthodes d'estimation du méta-modèle f∗ de m. Ce package s'applique indifféremment dans le cas où le modèle m est calculable et le cas du modèle de régression. Afin d'optimiser le temps de calcul et la mémoire de stockage, toutes les fonctions de ce package ont été écrites en utilisant les bibliothèques GSL et Eigen de C++ à l'exception d'une fonction qui est écrite en R. Elles sont ensuite interfacées avec l'environnement R afin de proposer un package facilement exploitable aux utilisateurs. La performance des fonctions du package en termes de qualité prédictive de l'estimateur et de l'estimation des indices de Sobol, est validée par une étude de simulation
In this work, the problem of estimating a meta-model of a complex model, denoted m, is considered. The model m depends on d input variables X1 , ..., Xd that are independent and have a known law. The meta-model, denoted f ∗ , approximates the Hoeffding decomposition of m, and allows to estimate its Sobol indices. It belongs to a reproducing kernel Hilbert space (RKHS), denoted H, which is constructed as a direct sum of Hilbert spaces (Durrande et al. (2013)). The estimator of the meta-model, denoted f^, is calculated by minimizing a least-squares criterion penalized by the sum of the Hilbert norm and the empirical L2-norm (Huet and Taupin (2017)). This procedure, called RKHS ridge group sparse, allows both to select and estimate the terms in the Hoeffding decomposition, and therefore, to select the Sobol indices that are non-zero and estimate them. It makes possible to estimate the Sobol indices even of high order, a point known to be difficult in practice.This work consists of a theoretical part and a practical part. In the theoretical part, I established upper bounds of the empirical L2 risk and the L2 risk of the estimator f^. That is, upper bounds with respect to the L2-norm and the empirical L2-norm for the f^ distance between the model m and its estimation f into the RKHS H. In the practical part, I developed an R package, called RKHSMetaMod, that implements the RKHS ridge group sparse procedure and a spacial case of it called the RKHS group lasso procedure. This package can be applied to a known model that is calculable in all points or an unknown regression model. In order to optimize the execution time and the storage memory, except for a function that is written in R, all of the functions of the RKHSMetaMod package are written using C++ libraries GSL and Eigen. These functions are then interfaced with the R environment in order to propose an user friendly package. The performance of the package functions in terms of the predictive quality of the estimator and the estimation of the Sobol indices, is validated by a simulation study
APA, Harvard, Vancouver, ISO, and other styles
48

Liley, Albert James. "Statistical co-analysis of high-dimensional association studies." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/270628.

Full text
Abstract:
Modern medical practice and science involve complex phenotypic definitions. Understanding patterns of association across this range of phenotypes requires co-analysis of high-dimensional association studies in order to characterise shared and distinct elements. In this thesis I address several problems in this area, with a general linking aim of making more efficient use of available data. The main application of these methods is in the analysis of genome-wide association studies (GWAS) and similar studies. Firstly, I developed methodology for a Bayesian conditional false discovery rate (cFDR) for levering GWAS results using summary statistics from a related disease. I extended an existing method to enable a shared control design, increasing power and applicability, and developed an approximate bound on false-discovery rate (FDR) for the procedure. Using the new method I identified several new variant-disease associations. I then developed a second application of shared control design in the context of study replication, enabling improvement in power at the cost of changing the spectrum of sensitivity to systematic errors in study cohorts. This has application in studies on rare diseases or in between-case analyses. I then developed a method for partially characterising heterogeneity within a disease by modelling the bivariate distribution of case-control and within-case effect sizes. Using an adaptation of a likelihood-ratio test, this allows an assessment to be made of whether disease heterogeneity corresponds to differences in disease pathology. I applied this method to a range of simulated and real datasets, enabling insight into the cause of heterogeneity in autoantibody positivity in type 1 diabetes (T1D). Finally, I investigated the relation of subtypes of juvenile idiopathic arthritis (JIA) to adult diseases, using modified genetic risk scores and linear discriminants in a penalised regression framework. The contribution of this thesis is in a range of methodological developments in the analysis of high-dimensional association study comparison. Methods such as these will have wide application in the analysis of GWAS and similar areas, particularly in the development of stratified medicine.
APA, Harvard, Vancouver, ISO, and other styles
49

Winkler, Anderson M. "Widening the applicability of permutation inference." Thesis, University of Oxford, 2016. https://ora.ox.ac.uk/objects/uuid:ce166876-0aa3-449e-8496-f28bf189960c.

Full text
Abstract:
This thesis is divided into three main parts. In the first, we discuss that, although permutation tests can provide exact control of false positives under the reasonable assumption of exchangeability, there are common examples in which global exchangeability does not hold, such as in experiments with repeated measurements or tests in which subjects are related to each other. To allow permutation inference in such cases, we propose an extension of the well known concept of exchangeability blocks, allowing these to be nested in a hierarchical, multi-level definition. This definition allows permutations that retain the original joint distribution unaltered, thus preserving exchangeability. The null hypothesis is tested using only a subset of all otherwise possible permutations. We do not need to explicitly model the degree of dependence between observations; rather the use of such permutation scheme leaves any dependence intact. The strategy is compatible with heteroscedasticity and can be used with permutations, sign flippings, or both combined. In the second part, we exploit properties of test statistics to obtain accelerations irrespective of generic software or hardware improvements. We compare six different approaches using synthetic and real data, assessing the methods in terms of their error rates, power, agreement with a reference result, and the risk of taking a different decision regarding the rejection of the null hypotheses (known as the resampling risk). In the third part, we investigate and compare the different methods for assessment of cortical volume and area from magnetic resonance images using surface-based methods. Using data from young adults born with very low birth weight and coetaneous controls, we show that instead of volume, the permutation-based non-parametric combination (NPC) of thickness and area is a more sensitive option for studying joint effects on these two quantities, giving equal weight to variation in both, and allowing a better characterisation of biological processes that can affect brain morphology.
APA, Harvard, Vancouver, ISO, and other styles
50

Ahmed, Mohamed Salem. "Contribution à la statistique spatiale et l'analyse de données fonctionnelles." Thesis, Lille 3, 2017. http://www.theses.fr/2017LIL30047/document.

Full text
Abstract:
Ce mémoire de thèse porte sur la statistique inférentielle des données spatiales et/ou fonctionnelles. En effet, nous nous sommes intéressés à l’estimation de paramètres inconnus de certains modèles à partir d’échantillons obtenus par un processus d’échantillonnage aléatoire ou non (stratifié), composés de variables indépendantes ou spatialement dépendantes.La spécificité des méthodes proposées réside dans le fait qu’elles tiennent compte de la nature de l’échantillon étudié (échantillon stratifié ou composé de données spatiales dépendantes).Tout d’abord, nous étudions des données à valeurs dans un espace de dimension infinie ou dites ”données fonctionnelles”. Dans un premier temps, nous étudions les modèles de choix binaires fonctionnels dans un contexte d’échantillonnage par stratification endogène (échantillonnage Cas-Témoin ou échantillonnage basé sur le choix). La spécificité de cette étude réside sur le fait que la méthode proposée prend en considération le schéma d’échantillonnage. Nous décrivons une fonction de vraisemblance conditionnelle sous l’échantillonnage considérée et une stratégie de réduction de dimension afin d’introduire une estimation du modèle par vraisemblance conditionnelle. Nous étudions les propriétés asymptotiques des estimateurs proposées ainsi que leurs applications à des données simulées et réelles. Nous nous sommes ensuite intéressés à un modèle linéaire fonctionnel spatial auto-régressif. La particularité du modèle réside dans la nature fonctionnelle de la variable explicative et la structure de la dépendance spatiale des variables de l’échantillon considéré. La procédure d’estimation que nous proposons consiste à réduire la dimension infinie de la variable explicative fonctionnelle et à maximiser une quasi-vraisemblance associée au modèle. Nous établissons la consistance, la normalité asymptotique et les performances numériques des estimateurs proposés.Dans la deuxième partie du mémoire, nous abordons des problèmes de régression et prédiction de variables dépendantes à valeurs réelles. Nous commençons par généraliser la méthode de k-plus proches voisins (k-nearest neighbors; k-NN) afin de prédire un processus spatial en des sites non-observés, en présence de co-variables spatiaux. La spécificité du prédicteur proposé est qu’il tient compte d’une hétérogénéité au niveau de la co-variable utilisée. Nous établissons la convergence presque complète avec vitesse du prédicteur et donnons des résultats numériques à l’aide de données simulées et environnementales.Nous généralisons ensuite le modèle probit partiellement linéaire pour données indépendantes à des données spatiales. Nous utilisons un processus spatial linéaire pour modéliser les perturbations du processus considéré, permettant ainsi plus de flexibilité et d’englober plusieurs types de dépendances spatiales. Nous proposons une approche d’estimation semi paramétrique basée sur une vraisemblance pondérée et la méthode des moments généralisées et en étudions les propriétés asymptotiques et performances numériques. Une étude sur la détection des facteurs de risque de cancer VADS (voies aéro-digestives supérieures)dans la région Nord de France à l’aide de modèles spatiaux à choix binaire termine notre contribution
This thesis is about statistical inference for spatial and/or functional data. Indeed, weare interested in estimation of unknown parameters of some models from random or nonrandom(stratified) samples composed of independent or spatially dependent variables.The specificity of the proposed methods lies in the fact that they take into considerationthe considered sample nature (stratified or spatial sample).We begin by studying data valued in a space of infinite dimension or so-called ”functionaldata”. First, we study a functional binary choice model explored in a case-controlor choice-based sample design context. The specificity of this study is that the proposedmethod takes into account the sampling scheme. We describe a conditional likelihoodfunction under the sampling distribution and a reduction of dimension strategy to definea feasible conditional maximum likelihood estimator of the model. Asymptotic propertiesof the proposed estimates as well as their application to simulated and real data are given.Secondly, we explore a functional linear autoregressive spatial model whose particularityis on the functional nature of the explanatory variable and the structure of the spatialdependence. The estimation procedure consists of reducing the infinite dimension of thefunctional variable and maximizing a quasi-likelihood function. We establish the consistencyand asymptotic normality of the estimator. The usefulness of the methodology isillustrated via simulations and an application to some real data.In the second part of the thesis, we address some estimation and prediction problemsof real random spatial variables. We start by generalizing the k-nearest neighbors method,namely k-NN, to predict a spatial process at non-observed locations using some covariates.The specificity of the proposed k-NN predictor lies in the fact that it is flexible and allowsa number of heterogeneity in the covariate. We establish the almost complete convergencewith rates of the spatial predictor whose performance is ensured by an application oversimulated and environmental data. In addition, we generalize the partially linear probitmodel of independent data to the spatial case. We use a linear process for disturbancesallowing various spatial dependencies and propose a semiparametric estimation approachbased on weighted likelihood and generalized method of moments methods. We establishthe consistency and asymptotic distribution of the proposed estimators and investigate thefinite sample performance of the estimators on simulated data. We end by an applicationof spatial binary choice models to identify UADT (Upper aerodigestive tract) cancer riskfactors in the north region of France which displays the highest rates of such cancerincidence and mortality of the country
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography