Dissertations / Theses on the topic 'Linear regression'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Linear regression.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Bai, Xue. "Robust linear regression." Kansas State University, 2012. http://hdl.handle.net/2097/14977.
Full textDepartment of Statistics
Weixin Yao
In practice, when applying a statistical method it often occurs that some observations deviate from the usual model assumptions. Least-squares (LS) estimators are very sensitive to outliers. Even one single atypical value may have a large effect on the regression parameter estimates. The goal of robust regression is to develop methods that are resistant to the possibility that one or several unknown outliers may occur anywhere in the data. In this paper, we review various robust regression methods including: M-estimate, LMS estimate, LTS estimate, S-estimate, [tau]-estimate, MM-estimate, GM-estimate, and REWLS estimate. Finally, we compare these robust estimates based on their robustness and efficiency through a simulation study. A real data set application is also provided to compare the robust estimates with traditional least squares estimator.
Hernandez, Erika Lyn. "Parameter Estimation in Linear-Linear Segmented Regression." Diss., CLICK HERE for online access, 2010. http://contentdm.lib.byu.edu/ETD/image/etd3551.pdf.
Full textOllikainen, Kati. "PARAMETER ESTIMATION IN LINEAR REGRESSION." Doctoral diss., University of Central Florida, 2006. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/4138.
Full textPh.D.
Department of Industrial Engineering and Management Systems
Engineering and Computer Science
Industrial Engineering and Management Systems
Chen, Xinyu. "Inference in Constrained Linear Regression." Digital WPI, 2017. https://digitalcommons.wpi.edu/etd-theses/405.
Full textWaterman, Megan Janet Tuttle. "Linear Mixed Model Robust Regression." Diss., Virginia Tech, 2002. http://hdl.handle.net/10919/27708.
Full textPh. D.
Ratnasingam, Suthakaran. "Sequential Change-point Detection in Linear Regression and Linear Quantile Regression Models Under High Dimensionality." Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu159050606401363.
Full textRettes, Julio Alberto Sibaja. "Robust algorithms for linear regression and locally linear embedding." reponame:Repositório Institucional da UFC, 2017. http://www.repositorio.ufc.br/handle/riufc/22445.
Full textSubmitted by Weslayne Nunes de Sales (weslaynesales@ufc.br) on 2017-03-30T13:15:27Z No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5)
Approved for entry into archive by Rocilda Sales (rocilda@ufc.br) on 2017-04-04T11:10:44Z (GMT) No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5)
Made available in DSpace on 2017-04-04T11:10:44Z (GMT). No. of bitstreams: 1 2017_dis_rettesjas.pdf: 3569500 bytes, checksum: 46cedc2d9f96d0f58bcdfe3e0d975d78 (MD5) Previous issue date: 2017
Nowadays a very large quantity of data is flowing around our digital society. There is a growing interest in converting this large amount of data into valuable and useful information. Machine learning plays an essential role in the transformation of data into knowledge. However, the probability of outliers inside the data is too high to marginalize the importance of robust algorithms. To understand that, various models of outliers are studied. In this work, several robust estimators within the generalized linear model for regression framework are discussed and analyzed: namely, the M-Estimator, the S-Estimator, the MM-Estimator, the RANSAC and the Theil-Sen estimator. This choice is motivated by the necessity of examining algorithms with different working principles. In particular, the M-, S-, MM-Estimator are based on a modification of the least squares criterion, whereas the RANSAC is based on finding the smallest subset of points that guarantees a predefined model accuracy. The Theil Sen, on the other hand, uses the median of least square models to estimate. The performance of the estimators under a wide range of experimental conditions is compared and analyzed. In addition to the linear regression problem, the dimensionality reduction problem is considered. More specifically, the locally linear embedding, the principal component analysis and some robust approaches of them are treated. Motivated by giving some robustness to the LLE algorithm, the RALLE algorithm is proposed. Its main idea is to use different sizes of neighborhoods to construct the weights of the points; to achieve this, the RAPCA is executed in each set of neighbors and the risky points are discarded from the corresponding neighborhood. The performance of the LLE, the RLLE and the RALLE over some datasets is evaluated.
Na atualidade um grande volume de dados é produzido na nossa sociedade digital. Existe um crescente interesse em converter esses dados em informação útil e o aprendizado de máquinas tem um papel central nessa transformação de dados em conhecimento. Por outro lado, a probabilidade dos dados conterem outliers é muito alta para ignorar a importância dos algoritmos robustos. Para se familiarizar com isso, são estudados vários modelos de outliers. Neste trabalho, discutimos e analisamos vários estimadores robustos dentro do contexto dos modelos de regressão linear generalizados: são eles o M-Estimator, o S-Estimator, o MM-Estimator, o RANSAC e o Theil-Senestimator. A escolha dos estimadores é motivada pelo principio de explorar algoritmos com distintos conceitos de funcionamento. Em particular os estimadores M, S e MM são baseados na modificação do critério de minimização dos mínimos quadrados, enquanto que o RANSAC se fundamenta em achar o menor subconjunto que permita garantir uma acurácia predefinida ao modelo. Por outro lado o Theil-Sen usa a mediana de modelos obtidos usando mínimos quadradosno processo de estimação. O desempenho dos estimadores em uma ampla gama de condições experimentais é comparado e analisado. Além do problema de regressão linear, considera-se o problema de redução da dimensionalidade. Especificamente, são tratados o Locally Linear Embedding, o Principal ComponentAnalysis e outras abordagens robustas destes. É proposto um método denominado RALLE com a motivação de prover de robustez ao algoritmo de LLE. A ideia principal é usar vizinhanças de tamanhos variáveis para construir os pesos dos pontos; para fazer isto possível, o RAPCA é executado em cada grupo de vizinhos e os pontos sob risco são descartados da vizinhança correspondente. É feita uma avaliação do desempenho do LLE, do RLLE e do RALLE sobre algumas bases de dados.
Peraça, Maria da Graça Teixeira. "Modelos para estimativa do grau de saturação do concreto mediante variáveis ambientais que influenciam na sua variação." reponame:Repositório Institucional da FURG, 2009. http://repositorio.furg.br/handle/1/3436.
Full textSubmitted by Lilian M. Silva (lilianmadeirasilva@hotmail.com) on 2013-04-22T19:51:54Z No. of bitstreams: 1 Modelos para estimativa do Grau de Saturação do concreto mediante Variáveis Ambientais que influenciam na sua variação.pdf: 2786682 bytes, checksum: df174dab02a19756db94fc47c6bb021d (MD5)
Approved for entry into archive by Bruna Vieira(bruninha_vieira@ibest.com.br) on 2013-06-03T19:20:55Z (GMT) No. of bitstreams: 1 Modelos para estimativa do Grau de Saturação do concreto mediante Variáveis Ambientais que influenciam na sua variação.pdf: 2786682 bytes, checksum: df174dab02a19756db94fc47c6bb021d (MD5)
Made available in DSpace on 2013-06-03T19:20:55Z (GMT). No. of bitstreams: 1 Modelos para estimativa do Grau de Saturação do concreto mediante Variáveis Ambientais que influenciam na sua variação.pdf: 2786682 bytes, checksum: df174dab02a19756db94fc47c6bb021d (MD5) Previous issue date: 2009
Nas engenharias, é fundamental estimar o tempo de vida útil das estruturas construídas, o que neste trabalho significa o tempo que os íons cloretos levam para atingirem a armadura do concreto. Um dos coeficientes que influenciam na vida útil do concreto é o de difusão, sendo este diretamente influenciado pelo grau de saturação (GS) do concreto. Recentes estudos levaram ao desenvolvimento de um método de medição do GS. Embora esse método seja eficiente, ainda assim há um grande desperdício de tempo e dinheiro em utilizá-lo. O objetivo deste trabalho é reduzir estes custos calculando uma boa aproximação para o valor do GS com modelos matemáticos que estimem o seu valor através de variáveis ambientais que influenciam na sua variação. As variáveis analisadas nesta pesquisa, são: pressão atmosférica,temperatura do ar seco, temperatura máxima, temperatura mínima, taxa de evaporação interna (Pichê), taxa de precipitação, umidade relativa, insolação, visibilidade, nebulosidade e taxa de evaporação externa. Todas foram analisadas e comparadas estatisticamente com medidas do GS obtidas durante quatro anos de medições semanais, para diferentes famílias de concreto. Com essas análises, pode-se medir a relação entre estes dados verificando que os fatores mais influentes no GS são, temperatura máxima e umidade relativa. Após a verificação desse resultado, foram elaborados modelos estatísticos, para que, através dos dados ambientais, cedidos pelo banco de dados meteorológicos, se possam calcular, sem desperdício de tempo e dinheiro, as médias aproximadas do GS para cada estação sazonal da região sul do Brasil, garantindo assim uma melhor estimativa do tempo de vida útil em estruturas de concreto.
In engineering, it is fundamental to estimate the life-cycle of built structures, which in this study means the period of time required for chlorides to reach the concrete reinforcement. One of the coefficients that affect the life-cycle of concrete is the diffusion, which is directly influenced by the saturation degree (SD) of concrete. Recent studies have led to the development of a measurement method for the SD. Although this method is efficient, there is still waste of time and money when it is used. The objective of this study is to reduce costs by calculating a good approximation for the SD value with mathematical models that predict its value through environmental variables that affect its variation. The variables analysed in the study are: atmospheric pressure, temperature of the dry air, maximum temperature, minimum temperature, internal evaporation rate (Pichê), precipitation rate, relative humidity, insolation, visibility, cloudiness and external evaporation rate. All of them were statistically analysed and compared with measurements of SD obtained during four years of weekly assessments for different families of concrete. By considering these analyses, the relationship among these data can be measured and it can be verified that the most influent variables affecting the SD are the maximum temperature and the relative humidity. After verifying this result, statistical models were developed aiming to calculate, based on the environmental data provided by the meteorological database and without waste of time and money, the approximate averages of SD for each seasonal station of the south region of Brazil, thus providing a better estimative of life-cycle for concrete structures.
Bocci, Cynthia Jacqueline. "Linear regression with spatially correlated data." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1999. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape10/PQDD_0012/NQ52271.pdf.
Full textMahmood, Nozad. "Sparse Ridge Fusion For Linear Regression." Master's thesis, University of Central Florida, 2013. http://digital.library.ucf.edu/cdm/ref/collection/ETD/id/5986.
Full textM.S.
Masters
Statistics
Sciences
Statistical Computing
Cao, Chendi. "Linear regression with Laplace measurement error." Kansas State University, 2016. http://hdl.handle.net/2097/32719.
Full textStatistics
Weixing Song
In this report, an improved estimation procedure for the regression parameter in simple linear regression models with the Laplace measurement error is proposed. The estimation procedure is made feasible by a Tweedie type equality established for E(X|Z), where Z = X + U, X and U are independent, and U follows a Laplace distribution. When the density function of X is unknown, a kernel estimator for E(X|Z) is constructed in the estimation procedure. A leave-one-out cross validation bandwidth selection method is designed. The finite sample performance of the proposed estimation procedure is evaluated by simulation studies. Comparison study is also conducted to show the superiority of the proposed estimation procedure over some existing estimation methods.
Gündüz, Necla. "D-optimal designs for weighted linear regression and binary regression models." Thesis, University of Glasgow, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.301629.
Full textRodrigues, Cátia Sofia Martins. "Quais os fatores que determinam o rendimento dos indivíduos em Portugal? - Regressão de Quantis." Master's thesis, Instituto Superior de Economia e Gestão, 2021. http://hdl.handle.net/10400.5/23425.
Full textApesar de se ter vindo a verificar, ao longo dos anos, um decréscimo significativo na desigualdade entre rendimentos, este tema ainda é alvo de estudo, principalmente numa abordagem econométrica, onde o principal objetivo passa por identificar e perceber os principais fatores que estão por detrás das desigualdades sentidas. Desta forma, o presente projeto destina-se ao estudo dos fatores que determinam o rendimento dos indivíduos residentes em Portugal, adotando uma abordagem de regressão de quantis, uma vez que grupos de indivíduos com diferentes valores de rendimento podem ter comportamentos distintos. Para tal, foram utilizados dados provenientes do Instituto Nacional de Estatística (INE) que permitiram construir o modelo estimado. A variável em estudo é o rendimento anual dos residentes em Portugal, no ano de 2019, e o modelo conta com oito regressores que caracterizam não só o indivíduo, incluindo, nomeadamente, a sua idade, sexo ou estado civil, mas também a sua instituição empregadora, incluindo variáveis como a dimensão, número de horas de trabalho, entre outras. Com o desenvolvimento do projeto e tendo em conta a análise aos resultados da estimação, é possível concluir que existem fatores, nomeadamente o género, nível de educação e região onde o indivíduo reside, responsáveis pela diferença significativa no valor do rendimento anual dos residentes em Portugal. No entanto, esta diferença não é uniforme para todos os grupos de indivíduos e comporta-se de maneira diferente quando comparados grupos de indivíduos com rendimentos mais baixos, médios ou altos. Este comportamento não linear permitiu ainda compreender a vantagem da utilização do método de regressão de quantis face ao método econométrico mais comum, a regressão linear, cujo objetivo é estimar o efeito das diferentes variáveis explicativas nos valores médios da variável dependente. A base de dados utilizada foi construída utilizando o software SQL Developer e a análise foi conduzida com recurso ao Stata.
Despite the fact that, over the years, there has been a significant decrease in income inequality, this issue is still a subject under study, mainly in an econometric approach, with the aim of studying and understanding the factors behind those inequalities. The main focus of this project is to identify and study the factors that determine the income of individuals living in Portugal, adopting a quantile regression approach, since individuals with different wages may have different behaviors. For this purpose, a regression model was created, using data from Statistics Portugal. The variable under study is the annual income of residents in Portugal, in 2019, and the model has several regressors that not only characterize the individual, such as their age, sex or marital status, but also the company, such as their dimension and number of working hours. With the development of this project and taking into account the estimation results, it is possible to conclude that there are factors, namely the individual's gender, level of education and region where he lives, responsible for the significant difference in the value of the annual income of residents in Portugal. However, these differences are not uniform for all groups of individuals, since there is a different behavior when comparing groups of individuals with lower, medium or high income. This nonlinear behavior also allowed to understand the advantage of using quantile regression over the most common econometric method, linear regression, whose objective is to estimate the effect of different explanatory variables on the average values of the dependent variable. The database used was built using SQL Developer and the analysis was conducted with software Stata.
info:eu-repo/semantics/publishedVersion
Edlund, Ove. "Solution of linear programming and non-linear regression problems using linear M-estimation methods /." Luleå, 1999. http://epubl.luth.se/1402-1544/1999/17/index.html.
Full textBullas, J. M. David. "K-nearest neighbours with weighted linear regression." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp01/MQ34340.pdf.
Full textHamzah, Nor Aishah. "Robust regression estimation in generalized linear models." Thesis, University of Bristol, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.294372.
Full textAh-Kine, Pascal Soon Shien. "Simultaneous confidence bands in linear regression analysis." Thesis, University of Southampton, 2010. https://eprints.soton.ac.uk/167557/.
Full textEssomba, Rene Franck. "An investigation into Functional Linear Regression Modeling." Master's thesis, University of Cape Town, 2015. http://hdl.handle.net/11427/15591.
Full textGormley, Nolan D. "Knotilus: A Differentiable Piecewise Linear Regression Framework." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1617222994436272.
Full textKhogasteh, Sam, and Edvin Wiorek. "Predicting Influencer Actual Reach Using Linear Regression." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299339.
Full textUnder de senaste åren har marknadsföringsindustrin med influencers växt drastiskt, ändå är effektiviteten hos denna marknadsföringsform relativt outforskad. Denna rapport avser använda linjär regression för att utforska hur olika prestationsmått är kopplade till räckvidden hos profiler på sociala medier. De olika datamängderna samlades manuellt, eller med hjälp av web scraping. Genom att dela upp datamängderna i träningsdata och testdata undersökte vi i hur hög grad den linjära regressionsmodellen kan förutsäga faktisk räckvidd, sidvisningar och profilens tillväxt under en vecka. Vi drog slutsatsen att det finns en statistisk signifikant korrelation mellan flera prestationsmått för en profilsida, och antalet sidvisningar for det kontot. Studien är emellertid begränsad av sin datamängd och tidsspann, något som motiverar framtida studier for att ytterligare etablera korrelationsgraden. Studiens resultat kan gynna företag i deras process att välja vilka influencers de vill samarbeta med, såväl som i deras process att bestämma den förväntade avkastningen för ett specifikt samarbete. Detta kan i sin tur bidra till en mer effektiv, autentisk och transparent marknad, något som också gör att konsumenten ¨ blir mindre exponerad for marknadsföring från vilseledande och illvilliga influencers.
Mirzayeva, Hijran. "Nonsmooth optimization algorithms for clusterwise linear regression." Thesis, University of Ballarat, 2013. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/41975.
Full textDoctor of Philosophy
Smith, David McCulloch. "Regression using QR decomposition methods." Thesis, University of Kent, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.303532.
Full textMöls, Märt. "Linear mixed models with equivalent predictors /." Online version, 2004. http://dspace.utlib.ee/dspace/bitstream/10062/1339/5/Mols.pdf.
Full textForslund, Gustaf, and David Åkesson. "Predicting share price by using Multiple Linear Regression." Thesis, KTH, Farkost och flyg, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-140645.
Full textAldahmani, Saeed. "High-dimensional linear regression problems via graphical models." Thesis, University of Essex, 2017. http://repository.essex.ac.uk/19207/.
Full textSaleem, Aban, and Jacob Blomgren. "Modelling Pupils’ Grades with Multiple Linear Regression Model." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-275672.
Full textDetta examensarbete, inom ämnet för matematisk statistik och industriell ekonomi, genomfördes med syftet att analysera avgångsbetygen för år 9 i den svenska skolan. Syftet var att förstå vilka variabler som hade en statistisk signifikant påverkan på elevers avgångsbetyg, så kommuner kan förstå vilka variabler som är viktiga för att förbättra de genomsnittliga skolresultaten. En regressionsanalys utfördes, på data från Skolverket, för att se vilka variabler som var statistiskt signifikanta. Den slutgiltiga regressionsmodellen, erhållen genom iterativ reducering av variabler, visade att främst strukturella kovariat, som akademisk bakgrund hos elever, andel kvinnliga studenter och andel studenter med svensk bakgrund hade en signifikant betydelse på studenters akademiska resultat. Justerad R2 var 0.5289 för den slutgiltiga modellen. I diskussionen utvärderades modellen utifrån tidigare forskning. Vidare användes teorin om balanserat styrkort, utvecklat av Robert S. Kaplan och David P. Norton, för att diskutera relevanta nyckeltal för att uppnå strategiska mål för skolan.
Brodbeck, William Joseph. "The Effect of Readability on Simple Linear Regression." Bowling Green State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1591867761661656.
Full textBunea, Florentina. "A model selection approach to partially linear regression /." Thesis, Connect to this title online; UW restricted, 2000. http://hdl.handle.net/1773/8971.
Full textMahmood, Arshad. "Rainfall prediction in Australia : Clusterwise linear regression approach." Thesis, Federation University Australia, 2017. http://researchonline.federation.edu.au/vital/access/HandleResolver/1959.17/159251.
Full textDoctor of Philosophy
Sardy, Sylvain. "A Comparison of Two Linear Nonparametric Regression Techniques." DigitalCommons@USU, 1992. https://digitalcommons.usu.edu/etd/7123.
Full textNunes, Hélio Rubens de Carvalho. "Ponderação Bayesiana de modelos em regressão linear clássica." Universidade de São Paulo, 2005. http://www.teses.usp.br/teses/disponiveis/11/11134/tde-16112005-155133/.
Full textThe objective of this work was divulge to Bayesian Model Averaging (BMA) between the researchers of the agronomy area and discuss its advantages and limitations. With the BMA is possible combine results of difeerent models about determined quantity of interest, with that, the BMA presents as being a metodology alternative of data analysis front the usual models selection approaches, for example the Coefficient of Multiple Determination (R2), Coefficient of Multiple Determination Adjusted (R2), Mallows (Cp Statistics) and Prediction Error Sum Squares (PRESS). Several works recently were carried out with the objective of compare the performance of the BMA regarding the approaches of models selection, however, there is still many situations for will be exploited to that can arrive to a general conclusion about this metodology. In this work, the BMA was applied to data originating from an agronomy experiment. It follow, the predictive performance of the BMA was compared with the performance of the approaches of selection above cited by means of a study of simulation varying the degree of multicollinearity, measured by the number of condition of the matrix standardized X'X and the number of observations in the sample. In each one of those situations, were utilized 1000 samples generated from the descriptive information of agronomy data. The predictive performance of the metodologies in comparison was measured by the Logarithm of the Score Predictive (LEP). The empirical results obtained indicated that the BMA presents similar performance to the usual approaches of selection of models in the situations of multicollinearity exploited.
Bowtell, Philip. "Non-linear functional relationships." Thesis, University of Reading, 1995. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.284183.
Full textLawrence, David E. "Cluster-Based Bounded Influence Regression." Diss., Virginia Tech, 2003. http://hdl.handle.net/10919/28455.
Full textPh. D.
Taga, Marcel Frederico de Lima. "Regressão linear com medidas censuradas." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-05122008-005901/.
Full textWe consider a simple linear regression model in which both variables are interval censored. To motivate the problem we use data from an audiometric study designed to evaluate the possibility of prediction of behavioral thresholds from physiological thresholds. We develop prediction intervals for the response variable, obtain the maximum likelihood estimators of the proposed model and compare their performance with that of estimators obtained under ordinary linear regression models.
Januario, Ana Paula Ferrari. "Análise estatística da produção de vitelão Mertolengo." Master's thesis, Universidade de Évora, 2021. http://hdl.handle.net/10174/29316.
Full textRodriguez, Mary Ana Petersen. "Parâmetros genéticos e fenotípicos do perfil de ácidos graxos do leite de vacas da raça holandesa." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/11/11139/tde-30102013-110828/.
Full textDuring the last decades, genetic improvement in dairy cattle in Brazil was based only on the importation of genetic material, resulting in small genetic gains for economic interest traits. There is a perceived need for genetic evaluation under national environment conditions to provide an increase in milk production allied to quality. In this context, the knowledge of the milk composition is very important for understanding how certain environmental factors and especially genetic factors may influence the increase in protein content (PROT), fat (FAT), beneficial fatty acids (FA) and in reducing somatic cell count, aiming to improve the nutritional quality of this product. The aim of this study was to predict the levels of interest FA using Bayesian linear regression and estimate the components of variance, coefficients of heritability and compare models with different orders of adjustment by Legendre polynomials functions, in random regression models. Milk samples were subjected to gas chromatography analysis and mid-infrared spectrometry for the determination of fatty acids. The comparison of the results obtained by both methods was performed using Pearson\'s correlation, Bland-Altman analysis and Bayesian linear regression, subsequently, prediction equations were developed for the fatty acids myristic (C14:0) and conjugated linoleic (CLA) from simple linear regressions and multiple Bayesian considering non-informative and informative priors. Legendre orthogonal polynomials from 1st to 6th orders were used to fit the random regression of the traits. That was viable the prediction of FA by applying the linear regression with prediction errors ranging from 0.01 to 4.84 g per 100 g of fat for C14:0 and 0.002 to 1.85 per 100 g of fat for CLA, in this case the smaller prediction errors obtained when adopted the multiple regression with non-informative priori. The models that best fit for FAT, PROT, C16:0, C18:0, C18:1C9, CLA, saturated (SAT), unsaturated (UNSAT), monounsaturated (MONO) and polyunsaturated (POLY) was the one of 1st order and for somatic cell scores (SCS) and C14:0 the one of 2nd order. The estimates of heritability ranged from 0.08 to 0.11 for FAT; 0.28 to 0.35 for PROT; 0.03 to 0.22 for SCS; 0.12 to 0.31 for C16:0; 0.08 to 0.14 for C18:0; 0.24 to 0.43 for C14:0; 0.07 to 0.17 for C18:1C9; 0.13 to 0.39 for CLA; 0.14 to 0.31 for SAT; 0.04 to 0.14 for UNSAT; 0.04 to 0.13 for MONO, 0.09 to 0.20 for POLY and 0.12 for PROD, in the models that best fit. We conclude that improvements in the nutritional quality of milk can be obtained through the inclusion of productive traits and fatty acid profile in genetic selection programs.
Medeiros, Patrick Valverde. "Análise da evapotranspiração de referência a partir de medidas lisimétricas e ajuste estatístico de estimativas de nove equações empírico-teóricas com base na equação de Penman-Monteith." Universidade de São Paulo, 2008. http://www.teses.usp.br/teses/disponiveis/18/18138/tde-21052008-090008/.
Full textThe quantification of the evapotranspiration is an essential task for the determination of the water balance in a watershed and for the establishment of the culture´s water deficit. Therefore, the present work describes the analysis of the reference evapotranspiration (ETo) for the region of Jaboticabal-SP. The phenomenon behavior in the region was studied based on the interpretation of 12 drainage lysimeters data (EToLis) and on theoretical estimates for 10 different equations available in the Literature. An statistical analysis indicated that the theoretical ETo estimates compared with the EToLis did not present good indices of comparison and error. Admitting that the lysimeters operation did not allow a reliable ETo determination, a local adjustment of the theoretical methodologies for ETo estimate was considered. An auto-regression (AR) of the noises of these equations in comparison with the annual average estimate for the Penman-Monteith equation (EToPM), taken as standard, has been performed in fortnightly and monthly periods. The adjustment through simple linear regression has also been analyzed. The obtained results indicate that the effective radiation is the most important climatic variable for the establishment of the ETo in the region. The Penman-Monteith estimate presented excellent correlation to the estimates by Makkink (1957) equation and the energy balance. The local adjustments presented excellent results for the majority of the tested equations, specially for the solar radiation FAO-24, Makkink (1957), Jensen-Haise (1963), Camargo (1971), radiation balance, Turc (1961) and Thornthwaite (1948) equations. The adjustment by simple linear regression is of easier execution and also presented excellent results.
Kartal, Elcin. "Metamodeling Complex Systems Using Linear And Nonlinear Regression Methods." Master's thesis, METU, 2007. http://etd.lib.metu.edu.tr/upload/2/12608930/index.pdf.
Full textBentley, Jason Phillip. "Exact Markov chain Monte Carlo and Bayesian linear regression." Thesis, University of Canterbury. Mathematics and Statistics, 2009. http://hdl.handle.net/10092/2534.
Full textCrews, Hugh Bates. "Fast FSR Methods for Second-Order Linear Regression Models." NCSU, 2008. http://www.lib.ncsu.edu/theses/available/etd-04282008-151809/.
Full textTsakonas, Efthymios. "Convex Optimization for Assignment and Generalized Linear Regression Problems." Doctoral thesis, KTH, Signalbehandling, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-150338.
Full textQC 20140902
Gustafsson, Alexander, and Sebastian Wogenius. "Modelling Apartment Prices with the Multiple Linear Regression Model." Thesis, KTH, Matematisk statistik, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-146735.
Full textDenna uppsats undersöker faktorer som är av störst statistisk signifikans för priset vid försäljning av lägenheter i Stockholms innerstad. Faktorer som undersöks är adress, yta, balkong, byggår, hiss, kakelugn, våningsnummer, etage, månadsavgift, vindsvåning och antal rum. Utifrån denna undersökning konstrueras en modell för att predicera priset på lägenheter. För att avgöra vilka faktorer som påverkar priset på lägenheter analyseras försäljningsstatistik. Den matematiska metoden som används är multipel linjär regressionsanalys. I en mindre litteratur- och fallstudie, inkluderad i denna uppsats, undersöks sambandet mellan närhet till kollektivtrafik och priset på läagenheter i Stockholm. Resultatet av denna uppsats visar att det är möjligt att konstruera en modell, utifrån de faktorer som undersöks, som kan predicera priset på läagenheter i Stockholms innerstad med en förklaringsgrad på 91 % och ett två miljoner SEK konfidensintervall på 95 %. Vidare dras en slutsats att modellen preciderar lägenheter med ett lägre pris noggrannare. I litteratur- och fallstudien indikerar resultatet stöd för hypotesen att närhet till kollektivtrafik är positivt för priset på en lägenhet. Detta skall dock betraktas med försiktighet med anledning av syftet med modelleringen vilket skiljer sig mellan en individuell tillämpning och en samhällsekonomisk tillämpning.
Lin, Shan. "Simultaneous confidence bands for linear and logistic regression models." Thesis, University of Southampton, 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.443030.
Full text"Supervised ridge regression in high dimensional linear regression." 2013. http://library.cuhk.edu.hk/record=b5549319.
Full textIn the field of statistical learning, we usually have a lot of features to determine the behavior of some response. For example in gene testing problems we have lots of genes as features and their relations with certain disease need to be determined. Without specific knowledge available, the most simple and fundamental way to model this kind of problem would be a linear model. There are many existing method to solve linear regression, like conventional ordinary least squares, ridge regression and LASSO (least absolute shrinkage and selection operator). Let N denote the number of samples and p denote the number of predictors, in ordinary settings where we have enough samples (N > p), ordinary linear regression methods like ridge regression will usually give reasonable predictions for the future values of the response. In the development of modern statistical learning, it's quite often that we meet high dimensional problems (N << p), like documents classification problems and microarray data testing problems. In high-dimensional problems it is generally quite difficult to identify the relationship between the predictors and the response without any further assumptions. Despite the fact that there are many predictors for prediction, most of the predictors are actually spurious in a lot of real problems. A predictor being spurious means that it is not directly related to the response. For example in microarray data testing problems, millions of genes may be available for doing prediction, but only a few hundred genes are actually related to the target disease. Conventional techniques in linear regression like LASSO and ridge regression both have their limitations in high-dimensional problems. The LASSO is one of the "state of the art technique for sparsity recovery, but when applied to high-dimensional problems, LASSO's performance is degraded a lot due to the presence of the measurement noise, which will result in high variance prediction and large prediction error. Ridge regression on the other hand is more robust to the additive measurement noise, but has its obvious limitation of not being able to separate true predictors from spurious predictors. As mentioned previously in many high-dimensional problems a large number of the predictors could be spurious, then in these cases ridge's disability in separating spurious and true predictors will result in poor interpretability of the model as well as poor prediction performance. The new technique that I will propose in this thesis aims to accommodate for the limitations of these two methods thus resulting in more accurate and stable prediction performance in a high-dimensional linear regression problem with signicant measurement noise. The idea is simple, instead of the doing a single step regression, we divide the regression procedure into two steps. In the first step we try to identify the seemingly relevant predictors and those that are obviously spurious by calculating the uni-variant correlations between the predictors and the response. We then discard those predictors that have very small or zero correlation with the response. After the first step we should have obtained a reduced predictor set. In the second step we will perform a ridge regression between the reduced predictor set and the response, the result of this ridge regression will then be our desired output. The thesis will be organized as follows, first I will start with a literature review about the linear regression problem and introduce in details about the ridge and LASSO and explain more precisely about their limitations in high-dimensional problems. Then I will introduce my new method called supervised ridge regression and show the reasons why it should dominate the ridge and LASSO in high-dimensional problems, and some simulation results will be demonstrated to strengthen my argument. Finally I will conclude with the possible limitations of my method and point out possible directions for further investigations.
Detailed summary in vernacular field only.
Zhu, Xiangchen.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2013.
Includes bibliographical references (leaves 68-69).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Abstracts also in Chinese.
Chapter 1. --- BASICS ABOUT LINEAR REGRESSION --- p.2
Chapter 1.1 --- Introduction --- p.2
Chapter 1.2 --- Linear Regression and Least Squares --- p.2
Chapter 1.2.1 --- Standard Notations --- p.2
Chapter 1.2.2 --- Least Squares and Its Geometric Meaning --- p.4
Chapter 2. --- PENALIZED LINEAR REGRESSION --- p.9
Chapter 2.1 --- Introduction --- p.9
Chapter 2.2 --- Deficiency of the Ordinary Least Squares Estimate --- p.9
Chapter 2.3 --- Ridge Regression --- p.12
Chapter 2.3.1 --- Introduction to Ridge Regression --- p.12
Chapter 2.3.2 --- Expected Prediction Error And Noise Variance Decomposition of Ridge Regression --- p.13
Chapter 2.3.3 --- Shrinkage effects on different principal components by ridge regression --- p.18
Chapter 2.4 --- The LASSO --- p.22
Chapter 2.4.1 --- Introduction to the LASSO --- p.22
Chapter 2.4.2 --- The Variable Selection Ability and Geometry of LASSO --- p.25
Chapter 2.4.3 --- Coordinate Descent Algorithm to solve for the LASSO --- p.28
Chapter 3. --- LINEAR REGRESSION IN HIGH-DIMENSIONAL PROBLEMS --- p.31
Chapter 3.1 --- Introduction --- p.31
Chapter 3.2 --- Spurious Predictors and Model Notations for High-dimensional Linear Regression --- p.32
Chapter 3.3 --- Ridge and LASSO in High-dimensional Linear Regression --- p.34
Chapter 4. --- THE SUPERVISED RIDGE REGRESSION --- p.39
Chapter 4.1 --- Introduction --- p.39
Chapter 4.2 --- Definition of Supervised Ridge Regression --- p.39
Chapter 4.3 --- An Underlying Latent Model --- p.43
Chapter 4.4 --- Ridge LASSO and Supervised Ridge Regression --- p.45
Chapter 4.4.1 --- LASSO vs SRR --- p.45
Chapter 4.4.2 --- Ridge regression vs SRR --- p.46
Chapter 5. --- TESTING AND SIMULATION --- p.49
Chapter 5.1 --- A Simulation Example --- p.49
Chapter 5.2 --- More Experiments --- p.54
Chapter 5.2.1 --- Correlated Spurious and True Predictors --- p.55
Chapter 5.2.2 --- Insufficient Amount of Data Samples --- p.59
Chapter 5.2.3 --- Low Dimensional Problem --- p.62
Chapter 6. --- CONCLUSIONS AND DISCUSSIONS --- p.66
Chapter 6.1 --- Conclusions --- p.66
Chapter 6.2 --- References and Related Works --- p.68
"Benchmarking non-linear series with quasi-linear regression." 2012. http://library.cuhk.edu.hk/record=b5549055.
Full text在基準修正過程中,一般會假設調查誤差及目標數據的大小互相獨立,即「累加模型」。然而,現實中兩者通常是相關的,目標變量越大,調查誤差亦會越大,即「乘積模型」。對此問題,陳兆國及胡家浩提出了利用準線性回歸手法對乘積模型進行基準修正。在本論文中,假設調查誤差服從AR(1)模型,首先我們會示範如何利用準線性回歸手法及默認調查誤差模型進行基準數據修正。然後,運用基準預測的方式,提出一個對調查誤差模型的估計辦法。最後我們會比較兩者的表現以及一些選擇誤差模型的指引。
For a target socio-economic variable, two sources of data with different collecting frequencies may be available in survey data analysis. In general, due to the difference of sample size or the data source, two sets of data do not agree with each other. Usually, the more frequent observations are less reliable, and the less frequent observations are much more accurate. In benchmarking problem, the less frequent observations can be treated as benchmarks, and will be used to adjust the higher frequent data.
In the common benchmarking setting, the survey error and the target variable are always assumed to be independent (Additive case). However, in reality, they should be correlated (Multiplicative case). The larger the variable, the larger the survey error. To deal with this problem, Chen and Wu (2006) proposed a regression method called quasi-linear regression for the multiplicative case. In this paper, by assuming the survey error to be an AR(1) model, we will demonstrate the benchmarking procedure using default error model for the quasi-linear regression. Also an error modelling procedure using benchmark forecast method will be proposed. Finally, we will compare the performance of the default error model with the fitted error model.
Detailed summary in vernacular field only.
Detailed summary in vernacular field only.
Luk, Wing Pan.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.
Includes bibliographical references (leaves 56-57).
Abstracts also in Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Recent Development For Benchmarking Methods --- p.2
Chapter 1.2 --- Multiplicative Case And Benchmarking Problem --- p.3
Chapter 2 --- Benchmarking With Quasi-linear Regression --- p.8
Chapter 2.1 --- Iterative Procedure For Quasi-linear Regression --- p.9
Chapter 2.2 --- Prediction Using Default Value φ --- p.16
Chapter 2.3 --- Performance Of Using Default Error Model --- p.17
Chapter 3 --- Estimation Of φ Via BM Forecasting method --- p.26
Chapter 3.1 --- Benchmark Forecasting Method --- p.26
Chapter 3.2 --- Performance Of Benchmark Forecasting Method --- p.28
Chapter 4 --- Benchmarking By The Estimated Value --- p.34
Chapter 4.1 --- Benchmarking With The Estimated Error Model --- p.35
Chapter 4.2 --- Performance Of Using Estimated Error Model --- p.36
Chapter 4.3 --- Suggestions For Selecting Error Model --- p.45
Chapter 5 --- Fitting AR(1) Model For Non-AR(1) Error --- p.47
Chapter 5.1 --- Settings For Non-AR(1) Model --- p.47
Chapter 5.2 --- Simulation Studies --- p.48
Chapter 6 --- An Illustrative Example: The Canada Total Retail Trade Se-ries --- p.50
Chapter 7 --- Conclusion --- p.54
Bibliography --- p.56
Lu, QiQi. "Linear regression under multiple changepoints." 2004. http://purl.galileo.usg.edu/uga%5Fetd/lu%5Fqiqi%5F200408%5Fphd.
Full text曾麗齡. "Linear Regression with Censored Data." Thesis, 1990. http://ndltd.ncl.edu.tw/handle/73824948008674721841.
Full textDias, Sónia Manuela Mendes. "Linear regression with empirical distributions." Doctoral thesis, 2014. https://repositorio-aberto.up.pt/handle/10216/74191.
Full textHuang, Min Ching. "Piecewise linear tree-structured regression." 1989. http://catalog.hathitrust.org/api/volumes/oclc/21951798.html.
Full textTypescript. Vita. eContent provider-neutral record in process. Description based on print version record. Includes bibliographical references (leaves 101-104).
Dias, Sónia Manuela Mendes. "Linear regression with empirical distributions." Tese, 2014. https://repositorio-aberto.up.pt/handle/10216/74191.
Full text