Дисертації з теми "LASSO regression models"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-34 дисертацій для дослідження на тему "LASSO regression models".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Patnaik, Kaushik. "Adaptive learning in lasso models." Thesis, Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/54353.
Повний текст джерелаChen, Xiaohui. "Lasso-type sparse regression and high-dimensional Gaussian graphical models." Thesis, University of British Columbia, 2012. http://hdl.handle.net/2429/42271.
Повний текст джерелаOlaya, Bucaro Orlando. "Predicting risk of cyberbullying victimization using lasso regression." Thesis, Uppsala universitet, Statistiska institutionen, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-338767.
Повний текст джерелаMo, Lili. "A class of operator splitting methods for least absolute shrinkage and selection operator (LASSO) models." HKBU Institutional Repository, 2012. https://repository.hkbu.edu.hk/etd_ra/1391.
Повний текст джерелаMiller, Ryan. "Marginal false discovery rate approaches to inference on penalized regression models." Diss., University of Iowa, 2018. https://ir.uiowa.edu/etd/6474.
Повний текст джерелаMarques, Matheus Augustus Pumputis. "Análise e comparação de alguns métodos alternativos de seleção de variáveis preditoras no modelo de regressão linear." Universidade de São Paulo, 2018. http://www.teses.usp.br/teses/disponiveis/45/45133/tde-23082018-210710/.
Повний текст джерелаIn this work, some new variable selection methods that have appeared in the last 15 years in the context of linear regression are studied, specifically the LARS - Least Angle Regression, the NAMS - Noise Addition Model Selection, the False Selection Rate - FSR, the Bayesian LASSO and the Spike-and-Slab LASSO. The methodology was the analysis and comparison of the studied methods. After this study, applications to real data bases are made, as well as a simulation study, in which all methods are shown to be promising, with the Bayesian methods showing the best results.
Zhai, Jing, Chiu-Hsieh Hsu, and Z. John Daye. "Ridle for sparse regression with mandatory covariates with application to the genetic assessment of histologic grades of breast cancer." BIOMED CENTRAL LTD, 2017. http://hdl.handle.net/10150/622811.
Повний текст джерелаSong, Song. "Confidence bands in quantile regression and generalized dynamic semiparametric factor models." Doctoral thesis, Humboldt-Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät, 2010. http://dx.doi.org/10.18452/16341.
Повний текст джерелаIn many applications it is necessary to know the stochastic fluctuation of the maximal deviations of the nonparametric quantile estimates, e.g. for various parametric models check. Uniform confidence bands are therefore constructed for nonparametric quantile estimates of regression functions. The first method is based on the strong approximations of the empirical process and extreme value theory. The strong uniform consistency rate is also established under general conditions. The second method is based on the bootstrap resampling method. It is proved that the bootstrap approximation provides a substantial improvement. The case of multidimensional and discrete regressor variables is dealt with using a partial linear model. A labor market analysis is provided to illustrate the method. High dimensional time series which reveal nonstationary and possibly periodic behavior occur frequently in many fields of science, e.g. macroeconomics, meteorology, medicine and financial engineering. One of the common approach is to separate the modeling of high dimensional time series to time propagation of low dimensional time series and high dimensional time invariant functions via dynamic factor analysis. We propose a two-step estimation procedure. At the first step, we detrend the time series by incorporating time basis selected by the group Lasso-type technique and choose the space basis based on smoothed functional principal component analysis. We show properties of this estimator under the dependent scenario. At the second step, we obtain the detrended low dimensional stochastic process (stationary).
Sawert, Marcus. "Predicting deliveries from suppliers : A comparison of predictive models." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-39314.
Повний текст джерелаYu, Lili. "Variable selection in the general linear model for censored data." Columbus, Ohio : Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1173279515.
Повний текст джерелаAnderskär, Erika, and Frida Thomasson. "Inkrementell responsanalys av Scandnavian Airlines medlemmar : Vilka kunder ska väljas vid riktad marknadsföring?" Thesis, Linköpings universitet, Statistik och maskininlärning, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139465.
Повний текст джерелаLundberg, Jacob. "Resource Efficient Representation of Machine Learning Models : investigating optimization options for decision trees in embedded systems." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-162013.
Повний текст джерелаKarmann, Clémence. "Inférence de réseaux pour modèles inflatés en zéro." Thesis, Université de Lorraine, 2019. http://www.theses.fr/2019LORR0146/document.
Повний текст джерелаNetwork inference has more and more applications, particularly in human health and environment, for the study of micro-biological and genomic data. Networks are indeed an appropriate tool to represent, or even study, relationships between entities. Many mathematical estimation techniques have been developed, particularly in the context of Gaussian graphical models, but also in the case of binary or mixed data. The processing of abundance data (of microorganisms such as bacteria for example) is particular for two reasons: on the one hand they do not directly reflect reality because a sequencing process takes place to duplicate species and this process brings variability, on the other hand a species may be absent in some samples. We are then in the context of zero-inflated data. Many graph inference methods exist for Gaussian, binary and mixed data, but zero-inflated models are rarely studied, although they reflect the structure of many data sets in a relevant way. The objective of this thesis is to infer networks for zero-inflated models. In this thesis, we will restrict to conditional dependency graphs. The work presented in this thesis is divided into two main parts. The first one concerns graph inference methods based on the estimation of neighbourhoods by a procedure combining ordinal regression models and variable selection methods. The second one focuses on graph inference in a model where the variables are Gaussian zero-inflated by double truncation (right and left)
Huynh, Bao Tuyen. "Estimation and feature selection in high-dimensional mixtures-of-experts models." Thesis, Normandie, 2019. http://www.theses.fr/2019NORMC237.
Повний текст джерелаThis thesis deals with the problem of modeling and estimation of high-dimensional MoE models, towards effective density estimation, prediction and clustering of such heterogeneous and high-dimensional data. We propose new strategies based on regularized maximum-likelihood estimation (MLE) of MoE models to overcome the limitations of standard methods, including MLE estimation with Expectation-Maximization (EM) algorithms, and to simultaneously perform feature selection so that sparse models are encouraged in such a high-dimensional setting. We first introduce a mixture-of-experts’ parameter estimation and variable selection methodology, based on l1 (lasso) regularizations and the EM framework, for regression and clustering suited to high-dimensional contexts. Then, we extend the method to regularized mixture of experts models for discrete data, including classification. We develop efficient algorithms to maximize the proposed l1 -penalized observed-data log-likelihood function. Our proposed strategies enjoy the efficient monotone maximization of the optimized criterion, and unlike previous approaches, they do not rely on approximations on the penalty functions, avoid matrix inversion, and exploit the efficiency of the coordinate ascent algorithm, particularly within the proximal Newton-based approach
Liu, Li. "Grouped variable selection in high dimensional partially linear additive Cox model." Diss., University of Iowa, 2010. https://ir.uiowa.edu/etd/847.
Повний текст джерелаShah, Smit. "Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions." FIU Digital Commons, 2015. http://digitalcommons.fiu.edu/etd/1853.
Повний текст джерелаKim, Byung-Jun. "Semiparametric and Nonparametric Methods for Complex Data." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99155.
Повний текст джерелаDoctor of Philosophy
A variety of complex data has broadened in many research fields such as epidemiology, genomics, and analytical chemistry with the development of science, technologies, and design scheme over the past few decades. For example, in epidemiology, the matched case-crossover study design is used to investigate the association between the clustered binary outcomes of disease and a measurement error in covariate within a certain period by stratifying subjects' conditions. In genomics, high-correlated and high-dimensional(HCHD) data are required to identify important genes and their interaction effect over diseases. In analytical chemistry, multiple time series data are generated to recognize the complex patterns among multiple classes. Due to the great diversity, we encounter three problems in analyzing the following three types of data: (1) matched case-crossover data, (2) HCHD data, and (3) Time-series data. We contribute to the development of statistical methods to deal with such complex data. First, under the matched study, we discuss an idea about hypothesis testing to effectively determine the association between observed factors and risk of interested disease. Because, in practice, we do not know the specific form of the association, it might be challenging to set a specific alternative hypothesis. By reflecting the reality, we consider the possibility that some observations are measured with errors. By considering these measurement errors, we develop a testing procedure under the matched case-crossover framework. This testing procedure has the flexibility to make inferences on various hypothesis settings. Second, we consider the data where the number of variables is very large compared to the sample size, and the variables are correlated to each other. In this case, our goal is to identify important variables for outcome among a large amount of the variables and build their network. For example, identifying few genes among whole genomics associated with diabetes can be used to develop biomarkers. By our proposed approach in the second project, we can identify differentially expressed and important genes and their network structure with consideration for the outcome. Lastly, we consider the scenario of changing patterns of interest over time with application to gas chromatography. We propose an efficient detection method to effectively distinguish the patterns of multi-level subjects in time-trend analysis. We suggest that our proposed method can give precious information on efficient search for the distinguishable patterns so as to reduce the burden of examining all observations in the data.
Mukhopadhyay, Shraddha. "Comparison of existing ZOI estimation methods with different model specifications and data." Thesis, Högskolan Dalarna, Mikrodataanalys, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:du-34397.
Повний текст джерелаChu, Shuyu. "Change Detection and Analysis of Data with Heterogeneous Structures." Diss., Virginia Tech, 2017. http://hdl.handle.net/10919/78613.
Повний текст джерелаPh. D.
Bécu, Jean-Michel. "Contrôle des fausses découvertes lors de la sélection de variables en grande dimension." Thesis, Compiègne, 2016. http://www.theses.fr/2016COMP2264/document.
Повний текст джерелаIn the regression framework, many studies are focused on the high-dimensional problem where the number of measured explanatory variables is very large compared to the sample size. If variable selection is a classical question, usual methods are not applicable in the high-dimensional case. So, in this manuscript, we develop the transposition of statistical tests to the high dimension. These tests operate on estimates of regression coefficients obtained by penalized linear regression, which is applicable in high-dimension. The main objective of these tests is the false discovery control. The first contribution of this manuscript provides a quantification of the uncertainty for regression coefficients estimated by ridge regression in high dimension. The Ridge regression penalizes the coefficients on their l2 norm. To do this, we devise a statistical test based on permutations. The second contribution is based on a two-step selection approach. A first step is dedicated to the screening of variables, based on parsimonious regression Lasso. The second step consists in cleaning the resulting set by testing the relevance of pre-selected variables. These tests are made on adaptive-ridge estimates, where the penalty is constructed on Lasso estimates learned during the screening step. A last contribution consists to the transposition of this approach to group-variables selection
Tjärnberg, Andreas. "Exploring the Boundaries of Gene Regulatory Network Inference." Doctoral thesis, Stockholms universitet, Institutionen för biokemi och biofysik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-122149.
Повний текст джерелаAt the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 4: Manuscript.
Liley, Albert James. "Statistical co-analysis of high-dimensional association studies." Thesis, University of Cambridge, 2017. https://www.repository.cam.ac.uk/handle/1810/270628.
Повний текст джерелаMcIlhagga, William H. "penalized: A MATLAB toolbox for fitting generalized linear models with penalties." 2015. http://hdl.handle.net/10454/10882.
Повний текст джерелаpenalized is a exible, extensible, and e cient MATLAB toolbox for penalized maximum likelihood. penalized allows you to t a generalized linear model (gaussian, logistic, poisson, or multinomial) using any of ten provided penalties, or none. The toolbox can be extended by creating new maximum likelihood models or new penalties. The toolbox also includes routines for cross-validation and plotting.
Zeng, Yan. "A Study of Missing Data Imputation and Predictive Modeling of Strength Properties of Wood Composites." 2011. http://trace.tennessee.edu/utk_gradthes/1041.
Повний текст джерелаKunc, Vladimír. "Comparison of different models for forecasting of Czech electricity market." Master's thesis, 2017. http://www.nusl.cz/ntk/nusl-367836.
Повний текст джерелаMpfumali, Phathutshedzo. "Probabilistic solar power forecasting using partially linear additive quantile regression models: an application to South African data." Diss., 2019. http://hdl.handle.net/11602/1349.
Повний текст джерелаDepartment of Statistics
This study discusses an application of partially linear additive quantile regression models in predicting medium-term global solar irradiance using data from Tellerie radiometric station in South Africa for the period August 2009 to April 2010. Variables are selected using a least absolute shrinkage and selection operator (Lasso) via hierarchical interactions and the parameters of the developed models are estimated using the Barrodale and Roberts's algorithm. The best models are selected based on the Akaike information criterion (AIC), Bayesian information criterion (BIC), adjusted R squared (AdjR2) and generalised cross validation (GCV). The accuracy of the forecasts is evaluated using mean absolute error (MAE) and root mean square errors (RMSE). To improve the accuracy of forecasts, a convex forecast combination algorithm where the average loss su ered by the models is based on the pinball loss function is used. A second forecast combination method which is quantile regression averaging (QRA) is also used. The best set of forecasts is selected based on the prediction interval coverage probability (PICP), prediction interval normalised average width (PINAW) and prediction interval normalised average deviation (PINAD). The results show that QRA is the best model since it produces robust prediction intervals than other models. The percentage improvement is calculated and the results demonstrate that QRA model over GAM with interactions yields a small improvement whereas QRA over a convex forecast combination model yields a higher percentage improvement. A major contribution of this dissertation is the inclusion of a non-linear trend variable and the extension of forecast combination models to include the QRA.
NRF
Martin, Jacqueline. "Modellierung des Unfallgeschehens im Radverkehr am Beispiel der Stadt Dresden." 2020. https://tud.qucosa.de/id/qucosa%3A73500.
Повний текст джерелаKusiak, Caroline. "Real-Time Dengue Forecasting In Thailand: A Comparison Of Penalized Regression Approaches Using Internet Search Data." 2018. https://scholarworks.umass.edu/masters_theses_2/708.
Повний текст джерелаThanyani, Maduvhahafani. "Forecasting hourly electricity demand in South Africa using machine learning models." Diss., 2020. http://hdl.handle.net/11602/1595.
Повний текст джерелаDepartment of Statistics
Short-term load forecasting in South Africa using machine learning and statistical models is discussed in this study. The research is focused on carrying out a comparative analysis in forecasting hourly electricity demand. This study was carried out using South Africa’s aggregated hourly load data from Eskom. The comparison is carried out in this study using support vector regression (SVR), stochastic gradient boosting (SGB), artificial neural networks (NN) with generalized additive model (GAM) as a benchmark model in forecasting hourly electricity demand. In both modelling frameworks, variable selection is done using least absolute shrinkage and selection operator (Lasso). The SGB model yielded the least root mean square error (RMSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) on testing data. SGB model also yielded the least RMSE, MAE and MAPE on training data. Forecast combination of the models’ forecasts is done using convex combination and quantile regres- sion averaging (QRA). The QRA was found to be the best forecast combination model ibased on the RMSE, MAE and MAPE.
NRF
Chen, Szu-Cheng, and 陳思成. "Lasso Quantile Regression Model to Construct Asia and Taiwan Systemic Risk Measurement." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/4b2fy9.
Повний текст джерелаHuang, Hsin-Hsiung, and 黃信雄. "Study on the Lasso Method for Variable Selectionin Linear Regression Model with Mallows'' Cp." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/41127770529976845884.
Повний текст джерела國立臺灣大學
數學研究所
95
When the number of predictors in a linear regression model is large, regularization is a commonly used method to reduce the complexity of the fitted model. LASSO (Tibshirani, 1996) is being advocated as a useful regulation method for achieving sparsity or parsimony of resulting fitted model. In this thesis, we study the operating characteristics of LASSO coupled with Mallows’Cp on identifying the orthonormal predictor variables of linear regression when the number of predictors and the number of the observation are of the same magnitude. The characteristics includes the chosen number of predictors and the proportion of correctly identified predictors. This result can be useful in multiple testing.
Hsin-Hsiung, Huang. "Study on the Lasso Method for Variable Selection in Linear Regression Model with Mallows' Cp." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0001-0701200722590000.
Повний текст джерелаHe, Zangdong. "Variable selection and structural discovery in joint models of longitudinal and survival data." Thesis, 2014. http://hdl.handle.net/1805/6365.
Повний текст джерелаJoint models of longitudinal and survival outcomes have been used with increasing frequency in clinical investigations. Correct specification of fixed and random effects, as well as their functional forms is essential for practical data analysis. However, no existing methods have been developed to meet this need in a joint model setting. In this dissertation, I describe a penalized likelihood-based method with adaptive least absolute shrinkage and selection operator (ALASSO) penalty functions for model selection. By reparameterizing variance components through a Cholesky decomposition, I introduce a penalty function of group shrinkage; the penalized likelihood is approximated by Gaussian quadrature and optimized by an EM algorithm. The functional forms of the independent effects are determined through a procedure for structural discovery. Specifically, I first construct the model by penalized cubic B-spline and then decompose the B-spline to linear and nonlinear elements by spectral decomposition. The decomposition represents the model in a mixed-effects model format, and I then use the mixed-effects variable selection method to perform structural discovery. Simulation studies show excellent performance. A clinical application is described to illustrate the use of the proposed methods, and the analytical results demonstrate the usefulness of the methods.
Dobreva, Maria Lubomirova. "Data-driven evaluation of real estate liquidity : predicting days on market to optimize the sales strategy of a startup." Master's thesis, 2019. http://hdl.handle.net/10362/91224.
Повний текст джерелаThis is a research project for applying data mining techniques on Real Estate data in cooperation with Homeheed, a startup in the area of real estate, providing a platform solution as a single source of truth in Sofia, Bulgaria. This project suggests the development of a predictive model by using LASSO regression with the premise to determine days on market. As a consequence, the discoveries are expected to contribute to the Startup by providing insights about more attractive listings, and so will support faster return on investment. Additionally, the paper provides an experimental part where misleading and fake listings are targeted in order to support fraud and real availability of a listing detection. The project’s main objectives and assumptions are that advanced statistics and information management can build such a synergy with data and business models that allows enhancement of both market entry strategy and quality of service.