Um die anderen Arten von Veröffentlichungen zu diesem Thema anzuzeigen, folgen Sie diesem Link: LASSO algoritmus.

Zeitschriftenartikel zum Thema „LASSO algoritmus“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit Top-50 Zeitschriftenartikel für die Forschung zum Thema "LASSO algoritmus" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Sehen Sie die Zeitschriftenartikel für verschiedene Spezialgebieten durch und erstellen Sie Ihre Bibliographie auf korrekte Weise.

1

Gaines, Brian R., Juhyun Kim und Hua Zhou. „Algorithms for Fitting the Constrained Lasso“. Journal of Computational and Graphical Statistics 27, Nr. 4 (07.08.2018): 861–71. http://dx.doi.org/10.1080/10618600.2018.1473777.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Bonnefoy, Antoine, Valentin Emiya, Liva Ralaivola und Remi Gribonval. „Dynamic Screening: Accelerating First-Order Algorithms for the Lasso and Group-Lasso“. IEEE Transactions on Signal Processing 63, Nr. 19 (Oktober 2015): 5121–32. http://dx.doi.org/10.1109/tsp.2015.2447503.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Zhou, Helper, und Victor Gumbo. „Supervised Machine Learning for Predicting SMME Sales: An Evaluation of Three Algorithms“. African Journal of Information and Communication, Nr. 27 (31.05.2021): 1–21. http://dx.doi.org/10.23962/10539/31371.

Der volle Inhalt der Quelle
Annotation:
The emergence of machine learning algorithms presents the opportunity for a variety of stakeholders to perform advanced predictive analytics and to make informed decisions. However, to date there have been few studies in developing countries that evaluate the performance of such algorithms—with the result that pertinent stakeholders lack an informed basis for selecting appropriate techniques for modelling tasks. This study aims to address this gap by evaluating the performance of three machine learning techniques: ordinary least squares (OLS), least absolute shrinkage and selection operator (LASSO), and artificial neural networks (ANNs). These techniques are evaluated in respect of their ability to perform predictive modelling of the sales performance of small, medium and micro enterprises (SMMEs) engaged in manufacturing. The evaluation finds that the ANNs algorithm’s performance is far superior to that of the other two techniques, OLS and LASSO, in predicting the SMMEs’ sales performance.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Wu, Tong Tong, und Kenneth Lange. „Coordinate descent algorithms for lasso penalized regression“. Annals of Applied Statistics 2, Nr. 1 (März 2008): 224–44. http://dx.doi.org/10.1214/07-aoas147.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Tsiligkaridis, Theodoros, Alfred O. Hero III und Shuheng Zhou. „On Convergence of Kronecker Graphical Lasso Algorithms“. IEEE Transactions on Signal Processing 61, Nr. 7 (April 2013): 1743–55. http://dx.doi.org/10.1109/tsp.2013.2240157.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Muchisha, Nadya Dwi, Novian Tamara, Andriansyah Andriansyah und Agus M. Soleh. „Nowcasting Indonesia’s GDP Growth Using Machine Learning Algorithms“. Indonesian Journal of Statistics and Its Applications 5, Nr. 2 (30.06.2021): 355–68. http://dx.doi.org/10.29244/ijsa.v5i2p355-368.

Der volle Inhalt der Quelle
Annotation:
GDP is very important to be monitored in real time because of its usefulness for policy making. We built and compared the ML models to forecast real-time Indonesia's GDP growth. We used 18 variables that consist a number of quarterly macroeconomic and financial market statistics. We have evaluated the performance of six popular ML algorithms, such as Random Forest, LASSO, Ridge, Elastic Net, Neural Networks, and Support Vector Machines, in doing real-time forecast on GDP growth from 2013:Q3 to 2019:Q4 period. We used the RMSE, MAD, and Pearson correlation coefficient as measurements of forecast accuracy. The results showed that the performance of all these models outperformed AR (1) benchmark. The individual model that showed the best performance is random forest. To gain more accurate forecast result, we run forecast combination using equal weighting and lasso regression. The best model was obtained from forecast combination using lasso regression with selected ML models, which are Random Forest, Ridge, Support Vector Machine, and Neural Network.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Jain, Rahi, und Wei Xu. „HDSI: High dimensional selection with interactions algorithm on feature selection and testing“. PLOS ONE 16, Nr. 2 (16.02.2021): e0246159. http://dx.doi.org/10.1371/journal.pone.0246159.

Der volle Inhalt der Quelle
Annotation:
Feature selection on high dimensional data along with the interaction effects is a critical challenge for classical statistical learning techniques. Existing feature selection algorithms such as random LASSO leverages LASSO capability to handle high dimensional data. However, the technique has two main limitations, namely the inability to consider interaction terms and the lack of a statistical test for determining the significance of selected features. This study proposes a High Dimensional Selection with Interactions (HDSI) algorithm, a new feature selection method, which can handle high-dimensional data, incorporate interaction terms, provide the statistical inferences of selected features and leverage the capability of existing classical statistical techniques. The method allows the application of any statistical technique like LASSO and subset selection on multiple bootstrapped samples; each contains randomly selected features. Each bootstrap data incorporates interaction terms for the randomly sampled features. The selected features from each model are pooled and their statistical significance is determined. The selected statistically significant features are used as the final output of the approach, whose final coefficients are estimated using appropriate statistical techniques. The performance of HDSI is evaluated using both simulated data and real studies. In general, HDSI outperforms the commonly used algorithms such as LASSO, subset selection, adaptive LASSO, random LASSO and group LASSO.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Qin, Zhiwei, Katya Scheinberg und Donald Goldfarb. „Efficient block-coordinate descent algorithms for the Group Lasso“. Mathematical Programming Computation 5, Nr. 2 (31.03.2013): 143–69. http://dx.doi.org/10.1007/s12532-013-0051-x.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Johnson, Karl M., und Thomas P. Monath. „Imported Lassa Fever — Reexamining the Algorithms“. New England Journal of Medicine 323, Nr. 16 (18.10.1990): 1139–41. http://dx.doi.org/10.1056/nejm199010183231611.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Zhao, Yingdong, und Richard Simon. „Development and Validation of Predictive Indices for a Continuous Outcome Using Gene Expression Profiles“. Cancer Informatics 9 (Januar 2010): CIN.S3805. http://dx.doi.org/10.4137/cin.s3805.

Der volle Inhalt der Quelle
Annotation:
There have been relatively few publications using linear regression models to predict a continuous response based on microarray expression profiles. Standard linear regression methods are problematic when the number of predictor variables exceeds the number of cases. We have evaluated three linear regression algorithms that can be used for the prediction of a continuous response based on high dimensional gene expression data. The three algorithms are the least angle regression (LAR), the least absolute shrinkage and selection operator (LASSO), and the averaged linear regression method (ALM). All methods are tested using simulations based on a real gene expression dataset and analyses of two sets of real gene expression data and using an unbiased complete cross validation approach. Our results show that the LASSO algorithm often provides a model with somewhat lower prediction error than the LAR method, but both of them perform more efficiently than the ALM predictor. We have developed a plug-in for BRB-ArrayTools that implements the LAR and the LASSO algorithms with complete cross-validation.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
11

Bunea, Florentina, Johannes Lederer und Yiyuan She. „The Group Square-Root Lasso: Theoretical Properties and Fast Algorithms“. IEEE Transactions on Information Theory 60, Nr. 2 (Februar 2014): 1313–25. http://dx.doi.org/10.1109/tit.2013.2290040.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
12

Rakotomamonjy, A. „Surveying and comparing simultaneous sparse approximation (or group-lasso) algorithms“. Signal Processing 91, Nr. 7 (Juli 2011): 1505–26. http://dx.doi.org/10.1016/j.sigpro.2011.01.012.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
13

Ghosh, Pronab, Asif Karim, Syeda Tanjila Atik, Saima Afrin und Mohd Saifuzzaman. „Expert cancer model using supervised algorithms with a LASSO selection approach“. International Journal of Electrical and Computer Engineering (IJECE) 11, Nr. 3 (01.06.2021): 2631. http://dx.doi.org/10.11591/ijece.v11i3.pp2631-2639.

Der volle Inhalt der Quelle
Annotation:
One of the most critical issues of the mortality rate in the medical field in current times is breast cancer. Nowadays, a large number of men and women is facing cancer-related deaths due to the lack of early diagnosis systems and proper treatment per year. To tackle the issue, various data mining approaches have been analyzed to build an effective model that helps to identify the different stages of deadly cancers. The study successfully proposes an early cancer disease model based on five different supervised algorithms such as logistic regression (henceforth LR), decision tree (henceforth DT), random forest (henceforth RF), Support vector machine (henceforth SVM), and K-nearest neighbor (henceforth KNN). After an appropriate preprocessing of the dataset, least absolute shrinkage and selection operator (LASSO) was used for feature selection (FS) using a 10-fold cross-validation (CV) approach. Employing LASSO with 10-fold cross-validation has been a novel steps introduced in this research. Afterwards, different performance evaluation metrics were measured to show accurate predictions based on the proposed algorithms. The result indicated top accuracy was received from RF classifier, approximately 99.41% with the integration of LASSO. Finally, a comprehensive comparison was carried out on Wisconsin breast cancer (diagnostic) dataset (WBCD) together with some current works containing all features.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
14

Dermoune, Azzouz, Daoud Ounaissi und Nadji Rahmania. „Oscillation of Metropolis–Hastings and simulated annealing algorithms around LASSO estimator“. Mathematics and Computers in Simulation 135 (Mai 2017): 39–50. http://dx.doi.org/10.1016/j.matcom.2015.09.003.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
15

Onose, Alexandru, und Bogdan Dumitrescu. „Adaptive Randomized Coordinate Descent for Sparse Systems: Lasso and Greedy Algorithms“. IEEE Transactions on Signal Processing 63, Nr. 15 (August 2015): 4091–101. http://dx.doi.org/10.1109/tsp.2015.2436369.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
16

Gillies, Christopher E., Xiaoli Gao, Nilesh V. Patel, Mohammad-Reza Siadat und George D. Wilson. „Improved Feature Selection by Incorporating Gene Similarity into the LASSO“. International Journal of Knowledge Discovery in Bioinformatics 3, Nr. 1 (Januar 2012): 1–22. http://dx.doi.org/10.4018/jkdb.2012010101.

Der volle Inhalt der Quelle
Annotation:
Personalized medicine is customizing treatments to a patient’s genetic profile and has the potential to revolutionize medical practice. An important process used in personalized medicine is gene expression profiling. Analyzing gene expression profiles is difficult, because there are usually few patients and thousands of genes, leading to the curse of dimensionality. To combat this problem, researchers suggest using prior knowledge to enhance feature selection for supervised learning algorithms. The authors propose an enhancement to the LASSO, a shrinkage and selection technique that induces parameter sparsity by penalizing a model’s objective function. Their enhancement gives preference to the selection of genes that are involved in similar biological processes. The authors’ modified LASSO selects similar genes by penalizing interaction terms between genes. They devise a coordinate descent algorithm to minimize the corresponding objective function. To evaluate their method, the authors created simulation data where they compared their model to the standard LASSO model and an interaction LASSO model. The authors’ model outperformed both the standard and interaction LASSO models in terms of detecting important genes and gene interactions for a reasonable number of training samples. They also demonstrated the performance of their method on a real gene expression data set from lung cancer cell lines.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
17

Costa, Marcelo A., Enrico A. Colosimo und Carolina G. Miranda. „SELECTING PROFILES OF IN DEBT CLIENTS OF A BRAZILIAN TELEPHONE COMPANY: NEW LASSO AND ADAPTIVE LASSO ALGORITHMS IN THE COX MODEL“. Pesquisa Operacional 35, Nr. 2 (August 2015): 401–21. http://dx.doi.org/10.1590/0101-7438.2015.035.02.0401.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
18

Li, Ao, und Hayaru Shouno. „Dictionary-Based Image Denoising by Fused-Lasso Atom Selection“. Mathematical Problems in Engineering 2014 (2014): 1–10. http://dx.doi.org/10.1155/2014/368602.

Der volle Inhalt der Quelle
Annotation:
We proposed an efficient image denoising scheme by fused lasso with dictionary learning. The scheme has two important contributions. The first one is that we learned the patch-based adaptive dictionary by principal component analysis (PCA) with clustering the image into many subsets, which can better preserve the local geometric structure. The second one is that we coded the patches in each subset by fused lasso with the clustering learned dictionary and proposed an iterative Split Bregman to solve it rapidly. We present the capabilities with several experiments. The results show that the proposed scheme is competitive to some excellent denoising algorithms.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
19

Zou, Jian, und Yuli Fu. „Split Bregman algorithms for sparse group Lasso with application to MRI reconstruction“. Multidimensional Systems and Signal Processing 26, Nr. 3 (12.02.2014): 787–802. http://dx.doi.org/10.1007/s11045-014-0282-7.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
20

Li, Yingying, und Yaxuan Zhang. „Bounded Perturbation Resilience of Two Modified Relaxed CQ Algorithms for the Multiple-Sets Split Feasibility Problem“. Axioms 10, Nr. 3 (23.08.2021): 197. http://dx.doi.org/10.3390/axioms10030197.

Der volle Inhalt der Quelle
Annotation:
In this paper, we present some modified relaxed CQ algorithms with different kinds of step size and perturbation to solve the Multiple-sets Split Feasibility Problem (MSSFP). Under mild assumptions, we establish weak convergence and prove the bounded perturbation resilience of the proposed algorithms in Hilbert spaces. Treating appropriate inertial terms as bounded perturbations, we construct the inertial acceleration versions of the corresponding algorithms. Finally, for the LASSO problem and three experimental examples, numerical computations are given to demonstrate the efficiency of the proposed algorithms and the validity of the inertial perturbation.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
21

Xiong, ZHANG, LV Xinyan, JI Tao und ZHONG Chen. „Water permeability prediction of sponge city pavement materials based on different machine learning algorithms“. E3S Web of Conferences 194 (2020): 05023. http://dx.doi.org/10.1051/e3sconf/202019405023.

Der volle Inhalt der Quelle
Annotation:
Permeable pavement material is one of the most important supporting materials in the construction of sponge city, and its water permeability is the most important performance index. The water permeability test of permeable pavement materials is a tedious and complicated experimental work. It is of great research significance to predict the water permeability of permeable pavement materials through structural parameters modeling. In this paper, the database is first established by experimental means, and then the prediction models of LASSO (Least absolute shrinkage and selection operator), SVR (Support vector regression) and GBR (Gradient Boosting Regression) machine learning algorithms are established. Through the four factors of particle size, particle size distribution, shape parameters and binder content predict the water permeability of sponge city pavement materials. The results show that different machine learning algorithms have different sensitivity to the distribution of data samples. The fitting effect of GBR model water permeability prediction is better than that of SVR and LASSO models. The test value-predicted value MSE is 0.0051 and R2 is 0.92, which can effectively predict the water permeability of sponge city pavement materials.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
22

Guo, Hongping, Zuguo Yu, Jiyuan An, Guosheng Han, Yuanlin Ma und Runbin Tang. „A Two-Stage Mutual Information Based Bayesian Lasso Algorithm for Multi-Locus Genome-Wide Association Studies“. Entropy 22, Nr. 3 (13.03.2020): 329. http://dx.doi.org/10.3390/e22030329.

Der volle Inhalt der Quelle
Annotation:
Genome-wide association study (GWAS) has turned out to be an essential technology for exploring the genetic mechanism of complex traits. To reduce the complexity of computation, it is well accepted to remove unrelated single nucleotide polymorphisms (SNPs) before GWAS, e.g., by using iterative sure independence screening expectation-maximization Bayesian Lasso (ISIS EM-BLASSO) method. In this work, a modified version of ISIS EM-BLASSO is proposed, which reduces the number of SNPs by a screening methodology based on Pearson correlation and mutual information, then estimates the effects via EM-Bayesian Lasso (EM-BLASSO), and finally detects the true quantitative trait nucleotides (QTNs) through likelihood ratio test. We call our method a two-stage mutual information based Bayesian Lasso (MBLASSO). Under three simulation scenarios, MBLASSO improves the statistical power and retains the higher effect estimation accuracy when comparing with three other algorithms. Moreover, MBLASSO performs best on model fitting, the accuracy of detected associations is the highest, and 21 genes can only be detected by MBLASSO in Arabidopsis thaliana datasets.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
23

Liu, Zhenqiu, und Gang Li. „Efficient Regularized Regression withL0Penalty for Variable Selection and Network Construction“. Computational and Mathematical Methods in Medicine 2016 (2016): 1–11. http://dx.doi.org/10.1155/2016/3456153.

Der volle Inhalt der Quelle
Annotation:
Variable selections for regression with high-dimensional big data have found many applications in bioinformatics and computational biology. One appealing approach is theL0regularized regression which penalizes the number of nonzero features in the model directly. However, it is well known thatL0optimization is NP-hard and computationally challenging. In this paper, we propose efficient EM (L0EM) and dualL0EM (DL0EM) algorithms that directly approximate theL0optimization problem. WhileL0EM is efficient with large sample size, DL0EM is efficient with high-dimensional (n≪m) data. They also provide a natural solution to allLp p∈[0,2]problems, including lasso withp=1and elastic net withp∈[1,2]. The regularized parameterλcan be determined through cross validation or AIC and BIC. We demonstrate our methods through simulation and high-dimensional genomic data. The results indicate thatL0has better performance than lasso, SCAD, and MC+, andL0with AIC or BIC has similar performance as computationally intensive cross validation. The proposed algorithms are efficient in identifying the nonzero variables with less bias and constructing biologically important networks with high-dimensional big data.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
24

Mukhtar, Majid Khan Bin Majahar Ali, Anam Javaid, Mohd Tahir Ismail und Ahmad Fudholi. „Accurate and Hybrid Regularization - Robust Regression Model in Handling Multicollinearity and Outlier Using 8SC for Big Data“. Mathematical Modelling of Engineering Problems 8, Nr. 4 (31.08.2021): 547–56. http://dx.doi.org/10.18280/mmep.080407.

Der volle Inhalt der Quelle
Annotation:
Regressions have been continuously received great attention. However, there are still open issues in regression, and two of the issues is regression with multicollinearity and outlier. Regularization (Ridge, Lasso, and Elastic Net) techniques implement a means to control regression coefficients. The methods can decrease the variance and reduce our sample error for tackle multicollinearity. In robust regression, it is a form of regression method designed to overcome outliers. Robust regression is an important method for analyzing data that are infected with outliers. The data have been interacted on the second order interaction. The data contained 435 different independent interaction variables. The primary focus of this paper is to analyze and compare the impact of three different variable selection techniques regularization regression algorithms for the data seaweed drying. After that, it will be analyzed through robust regression (Tukey Bi-Square, Hampel, and Huber). As the result, the Lasso-Hampel was better than others with the MAE (4.09641), RMSE (5.275992), MAPE (7.9962), SSE (182491.2), R-square (0.6514791), and R-square Adjusted (0.649279). The method of Lasso-Hampel is able to be relied on investigation of the accuracy in big data obtained from regularization and robust regression.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
25

Kim, Baekjin, Donghyeon Yu und Joong-Ho Won. „Comparative study of computational algorithms for the Lasso with high-dimensional, highly correlated data“. Applied Intelligence 48, Nr. 8 (20.10.2016): 1933–52. http://dx.doi.org/10.1007/s10489-016-0850-7.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
26

Nesterov, Yurii, und Arkadi Nemirovski. „On first-order algorithms for l1/nuclear norm minimization“. Acta Numerica 22 (02.04.2013): 509–75. http://dx.doi.org/10.1017/s096249291300007x.

Der volle Inhalt der Quelle
Annotation:
In the past decade, problems related to l1/nuclear norm minimization have attracted much attention in the signal processing, machine learning and optimization communities. In this paper, devoted to l1/nuclear norm minimization as ‘optimization beasts’, we give a detailed description of two attractive first-order optimization techniques for solving problems of this type. The first one, aimed primarily at lasso-type problems, comprises fast gradient methods applied to composite minimization formulations. The second approach, aimed at Dantzig-selector-type problems, utilizes saddle-point first-order algorithms and reformulation of the problem of interest as a generalized bilinear saddle-point problem. For both approaches, we give complete and detailed complexity analyses and discuss the application domains.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
27

Radovanović, Sandro, Boris Delibašić, Miloš Jovanović, Milan Vukićević und Milija Suknović. „A Framework for Integrating Domain Knowledge in Logistic Regression with Application to Hospital Readmission Prediction“. International Journal on Artificial Intelligence Tools 28, Nr. 06 (September 2019): 1960006. http://dx.doi.org/10.1142/s0218213019600066.

Der volle Inhalt der Quelle
Annotation:
It is commonly understood that machine learning algorithms discover and extract knowledge based on data at hand. However, a huge amount of knowledge is available which is in machine-readable format and ready for inclusion in machine learning algorithms and models. In this paper, we propose a framework that integrates domain knowledge in form of ontologies/hierarchies into logistic regression using stacked generalization. Namely, relations from ontology/hierarchy are used in stacking manner in order to obtain higher, more abstract concepts. Obtained concepts are further used for prediction. The problem we solved is unplanned 30-days hospital readmission, which is considered as one of the major problems in healthcare. Proposed framework yields better results compared to Ridge, Lasso, and Tree Lasso Logistic Regression. Results suggest that the proposed framework improves AUC by up to 9.5% on pediatric datasets and up to 4% on morbidly obese patients’ datasets and also improves AUPRC by up to 5.7% on pediatric datasets and up to 2.6% on morbidly obese patients’ datasets on average. This indicates that the inclusion of domain knowledge improves the predictive performance of Logistic Regression.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
28

Marica, Vasile George, und Alexandra Horobet. „Conditional Granger Causality and Genetic Algorithms in VAR Model Selection“. Symmetry 11, Nr. 8 (03.08.2019): 1004. http://dx.doi.org/10.3390/sym11081004.

Der volle Inhalt der Quelle
Annotation:
Overcoming symmetry in combinatorial evolutionary algorithms is a challenge for existing niching methods. This research presents a genetic algorithm designed for the shrinkage of the coefficient matrix in vector autoregression (VAR) models, constructed on two pillars: conditional Granger causality and Lasso regression. Departing from a recent information theory proof that Granger causality and transfer entropy are equivalent, we propose a heuristic method for the identification of true structural dependencies in multivariate economic time series. Through rigorous testing, both empirically and through simulations, the present paper proves that genetic algorithms initialized with classical solutions are able to easily break the symmetry of random search and progress towards specific modeling.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
29

Goutman, Stephen A., Jonathan Boss, Kai Guo, Fadhl M. Alakwaa, Adam Patterson, Sehee Kim, Masha Georges Savelieff, Junguk Hur und Eva L. Feldman. „Untargeted metabolomics yields insight into ALS disease mechanisms“. Journal of Neurology, Neurosurgery & Psychiatry 91, Nr. 12 (14.09.2020): 1329–38. http://dx.doi.org/10.1136/jnnp-2020-323611.

Der volle Inhalt der Quelle
Annotation:
ObjectiveTo identify dysregulated metabolic pathways in amyotrophic lateral sclerosis (ALS) versus control participants through untargeted metabolomics.MethodsUntargeted metabolomics was performed on plasma from ALS participants (n=125) around 6.8 months after diagnosis and healthy controls (n=71). Individual differential metabolites in ALS cases versus controls were assessed by Wilcoxon rank-sum tests, adjusted logistic regression and partial least squares-discriminant analysis (PLS-DA), while group lasso explored sub-pathway-level differences. Adjustment parameters included sex, age and body mass index (BMI). Metabolomics pathway enrichment analysis was performed on metabolites selected by the above methods. Finally, machine learning classification algorithms applied to group lasso-selected metabolites were evaluated for classifying case status.ResultsThere were no group differences in sex, age and BMI. Significant metabolites selected were 303 by Wilcoxon, 300 by logistic regression, 295 by PLS-DA and 259 by group lasso, corresponding to 11, 13, 12 and 22 enriched sub-pathways, respectively. ‘Benzoate metabolism’, ‘ceramides’, ‘creatine metabolism’, ‘fatty acid metabolism (acyl carnitine, polyunsaturated)’ and ‘hexosylceramides’ sub-pathways were enriched by all methods, and ‘sphingomyelins’ by all but Wilcoxon, indicating these pathways significantly associate with ALS. Finally, machine learning prediction of ALS cases using group lasso-selected metabolites achieved the best performance by regularised logistic regression with elastic net regularisation, with an area under the curve of 0.98 and specificity of 83%.ConclusionIn our analysis, ALS led to significant metabolic pathway alterations, which had correlations to known ALS pathomechanisms in the basic and clinical literature, and may represent important targets for future ALS therapeutics.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
30

Chen, Chengbin, Chudong Pan, Zepeng Chen und Ling Yu. „Structural damage detection via combining weighted strategy with trace Lasso“. Advances in Structural Engineering 22, Nr. 3 (11.09.2018): 597–612. http://dx.doi.org/10.1177/1369433218795310.

Der volle Inhalt der Quelle
Annotation:
With the rapid development of computation technologies, swarm intelligence–based algorithms become an innovative technique used for addressing structural damage detection issues, but traditional swarm intelligence–based structural damage detection methods often face with insufficient detection accuracy and lower robustness to noise. As an exploring attempt, a novel structural damage detection method is proposed to tackle the above deficiency via combining weighted strategy with trace least absolute shrinkage and selection operator (Lasso). First, an objective function is defined for the structural damage detection optimization problem by using structural modal parameters; a weighted strategy and the trace Lasso are also involved into the objection function. A novel antlion optimizer algorithm is then employed as a solution solver to the structural damage detection optimization problem. To assess the capability of the proposed structural damage detection method, two numerical simulations and a series of laboratory experiments are performed, and a comparative study on effects of different parameters, such as weighted coefficients, regularization parameters and damage patterns, on the proposed structural damage detection methods are also carried out. Illustrated results show that the proposed structural damage detection method via combining weighted strategy with trace Lasso is able to accurately locate structural damages and quantify damage severities of structures.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
31

Zhou, Xiabing, Xingxing Xing, Lei Han, Haikun Hong, Kaigui Bian und Kunqing Xie. „Structure Feature Learning Method for Incomplete Data“. International Journal of Pattern Recognition and Artificial Intelligence 30, Nr. 09 (November 2016): 1660007. http://dx.doi.org/10.1142/s0218001416600077.

Der volle Inhalt der Quelle
Annotation:
Learning with incomplete data remains challenging in many real-world applications especially when the data is high-dimensional and dynamic. Many imputation-based algorithms have been proposed to handle with incomplete data, where these algorithms use statistics of the historical information to remedy the missing parts. However, these methods merely use the structural information existing in the data, which are very helpful for sharing between the complete entries and the missing ones. For example, in traffic system, some group information and temporal smoothness exist in the data structure. In this paper, we propose to incorporate these structural information and develop structural feature leaning method for learning with incomplete data (SFLIC). The SFLIC model adopt a fused Lasso based regularizer and a group Lasso style regularizer to enlarge the data sharing along both the temporal smoothness level and the feature group level to fill the gap where the data entries are missing. The proposed SFLIC model is a nonsmooth function according to the model parameters, and we adopt the smoothing proximal gradient (SPG) method to seek for an efficient solution. We evaluate our model on both synthetic and real-world highway traffic datasets. Experimental results show that our method outperforms the state-of-the-art methods.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
32

Liang, Zhaohui, Jun Liu, Jimmy X. Huang und Xing Zeng. „Fast Screening Technology for Drug Emergency Management: Predicting Suspicious SNPs for ADR with Information Theory-based Models“. Combinatorial Chemistry & High Throughput Screening 21, Nr. 2 (17.04.2018): 93–99. http://dx.doi.org/10.2174/1386207321666180115094814.

Der volle Inhalt der Quelle
Annotation:
Objective: The genetic polymorphism of Cytochrome P450 (CYP 450) is considered as one of the main causes for adverse drug reactions (ADRs). In order to explore the latent correlations between ADRs and potentially corresponding single-nucleotide polymorphism (SNPs) in CYP450, three algorithms based on information theory are used as the main method to predict the possible relation. Methods: The study uses a retrospective case-control study to explore the potential relation of ADRs to specific genomic locations and single-nucleotide polymorphism (SNP). The genomic data collected from 53 healthy volunteers are applied for the analysis, another group of genomic data collected from 30 healthy volunteers excluded from the study are used as the control group. The SNPs respective on five loci of CYP2D6*2,*10,*14 and CYP1A2*1C, *1F are detected by the Applied Biosystem 3130xl. The raw data is processed by ChromasPro to detect the specific alleles on the above loci from each sample. The secondary data are reorganized and processed by R combined with the reports of ADRs from clinical reports. Three information theory based algorithms are implemented for the screening task: JMI, CMIM, and mRMR. If a SNP is selected by more than two algorithms, we are confident to conclude that it is related to the corresponding ADR. The selection results are compared with the control decision tree + LASSO regression model. Results: In the study group where ADRs occur, 10 SNPs are considered relevant to the occurrence of a specific ADR by the combined information theory model. In comparison, only 5 SNPs are considered relevant to a specific ADR by the decision tree + LASSO regression model. In addition, the new method detects more relevant pairs of SNP and ADR which are affected by both SNP and dosage. This implies that the new information theory based model is effective to discover correlations of ADRs and CYP 450 SNPs and is helpful in predicting the potential vulnerable genotype for some ADRs. Conclusion: The newly proposed information theory based model has superiority performance in detecting the relation between SNP and ADR compared to the decision tree + LASSO regression model. The new model is more sensitive to detect ADRs compared to the old method, while the old method is more reliable. Therefore, the selection criteria for selecting algorithms should depend on the pragmatic needs.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
33

Schmid, Matthias, Olaf Gefeller, Elisabeth Waldmann, Andreas Mayr und Tobias Hepp. „Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso“. Methods of Information in Medicine 55, Nr. 05 (Mai 2016): 422–30. http://dx.doi.org/10.3414/me16-01-0033.

Der volle Inhalt der Quelle
Annotation:
Summary Background: Penalization and regularization techniques for statistical modeling have attracted increasing attention in biomedical research due to their advantages in the presence of high-dimensional data. A special focus lies on algorithms that incorporate automatic variable selection like the least absolute shrinkage operator (lasso) or statistical boosting techniques. Objectives: Focusing on the linear regression framework, this article compares the two most-common techniques for this task, the lasso and gradient boosting, both from a methodological and a practical perspective. Methods: We describe these methods highlighting under which circumstances their results will coincide in low-dimensional settings. In addition, we carry out extensive simulation studies comparing the performance in settings with more predictors than observations and investigate multiple combinations of noise-to-signal ratio and number of true non-zero coeffcients. Finally, we examine the impact of different tuning methods on the results. Results: Both methods carry out penalization and variable selection for possibly highdimensional data, often resulting in very similar models. An advantage of the lasso is its faster run-time, a strength of the boosting concept is its modular nature, making it easy to extend to other regression settings. Conclusions: Although following different strategies with respect to optimization and regularization, both methods imply similar constraints to the estimation problem leading to a comparable performance regarding prediction accuracy and variable selection in practice.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
34

JING, LIPING, MICHAEL K. NG und TIEYONG ZENG. „ON GENE SELECTION AND CLASSIFICATION FOR CANCER MICROARRAY DATA USING MULTI-STEP CLUSTERING AND SPARSE REPRESENTATION“. Advances in Adaptive Data Analysis 03, Nr. 01n02 (April 2011): 127–48. http://dx.doi.org/10.1142/s1793536911000763.

Der volle Inhalt der Quelle
Annotation:
Microarray data profiles gene expression on a whole genome scale, and provides a good way to study associations between gene expression and occurrence or progression of cancer disease. Many researchers realized that microarray data is useful to predict cancer cases. However, the high dimension of gene expressions, which is significantly larger than the sample size, makes this task very difficult. It is very important to identify the significant genes causing cancer. Many feature selection algorithms have been proposed focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for significant genes selection and efficient cancer case classification. The proposed framework first performs a clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects (1) the significant (2) genes in each group using the Bayesian Lasso method and important gene groups using the group Lasso method, and finally builds a prediction model based on the shrinkage gene space with efficient classification algorithm (such as support vector machine (SVM), 1NN, and regression). Experimental results on public available microarray data show that the proposed framework often outperforms the existing feature selection and prediction methods such as SAM, information gain (IG), and Lasso-type prediction models.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
35

Youngstrom, Eric A., Tate F. Halverson, Jennifer K. Youngstrom, Oliver Lindhiem und Robert L. Findling. „Evidence-Based Assessment From Simple Clinical Judgments to Statistical Learning: Evaluating a Range of Options Using Pediatric Bipolar Disorder as a Diagnostic Challenge“. Clinical Psychological Science 6, Nr. 2 (08.12.2017): 243–65. http://dx.doi.org/10.1177/2167702617741845.

Der volle Inhalt der Quelle
Annotation:
Reliability of clinical diagnoses is often low. There are many algorithms that could improve diagnostic accuracy, and statistical learning is becoming popular. Using pediatric bipolar disorder as a clinically challenging example, we evaluated a series of increasingly complex models ranging from simple screening to a supervised LASSO (least absolute shrinkage and selection operation) regression in a large ( N = 550) academic clinic sample. We then externally validated models in a community clinic ( N = 511) with the same candidate predictors and semistructured interview diagnoses, providing high methodological consistency; the clinics also had substantially different demography and referral patterns. Models performed well according to internal validation metrics. Complex models degraded rapidly when externally validated. Naive Bayesian and logistic models concentrating on predictors identified in prior meta-analyses tied or bettered LASSO models when externally validated. Implementing these methods would improve clinical diagnostic performance. Statistical learning research should continue to invest in high-quality indicators and diagnoses to supervise model training.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
36

Liu, Pengfei, und Weidong Tian. „Identification of DNA methylation patterns and biomarkers for clear-cell renal cell carcinoma by multi-omics data analysis“. PeerJ 8 (03.08.2020): e9654. http://dx.doi.org/10.7717/peerj.9654.

Der volle Inhalt der Quelle
Annotation:
Background Tumorigenesis is highly heterogeneous, and using clinicopathological signatures only is not enough to effectively distinguish clear cell renal cell carcinoma (ccRCC) and improve risk stratification of patients. DNA methylation (DNAm) with the stability and reversibility often occurs in the early stage of tumorigenesis. Disorders of transcription and metabolism are also an important molecular mechanisms of tumorigenesis. Therefore, it is necessary to identify effective biomarkers involved in tumorigenesis through multi-omics analysis, and these biomarkers also provide new potential therapeutic targets. Method The discovery stage involved 160 pairs of ccRCC and matched normal tissues for investigation of DNAm and biomarkers as well as 318 cases of ccRCC including clinical signatures. Correlation analysis of epigenetic, transcriptomic and metabolomic data revealed the connection and discordance among multi-omics and the deregulated functional modules. Diagnostic or prognostic biomarkers were obtained by the correlation analysis, the Least Absolute Shrinkage and Selection Operator (LASSO) and the LASSO-Cox methods. Two classifiers were established based on random forest (RF) and LASSO-Cox algorithms in training datasets. Seven independent datasets were used to evaluate robustness and universality. The molecular biological function of biomarkers were investigated using DAVID and GeneMANIA. Results Based on multi-omics analysis, the epigenetic measurements uniquely identified DNAm dysregulation of cellular mechanisms resulting in transcriptomic alterations, including cell proliferation, immune response and inflammation. Combination of the gene co-expression network and metabolic network identified 134 CpG sites (CpGs) as potential biomarkers. Based on the LASSO and RF algorithms, five CpGs were obtained to build a diagnostic classifierwith better classification performance (AUC > 99%). A eight-CpG-based prognostic classifier was obtained to improve risk stratification (hazard ratio (HR) > 4; log-rank test, p-value < 0.01). Based on independent datasets and seven additional cancers, the diagnostic and prognostic classifiers also had better robustness and stability. The molecular biological function of genes with abnormal methylation were significantly associated with glycolysis/gluconeogenesis and signal transduction. Conclusion The present study provides a comprehensive analysis of ccRCC using multi-omics data. These findings indicated that multi-omics analysis could identify some novel epigenetic factors, which were the most important causes of advanced cancer and poor clinical prognosis. Diagnostic and prognostic biomarkers were identified, which provided a promising avenue to develop effective therapies for ccRCC.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
37

Shi, Yuanyuan, Junyu Zhao, Xianchong Song, Zuoyu Qin, Lichao Wu, Huili Wang und Jian Tang. „Hyperspectral band selection and modeling of soil organic matter content in a forest using the Ranger algorithm“. PLOS ONE 16, Nr. 6 (28.06.2021): e0253385. http://dx.doi.org/10.1371/journal.pone.0253385.

Der volle Inhalt der Quelle
Annotation:
Effective soil spectral band selection and modeling methods can improve modeling accuracy. To establish a hyperspectral prediction model of soil organic matter (SOM) content, this study investigated a forested Eucalyptus plantation in Huangmian Forest Farm, Guangxi, China. The Ranger and Lasso algorithms were used to screen spectral bands. Subsequently, models were established using four algorithms: partial least squares regression, random forest (RF), a support vector machine, and an artificial neural network (ANN). The optimal model was then selected. The results showed that the modeling accuracy was higher when band selection was based on the Ranger algorithm than when it was based on the Lasso algorithm. ANN modeling had the best goodness of fit, and the model established by RF had the most stable modeling results. Based on the above results, a new method is proposed in this study for band selection in the early phase of soil hyperspectral modeling. The Ranger algorithm can be applied to screen the spectral bands, and ANN or RF can then be selected to construct the prediction model based on different datasets, which is applicable to establish the prediction model of SOM content in red soil plantations. This study provides a reference for the remote sensing of soil fertility in forests of different soil types and a theoretical basis for developing portable equipment for the hyperspectral measurement of SOM content in forest habitats.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
38

Ghosh, Pronab, Sami Azam, Mirjam Jonkman, Asif Karim, F. M. Javed Mehedi Shamrat, Eva Ignatious, Shahana Shultana, Abhijith Reddy Beeravolu und Friso De Boer. „Efficient Prediction of Cardiovascular Disease Using Machine Learning Algorithms With Relief and LASSO Feature Selection Techniques“. IEEE Access 9 (2021): 19304–26. http://dx.doi.org/10.1109/access.2021.3053759.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
39

Castelli, Mauro, Maria Dobreva, Roberto Henriques und Leonardo Vanneschi. „Predicting Days on Market to Optimize Real Estate Sales Strategy“. Complexity 2020 (25.02.2020): 1–22. http://dx.doi.org/10.1155/2020/4603190.

Der volle Inhalt der Quelle
Annotation:
Irregularities and frauds are frequent in the real estate market in Bulgaria due to the substantial lack of rigorous legislation. For instance, agencies frequently publish unreal or unavailable apartment listings for a cheap price, as a method to attract the attention of unaware potential new customers. For this reason, systems able to identify unreal listings and improve the transparency of listings authenticity and availability are much on demand. Recent research has highlighted that the number of days a published listing remains online can have a strong correlation with the probability of a listing being unreal. For this reason, building an accurate predictive model for the number of days a published listing will be online can be very helpful to accomplish the task of identifying fake listings. In this paper, we investigate the use of four different machine learning algorithms for this task: Lasso, Ridge, Elastic Net, and Artificial Neural Networks. The results, obtained on a vast dataset made available by the Bulgarian company Homeheed, show the appropriateness of Lasso regression.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
40

Zhao, Sipei, und Qiaoxi Zhu. „Comparative study of loudspeaker position optimization techniques for multizone sound field reproduction“. INTER-NOISE and NOISE-CON Congress and Conference Proceedings 263, Nr. 4 (01.08.2021): 2486–93. http://dx.doi.org/10.3397/in-2021-2150.

Der volle Inhalt der Quelle
Annotation:
Mutlizone sound field reproduction aims to generate personal sound zones in a shared space with multiple loudspeakers. Conventionally, loudspeakers are placed to form a regular pattern such as circular, arc or linear array, which are empirical rather than optimal mainly for the convenience of physical placement. Recently, several algorithms have been proposed to select a fixed number of loudspeaker locations from a large set of candidate positions, such as the sparse regularization (i.e. Lasso and Elastic Net) methods, the Constrained Match Pursuit (CMP) method, the Gram-Schmidt Orthogonalization (GSO) method etc. Most of these methods were investigated for single-zone rather than mulit-zone sound field reproduction based on the pressure matching techniques. This paper compares the performance of the state-of-the-art techniques for loudspeaker position optimization in a multizone sound field reproduction system in terms of reproduction error, acoustic contrast and array effort. Simulation results demonstrate that the CMP-LS method shows the best performance in terms of lower MSE and higher AC while the Lasso method needs the lowest AE.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
41

Tian, Mi, Tao Wang und Peng Wang. „Development and Clinical Validation of a Seven-Gene Prognostic Signature Based on Multiple Machine Learning Algorithms in Kidney Cancer“. Cell Transplantation 30 (01.01.2021): 096368972096917. http://dx.doi.org/10.1177/0963689720969176.

Der volle Inhalt der Quelle
Annotation:
About a third of patients with kidney cancer experience recurrence or cancer-related progression. Clinically, kidney cancer prognoses may be quite different, even in patients with kidney cancer at the same clinical stage. Therefore, there is an urgent need to screen for kidney cancer prognosis biomarkers. Differentially expressed genes (DEGs) were identified using kidney cancer RNA sequencing data from the Gene Expression Omnibus (GEO) database. Biomarkers were screened using random forest (RF) and support vector machine (SVM) models, and a multigene signature was constructed using the least absolute shrinkage and selection operator (LASSO) regression analysis. Univariate and multivariate Cox regression analyses were performed to explore the relationships between clinical features and prognosis. Finally, the reliability and clinical applicability of the model were validated, and relationships with biological pathways were identified. Western blots were also performed to evaluate gene expression. A total of 50 DEGs were obtained by intersecting the RF and SVM models. A seven-gene signature (RNASET2, EZH2, FXYD5, KIF18A, NAT8, CDCA7, and WNT7B) was constructed by LASSO regression. Univariate and multivariate Cox regression analyses showed that the seven-gene signature was an independent prognostic factor for kidney cancer. Finally, a predictive nomogram was established in The Cancer Genome Atlas (TCGA) cohort and validated internally. In tumor tissue, RNASET2 and FXYD5 were highly expressed and NAT8 was lowly expressed at the protein and transcription levels. This model could complement the clinicopathological characteristics of kidney cancer and promote the personalized management of patients with kidney cancer.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
42

Shing, Jaimie Zhi, Marie Griffin, James C. Slaughter, Manideepthi Pemmaraju, Edward F. Mitchel, Rachel S. Chang und Pamela C. Hull. „4486 Assessing the Validity of an ICD-9 and ICD-10 Coding Algorithm for Identifying Cervical Premalignant Lesions Using Administrative Claims Data“. Journal of Clinical and Translational Science 4, s1 (Juni 2020): 45. http://dx.doi.org/10.1017/cts.2020.167.

Der volle Inhalt der Quelle
Annotation:
OBJECTIVES/GOALS: We compared the validity of an International Classification of Diseases, Clinical Modification (ICD) algorithm for identifying high-grade cervical intraepithelial neoplasia and adenocarcinoma in situ (together referred to as CIN2+) from ICD 9th revision (ICD-9) and 10th revision (ICD-10) codes. METHODS/STUDY POPULATION: Using Tennessee Medicaid data, we identified cervical diagnostic procedures in 2008-2017 among females aged 18-39 years in Davidson County, TN. Gold-standard cases were pathology-confirmed CIN2+ diagnoses validated by HPV-IMPACT, a population-based surveillance project in catchment areas of five US states. Procedures in the ICD transition year (2015) were excluded to account for implementation lag. We pre-grouped diagnosis and procedure codes by theme. We performed feature selection using least absolute shrinkage and selection operator (LASSO) logistic regression with 10-fold cross validation and validated models by ICD-9 era (2008-2014, N = 6594) and ICD-10 era (2016-2017, N = 1270). RESULTS/ANTICIPATED RESULTS: Of 7864 cervical diagnostic procedures, 880 (11%) were true CIN2+ cases. LASSO logistic regression selected the strongest features of case status: Having codes for a CIN2+ tissue diagnosis, non-specific CIN tissue diagnosis, high-grade squamous intraepithelial lesion, receiving a cervical treatment procedure, and receiving a cervical/vaginal biopsy. Features of non-case status were codes for a CIN1 tissue diagnosis, Pap test, and HPV DNA test. The ICD-9 vs ICD-10 algorithms predicted case status with 68% vs 63% sensitivity, 95% vs 94% specificity, 63% vs 64% positive predictive value, 96% vs 94% negative predictive value, 92% vs 89% accuracy, and C-indices of 0.95 vs 0.92, respectively. DISCUSSION/SIGNIFICANCE OF IMPACT: Overall, the algorithm’s validity for identifying CIN2+ case status was similar between coding versions. ICD-9 had slightly better discriminative ability. Results support a prior study concluding that ICD-10 implementation has not substantially improved the quality of administrative data from ICD-9.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
43

Toraya, Hideo. „Finding the best-fit background function for whole-powder-pattern fitting using LASSO combined with tree search“. Journal of Applied Crystallography 54, Nr. 2 (14.02.2021): 427–38. http://dx.doi.org/10.1107/s1600576720016751.

Der volle Inhalt der Quelle
Annotation:
A new linear function for modelling the background in whole-powder-pattern fitting has been derived by applying LASSO (least absolute shrinkage and selection operator) and the technique of tree search. The background function (BGF) consists of terms b n L(2θ/180)−n/2 and b n H(1 − 2θ/180)−n/2 for the low- and high-angle sides, respectively. Some variable parameters of the BGF should be fixed at zero while others should be varied in order to find the best fit for a given data set without inducing overfitting. The LASSO algorithm can automatically select the variables in linear regression analysis. However, it finds the best-fit BGF with a set of adjustable parameters for a given data set while it derives a different set of parameters for a different data set. Thus, LASSO derives multiple solutions depending on the data set used. By regarding the individual solutions from LASSO as nodes of trees, tree structures were constructed from these solutions. The root node has the maximum number of adjustable parameters, P. P decreases with descending levels of the tree one by one, and leaf nodes have just one parameter. By evaluating individual solutions (nodes) by their χ2 index, the best-fit single path from a root node to a leaf node was found. The present BGF can be used simply by varying P in the range 1–10. The BGF thus derived as a final single solution was incorporated into computer programs for Pawley-based whole-powder-pattern decomposition and Rietveld refinement, and the performance of the BGF was tested in comparison with the polynomials currently widely used as the BGF. The present BGF has been demonstrated to be stable and to give an excellent fit, comparable to polynomials but with a smaller number of adjustable parameters and without introducing undulation into the calculated background curve. Basic algorithms used in statistics and machine learning have been demonstrated to be useful in developing an analytical model in X-ray crystallography.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
44

Gim, Jeong-An, Yonghan Kwon, Hyun A. Lee, Kyeong-Ryoon Lee, Soohyun Kim, Yoonjung Choi, Yu Kyong Kim und Howard Lee. „A Machine Learning-Based Identification of Genes Affecting the Pharmacokinetics of Tacrolimus Using the DMETTM Plus Platform“. International Journal of Molecular Sciences 21, Nr. 7 (04.04.2020): 2517. http://dx.doi.org/10.3390/ijms21072517.

Der volle Inhalt der Quelle
Annotation:
Tacrolimus is an immunosuppressive drug with a narrow therapeutic index and larger interindividual variability. We identified genetic variants to predict tacrolimus exposure in healthy Korean males using machine learning algorithms such as decision tree, random forest, and least absolute shrinkage and selection operator (LASSO) regression. rs776746 (CYP3A5) and rs1137115 (CYP2A6) are single nucleotide polymorphisms (SNPs) that can affect exposure to tacrolimus. A decision tree, when coupled with random forest analysis, is an efficient tool for predicting the exposure to tacrolimus based on genotype. These tools are helpful to determine an individualized dose of tacrolimus.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
45

Gosselt, Helen R., Maxime M. A. Verhoeven, Maja Bulatović-Ćalasan, Paco M. Welsing, Maurits C. F. J. de Rotte, Johanna M. W. Hazes, Floris P. J. G. Lafeber, Mark Hoogendoorn und Robert de Jonge. „Complex Machine-Learning Algorithms and Multivariable Logistic Regression on Par in the Prediction of Insufficient Clinical Response to Methotrexate in Rheumatoid Arthritis“. Journal of Personalized Medicine 11, Nr. 1 (14.01.2021): 44. http://dx.doi.org/10.3390/jpm11010044.

Der volle Inhalt der Quelle
Annotation:
The goals of this study were to examine whether machine-learning algorithms outperform multivariable logistic regression in the prediction of insufficient response to methotrexate (MTX); secondly, to examine which features are essential for correct prediction; and finally, to investigate whether the best performing model specifically identifies insufficient responders to MTX (combination) therapy. The prediction of insufficient response (3-month Disease Activity Score 28-Erythrocyte-sedimentation rate (DAS28-ESR) > 3.2) was assessed using logistic regression, least absolute shrinkage and selection operator (LASSO), random forest, and extreme gradient boosting (XGBoost). The baseline features of 355 rheumatoid arthritis (RA) patients from the “treatment in the Rotterdam Early Arthritis CoHort” (tREACH) and the U-Act-Early trial were combined for analyses. The model performances were compared using area under the curve (AUC) of receiver operating characteristic (ROC) curves, 95% confidence intervals (95% CI), and sensitivity and specificity. Finally, the best performing model following feature selection was tested on 101 RA patients starting tocilizumab (TCZ)-monotherapy. Logistic regression (AUC = 0.77 95% CI: 0.68–0.86) performed as well as LASSO (AUC = 0.76, 95% CI: 0.67–0.85), random forest (AUC = 0.71, 95% CI: 0.61 = 0.81), and XGBoost (AUC = 0.70, 95% CI: 0.61–0.81), yet logistic regression reached the highest sensitivity (81%). The most important features were baseline DAS28 (components). For all algorithms, models with six features performed similarly to those with 16. When applied to the TCZ-monotherapy group, logistic regression’s sensitivity significantly dropped from 83% to 69% (p = 0.03). In the current dataset, logistic regression performed equally well compared to machine-learning algorithms in the prediction of insufficient response to MTX. Models could be reduced to six features, which are more conducive for clinical implementation. Interestingly, the prediction model was specific to MTX (combination) therapy response.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
46

Luo, Mi, Yifu Wang, Yunhong Xie, Lai Zhou, Jingjing Qiao, Siyu Qiu und Yujun Sun. „Combination of Feature Selection and CatBoost for Prediction: The First Application to the Estimation of Aboveground Biomass“. Forests 12, Nr. 2 (13.02.2021): 216. http://dx.doi.org/10.3390/f12020216.

Der volle Inhalt der Quelle
Annotation:
Increasing numbers of explanatory variables tend to result in information redundancy and “dimensional disaster” in the quantitative remote sensing of forest aboveground biomass (AGB). Feature selection of model factors is an effective method for improving the accuracy of AGB estimates. Machine learning algorithms are also widely used in AGB estimation, although little research has addressed the use of the categorical boosting algorithm (CatBoost) for AGB estimation. Both feature selection and regression for AGB estimation models are typically performed with the same machine learning algorithm, but there is no evidence to suggest that this is the best method. Therefore, the present study focuses on evaluating the performance of the CatBoost algorithm for AGB estimation and comparing the performance of different combinations of feature selection methods and machine learning algorithms. AGB estimation models of four forest types were developed based on Landsat OLI data using three feature selection methods (recursive feature elimination (RFE), variable selection using random forests (VSURF), and least absolute shrinkage and selection operator (LASSO)) and three machine learning algorithms (random forest regression (RFR), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost)). Feature selection had a significant influence on AGB estimation. RFE preserved the most informative features for AGB estimation and was superior to VSURF and LASSO. In addition, CatBoost improved the accuracy of the AGB estimation models compared with RFR and XGBoost. AGB estimation models using RFE for feature selection and CatBoost as the regression algorithm achieved the highest accuracy, with root mean square errors (RMSEs) of 26.54 Mg/ha for coniferous forest, 24.67 Mg/ha for broad-leaved forest, 22.62 Mg/ha for mixed forests, and 25.77 Mg/ha for all forests. The combination of RFE and CatBoost had better performance than the VSURF–RFR combination in which random forests were used for both feature selection and regression, indicating that feature selection and regression performed by a single machine learning algorithm may not always ensure optimal AGB estimation. It is promising to extending the application of new machine learning algorithms and feature selection methods to improve the accuracy of AGB estimates.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
47

Mashayekhi, Morteza, und Robin Gras. „Rule Extraction from Decision Trees Ensembles: New Algorithms Based on Heuristic Search and Sparse Group Lasso Methods“. International Journal of Information Technology & Decision Making 16, Nr. 06 (November 2017): 1707–27. http://dx.doi.org/10.1142/s0219622017500055.

Der volle Inhalt der Quelle
Annotation:
Decision trees are examples of easily interpretable models whose predictive accuracy is normally low. In comparison, decision tree ensembles (DTEs) such as random forest (RF) exhibit high predictive accuracy while being regarded as black-box models. We propose three new rule extraction algorithms from DTEs. The RF[Formula: see text]DHC method, a hill climbing method with downhill moves (DHC), is used to search for a rule set that decreases the number of rules dramatically. In the RF[Formula: see text]SGL and RF[Formula: see text]MSGL methods, the sparse group lasso (SGL) method, and the multiclass SGL (MSGL) method are employed respectively to find a sparse weight vector corresponding to the rules generated by RF. Experimental results with 24 data sets show that the proposed methods outperform similar state-of-the-art methods, in terms of human comprehensibility, by greatly reducing the number of rules and limiting the number of antecedents in the retained rules, while preserving the same level of accuracy.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
48

Liu, Xiaoli, Jianzhong Wang, Fulong Ren und Jun Kong. „Group Guided Fused Laplacian Sparse Group Lasso for Modeling Alzheimer’s Disease Progression“. Computational and Mathematical Methods in Medicine 2020 (07.02.2020): 1–23. http://dx.doi.org/10.1155/2020/4036560.

Der volle Inhalt der Quelle
Annotation:
As the largest cause of dementia, Alzheimer’s disease (AD) has brought serious burdens to patients and their families, mostly in the financial, psychological, and emotional aspects. In order to assess the progression of AD and develop new treatment methods for the disease, it is essential to infer the trajectories of patients’ cognitive performance over time to identify biomarkers that connect the patterns of brain atrophy and AD progression. In this article, a structured regularized regression approach termed group guided fused Laplacian sparse group Lasso (GFL-SGL) is proposed to infer disease progression by considering multiple prediction of the same cognitive scores at different time points (longitudinal analysis). The proposed GFL-SGL simultaneously exploits the interrelated structures within the MRI features and among the tasks with sparse group Lasso (SGL) norm and presents a novel group guided fused Laplacian (GFL) regularization. This combination effectively incorporates both the relatedness among multiple longitudinal time points with a general weighted (undirected) dependency graphs and useful inherent group structure in features. Furthermore, an alternating direction method of multipliers- (ADMM-) based algorithm is also derived to optimize the nonsmooth objective function of the proposed approach. Experiments on the dataset from Alzheimer’s Disease Neuroimaging Initiative (ADNI) show that the proposed GFL-SGL outperformed some other state-of-the-art algorithms and effectively fused the multimodality data. The compact sets of cognition-relevant imaging biomarkers identified by our approach are consistent with the results of clinical studies.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
49

Vasudha Bahl and Nidhi Sengar, Manu Shahi, Abhay Singh, Amita Goel. „Machine Learning House Price Prediction“. International Journal for Modern Trends in Science and Technology 6, Nr. 12 (13.12.2020): 186–89. http://dx.doi.org/10.46501/ijmtst061236.

Der volle Inhalt der Quelle
Annotation:
This document present the implementation of Machine Learning algorithms for the prediction of the house and the real estate prices. As the house and real estate prices are subject to change with the market conditions, so it become very difficult to predict the real estate prices with the conventional methods as it may sometimes gives some exaggerated result that may incur losses. To predict the prices more accurately and precisely we predict the prices based on the statics of that particular area which has all the trends and factors on which the price is dependent. To analyse these data , several algorithms are used namely random forest, linear regression , lasso regression etc. Use of these algorithms decreases the margin of error and more precise result are achieved. So,we at this point recommend the real estate agents and house vendors as well as the people to look into the model for better valuation of the house. This model can also be integrated with the real estates websites to give better recommendation based on the prices using Machine Learning Algorithms.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
50

Bretó, Carles, Priscila Espinosa, Penélope Hernández und Jose M. Pavía. „An Entropy-Based Machine Learning Algorithm for Combining Macroeconomic Forecasts“. Entropy 21, Nr. 10 (19.10.2019): 1015. http://dx.doi.org/10.3390/e21101015.

Der volle Inhalt der Quelle
Annotation:
This paper applies a Machine Learning approach with the aim of providing a single aggregated prediction from a set of individual predictions. Departing from the well-known maximum-entropy inference methodology, a new factor capturing the distance between the true and the estimated aggregated predictions presents a new problem. Algorithms such as ridge, lasso or elastic net help in finding a new methodology to tackle this issue. We carry out a simulation study to evaluate the performance of such a procedure and apply it in order to forecast and measure predictive ability using a dataset of predictions on Spanish gross domestic product.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie