Academic literature on the topic 'Best Subset Selection'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Best Subset Selection.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Best Subset Selection"

1

Groz, Benoît, and Silviu Maniu. "Hypervolume Subset Selection with Small Subsets." Evolutionary Computation 27, no. 4 (December 2019): 611–37. http://dx.doi.org/10.1162/evco_a_00235.

Full text
Abstract:
The hypervolume subset selection problem (HSSP) aims at approximating a set of [Formula: see text] multidimensional points in [Formula: see text] with an optimal subset of a given size. The size [Formula: see text] of the subset is a parameter of the problem, and an approximation is considered best when it maximizes the hypervolume indicator. This problem has proved popular in recent years as a procedure for multiobjective evolutionary algorithms. Efficient algorithms are known for planar points ([Formula: see text]), but there are hardly any results on HSSP in larger dimensions ([Formula: see text]). So far, most algorithms in higher dimensions essentially enumerate all possible subsets to determine the optimal one, and most of the effort has been directed toward improving the efficiency of hypervolume computation. We propose efficient algorithms for the selection problem in dimension 3 when either [Formula: see text] or [Formula: see text] is small, and extend our techniques to arbitrary dimensions for [Formula: see text].
APA, Harvard, Vancouver, ISO, and other styles
2

Tamura, Ryuta, Ken Kobayashi, Yuichi Takano, Ryuhei Miyashiro, Kazuhide Nakata, and Tomomi Matsui. "BEST SUBSET SELECTION FOR ELIMINATING MULTICOLLINEARITY." Journal of the Operations Research Society of Japan 60, no. 3 (2017): 321–36. http://dx.doi.org/10.15807/jorsj.60.321.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Alrefaei, Mahmoud H., and Mohammad Almomani. "Subset selection of best simulated systems." Journal of the Franklin Institute 344, no. 5 (August 2007): 495–506. http://dx.doi.org/10.1016/j.jfranklin.2006.02.020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Bofinger, Evc, and Kerrie Mangersen. "Subset selection of the t best populations." Communications in Statistics - Theory and Methods 15, no. 10 (January 1986): 3145–61. http://dx.doi.org/10.1080/03610928608829299.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Takano, Yuichi, and Ryuhei Miyashiro. "Best subset selection via cross-validation criterion." TOP 28, no. 2 (February 14, 2020): 475–88. http://dx.doi.org/10.1007/s11750-020-00538-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Gupta, Shanti S., and Hwa-Ming Yang. "subset selection procedures for the best population." Journal of Statistical Planning and Inference 12 (January 1985): 213–33. http://dx.doi.org/10.1016/0378-3758(85)90071-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Laan, Paul Van Der, and Paul van der Laan. "Subset Selection of an Almost Best Treatment." Biometrical Journal 34, no. 6 (1992): 647–56. http://dx.doi.org/10.1002/bimj.4710340602.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zhang, Zhongheng. "Variable selection with stepwise and best subset approaches." Annals of Translational Medicine 4, no. 7 (April 2016): 136. http://dx.doi.org/10.21037/atm.2016.03.35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Zhu, Junxian, Canhong Wen, Jin Zhu, Heping Zhang, and Xueqin Wang. "A polynomial algorithm for best-subset selection problem." Proceedings of the National Academy of Sciences 117, no. 52 (December 16, 2020): 33117–23. http://dx.doi.org/10.1073/pnas.2014241117.

Full text
Abstract:
Best-subset selection aims to find a small subset of predictors, so that the resulting linear model is expected to have the most desirable prediction accuracy. It is not only important and imperative in regression analysis but also has far-reaching applications in every facet of research, including computer science and medicine. We introduce a polynomial algorithm, which, under mild conditions, solves the problem. This algorithm exploits the idea of sequencing and splicing to reach a stable solution in finite steps when the sparsity level of the model is fixed but unknown. We define an information criterion that helps the algorithm select the true sparsity level with a high probability. We show that when the algorithm produces a stable optimal solution, that solution is the oracle estimator of the true parameters with probability one. We also demonstrate the power of the algorithm in several numerical studies.
APA, Harvard, Vancouver, ISO, and other styles
10

Bertsimas, Dimitris, Angela King, and Rahul Mazumder. "Best subset selection via a modern optimization lens." Annals of Statistics 44, no. 2 (April 2016): 813–52. http://dx.doi.org/10.1214/15-aos1388.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Best Subset Selection"

1

Zhang, Tao. "Discrepancy-based algorithms for best-subset model selection." Diss., University of Iowa, 2013. https://ir.uiowa.edu/etd/4800.

Full text
Abstract:
The selection of a best-subset regression model from a candidate family is a common problem that arises in many analyses. In best-subset model selection, we consider all possible subsets of regressor variables; thus, numerous candidate models may need to be fit and compared. One of the main challenges of best-subset selection arises from the size of the candidate model family: specifically, the probability of selecting an inappropriate model generally increases as the size of the family increases. For this reason, it is usually difficult to select an optimal model when best-subset selection is attempted based on a moderate to large number of regressor variables. Model selection criteria are often constructed to estimate discrepancy measures used to assess the disparity between each fitted candidate model and the generating model. The Akaike information criterion (AIC) and the corrected AIC (AICc) are designed to estimate the expected Kullback-Leibler (K-L) discrepancy. For best-subset selection, both AIC and AICc are negatively biased, and the use of either criterion will lead to overfitted models. To correct for this bias, we introduce a criterion AICi, which has a penalty term evaluated from Monte Carlo simulation. A multistage model selection procedure AICaps, which utilizes AICi, is proposed for best-subset selection. In the framework of linear regression models, the Gauss discrepancy is another frequently applied measure of proximity between a fitted candidate model and the generating model. Mallows' conceptual predictive statistic (Cp) and the modified Cp (MCp) are designed to estimate the expected Gauss discrepancy. For best-subset selection, Cp and MCp exhibit negative estimation bias. To correct for this bias, we propose a criterion CPSi that again employs a penalty term evaluated from Monte Carlo simulation. We further devise a multistage procedure, CPSaps, which selectively utilizes CPSi. In this thesis, we consider best-subset selection in two different modeling frameworks: linear models and generalized linear models. Extensive simulation studies are compiled to compare the selection behavior of our methods and other traditional model selection criteria. We also apply our methods to a model selection problem in a study of bipolar disorder.
APA, Harvard, Vancouver, ISO, and other styles
2

Carter, Knute Derek. "Best-subset model selection based on multitudinal assessments of likelihood improvements." Diss., University of Iowa, 2013. https://ir.uiowa.edu/etd/5726.

Full text
Abstract:
Given a set of potential explanatory variables, one model selection approach is to select the best model, according to some criterion, from among the collection of models defined by all possible subsets of the explanatory variables. A popular procedure that has been used in this setting is to select the model that results in the smallest value of the Akaike information criterion (AIC). One drawback in using the AIC is that it can lead to the frequent selection of overspecified models. This can be problematic if the researcher wishes to assert, with some level of certainty, the necessity of any given variable that has been selected. This thesis develops a model selection procedure that allows the researcher to nominate, a priori, the probability at which overspecified models will be selected from among all possible subsets. The procedure seeks to determine if the inclusion of each candidate variable results in a sufficiently improved fitting term, and hence is referred to as the SIFT procedure. In order to determine whether there is sufficient evidence to retain a candidate variable or not, a set of threshold values are computed. Two procedures are proposed: a naive method based on a set of restrictive assumptions; and an empirical permutation-based method. Graphical tools have also been developed to be used in conjunction with the SIFT procedure. The graphical representation of the SIFT procedure clarifies the process being undertaken. Using these tools can also assist researchers in developing a deeper understanding of the data they are analyzing. The naive and empirical SIFT methods are investigated by way of simulation under a range of conditions within the standard linear model framework. The performance of the SIFT methodology is compared with model selection by minimum AIC; minimum Bayesian Information Criterion (BIC); and backward elimination based on p-values. The SIFT procedure is found to behave as designed—asymptotically selecting those variables that characterize the underlying data generating mechanism, while limiting the selection of false or spurious variables to the desired level. The SIFT methodology offers researchers a promising new approach to model selection, whereby they are now able to control the probability of selecting an overspecified model to a level that best suits their needs.
APA, Harvard, Vancouver, ISO, and other styles
3

Haeggström, Andreas, and Jennie Sund. "Prognosmodell för svenska läns bruttoregionalprodukt (BRP) : En komparativ analys av bayesian model averaging, best subset selection och en longitudinell modell." Thesis, Umeå universitet, Statistik, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-105339.

Full text
Abstract:
Föreliggande uppsats har som främsta syfte att skapa en prognosmodell för bruttoregionalprodukten (BRP) för Sveriges 21 län. Behovet av en prognosmodell motiveras av att Statistiska centralbyrån (SCB) i dagsläget redovisar de definitiva siffrorna av BRP med två års fördröjning. Det kan därmed finnas ett intresse hos regionala beslutsfattare att få en uppfattning om hur BRP utvecklats under de två senaste åren. Metoden som används är bayesian model averaging (BMA), vilken kommer att utvärderas samt jämföras med två andra metoder: En multipel linjär modell som skattas med minsta kvadratmetoden där variabelselektion utförs med best subset selection (BSS). Den andra metoden är en tidsseriemodell och kallas här för en longitudinell modell (LM). Resultatet påvisar bland annat att modellerna lider av multikollinjäritet. Hur väl dess tre metoder predikterar BRP utvärderas med Validation set approach och presenteras med olika precisionsmått. Ett av måtten mean absolute percentage error (MAPE) resulterade i 6,67 % för BMA, 6,61 % för BSS och 4,08 % för LM.
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Feng-Chi, and 劉峰旗. "Best Subset Selection of AR-GARCH Models." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/45538737888749003214.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

McLean, Kevin. "Obtaining the Best Model Predictions and Parameter Estimates Using Limited Data." Thesis, 2011. http://hdl.handle.net/1974/6757.

Full text
Abstract:
Engineers who develop fundamental models for chemical processes are often unable to estimate all of the model parameters due to problems with parameter identifiability and estimability. The literature concerning these two concepts is reviewed and techniques for assessing parameter identifiability and estimability in nonlinear dynamic models are summarized. Modellers often face estimability problems when the available data are limited or noisy. In this situation, modellers must decide whether to conduct new experiments, change the model structure, or to estimate only a subset of the parameters and leave others at fixed values. Estimating only a subset of important model parameters is a technique often used by modellers who face estimability problems and it may lead to better model predictions with lower mean squared error (MSE) than the full model with all parameters estimated. Different methods in the literature for parameter subset selection are discussed and compared. An orthogonalization algorithm combined with a recent MSE-based criterion has been used successfully to rank parameters from most to least estimable and to determine the parameter subset that should be estimated to obtain the best predictions. In this work, this strategy is applied to a batch reactor model using additional data and results are compared with computationally-expensive leave-one-out cross-validation. A new simultaneous ranking and selection technique based on this MSE criterion is also described. Unfortunately, results from these parameter selection techniques are sensitive to the initial parameter values and the uncertainty factors used to calculate sensitivity coefficients. A robustness test is proposed and applied to assess the sensitivity of the selected parameter subset to the initial parameter guesses. The selected parameter subsets are compared with those selected using another MSE-based method proposed by Chu et al. (2009). The computational efforts of these methods are compared and recommendations are provided to modellers.
Thesis (Master, Chemical Engineering) -- Queen's University, 2011-09-27 10:52:31.588
APA, Harvard, Vancouver, ISO, and other styles
6

Chung, Yu-Wei, and 莊昱偉. "Monte Carlo Sampling of Sequential Likelihood Procedure for Selecting a Subset of Size s which are contained in the Best t (t>=s) populations." Thesis, 1995. http://ndltd.ncl.edu.tw/handle/09111651118401105662.

Full text
Abstract:
碩士
淡江大學
數學系
83
In many of the experimental situations, one is faced with the problems, e.g. drug efficiency, crop yields, etc, of selecting the better ones from a given collection. Bechhofer (1954) developed a procedure based on predetermined number of observations from normal population's with unknown mean and known variance. Mahamunulu(1967) considered a fixed-sample procedure of selecting a subset of size s which contains at least c of the t best populations.(max(1,s+t+1-k)==s) population for totally k (s=0,(t, s)/(k,s)
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Best Subset Selection"

1

Hurtado de Mendoza, Diego, 1503-1575, Kelly, Jerry, 1955- book designer, Morillon Antoine d. 1522-1556, and Association internationale de bibliophilie, eds. Index of best authors: By subject classification compiled in 1547 by Antoine Morillon : for Antoine Perrenot de Granvelle including a selection of Greek manuscripts in the library of Diego Hurtado de Mendoza : (Besançon, Bibliothèque Municipale, MS Granvelle 90, ff. 11- 18v). [Place of publication not identified]: Prepared for the visit to Besançon by the Association internationale de bibliophilie, 2014.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hankin, David, Michael S. Mohr, and Kenneth B. Newman. Sampling Theory. Oxford University Press, 2019. http://dx.doi.org/10.1093/oso/9780198815792.001.0001.

Full text
Abstract:
We present a rigorous but understandable introduction to the field of sampling theory for ecologists and natural resource scientists. Sampling theory concerns itself with development of procedures for random selection of a subset of units, a sample, from a larger finite population, and with how to best use sample data to make scientifically and statistically sound inferences about the population as a whole. The inferences fall into two broad categories: (a) estimation of simple descriptive population parameters, such as means, totals, or proportions, for variables of interest, and (b) estimation of uncertainty associated with estimated parameter values. Although the targets of estimation are few and simple, estimates of means, totals, or proportions see important and often controversial uses in management of natural resources and in fundamental ecological research, but few ecologists or natural resource scientists have formal training in sampling theory. We emphasize the classical design-based approach to sampling in which variable values associated with units are regarded as fixed and uncertainty of estimation arises via various randomization strategies that may be used to select samples. In addition to covering standard topics such as simple random, systematic, cluster, unequal probability (stressing the generality of Horvitz–Thompson estimation), multi-stage, and multi-phase sampling, we also consider adaptive sampling, spatially balanced sampling, and sampling through time, three areas of special importance for ecologists and natural resource scientists. The text is directed to undergraduate seniors, graduate students, and practicing professionals. Problems emphasize application of the theory and R programming in ecological and natural resource settings.
APA, Harvard, Vancouver, ISO, and other styles
3

Chow, Jade, John Patterson, Kathy Boursicot, and David Sales, eds. Oxford Assess and Progress: Medical Sciences. Oxford University Press, 2012. http://dx.doi.org/10.1093/oso/9780199605071.001.0001.

Full text
Abstract:
Oxford Assess and Progress is a new and unique revision resource for medical students. Written and edited by subject and assessment experts the series provides a wealt of popular assessment questions and extra features to be truly fit for purpose and assessment success! Medical students will benefit from a comprehensive selection of Single Best Answer questions and Extended Matching Questions designed to test understanding and application of core medical science topics. Well illustrated, many assessment items are image based to prepare students for such exam questions. Chapter introductions provide a helpful quick overview of each topic. Ideal companions to the best-selling Oxford Handbooks, these excellent self-assessment guides can also be used entirely independently. Oxford Assess and Progress: Medical Sciences doesn't simply reveal the correct or wrong answer. Readers are directed to further revision material via detailed feedback on why the correct answer is best, and references to the Oxford Handbook of Medical Sciences and resources such as medical science textbooks. Each question is rated out of four possible levels of difficulty, from medical student to junior doctor. Carefully compiled and reviewed to ensure quality, students can rely on the Oxford Assess and Progress series to prepare for their exams.
APA, Harvard, Vancouver, ISO, and other styles
4

Minelli, Alessandro. Evolvability and Its Evolvability. Oxford University Press, 2017. http://dx.doi.org/10.1093/oso/9780199377176.003.0007.

Full text
Abstract:
No universally accepted notion of evolvability is available, focus being alternatively put onto either genetic or phenotypic change. The heuristic power of this concept is best found when considering the intricacies of the genotype→phenotype map, which is not necessarily predictable, expression of variation depending on the structure of gene networks and especially on the modularity and robustness of developmental systems. We can hardly ignore evolvability whenever studying the role of cryptic variation in evolution, the often pervious boundary between phenotypic plasticity and the expression of a genetic polymorphism, the major phenotypic leaps that the mechanisms of development can produce based on point mutations, or the morphological stasis that reveals how robust a developmental process can be in front of genetic change. Evolvability is subject itself to evolution, but it is still uncertain to what extent there is positive selection for enhanced evolvability, or for evolvability biased in a specific direction.
APA, Harvard, Vancouver, ISO, and other styles
5

Rutherford, Donald, ed. Oxford Studies in Early Modern Philosophy, Volume X. Oxford University Press, 2021. http://dx.doi.org/10.1093/oso/9780192897442.001.0001.

Full text
Abstract:
Oxford Studies in Early Modern Philosophy is an annual series, presenting a selection of the best current work in the history of early modern philosophy. It focuses on the seventeenth and eighteenth centuries—the extraordinary period of intellectual flourishing that begins, roughly, with Descartes and his contemporaries and ends with Kant. It also publishes work on thinkers or movements outside of that framework, provided they are important in illuminating early modern thought. The core of the subject matter is philosophy and its history. But the volume’s chapters reflect the fact that philosophy in the early modern period was much broader in its scope than it is currently taken to be and included a great deal of what now belongs to the natural sciences. Furthermore, philosophy in the period was closely connected with other disciplines, such as theology, law and medicine, and with larger questions of social, political, and religious history. Volume 10 includes chapters dedicated to a wide set of topics in the philosophies of Thomas White, Spinoza, Locke, Leibniz, and Hume.
APA, Harvard, Vancouver, ISO, and other styles
6

Manson, S. S., and G. R. Halford. Fatigue and Durability of Structural Materials. ASM International, 2006. http://dx.doi.org/10.31399/asm.tb.fdsm.9781627083447.

Full text
Abstract:
Fatigue and Durability of Structural Materials serves as a reference, textbook, and guide for engineers who design or maintain equipment subject to fatigue damage and failure. Using images, diagrams, and equations, it explains how cyclic loading affects the composition, structure, and properties of metals and the lifetime and performance of machine components. It describes the fundamentals of fatigue analysis, the role of dislocations, the concept of mean stress, the complexity of multiaxial loading, and the impact of cumulative fatigue damage. It discusses the influence of notches and cracks on shaft failures, the effects of fatigue on nonmetals, the characteristics of fatigue mechanisms, and the use of fatigue life equations and approximating techniques. It also defines important terms and concepts, includes relevant background information, and provides guidelines and best practices on part sizing, materials selection and processing routes, fabrication methods, surface preparation, the introduction of favorable residual stresses, material restoration and healing, and permissible crack growth. For information on the print version, ISBN 978-0-87170-825-0, follow this link.
APA, Harvard, Vancouver, ISO, and other styles
7

Rutherford, Donald, ed. Oxford Studies in Early Modern Philosophy, Volume IX. Oxford University Press, 2019. http://dx.doi.org/10.1093/oso/9780198852452.001.0001.

Full text
Abstract:
Oxford Studies in Early Modern Philosophy is an annual series, presenting a selection of the best current work in the history of early modern philosophy. It focuses on the seventeenth and eighteenth centuries—the extraordinary period of intellectual flourishing that begins, roughly, with Descartes and his contemporaries and ends with Kant. It also publishes work on thinkers or movements outside of that framework, provided they are important in illuminating early modern thought. The core of the subject matter is philosophy and its history. But the volume’s chapters reflect the fact that philosophy in the early modern period was much broader in its scope than it is currently taken to be and included a great deal of what now belongs to the natural sciences. Furthermore, philosophy in the period was closely connected with other disciplines, such as theology, law, and medicine, and with larger questions of social, political, and religious history. Volume 9 includes chapters dedicated to a wide set of topics in the philosophies of Descartes, Malebranche, Locke, Leibniz, Hume, and Kant.
APA, Harvard, Vancouver, ISO, and other styles
8

Letters and pape[rs] on agriculture: Extracted from the correspondence of a society instituted at Halifax, for promoting agriculture in the province of Nova-Scotia ; to which is added a selection of papers on various branches of husbandry, from some of the best publications on the subject in Europe and America. Halifax: Printed by J. Howe, 1990.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Best Subset Selection"

1

Tuv, Eugene, Alexander Borisov, and Kari Torkkola. "Best Subset Feature Selection for Massive Mixed-Type Problems." In Intelligent Data Engineering and Automated Learning – IDEAL 2006, 1048–56. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006. http://dx.doi.org/10.1007/11875581_125.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gonzalez, P. L., R. Cléroux, and B. Rioux. "Selecting the Best Subset of Variables in Principal Component Analysis." In Compstat, 115–20. Heidelberg: Physica-Verlag HD, 1990. http://dx.doi.org/10.1007/978-3-642-50096-1_18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Chakraborty, Basabi. "Particle Swarm Optimization Algorithm and its Hybrid Variants for Feature Subset Selection." In Handbook of Research on Computational Intelligence for Engineering, Science, and Business, 449–66. IGI Global, 2013. http://dx.doi.org/10.4018/978-1-4666-2518-1.ch017.

Full text
Abstract:
Selecting an optimum subset of features from a large set of features is an important pre- processing step for pattern classification, data mining, or machine learning applications. Feature subset selection basically comprises of defining a criterion function for evaluation of the feature subset and developing a search strategy to find the best feature subset from a large number of feature subsets. Lots of mathematical and statistical techniques have been proposed so far. Recently biologically inspired computing is gaining popularity for solving real world problems for their more flexibility compared to traditional statistical or mathematical techniques. In this chapter, the role of Particle Swarm Optimization (PSO), one of the recently developed bio-inspired evolutionary computational (EC) approaches in designing algorithms for producing optimal feature subset from a large feature set, is examined. A state of the art review on Particle Swarm Optimization algorithms and its hybrids with other soft computing techniques for feature subset selection are presented followed by author’s proposals of PSO based algorithms. Simple simulation experiments with benchmark data sets and their results are shown to evaluate their respective effectiveness and comparative performance in selecting best feature subset from a set of features.
APA, Harvard, Vancouver, ISO, and other styles
4

Nechval, Nicholas A., Konstantin N. Nechval, Maris Purgailis, and Uldis Rozevskis. "Selection of the Best Subset of Variables in Regression and Time Series Models." In Cybernetics and Systems Theory in Management, 303–20. IGI Global, 2010. http://dx.doi.org/10.4018/978-1-61520-668-1.ch016.

Full text
Abstract:
The problem of variable selection is one of the most pervasive model selection problems in statistical applications. Often referred to as the problem of subset selection, it arises when one wants to model the relationship between a variable of interest and a subset of potential explanatory variables or predictors, but there is uncertainty about which subset to use. Several papers have dealt with various aspects of the problem but it appears that the typical regression user has not benefited appreciably. One reason for the lack of resolution of the problem is the fact that it is has not been well defined. Indeed, it is apparent that there is not a single problem, but rather several problems for which different answers might be appropriate. The intent of this chapter is not to give specific answers but merely to present a new simple multiplicative variable selection criterion based on the parametrically penalized residual sum of squares to address the subset selection problem in multiple linear regression analysis, where the objective is to select a minimal subset of predictor variables without sacrificing any explanatory power. The variables, which optimize this criterion, are chosen to be the best variables. The authors find that the proposed criterion performs consistently well across a wide variety of variable selection problems. Practical utility of this criterion is demonstrated by numerical examples.
APA, Harvard, Vancouver, ISO, and other styles
5

Tiwari, Arvind Kumar. "Feature Selection Algorithms for Classification and Clustering." In Cognitive Analytics, 422–42. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-2460-2.ch022.

Full text
Abstract:
Feature selection is an important topic in data mining, especially for high dimensional dataset. Feature selection is a process commonly used in machine learning, wherein subsets of the features available from the data are selected for application of learning algorithm. The best subset contains the least number of dimensions that most contribute to accuracy. Feature selection methods can be decomposed into three main classes, one is filter method, another one is wrapper method and third one is embedded method. This chapter presents an empirical comparison of feature selection methods and its algorithm. In view of the substantial number of existing feature selection algorithms, the need arises to count on criteria that enable to adequately decide which algorithm to use in certain situation. This chapter reviews several fundamental algorithms found in the literature and assess their performance in a controlled scenario.
APA, Harvard, Vancouver, ISO, and other styles
6

Tiwari, Arvind Kumar. "Feature Selection Algorithms for Classification and Clustering." In Ubiquitous Machine Learning and Its Applications, 143–67. IGI Global, 2017. http://dx.doi.org/10.4018/978-1-5225-2545-5.ch007.

Full text
Abstract:
Feature selection is an important topic in data mining, especially for high dimensional dataset. Feature selection is a process commonly used in machine learning, wherein subsets of the features available from the data are selected for application of learning algorithm. The best subset contains the least number of dimensions that most contribute to accuracy. Feature selection methods can be decomposed into three main classes, one is filter method, another one is wrapper method and third one is embedded method. This chapter presents an empirical comparison of feature selection methods and its algorithm. In view of the substantial number of existing feature selection algorithms, the need arises to count on criteria that enable to adequately decide which algorithm to use in certain situation. This chapter reviews several fundamental algorithms found in the literature and assess their performance in a controlled scenario.
APA, Harvard, Vancouver, ISO, and other styles
7

Dhamodharavadhani S. and Rathipriya R. "Variable Selection Method for Regression Models Using Computational Intelligence Techniques." In Handbook of Research on Machine and Deep Learning Applications for Cyber Security, 416–36. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-5225-9611-0.ch019.

Full text
Abstract:
Regression model (RM) is an important tool for modeling and analyzing data. It is one of the popular predictive modeling techniques which explore the relationship between a dependent (target) and independent (predictor) variables. The variable selection method is used to form a good and effective regression model. Many variable selection methods existing for regression model such as filter method, wrapper method, embedded methods, forward selection method, Backward Elimination methods, stepwise methods, and so on. In this chapter, computational intelligence-based variable selection method is discussed with respect to the regression model in cybersecurity. Generally, these regression models depend on the set of (predictor) variables. Therefore, variable selection methods are used to select the best subset of predictors from the entire set of variables. Genetic algorithm-based quick-reduct method is proposed to extract optimal predictor subset from the given data to form an optimal regression model.
APA, Harvard, Vancouver, ISO, and other styles
8

Dhamodharavadhani S. and Rathipriya R. "Variable Selection Method for Regression Models Using Computational Intelligence Techniques." In Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms, 742–61. IGI Global, 2021. http://dx.doi.org/10.4018/978-1-7998-8048-6.ch037.

Full text
Abstract:
Regression model (RM) is an important tool for modeling and analyzing data. It is one of the popular predictive modeling techniques which explore the relationship between a dependent (target) and independent (predictor) variables. The variable selection method is used to form a good and effective regression model. Many variable selection methods existing for regression model such as filter method, wrapper method, embedded methods, forward selection method, Backward Elimination methods, stepwise methods, and so on. In this chapter, computational intelligence-based variable selection method is discussed with respect to the regression model in cybersecurity. Generally, these regression models depend on the set of (predictor) variables. Therefore, variable selection methods are used to select the best subset of predictors from the entire set of variables. Genetic algorithm-based quick-reduct method is proposed to extract optimal predictor subset from the given data to form an optimal regression model.
APA, Harvard, Vancouver, ISO, and other styles
9

Bidi, Noria, and Zakaria Elberrichi. "Best Features Selection for Biomedical Data Classification Using Seven Spot Ladybird Optimization Algorithm." In Cognitive Analytics, 407–21. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-2460-2.ch021.

Full text
Abstract:
This article presents a new adaptive algorithm called FS-SLOA (Feature Selection-Seven Spot Ladybird Optimization Algorithm) which is a meta-heuristic feature selection method based on the foraging behavior of a seven spot ladybird. The new efficient technique has been applied to find the best subset features, which achieves the highest accuracy in classification using three classifiers: the Naive Bayes (NB), the Nearest Neighbors (KNN) and the Support Vector Machine (SVM). The authors' proposed approach has been experimented on four well-known benchmark datasets (Wisconsin Breast cancer, Pima Diabetes, Mammographic Mass, and Dermatology datasets) taken from the UCI machine learning repository. Experimental results prove that the classification accuracy of FS-SLOA is the best performing for different datasets.
APA, Harvard, Vancouver, ISO, and other styles
10

Alaoui, Abdiya, and Zakaria Elberrichi. "Enhanced Ant Colony Algorithm for Best Features Selection for a Decision Tree Classification of Medical Data." In Advances in Library and Information Science, 278–93. IGI Global, 2020. http://dx.doi.org/10.4018/978-1-7998-1021-6.ch015.

Full text
Abstract:
Classification algorithms are widely applied in medical domain to classify the data for diagnosis. The datasets have considerable irrelevant attributes. Diagnosis of the diseases is costly because many tests are required to predict a disease. Feature selection is one of the significant tasks of the preprocessing phase for the data. It can extract a subset of attributes from a large set and exclude redundant, irrelevant, or noisy attributes. The authors can decrease the cost of diagnosis by avoiding numerous tests by selection of features, which are important for prediction of disease. Applied to the task of supervised classification, the authors construct a robust learning model for disease prediction. The search for a subset of features is an NP-hard problem, which can be solved by the metaheuristics. In this chapter, a wrapper approach by hybridization between ant colony algorithm and adaboost with decision trees to ameliorate the classification is proposed. The authors use an enhanced global pheromone updating rule. With the experimental results, this approach gives good results.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Best Subset Selection"

1

Wang, Yu, Louis Luangkesorn, and Larry J. Shuman. "Best-subset selection procedure." In 2011 Winter Simulation Conference - (WSC 2011). IEEE, 2011. http://dx.doi.org/10.1109/wsc.2011.6148118.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gieseke, Fabian, Kai Lars Polsterer, Ashish Mahabal, Christian Igel, and Tom Heskes. "Massively-parallel best subset selection for ordinary least-squares regression." In 2017 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 2017. http://dx.doi.org/10.1109/ssci.2017.8285225.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhang, Ping, Ju Jiang, Xueshan Han, and Zhuoxun Lin. "M-best subset selection from n alternatives based on genetic algorithm." In 2011 24th IEEE Canadian Conference on Electrical and Computer Engineering (CCECE). IEEE, 2011. http://dx.doi.org/10.1109/ccece.2011.6030526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Joshi, Alok A., Peter H. Meckl, Galen B. King, and Kristofer Jennings. "Information-Theoretic Sensor Subset Selection: Application to Signal-Based Fault Isolation in Diesel Engines." In ASME 2006 International Mechanical Engineering Congress and Exposition. ASMEDC, 2006. http://dx.doi.org/10.1115/imece2006-15903.

Full text
Abstract:
In this paper a stepwise information-theoretic feature selector is designed and implemented to reduce the dimension of a data set without losing pertinent information. The effectiveness of the proposed feature selector is demonstrated by selecting features from forty three variables monitored on a set of heavy duty diesel engines and then using this feature space for classification of faults in these engines. Using a cross-validation technique, the effects of various classification methods (linear regression, quadratic discriminants, probabilistic neural networks, and support vector machines) and feature selection methods (regression subset selection, RV-based selection by simulated annealing, and information-theoretic selection) are compared based on the percentage misclassification. The information-theoretic feature selector combined with the probabilistic neural network achieved an average classification accuracy of 90%, which was the best performance of any combination of classifiers and feature selectors under consideration.
APA, Harvard, Vancouver, ISO, and other styles
5

Qian, Chao, Yang Yu, and Ke Tang. "Approximation Guarantees of Stochastic Greedy Algorithms for Subset Selection." In Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}. California: International Joint Conferences on Artificial Intelligence Organization, 2018. http://dx.doi.org/10.24963/ijcai.2018/205.

Full text
Abstract:
Subset selection is a fundamental problem in many areas, which aims to select the best subset of size at most $k$ from a universe. Greedy algorithms are widely used for subset selection, and have shown good approximation performances in deterministic situations. However, their behaviors are stochastic in many realistic situations (e.g., large-scale and noisy). For general stochastic greedy algorithms, bounded approximation guarantees were obtained only for subset selection with monotone submodular objective functions, while real-world applications often involve non-monotone or non-submodular objective functions and can be subject to a more general constraint than a size constraint. This work proves their approximation guarantees in these cases, and thus largely extends the applicability of stochastic greedy algorithms.
APA, Harvard, Vancouver, ISO, and other styles
6

Bigler, Tamara, and Oliver Strub. "A Local-branching Heuristic for the Best Subset Selection Problem in Linear Regression." In 2018 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM). IEEE, 2018. http://dx.doi.org/10.1109/ieem.2018.8607366.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Bian, Chao, Chao Qian, Frank Neumann, and Yang Yu. "Fast Pareto Optimization for Subset Selection with Dynamic Cost Constraints." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/302.

Full text
Abstract:
Subset selection with cost constraints is a fundamental problem with various applications such as influence maximization and sensor placement. The goal is to select a subset from a ground set to maximize a monotone objective function such that a monotone cost function is upper bounded by a budget. Previous algorithms with bounded approximation guarantees include the generalized greedy algorithm, POMC and EAMC, all of which can achieve the best known approximation guarantee. In real-world scenarios, the resources often vary, i.e., the budget often changes over time, requiring the algorithms to adapt the solutions quickly. However, when the budget changes dynamically, all these three algorithms either achieve arbitrarily bad approximation guarantees, or require a long running time. In this paper, we propose a new algorithm FPOMC by combining the merits of the generalized greedy algorithm and POMC. That is, FPOMC introduces a greedy selection strategy into POMC. We prove that FPOMC can maintain the best known approximation guarantee efficiently.
APA, Harvard, Vancouver, ISO, and other styles
8

Cross, Valerie, and Michael Zmuda. "Ensemble Creation using Fuzzy Similarity Measures and Feature Subset Evaluators." In 2nd International Conference on Machine Learning Techniques and NLP (MLNLP 2021). Academy and Industry Research Collaboration Center (AIRCC), 2021. http://dx.doi.org/10.5121/csit.2021.111407.

Full text
Abstract:
Current machine learning research is addressing the problem that occurs when the data set includes numerous features but the number of training data is small. Microarray data, for example, typically has a very large number of features, the genes, as compared to the number of training data examples, the patients. An important research problem is to develop techniques to effectively reduce the number of features by selecting the best set of features for use in a machine learning process, referred to as the feature selection problem. Another means of addressing high dimensional data is the use of an ensemble of base classifiers. Ensembles have been shown to improve the predictive performance of a single model by training multiple models and combining their predictions. This paper examines combining an enhancement of the random subspace model of feature selection using fuzzy set similarity measures with different measures of evaluating feature subsets in the construction of an ensemble classifier. Experimental results show that in most cases a fuzzy set similarity measure paired with a feature subset evaluator outperforms the corresponding fuzzy similarity measure by itself and the learning process only needs to occur on typically about half the number of base classifiers since the features subset evaluator eliminates those feature subsets of low quality from use in the ensemble. In general, the fuzzy consistency index is the better performing feature subset evaluator, and inclusion maximum is the better performing fuzzy similarity measure.
APA, Harvard, Vancouver, ISO, and other styles
9

Gu, Lei. "A Comparison of Polynomial Based Regression Models in Vehicle Safety Analysis." In ASME 2001 International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. American Society of Mechanical Engineers, 2001. http://dx.doi.org/10.1115/detc2001/dac-21063.

Full text
Abstract:
Abstract Vehicle crash is a highly nonlinear event in terms of the structural and dummy responses. However, Linear and quadratic polynomials regression are still widely used in the design optimization and reliability based optimization of vehicle safety analysis. This paper investigates the polynomial based subset selection regression models for vehicle safety analysis. Three subset selection techniques: all possible subset with linear polynomial, stepwise with quadratic polynomial and sequential replacement with quadratic and cubic polynomials, are discussed. The methods have been applied to data from finite element simulations of vehicle full frontal crash, side impact and frontal offset impact. It is shown subset selection with sequential replacement algorithm gives the best accuracy responses. It is also shown from limited finite element simulation data, the quadratic polynomial is good enough for most structural and dummy responses when gauges and materials are used as design variables. For vehicle weight, linear polynomial fits well.
APA, Harvard, Vancouver, ISO, and other styles
10

Dereziński, Michał, Rajiv Khanna, and Michael W. Mahoney. "Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract)." In Thirtieth International Joint Conference on Artificial Intelligence {IJCAI-21}. California: International Joint Conferences on Artificial Intelligence Organization, 2021. http://dx.doi.org/10.24963/ijcai.2021/647.

Full text
Abstract:
The Column Subset Selection Problem (CSSP) and the Nystrom method are among the leading tools for constructing interpretable low-rank approximations of large datasets by selecting a small but representative set of features or instances. A fundamental question in this area is: what is the cost of this interpretability, i.e., how well can a data subset of size k compete with the best rank k approximation? We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees which go beyond the standard worst-case analysis. Our approach leads to significantly better bounds for datasets with known rates of singular value decay, e.g., polynomial or exponential decay. Our analysis also reveals an intriguing phenomenon: the cost of interpretability as a function of k may exhibit multiple peaks and valleys, which we call a multiple-descent curve. A lower bound we establish shows that this behavior is not an artifact of our analysis, but rather it is an inherent property of the CSSP and Nystrom tasks. Finally, using the example of a radial basis function (RBF) kernel, we show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
APA, Harvard, Vancouver, ISO, and other styles

Reports on the topic "Best Subset Selection"

1

Seginer, Ido, James Jones, Per-Olof Gutman, and Eduardo Vallejos. Optimal Environmental Control for Indeterminate Greenhouse Crops. United States Department of Agriculture, August 1997. http://dx.doi.org/10.32747/1997.7613034.bard.

Full text
Abstract:
Increased world competition, as well as increased concern for the environment, drive all manufacturing systems, including greenhouses, towards high-precision operation. Optimal control is an important tool to achieve this goal, since it finds the best compromise between conflicting demands, such as higher profits and environmental concerns. The report, which is a collection of papers, each with its own abstract, outlines an approach for optimal, model-based control of the greenhouse environment. A reliable crop model is essential for this approach and a significant portion of the effort went in this direction, resulting in a radically new version of the tomato model TOMGRO, which can be used as a prototype model for other greenhouse crops. Truly optimal control of a very complex system requires prohibitively large computer resources. Two routes to model simplification have, therefore, been tried: Model reduction (to fewer state variables) and simplified decision making. Crop model reduction from nearly 70 state variables to about 5, was accomplished by either selecting a subset of the original variables or by forming combinations of them. Model dynamics were then fitted either with mechanistic relationships or with neural networks. To simplify the decision making process, the number of costate variables (control policy parametrs) was recuced to one or two. The dry-matter state variable was transformed in such a way that its costate became essentially constant throughout the season. A quasi-steady-state control algorithm was implemented in an experimental greenhouse. A constant value for the dry-matter costate was able to control simultaneously ventilation and CO2 enrichment by continuously producing weather-dependent optimal setpoints and then maintaining them closely.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!