Journal articles on the topic 'Tree Ensemble'

To see the other types of publications on this topic, follow the link: Tree Ensemble.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Tree Ensemble.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Alazba, Amal, and Hamoud Aljamaan. "Software Defect Prediction Using Stacking Generalization of Optimized Tree-Based Ensembles." Applied Sciences 12, no. 9 (April 30, 2022): 4577. http://dx.doi.org/10.3390/app12094577.

Full text
Abstract:
Software defect prediction refers to the automatic identification of defective parts of software through machine learning techniques. Ensemble learning has exhibited excellent prediction outcomes in comparison with individual classifiers. However, most of the previous work utilized ensemble models in the context of software defect prediction with the default hyperparameter values, which are considered suboptimal. In this paper, we investigate the applicability of a stacking ensemble built with fine-tuned tree-based ensembles for defect prediction. We used grid search to optimize the hyperparameters of seven tree-based ensembles: random forest, extra trees, AdaBoost, gradient boosting, histogram-based gradient boosting, XGBoost and CatBoost. Then, a stacking ensemble was built utilizing the fine-tuned tree-based ensembles. The ensembles were evaluated using 21 publicly available defect datasets. Empirical results showed large impacts of hyperparameter optimization on extra trees and random forest ensembles. Moreover, our results demonstrated the superiority of the stacking ensemble over all fine-tuned tree-based ensembles.
APA, Harvard, Vancouver, ISO, and other styles
2

WINDEATT, T., and G. ARDESHIR. "DECISION TREE SIMPLIFICATION FOR CLASSIFIER ENSEMBLES." International Journal of Pattern Recognition and Artificial Intelligence 18, no. 05 (August 2004): 749–76. http://dx.doi.org/10.1142/s021800140400340x.

Full text
Abstract:
The goal of designing an ensemble of simple classifiers is to improve the accuracy of a recognition system. However, the performance of ensemble methods is problem-dependent and the classifier learning algorithm has an important influence on ensemble performance. In particular, base classifiers that are too complex may result in overfitting. In this paper, the performance of Bagging, Boosting and Error-Correcting Output Code (ECOC) is compared for five decision tree pruning methods. A description is given for each of the pruning methods and the ensemble techniques. AdaBoost.OC which is a combination of Boosting and ECOC is compared with the pseudo-loss based version of Boosting, AdaBoost.M2 and the influence of pruning on the performance of the ensembles is studied. Motivated by the result that both pruned and unpruned ensembles made by AdaBoost.OC give similar accuracy, pruned ensembles are compared with ensembles of Decision Stumps. This leads to the hypothesis that ensembles of simple classifiers may give better performance for some problems. Using the application of face recognition, it is shown that an AdaBoost.OC ensemble of Decision Stumps outperforms an ensemble of pruned C4.5 trees for face identification, but is inferior for face verification. The implication is that in some real-world tasks to achieve best accuracy of an ensemble, it may be necessary to select base classifier complexity.
APA, Harvard, Vancouver, ISO, and other styles
3

Pahno, Steve, Jidong J. Yang, and S. Sonny Kim. "Use of Machine Learning Algorithms to Predict Subgrade Resilient Modulus." Infrastructures 6, no. 6 (May 21, 2021): 78. http://dx.doi.org/10.3390/infrastructures6060078.

Full text
Abstract:
Modern machine learning methods, such as tree ensembles, have recently become extremely popular due to their versatility and scalability in handling heterogeneous data and have been successfully applied across a wide range of domains. In this study, two widely applied tree ensemble methods, i.e., random forest (parallel ensemble) and gradient boosting (sequential ensemble), were investigated to predict resilient modulus, using routinely collected soil properties. Laboratory test data on sandy soils from nine borrow pits in Georgia were used for model training and testing. For comparison purposes, the two tree ensemble methods were evaluated against a regression tree model and a multiple linear regression model, demonstrating their superior performance. The results revealed that a single tree model generally suffers from high variance, while providing a similar performance to the traditional multiple linear regression model. By leveraging a collection of trees, both tree ensemble methods, Random Forest and eXtreme Gradient Boosting, significantly reduced variance and improved prediction accuracy, with the eXtreme Gradient Boosting being the best model, with an R2 of 0.95 on the test dataset.
APA, Harvard, Vancouver, ISO, and other styles
4

PETERSON, ADAM H., and TONY R. MARTINEZ. "REDUCING DECISION TREE ENSEMBLE SIZE USING PARALLEL DECISION DAGS." International Journal on Artificial Intelligence Tools 18, no. 04 (August 2009): 613–20. http://dx.doi.org/10.1142/s0218213009000305.

Full text
Abstract:
This research presents a new learning model, the Parallel Decision DAG (PDDAG), and shows how to use it to represent an ensemble of decision trees while using significantly less storage. Ensembles such as Bagging and Boosting have a high probability of encoding redundant data structures, and PDDAGs provide a way to remove this redundancy in decision tree based ensembles. When trained by encoding an ensemble, the new model behaves similar to the original ensemble, and can be made to perform identically to it. The reduced storage requirements allow an ensemble approach to be used in cases where storage requirements would normally be exceeded, and the smaller model can potentially execute faster by reducing redundant computation.
APA, Harvard, Vancouver, ISO, and other styles
5

Jiang, Xiangkui, Chang-an Wu, and Huaping Guo. "Forest Pruning Based on Branch Importance." Computational Intelligence and Neuroscience 2017 (2017): 1–11. http://dx.doi.org/10.1155/2017/3162571.

Full text
Abstract:
A forest is an ensemble with decision trees as members. This paper proposes a novel strategy to pruning forest to enhance ensemble generalization ability and reduce ensemble size. Unlike conventional ensemble pruning approaches, the proposed method tries to evaluate the importance of branches of trees with respect to the whole ensemble using a novel proposed metric called importance gain. The importance of a branch is designed by considering ensemble accuracy and the diversity of ensemble members, and thus the metric reasonably evaluates how much improvement of the ensemble accuracy can be achieved when a branch is pruned. Our experiments show that the proposed method can significantly reduce ensemble size and improve ensemble accuracy, no matter whether ensembles are constructed by a certain algorithm such as bagging or obtained by an ensemble selection algorithm, no matter whether each decision tree is pruned or unpruned.
APA, Harvard, Vancouver, ISO, and other styles
6

Kułaga, Rafał, and Marek Gorgoń. "FPGA Implementation of Decision Trees and Tree Ensembles for Character Recognition in Vivado Hls." Image Processing & Communications 19, no. 2-3 (September 1, 2014): 71–82. http://dx.doi.org/10.1515/ipc-2015-0012.

Full text
Abstract:
Abstract Decision trees and decision tree ensembles are popular machine learning methods, used for classification and regression. In this paper, an FPGA implementation of decision trees and tree ensembles for letter and digit recognition in Vivado High-Level Synthesis is presented. Two publicly available datasets were used at both training and testing stages. Different optimizations for tree code and tree node layout in memory are considered. Classification accuracy, throughput and resource usage for different training algorithms, tree depths and ensemble sizes are discussed. The correctness of the module’s operation was verified using C/RTL cosimulation and on a Zynq-7000 SoC device, using Xillybus IP core for data transfer between the processing system and the programmable logic.
APA, Harvard, Vancouver, ISO, and other styles
7

Ranzato, Francesco, and Marco Zanella. "Abstract Interpretation of Decision Tree Ensemble Classifiers." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 5478–86. http://dx.doi.org/10.1609/aaai.v34i04.5998.

Full text
Abstract:
We study the problem of formally and automatically verifying robustness properties of decision tree ensemble classifiers such as random forests and gradient boosted decision tree models. A recent stream of works showed how abstract interpretation, which is ubiquitously used in static program analysis, can be successfully deployed to formally verify (deep) neural networks. In this work we push forward this line of research by designing a general and principled abstract interpretation-based framework for the formal verification of robustness and stability properties of decision tree ensemble models. Our abstract interpretation-based method may induce complete robustness checks of standard adversarial perturbations and output concrete adversarial attacks. We implemented our abstract verification technique in a tool called silva, which leverages an abstract domain of not necessarily closed real hyperrectangles and is instantiated to verify random forests and gradient boosted decision trees. Our experimental evaluation on the MNIST dataset shows that silva provides a precise and efficient tool which advances the current state of the art in tree ensembles verification.
APA, Harvard, Vancouver, ISO, and other styles
8

Louk, Maya Hilda Lestari, and Bayu Adhi Tama. "Tree-Based Classifier Ensembles for PE Malware Analysis: A Performance Revisit." Algorithms 15, no. 9 (September 17, 2022): 332. http://dx.doi.org/10.3390/a15090332.

Full text
Abstract:
Given their escalating number and variety, combating malware is becoming increasingly strenuous. Machine learning techniques are often used in the literature to automatically discover the models and patterns behind such challenges and create solutions that can maintain the rapid pace at which malware evolves. This article compares various tree-based ensemble learning methods that have been proposed in the analysis of PE malware. A tree-based ensemble is an unconventional learning paradigm that constructs and combines a collection of base learners (e.g., decision trees), as opposed to the conventional learning paradigm, which aims to construct individual learners from training data. Several tree-based ensemble techniques, such as random forest, XGBoost, CatBoost, GBM, and LightGBM, are taken into consideration and are appraised using different performance measures, such as accuracy, MCC, precision, recall, AUC, and F1. In addition, the experiment includes many public datasets, such as BODMAS, Kaggle, and CIC-MalMem-2022, to demonstrate the generalizability of the classifiers in a variety of contexts. Based on the test findings, all tree-based ensembles performed well, and performance differences between algorithms are not statistically significant, particularly when their respective hyperparameters are appropriately configured. The proposed tree-based ensemble techniques also outperformed other, similar PE malware detectors that have been published in recent years.
APA, Harvard, Vancouver, ISO, and other styles
9

Buschjäger, Sebastian, Sibylle Hess, and Katharina J. Morik. "Shrub Ensembles for Online Classification." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 6 (June 28, 2022): 6123–31. http://dx.doi.org/10.1609/aaai.v36i6.20560.

Full text
Abstract:
Online learning algorithms have become a ubiquitous tool in the machine learning toolbox and are frequently used in small, resource-constraint environments. Among the most successful online learning methods are Decision Tree (DT) ensembles. DT ensembles provide excellent performance while adapting to changes in the data, but they are not resource efficient. Incremental tree learners keep adding new nodes to the tree but never remove old ones increasing the memory consumption over time. Gradient-based tree learning, on the other hand, requires the computation of gradients over the entire tree which is costly for even moderately sized trees. In this paper, we propose a novel memory-efficient online classification ensemble called shrub ensembles for resource-constraint systems. Our algorithm trains small to medium-sized decision trees on small windows and uses stochastic proximal gradient descent to learn the ensemble weights of these `shrubs'. We provide a theoretical analysis of our algorithm and include an extensive discussion on the behavior of our approach in the online setting. In a series of 2~959 experiments on 12 different datasets, we compare our method against 8 state-of-the-art methods. Our Shrub Ensembles retain an excellent performance even when only little memory is available. We show that SE offers a better accuracy-memory trade-off in 7 of 12 cases, while having a statistically significant better performance than most other methods. Our implementation is available under https://github.com/sbuschjaeger/se-online .
APA, Harvard, Vancouver, ISO, and other styles
10

Franzese, Giulio, and Monica Visintin. "Probabilistic Ensemble of Deep Information Networks." Entropy 22, no. 1 (January 14, 2020): 100. http://dx.doi.org/10.3390/e22010100.

Full text
Abstract:
We describe a classifier made of an ensemble of decision trees, designed using information theory concepts. In contrast to algorithms C4.5 or ID3, the tree is built from the leaves instead of the root. Each tree is made of nodes trained independently of the others, to minimize a local cost function (information bottleneck). The trained tree outputs the estimated probabilities of the classes given the input datum, and the outputs of many trees are combined to decide the class. We show that the system is able to provide results comparable to those of the tree classifier in terms of accuracy, while it shows many advantages in terms of modularity, reduced complexity, and memory requirements.
APA, Harvard, Vancouver, ISO, and other styles
11

Liu, F. T., K. M. Ting, Y. Yu, and Z. H. Zhou. "Spectrum of Variable-Random Trees." Journal of Artificial Intelligence Research 32 (May 29, 2008): 355–84. http://dx.doi.org/10.1613/jair.2470.

Full text
Abstract:
In this paper, we show that a continuous spectrum of randomisation exists, in which most existing tree randomisations are only operating around the two ends of the spectrum. That leaves a huge part of the spectrum largely unexplored. We propose a base learner VR-Tree which generates trees with variable-randomness. VR-Trees are able to span from the conventional deterministic trees to the complete-random trees using a probabilistic parameter. Using VR-Trees as the base models, we explore the entire spectrum of randomised ensembles, together with Bagging and Random Subspace. We discover that the two halves of the spectrum have their distinct characteristics; and the understanding of which allows us to propose a new approach in building better decision tree ensembles. We name this approach Coalescence, which coalesces a number of points in the random-half of the spectrum. Coalescence acts as a committee of ``experts'' to cater for unforeseeable conditions presented in training data. Coalescence is found to perform better than any single operating point in the spectrum, without the need to tune to a specific level of randomness. In our empirical study, Coalescence ranks top among the benchmarking ensemble methods including Random Forests, Random Subspace and C5 Boosting; and only Coalescence is significantly better than Bagging and Max-Diverse Ensemble among all the methods in the comparison. Although Coalescence is not significantly better than Random Forests, we have identified conditions under which one will perform better than the other.
APA, Harvard, Vancouver, ISO, and other styles
12

Fern, Alan, and Paul Lewis. "Ensemble Monte-Carlo Planning: An Empirical Study." Proceedings of the International Conference on Automated Planning and Scheduling 21 (March 22, 2011): 58–65. http://dx.doi.org/10.1609/icaps.v21i1.13458.

Full text
Abstract:
Monte-Carlo planning algorithms, such as UCT, select actions at each decision epoch by intelligently expanding a single search tree given the available time and then selecting the best root action. Recent work has provided evidence that it can be advantageous to instead construct an ensemble of search trees and to make a decision according to a weighted vote. However, these prior investigations have only considered the application domains of Go and Solitaire and were limited in the scope of ensemble configurations considered. In this paper, we conduct a more exhaustive empirical study of ensemble Monte-Carlo planning using the UCT algorithm in a set of six additional domains. In particular, we evaluate the advantages of a broad set of ensemble configurations in terms of space and time efficiency in both parallel and singlecore models. Our results demonstrate that ensembles are an effective way to improve performance per unit time given a parallel time model and performance per unit space in a single-core model. However, contrary to prior isolated observations, we did not find significant evidence that ensembles improve performance per unit time in a single-core model.
APA, Harvard, Vancouver, ISO, and other styles
13

Jung, M., M. Reichstein, and A. Bondeau. "Towards global empirical upscaling of FLUXNET eddy covariance observations: validation of a model tree ensemble approach using a biosphere model." Biogeosciences Discussions 6, no. 3 (May 26, 2009): 5271–304. http://dx.doi.org/10.5194/bgd-6-5271-2009.

Full text
Abstract:
Abstract. Global, spatially and temporally explicit estimates of carbon and water fluxes derived from empirical up-scaling eddy covariance measurements would constitute a new and possibly powerful data stream to study the variability of the global terrestrial carbon and water cycle. This paper introduces and validates a machine learning approach dedicated to the upscaling of observations from the current global network of eddy covariance towers (FLUXNET). We present a new model TRee Induction ALgorithm (TRIAL) that performs hierarchical stratification of the data set into units where particular multiple regressions for a target variable hold. We propose an ensemble approach (Evolving tRees with RandOm gRowth, ERROR) where the base learning algorithm is perturbed in order to gain a diverse sequence of different model trees which evolves over time. We evaluate the efficiency of the model tree ensemble approach using an artificial data set derived from the the Lund-Potsdam-Jena managed Land (LPJmL) biosphere model. We aim at reproducing global monthly gross primary production as simulated by LPJmL from 1998–2005 using only locations and months where high quality FLUXNET data exist for the training of the model trees. The model trees are trained with the LPJmL land cover and meteorological input data, climate data, and the fraction of absorbed photosynthetic active radiation simulated by LPJmL. Given that we know the "true result" in the form of global LPJmL simulations we can effectively study the performance of the model tree ensemble upscaling and associated problems of extrapolation capacity. We show that the model tree ensemble is able to explain 92% of the variability of the global LPJmL GPP simulations. The mean spatial pattern and the seasonal variability of GPP that constitute the largest sources of variance are very well reproduced (96% and 94% of variance explained respectively) while the monthly interannual anomalies which occupy much less variance are less well matched (41% of variance explained). We demonstrate the substantially improved accuracy of the model tree ensemble over individual model trees in particular for the monthly anomalies and for situations of extrapolation. We estimate that roughly one fifth of the domain is subject to extrapolation while the model tree ensemble is still able to reproduce 73% of the LPJmL GPP variability here. This paper presents for the first time a benchmark for a global FLUXNET upscaling approach that will be employed in future studies. Although the real world FLUXNET upscaling is more complicated than for a noise free and reduced complexity biosphere model as presented here, our results show that an empirical upscaling from the current FLUXNET network with a model tree ensemble is feasible and able to extract global patterns of carbon flux variability.
APA, Harvard, Vancouver, ISO, and other styles
14

Mišić, Velibor V. "Optimization of Tree Ensembles." Operations Research 68, no. 5 (September 2020): 1605–24. http://dx.doi.org/10.1287/opre.2019.1928.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Choi, Sung Hoon, and Hyunjoong Kim. "Tree size determination for classification ensemble." Journal of the Korean Data and Information Science Society 27, no. 1 (January 31, 2016): 255–64. http://dx.doi.org/10.7465/jkdi.2016.27.1.255.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Liu, Chanjuan, Shijie Zhou, You-Gan Wang, and Zhihua Hu. "Natural mortality estimation using tree-based ensemble learning models." ICES Journal of Marine Science 77, no. 4 (June 5, 2020): 1414–26. http://dx.doi.org/10.1093/icesjms/fsaa058.

Full text
Abstract:
Abstract Empirical studies are popular in estimating fish natural mortality rate (M). However, these empirical methods derive M from other life-history parameters and are often perceived as being less reliable than direct methods. To improve the predictive performance and reliability of empirical methods, we develop ensemble learning models, including bagging trees, random forests, and boosting trees, to predict M based on a dataset of 256 records of both Chondrichthyes and Osteichthyes. Three common life-history parameters are used as predictors: the maximum age and two growth parameters (growth coefficient and asymptotic length). In addition, taxonomic variable class is included to distinguish Chondrichthyes and Osteichthyes. Results indicate that tree-based ensemble learning models significantly improve the accuracy of M estimate, compared to the traditional statistical regression models and the basic regression tree model. Among ensemble learning models, boosting trees and random forests perform best on the training dataset, but the former performs a slightly better on the test dataset. We develop four boosting trees models for estimating M based on varying life-history parameters, and an R package is provided for interested readers to estimate M of their new species.
APA, Harvard, Vancouver, ISO, and other styles
17

Rahman, Raziur, Saad Haider, Souparno Ghosh, and Ranadip Pal. "Design of Probabilistic Random Forests with Applications to Anticancer Drug Sensitivity Prediction." Cancer Informatics 14s5 (January 2015): CIN.S30794. http://dx.doi.org/10.4137/cin.s30794.

Full text
Abstract:
Random forests consisting of an ensemble of regression trees with equal weights are frequently used for design of predictive models. In this article, we consider an extension of the methodology by representing the regression trees in the form of probabilistic trees and analyzing the nature of heteroscedasticity. The probabilistic tree representation allows for analytical computation of confidence intervals (CIs), and the tree weight optimization is expected to provide stricter CIs with comparable performance in mean error. We approached the ensemble of probabilistic trees’ prediction from the perspectives of a mixture distribution and as a weighted sum of correlated random variables. We applied our methodology to the drug sensitivity prediction problem on synthetic and cancer cell line encyclopedia dataset and illustrated that tree weights can be selected to reduce the average length of the CI without increase in mean error.
APA, Harvard, Vancouver, ISO, and other styles
18

Ruaud, Albane, Niklas Pfister, Ruth E. Ley, and Nicholas D. Youngblut. "Interpreting tree ensemble machine learning models with endoR." PLOS Computational Biology 18, no. 12 (December 14, 2022): e1010714. http://dx.doi.org/10.1371/journal.pcbi.1010714.

Full text
Abstract:
Tree ensemble machine learning models are increasingly used in microbiome science as they are compatible with the compositional, high-dimensional, and sparse structure of sequence-based microbiome data. While such models are often good at predicting phenotypes based on microbiome data, they only yield limited insights into how microbial taxa may be associated. We developed endoR, a method to interpret tree ensemble models. First, endoR simplifies the fitted model into a decision ensemble. Then, it extracts information on the importance of individual features and their pairwise interactions, displaying them as an interpretable network. Both the endoR network and importance scores provide insights into how features, and interactions between them, contribute to the predictive performance of the fitted model. Adjustable regularization and bootstrapping help reduce the complexity and ensure that only essential parts of the model are retained. We assessed endoR on both simulated and real metagenomic data. We found endoR to have comparable accuracy to other common approaches while easing and enhancing model interpretation. Using endoR, we also confirmed published results on gut microbiome differences between cirrhotic and healthy individuals. Finally, we utilized endoR to explore associations between human gut methanogens and microbiome components. Indeed, these hydrogen consumers are expected to interact with fermenting bacteria in a complex syntrophic network. Specifically, we analyzed a global metagenome dataset of 2203 individuals and confirmed the previously reported association between Methanobacteriaceae and Christensenellales. Additionally, we observed that Methanobacteriaceae are associated with a network of hydrogen-producing bacteria. Our method accurately captures how tree ensembles use features and interactions between them to predict a response. As demonstrated by our applications, the resultant visualizations and summary outputs facilitate model interpretation and enable the generation of novel hypotheses about complex systems.
APA, Harvard, Vancouver, ISO, and other styles
19

Gavrylenko, Svitlana, and Oleksii Hornostal. "Development of a method for identification of the state of computer systems based on bagging classifiers." Advanced Information Systems 5, no. 4 (December 20, 2021): 5–9. http://dx.doi.org/10.20998/2522-9052.2021.4.01.

Full text
Abstract:
The subject of the research is methods and means of identifying the state of a computer system . The purpose of the article is to improve the quality of computer system state identification by developing a method based on ensemble classifiers. Task: to investigate methods for constructing bagging classifiers based on decision trees, to configure them and develop a method for identifying the state of the computer system. Methods used: artificial intelligence methods, machine learning, ensemble methods. The following results were obtained: the use of bagging classifiers based on meta-algorithms were investigated: Pasting Ensemble, Bootstrap Ensemble, Random Subspace Ensemble, Random Patches Ensemble and Random Forest methods and their accuracy were assessed to identify the state of the computer system. The research of tuning parameters of individual decision trees was carried out and their optimal values were found, including: the maximum number of features used in the construction of the tree; the minimum number of branches when building a tree; minimum number of leaves and maximum tree depth. The optimal number of trees in the ensemble has been determined. A method for identifying the state of the computer system is proposed, which differs from the known ones by the choice of the classification meta-algorithm and the selection of the optimal parameters for its adjustment. An assessment of the accuracy of the developed method for identifying the state of a computer system is carried out. The developed method is implemented in software and investigated when solving the problem of identifying the abnormal state of the computer system functioning. Conclusions. The scientific novelty of the results obtained lies in the development of a method for identifying the state of the computer system by choosing a meta-algorithm for classification and determining the optimal parameters for its configuration.
APA, Harvard, Vancouver, ISO, and other styles
20

Kelarev, Andrei V., Jemal Abawajy, Andrew Stranieri, and Herbert F. Jelinek. "Empirical Investigation of Decision Tree Ensembles for Monitoring Cardiac Complications of Diabetes." International Journal of Data Warehousing and Mining 9, no. 4 (October 2013): 1–18. http://dx.doi.org/10.4018/ijdwm.2013100101.

Full text
Abstract:
Cardiac complications of diabetes require continuous monitoring since they may lead to increased morbidity or sudden death of patients. In order to monitor clinical complications of diabetes using wearable sensors, a small set of features have to be identified and effective algorithms for their processing need to be investigated. This article focuses on detecting and monitoring cardiac autonomic neuropathy (CAN) in diabetes patients. The authors investigate and compare the effectiveness of classifiers based on the following decision trees: ADTree, J48, NBTree, RandomTree, REPTree, and SimpleCart. The authors perform a thorough study comparing these decision trees as well as several decision tree ensembles created by applying the following ensemble methods: AdaBoost, Bagging, Dagging, Decorate, Grading, MultiBoost, Stacking, and two multi-level combinations of AdaBoost and MultiBoost with Bagging for the processing of data from diabetes patients for pervasive health monitoring of CAN. This paper concentrates on the particular task of applying decision tree ensembles for the detection and monitoring of cardiac autonomic neuropathy using these features. Experimental outcomes presented here show that the authors' application of the decision tree ensembles for the detection and monitoring of CAN in diabetes patients achieved better performance parameters compared with the results obtained previously in the literature.
APA, Harvard, Vancouver, ISO, and other styles
21

Gbenga, Fadare Oluwaseun, Adetunmbi Adebayo Olusola, and Oyinloye Oghenerukevwe Elohor. "Towards Optimization of Malware Detection using Extra-Tree and Random Forest Feature Selections on Ensemble Classifiers." International Journal of Recent Technology and Engineering 9, no. 6 (March 30, 2021): 223–32. http://dx.doi.org/10.35940/ijrte.f5545.039621.

Full text
Abstract:
The proliferation of Malware on computer communication systems posed great security challenges to confidential data stored and other valuable substances across the globe. There have been several attempts in curbing the menace using a signature-based approach and in recent times, machine learning techniques have been extensively explored. This paper proposes a framework combining the exploit of both feature selections based on extra tree and random forest and eight ensemble techniques on five base learners- KNN, Naive Bayes, SVM, Decision Trees, and Logistic Regression. K-Nearest Neighbors returns the highest accuracy of 96.48%, 96.40%, and 87.89% on extra-tree, random forest, and without feature selection (WFS) respectively. Random forest ensemble accuracy on both Feature Selections are the highest with 98.50% and 98.16% on random forest and extra-tree respectively. The Extreme Gradient Boosting Classifier is next on random-forest FS with an accuracy of 98.37% while Voting returns the least detection accuracy of 95.80%. On extra-tree FS, Bagging is next with a detection accuracy of 98.09% while Voting returns the least accuracy of 95.54%. Random Forest has the highest all in seven evaluative measures in both extra tree and random forest feature selection techniques. The study results uncover the tree-based ensemble model is proficient and successful for malware classification.
APA, Harvard, Vancouver, ISO, and other styles
22

Dagogo-George, Tamunopriye Ene, Hammed Adeleye Mojeed, Abdulateef Oluwagbemiga Balogun, Modinat Abolore Mabayoje, and Shakirat Aderonke Salihu. "Tree-based homogeneous ensemble model with feature selection for diabetic retinopathy prediction." Jurnal Teknologi dan Sistem Komputer 8, no. 4 (October 13, 2020): 297–303. http://dx.doi.org/10.14710/jtsiskom.2020.13669.

Full text
Abstract:
Diabetic Retinopathy (DR) is a condition that emerges from prolonged diabetes, causing severe damages to the eyes. Early diagnosis of this disease is highly imperative as late diagnosis may be fatal. Existing studies employed machine learning approaches with Support Vector Machines (SVM) having the highest performance on most analyses and Decision Trees (DT) having the lowest. However, SVM has been known to suffer from parameter and kernel selection problems, which undermine its predictive capability. Hence, this study presents homogenous ensemble classification methods with DT as the base classifier to optimize predictive performance. Boosting and Bagging ensemble methods with feature selection were employed, and experiments were carried out using Python Scikit Learn libraries on DR datasets extracted from UCI Machine Learning repository. Experimental results showed that Bagged and Boosted DT were better than SVM. Specifically, Bagged DT performed best with accuracy 65.38 %, f-score 0.664, and AUC 0.731, followed by Boosted DT with accuracy 65.42 %, f-score 0.655, and AUC 0.724 when compared to SVM (accuracy 65.16 %, f-score 0.652, and AUC 0.721). These results indicate that DT's predictive performance can be optimized by employing the homogeneous ensemble methods to outperform SVM in predicting DR.
APA, Harvard, Vancouver, ISO, and other styles
23

García-Martín, Eva, Albert Bifet, and Niklas Lavesson. "Energy modeling of Hoeffding tree ensembles." Intelligent Data Analysis 25, no. 1 (January 26, 2021): 81–104. http://dx.doi.org/10.3233/ida-194890.

Full text
Abstract:
Energy consumption reduction has been an increasing trend in machine learning over the past few years due to its socio-ecological importance. In new challenging areas such as edge computing, energy consumption and predictive accuracy are key variables during algorithm design and implementation. State-of-the-art ensemble stream mining algorithms are able to create highly accurate predictions at a substantial energy cost. This paper introduces the nmin adaptation method to ensembles of Hoeffding tree algorithms, to further reduce their energy consumption without sacrificing accuracy. We also present extensive theoretical energy models of such algorithms, detailing their energy patterns and how nmin adaptation affects their energy consumption. We have evaluated the energy efficiency and accuracy of the nmin adaptation method on five different ensembles of Hoeffding trees under 11 publicly available datasets. The results show that we are able to reduce the energy consumption significantly, by 21% on average, affecting accuracy by less than one percent on average.
APA, Harvard, Vancouver, ISO, and other styles
24

Dağ, Özge Hüsniye Namlı. "Predicting the Success of Ensemble Algorithms in the Banking Sector." International Journal of Business Analytics 6, no. 4 (October 2019): 12–31. http://dx.doi.org/10.4018/ijban.2019100102.

Full text
Abstract:
The banking sector, like other service sector, improves in accordance with the customer's needs. Therefore, to know the needs of customers and to predict customer behaviors are very important for competition in the banking sector. Data mining uncovers relationships and hidden patterns in large data sets. Classification algorithms, one of the applications of data mining, is used very effectively in decision making. In this study, the c4.5 algorithm, a decision trees algorithm widely used in classification problems, is used in an integrated way with the ensemble machine learning methods in order to increase the efficiency of the algorithms. Data obtained via direct marketing campaigns from Portugal Banks was used to classify whether customers have term deposit accounts or not. Artificial Neural Networks and Support Vector Machines as Traditional Artificial Intelligence Methods and Bagging-C4.5 and Boosted-C.45 as ensemble-decision tree hybrid methods were used in classification. Bagging-C4.5 as ensemble-decision tree algorithm achieved more powerful classification success than other used algorithms. The ensemble-decision tree hybrid methods give better results than artificial neural networks and support vector machines as traditional artificial intelligence methods for this study.
APA, Harvard, Vancouver, ISO, and other styles
25

Wei, Yan Yan, and Tao Sheng Li. "An Empirical Study on Feature Subsampling-Based Ensembles." Applied Mechanics and Materials 239-240 (December 2012): 848–52. http://dx.doi.org/10.4028/www.scientific.net/amm.239-240.848.

Full text
Abstract:
Feature subsampling techniques help to create diverse for classifiers ensemble. In this article we investigate two feature subsampling-base ensemble methods - Random Subspace Method (RSM) and Rotation Forest Method (RFM) to explore their usability with different learning algorithms and the robust on noise data. The experiments show that RSM with IBK work better than RFM and AdaBoost, and RFM with tree classifier and rule classifier achieve prominent improvement than others. We also find that Logistic algorithm is not suitable for any of the three ensembles. When adding classification noise into original data sets, ensembles outperform singles at lower noisy level but fail to maintain such superior at higher noisy level.
APA, Harvard, Vancouver, ISO, and other styles
26

Zou, Yao, Changchun Gao, Meng Xia, and Congyuan Pang. "Credit scoring based on a Bagging-cascading boosted decision tree." Intelligent Data Analysis 26, no. 6 (November 12, 2022): 1557–78. http://dx.doi.org/10.3233/ida-216228.

Full text
Abstract:
Establishing precise credit scoring models to predict the potential default probability is vital for credit risk management. Machine learning models, especially ensemble learning approaches, have shown substantial progress in the performance improvement of credit scoring. The Bagging ensemble approach improves the credit scoring performance by optimizing the prediction variance while boosting ensemble algorithms reduce the prediction error by controlling the prediction bias. In this study, we propose a hybrid ensemble method that combines the advantages of the Bagging ensemble strategy and boosting ensemble optimization pattern, which can well balance the tradeoff of variance-bias optimization. The proposed method considers XGBoost as a base learner, which ensures the low-bias prediction. Moreover, the Bagging strategy is introduced to train the base learner to prevent over-fitting in the proposed method. Besides, the Bagging-boosting ensemble algorithm is further assembled in a cascading way, making the proposed new hybrid ensemble algorithm a good solution to balance the tradeoff of variance bias for credit scoring. Experimental results on the Australian, German, Japanese, and Taiwan datasets show the proposed Bagging-cascading boosted decision tree provides a more accurate credit scoring result.
APA, Harvard, Vancouver, ISO, and other styles
27

Yang, Dazhi. "Ultra-fast analog ensemble using kd-tree." Journal of Renewable and Sustainable Energy 11, no. 5 (September 2019): 053703. http://dx.doi.org/10.1063/1.5124711.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Liu, Xiaoqian, Qianmu Li, Tao Li, and Dong Chen. "Differentially private classification with decision tree ensemble." Applied Soft Computing 62 (January 2018): 807–16. http://dx.doi.org/10.1016/j.asoc.2017.09.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Sun, Tao, and Zhi-Hua Zhou. "Structural diversity for decision tree ensemble learning." Frontiers of Computer Science 12, no. 3 (February 15, 2018): 560–70. http://dx.doi.org/10.1007/s11704-018-7151-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Yamaguchi, Takashi, Yuki Noguchi, Kenneth J. Mackin, and Takumi Ichimura. "Cluster ensemble in adaptive tree structured clustering." International Journal of Knowledge Engineering and Soft Data Paradigms 3, no. 1 (2011): 69. http://dx.doi.org/10.1504/ijkesdp.2011.039879.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Frederickson, Greg N., and D. J. Guan. "Preemptive Ensemble Motion Planning on a Tree." SIAM Journal on Computing 21, no. 6 (December 1992): 1130–52. http://dx.doi.org/10.1137/0221066.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Frederickson, G. N., and D. J. Guan. "Nonpreemptive Ensemble Motion Planning on a Tree." Journal of Algorithms 15, no. 1 (July 1993): 29–60. http://dx.doi.org/10.1006/jagm.1993.1029.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Hsu, Kuo-Wei. "A Theoretical Analysis of Why Hybrid Ensembles Work." Computational Intelligence and Neuroscience 2017 (2017): 1–12. http://dx.doi.org/10.1155/2017/1930702.

Full text
Abstract:
Inspired by the group decision making process, ensembles or combinations of classifiers have been found favorable in a wide variety of application domains. Some researchers propose to use the mixture of two different types of classification algorithms to create a hybrid ensemble. Why does such an ensemble work? The question remains. Following the concept of diversity, which is one of the fundamental elements of the success of ensembles, we conduct a theoretical analysis of why hybrid ensembles work, connecting using different algorithms to accuracy gain. We also conduct experiments on classification performance of hybrid ensembles of classifiers created by decision tree and naïve Bayes classification algorithms, each of which is a top data mining algorithm and often used to create non-hybrid ensembles. Therefore, through this paper, we provide a complement to the theoretical foundation of creating and using hybrid ensembles.
APA, Harvard, Vancouver, ISO, and other styles
34

Jun, Sungbum. "Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing." Sensors 21, no. 8 (April 18, 2021): 2849. http://dx.doi.org/10.3390/s21082849.

Full text
Abstract:
Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree’s performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.
APA, Harvard, Vancouver, ISO, and other styles
35

Pleșoianu, Alin-Ionuț, Mihai-Sorin Stupariu, Ionuț Șandric, Ileana Pătru-Stupariu, and Lucian Drăguț. "Individual Tree-Crown Detection and Species Classification in Very High-Resolution Remote Sensing Imagery Using a Deep Learning Ensemble Model." Remote Sensing 12, no. 15 (July 29, 2020): 2426. http://dx.doi.org/10.3390/rs12152426.

Full text
Abstract:
Traditional methods for individual tree-crown (ITC) detection (image classification, segmentation, template matching, etc.) applied to very high-resolution remote sensing imagery have been shown to struggle in disparate landscape types or image resolutions due to scale problems and information complexity. Deep learning promised to overcome these shortcomings due to its superior performance and versatility, proven with reported detection rates of ~90%. However, such models still find their limits in transferability across study areas, because of different tree conditions (e.g., isolated trees vs. compact forests) and/or resolutions of the input data. This study introduces a highly replicable deep learning ensemble design for ITC detection and species classification based on the established single shot detector (SSD) model. The ensemble model design is based on varying the input data for the SSD models, coupled with a voting strategy for the output predictions. Very high-resolution unmanned aerial vehicles (UAV), aerial remote sensing imagery and elevation data are used in different combinations to test the performance of the ensemble models in three study sites with highly contrasting spatial patterns. The results show that ensemble models perform better than any single SSD model, regardless of the local tree conditions or image resolution. The detection performance and the accuracy rates improved by 3–18% with only as few as two participant single models, regardless of the study site. However, when more than two models were included, the performance of the ensemble models only improved slightly and even dropped.
APA, Harvard, Vancouver, ISO, and other styles
36

Barukab, Omar, Amir Ahmad, Tabrej Khan, and Mujeeb Rahiman Thayyil Kunhumuhammed. "Analysis of Parkinson’s Disease Using an Imbalanced-Speech Dataset by Employing Decision Tree Ensemble Methods." Diagnostics 12, no. 12 (November 30, 2022): 3000. http://dx.doi.org/10.3390/diagnostics12123000.

Full text
Abstract:
Parkinson’s disease (PD) currently affects approximately 10 million people worldwide. The detection of PD positive subjects is vital in terms of disease prognostics, diagnostics, management and treatment. Different types of early symptoms, such as speech impairment and changes in writing, are associated with Parkinson disease. To classify potential patients of PD, many researchers used machine learning algorithms in various datasets related to this disease. In our research, we study the dataset of the PD vocal impairment feature, which is an imbalanced dataset. We propose comparative performance evaluation using various decision tree ensemble methods, with or without oversampling techniques. In addition, we compare the performance of classifiers with different sizes of ensembles and various ratios of the minority class and the majority class with oversampling and undersampling. Finally, we combine feature selection with best-performing ensemble classifiers. The result shows that AdaBoost, random forest, and decision tree developed for the RUSBoost imbalanced dataset perform well in performance metrics such as precision, recall, F1-score, area under the receiver operating characteristic curve (AUROC) and the geometric mean. Further, feature selection methods, namely lasso and information gain, were used to screen the 10 best features using the best ensemble classifiers. AdaBoost with information gain feature selection method is the best performing ensemble method with an F1-score of 0.903.
APA, Harvard, Vancouver, ISO, and other styles
37

Liu, Rencheng, Saqib Ali, Syed Fakhar Bilal, Zareen Sakhawat, Azhar Imran, Abdullah Almuhaimeed, Abdulkareem Alzahrani, and Guangmin Sun. "An Intelligent Hybrid Scheme for Customer Churn Prediction Integrating Clustering and Classification Algorithms." Applied Sciences 12, no. 18 (September 18, 2022): 9355. http://dx.doi.org/10.3390/app12189355.

Full text
Abstract:
Nowadays, customer churn has been reflected as one of the main concerns in the processes of the telecom sector, as it affects the revenue directly. Telecom companies are looking to design novel methods to identify the potential customer to churn. Hence, it requires suitable systems to overcome the growing churn challenge. Recently, integrating different clustering and classification models to develop hybrid learners (ensembles) has gained wide acceptance. Ensembles are getting better approval in the domain of big data since they have supposedly achieved excellent predictions as compared to single classifiers. Therefore, in this study, we propose a customer churn prediction (CCP) based on ensemble system fully incorporating clustering and classification learning techniques. The proposed churn prediction model uses an ensemble of clustering and classification algorithms to improve CCP model performance. Initially, few clustering algorithms such as k-means, k-medoids, and Random are employed to test churn prediction datasets. Next, to enhance the results hybridization technique is applied using different ensemble algorithms to evaluate the performance of the proposed system. Above mentioned clustering algorithms integrated with different classifiers including Gradient Boosted Tree (GBT), Decision Tree (DT), Random Forest (RF), Deep Learning (DL), and Naive Bayes (NB) are evaluated on two standard telecom datasets which were acquired from Orange and Cell2Cell. The experimental result reveals that compared to the bagging ensemble technique, the stacking-based hybrid model (k-medoids-GBT-DT-DL) achieve the top accuracies of 96%, and 93.6% on the Orange and Cell2Cell dataset, respectively. The proposed method outperforms conventional state-of-the-art churn prediction algorithms.
APA, Harvard, Vancouver, ISO, and other styles
38

Bian, Yan Shan, Ling Da Wu, Rong Huan Yu, and Zhi Ke Chen. "Individual Tree Detection from High Spatial Resolution Imagery Using Color and Texture Features." Applied Mechanics and Materials 519-520 (February 2014): 703–7. http://dx.doi.org/10.4028/www.scientific.net/amm.519-520.703.

Full text
Abstract:
An automatic individual tree detection method from pure image is proposed. Color and texture features are selected to form a vector for a pixel-level classification, then trained to assign a label to each pixel. Other features can be integrated into the pixel vector for extracting more information of trees. An ensemble method combining multiple logistic regression classifiers improves effectiveness of the single pixel-level classifier. Then spectral, shape and knowledge characteristics of individual tree crowns are used for tree top localization. At last, tree crowns are delineated by region-based algorithm.
APA, Harvard, Vancouver, ISO, and other styles
39

Geethanjali, P., and K. K. Ray. "STATISTICAL PATTERN RECOGNITION TECHNIQUE FOR IMPROVED REAL-TIME MYOELECTRIC SIGNAL CLASSIFICATION." Biomedical Engineering: Applications, Basis and Communications 25, no. 02 (April 2013): 1350026. http://dx.doi.org/10.4015/s1016237213500269.

Full text
Abstract:
The authors in this paper propose a statistical technique for pattern recognition of electromyogram (EMG) signals along with effective feature ensemble to achieve an improved classification performance with less processing time and memory space. In this study, EMG signals from 10 healthy subjects and two transradial amputees for six motions of hand and wrist is considered for identification of the intended motion. From four channels myoelectric signals, the extracted time domain features are grouped into three ensembles to identify the effectiveness of feature ensemble in classification. The three feature ensembles obtained from multichannel continuous EMG signals are applied to the new classifiers namely simple logistic regression (SLR), J48 algorithm for decision tree (DT), logistic model tree (LMT) and feature subspace ensemble using k-nearest neighbor (kNN). Novel classifiers SLR, DT and LMT, select only the dominant features during training to develop the model for pattern recognition. This selection of features reduces the processing time as well as memory space of the controller for real-time application. The performance of SLR, DT, LMT and feature subspace ensemble using kNN classifiers are compared with other conventional classifiers, such as neural network (NN), simple kNN and linear discriminant analysis (LDA). The average classification accuracy with SLR is found to be better with feature ensemble-1 compared to the other classifiers. Also, the statistical Kruscal–Wallis test shows, the classification performance of SLR is not only better but also takes less time and memory space compared to other classifiers for classification. Also the performance of the classifier is tested in real-time with transradial amputees for actuation of drive for two intended motions with TMS320F28335eZdsp controller. The experimental results show that the SLR classifier improves the controller response in real-time.
APA, Harvard, Vancouver, ISO, and other styles
40

Cahyana, Nur Heri, Yuli Fauziah, and Agus Sasmito Aribowo. "The Comparison of Tree-Based Ensemble Machine Learning for Classifying Public Datasets." RSF Conference Series: Engineering and Technology 1, no. 1 (December 23, 2021): 407–13. http://dx.doi.org/10.31098/cset.v1i1.412.

Full text
Abstract:
This study aims to determine the best methods of tree-based ensemble machine learning to classify the datasets used, a total of 34 datasets. This study also wants to know the relationship between the number of records and columns of the test dataset with the number of estimators (trees) for each ensemble model, namely Random Forest, Extra Tree Classifier, AdaBoost, and Gradient Bosting. The four methods will be compared to the maximum accuracy and the number of estimators when tested to classify the test dataset. Based on the results of the experiments above, tree-based ensemble machine learning methods have been obtained and the best number of estimators for the classification of each dataset used in the study. The Extra Tree method is the best classifier method for binary-class and multi-class. Random Forest is good for multi-classes, and AdaBoost is a pretty good method for binary-classes. The number of rows, columns and data classes is positively correlated with the number of estimators. This means that to process a dataset with a large row, column or class size requires more estimators than processing a dataset with a small row, column or class size. However, the relationship between the number of classes and accuracy is negatively correlated, meaning that the accuracy will decrease if there are more classes for classification.
APA, Harvard, Vancouver, ISO, and other styles
41

Cao, Mengli, and Xingwei Ling. "Quantitative Comparison of Tree Ensemble Learning Methods for Perfume Identification Using a Portable Electronic Nose." Applied Sciences 12, no. 19 (September 27, 2022): 9716. http://dx.doi.org/10.3390/app12199716.

Full text
Abstract:
Perfume identification (PI) based on an electronic nose (EN) can be used for exposing counterfeit perfumes more time-efficiently and cost-effectively than using gas chromatography and mass spectrometry instruments. During the past five years, decision-tree-based ensemble learning methods, also called tree ensemble learning methods, have demonstrated excellent performance when solving multi-class classification problems. However, the performance of tree ensemble learning methods for the EN-based PI problem remains uncertain. In this paper, four well-known tree ensemble learning classification methods, random forest (RF), stagewise additive modeling using a multi-class exponential loss function (SAMME), gradient-boosting decision tree (GBDT), and extreme gradient boosting (XGBoost), were implemented for PI using our self-designed EN. For fair comparison, all the tested classification methods used as input the same feature data extracted using principal component analysis. Moreover, two benchmark methods, neural network and support vector machine, were also tested with the same experimental setup. The quantitative results of experiments undertaken demonstrated that the mean PI accuracy achieved by XGBoost was up to 97.5%, and that XGBoost outperformed other tested methods in terms of accuracy mean and variance based on our self-designed EN.
APA, Harvard, Vancouver, ISO, and other styles
42

Godara, Jyoti, Isha Batra, Rajni Aron, and Mohammad Shabaz. "Ensemble Classification Approach for Sarcasm Detection." Behavioural Neurology 2021 (November 22, 2021): 1–13. http://dx.doi.org/10.1155/2021/9731519.

Full text
Abstract:
Cognitive science is a technology which focuses on analyzing the human brain using the application of DM. The databases are utilized to gather and store the large volume of data. The authenticated information is extracted using measures. This research work is based on detecting the sarcasm from the text data. This research work introduces a scheme to detect sarcasm based on PCA algorithm, K -means algorithm, and ensemble classification. The four ensemble classifiers are designed with the objective of detecting the sarcasm. The first ensemble classification algorithm (SKD) is the combination of SVM, KNN, and decision tree. In the second ensemble classifier (SLD), SVM, logistic regression, and decision tree classifiers are combined for the sarcasm detection. In the third ensemble model (MLD), MLP, logistic regression, and decision tree are combined, and the last one (SLM) is the combination of MLP, logistic regression, and SVM. The proposed model is implemented in Python and tested on five datasets of different sizes. The performance of the models is tested with regard to various metrics.
APA, Harvard, Vancouver, ISO, and other styles
43

Zhang, Shangtong, and Hengshuai Yao. "ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5789–96. http://dx.doi.org/10.1609/aaai.v33i01.33015789.

Full text
Abstract:
In this paper, we propose an actor ensemble algorithm, named ACE, for continuous control with a deterministic policy in reinforcement learning. In ACE, we use actor ensemble (i.e., multiple actors) to search the global maxima of the critic. Besides the ensemble perspective, we also formulate ACE in the option framework by extending the option-critic architecture with deterministic intra-option policies, revealing a relationship between ensemble and options. Furthermore, we perform a look-ahead tree search with those actors and a learned value prediction model, resulting in a refined value estimation. We demonstrate a significant performance boost of ACE over DDPG and its variants in challenging physical robot simulators.
APA, Harvard, Vancouver, ISO, and other styles
44

Liu, Jiaming, Liuan Wang, Linan Zhang, Zeming Zhang, and Sicheng Zhang. "Predictive analytics for blood glucose concentration: an empirical study using the tree-based ensemble approach." Library Hi Tech 38, no. 4 (July 1, 2020): 835–58. http://dx.doi.org/10.1108/lht-08-2019-0171.

Full text
Abstract:
PurposeThe primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of four tree-based ensemble models, i.e. bagging with tree regressors (bagging-decision tree [Bagging-DT]), AdaBoost with tree regressors (Adaboost-DT), random forest (RF) and gradient boosting decision tree (GBDT).Design/methodology/approachThis study proposed a majority voting feature selection method by combining lasso regression with the Akaike information criterion (AIC) (LR-AIC), lasso regression with the Bayesian information criterion (BIC) (LR-BIC) and RF to select indicators with excellent predictive performance from initial 38 indicators in 5,642 samples. The selected features were deployed to build the tree-based ensemble models. The 10-fold cross-validation (CV) method was used to evaluate the performance of each ensemble model.FindingsThe results of feature selection indicated that age, corpuscular hemoglobin concentration (CHC), red blood cell volume distribution width (RBCVDW), red blood cell volume and leucocyte count are five most important clinical/physical indicators in BG prediction. Furthermore, this study also found that the GBDT ensemble model combined with the proposed majority voting feature selection method is better than other three models with respect to prediction performance and stability.Practical implicationsThis study proposed a novel BG prediction framework for better predictive analytics in health care.Social implicationsThis study incorporated medical background and machine learning technology to reduce diabetes morbidity and formulate precise medical schemes.Originality/valueThe majority voting feature selection method combined with the GBDT ensemble model provides an effective decision-making tool for predicting BG and detecting diabetes risk in advance.
APA, Harvard, Vancouver, ISO, and other styles
45

Cho, Sung-bin. "Corporate Bankruptcy Prediction using Decision Tree Ensemble Technique." Journal of the Korea Management Engineers Society 25, no. 4 (December 31, 2020): 63–71. http://dx.doi.org/10.35373/kmes.25.4.5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Zhou, Yu, Hui Li, Mei Chen, Zhenyu Dai, and Ming Zhu. "Accelerate tree ensemble learning based on adaptive sampling." Journal of Computational Methods in Sciences and Engineering 20, no. 2 (July 10, 2020): 509–19. http://dx.doi.org/10.3233/jcm-193912.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Lavanya, D. "Ensemble Decision Tree Classifier For Breast Cancer Data." International Journal of Information Technology Convergence and Services 2, no. 1 (February 29, 2012): 17–24. http://dx.doi.org/10.5121/ijitcs.2012.2103.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Li, Feijiang, Yuhua Qian, and Jieting Wang. "GoT: a Growing Tree Model for Clustering Ensemble." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 9 (May 18, 2021): 8349–56. http://dx.doi.org/10.1609/aaai.v35i9.17015.

Full text
Abstract:
The clustering ensemble technique that integrates multiple clustering results can improve the accuracy and robustness of the final clustering. In many clustering ensemble algorithms, the co-association matrix (CA matrix), which reflects the frequency of any two samples being partitioned into the same cluster, plays an important role. However, generally, the CA matrix is highly sparse with low value density, which may limit the performance of an algorithm based on it. To handle these issues, in this paper, we propose a growing tree model (GoT). In this model, the CA matrix is firstly refined by the shortest path technique so that its sparsity will be mitigated. Then, a set of representative prototype examples is discovered. Finally, to handle the low value density of the CA matrix, the prototypes gradually connect to their neighborhood, which likes a set of trees growing up. The rationality of the discovered prototype examples is illustrated by theoretical analysis and experimental analysis. The working mechanism of the GoT is visually shown on synthetic data sets. Experimental analyses on eight UCI data sets and eight image data sets show that the GoT outperforms nine representative clustering ensemble algorithms.
APA, Harvard, Vancouver, ISO, and other styles
49

Park, Sangho, and Chanmin Kim. "Comparison of tree-based ensemble models for regression." Communications for Statistical Applications and Methods 29, no. 5 (September 30, 2022): 561–89. http://dx.doi.org/10.29220/csam.2022.29.5.561.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Weng, Jinxian, and Qiang Meng. "Ensemble Tree Approach to Estimating Work Zone Capacity." Transportation Research Record: Journal of the Transportation Research Board 2286, no. 1 (January 2012): 56–67. http://dx.doi.org/10.3141/2286-07.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography