Journal articles on the topic 'Random Decision Forests'

To see the other types of publications on this topic, follow the link: Random Decision Forests.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Random Decision Forests.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Jeong, Hoyeon, Youngjune Kim, and So Yeong Lim. "A Predictive Model for Farmland Purchase/Rent Using Random Forests." Korean Agricultural Economics Association 63, no. 3 (September 30, 2022): 153–68. http://dx.doi.org/10.24997/kjae.2022.63.3.153.

Full text
Abstract:
This study contributes to guidance for understanding farmland purchase and rent decisions in Korea via an analysis using a machine learning tool, Random Forests: A Supervised Machine Learning Algorithm. Farm Household Economy Survey is employed to predict the relationship between farmland acquisition and farm household economic characteristics. Our main findings are two folds. First, a farmland purchase decision is positively related to transfer incomes, the value of inventory & fixed assets, and the value of farmland that farmers owned. Second, a farmland rent decision is also positively associated with a rent paid in a prior year, revenue from field crops, inventory and agricultural assets, and transfer incomes.
APA, Harvard, Vancouver, ISO, and other styles
2

Wu, David J., Tony Feng, Michael Naehrig, and Kristin Lauter. "Privately Evaluating Decision Trees and Random Forests." Proceedings on Privacy Enhancing Technologies 2016, no. 4 (October 1, 2016): 335–55. http://dx.doi.org/10.1515/popets-2016-0043.

Full text
Abstract:
Abstract Decision trees and random forests are common classifiers with widespread use. In this paper, we develop two protocols for privately evaluating decision trees and random forests. We operate in the standard two-party setting where the server holds a model (either a tree or a forest), and the client holds an input (a feature vector). At the conclusion of the protocol, the client learns only the model’s output on its input and a few generic parameters concerning the model; the server learns nothing. The first protocol we develop provides security against semi-honest adversaries. We then give an extension of the semi-honest protocol that is robust against malicious adversaries. We implement both protocols and show that both variants are able to process trees with several hundred decision nodes in just a few seconds and a modest amount of bandwidth. Compared to previous semi-honest protocols for private decision tree evaluation, we demonstrate a tenfold improvement in computation and bandwidth.
APA, Harvard, Vancouver, ISO, and other styles
3

Kumano, So, and Tatsuya Akutsu. "Comparison of the Representational Power of Random Forests, Binary Decision Diagrams, and Neural Networks." Neural Computation 34, no. 4 (March 23, 2022): 1019–44. http://dx.doi.org/10.1162/neco_a_01486.

Full text
Abstract:
Abstract In this letter, we compare the representational power of random forests, binary decision diagrams (BDDs), and neural networks in terms of the number of nodes. We assume that an axis-aligned function on a single variable is assigned to each edge in random forests and BDDs, and the activation functions of neural networks are sigmoid, rectified linear unit, or similar functions. Based on existing studies, we show that for any random forest, there exists an equivalent depth-3 neural network with a linear number of nodes. We also show that for any BDD with balanced width, there exists an equivalent shallow depth neural network with a polynomial number of nodes. These results suggest that even shallow neural networks have the same or higher representation power than deep random forests and deep BDDs. We also show that in some cases, an exponential number of nodes are required to express a given random forest by a random forest with a much fewer number of trees, which suggests that many trees are required for random forests to represent some specific knowledge efficiently.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Heng-Ru, Fan Min, and Xu He. "Aggregated Recommendation through Random Forests." Scientific World Journal 2014 (2014): 1–11. http://dx.doi.org/10.1155/2014/649596.

Full text
Abstract:
Aggregated recommendation refers to the process of suggesting one kind of items to a group of users. Compared to user-oriented or item-oriented approaches, it is more general and, therefore, more appropriate for cold-start recommendation. In this paper, we propose a random forest approach to create aggregated recommender systems. The approach is used to predict the rating of a group of users to a kind of items. In the preprocessing stage, we merge user, item, and rating information to construct an aggregated decision table, where rating information serves as the decision attribute. We also model the data conversion process corresponding to the new user, new item, and both new problems. In the training stage, a forest is built for the aggregated training set, where each leaf is assigned a distribution of discrete rating. In the testing stage, we present four predicting approaches to compute evaluation values based on the distribution of each tree. Experiments results on the well-known MovieLens dataset show that the aggregated approach maintains an acceptable level of accuracy.
APA, Harvard, Vancouver, ISO, and other styles
5

Audemard, Gilles, Steve Bellart, Louènas Bounia, Frédéric Koriche, Jean-Marie Lagniez, and Pierre Marquis. "Trading Complexity for Sparsity in Random Forest Explanations." Proceedings of the AAAI Conference on Artificial Intelligence 36, no. 5 (June 28, 2022): 5461–69. http://dx.doi.org/10.1609/aaai.v36i5.20484.

Full text
Abstract:
Random forests have long been considered as powerful model ensembles in machine learning. By training multiple decision trees, whose diversity is fostered through data and feature subsampling, the resulting random forest can lead to more stable and reliable predictions than a single decision tree. This however comes at the cost of decreased interpretability: while decision trees are often easily interpretable, the predictions made by random forests are much more difficult to understand, as they involve a majority vote over multiple decision trees. In this paper, we examine different types of reasons that explain "why" an input instance is classified as positive or negative by a Boolean random forest. Notably, as an alternative to prime-implicant explanations taking the form of subset-minimal implicants of the random forest, we introduce majoritary reasons which are subset-minimal implicants of a strict majority of decision trees. For these abductive explanations, the tractability of the generation problem (finding one reason) and the optimization problem (finding one minimum-sized reason) are investigated. Unlike prime-implicant explanations, majoritary reasons may contain redundant features. However, in practice, prime-implicant explanations - for which the identification problem is DP-complete - are slightly larger than majoritary reasons that can be generated using a simple linear-time greedy algorithm. They are also significantly larger than minimum-sized majoritary reasons which can be approached using an anytime Partial MaxSAT algorithm.
APA, Harvard, Vancouver, ISO, and other styles
6

Tin Kam Ho. "The random subspace method for constructing decision forests." IEEE Transactions on Pattern Analysis and Machine Intelligence 20, no. 8 (1998): 832–44. http://dx.doi.org/10.1109/34.709601.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Fröhlich, B., E. Rodner, M. Kemmler, and J. Denzler. "Efficient Gaussian process classification using random decision forests." Pattern Recognition and Image Analysis 21, no. 2 (June 2011): 184–87. http://dx.doi.org/10.1134/s1054661811020337.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Fletcher, Sam, and Md Zahidul Islam. "Differentially private random decision forests using smooth sensitivity." Expert Systems with Applications 78 (July 2017): 16–31. http://dx.doi.org/10.1016/j.eswa.2017.01.034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Thongkam, Jaree, and Vatinee Sukmak. "Enhancing Decision Tree with AdaBoost for Predicting Schizophrenia Readmission." Advanced Materials Research 931-932 (May 2014): 1467–71. http://dx.doi.org/10.4028/www.scientific.net/amr.931-932.1467.

Full text
Abstract:
A psychiatric readmission is argued to be an adverse outcome because it is costly and occurs when relapse to the illness is so severe. An analysis of systematic models in readmission data can provide useful insight into the quicker and sicker patients with schizophrenia. This research aims to develop and investigate schizophrenia readmission prediction models using data mining techniques including decision tree, Random Tree, Random Forests, AdaBoost, Bagging and a combination of AdaBoost with decision tree, AdaBoost with Random Tree, AdaBoost with Random Forests, Bagging with decision tree, Bagging with Random Tree and Bagging with Random Forests. The experimental results successfully showed that AdaBoost with decision tree has the highest precision, recall and F-measure up to 98.11%, 98.79% and 98.41%, respectively.
APA, Harvard, Vancouver, ISO, and other styles
10

Fröhlich, B., E. Rodner, M. Kemmler, and J. Denzler. "Large-scale Gaussian process classification using random decision forests." Pattern Recognition and Image Analysis 22, no. 1 (March 2012): 113–20. http://dx.doi.org/10.1134/s1054661812010166.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Ibrahim, Muhammad. "Evolution of Random Forest from Decision Tree and Bagging: A Bias-Variance Perspective." Dhaka University Journal of Applied Science and Engineering 7, no. 1 (February 1, 2023): 66–71. http://dx.doi.org/10.3329/dujase.v7i1.62888.

Full text
Abstract:
The ensemble methods are one of the most heavily used techniques in machine learning. The random forest arguably spearheads this army of learners. Being sprung from the decision tree in the late 90s, the benefits of a random forest have rightfully attracted practitioners to widely and successfully apply this powerful yet simple-to-understand technique to numerous applications. In this study we explain the evolution of a random forest from a decision tree in the context of bias and variance of learning theory. While doing so, we focus on the interplay between the correlation and generalization error of the random forest. This analysis is expected to enrich the literature of random forests by providing further insight into its working mechanism. These insights will assist the practitioners of the random forest implement this algorithm more wisely and in an informed way. DUJASE Vol. 7(1) 67-71, 2022 (January)
APA, Harvard, Vancouver, ISO, and other styles
12

Sadorsky, Perry. "Predicting Gold and Silver Price Direction Using Tree-Based Classifiers." Journal of Risk and Financial Management 14, no. 5 (April 29, 2021): 198. http://dx.doi.org/10.3390/jrfm14050198.

Full text
Abstract:
Gold is often used by investors as a hedge against inflation or adverse economic times. Consequently, it is important for investors to have accurate forecasts of gold prices. This paper uses several machine learning tree-based classifiers (bagging, stochastic gradient boosting, random forests) to predict the price direction of gold and silver exchange traded funds. Decision tree bagging, stochastic gradient boosting, and random forests predictions of gold and silver price direction are much more accurate than those obtained from logit models. For a 20-day forecast horizon, tree bagging, stochastic gradient boosting, and random forests produce accuracy rates of between 85% and 90% while logit models produce accuracy rates of between 55% and 60%. Stochastic gradient boosting accuracy is a few percentage points less than that of random forests for forecast horizons over 10 days. For those looking to forecast the direction of gold and silver prices, tree bagging and random forests offer an attractive combination of accuracy and ease of estimation. For each of gold and silver, a portfolio based on the random forests price direction forecasts outperformed a buy and hold portfolio.
APA, Harvard, Vancouver, ISO, and other styles
13

Zhang, Chunying, Wenjie Wang, Lu Liu, Jing Ren, and Liya Wang. "Three-Branch Random Forest Intrusion Detection Model." Mathematics 10, no. 23 (November 26, 2022): 4460. http://dx.doi.org/10.3390/math10234460.

Full text
Abstract:
Network intrusion detection has the problems of large amounts of data, numerous attributes, and different levels of importance for each attribute in detection. However, in random forests, the detection results have large deviations due to the random selection of attributes. Therefore, aiming at the current problems, considering increasing the probability of essential features being selected, a network intrusion detection model based on three-way selected random forest (IDTSRF) is proposed, which integrates three decision branches and random forest. Firstly, according to the characteristics of attributes, it is proposed to evaluate the importance of attributes by combining decision boundary entropy, and using three decision rules to divide attributes; secondly, to keep the randomness of attributes, three attribute random selection rules based on attribute randomness are established, and a certain number of attributes are randomly selected from three candidate fields according to conditions; finally, the training sample set is formed by using autonomous sampling method to select samples and combining three randomly selected attribute sets randomly, and multiple decision trees are trained to form a random forest. The experimental results show that the model has high precision and recall.
APA, Harvard, Vancouver, ISO, and other styles
14

Pramanik, Moumita, Ratika Pradhan, Parvati Nandy, Akash Kumar Bhoi, and Paolo Barsocchi. "Machine Learning Methods with Decision Forests for Parkinson’s Detection." Applied Sciences 11, no. 2 (January 8, 2021): 581. http://dx.doi.org/10.3390/app11020581.

Full text
Abstract:
Biomedical engineers prefer decision forests over traditional decision trees to design state-of-the-art Parkinson’s Detection Systems (PDS) on massive acoustic signal data. However, the challenges that the researchers are facing with decision forests is identifying the minimum number of decision trees required to achieve maximum detection accuracy with the lowest error rate. This article examines two recent decision forest algorithms Systematically Developed Forest (SysFor), and Decision Forest by Penalizing Attributes (ForestPA) along with the popular Random Forest to design three distinct Parkinson’s detection schemes with optimum number of decision trees. The proposed approach undertakes minimum number of decision trees to achieve maximum detection accuracy. The training and testing samples and the density of trees in the forest are kept dynamic and incremental to achieve the decision forests with maximum capability for detecting Parkinson’s Disease (PD). The incremental tree densities with dynamic training and testing of decision forests proved to be a better approach for detection of PD. The proposed approaches are examined along with other state-of-the-art classifiers including the modern deep learning techniques to observe the detection capability. The article also provides a guideline to generate ideal training and testing split of two modern acoustic datasets of Parkinson’s and control subjects donated by the Department of Neurology in Cerrahpaşa, Istanbul and Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain. Among the three proposed detection schemes the Forest by Penalizing Attributes (ForestPA) proved to be a promising Parkinson’s disease detector with a little number of decision trees in the forest to score the highest detection accuracy of 94.12% to 95.00%.
APA, Harvard, Vancouver, ISO, and other styles
15

Pramanik, Moumita, Ratika Pradhan, Parvati Nandy, Akash Kumar Bhoi, and Paolo Barsocchi. "Machine Learning Methods with Decision Forests for Parkinson’s Detection." Applied Sciences 11, no. 2 (January 8, 2021): 581. http://dx.doi.org/10.3390/app11020581.

Full text
Abstract:
Biomedical engineers prefer decision forests over traditional decision trees to design state-of-the-art Parkinson’s Detection Systems (PDS) on massive acoustic signal data. However, the challenges that the researchers are facing with decision forests is identifying the minimum number of decision trees required to achieve maximum detection accuracy with the lowest error rate. This article examines two recent decision forest algorithms Systematically Developed Forest (SysFor), and Decision Forest by Penalizing Attributes (ForestPA) along with the popular Random Forest to design three distinct Parkinson’s detection schemes with optimum number of decision trees. The proposed approach undertakes minimum number of decision trees to achieve maximum detection accuracy. The training and testing samples and the density of trees in the forest are kept dynamic and incremental to achieve the decision forests with maximum capability for detecting Parkinson’s Disease (PD). The incremental tree densities with dynamic training and testing of decision forests proved to be a better approach for detection of PD. The proposed approaches are examined along with other state-of-the-art classifiers including the modern deep learning techniques to observe the detection capability. The article also provides a guideline to generate ideal training and testing split of two modern acoustic datasets of Parkinson’s and control subjects donated by the Department of Neurology in Cerrahpaşa, Istanbul and Departamento de Matemáticas, Universidad de Extremadura, Cáceres, Spain. Among the three proposed detection schemes the Forest by Penalizing Attributes (ForestPA) proved to be a promising Parkinson’s disease detector with a little number of decision trees in the forest to score the highest detection accuracy of 94.12% to 95.00%.
APA, Harvard, Vancouver, ISO, and other styles
16

Depari, Deo Haganta, Yuni Widiastiwi, and Mayanda Mega Santoni. "Perbandingan Model Decision Tree, Naive Bayes dan Random Forest untuk Prediksi Klasifikasi Penyakit Jantung." Informatik : Jurnal Ilmu Komputer 18, no. 3 (December 28, 2022): 239. http://dx.doi.org/10.52958/iftk.v18i3.4694.

Full text
Abstract:
Jantung sebuah rongga organ berotot yang memompa darah melalui pembuluh darah dengan kontraksi berirama yang terus berulang merupakan salah satu organ manusia yang berperan dalam sistem peredaran darah. Jantung sebagai salah organ terpenting dalam tubuh memiliki resiko kematian jika ada kelainan yang terjadi pada jantung. Beberapa masalah pada jantung dibagi menjadi dua yaitu penyakit jantung dan serangan jantung. WHO berdasarkan data menyatakan bahwa ada sebanyak 7,3 juta penduduk di dunia yang meninggal dikarenakan penyakit jantung. Penelitian ini menggunakan kumpulan data pasien penyakit jantung “Personal Key Indicators of Heart Disease” dan menerapkan algoritma klasifikasi Decision Tree, Naive Bayes dan Random Forest. Tujuan dari penelitian ini adalah untuk bagaimana mengolah dan melakukan analisa data, bagaimana penerapan metode Decision Tree, Naive Bayes dan Random Forest pada klasifikasi penyakit jantung, kemudian bagaimana hasil akurasi metode-metode yang digunakan tersebut, bagaimana hasil perbandingan antara Decision Tree, Naive Bayes dan Random Forests yang digunakan dan metode apa yang merupakan terbaik dari klasifikasi penyakit jantung. Hasil dari penelitian ini adalah evaluasi performa metode klasifikasi Decision Tree, Naive Bayes dan Random Forest. Dimana nilai akurasi metode Decision Tree sebesar 0.71%, Naive Bayes sebesar 0.72% dan Random Forest sebesar 0.75%.
APA, Harvard, Vancouver, ISO, and other styles
17

Hoang, Van Dung, My Ha Le, Hyun-Deok Kang, and Kang-Hyun Jo. "Local descriptors based random forests for human detection." Science and Technology Development Journal 18, no. 3 (August 30, 2015): 199–207. http://dx.doi.org/10.32508/stdj.v18i3.902.

Full text
Abstract:
This paper presents a framework based on Random forest using local feature descriptors to detect human in dynamic camera. The contribution presents two issues for dealing with the problem of human detection in variety of background. First, it presents the local feature descriptors based on multi scales based Histograms of Oriented Gradients (HOG) for improving the accuracy of the system. By using local feature descriptors based multiple scales HOG, an extensive feature space allows obtaining high-discriminated features. Second, machine detection system using cascade of Random Forest (RF) based approach is used for training and prediction. In this case, the decision forest based on the optimization of the set of parameters for binary decision based on the linear support vector machine (SVM) technique. Finally, the detection system based on cascade classification is presented to speed up the computational cost.
APA, Harvard, Vancouver, ISO, and other styles
18

Doubleday, Kevin, Jin Zhou, Hua Zhou, and Haoda Fu. "Risk controlled decision trees and random forests for precision Medicine." Statistics in Medicine 41, no. 4 (November 16, 2021): 719–35. http://dx.doi.org/10.1002/sim.9253.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Noroozi, Fatemeh, Tomasz Sapiński, Dorota Kamińska, and Gholamreza Anbarjafari. "Vocal-based emotion recognition using random forests and decision tree." International Journal of Speech Technology 20, no. 2 (February 9, 2017): 239–46. http://dx.doi.org/10.1007/s10772-017-9396-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Xu, Ning, Jiangping Wang, Guojun Qi, Thomas Huang, and Weiyao Lin. "Ontological Random Forests for Image Classification." International Journal of Information Retrieval Research 5, no. 3 (July 2015): 61–74. http://dx.doi.org/10.4018/ijirr.2015070104.

Full text
Abstract:
Previous image classification approaches mostly neglect semantics, which has two major limitations. First, categories are simply treated independently while in fact they have semantic overlaps. For example, “sedan” is a specific kind of “car”. Therefore, it's unreasonable to train a classifier to distinguish between “sedan” and “car”. Second, image feature representations used for classifying different categories are the same. However, the human perception system is believed to use different features for different objects. In this paper, we leverage semantic ontologies to solve the aforementioned problems. The authors propose an ontological random forest algorithm where the splitting of decision trees are determined by semantic relations among categories. Then hierarchical features are automatically learned by multiple-instance learning to capture visual dissimilarities at different concept levels. Their approach is tested on two image classification datasets. Experimental results demonstrate that their approach not only outperforms state-of-the-art results but also identifies semantic visual features.
APA, Harvard, Vancouver, ISO, and other styles
21

Polaka, Inese, Igor Tom, and Arkady Borisov. "Decision Tree Classifiers in Bioinformatics." Scientific Journal of Riga Technical University. Computer Sciences 42, no. 1 (January 1, 2010): 118–23. http://dx.doi.org/10.2478/v10143-010-0052-4.

Full text
Abstract:
Decision Tree Classifiers in BioinformaticsThis paper presents a literature review of articles related to the use of decision tree classifiers in gene microarray data analysis published in the last ten years. The main focus is on researches solving the cancer classification problem using single decision tree classifiers (algorithms C4.5 and CART) and decision tree forests (e.g. random forests) showing strengths and weaknesses of the proposed methodologies when compared to other popular classification methods. The article also touches the use of decision tree classifiers in gene selection.
APA, Harvard, Vancouver, ISO, and other styles
22

Duroux, Roxane, and Erwan Scornet. "Impact of subsampling and tree depth on random forests." ESAIM: Probability and Statistics 22 (2018): 96–128. http://dx.doi.org/10.1051/ps/2018008.

Full text
Abstract:
Random forests are ensemble learning methods introduced by Breiman [Mach. Learn. 45 (2001) 5–32] that operate by averaging several decision trees built on a randomly selected subspace of the data set. Despite their widespread use in practice, the respective roles of the different mechanisms at work in Breiman’s forests are not yet fully understood, neither is the tuning of the corresponding parameters. In this paper, we study the influence of two parameters, namely the subsampling rate and the tree depth, on Breiman’s forests performance. More precisely, we prove that quantile forests (a specific type of random forests) based on subsampling and quantile forests whose tree construction is terminated early have similar performances, as long as their respective parameters (subsampling rate and tree depth) are well chosen. Moreover, experiments show that a proper tuning of these parameters leads in most cases to an improvement of Breiman’s original forests in terms of mean squared error.
APA, Harvard, Vancouver, ISO, and other styles
23

Yuan, Lina, Huajun Chen, and Jing Gong. "Classifications Based Decision Tree and Random Forests for Fanjing Mountains’ Tea." IOP Conference Series: Materials Science and Engineering 394 (August 7, 2018): 052002. http://dx.doi.org/10.1088/1757-899x/394/5/052002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Doubleday, Kevin, Hua Zhou, Haoda Fu, and Jin Zhou. "An Algorithm for Generating Individualized Treatment Decision Trees and Random Forests." Journal of Computational and Graphical Statistics 27, no. 4 (June 14, 2018): 849–60. http://dx.doi.org/10.1080/10618600.2018.1451337.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Groll, Andreas, Cristophe Ley, Gunther Schauberger, and Hans Van Eetvelde. "A hybrid random forest to predict soccer matches in international tournaments." Journal of Quantitative Analysis in Sports 15, no. 4 (October 25, 2019): 271–87. http://dx.doi.org/10.1515/jqas-2018-0060.

Full text
Abstract:
Abstract In this work, we propose a new hybrid modeling approach for the scores of international soccer matches which combines random forests with Poisson ranking methods. While the random forest is based on the competing teams’ covariate information, the latter method estimates ability parameters on historical match data that adequately reflect the current strength of the teams. We compare the new hybrid random forest model to its separate building blocks as well as to conventional Poisson regression models with regard to their predictive performance on all matches from the four FIFA World Cups 2002–2014. It turns out that by combining the random forest with the team ability parameters from the ranking methods as an additional covariate the predictive power can be improved substantially. Finally, the hybrid random forest is used (in advance of the tournament) to predict the FIFA World Cup 2018. To complete our analysis on the previous World Cup data, the corresponding 64 matches serve as an independent validation data set and we are able to confirm the compelling predictive potential of the hybrid random forest which clearly outperforms all other methods including the betting odds.
APA, Harvard, Vancouver, ISO, and other styles
26

Siders, ZA, ND Ducharme-Barth, F. Carvalho, D. Kobayashi, S. Martin, J. Raynor, TT Jones, and RNM Ahrens. "Ensemble Random Forests as a tool for modeling rare occurrences." Endangered Species Research 43 (October 8, 2020): 183–97. http://dx.doi.org/10.3354/esr01060.

Full text
Abstract:
Relative to target species, priority conservation species occur rarely in fishery interactions, resulting in imbalanced, overdispersed data. We present Ensemble Random Forests (ERFs) as an intuitive extension of the Random Forest algorithm to handle rare event bias. Each Random Forest receives individual stratified randomly sampled training/test sets, then down-samples the majority class for each decision tree. Results are averaged across Random Forests to generate an ensemble prediction. Through simulation, we show that ERFs outperform Random Forest with and without down-sampling, as well as with the synthetic minority over-sampling technique, for highly class imbalanced to balanced datasets. Spatial covariance greatly impacts ERFs’ perceived performance, as shown through simulation and case studies. In case studies from the Hawaii deep-set longline fishery, giant manta ray Mobula birostris syn. Manta birostris and scalloped hammerhead Sphyrna lewini presence had high spatial covariance and high model test performance, while false killer whale Pseudorca crassidens had low spatial covariance and low model test performance. Overall, we find ERFs have 4 advantages: (1) reduced successive partitioning effects; (2) prediction uncertainty propagation; (3) better accounting for interacting covariates through balancing; and (4) minimization of false positives, as the majority of Random Forests within the ensemble vote correctly. As ERFs can readily mitigate rare event bias without requiring large presence sample sizes or imparting considerable balancing bias, they are likely to be a valuable tool in bycatch and species distribution modeling, as well as spatial conservation planning, especially for protected species where presence can be rare.
APA, Harvard, Vancouver, ISO, and other styles
27

Rajure, Pranita. "Prediction of Domestic Airline Tickets using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 9, no. VI (June 14, 2021): 666–74. http://dx.doi.org/10.22214/ijraset.2021.35053.

Full text
Abstract:
Airlines usually keep their price strategies as commercial secrets and information is always asymmetric, it is difficult for ordinary customers to estimate future flight price changes. However, a reasonable prediction can help customers make decisions when to buy air tickets for a lower price. Flight price prediction can be regarded as a typical time series prediction problem. When you give customers a device that can help them save some money, they will pay you back with loyalty, which is priceless. Interesting fact: Fareboom users started spending twice as much time per session within a month of the release of an airfare price forecasting feature. Considering the features such as departure time, the number of days left for departure and time of the day it will give the best time to buy the ticket. Features are extracted from the collected data to apply Random Forest Machine Learning (ML) model. Then using this information, we are intended to build a system that can help buyers whether to buy a ticket or not. We have used Random Forest Algorithm which is a popular machine learning algorithm that belongs to the supervised learning technique. It can be used for both Classification and Regression problems in ML. It is based on the concept of ensemble learning, which is a process of combining multiple classifiers to solve a complex problem and to improve the performance of the model. With that said, random forests are a strong modelling technique and much more robust than a single decision tree. They aggregate many decision trees to limit over fitting as well as error due to bias and therefore yield useful results. Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them.
APA, Harvard, Vancouver, ISO, and other styles
28

Purwanto, Anang Dwi, Ketut Wikantika, Albertus Deliar, and Soni Darmawan. "Decision Tree and Random Forest Classification Algorithms for Mangrove Forest Mapping in Sembilang National Park, Indonesia." Remote Sensing 15, no. 1 (December 21, 2022): 16. http://dx.doi.org/10.3390/rs15010016.

Full text
Abstract:
Sembilang National Park, one of the best and largest mangrove areas in Indonesia, is very vulnerable to disturbance by community activities. Changes in the dynamic condition of mangrove forests in Sembilang National Park must be quickly and easily accompanied by mangrove monitoring efforts. One way to monitor mangrove forests is to use remote sensing technology. Recently, machine-learning classification techniques have been widely used to classify mangrove forests. This study aims to investigate the ability of decision tree (DT) and random forest (RF) machine-learning algorithms to determine the mangrove forest distribution in Sembilang National Park. The satellite data used are Landsat-7 ETM+ acquired on 30 June 2002 and Landsat-8 OLI acquired on 9 September 2019, as well as supporting data such as SPOT 6/7 image acquired in 2020–2021, MERIT DEM and an existing mangrove map. The pre-processing includes radiometric and atmospheric corrections performed using the semi-automatic classification plugin contained in Quantum GIS. We applied decision tree and random forest algorithms to classify the mangrove forest. In the DT algorithm, threshold analysis is carried out to obtain the most optimal threshold value in distinguishing mangrove and non-mangrove objects. Here, the use of DT and RF algorithms involves several important parameters, namely, the normalized difference moisture index (NDMI), normalized difference soil index (NDSI), near-infrared (NIR) band, and digital elevation model (DEM) data. The results of DT and RF classification from Landsat-7 ETM+ and Landsat-8 OLI images show similarities regarding mangrove spatial distribution. The DT classification algorithm with the parameter combination NDMI+NDSI+DEM is very effective in classifying Landsat-7 ETM+ image, while the parameter combination NDMI+NIR is very effective in classifying Landsat-8 OLI image. The RF classification algorithm with the parameter Image (6 bands), the number of trees = 100, the number of variables predictor (mtry) is square root (), and the minimum number of node sizes = 6, provides the highest overall accuracy for Landsat-7 ETM+ image, while combining Image (7 bands) + NDMI+NDSI+DEM parameters with the number of trees = 100, mtry = all variables (, and the minimum node size = 6 provides the highest overall accuracy for Landsat-8 OLI image. The overall classification accuracy is higher when using the RF algorithm (99.12%) instead of DT (92.82%) for the Landsat-7 ETM+ image, but it is slightly higher when using the DT algorithm (98.34%) instead of the RF algorithm (97.79%) for the Landsat-8 OLI image. The overall RF classification algorithm outperforms DT because all RF classification model parameters provide a higher producer accuracy in mapping mangrove forests. This development of the classification method should support the monitoring and rehabilitation programs of mangroves more quickly and easily, particularly in Indonesia.
APA, Harvard, Vancouver, ISO, and other styles
29

Zhang, J., S. Huang, E. H. Hogg, V. Lieffers, Y. Qin, and F. He. "Estimating spatial variation in Alberta forest biomass from a combination of forest inventory and remote sensing data." Biogeosciences 11, no. 10 (May 27, 2014): 2793–808. http://dx.doi.org/10.5194/bg-11-2793-2014.

Full text
Abstract:
Abstract. Uncertainties in the estimation of tree biomass carbon storage across large areas pose challenges for the study of forest carbon cycling at regional and global scales. In this study, we attempted to estimate the present aboveground biomass (AGB) in Alberta, Canada, by taking advantage of a spatially explicit data set derived from a combination of forest inventory data from 1968 plots and spaceborne light detection and ranging (lidar) canopy height data. Ten climatic variables, together with elevation, were used for model development and assessment. Four approaches, including spatial interpolation, non-spatial and spatial regression models, and decision-tree-based modeling with random forests algorithm (a machine-learning technique), were compared to find the "best" estimates. We found that the random forests approach provided the best accuracy for biomass estimates. Non-spatial and spatial regression models gave estimates similar to random forests, while spatial interpolation greatly overestimated the biomass storage. Using random forests, the total AGB stock in Alberta forests was estimated to be 2.26 × 109 Mg (megagram), with an average AGB density of 56.30 ± 35.94 Mg ha−1. At the species level, three major tree species, lodgepole pine, trembling aspen and white spruce, stocked about 1.39 × 109 Mg biomass, accounting for nearly 62% of total estimated AGB. Spatial distribution of biomass varied with natural regions, land cover types, and species. Furthermore, the relative importance of predictor variables on determining biomass distribution varied with species. This study showed that the combination of ground-based inventory data, spaceborne lidar data, land cover classification, and climatic and environmental variables was an efficient way to estimate the quantity, distribution and variation of forest biomass carbon stocks across large regions.
APA, Harvard, Vancouver, ISO, and other styles
30

Zhang, J., S. Huang, E. H. Hogg, V. Lieffers, Y. Qin, and F. He. "Estimating spatial variation in Alberta forest biomass from a combination of forest inventory and remote sensing data." Biogeosciences Discussions 10, no. 12 (December 4, 2013): 19005–44. http://dx.doi.org/10.5194/bgd-10-19005-2013.

Full text
Abstract:
Abstract. Uncertainties in the estimation of tree biomass carbon storage across large areas pose challenges for the study of forest carbon cycling at regional and global scales. In this study, we attempted to estimate the present biomass carbon storage in Alberta, Canada, by taking advantage of a spatially explicit dataset derived from a combination of forest inventory data from 1968 plots and spaceborne light detection and ranging (LiDAR) canopy height data. Ten climatic variables together with elevation, were used for model development and assessment. Four approaches, including spatial interpolation, non-spatial and spatial regression models, and decision-tree based modelling with random forests algorithm (a machine-learning technique), were compared to find the "best" estimates. We found that the random forests approach provided the best accuracy for biomass estimates. Non-spatial and spatial regression models gave estimates similar to random forests, while spatial interpolation greatly overestimated the biomass storage. Using random forests, the total biomass stock in Alberta forests was estimated to be 3.11 × 109 Mg, with the average biomass density of 77.59 Mg ha−1. At the species level, three major tree species, lodgepole pine, trembling aspen and white spruce, stocked about 1.91 × 109 Mg biomass, accounting for 61% of total estimated biomass. Spatial distribution of biomass varied with natural regions, land cover types, and species. And the relative importance of predictor variables on determining biomass distribution varied with species. This study showed that the combination of ground-based inventory data, spaceborne LiDAR data, land cover classification, climatic and environmental variables was an efficient way to estimate the quantity, distribution and variation of forest biomass carbon stocks across large regions.
APA, Harvard, Vancouver, ISO, and other styles
31

Zhou, Xiao, Fengying Guan, Shaohui Fan, Zixu Yin, Xuan Zhang, Chengji Li, and Yang Zhou. "Modeling Degraded Bamboo Shoots in Southeast China." Forests 13, no. 9 (September 14, 2022): 1482. http://dx.doi.org/10.3390/f13091482.

Full text
Abstract:
Degraded bamboo shoots (DBS) constitute an important variable in the carbon fixation of bamboo forests. DBS are useful for informed decision making in bamboo forests. Despite their importance, studies on DBS are limited. In this study, we aimed to develop models to describe DBS variations. By using DBS data from 64 plots of Yixing forest farm in Jiangsu Province, China, a mixed-effects model was constructed, including block-level random effects. We evaluated the potential impact of several variables on DBS. The number of bamboo shoots (NBS), mean height to crown base (MHCB), hydrolytic nitrogen (HN), and available potassium (AK) significantly contributed to the model. By introducing the block-level random effect in the logistic model, the fitting statistics were significantly improved. The model showed that there were increased DBS in bamboo stands with decreased MHCB and AK, whereas DBS decreased with decreasing NBS and HN. The application of K fertilizer reduced the number of DBS during the emergence stage. By adjusting these factors, the number of DBS in bamboo forests can be reduced, which provides a theoretical basis for increasing the biomass of bamboo forests. It can also provide an important basis for studying the carbon sink characteristics of bamboo forests and help to formulate more effective bamboo forest management plans.
APA, Harvard, Vancouver, ISO, and other styles
32

Ranzato, Francesco, and Marco Zanella. "Abstract Interpretation of Decision Tree Ensemble Classifiers." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 04 (April 3, 2020): 5478–86. http://dx.doi.org/10.1609/aaai.v34i04.5998.

Full text
Abstract:
We study the problem of formally and automatically verifying robustness properties of decision tree ensemble classifiers such as random forests and gradient boosted decision tree models. A recent stream of works showed how abstract interpretation, which is ubiquitously used in static program analysis, can be successfully deployed to formally verify (deep) neural networks. In this work we push forward this line of research by designing a general and principled abstract interpretation-based framework for the formal verification of robustness and stability properties of decision tree ensemble models. Our abstract interpretation-based method may induce complete robustness checks of standard adversarial perturbations and output concrete adversarial attacks. We implemented our abstract verification technique in a tool called silva, which leverages an abstract domain of not necessarily closed real hyperrectangles and is instantiated to verify random forests and gradient boosted decision trees. Our experimental evaluation on the MNIST dataset shows that silva provides a precise and efficient tool which advances the current state of the art in tree ensembles verification.
APA, Harvard, Vancouver, ISO, and other styles
33

Cichosz, Paweł, and Łukasz Pawełczak. "Imitation learning of car driving skills with decision trees and random forests." International Journal of Applied Mathematics and Computer Science 24, no. 3 (September 1, 2014): 579–97. http://dx.doi.org/10.2478/amcs-2014-0042.

Full text
Abstract:
Abstract Machine learning is an appealing and useful approach to creating vehicle control algorithms, both for simulated and real vehicles. One common learning scenario that is often possible to apply is learning by imitation, in which the behavior of an exemplary driver provides training instances for a supervised learning algorithm. This article follows this approach in the domain of simulated car racing, using the TORCS simulator. In contrast to most prior work on imitation learning, a symbolic decision tree knowledge representation is adopted, which combines potentially high accuracy with human readability, an advantage that can be important in many applications. Decision trees are demonstrated to be capable of representing high quality control models, reaching the performance level of sophisticated pre-designed algorithms. This is achieved by enhancing the basic imitation learning scenario to include active retraining, automatically triggered on control failures. It is also demonstrated how better stability and generalization can be achieved by sacrificing human-readability and using decision tree model ensembles. The methodology for learning control models contributed by this article can be hopefully applied to solve real-world control tasks, as well as to develop video game bots
APA, Harvard, Vancouver, ISO, and other styles
34

Haidar, Aissa, Tarik Ahajjam, Imad Zeroual, and Yousef Farhaoui. "Application of machine learning algorithms for predicting outcomes of accident cases in Moroccan courts." Indonesian Journal of Electrical Engineering and Computer Science 26, no. 2 (May 1, 2022): 1103. http://dx.doi.org/10.11591/ijeecs.v26.i2.pp1103-1108.

Full text
Abstract:
<p><span>Due to the large number of legal cases, the processing of them by the courts is generally very slow. Among these cases, we find accidents cases, which require a great speed of judgment to compensate the victims of those accidents. To this end, we thought of exploiting the possibilities offered by machine learning in order to simulate the work of judges and contribute to speeding up the time of decision. Further, we applied different machine learning algorithms, such as linear regression, decision trees, and random forests. According to the results achieved, the Random Forest is the most perfect model for with the utmost accuracy about 91.05%</span></p>
APA, Harvard, Vancouver, ISO, and other styles
35

Dutschmann, Thomas-Martin, and Knut Baumann. "Evaluating High-Variance Leaves as Uncertainty Measure for Random Forest Regression." Molecules 26, no. 21 (October 28, 2021): 6514. http://dx.doi.org/10.3390/molecules26216514.

Full text
Abstract:
Uncertainty measures estimate the reliability of a predictive model. Especially in the field of molecular property prediction as part of drug design, model reliability is crucial. Besides other techniques, Random Forests have a long tradition in machine learning related to chemoinformatics and are widely used. Random Forests consist of an ensemble of individual regression models, namely, decision trees and, therefore, provide an uncertainty measure already by construction. Regarding the disagreement of single-model predictions, a narrower distribution of predictions is interpreted as a higher reliability. The standard deviation of the decision tree ensemble predictions is the default uncertainty measure for Random Forests. Due to the increasing application of machine learning in drug design, there is a constant search for novel uncertainty measures that, ideally, outperform classical uncertainty criteria. When analyzing Random Forests, it appears obvious to consider the variance of the dependent variables within each terminal decision tree leaf to obtain predictive uncertainties. Hereby, predictions that arise from more leaves of high variance are considered less reliable. Expectedly, the number of such high-variance leaves yields a reasonable uncertainty measure. Depending on the dataset, it can also outperform ensemble uncertainties. However, small-scale comparisons, i.e., considering only a few datasets, are insufficient, since they are more prone to chance correlations. Therefore, large-scale estimations are required to make general claims about the performance of uncertainty measures. On several chemoinformatic regression datasets, high-variance leaves are compared to the standard deviation of ensemble predictions. It turns out that high-variance leaf uncertainty is meaningful, not superior to the default ensemble standard deviation. A brief possible explanation is offered.
APA, Harvard, Vancouver, ISO, and other styles
36

Li, Zizhao, Shoudong Bi, Shuang Hao, and Yuhuan Cui. "Aboveground biomass estimation in forests with random forest and Monte Carlo-based uncertainty analysis." Ecological Indicators 142 (September 2022): 109246. http://dx.doi.org/10.1016/j.ecolind.2022.109246.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Park, Se-Rin, Suyeon Kim, and Sang-Woo Lee. "Evaluating the Relationships between Riparian Land Cover Characteristics and Biological Integrity of Streams Using Random Forest Algorithms." International Journal of Environmental Research and Public Health 18, no. 6 (March 19, 2021): 3182. http://dx.doi.org/10.3390/ijerph18063182.

Full text
Abstract:
The relationships between land cover characteristics in riparian areas and the biological integrity of rivers and streams are critical in riparian area management decision-making. This study aims to evaluate such relationships using the Trophic Diatom Index (TDI), Benthic Macroinvertebrate Index (BMI), Fish Assessment Index (FAI), and random forest regression, which can capture nonlinear and complex relationships with limited training datasets. Our results indicate that the proportions of land cover types in riparian areas, including urban, agricultural, and forested areas, have greater impacts on the biological communities in streams than those offered by land cover spatial patterns. The proportion of forests in riparian areas has the greatest influence on the biological integrity of streams. Partial dependence plots indicate that the biological integrity of streams gradually improves until the proportion of riparian forest areas reach about 60%; it rapidly decreases until riparian urban areas reach 25%, and declines significantly when the riparian agricultural area ranges from 20% to 40%. Overall, this study highlights the importance of riparian forests in the planning, restoration, and management of streams, and suggests that partial dependence plots may serve to provide insightful quantitative criteria for defining specific objectives that managers and decision-makers can use to improve stream conditions.
APA, Harvard, Vancouver, ISO, and other styles
38

Shao, Zhenfeng, Yuan Zhang, Lei Zhang, Yang Song, and Minjun Peng. "COMBINING SPECTRAL AND TEXTURE FEATURES USING RANDOM FOREST ALGORITHM: EXTRACTING IMPERVIOUS SURFACE AREA IN WUHAN." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 351–58. http://dx.doi.org/10.5194/isprs-archives-xli-b7-351-2016.

Full text
Abstract:
Impervious surface area (ISA) is one of the most important indicators of urban environments. At present, based on multi-resolution remote sensing images, numerous approaches have been proposed to extract impervious surface, using statistical estimation, sub-pixel classification and spectral mixture analysis method of sub-pixel analysis. Through these methods, impervious surfaces can be effectively applied to regional-scale planning and management. However, for the large scale region, high resolution remote sensing images can provide more details, and therefore they will be more conducive to analysis environmental monitoring and urban management. Since the purpose of this study is to map impervious surfaces more effectively, three classification algorithms (random forests, decision trees, and artificial neural networks) were tested for their ability to map impervious surface. Random forests outperformed the decision trees, and artificial neural networks in precision. Combining the spectral indices and texture, random forests is applied to impervious surface extraction with a producer’s accuracy of 0.98, a user’s accuracy of 0.97, and an overall accuracy of 0.98 and a kappa coefficient of 0.97.
APA, Harvard, Vancouver, ISO, and other styles
39

Shao, Zhenfeng, Yuan Zhang, Lei Zhang, Yang Song, and Minjun Peng. "COMBINING SPECTRAL AND TEXTURE FEATURES USING RANDOM FOREST ALGORITHM: EXTRACTING IMPERVIOUS SURFACE AREA IN WUHAN." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 351–58. http://dx.doi.org/10.5194/isprsarchives-xli-b7-351-2016.

Full text
Abstract:
Impervious surface area (ISA) is one of the most important indicators of urban environments. At present, based on multi-resolution remote sensing images, numerous approaches have been proposed to extract impervious surface, using statistical estimation, sub-pixel classification and spectral mixture analysis method of sub-pixel analysis. Through these methods, impervious surfaces can be effectively applied to regional-scale planning and management. However, for the large scale region, high resolution remote sensing images can provide more details, and therefore they will be more conducive to analysis environmental monitoring and urban management. Since the purpose of this study is to map impervious surfaces more effectively, three classification algorithms (random forests, decision trees, and artificial neural networks) were tested for their ability to map impervious surface. Random forests outperformed the decision trees, and artificial neural networks in precision. Combining the spectral indices and texture, random forests is applied to impervious surface extraction with a producer’s accuracy of 0.98, a user’s accuracy of 0.97, and an overall accuracy of 0.98 and a kappa coefficient of 0.97.
APA, Harvard, Vancouver, ISO, and other styles
40

Wang, Peng, and Ningchao Zhang. "Decision tree classification algorithm for non-equilibrium data set based on random forests." Journal of Intelligent & Fuzzy Systems 39, no. 2 (August 31, 2020): 1639–48. http://dx.doi.org/10.3233/jifs-179937.

Full text
Abstract:
In order to overcome the problems of poor accuracy and high complexity of current classification algorithm for non-equilibrium data set, this paper proposes a decision tree classification algorithm for non-equilibrium data set based on random forest. Wavelet packet decomposition is used to denoise non-equilibrium data, and SNM algorithm and RFID are combined to remove redundant data from data sets. Based on the results of data processing, the non-equilibrium data sets are classified by random forest method. According to Bootstrap resampling method with certain constraints, the majority and minority samples of each sample subset are sampled, CART is used to train the data set, and a decision tree is constructed. Obtain the final classification results by voting on the CART decision tree classification. Experimental results show that the proposed algorithm has the characteristics of high classification accuracy and low complexity, and it is a feasible classification algorithm for non-equilibrium data set.
APA, Harvard, Vancouver, ISO, and other styles
41

Peterson, Seth H., Janet Franklin, Dar A. Roberts, and Jan W. van Wagtendonk. "Mapping fuels in Yosemite National Park." Canadian Journal of Forest Research 43, no. 1 (January 2013): 7–17. http://dx.doi.org/10.1139/cjfr-2012-0213.

Full text
Abstract:
Decades of fire suppression have led to unnaturally large accumulations of fuel in some forest communities in the western United States, including those found in lower and midelevation forests in Yosemite National Park in California. We employed the Random Forests decision tree algorithm to predict fuel models as well as 1-h live and 1-, 10-, and 100-h dead fuel loads using a suite of climatic, topographic, remotely sensed, and burn history predictor variables. Climate variables and elevation consistently were most useful for predicting all types of fuels, but remotely sensed variables increased the kappa accuracy metric by 5%–12% age points in each case, demonstrating the utility of using disparate data sources in a topographically diverse region dominated by closed-canopy vegetation. Fire history information (time-since-fire) generally only increased kappa by 1% age point, and only for the largest fuel classes. The Random Forests models were applied to the spatial predictor layers to produce maps of fuel models and fuel loads, and these showed that fuel loads are highest in the low-elevation forests that have been most affected by fire suppression impacting the natural fire regime.
APA, Harvard, Vancouver, ISO, and other styles
42

Sadorsky, Perry. "A Random Forests Approach to Predicting Clean Energy Stock Prices." Journal of Risk and Financial Management 14, no. 2 (January 24, 2021): 48. http://dx.doi.org/10.3390/jrfm14020048.

Full text
Abstract:
Climate change, green consumers, energy security, fossil fuel divestment, and technological innovation are powerful forces shaping an increased interest towards investing in companies that specialize in clean energy. Well informed investors need reliable methods for predicting the stock prices of clean energy companies. While the existing literature on forecasting stock prices shows how difficult it is to predict stock prices, there is evidence that predicting stock price direction is more successful than predicting actual stock prices. This paper uses the machine learning method of random forests to predict the stock price direction of clean energy exchange traded funds. Some well-known technical indicators are used as features. Decision tree bagging and random forests predictions of stock price direction are more accurate than those obtained from logit models. For a 20-day forecast horizon, tree bagging and random forests methods produce accuracy rates of between 85% and 90% while logit models produce accuracy rates of between 55% and 60%. Tree bagging and random forests are easy to understand and estimate and are useful methods for forecasting the stock price direction of clean energy stocks.
APA, Harvard, Vancouver, ISO, and other styles
43

Yenny Espinosa Gómez et al.,, Yenny Espinosa Gómez et al ,. "Using Decision Tree and Random Forests to Classify Land Coverage in Tomine Reservoir." International Journal of Mechanical and Production Engineering Research and Development 10, no. 5 (2020): 21–34. http://dx.doi.org/10.24247/ijmperdoct20202.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Sohn, Myoung‐Kyu, Sang‐Heon Lee, Hyunduk Kim, and Hyeyoung Park. "Enhanced hand part classification from a single depth image using random decision forests." IET Computer Vision 10, no. 8 (July 2016): 861–67. http://dx.doi.org/10.1049/iet-cvi.2015.0239.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Jian Xue and Yunxin Zhao. "Random Forests of Phonetic Decision Trees for Acoustic Modeling in Conversational Speech Recognition." IEEE Transactions on Audio, Speech, and Language Processing 16, no. 3 (March 2008): 519–28. http://dx.doi.org/10.1109/tasl.2007.913036.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Beghoura, Mohamed Amine, Abdelhak Boubetra, and Abdallah Boukerram. "Green software requirements and measurement: random decision forests-based software energy consumption profiling." Requirements Engineering 22, no. 1 (July 26, 2015): 27–40. http://dx.doi.org/10.1007/s00766-015-0234-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Snodgrass, G. Matthew, André B. Rosay, and Angela R. Gover. "Modeling the Referral Decision in Sexual Assault Cases: An Application of Random Forests." American Journal of Criminal Justice 39, no. 2 (May 7, 2013): 267–91. http://dx.doi.org/10.1007/s12103-013-9210-x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Minbashi, Niloofar, Markus Bohlin, Carl-William Palmqvist, and Behzad Kordnejad. "The Application of Tree-Based Algorithms on Classifying Shunting Yard Departure Status." Journal of Advanced Transportation 2021 (September 7, 2021): 1–10. http://dx.doi.org/10.1155/2021/3538462.

Full text
Abstract:
Shunting yards are one of the main areas impacting the reliability of rail freight networks, and delayed departures from shunting yards can further also affect the punctuality of mixed-traffic networks. Methods for automatic detection of departures, which are likely to be delayed, can therefore contribute towards increasing the reliability and punctuality of both freight and passenger services. In this paper, we compare the performance of tree-based methods (decision trees and random forests), which have been highly successful in a wide range of generic applications, in classifying the status of (delayed, early, and on-time) departing trains from shunting yards, focusing on the delayed departures as the minority class. We use a total number of 6,243 train connections (representing over 21,000 individual wagon connections) for a one-month period from the Hallsberg yard in Sweden, which is the largest shunting yard in Scandinavia. Considering our dataset, our results show a slight difference between the application of decision trees and random forests in detecting delayed departures as the minority class. To remedy this, enhanced sampling for minority classes is applied by the synthetic minority oversampling technique (SMOTE) to improve detecting and assigning delayed departures. Applying SMOTE improved the sensitivity, precision, and F-measure of delayed departures by 20% for decision trees and by 30% for random forests. Overall, random forests show a relative better performance in detecting all three departure classes before and after applying SMOTE. Although the preliminary results presented in this paper are encouraging, future studies are needed to investigate the computational performance of tree-based algorithms using larger datasets and considering additional predictors.
APA, Harvard, Vancouver, ISO, and other styles
49

Rakhee, Rakhee, Archana Singh, Mamta Mittal, and Amrender Kumar. "Qualitative analysis of random forests for evaporation prediction in Indian Regions." Indian Journal of Agricultural Sciences 90, no. 6 (September 14, 2020): 1140–44. http://dx.doi.org/10.56093/ijas.v90i6.104786.

Full text
Abstract:
The performance of logistic regression, discriminant analysis, and random forest has been compared for the prediction of evaporation of different regions of India during 2019 at ICAR-IARI, New Delhi . The present experiment was performed at Raipur (Chhattisgarh), Karnal (Haryana), Pattambi (Kerala) and Anantpur (Andhra Pradesh). Evaporation and other weather parameters are collected from the year 1985-2012, 1973-2005, 1991-2005 and 1958-2010 respectively. The performance of the techniques is compared using classification, misclassification, and sensitivity of the model along with the Receiver Operating Characteristics (ROC) curve and Area Under Curve (AUC) value. The combinations of variables as independent variables are used in two sets. In the first set, maximum & minimum temperature, relative humidity morning & evening, wind speed, rainfall, and bright sunshine hours are used. In the second set mean temperature, mean relative humidity, bright sunshine hours, and wind speed is used to see the effect on evaporation. It is found that more accuracy is obtained using the second set as predictors. The model validation accuracy is checked via running developed model on out of sample data, i.e. testing data (last three years). The study demonstrates that the random forest approach predict evaporation in a much better way than logistic regression, discriminant analysis. The random forest model can provide timely information for the decision-makers to make crucial decisions impacting due to evaporation conditions in India.
APA, Harvard, Vancouver, ISO, and other styles
50

Benáček, Patrik, Aleš Farda, and Petr Štěpánek. "Postprocessing of Ensemble Weather Forecast Using Decision Tree–Based Probabilistic Forecasting Methods." Weather and Forecasting 38, no. 1 (January 2023): 69–82. http://dx.doi.org/10.1175/waf-d-22-0006.1.

Full text
Abstract:
Abstract Producing an accurate and calibrated probabilistic forecast has high social and economic value. Systematic errors or biases in the ensemble weather forecast can be corrected by postprocessing models whose development is an urgent challenge. Traditionally, the bias correction is done by employing linear regression models that estimate the conditional probability distribution of the forecast. Although this model framework works well, it is restricted to a prespecified model form that often relies on a limited set of predictors only. Most machine learning (ML) methods can tackle these problems with a point prediction, but only a few of them can be applied effectively in a probabilistic manner. The tree-based ML techniques, namely, natural gradient boosting (NGB), quantile random forests (QRF), and distributional regression forests (DRF), are used to adjust hourly 2-m temperature ensemble prediction at lead times of 1–10 days. The ensemble model output statistics (EMOS) and its boosting version are used as benchmark models. The model forecast is based on the European Centre for Medium-Range Weather Forecasts (ECMWF) for the Czech Republic domain. Two training periods 2015–18 and 2018 only were used to learn the models, and their prediction skill was evaluated in 2019. The results show that the QRF and NGB methods provide the best performance for 1–2-day forecasts, while the EMOS method outperforms other methods for 8–10-day forecasts. Key components to improving short-term forecasting are additional atmospheric/surface state predictors and the 4-yr training sample size. Significance Statement Machine learning methods have great potential and are beginning to be widely applied in meteorology in recent years. A new technique called natural gradient boosting (NGB) has been released and used in this paper to refine the probabilistic forecast of surface temperature. It was found that the NGB has better prediction skills than the traditional ensemble model output statistics in forecasting 1 and 2 days in advance. The NGB has similar prediction skills with lower computational demands compared to other advanced machine learning methods such as the quantile random forests. We showed a path to employ the NGB method in this task, which can be followed for refining other and more challenging meteorological variables such as wind speed or precipitation.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography