Academic literature on the topic 'XGBOOST PREDICTION MODEL'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'XGBOOST PREDICTION MODEL.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "XGBOOST PREDICTION MODEL"

1

Zhao, Haolei, Yixian Wang, Xian Li, Panpan Guo, and Hang Lin. "Prediction of Maximum Tunnel Uplift Caused by Overlying Excavation Using XGBoost Algorithm with Bayesian Optimization." Applied Sciences 13, no. 17 (August 28, 2023): 9726. http://dx.doi.org/10.3390/app13179726.

Full text
Abstract:
The uplifting behaviors of existing tunnels due to overlying excavations are complex and non-linear. They are contributed to by multiple factors, and therefore, they are difficult to be accurately predicted. To address this issue, an extreme gradient boosting (XGBoost) prediction model based on Bayesian optimization (BO), namely, BO-XGBoost, was developed specifically for assessing the tunnel uplift. The modified model incorporated various factors such as an engineering design, soil types, and site construction conditions as input parameters. The performance of the BO-XGBoost model was compared with other models such as support vector machines (SVMs), the classification and regression tree (CART) model, and the extreme gradient boosting (XGBoost) model. In preparation for the model, 170 datasets from a construction site were collected and divided into 70% for training and 30% for testing. The BO-XGBoost model demonstrated a superior predictive performance, providing the most accurate displacement predictions and exhibiting better generalization capabilities. Further analysis revealed that the accuracy of the BO-XGBoost model was primarily influenced by the site’s construction factors. The interpretability of the BO-XGBoost model will provide valuable guidance for geotechnical practitioners in their decision-making processes.
APA, Harvard, Vancouver, ISO, and other styles
2

Gu, Xinqin, Li Yao, and Lifeng Wu. "Prediction of Water Carbon Fluxes and Emission Causes in Rice Paddies Using Two Tree-Based Ensemble Algorithms." Sustainability 15, no. 16 (August 13, 2023): 12333. http://dx.doi.org/10.3390/su151612333.

Full text
Abstract:
Quantification of water carbon fluxes in rice paddies and analysis of their causes are essential for agricultural water management and carbon budgets. In this regard, two tree-based machine learning models, which are extreme gradient boosting (XGBoost) and random forest (RF), were constructed to predict evapotranspiration (ET), net ecosystem carbon exchange (NEE), and methane flux (FCH4) in seven rice paddy sites. During the training process, the k-fold cross-validation algorithm by splitting the available data into multiple subsets or folds to avoid overfitting, and the XGBoost model was used to assess the importance of input factors. When predicting ET, the XGBoost model outperformed the RF model at all sites. Solar radiation was the most important input to ET predictions. Except for the KR-CRK site, the prediction for NEE was that the XGBoost models also performed better in the other six sites, and the root mean square error decreased by 0.90–11.21% compared to the RF models. Among all sites (except for the absence of net radiation (NETRAD) data at the JP-Mse site), NETRAD and normalized difference vegetation index (NDVI) performed well for predicting NEE. Air temperature, soil water content (SWC), and longwave radiation were particularly important at individual sites. Similarly, the XGBoost model was more capable of predicting FCH4 than the RF model, except for the IT-Cas site. FCH4 sensitivity to input factors varied from site to site. SWC, ecosystem respiration, NDVI, and soil temperature were important for FCH4 prediction. It is proposed to use the XGBoost model to model water carbon fluxes in rice paddies.
APA, Harvard, Vancouver, ISO, and other styles
3

Liu, Jialin, Jinfa Wu, Siru Liu, Mengdie Li, Kunchang Hu, and Ke Li. "Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model." PLOS ONE 16, no. 2 (February 4, 2021): e0246306. http://dx.doi.org/10.1371/journal.pone.0246306.

Full text
Abstract:
Purpose The goal of this study is to construct a mortality prediction model using the XGBoot (eXtreme Gradient Boosting) decision tree model for AKI (acute kidney injury) patients in the ICU (intensive care unit), and to compare its performance with that of three other machine learning models. Methods We used the eICU Collaborative Research Database (eICU-CRD) for model development and performance comparison. The prediction performance of the XGBoot model was compared with the other three machine learning models. These models included LR (logistic regression), SVM (support vector machines), and RF (random forest). In the model comparison, the AUROC (area under receiver operating curve), accuracy, precision, recall, and F1 score were used to evaluate the predictive performance of each model. Results A total of 7548 AKI patients were analyzed in this study. The overall in-hospital mortality of AKI patients was 16.35%. The best performing algorithm in this study was XGBoost with the highest AUROC (0.796, p < 0.01), F1(0.922, p < 0.01) and accuracy (0.860). The precision (0.860) and recall (0.994) of the XGBoost model rank second among the four models. Conclusion XGBoot model had obvious advantages of performance compared to the other machine learning models. This will be helpful for risk identification and early intervention for AKI patients at risk of death.
APA, Harvard, Vancouver, ISO, and other styles
4

Wang, Jun, Wei Rong, Zhuo Zhang, and Dong Mei. "Credit Debt Default Risk Assessment Based on the XGBoost Algorithm: An Empirical Study from China." Wireless Communications and Mobile Computing 2022 (March 19, 2022): 1–14. http://dx.doi.org/10.1155/2022/8005493.

Full text
Abstract:
The bond market is an important part of China’s capital market. However, defaults have become frequent in the bond market in recent years, and consequently, the default risk of Chinese credit bonds has become increasingly prominent. Therefore, the assessment of default risk is particularly important. In this paper, we utilize 31 indicators at the macroeconomic level and the corporate microlevel for the prediction of bond defaults, and we conduct principal component analysis to extract 10 principal components from them. We use the XGBoost algorithm to analyze the importance of variables and assess the credit debt default risk based on the XGBoost prediction model through the calculation of evaluation indicators such as the area under the ROC curve (AUC), accuracy, precision, recall, and F1-score, in order to evaluate the classification prediction effect of the model. Finally, the grid search algorithm and k -fold cross-validation are used to optimize the parameters of the XGBoost model and determine the final classification prediction model. Existing research has focused on the selection of bond default risk prediction indicators and the application of XGBoost algorithm in default risk prediction. After optimization of the parameters, the optimized XGBoost algorithm is found to be more accurate than the original algorithm. The grid search and k -fold cross-validation algorithms are used to optimize the XGBoost model for predicting the default risk of credit bonds, resulting in higher accuracy of the proposed model. Our research results demonstrate that the optimized XGBoost model has a significantly improved prediction accuracy, compared to the original model, which is beneficial to improving the prediction effect for practical applications.
APA, Harvard, Vancouver, ISO, and other styles
5

Gu, Zhongyuan, Miaocong Cao, Chunguang Wang, Na Yu, and Hongyu Qing. "Research on Mining Maximum Subsidence Prediction Based on Genetic Algorithm Combined with XGBoost Model." Sustainability 14, no. 16 (August 22, 2022): 10421. http://dx.doi.org/10.3390/su141610421.

Full text
Abstract:
The extreme gradient boosting (XGBoost) ensemble learning algorithm excels in solving complex nonlinear relational problems. In order to accurately predict the surface subsidence caused by mining, this work introduces the genetic algorithm (GA) and XGBoost integrated algorithm model for mining subsidence prediction and uses the Python language to develop the GA-XGBoost combined model. The hyperparameter vector of XGBoost is optimized by a genetic algorithm to improve the prediction accuracy and reliability of the XGBoost model. Using some domestic mining subsidence data sets to conduct a model prediction evaluation, the results show that the R2 (coefficient of determination) of the prediction results of the GA-XGBoost model is 0.941, the RMSE (root mean square error) is 0.369, and the MAE (mean absolute error) is 0.308. Then, compared with classic ensemble learning models such as XGBoost, random deep forest, and gradient boost, the GA-XGBoost model has higher prediction accuracy and performance than a single machine learning model.
APA, Harvard, Vancouver, ISO, and other styles
6

Kang, Leilei, Guojing Hu, Hao Huang, Weike Lu, and Lan Liu. "Urban Traffic Travel Time Short-Term Prediction Model Based on Spatio-Temporal Feature Extraction." Journal of Advanced Transportation 2020 (August 14, 2020): 1–16. http://dx.doi.org/10.1155/2020/3247847.

Full text
Abstract:
In order to improve the accuracy of short-term travel time prediction in an urban road network, a hybrid model for spatio-temporal feature extraction and prediction of urban road network travel time is proposed in this research, which combines empirical dynamic modeling (EDM) and complex networks (CN) with an XGBoost prediction model. Due to the highly nonlinear and dynamic nature of travel time series, it is necessary to consider time dependence and the spatial reliance of travel time series for predicting the travel time of road networks. The dynamic feature of the travel time series can be revealed by the EDM method, a nonlinear approach based on Chaos theory. Further, the spatial characteristic of urban traffic topology can be reflected from the perspective of complex networks. To fully guarantee the reasonability and validity of spatio-temporal features, which are dug by empirical dynamic modeling and complex networks (EDMCN), for urban traffic travel time prediction, an XGBoost prediction model is established for those characteristics. Through the in-depth exploration of the travel time and topology of a particular road network in Guiyang, the EDMCN-XGBoost prediction model’s performance is verified. The results show that, compared with the single XGBoost, autoregressive moving average, artificial neural network, support vector machine, and other models, the proposed EDMCN-XGBoost prediction model presents a better performance in forecasting.
APA, Harvard, Vancouver, ISO, and other styles
7

Wang, Wenle, Wentao Xiong, Jing Wang, Lei Tao, Shan Li, Yugen Yi, Xiang Zou, and Cui Li. "A User Purchase Behavior Prediction Method Based on XGBoost." Electronics 12, no. 9 (April 28, 2023): 2047. http://dx.doi.org/10.3390/electronics12092047.

Full text
Abstract:
With the increasing use of electronic commerce, online purchasing users have been rapidly rising. Predicting user behavior has therefore become a vital issue based on the collected data. However, traditional machine learning algorithms for prediction require significant computing time and often produce unsatisfactory results. In this paper, a prediction model based on XGBoost is proposed to predict user purchase behavior. Firstly, a user value model (LDTD) utilizing multi-feature fusion is proposed to differentiate between user types based on the available user account data. The multi-feature behavior fusion is carried out to generate the user tag feature according to user behavior patterns. Next, the XGBoost feature importance model is employed to analyze multi-dimensional features and identify the model with the most significant weight value as the key feature for constructing the model. This feature, together with other user features, is then used for prediction via the XGBoost model. Compared to existing machine learning models such as K-Nearest Neighbor (KNN), Support Vector Machine (SVM), Random Forest (RF), and Back Propagation Neural Network (BPNN), the eXtreme Gradient Boosting (XGBoost) model outperforms with an accuracy of 0.9761, an F1 score of 0.9763, and a ROC value of 0.9768. Thus, the XGBoost model demonstrates superior stability and algorithm efficiency, making it an ideal choice for predicting user purchase behavior with high levels of accuracy.
APA, Harvard, Vancouver, ISO, and other styles
8

Oubelaid, Adel, Abdelhameed Ibrahim, and Ahmed M. Elshewey. "Bridging the Gap: An Explainable Methodology for Customer Churn Prediction in Supply Chain Management." Journal of Artificial Intelligence and Metaheuristics 4, no. 1 (2023): 16–23. http://dx.doi.org/10.54216/jaim.040102.

Full text
Abstract:
Customer churn prediction is a critical task for businesses aiming to retain their valuable customers. Nevertheless, the lack of transparency and interpretability in machine learning models hinders their implementation in real-world applications. In this paper, we introduce a novel methodology for customer churn prediction in supply chain management that addresses the need for explainability. Our approach take advantage of XGBoost as the underlying predictive model. We recognize the importance of not only accurately predicting churn but also providing actionable insights into the key factors driving customer attrition. To achieve this, we employ Local Interpretable Model-agnostic Explanations (LIME), a state-of-the-art technique for generating intuitive and understandable explanations. By utilizing LIME to the predictions made by XGBoost, we enable decision-makers to gain intuition into the decision process of the model and the reasons behind churn predictions. Through a comprehensive case study on customer churn data, we demonstrate the success of our explainable ML approach. Our methodology not only achieves high prediction accuracy but also offers interpretable explanations that highlight the underlying drivers of customer churn. These insights supply valuable management for decision-making processes within supply chain management.
APA, Harvard, Vancouver, ISO, and other styles
9

Liu, Yuan, Wenyi Du, Yi Guo, Zhiqiang Tian, and Wei Shen. "Identification of high-risk factors for recurrence of colon cancer following complete mesocolic excision: An 8-year retrospective study." PLOS ONE 18, no. 8 (August 11, 2023): e0289621. http://dx.doi.org/10.1371/journal.pone.0289621.

Full text
Abstract:
Background Colon cancer recurrence is a common adverse outcome for patients after complete mesocolic excision (CME) and greatly affects the near-term and long-term prognosis of patients. This study aimed to develop a machine learning model that can identify high-risk factors before, during, and after surgery, and predict the occurrence of postoperative colon cancer recurrence. Methods The study included 1187 patients with colon cancer, including 110 patients who had recurrent colon cancer. The researchers collected 44 characteristic variables, including patient demographic characteristics, basic medical history, preoperative examination information, type of surgery, and intraoperative information. Four machine learning algorithms, namely extreme gradient boosting (XGBoost), random forest (RF), support vector machine (SVM), and k-nearest neighbor algorithm (KNN), were used to construct the model. The researchers evaluated the model using the k-fold cross-validation method, ROC curve, calibration curve, decision curve analysis (DCA), and external validation. Results Among the four prediction models, the XGBoost algorithm performed the best. The ROC curve results showed that the AUC value of XGBoost was 0.962 in the training set and 0.952 in the validation set, indicating high prediction accuracy. The XGBoost model was stable during internal validation using the k-fold cross-validation method. The calibration curve demonstrated high predictive ability of the XGBoost model. The DCA curve showed that patients who received interventional treatment had a higher benefit rate under the XGBoost model. The external validation set’s AUC value was 0.91, indicating good extrapolation of the XGBoost prediction model. Conclusion The XGBoost machine learning algorithm-based prediction model for colon cancer recurrence has high prediction accuracy and clinical utility.
APA, Harvard, Vancouver, ISO, and other styles
10

He, Wenwen, Hongli Le, and Pengcheng Du. "Stroke Prediction Model Based on XGBoost Algorithm." International Journal of Applied Sciences & Development 1 (December 13, 2022): 7–10. http://dx.doi.org/10.37394/232029.2022.1.2.

Full text
Abstract:
In this paper, individual sample data randomly measured are preprocessed, for example, outliers values are deleted and the characteristics of the samples are normalized to between 0 and 1. The correlation analysis approach is then used to determine and rank the relevance of stroke characteristics, and factors with poor correlation are discarded. The samples are randomly split into a 70% training set and a 30% testing set. Finally,the random forest model and XGBoost algorithm combined with cross-validation and grid search method are implemented to learn the stroke characteristics. The accuracy of the testing set by the XGBoost algorithm is 0.9257, which is better than that of the random forest model with 0.8991. Thus, the XGBoost model is selected to predict the stroke for ten people, and the obtained conclusion is that two people have a stroke and eight people have no stroke.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "XGBOOST PREDICTION MODEL"

1

Pettersson, Gustav, and John Almqvist. "Lavinprognoser och maskininlärning : Att prediktera lavinprognoser med maskininlärning och väderdata." Thesis, Uppsala universitet, Institutionen för informatik och media, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-387205.

Full text
Abstract:
Denna forskningsansats undersöker genomförbarheten i att prediktera lavinfara med hjälp av ma-skininlärning i form avXGBoostoch väderdata. Lavinprognoser och meterologisk vädermodelldata harsamlats in för de sex svenska fjällområden där Naturvårdsveket genomlavinprognoser.sepublicerar lavin-prognoser. Lavinprognoserna har hämtats frånlavinprognoser.seoch den vädermodelldata som användsär hämtad från prognosmodellen MESAN, som produceras och tillhandahålls av Sveriges meteorologiskaoch hydrologiska institut. 40 modeller av typenXGBoosthar sedan tränats på denna datamängd, medsyfte att prediktera olika aspekter av en lavinprognos och den övergripande lavinfaran. Resultaten visaratt det möjligt att prediktera den dagligalavinfaranunder säsongen 2018/19 i Södra Jämtlandsfjällenmed en träffsäkerhet på 71% och enmean average errorpå 0,295, genom att applicera maskininlärningpå väderleken för det området. Värdet avXGBoosti sammanhanget har styrkts genom att jämföradessa resultat med resultaten från den enklare metoden logistisk regression, vilken uppvisade en sämreträffsäkerhet på 56% och enmean average errorpå 0,459. Forskningsansatsens bidrag är ett ”proof ofconcept” som visar på genomförbarheten av att med hjälp av maskininlärning och väderdata predikteralavinprognoser.
This research project examines the feasibility of using machine learning to predict avalanche dangerby usingXGBoostand openly available weather data. Avalanche forecasts and meterological modelledweather data have been gathered for the six areas in Sweden where Naturvårdsverket throughlavin-prognoser.seissues avalanche forecasts. The avanlanche forecasts are collected fromlavinprognoser.seand the modelled weather data is collected from theMESANmodel, which is produced and providedby the Swedish Meteorological and Hydrological Institute. 40 machine learning models, in the form ofXGBoost, have been trained on this data set, with the goal of assessing the main aspects of an avalan-che forecast and the overall avalanche danger. The results show it is possible to predict the day to dayavalanche danger for the 2018/19 season inSödra Jämtlandsfjällenwith an accuracy of 71% and a MeanAverage Error of 0.256, by applying machine learning to the weather data for that region. The contribu-tion ofXGBoostin this context, is demonstrated by applying the simpler method ofLogistic Regressionon the data set and comparing the results. Thelogistic regressionperforms worse with an accuracy of56% and a Mean Average Error of 0.459. The contribution of this research is a proof of concept, showingfeasibility in predicting avalanche danger in Sweden, with the help of machine learning and weather data.
APA, Harvard, Vancouver, ISO, and other styles
2

Henriksson, Erik, and Kristopher Werlinder. "Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor models." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302535.

Full text
Abstract:
The aim of this research project is to investigate how an XGBoost regressor compares to a Random Forest regressor in terms of predictive performance of housing prices with the help of two data sets. The comparison considers training time, inference time and the three evaluation metrics R2, RMSE and MAPE. The data sets are described in detail together with background about the regressor models that are used. The method makes substantial data cleaning of the two data sets, it involves hyperparameter tuning to find optimal parameters and 5foldcrossvalidation in order to achieve good performance estimates. The finding of this research project is that XGBoost performs better on both small and large data sets. While the Random Forest model can achieve similar results as the XGBoost model, it needs a much longer training time, between 2 and 50 times as long, and has a longer inference time, around 40 times as long. This makes it especially superior when used on larger sets of data.
Målet med den här studien är att jämföra och undersöka hur en XGBoost regressor och en Random Forest regressor presterar i att förutsäga huspriser. Detta görs med hjälp av två stycken datauppsättningar. Jämförelsen tar hänsyn till modellernas träningstid, slutledningstid och de tre utvärderingsfaktorerna R2, RMSE and MAPE. Datauppsättningarna beskrivs i detalj tillsammans med en bakgrund om regressionsmodellerna. Metoden innefattar en rengöring av datauppsättningarna, sökande efter optimala hyperparametrar för modellerna och 5delad korsvalidering för att uppnå goda förutsägelser. Resultatet av studien är att XGBoost regressorn presterar bättre på både små och stora datauppsättningar, men att den är överlägsen när det gäller stora datauppsättningar. Medan Random Forest modellen kan uppnå liknande resultat som XGBoost modellen, tar träningstiden mellan 250 gånger så lång tid och modellen får en cirka 40 gånger längre slutledningstid. Detta gör att XGBoost är särskilt överlägsen vid användning av stora datauppsättningar.
APA, Harvard, Vancouver, ISO, and other styles
3

Kinnander, Mathias. "Predicting profitability of new customers using gradient boosting tree models : Evaluating the predictive capabilities of the XGBoost, LightGBM and CatBoost algorithms." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19171.

Full text
Abstract:
In the context of providing credit online to customers in retail shops, the provider must perform risk assessments quickly and often based on scarce historical data. This can be achieved by automating the process with Machine Learning algorithms. Gradient Boosting Tree algorithms have demonstrated to be capable in a wide range of application scenarios. However, they are yet to be implemented for predicting the profitability of new customers based solely on the customers’ first purchases. This study aims to evaluate the predictive performance of the XGBoost, LightGBM, and CatBoost algorithms in this context. The Recall and Precision metrics were used as the basis for assessing the models’ performance. The experiment implemented for this study shows that the model displays similar capabilities while also being biased towards the majority class.
APA, Harvard, Vancouver, ISO, and other styles
4

Svensson, William. "CAN STATISTICAL MODELS BEAT BENCHMARK PREDICTIONS BASED ON RANKINGS IN TENNIS?" Thesis, Uppsala universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447384.

Full text
Abstract:
The aim of this thesis is to beat a benchmark prediction of 64.58 percent based on player rankings on the ATP tour in tennis. That means that the player with the best rank in a tennis match is deemed as the winner. Three statistical model are used, logistic regression, random forest and XGBoost. The data are over a period between the years 2000-2010 and has over 60 000 observations with 49 variables each. After the data was prepared, new variables were created and the difference between the two players in hand taken all three statistical models did outperform the benchmark prediction. All three variables had an accuracy around 66 percent with the logistic regression performing the best with an accuracy of 66.45 percent. The most important variable overall for the models is the total win rate on different surfaces, the total win rate and rank.
APA, Harvard, Vancouver, ISO, and other styles
5

Herrmann, Vojtěch. "Moderní predikční metody pro finanční časové řady." Master's thesis, 2021. http://www.nusl.cz/ntk/nusl-437908.

Full text
Abstract:
This thesis deals with comparing two approaches to modelling and predicting time series: a traditional one (the ARIMAX model) and a modern one (gradiently boosted decision trees within the framework of the XGBoost library). In the first part of the thesis we introduce the theoretical framework of supervised learning, the ARIMAX model and gradient boosting in the context of decision trees. In the second part we fit the ARIMAX and XGBoost models which both predict a specific time series, the daily volume of the S&P 500 index, which is a crucial task in many branches. After that we compare the results of the two approaches, we describe the advantages of the XGBoost model, which presumably lead to its better results in this specific simulation study and we show the importance of hyperparameter optimization. Afterwards, we compare the practicality of the methods, especially in regards to their computational demands. In the last part of the thesis, a hybrid model theory is derived and algorithms to get the optimal hybrid model are proposed. These algorithms are then used for the mentioned prediction problem. The optimal hybrid model combines ARIMAX and XGBoost models and performs better than each of the individual models on its own. 1
APA, Harvard, Vancouver, ISO, and other styles
6

KELLER, AISHWARYA. "HYBRID RESAMPLING AND XGBOOST PREDICTION MODEL USING PATIENT'S INFORMATION AND DRAWING AS FEATURES FOR PARKINSON'S DISEASE DETECTION." Thesis, 2021. http://dspace.dtu.ac.in:8080/jspui/handle/repository/19442.

Full text
Abstract:
In the list of most commonly occurring neurodegenerative disorders, Parkinson’s disease ranks second while Alzheimer’s disease tops the list. It has no definite examination for an exact diagnosis. It has been observed that the handwriting of an individual suffering from Parkinson's disease deteriorates considerably. Therefore, many computer vision and micrography-based methods have been used by researchers to explore handwriting as a detection parameter. Yet, these methods suffer from two major drawbacks, i.e., the prediction model's biasedness due to the imbalance in the data and low rate of classification accuracy. The proposed technique is designed to alleviate prediction bias and low classification accuracy by use of hybrid resampling (Synthetic Minority Oversampling Technique and Wilson's Edited Nearest Neighbours) techniques and Extreme Gradient Boosting (XGBoost). Additionally, there is proof of innate neurological dissimilarities between men and women and the aged and the young. There is also a significant link of the dominant hand of the person and the side of the body where initial manifestation begins. Further, the gender, age, and handedness information have not been utilized for Parkinson’s disease detection. In this research work, a prediction method is developed incorporating age, gender, and dominant hand as features to identify Parkinson’s disease. The proposed hybrid resampling and XGBoost method's experimental results yield an accuracy of 98.24% highest so far when age is taken as a parameter along with nine statistical parameters (root mean square, largest value of radius difference between ET and HT, smallest value of radius difference between ET and HT, standard deviation of ET and HT radius difference, mean relative tremor, maximum ET, minimum HT, standard deviation of exam template values, number of instances where the HT and ET radius difference undergoes a change from negative value to positive value or vice versa) achieved on the HandPD dataset. The conventional accuracy is 98.24% (meanders) and 95.37% (spirals) when age is used along with nine statistical parameters extracted from the dataset. It becomes 97.02% (meanders) and 97.12% (spirals) when age, gender and handedness information are utilised. The proposed method results were compared with existing methods, and it is evident that the method outperforms its predecessors.
APA, Harvard, Vancouver, ISO, and other styles
7

(5930375), Junhui Wang. "SYSTEMATICALLY LEARNING OF INTERNAL RIBOSOME ENTRY SITE AND PREDICTION BY MACHINE LEARNING." Thesis, 2019.

Find full text
Abstract:

Internal ribosome entry sites (IRES) are segments of the mRNA found in untranslated regions, which can recruit the ribosome and initiate translation independently of the more widely used 5’ cap dependent translation initiation mechanism. IRES play an important role in conditions where has been 5’ cap dependent translation initiation blocked or repressed. They have been found to play important roles in viral infection, cellular apoptosis, and response to other external stimuli. It has been suggested that about 10% of mRNAs, both viral and cellular, can utilize IRES. But due to the limitations of IRES bicistronic assay, which is a gold standard for identifying IRES, relatively few IRES have been definitively described and functionally validated compared to the potential overall population. Viral and cellular IRES may be mechanistically different, but this is difficult to analyze because the mechanistic differences are still not very clearly defined. Identifying additional IRES is an important step towards better understanding IRES mechanisms. Development of a new bioinformatics tool that can accurately predict IRES from sequence would be a significant step forward in identifying IRES-based regulation, and in elucidating IRES mechanism. This dissertation systematically studies the features which can distinguish IRES from nonIRES sequences. Sequence features such as kmer words, and structural features such as predicted MFE of folding, QMFE, and sequence/structure triplets are evaluated as possible discriminative features. Those potential features incorporated into an IRES classifier based on XGBboost, a machine learning model, to classify novel sequences as belong to IRES or nonIRES groups. The XGBoost model performs better than previous predictors, with higher accuracy and lower computational time. The number of features in the model has been greatly reduced, compared to previous predictors, by adding global kmer and structural features. The trained XGBoost model has been implemented as the first high-throughput bioinformatics tool for IRES prediction, IRESpy. This website provides a public tool for all IRES researchers and can be used in other genomics applications such as gene annotation and analysis of differential gene expression.

APA, Harvard, Vancouver, ISO, and other styles
8

Salvaire, Pierre Antony Jean Marie. "Explaining the predictions of a boosted tree algorithm : application to credit scoring." Master's thesis, 2019. http://hdl.handle.net/10362/85991.

Full text
Abstract:
Dissertation report presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Business Intelligence and Knowledge Management
The main goal of this report is to contribute to the adoption of complex « Black Box » machine learning models in the field of credit scoring for retail credit. Although numerous investigations have been showing the potential benefits of using complex models, we identified the lack of interpretability as one of the main vector preventing from a full and trustworthy adoption of these new modeling techniques. Intrinsically linked with recent data concerns such as individual rights for explanation, fairness (introduced in the GDPR1) or model reliability, we believe that this kind of research is crucial for easing its adoption among credit risk practitioners. We build a standard Linear Scorecard model along with a more advanced algorithm called Extreme Gradient Boosting (XGBoost) on a retail credit open source dataset. The modeling scenario is a binary classification task consisting in identifying clients that will experienced 90 days past due delinquency state or worse. The interpretation of the Scorecard model is performed using the raw output of the algorithm while more complex data perturbation technique, namely Partial Dependence Plots and Shapley Additive Explanations methods are computed for the XGBoost algorithm. As a result, we observe that the XGBoost algorithm is statistically more performant at distinguishing “bad” from “good” clients. Additionally, we show that the global interpretation of the XGBoost is not as accurate as the Scorecard algorithm. At an individual level however (for each instance of the dataset), we show that the level of interpretability is very similar as they are both able to quantify the contribution of each variable to the predicted risk of a specific application.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "XGBOOST PREDICTION MODEL"

1

Zhong, Weijian, Xiaoqin Lian, Chao Gao, Xiang Chen, and Hongzhou Tan. "PM2.5 Concentration Prediction Based on mRMR-XGBoost Model." In Machine Learning and Intelligent Communications, 327–36. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-04409-0_30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Yu, Sun, Liwei Tian, Yijun Liu, and Yuankai Guo. "LSTM-XGBoost Application of the Model to the Prediction of Stock Price." In Lecture Notes in Computer Science, 86–98. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-78609-0_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ratre, Sushila, and Jyotsna Jayaraj. "Sales Prediction Using ARIMA, Facebook’s Prophet and XGBoost Model of Machine Learning." In Lecture Notes in Electrical Engineering, 101–11. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-19-5868-7_9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Sanchez-Atuncar, Giancarlo, Victor Manuel Cabrejos-Yalán, and Yesenia del Rosario Vasquez-Valencia. "Machine Learning Model Optimization for Energy Efficiency Prediction in Buildings Using XGBoost." In Lecture Notes in Networks and Systems, 309–15. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-33258-6_29.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Pradeep, S., M. Kishore, G. Oviya, S. Poorani, and R. Anitha. "XGBoost-Based Prediction and Evaluation Model for Enchanting Subscribers in Industrial Sector." In Proceedings of Fourth International Conference on Computing, Communications, and Cyber-Security, 283–95. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1479-1_22.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Uttam, Atul Kumar. "Urinary System Diseases Prediction Using Supervised Machine Learning-Based Model: XGBoost and Random Forest." In Lecture Notes in Electrical Engineering, 179–85. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-16-8542-2_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Hojaji, Fazilat, Adam J. Toth, and Mark J. Campbell. "A Machine Learning Approach for Modeling and Analyzing of Driver Performance in Simulated Racing." In Communications in Computer and Information Science, 95–105. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-26438-2_8.

Full text
Abstract:
AbstractThe emerging progress of esports lacks the approaches for ensuring high-quality analytics and training in professional and amateur esports teams. In this paper, we demonstrated the application of Artificial Intelligence (AI) and Machine Learning (ML) approach in the esports domain, particularly in simulated racing. To achieve this, we gathered a variety of feature-rich telemetry data from several web sources that was captured through MoTec telemetry software and the ACC simulated racing game. We performed a number of analyses using ML algorithms to classify the laps into the performance levels, evaluating driving behaviors along these performance levels, and finally defined a prediction model highlighting the channels/features that have significant impact on the driver performance. To identify the optimal feature set, three feature selection algorithms, i.e., the Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost) and Random Forest (RF) have been applied where out of 84 features, a subset of 10 features has been selected as the best feature subset. For the classification, XGBoost outperformed RF and SVM with the highest accuracy score among the other evaluated models. The study highlights the promising use of AI to categorize sim racers according to their technical-tactical behaviour, enhancing sim racing knowledge and know how.
APA, Harvard, Vancouver, ISO, and other styles
8

Tahsin, Labeba, and Shaily Roy. "Prediction of COVID-19 Severity Level Using XGBoost Algorithm: A Machine Learning Approach Based on SIR Epidemiological Model." In Intelligent Systems and Sustainable Computing, 69–78. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0011-2_7.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Dong, W., Y. Huang, B. Lehane, and G. Ma. "An Intelligent Multi-objective Design Optimization Method for Nanographite-Based Electrically Conductive Cementitious Composites." In Lecture Notes in Civil Engineering, 339–46. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-3330-3_35.

Full text
Abstract:
AbstractNanographite (NG) is a promising conductive filler for producing effective electrically conductive cementitious composites for use in structural health monitoring methods. Since the acceptable mechanical strength and electrical resistivity are both required, the design of NG-based cementitious composite (NGCC) is a complicated multi-objective optimization problem. This study proposes a data-driven method to address this multi-objective design optimization (MODO) issue for NGCC using machine learning (ML) techniques and non-dominated sorting genetic algorithm (NSGA-II). Prediction models of the uniaxial compressive strength (UCS) and electrical resistivity (ER) of NGCC are established by Bayesian-tuned XGBoost with prepared datasets. Results show that they have excellent performance in predicting both properties with high R2 (0.95 and 0.92, 0.99 and 0.98) and low mean absolute error (1.24 and 3.44, 0.15 and 0.22). The influence of critical features on NGCC’s properties are quantified by ML theories, which help determine the variables to be optimized and define their constraints for the MODO. The MODO program is developed on the basis of NSGA-II. It optimizes NGCC’s properties of UCS and ER simultaneously, and successfully achieves a set of Pareto solutions, which can facilitate appropriate parameters selections for the NGCC design.
APA, Harvard, Vancouver, ISO, and other styles
10

Dierckx, Thomas, Jesse Davis, and Wim Schoutens. "Quantifying News Narratives to Predict Movements in Market Risk." In Data Science for Economics and Finance, 265–85. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-66891-4_12.

Full text
Abstract:
AbstractThe theory of Narrative Economics suggests that narratives present in media influence market participants and drive economic events. In this chapter, we investigate how financial news narratives relate to movements in the CBOE Volatility Index. To this end, we first introduce an uncharted dataset where news articles are described by a set of financial keywords. We then perform topic modeling to extract news themes, comparing the canonical latent Dirichlet analysis to a technique combining doc2vec and Gaussian mixture models. Finally, using the state-of-the-art XGBoost (Extreme Gradient Boosted Trees) machine learning algorithm, we show that the obtained news features outperform a simple baseline when predicting CBOE Volatility Index movements on different time horizons.
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "XGBOOST PREDICTION MODEL"

1

Siyuan, Liu, Liu Jingyuan, Gu Hangping, and Ren Minhua. "Sleep staging prediction model based on XGBoost." In 2021 International Conference on Electronic Information Engineering and Computer Science (EIECS). IEEE, 2021. http://dx.doi.org/10.1109/eiecs53707.2021.9587974.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Al-Mudhafar, Watheq J., and David A. Wood. "Tree-Based Ensemble Algorithms for Lithofacies Classification and Permeability Prediction in Heterogeneous Carbonate Reservoirs." In Offshore Technology Conference. OTC, 2022. http://dx.doi.org/10.4043/31780-ms.

Full text
Abstract:
Abstract Rock facies are typically identified either by core analysis to provide visually interpreted lithofacies, or determined indirectly based on suites of recorded well-log data, thereby generating electrofacies interpretations. Since the lithofacies cannot be obtained for all reservoir intervals, drilled section and/or wells, it is commonly essential to model the discrete lithofacies as a function of well-log data (electrofacies) to predict the poorly sampled or non-cored intervals. The process is called predictive lithofacies classification. In this study, measured discrete lithofacies distributions (based on core data) are comparatively modeled with well-log data using two tree-based ensemble algorithms: extreme gradient boosting (XGBoost) and adaptive boosting (AdaBoost) configured as classifiers. The predicted lithofacies are then combined with recorded well-log data for analysis by an XGBoost regression model to predict permeability. The input well-log variables are log porosity, gamma ray, water saturation, neutron porosity, deep resistivity, and bulk density. The data are derived from the Mishrif carbonate reservoir in a giant southern Iraqi oil field. For efficient lithofacies classification and permeability modelling, random sub-sampling cross-validation was applied to the well-log dataset to generate two subsets: training subset for model tuning; and testing subset for prediction of data points unseen during training of the model. Confusion matrices and the total correct percentage (TCP) of predictions are used to measure the prediction performance of each algorithm to identify the most realistic lithofacies classification. The TCPs for XGBoost and AdaBoost classifiers for the training subset were 98% and 100%, respectively. However, the TCPs achieved for the testing subsets were 97%, and 96%, respectively. The mismatch between the measured and predicted permeability from the XGBoost regressor was determined using root mean square error. The XGBoost model provides accurate lithofacies classification and permeability predictions of the cored data. The XGBoost model is therefore considered suitable for providing reliable predictions of lithofacies and permeability for the non-cored intervals of the same well and for non-cored wells in the studied reservoir. The workflow for lithofacies and permeability prediction was fully implemented and visualized using R open-source codes.
APA, Harvard, Vancouver, ISO, and other styles
3

Al-Mudhafar, Watheq J., and David A. Wood. "Tree-Based Ensemble Algorithms for Lithofacies Classification and Permeability Prediction in Heterogeneous Carbonate Reservoirs." In Offshore Technology Conference. OTC, 2022. http://dx.doi.org/10.4043/31780-ms.

Full text
Abstract:
Abstract Rock facies are typically identified either by core analysis to provide visually interpreted lithofacies, or determined indirectly based on suites of recorded well-log data, thereby generating electrofacies interpretations. Since the lithofacies cannot be obtained for all reservoir intervals, drilled section and/or wells, it is commonly essential to model the discrete lithofacies as a function of well-log data (electrofacies) to predict the poorly sampled or non-cored intervals. The process is called predictive lithofacies classification. In this study, measured discrete lithofacies distributions (based on core data) are comparatively modeled with well-log data using two tree-based ensemble algorithms: extreme gradient boosting (XGBoost) and adaptive boosting (AdaBoost) configured as classifiers. The predicted lithofacies are then combined with recorded well-log data for analysis by an XGBoost regression model to predict permeability. The input well-log variables are log porosity, gamma ray, water saturation, neutron porosity, deep resistivity, and bulk density. The data are derived from the Mishrif carbonate reservoir in a giant southern Iraqi oil field. For efficient lithofacies classification and permeability modelling, random sub-sampling cross-validation was applied to the well-log dataset to generate two subsets: training subset for model tuning; and testing subset for prediction of data points unseen during training of the model. Confusion matrices and the total correct percentage (TCP) of predictions are used to measure the prediction performance of each algorithm to identify the most realistic lithofacies classification. The TCPs for XGBoost and AdaBoost classifiers for the training subset were 98% and 100%, respectively. However, the TCPs achieved for the testing subsets were 97%, and 96%, respectively. The mismatch between the measured and predicted permeability from the XGBoost regressor was determined using root mean square error. The XGBoost model provides accurate lithofacies classification and permeability predictions of the cored data. The XGBoost model is therefore considered suitable for providing reliable predictions of lithofacies and permeability for the non-cored intervals of the same well and for non-cored wells in the studied reservoir. The workflow for lithofacies and permeability prediction was fully implemented and visualized using R open-source codes.
APA, Harvard, Vancouver, ISO, and other styles
4

Ma, Tao, Yusen Zhang, Xiangxin Nie, Xinchao Zhao, and Yexing Li. "An XGBoost-based Electric Vehicle Battery Consumption Prediction Model." In 2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS). IEEE, 2021. http://dx.doi.org/10.1109/icpics52425.2021.9524291.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Qiongyu, Shi. "Prediction of O2O Coupon Usage Based on XGBoost Model." In ICEME '20: 2020 The 11th International Conference on E-business, Management and Economics. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3414752.3414775.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Wang, Yiying, Zhe Yan, and Lidong Xing. "A Movie Score Prediction Model Based on XGBoost Algorithm." In 2021 International Conference on Culture-oriented Science & Technology (ICCST). IEEE, 2021. http://dx.doi.org/10.1109/iccst53801.2021.00108.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Duan, Ran, You Li, Baohua Qiang, and Laixin Zhou. "A Feature Selection-Based XGBoost Model for Fault Prediction." In 2021 17th International Conference on Computational Intelligence and Security (CIS). IEEE, 2021. http://dx.doi.org/10.1109/cis54983.2021.00056.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zhang, Zhixin, Gaofeng Xu, Hongting Wang, and Kaibo Zhou. "Anode Effect prediction based on Expectation Maximization and XGBoost model." In 2018 IEEE 7th Data Driven Control and Learning Systems Conference (DDCLS). IEEE, 2018. http://dx.doi.org/10.1109/ddcls.2018.8516046.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Gupta, Aashish, Shilpa Sharma, Shubham Goyal, and Mamoon Rashid. "Novel XGBoost Tuned Machine Learning Model for Software Bug Prediction." In 2020 International Conference on Intelligent Engineering and Management (ICIEM). IEEE, 2020. http://dx.doi.org/10.1109/iciem48762.2020.9160152.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Tang, Qi, Guoen Xia, Xianquan Zhang, and Feng Long. "A Customer Churn Prediction Model Based on XGBoost and MLP." In 2020 International Conference on Computer Engineering and Application (ICCEA). IEEE, 2020. http://dx.doi.org/10.1109/iccea50009.2020.00133.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography