Academic literature on the topic 'XGBOOST MODEL'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'XGBOOST MODEL.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "XGBOOST MODEL"

1

Yang, Hao, Jiaxi Li, Siru Liu, Xiaoling Yang, and Jialin Liu. "Predicting Risk of Hypoglycemia in Patients With Type 2 Diabetes by Electronic Health Record–Based Machine Learning: Development and Validation." JMIR Medical Informatics 10, no. 6 (June 16, 2022): e36958. http://dx.doi.org/10.2196/36958.

Full text
Abstract:
Background Hypoglycemia is a common adverse event in the treatment of diabetes. To efficiently cope with hypoglycemia, effective hypoglycemia prediction models need to be developed. Objective The aim of this study was to develop and validate machine learning models to predict the risk of hypoglycemia in adult patients with type 2 diabetes. Methods We used the electronic health records of all adult patients with type 2 diabetes admitted to West China Hospital between November 2019 and December 2021. The prediction model was developed based on XGBoost and natural language processing. F1 score, area under the receiver operating characteristic curve (AUC), and decision curve analysis (DCA) were used as the main criteria to evaluate model performance. Results We included 29,843 patients with type 2 diabetes, of whom 2804 patients (9.4%) developed hypoglycemia. In this study, the embedding machine learning model (XGBoost3) showed the best performance among all the models. The AUC and the accuracy of XGBoost are 0.82 and 0.93, respectively. The XGboost3 was also superior to other models in DCA. Conclusions The Paragraph Vector–Distributed Memory model can effectively extract features and improve the performance of the XGBoost model, which can then effectively predict hypoglycemia in patients with type 2 diabetes.
APA, Harvard, Vancouver, ISO, and other styles
2

OUKHOUYA, HASSAN, HAMZA KADIRI, KHALID EL HIMDI, and RABY GUERBAZ. "Forecasting International Stock Market Trends: XGBoost, LSTM, LSTM-XGBoost, and Backtesting XGBoost Models." Statistics, Optimization & Information Computing 12, no. 1 (November 3, 2023): 200–209. http://dx.doi.org/10.19139/soic-2310-5070-1822.

Full text
Abstract:
Forecasting time series is crucial for financial research and decision-making in business. The nonlinearity of stock market prices profoundly impacts global economic and financial sectors. This study focuses on modeling and forecasting the daily prices of key stock indices - MASI, CAC 40, DAX, FTSE 250, NASDAQ, and HKEX, representing the Moroccan, French, German, British, US, and Hong Kong markets, respectively. We compare the performance of machine learning models, including Long Short-Term Memory (LSTM), eXtreme Gradient Boosting (XGBoost), and the hybrid LSTM-XGBoost, and utilize the skforecast library for backtesting. Results show that the hybrid LSTM-XGBoost model, optimized using Grid Search (GS), outperforms other models, achieving high accuracy in forecasting daily prices. This contribution offers financial analysts and investors valuable insights, facilitating informed decision-making through precise forecasts of international stock prices.
APA, Harvard, Vancouver, ISO, and other styles
3

Gu, Kai, Jianqi Wang, Hong Qian, and Xiaoyan Su. "Study on Intelligent Diagnosis of Rotor Fault Causes with the PSO-XGBoost Algorithm." Mathematical Problems in Engineering 2021 (April 26, 2021): 1–17. http://dx.doi.org/10.1155/2021/9963146.

Full text
Abstract:
On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance. To improve the diagnostic accuracy and practical efficiency, a hybrid model based on the particle swarm optimization-extreme gradient boosting algorithm, namely, PSO-XGBoost is designed. XGBoost is used as a classifier to diagnose rotor fault causes, having good performance due to the second-order Taylor expansion and the explicit regularization term. PSO is used to automatically optimize the process of adjusting the XGBoost’s parameters, which overcomes the shortcomings when using the empirical method or the trial-and-error method to adjust parameters of the XGBoost model. The hybrid model combines the advantages of the two algorithms and can diagnose nine rotor fault causes accurately. Following diagnostic results, maintenance measures referring to the corresponding knowledge base are provided intelligently. Finally, the proposed PSO-XGBoost model is compared with five state-of-the-art intelligent classification methods. The experimental results demonstrate that the proposed method has higher diagnostic accuracy and practical efficiency in diagnosing rotor fault causes.
APA, Harvard, Vancouver, ISO, and other styles
4

Liu, Jialin, Jinfa Wu, Siru Liu, Mengdie Li, Kunchang Hu, and Ke Li. "Predicting mortality of patients with acute kidney injury in the ICU using XGBoost model." PLOS ONE 16, no. 2 (February 4, 2021): e0246306. http://dx.doi.org/10.1371/journal.pone.0246306.

Full text
Abstract:
Purpose The goal of this study is to construct a mortality prediction model using the XGBoot (eXtreme Gradient Boosting) decision tree model for AKI (acute kidney injury) patients in the ICU (intensive care unit), and to compare its performance with that of three other machine learning models. Methods We used the eICU Collaborative Research Database (eICU-CRD) for model development and performance comparison. The prediction performance of the XGBoot model was compared with the other three machine learning models. These models included LR (logistic regression), SVM (support vector machines), and RF (random forest). In the model comparison, the AUROC (area under receiver operating curve), accuracy, precision, recall, and F1 score were used to evaluate the predictive performance of each model. Results A total of 7548 AKI patients were analyzed in this study. The overall in-hospital mortality of AKI patients was 16.35%. The best performing algorithm in this study was XGBoost with the highest AUROC (0.796, p < 0.01), F1(0.922, p < 0.01) and accuracy (0.860). The precision (0.860) and recall (0.994) of the XGBoost model rank second among the four models. Conclusion XGBoot model had obvious advantages of performance compared to the other machine learning models. This will be helpful for risk identification and early intervention for AKI patients at risk of death.
APA, Harvard, Vancouver, ISO, and other styles
5

Ji, Shouwen, Xiaojing Wang, Wenpeng Zhao, and Dong Guo. "An Application of a Three-Stage XGBoost-Based Model to Sales Forecasting of a Cross-Border E-Commerce Enterprise." Mathematical Problems in Engineering 2019 (September 16, 2019): 1–15. http://dx.doi.org/10.1155/2019/8503252.

Full text
Abstract:
Sales forecasting is even more vital for supply chain management in e-commerce with a huge amount of transaction data generated every minute. In order to enhance the logistics service experience of customers and optimize inventory management, e-commerce enterprises focus more on improving the accuracy of sales prediction with machine learning algorithms. In this study, a C-A-XGBoost forecasting model is proposed taking sales features of commodities and tendency of data series into account, based on the XGBoost model. A C-XGBoost model is first established to forecast for each cluster of the resulting clusters based on two-step clustering algorithm, incorporating sales features into the C-XGBoost model as influencing factors of forecasting. Secondly, an A-XGBoost model is used to forecast the tendency with the ARIMA model for the linear part and the XGBoost model for the nonlinear part. The final results are summed by assigning weights to forecasting results of the C-XGBoost and A-XGBoost models. By comparison with the ARIMA, XGBoost, C-XGBoost, and A-XGBoost models using data from Jollychic cross-border e-commerce platform, the C-A-XGBoost is proved to outperform than other four models.
APA, Harvard, Vancouver, ISO, and other styles
6

Zhu, Yiming. "Stock Price Prediction based on LSTM and XGBoost Combination Model." Transactions on Computer Science and Intelligent Systems Research 1 (October 12, 2023): 94–109. http://dx.doi.org/10.62051/z6dere47.

Full text
Abstract:
In recent years, many machine learning and deep learning algorithms have been applied to stock prediction, providing a reference basis for stock trading, and LSTM neural network and XGBoost algorithm are two typical representatives, each with advantages and disadvantages in prediction. In view of this, we propose a combination model based on LSTM and XGBoost, which combines the advantages of LSTM in processing time series data and the ability of XGBoost to evaluate the importance of features. The combination model first selects feature variables with high importance through XGBoost, performs data dimensionality reduction, and then uses LSTM to make predictions. In order to verify the feasibility of the combination model, we built XGBoost, LSTM and LSTM-XGBoost models, and carried out experiments on three data sets of China Eastern Airlines, China Merchants Bank and Kweichow Moutai respectively. Finally, we concluded that the proposed LSTM-XGBoost model has good feasibility and universality in stock price prediction by comparing the accuracy of the predicted images and their performance in RMSE, RMAE, and MAPE indicators.
APA, Harvard, Vancouver, ISO, and other styles
7

Xiong, Shuai, Zhixiang Liu, Chendi Min, Ying Shi, Shuangxia Zhang, and Weijun Liu. "Compressive Strength Prediction of Cemented Backfill Containing Phosphate Tailings Using Extreme Gradient Boosting Optimized by Whale Optimization Algorithm." Materials 16, no. 1 (December 28, 2022): 308. http://dx.doi.org/10.3390/ma16010308.

Full text
Abstract:
Unconfined compressive strength (UCS) is the most significant mechanical index for cemented backfill, and it is mainly determined by traditional mechanical tests. This study optimized the extreme gradient boosting (XGBoost) model by utilizing the whale optimization algorithm (WOA) to construct a hybrid model for the UCS prediction of cemented backfill. The PT proportion, the OPC proportion, the FA proportion, the solid concentration, and the curing age were selected as input variables, and the UCS of the cemented PT backfill was selected as the output variable. The original XGBoost model, the XGBoost model optimized by particle swarm optimization (PSO-XGBoost), and the decision tree (DT) model were also constructed for comparison with the WOA-XGBoost model. The results showed that the values of the root mean square error (RMSE), coefficient of determination (R2), and mean absolute error (MAE) obtained from the WOA-XGBoost model, XGBoost model, PSO-XGBoost model, and DT model were equal to (0.241, 0.967, 0.184), (0.426, 0.917, 0.336), (0.316, 0.943, 0.258), and (0.464, 0.852, 0.357), respectively. The results show that the proposed WOA-XGBoost has better prediction accuracy than the other machine learning models, confirming the ability of the WOA to enhance XGBoost in cemented PT backfill strength prediction. The WOA-XGBoost model could be a fast and accurate method for the UCS prediction of cemented PT backfill.
APA, Harvard, Vancouver, ISO, and other styles
8

Wang, Yu, Li Guo, Yanrui Zhang, and Xinyue Ma. "Research on CSI 300 Stock Index Price Prediction Based On EMD-XGBoost." Frontiers in Computing and Intelligent Systems 3, no. 1 (March 17, 2023): 72–77. http://dx.doi.org/10.54097/fcis.v3i1.6027.

Full text
Abstract:
The combination of artificial intelligence techniques and quantitative investment has given birth to various types of price prediction models based on machine learning algorithms. In this study, we verify the applicability of machine learning fused with statistical method models through the EMD-XGBoost model for stock price prediction. In the modeling process, specific solutions are proposed for overfitting problems that arise. The stock prediction model of machine learning fused with statistical learning was constructed from an empirical perspective, and an XGBoost algorithm model based on empirical modal decomposition was proposed. The data set selected for the experiment was the closing price of the CSI 300 index, and the model was judged by four indicators:mean absolute error, mean error, and root mean square error, etc. The method used for the experiment was the EMD-XGBoost network model, which had the following advantages: first, combining the empirical modal decomposition method with the XGBoost model is conducive to mining the time series data for Second, the decomposition of the CSI 300 index data by the empirical modal decomposition method is helpful to improve the accuracy of the XGBoost model for time series data prediction. The experiments show that the EMD-XGBoost model outperforms the single ARIMA or LSTM network model as well as the EMD-LSTM network model in terms of mean absolute error, mean error, and root mean square error.
APA, Harvard, Vancouver, ISO, and other styles
9

Harriz, Muhammad Alfathan, Nurhaliza Vania Akbariani, Harlis Setiyowati, and Handri Santoso. "Enhancing the Efficiency of Jakarta's Mass Rapid Transit System with XGBoost Algorithm for Passenger Prediction." Jambura Journal of Informatics 5, no. 1 (April 27, 2023): 1–6. http://dx.doi.org/10.37905/jji.v5i1.18814.

Full text
Abstract:
This study is based on a machine learning algorithm known as XGBoost. We used the XGBoost algorithm to forecast the capacity of Jakarta's mass transit system. Using preprocessed raw data obtained from the Jakarta Open Data website for the period 2020-2021 as a training medium, we achieved a mean absolute percentage error of 69. However, after the model was fine-tuned, the MAPE was significantly reduced by 28.99% to 49.97. The XGBoost algorithm was found to be effective in detecting patterns and trends in the data, which can be used to improve routes and plan future studies by providing valuable insights. It is possible that additional data points, such as holidays and weather conditions, will further enhance the accuracy of the model in future research. As a result of implementing XGBoost, Jakarta's transportation system can optimize resource utilization and improve customer service in order to improve passenger satisfaction. Future studies may benefit from additional data points, such as holidays and weather conditions, in order to improve XGBoost's efficiency.
APA, Harvard, Vancouver, ISO, and other styles
10

Siringoringo, Rimbun, Resianta Perangin-angin, and Jamaluddin Jamaluddin. "MODEL HIBRID GENETIC-XGBOOST DAN PRINCIPAL COMPONENT ANALYSIS PADA SEGMENTASI DAN PERAMALAN PASAR." METHOMIKA Jurnal Manajemen Informatika dan Komputerisasi Akuntansi 5, no. 2 (October 31, 2021): 97–103. http://dx.doi.org/10.46880/jmika.vol5no2.pp97-103.

Full text
Abstract:
Extreme Gradient Boosting(XGBoost) is a popular boosting algorithm based on decision trees. XGBoost is the best in the boosting group. XGBoost has excellent convergence. On the other hand, XGBoost is a Hyper parameterized model. Determining the value of each parameter is classified as difficult, resulting in the results obtained being trapped in the local optimum situation. Determining the value of each parameter manually, of course, takes a lot of time. In this study, a Genetic Algorithm (GA) is applied to find the optimal value of the XGBoost hyperparameter on the market segmentation problem. The evaluation of the model is based on the ROC curve. Test result. The ROC test results for several SVM, Logistic Regression, and Genetic-XGBoost models are 0.89; 0.98; 0.99. The results show that the Genetic-XGBoost model can be applied to market segmentation and forecasting.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "XGBOOST MODEL"

1

Matos, Sara Madeira. "Interpretable models of loss given default." Master's thesis, Instituto Superior de Economia e Gestão, 2021. http://hdl.handle.net/10400.5/20981.

Full text
Abstract:
Mestrado em Econometria Aplicada e Previsão
A gestão do risco de crédito é uma área em que os reguladores esperam que os bancos adotem modelos de risco transparentes e auditáveis colocando de parte o uso de modelos de black-box apesar destes serem mais precisos. Neste estudo, mostramos que os bancos não precisam de sacrificar a precisão preditiva ao custo da transparência do modelo para estar em conformidade com os requisitos regulatórios. Ilustramos isso mostrando que as previsões de perdas de crédito fornecidas por um modelo black-box podem ser facilmente explicadas em termos dos seus inputs.
Credit risk management is an area where regulators expect banks to have transparent and auditable risk models, which would preclude the use of more accurate black-box models. Furthermore, the opaqueness of these models may hide unknown biases that may lead to unfair lending decisions. In this study, we show that banks do not have to sacrifice predictive accuracy at the cost of model transparency to be compliant with regulatory requirements. We illustrate this by showing that the predictions of credit losses given by a black-box model can be easily explained in terms of their inputs. Because black-box models fit better the data, banks should consider the determinants of credit losses suggested by these models in lending decisions and pricing of credit exposures.
info:eu-repo/semantics/publishedVersion
APA, Harvard, Vancouver, ISO, and other styles
2

Wigren, Richard, and Filip Cornell. "Marketing Mix Modelling: A comparative study of statistical models." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-160082.

Full text
Abstract:
Deciding the optimal media advertisement spending is a complex issue that many companies today are facing. With the rise of new ways to market products, the choices can appear infinite. One methodical way to do this is to use Marketing Mix Modelling (MMM), in which statistical modelling is used to attribute sales to media spendings. However, many problems arise during the modelling. Modelling and mitigation of uncertainty, time-dependencies of sales, incorporation of expert information and interpretation of models are all issues that need to be addressed. This thesis aims to investigate the effectiveness of eight different statistical and machine learning methods in terms of prediction accuracy and certainty, each one addressing one of the previously mentioned issues. It is concluded that while Shapley Value Regression has the highest certainty in terms of coefficient estimation, it sacrifices some prediction accuracy. The overall highest performing model is the Bayesian hierarchical model, achieving both high prediction accuracy and high certainty.
APA, Harvard, Vancouver, ISO, and other styles
3

Pettersson, Gustav, and John Almqvist. "Lavinprognoser och maskininlärning : Att prediktera lavinprognoser med maskininlärning och väderdata." Thesis, Uppsala universitet, Institutionen för informatik och media, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-387205.

Full text
Abstract:
Denna forskningsansats undersöker genomförbarheten i att prediktera lavinfara med hjälp av ma-skininlärning i form avXGBoostoch väderdata. Lavinprognoser och meterologisk vädermodelldata harsamlats in för de sex svenska fjällområden där Naturvårdsveket genomlavinprognoser.sepublicerar lavin-prognoser. Lavinprognoserna har hämtats frånlavinprognoser.seoch den vädermodelldata som användsär hämtad från prognosmodellen MESAN, som produceras och tillhandahålls av Sveriges meteorologiskaoch hydrologiska institut. 40 modeller av typenXGBoosthar sedan tränats på denna datamängd, medsyfte att prediktera olika aspekter av en lavinprognos och den övergripande lavinfaran. Resultaten visaratt det möjligt att prediktera den dagligalavinfaranunder säsongen 2018/19 i Södra Jämtlandsfjällenmed en träffsäkerhet på 71% och enmean average errorpå 0,295, genom att applicera maskininlärningpå väderleken för det området. Värdet avXGBoosti sammanhanget har styrkts genom att jämföradessa resultat med resultaten från den enklare metoden logistisk regression, vilken uppvisade en sämreträffsäkerhet på 56% och enmean average errorpå 0,459. Forskningsansatsens bidrag är ett ”proof ofconcept” som visar på genomförbarheten av att med hjälp av maskininlärning och väderdata predikteralavinprognoser.
This research project examines the feasibility of using machine learning to predict avalanche dangerby usingXGBoostand openly available weather data. Avalanche forecasts and meterological modelledweather data have been gathered for the six areas in Sweden where Naturvårdsverket throughlavin-prognoser.seissues avalanche forecasts. The avanlanche forecasts are collected fromlavinprognoser.seand the modelled weather data is collected from theMESANmodel, which is produced and providedby the Swedish Meteorological and Hydrological Institute. 40 machine learning models, in the form ofXGBoost, have been trained on this data set, with the goal of assessing the main aspects of an avalan-che forecast and the overall avalanche danger. The results show it is possible to predict the day to dayavalanche danger for the 2018/19 season inSödra Jämtlandsfjällenwith an accuracy of 71% and a MeanAverage Error of 0.256, by applying machine learning to the weather data for that region. The contribu-tion ofXGBoostin this context, is demonstrated by applying the simpler method ofLogistic Regressionon the data set and comparing the results. Thelogistic regressionperforms worse with an accuracy of56% and a Mean Average Error of 0.459. The contribution of this research is a proof of concept, showingfeasibility in predicting avalanche danger in Sweden, with the help of machine learning and weather data.
APA, Harvard, Vancouver, ISO, and other styles
4

Karlsson, Henrik. "Uplift Modeling : Identifying Optimal Treatment Group Allocation and Whom to Contact to Maximize Return on Investment." Thesis, Linköpings universitet, Statistik och maskininlärning, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-157962.

Full text
Abstract:
This report investigates the possibilities to model the causal effect of treatment within the insurance domain to increase return on investment of sales through telemarketing. In order to capture the causal effect, two or more subgroups are required where one group receives control treatment. Two different uplift models model the causal effect of treatment, Class Transformation Method, and Modeling Uplift Directly with Random Forests. Both methods are evaluated by the Qini curve and the Qini coefficient. To model the causal effect of treatment, the comparison with a control group is a necessity. The report attempts to find the optimal treatment group allocation in order to maximize the precision in the difference between the treatment group and the control group. Further, the report provides a rule of thumb that ensure that the control group is of sufficient size to be able to model the causal effect. If has provided the data material used to model uplift and it consists of approximately 630000 customer interactions and 60 features. The total uplift in the data set, the difference in purchase rate between the treatment group and control group, is approximately 3%. Uplift by random forest with a Euclidean distance splitting criterion that tries to maximize the distributional divergence between treatment group and control group performs best, which captures 15% of the theoretical best model. The same model manages to capture 77% of the total amount of purchases in the treatment group by only giving treatment to half of the treatment group. With the purchase rates in the data set, the optimal treatment group allocation is approximately 58%-70%, but the study could be performed with as much as approximately 97%treatment group allocation.
APA, Harvard, Vancouver, ISO, and other styles
5

Henriksson, Erik, and Kristopher Werlinder. "Housing Price Prediction over Countrywide Data : A comparison of XGBoost and Random Forest regressor models." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302535.

Full text
Abstract:
The aim of this research project is to investigate how an XGBoost regressor compares to a Random Forest regressor in terms of predictive performance of housing prices with the help of two data sets. The comparison considers training time, inference time and the three evaluation metrics R2, RMSE and MAPE. The data sets are described in detail together with background about the regressor models that are used. The method makes substantial data cleaning of the two data sets, it involves hyperparameter tuning to find optimal parameters and 5foldcrossvalidation in order to achieve good performance estimates. The finding of this research project is that XGBoost performs better on both small and large data sets. While the Random Forest model can achieve similar results as the XGBoost model, it needs a much longer training time, between 2 and 50 times as long, and has a longer inference time, around 40 times as long. This makes it especially superior when used on larger sets of data.
Målet med den här studien är att jämföra och undersöka hur en XGBoost regressor och en Random Forest regressor presterar i att förutsäga huspriser. Detta görs med hjälp av två stycken datauppsättningar. Jämförelsen tar hänsyn till modellernas träningstid, slutledningstid och de tre utvärderingsfaktorerna R2, RMSE and MAPE. Datauppsättningarna beskrivs i detalj tillsammans med en bakgrund om regressionsmodellerna. Metoden innefattar en rengöring av datauppsättningarna, sökande efter optimala hyperparametrar för modellerna och 5delad korsvalidering för att uppnå goda förutsägelser. Resultatet av studien är att XGBoost regressorn presterar bättre på både små och stora datauppsättningar, men att den är överlägsen när det gäller stora datauppsättningar. Medan Random Forest modellen kan uppnå liknande resultat som XGBoost modellen, tar träningstiden mellan 250 gånger så lång tid och modellen får en cirka 40 gånger längre slutledningstid. Detta gör att XGBoost är särskilt överlägsen vid användning av stora datauppsättningar.
APA, Harvard, Vancouver, ISO, and other styles
6

Kinnander, Mathias. "Predicting profitability of new customers using gradient boosting tree models : Evaluating the predictive capabilities of the XGBoost, LightGBM and CatBoost algorithms." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-19171.

Full text
Abstract:
In the context of providing credit online to customers in retail shops, the provider must perform risk assessments quickly and often based on scarce historical data. This can be achieved by automating the process with Machine Learning algorithms. Gradient Boosting Tree algorithms have demonstrated to be capable in a wide range of application scenarios. However, they are yet to be implemented for predicting the profitability of new customers based solely on the customers’ first purchases. This study aims to evaluate the predictive performance of the XGBoost, LightGBM, and CatBoost algorithms in this context. The Recall and Precision metrics were used as the basis for assessing the models’ performance. The experiment implemented for this study shows that the model displays similar capabilities while also being biased towards the majority class.
APA, Harvard, Vancouver, ISO, and other styles
7

Svensson, William. "CAN STATISTICAL MODELS BEAT BENCHMARK PREDICTIONS BASED ON RANKINGS IN TENNIS?" Thesis, Uppsala universitet, Statistiska institutionen, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-447384.

Full text
Abstract:
The aim of this thesis is to beat a benchmark prediction of 64.58 percent based on player rankings on the ATP tour in tennis. That means that the player with the best rank in a tennis match is deemed as the winner. Three statistical model are used, logistic regression, random forest and XGBoost. The data are over a period between the years 2000-2010 and has over 60 000 observations with 49 variables each. After the data was prepared, new variables were created and the difference between the two players in hand taken all three statistical models did outperform the benchmark prediction. All three variables had an accuracy around 66 percent with the logistic regression performing the best with an accuracy of 66.45 percent. The most important variable overall for the models is the total win rate on different surfaces, the total win rate and rank.
APA, Harvard, Vancouver, ISO, and other styles
8

Liu, Xiaoyang. "Machine Learning Models in Fullerene/Metallofullerene Chromatography Studies." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/93737.

Full text
Abstract:
Machine learning methods are now extensively applied in various scientific research areas to make models. Unlike regular models, machine learning based models use a data-driven approach. Machine learning algorithms can learn knowledge that are hard to be recognized, from available data. The data-driven approaches enhance the role of algorithms and computers and then accelerate the computation using alternative views. In this thesis, we explore the possibility of applying machine learning models in the prediction of chromatographic retention behaviors. Chromatographic separation is a key technique for the discovery and analysis of fullerenes. In previous studies, differential equation models have achieved great success in predictions of chromatographic retentions. However, most of the differential equation models require experimental measurements or theoretical computations for many parameters, which are not easy to obtain. Fullerenes/metallofullerenes are rigid and spherical molecules with only carbon atoms, which makes the predictions of chromatographic retention behaviors as well as other properties much simpler than other flexible molecules that have more variations on conformations. In this thesis, I propose the polarizability of a fullerene molecule is able to be estimated directly from the structures. Structural motifs are used to simplify the model and the models with motifs provide satisfying predictions. The data set contains 31947 isomers and their polarizability data and is split into a training set with 90% data points and a complementary testing set. In addition, a second testing set of large fullerene isomers is also prepared and it is used to testing whether a model can be trained by small fullerenes and then gives ideal predictions on large fullerenes.
Machine learning models are capable to be applied in a wide range of areas, such as scientific research. In this thesis, machine learning models are applied to predict chromatography behaviors of fullerenes based on the molecular structures. Chromatography is a common technique for mixture separations, and the separation is because of the difference of interactions between molecules and a stationary phase. In real experiments, a mixture usually contains a large family of different compounds and it requires lots of work and resources to figure out the target compound. Therefore, models are extremely import for studies of chromatography. Traditional models are built based on physics rules, and involves several parameters. The physics parameters are measured by experiments or theoretically computed. However, both of them are time consuming and not easy to be conducted. For fullerenes, in my previous studies, it has been shown that the chromatography model can be simplified and only one parameter, polarizability, is required. A machine learning approach is introduced to enhance the model by predicting the molecular polarizabilities of fullerenes based on structures. The structure of a fullerene is represented by several local structures. Several types of machine learning models are built and tested on our data set and the result shows neural network gives the best predictions.
APA, Harvard, Vancouver, ISO, and other styles
9

Sharma, Vibhor. "Early Stratification of Gestational Diabetes Mellitus (GDM) by building and evaluating machine learning models." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-281398.

Full text
Abstract:
Gestational diabetes Mellitus (GDM), a condition involving abnormal levels of glucose in the blood plasma has seen a rapid surge amongst the gestating mothers belonging to different regions and ethnicities around the world. Cur- rent method of screening and diagnosing GDM is restricted to Oral Glucose Tolerance Test (OGTT). With the advent of machine learning algorithms, the healthcare has seen a surge of machine learning methods for disease diag- nosis which are increasingly being employed in a clinical setup. Yet in the area of GDM, there has not been wide spread utilization of these algorithms to generate multi-parametric diagnostic models to aid the clinicians for the aforementioned condition diagnosis.In literature, there is an evident scarcity of application of machine learn- ing algorithms for the GDM diagnosis. It has been limited to the proposed use of some very simple algorithms like logistic regression. Hence, we have attempted to address this research gap by employing a wide-array of machine learning algorithms, known to be effective for binary classification, for GDM classification early on amongst gestating mother. This can aid the clinicians for early diagnosis of GDM and will offer chances to mitigate the adverse out- comes related to GDM among the gestating mother and their progeny.We set up an empirical study to look into the performance of different ma- chine learning algorithms used specifically for the task of GDM classification. These algorithms were trained on a set of chosen predictor variables by the ex- perts. Then compared the results with the existing machine learning methods in the literature for GDM classification based on a set of performance metrics. Our model couldn’t outperform the already proposed machine learning mod- els for GDM classification. We could attribute it to our chosen set of predictor variable and the under reporting of various performance metrics like precision in the existing literature leading to a lack of informed comparison.
Graviditetsdiabetes Mellitus (GDM), ett tillstånd som involverar onormala ni- våer av glukos i blodplasma har haft en snabb kraftig ökning bland de drab- bade mammorna som tillhör olika regioner och etniciteter runt om i världen. Den nuvarande metoden för screening och diagnos av GDM är begränsad till Oralt glukosetoleranstest (OGTT). Med tillkomsten av maskininlärningsalgo- ritmer har hälso- och sjukvården sett en ökning av maskininlärningsmetoder för sjukdomsdiagnos som alltmer används i en klinisk installation. Ändå inom GDM-området har det inte använts stor spridning av dessa algoritmer för att generera multiparametriska diagnostiska modeller för att hjälpa klinikerna för ovannämnda tillståndsdiagnos.I litteraturen finns det en uppenbar brist på tillämpning av maskininlär- ningsalgoritmer för GDM-diagnosen. Det har begränsats till den föreslagna användningen av några mycket enkla algoritmer som logistisk regression. Där- för har vi försökt att ta itu med detta forskningsgap genom att använda ett brett spektrum av maskininlärningsalgoritmer, kända för att vara effektiva för binär klassificering, för GDM-klassificering tidigt bland gesterande mamma. Det- ta kan hjälpa klinikerna för tidig diagnos av GDM och kommer att erbjuda chanser att mildra de negativa utfallen relaterade till GDM bland de dödande mamma och deras avkommor.Vi inrättade en empirisk studie för att undersöka prestandan för olika ma- skininlärningsalgoritmer som används specifikt för uppgiften att klassificera GDM. Dessa algoritmer tränades på en uppsättning valda prediktorvariabler av experterna. Jämfört sedan resultaten med de befintliga maskininlärnings- metoderna i litteraturen för GDM-klassificering baserat på en uppsättning pre- standametriker. Vår modell kunde inte överträffa de redan föreslagna maskininlärningsmodellerna för GDM-klassificering. Vi kunde tillskriva den valda uppsättningen prediktorvariabler och underrapportering av olika prestanda- metriker som precision i befintlig litteratur vilket leder till brist på informerad jämförelse.
APA, Harvard, Vancouver, ISO, and other styles
10

Gregório, Rafael Leite. "Modelo híbrido de avaliação de risco de crédito para corporações brasileiras com base em algoritmos de aprendizado de máquina." Universidade Católica de Brasília, 2018. https://bdtd.ucb.br:8443/jspui/handle/tede/2432.

Full text
Abstract:
Submitted by Sara Ribeiro (sara.ribeiro@ucb.br) on 2018-08-08T13:33:03Z No. of bitstreams: 1 RafaelLeiteGregorioDissertacao2018.pdf: 1382550 bytes, checksum: 9c6e4f1d3c561482546aca581262b92b (MD5)
Approved for entry into archive by Sara Ribeiro (sara.ribeiro@ucb.br) on 2018-08-08T13:33:24Z (GMT) No. of bitstreams: 1 RafaelLeiteGregorioDissertacao2018.pdf: 1382550 bytes, checksum: 9c6e4f1d3c561482546aca581262b92b (MD5)
Made available in DSpace on 2018-08-08T13:33:24Z (GMT). No. of bitstreams: 1 RafaelLeiteGregorioDissertacao2018.pdf: 1382550 bytes, checksum: 9c6e4f1d3c561482546aca581262b92b (MD5) Previous issue date: 2018-07-09
The credit risk assessment has a relevant role for financial institutions because it is associated with possible losses and has a large impact on the balance sheets. Although there are several researches on applications of machine learning and finance models, a study is still lacking that integrates available knowledge about credit risk assessment. This paper aims at specifying the machine learning model of the probability of default of publicly traded companies present in the Bovespa Index (corporations) and, based on the estimations of the model, to obtain risk assessment metrics based on risk letters. We converged methodologies verified in the literature and we estimated models that comprise fundamentalist (balance sheet) and governance data, macroeconomic and even variables resulting from the application of the proprietary model of KMV credit risk assessment. We test the XGboost and LinearSVM algorithms, which have very different characteristics among them, but are potentially useful to the problem. Parameter Grids were performed to identify the most representative variables and to specify the best performing model. The model selected was XGboost, and performance was very similar to the results obtained for the North American stock market in analogous research. The estimated credit ratings suggest that they are more sensitive to the economic and financial situation of the companies than that verified by traditional Rating Agencies.
A avaliação do risco de crédito tem papel relevante para as instituições financeiras por estar associada a possíveis perdas que podem gerar grande impacto nos balanços. Embora existam várias pesquisas sobre aplicações de modelos de aprendizado de máquina e finanças, ainda não há estudo que integre o conhecimento disponível sobre avaliação de risco de crédito. Este trabalho visa especificar modelo de aprendizado de máquina da probabilidade de descumprimento de empresas de capital aberto presentes no Índice Bovespa (corporações) e, fruto das estimações do modelo, obter métrica de avaliação de risco baseada em letras (ratings) de risco. Convergiu-se metodologias verificadas na literatura e estimou-se modelos que compreendem componentes fundamentalistas (de balanço) e de governança corporativa, macroeconômicos e ainda variáveis produto da aplicação do modelo proprietário de avaliação de risco de crédito KMV. Testou-se os algoritmos XGboost e LinearSVM, os quais possuem características bastante distintas entre si, mas são potencialmente úteis ao problema exposto. Foram realizados Grids de parâmetros para identificação das variáveis mais representativas e para a especificação do modelo com melhor desempenho. O modelo selecionado foi o XGboost, tendo sido observado desempenho bastante semelhante aos resultados obtidos para o mercado de ações norte-americano em pesquisa análoga. Os ratings de crédito estimados mostram-se mais sensíveis à situação econômico-financeira das empresas ante o verificado por agências de rating tradicionais.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "XGBOOST MODEL"

1

Nokeri, Tshepo Chris. Data Science Solutions with Python: Fast and Scalable Models Using Keras, Pyspark MLlib, H2O, XGBoost, and Scikit-Learn. Apress L. P., 2022.

Find full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "XGBOOST MODEL"

1

Saadat, Sumaya, and V. Joseph Raymond. "Malware Classification Using CNN-XGBoost Model." In Artificial Intelligence Techniques for Advanced Computing Applications, 191–202. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-5329-5_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Venkat, Karthik, Tarika Gautam, Mohit Yadav, and Mukhtiar Singh. "An XGBoost Ensemble Model for Residential Load Forecasting." In Advances in Intelligent Systems and Computing, 321–34. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-15-8443-5_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhong, Weijian, Xiaoqin Lian, Chao Gao, Xiang Chen, and Hongzhou Tan. "PM2.5 Concentration Prediction Based on mRMR-XGBoost Model." In Machine Learning and Intelligent Communications, 327–36. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-04409-0_30.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Ren, Xudie, Haonan Guo, Shenghong Li, Shilin Wang, and Jianhua Li. "A Novel Image Classification Method with CNN-XGBoost Model." In Digital Forensics and Watermarking, 378–90. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-64185-0_28.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Chang, Wen-Chih, Yi-Hong Guo, Ya-Ling Yang, Ming-Chien Hsu, Yi-Hsuan Chu, Ting-Yi Chu, and Long-Cheng Meng. "Using the XGBoost Model to Predict Santander Customer Trading." In Lecture Notes in Electrical Engineering, 115–24. Singapore: Springer Singapore, 2021. http://dx.doi.org/10.1007/978-981-16-0115-6_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Ye, Lu. "Credit Rating of Chinese Companies Based on XGBoost Model." In New Perspectives and Paradigms in Applied Economics and Business, 99–111. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-23844-4_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Zolotareva, Ekaterina. "Aiding Long-Term Investment Decisions with XGBoost Machine Learning Model." In Artificial Intelligence and Soft Computing, 414–27. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87897-9_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zolotareva, Ekaterina. "Aiding Long-Term Investment Decisions with XGBoost Machine Learning Model." In Artificial Intelligence and Soft Computing, 414–27. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-87897-9_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Petrovic, Aleksandar, Milos Antonijevic, Ivana Strumberger, Nebojsa Budimirovic, Nikola Savanovic, and Stefana Janicijevic. "Intrusion Detection by XGBoost Model Tuned by Improved Multi-verse Optimizer." In Proceedings of the 1st International Conference on Innovation in Information Technology and Business (ICIITB 2022), 203–18. Dordrecht: Atlantis Press International BV, 2023. http://dx.doi.org/10.2991/978-94-6463-110-4_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Yu, Sun, Liwei Tian, Yijun Liu, and Yuankai Guo. "LSTM-XGBoost Application of the Model to the Prediction of Stock Price." In Lecture Notes in Computer Science, 86–98. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-78609-0_8.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "XGBOOST MODEL"

1

Zhaoweijie, Chenliang, and Hujiangmin. "Forecast Rossmann Store Sales Base on Xgboost Model." In 2020 2nd International Conference on Economic Management and Model Engineering (ICEMME). IEEE, 2020. http://dx.doi.org/10.1109/icemme51517.2020.00110.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Yixuan, Jialiang Tong, Ziyi Wang, and Fengqiang Gao. "Customer Transaction Fraud Detection Using Xgboost Model." In 2020 International Conference on Computer Engineering and Application (ICCEA). IEEE, 2020. http://dx.doi.org/10.1109/iccea50009.2020.00122.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wan, Fang. "XGBoost Based Supply Chain Fraud Detection Model." In 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). IEEE, 2021. http://dx.doi.org/10.1109/icbaie52039.2021.9390041.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Siyuan, Liu, Liu Jingyuan, Gu Hangping, and Ren Minhua. "Sleep staging prediction model based on XGBoost." In 2021 International Conference on Electronic Information Engineering and Computer Science (EIECS). IEEE, 2021. http://dx.doi.org/10.1109/eiecs53707.2021.9587974.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Liu, Feng, Xiaowei Liu, and Hao Yan. "Driving Style Identification Model based on XGBoost." In AIAM2021: 2021 3rd International Conference on Artificial Intelligence and Advanced Manufacture. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3495018.3495033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Xiao, Bei, Peng-Cheng Luo, Zhi-Jun Cheng, Xiao-Nan Zhang, and Xin-Wu Hu. "Systematic Combat Effectiveness Evaluation Model Based on Xgboost." In 2018 12th International Conference on Reliability, Maintainability, and Safety (ICRMS). IEEE, 2018. http://dx.doi.org/10.1109/icrms.2018.00033.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Ba Alawi, Abdulfattah E., Ferhat Bozkurt, and Faruk Baturalp. "Xgboost-Based Multi-Steps Cybersecurity Attacks Detection Model." In 2023 3rd International Conference on Emerging Smart Technologies and Applications (eSmarTA). IEEE, 2023. http://dx.doi.org/10.1109/esmarta59349.2023.10293597.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Zhang, Yibin, Chunyan Shao, and Chen Zou. "Prediction of Customers’ Behaviors Based on XGBoost Model." In 2023 2nd International Conference on Data Analytics, Computing and Artificial Intelligence (ICDACAI). IEEE, 2023. http://dx.doi.org/10.1109/icdacai59742.2023.00076.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ribeiro, Matheus Henrique Dal Molin, Ramon Gomes Silva, Viviana Cocco Mariani, and Leandro dos Santos Coelho. "Dengue Cases Forecasting Based on eXtreme Gradient Boosting Ensemble with Coyote Optimization." In Congresso Brasileiro de Inteligência Computacional. SBIC, 2021. http://dx.doi.org/10.21528/cbic2021-36.

Full text
Abstract:
Dengue is considered a public health problem in tropical regions, periodically affecting an increasing number of citizens. Consequently, the development of efficient models is essentials to short and long-term forecasting, supporting health care officials to optimally disseminate available resources in the dengue-prone areas. Hybridization of two or more models is a common solution to this problem where one can take advantage of diversity among models to reduce both the bias and variances of the prediction error obtained using single models. Fortunately, the use of ensemble approaches becomes attractive. In this paper, we propose a novel ensemble learning approach combining the eXtreme Gradient Boosting (XGBoost) and Coyote Optimization Algorithm (COA) to capture the nonlinearity in a dataset and perform dengue cases forecasting. The performance of the XGBoost model depends upon the appropriate choice of its hyperparameters. In this study, COA has been employed to tune the XGBoost hyperparameters. The proposed hybrid COA-XGBoost model is applied to predicting dengue time-series dataset from Parana, Brazil. Averages of precipitation, temperature, thermal amplitude, relative humidity, and previous dengue cases are considered as input variables as well as dengue cases are used as output variables. The performance of the proposed COA-XGBoost model has been compared with XGBoost when hyperparameters are obtained using other optimization techniques like Differential Evolution, Genetic Algorithm, Cuckoo Search Optimization, Grey Wolf Optimizer, and Firefly Algorithm. The results indicate that the proposed COA–XGBoost can be competitive model when compared to other classical techniques.
APA, Harvard, Vancouver, ISO, and other styles
10

Ma, Tao, Yusen Zhang, Xiangxin Nie, Xinchao Zhao, and Yexing Li. "An XGBoost-based Electric Vehicle Battery Consumption Prediction Model." In 2021 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS). IEEE, 2021. http://dx.doi.org/10.1109/icpics52425.2021.9524291.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography