Journal articles on the topic 'Online ensemble regression'

To see the other types of publications on this topic, follow the link: Online ensemble regression.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Online ensemble regression.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liu, Yang, Bo He, Diya Dong, Yue Shen, Tianhong Yan, Rui Nian, and Amaury Lendasse. "Particle Swarm Optimization Based Selective Ensemble of Online Sequential Extreme Learning Machine." Mathematical Problems in Engineering 2015 (2015): 1–10. http://dx.doi.org/10.1155/2015/504120.

Full text
Abstract:
A novel particle swarm optimization based selective ensemble (PSOSEN) of online sequential extreme learning machine (OS-ELM) is proposed. It is based on the original OS-ELM with an adaptive selective ensemble framework. Two novel insights are proposed in this paper. First, a novel selective ensemble algorithm referred to as particle swarm optimization selective ensemble is proposed, noting that PSOSEN is a general selective ensemble method which is applicable to any learning algorithms, including batch learning and online learning. Second, an adaptive selective ensemble framework for online learning is designed to balance the accuracy and speed of the algorithm. Experiments for both regression and classification problems with UCI data sets are carried out. Comparisons between OS-ELM, simple ensemble OS-ELM (EOS-ELM), genetic algorithm based selective ensemble (GASEN) of OS-ELM, and the proposed particle swarm optimization based selective ensemble of OS-ELM empirically show that the proposed algorithm achieves good generalization performance and fast learning speed.
APA, Harvard, Vancouver, ISO, and other styles
2

Rahmawati, Eka, and Candra Agustina. "Implementasi Teknik Bagging untuk Peningkatan Kinerja J48 dan Logistic Regression dalam Prediksi Minat Pembelian Online." Jurnal Teknologi Informasi dan Terapan 7, no. 1 (June 9, 2020): 16–19. http://dx.doi.org/10.25047/jtit.v7i1.123.

Full text
Abstract:
Abstract—The rapid growth of online shopping sites makes business in the virtual world very promising. Purchasing intentions is one of the keys to success in an online store. There are several data mining methods for making predictions on online purchase intentions datasets. Data can represent the characteristics or habits of each user who has visited a site whether it ends with a transaction or not. Some popular algorithms with good performance in data mining include J48 and Logistic Regression. However, in data sometimes there is a problem of class imbalance, so the ensemble technique needs to be applied. One technique that can be applied is bagging. This research examines data using bagging techniques to improve the performance of the J48 algorithm and Logistic Regression. The results of improving the performance of data mining algorithms with these techniques have an accuracy value of 89.68% for the J48 algorithm and 88.50% for the Logistic Regression algorithm. This figure shows an increase when compared with initial testing without using ensemble techniques. Increases were also experienced in Recall, F-Measure, and AUC values. Keywords—purchasing intentions; J48; Logistic Regression; Bagging; Abstrak— Pesatnya situs pembelanjaan online menjadikan bisnis di dunia virtual sangat menjanjikan. Minat pembelian menjadi salah satu kunci kesuksesan pada sebuah toko online. Terdapat beberapa metode data mining untuk melakukan prediksi pada dataset minat pembelian online. Data dapat mewakili karakteristik atau kebiasaan dari setiap user yang telah mengunjungi suatu situs baik berakhir dengan melakukan transaksi ataupun tidak. Beberapa algoritma yang populer dengan kinerja yang baik dalam data mining diantaranya J48 dan Logistic Regreession. Namun, dalam sebuah data terkadang terdapat masalah ketidakseimbangan kelas, sehingga perlu diterapkan teknik ensemble. Salah satu teknik yang dapat diterapkan adalah teknik bagging. Penelitian kali ini mengujikan data dengan teknik bagging untuk meningkatkan kinerja algoritma J48 dan Logistic Regression. Hasil dari peningkatan kinerja algoritma data mining dengan teknik tersebut memiliki nilai akurasi 89.68% untuk algoritma J48 dan 88.50% untuk algoritma Logistic Regression. Angka tersebut menunjukan adanya peningkatan jika dibandingkan dengan pengujian awal tanpa menggunakan teknik ensemble. Peningkatan juga dialami pada nilai Recall, F-Measure, dan AUC. Keywords—Minat Pembelian, J48, Logistic Regression, Bagging
APA, Harvard, Vancouver, ISO, and other styles
3

Hansrajh, Arvin, Timothy T. Adeliyi, and Jeanette Wing. "Detection of Online Fake News Using Blending Ensemble Learning." Scientific Programming 2021 (July 28, 2021): 1–10. http://dx.doi.org/10.1155/2021/3434458.

Full text
Abstract:
The exponential growth in fake news and its inherent threat to democracy, public trust, and justice has escalated the necessity for fake news detection and mitigation. Detecting fake news is a complex challenge as it is intentionally written to mislead and hoodwink. Humans are not good at identifying fake news. The detection of fake news by humans is reported to be at a rate of 54% and an additional 4% is reported in the literature as being speculative. The significance of fighting fake news is exemplified during the present pandemic. Consequently, social networks are ramping up the usage of detection tools and educating the public in recognising fake news. In the literature, it was observed that several machine learning algorithms have been applied to the detection of fake news with limited and mixed success. However, several advanced machine learning models are not being applied, although recent studies are demonstrating the efficacy of the ensemble machine learning approach; hence, the purpose of this study is to assist in the automated detection of fake news. An ensemble approach is adopted to help resolve the identified gap. This study proposed a blended machine learning ensemble model developed from logistic regression, support vector machine, linear discriminant analysis, stochastic gradient descent, and ridge regression, which is then used on a publicly available dataset to predict if a news report is true or not. The proposed model will be appraised with the popular classical machine learning models, while performance metrics such as AUC, ROC, recall, accuracy, precision, and f1-score will be used to measure the performance of the proposed model. Results presented showed that the proposed model outperformed other popular classical machine learning models.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Junbo, C. Y. Chung, and Lin Guan. "Noise Effect and Noise-Assisted Ensemble Regression in Power System Online Sensitivity Identification." IEEE Transactions on Industrial Informatics 13, no. 5 (October 2017): 2302–10. http://dx.doi.org/10.1109/tii.2017.2671351.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Azeez, Nureni Ayofe, and Emad Fadhal. "Classification of Virtual Harassment on Social Networks Using Ensemble Learning Techniques." Applied Sciences 13, no. 7 (April 4, 2023): 4570. http://dx.doi.org/10.3390/app13074570.

Full text
Abstract:
Background: Internet social media platforms have become quite popular, enabling a wide range of online users to stay in touch with their friends and relatives wherever they are at any time. This has led to a significant increase in virtual crime from the inception of these platforms to the present day. Users are harassed online when confidential information about them is stolen, or when another user posts insulting or offensive comments about them. This has posed a significant threat to online social media users, both mentally and psychologically. Methods: This research compares traditional classifiers and ensemble learning in classifying virtual harassment in online social media networks by using both models with four different datasets: seven machine learning algorithms (Nave Bayes NB, Decision Tree DT, K Nearest Neighbor KNN, Logistics Regression LR, Neural Network NN, Quadratic Discriminant Analysis QDA, and Support Vector Machine SVM) and four ensemble learning models (Ada Boosting, Gradient Boosting, Random Forest, and Max Voting). Finally, we compared our results using twelve evaluation metrics, namely: Accuracy, Precision, Recall, F1-measure, Specificity, Matthew’s Correlation Coefficient (MCC), Cohen’s Kappa Coefficient KAPPA, Area Under Curve (AUC), False Discovery Rate (FDR), False Negative Rate (FNR), False Positive Rate (FPR), and Negative Predictive Value (NPV) were used to show the validity of our algorithms. Results: At the end of the experiments, For Dataset 1, Logistics Regression had the highest accuracy of 0.6923 for machine learning algorithms, while Max Voting Ensemble had the highest accuracy of 0.7047. For dataset 2, K-Nearest Neighbor, Support Vector Machine, and Logistics Regression all had the same highest accuracy of 0.8769 in the machine learning algorithm, while Random Forest and Gradient Boosting Ensemble both had the highest accuracy of 0.8779. For dataset 3, the Support Vector Machine had the highest accuracy of 0.9243 for the machine learning algorithms, while the Random Forest ensemble had the highest accuracy of 0.9258. For dataset 4, the Support Vector Machine and Logistics Regression both had 0.8383, while the Max voting ensemble obtained an accuracy of 0.8280. A bar chart was used to represent our results, showing the minimum, maximum, and quartile ranges. Conclusions: Undoubtedly, this technique has assisted in no small measure in comparing the selected machine learning algorithms as well as the ensemble for detecting and exposing various forms of cyber harassment in cyberspace. Finally, the best and weakest algorithms were revealed.
APA, Harvard, Vancouver, ISO, and other styles
6

Bodyanskiy, Ye V., Kh V. Lipianina-Honcharenko, and A. O. Sachenko. "ENSEMBLE OF ADAPTIVE PREDICTORS FOR MULTIVARIATE NONSTATIONARY SEQUENCES AND ITS ONLINE LEARNING." Radio Electronics, Computer Science, Control, no. 4 (January 2, 2024): 91. http://dx.doi.org/10.15588/1607-3274-2023-4-9.

Full text
Abstract:
Context. In this research, we explore an ensemble of metamodels that utilizes multivariate signals to generate forecasts. The ensemble includes various traditional forecasting models such as multivariate regression, exponential smoothing, ARIMAX, as well as nonlinear structures based on artificial neural networks, ranging from simple feedforward networks to deep architectures like LSTM and transformers. Objective. A goal of this research is to develop an effective method for combining forecasts from multiple models forming metamodels to create a unified forecast that surpasses the accuracy of individual models. We are aimed to investigate the effectiveness of the proposed ensemble in the context of forecasting tasks with nonstationary signals. Method. The proposed ensemble of metamodels employs the method of Lagrange multipliers to estimate the parameters of the metamodel. The Kuhn-Tucker system of equations is solved to obtain unbiased estimates using the least squares method. Additionally, we introduce a recurrent form of the least squares algorithm for adaptive processing of nonstationary signals. Results. The evaluation of the proposed ensemble method is conducted on a dataset of time series. Metamodels formed by combining various individual models demonstrate improved forecast accuracy compared to individual models. The approach shows effectiveness in capturing nonstationary patterns and enhancing overall forecasting accuracy. Conclusions. The ensemble of metamodels, which utilizes multivariate signals for forecast generation, offers a promising approach to achieve better forecasting accuracy. By combining diverse models, the ensemble exhibits robustness to nonstationarity and improves the reliability of forecasts.
APA, Harvard, Vancouver, ISO, and other styles
7

R, Chitra A., and Dr Arjun B. C. "Performance Analysis of Regression Algorithms for Used Car Price Prediction: KNIME Analytics Platform." International Journal for Research in Applied Science and Engineering Technology 11, no. 2 (February 28, 2023): 1324–31. http://dx.doi.org/10.22214/ijraset.2023.49180.

Full text
Abstract:
Abstract: In the recent years people’s willingness towards used car has increased. This has reflected in selling and buying of such cars. With the advance in technology online portal for marketing of used cars has come into effect. Many online portals focus to connect available used cars with user needs, present trends and various selection criteria. Using Machine Learning Algorithms such as Linear Regression, Tree Ensemble (Regression), Random forest (Regression), Gradient Boosted Tree(Regression), Simple Regression tree provided by KNIME Analytics Platform used car price predicted is performed. Analysis shows that Gradient Boosted Tree(Regression) prediction is closest to the target.
APA, Harvard, Vancouver, ISO, and other styles
8

Setiawan, Yahya, Jondri Jondri, and Widi Astuti. "Twitter Sentiment Analysis on Online Transportation in Indonesia Using Ensemble Stacking." JURNAL MEDIA INFORMATIKA BUDIDARMA 6, no. 3 (July 25, 2022): 1452. http://dx.doi.org/10.30865/mib.v6i3.4359.

Full text
Abstract:
Online transportation is a transportation innovation that has emerged along with the development of online-based applications that provide many features and conveniences. In its development, many users wrote their responses to the application on social media such as twitter. Many opinions and responses are directly conveyed by users of online transportation modes to their official accounts. The responses given by these users are very large and can be used as sentiment analysis on online transportation. However, the analysis process cannot be done manually. Therefore, we need a system that can help analyze user responses on Twitter automatically. In this study, a sentiment analysis system was built for online transportation in Indonesia using the ensemble stacking algorithm, which will simplify and increase the accuracy of the sentiment analysis. Ensemble stacking is a solution for advanced machine learning methods that can improve the performance of the base classifier. The system built on ensemble stacking uses three base classifiers, namely SVM kernel RBF, SVM linear kernel, and logistic regression. The best accuracy result on the gojek dataset is 88%, and the best F1 score is 87%. Ensemble Stacking which is applied to the research that the author conducted on online transportation sentiment analysis on twitter, obtained better accuracy than the base classifier used.
APA, Harvard, Vancouver, ISO, and other styles
9

de Almeida, Ricardo, Yee Mey Goh, Radmehr Monfared, Maria Teresinha Arns Steiner, and Andrew West. "An ensemble based on neural networks with random weights for online data stream regression." Soft Computing 24, no. 13 (November 9, 2019): 9835–55. http://dx.doi.org/10.1007/s00500-019-04499-x.

Full text
Abstract:
Abstract Most information sources in the current technological world are generating data sequentially and rapidly, in the form of data streams. The evolving nature of processes may often cause changes in data distribution, also known as concept drift, which is difficult to detect and causes loss of accuracy in supervised learning algorithms. As a consequence, online machine learning algorithms that are able to update actively according to possible changes in the data distribution are required. Although many strategies have been developed to tackle this problem, most of them are designed for classification problems. Therefore, in the domain of regression problems, there is a need for the development of accurate algorithms with dynamic updating mechanisms that can operate in a computational time compatible with today’s demanding market. In this article, the authors propose a new bagging ensemble approach based on neural network with random weights for online data stream regression. The proposed method improves the data prediction accuracy as well as minimises the required computational time compared to a recent algorithm for online data stream regression from literature. The experiments are carried out using four synthetic datasets to evaluate the algorithm’s response to concept drift, along with four benchmark datasets from different industries. The results indicate improvement in data prediction accuracy, effectiveness in handling concept drift, and much faster updating times compared to the existing available approach. Additionally, the use of design of experiments as an effective tool for hyperparameter tuning is demonstrated.
APA, Harvard, Vancouver, ISO, and other styles
10

Kothapalli. Mandakini, Et al. "Ensemble Learning for fraud detection in Online Payment System." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 10 (November 2, 2023): 1070–76. http://dx.doi.org/10.17762/ijritcc.v11i10.8626.

Full text
Abstract:
The imbalanced problem in fraud detection systems refers to the unequal distribution of fraud cases and non-fraud cases in the information that is used to train machine learning models. This can make it difficult to accurately detect fraudulent activity. As a general rule, instances of fraud occur much less frequently than instances of other types of occurrences, which results in a dataset which is very unbalanced. This imbalance can present challenges for machine learning algorithms, as they may become biased towards the majority class (that is, non-fraud cases) and fail to accurately detect fraud. In situations like these, machine learning models may have a high accuracy overall, but a low recall for the minority class (i.e., fraud cases), which means that many instances of fraud will be misclassified as instances of something else and will not be found. In this study, Synthetic Minority Sampling Technique (SMOTE) is used for balancing the data set and the following machine learning algorithms such as decision trees, Enhanced logistic regression, Naive Bayes are used to classify the dataset.Majority Voting mechanism is used to ensemble the DT,NB, ELR methods and analyze the performance of the model. The performance of the Ensemble of various Machine Learning algorithms was superior to that of the other algorithms in terms of accuracy (98.62%), F1 score (95.21%), precision (98.02%), and recall (96.75%).
APA, Harvard, Vancouver, ISO, and other styles
11

Putri, Anastasia Kinanti, and Hari Suparwito. "Uji Algoritma Stacking Ensemble Classifier pada Kemampuan Adaptasi Mahasiswa Baru dalam Pembelajaran Online." KONSTELASI: Konvergensi Teknologi dan Sistem Informasi 3, no. 1 (June 7, 2023): 1–12. http://dx.doi.org/10.24002/konstelasi.v3i1.7009.

Full text
Abstract:
Perubahan metode pembelajaran dari sistem kelas ke online membawa dampak yang sangat signifikan. Mahasiswa dituntut mampu beradaptasi pada perubahan pola belajar mengajar. Penelitian ini bertujuan untuk melakukan klasifikasi kemampuan adaptasi mahasiswa baru dalam pembelajaran online dengan pendekatan machine learning menggunakan algoritma stacking ensemble. Metode penelitian menggunakan penggabungan single classifier dengan teknik ensemble stacking atau stacked generalization menggunakan Random Forest, Decision Tree, K-Nearest Neighbor, Support Vector Machine, dan Neural Network sebagai base learner dan Logistic Regression sebagai meta learner. Dari penelitian yang dilakukan, didapatkan f-1 score pada Random Forest sebesar 89.26%, Decision Tree 88.58%, K-NN 84.25%, SVM 88.98%, Neural Network 89.06%, Logistic Regression 89.07%, dan Stacking 88.86%. Meski dibandingkan dengan single classifier seperti Decision Tree dan K- NN, akurasi pada Stacking meningkat, akan tetapi tidak lebih optimal dari Random Forest, SVM, Neural Network, maupun Logistic Regression. Validasi keakuratan model menggunakan Cross Validation menghasilkan f-1 score konstan berada pada angka 88% untuk setiap n-fold yang menunjukkan bahwa model stacking yang diimplementasikan sudah baik dan stabil. Hal tersebut juga ditunjukkan pada hasil uji stabilitas algoritma stacking menggunakan data random yang berjumlah 10 dan 5 record masing-masing sebanyak 5 kali percobaan, hasil yang didapatkan f-1 score konsisten berada pada angka 88%.
APA, Harvard, Vancouver, ISO, and other styles
12

Zheng, Shuihua, Kaixin Liu, Yili Xu, Hao Chen, Xuelei Zhang, and Yi Liu. "Robust Soft Sensor with Deep Kernel Learning for Quality Prediction in Rubber Mixing Processes." Sensors 20, no. 3 (January 27, 2020): 695. http://dx.doi.org/10.3390/s20030695.

Full text
Abstract:
Although several data-driven soft sensors are available, online reliable prediction of the Mooney viscosity in industrial rubber mixing processes is still a challenging task. A robust semi-supervised soft sensor, called ensemble deep correntropy kernel regression (EDCKR), is proposed. It integrates the ensemble strategy, deep brief network (DBN), and correntropy kernel regression (CKR) into a unified soft sensing framework. The multilevel DBN-based unsupervised learning stage extracts useful information from all secondary variables. Sequentially, a supervised CKR model is built to explore the relationship between the extracted features and the Mooney viscosity values. Without cumbersome preprocessing steps, the negative effects of outliers are reduced using the CKR-based robust nonlinear estimator. With the help of ensemble strategy, more reliable prediction results are further obtained. An industrial case validates the practicality and reliability of EDCKR.
APA, Harvard, Vancouver, ISO, and other styles
13

Lalloué, Benoît, Jean-Marie Monnez, and Eliane Albuisson. "Construction and Update of an Online Ensemble Score Involving Linear Discriminant Analysis and Logistic Regression." Applied Mathematics 13, no. 02 (2022): 228–42. http://dx.doi.org/10.4236/am.2022.132018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Udayana, I. Putu Agus Eka Darma, Ni Putu Eka Kherismawati, and I. Gede Iwan Sudipa. "Detection of Student Drowsiness Using Ensemble Regression Trees in Online Learning During a COVID-19 Pandemic." Telematika 19, no. 2 (June 30, 2022): 229. http://dx.doi.org/10.31315/telematika.v19i2.7044.

Full text
Abstract:
Online lectures are mandatory to deal with the implementation of education during the COVID-19 pandemic. This significant change certainly creates a different experience for students. Regarding online learning, several public health experts and ophthalmologists say that residual radiation from electronic screens is causing an epidemic of eye fatigue. Research on smart classrooms actually appeared several years ago, but in reality it has not been implemented according to the planned concept. The current smart classroom research environment only uses outdated methods, which make the computer system incongruent (such as decision trees in video feeds) or only to the level of empirical studies or blueprints, which are not much help for other academic footing or reference materials. to students. This study aims to build an intelligent system that can evaluate students' attention during online classes, use teaching videos as learning feeds and input for predictions and also use advanced algorithms in several computational domains, namely face segmentation, landmarking, PERCLOS observations, Yawning and decision analysis using Ensemble Regression Trees to detect students' sleepiness, which is expected to patch up the shortcomings of the PERCLOS algorithm and the problems found in the single regression tree-based implementation. Based on the results of the tests that have been carried out, the system developed has been able to observe sleepy objects in learning videos with an accuracy of 80% so that later it can be a lesson for teachers why there are students who are sleepy during online classes either because of uninteresting material or other reasons.
APA, Harvard, Vancouver, ISO, and other styles
15

Lee, Sangjae, and Joon Yeon Choeh. "Exploring the influence of online word-of-mouth on hotel booking prices: insights from regression and ensemble-based machine learning methods." Data Science in Finance and Economics 4, no. 1 (2024): 65–82. http://dx.doi.org/10.3934/dsfe.2024003.

Full text
Abstract:
<abstract> <p>Previous studies have extensively investigated the effects of online word-of-mouth (eWOM) factors such as volume and valence on product sales. However, studies of the effect of eWOM factors on product prices are lacking. It is necessary to examine how various eWOM factors can either explain or affect product prices. The objective of this study is to suggest explanatory and predictive analytics using a regression analysis and ensemble-based machine learning methods for eWOM factors and hotels booking prices. This study utilizes publicly available data from a hotel booking site to build a sample of eWOM factors. The final study sample was comprised of 927 hotels. The important eWOM factors found to affect hotel prices are the review depth and the review rating, which are moderated by a number of reviews to affect prices. The effect of the number of positive words is moderated by the review helpfulness to affect the price. The review depth and rating, along with the number of reviews, should be considered in the design of hotel services, as these provide the rationale for adjusting the prices of various aspects of hotel services. Furthermore, the comparison results when applying various ensemble-based machine learning methods to predict prices using eWOM factors based on a 46-fold cross-validation partition method indicated that ensemble methods (bagging and boosting) based on decision trees outperformed ensemble methods based on k-nearest neighbor methods and neural networks. This shows that bagging and boosting methods are effective ways to improve the prediction performance outcomes when using decision trees. The explanatory and predictive analytics using eWOM factors for hotel booking prices offers a better understanding in terms of how the accommodation prices of hotel services can be explained and predicted by eWOM factors.</p> </abstract>
APA, Harvard, Vancouver, ISO, and other styles
16

Kadam, Aishwarya. "WEB MINING TO DETECT ONLINE SPREAD OF TERRORISM." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 04 (April 21, 2024): 1–5. http://dx.doi.org/10.55041/ijsrem31243.

Full text
Abstract:
To fight online terrorism, integrating web mining techniques with data mining algorithms is crucial. Various algorithms like Logistic Regression, K-Nearest Neighbors (KNN), Support Vector Machines (SVM), Naive Bayes (NB), Decision Trees (DT), Random Forests (RF), and Gradient Boosting (GB) are studied for detecting terrorist activities online, extracting data, identifying patterns, and relevant information. Logistic Regression provides a probabilistic framework, KNN uses similarity metrics, SVM constructs hyper-planes, NB assumes feature independence, DT builds decision trees, RF applies ensemble learning, and GB boosts weak learners' performance. These algorithms aid in proactive monitoring and prevention of online terrorism through efficient analysis of structured and unstructured web data. By combining web mining and data mining strengths, this study emphasizes a comprehensive approach to combatting online terrorism dissemination, helping security agencies anticipate evolving threats and prevent terrorist propagation effectively. Key Words: Terrorism, Naive-bayes, random forest, web mining, Gradient boosting
APA, Harvard, Vancouver, ISO, and other styles
17

Saleh Hussein, Ameer, Rihab Salah Khairy, Shaima Miqdad Mohamed Najeeb, and Haider Th Salim Alrikabi. "Credit Card Fraud Detection Using Fuzzy Rough Nearest Neighbor and Sequential Minimal Optimization with Logistic Regression." International Journal of Interactive Mobile Technologies (iJIM) 15, no. 05 (March 16, 2021): 24. http://dx.doi.org/10.3991/ijim.v15i05.17173.

Full text
Abstract:
<p>The global online communication channel made possible with the internet has increased credit card fraud leading to huge loss of monetary fund in their billions annually for consumers and financial institutions. The fraudsters constantly devise new strategy to perpetrate illegal transactions. As such, innovative detection systems in combating fraud are imperative to curb these losses. This paper presents the combination of multiple classifiers through stacking ensemble technique for credit card fraud detection. The fuzzy-rough nearest neighbor (FRNN) and sequential minimal optimization (SMO) are employed as base classifiers. Their combined prediction becomes data input for the meta-classifier, which is logistic regression (LR) resulting in a final predictive outcome for improved detection. Simulation results compared with seven other algorithms affirms that ensemble model can adequately detect credit card fraud with detection rates of 84.90% and 76.30%.</p>
APA, Harvard, Vancouver, ISO, and other styles
18

Jin, Huaiping, Xiangguang Chen, Li Wang, Kai Yang, and Lei Wu. "Adaptive Soft Sensor Development Based on Online Ensemble Gaussian Process Regression for Nonlinear Time-Varying Batch Processes." Industrial & Engineering Chemistry Research 54, no. 30 (July 28, 2015): 7320–45. http://dx.doi.org/10.1021/acs.iecr.5b01495.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Jin, Huaiping, Xiangguang Chen, Li Wang, Kai Yang, and Lei Wu. "Dual learning-based online ensemble regression approach for adaptive soft sensor modeling of nonlinear time-varying processes." Chemometrics and Intelligent Laboratory Systems 151 (February 2016): 228–44. http://dx.doi.org/10.1016/j.chemolab.2016.01.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Zhou, Zhiyu, Xu Gao, Jianxin Zhang, Zefei Zhu, and Xudong Hu. "A novel hybrid model using the rotation forest-based differential evolution online sequential extreme learning machine for illumination correction of dyed fabrics." Textile Research Journal 89, no. 7 (March 20, 2018): 1180–97. http://dx.doi.org/10.1177/0040517518764020.

Full text
Abstract:
This study proposes an ensemble differential evolution online sequential extreme learning machine (DE-OSELM) for textile image illumination correction based on the rotation forest framework. The DE-OSELM solves the inaccuracy and long training time problems associated with traditional illumination correction algorithms. First, the Grey–Edge framework is used to extract the low-dimensional and efficient image features as online sequential extreme learning machine (OSELM) input vectors to improve the training and learning speed of the OSELM. Since the input weight and hidden-layer bias of OSELMs are randomly obtained, the OSELM algorithm has poor prediction accuracy and low robustness. To overcome this shortcoming, a differential evolution algorithm that has the advantages of good global search ability and robustness is used to optimize the input weight and hidden-layer bias of the DE-OSELM. To further improve the generalization ability and robustness of the illumination correction model, the rotation forest algorithm is used as the ensemble framework, and the DE-OSELM is used as the base learner to replace the regression tree algorithm in the original rotation forest algorithm. Then, the obtained multiple different DE-OSELM learners are aggregated to establish the prediction model. The experimental results show that compared with the textile color correction algorithm based on the support vector regression and extreme learning machine algorithms, the ensemble illumination correction method achieves high prediction accuracy, strong robustness, and good generalization ability.
APA, Harvard, Vancouver, ISO, and other styles
21

Bokolo, Biodoumoye George, and Qingzhong Liu. "Advanced Algorithmic Approaches for Scam Profile Detection on Instagram." Electronics 13, no. 8 (April 19, 2024): 1571. http://dx.doi.org/10.3390/electronics13081571.

Full text
Abstract:
Social media platforms like Instagram have become a haven for online scams, employing various deceptive tactics to exploit unsuspecting users. This paper investigates advanced algorithmic approaches to combat this growing threat. We explore various machine learning models for scam profile detection on Instagram. Our methodology involves collecting a comprehensive dataset from a trusted source and meticulously preprocessing the data for analysis. We then evaluate the effectiveness of a suite of machine learning algorithms, including decision trees, logistic regression, SVMs, and other ensemble methods. Each model’s performance is measured using established metrics like accuracy, precision, recall, and F1-scores. Our findings indicate that ensemble methods, particularly random forest, XGBoost, and gradient boosting, outperform other models, achieving accuracy of 90%. The insights garnered from this study contribute significantly to the body of knowledge in social media forensics, offering practical implications for the development of automated tools to combat online deception.
APA, Harvard, Vancouver, ISO, and other styles
22

Budiman, Arif, Mohamad Ivan Fanany, and Chan Basaruddin. "Adaptive Online Sequential ELM for Concept Drift Tackling." Computational Intelligence and Neuroscience 2016 (2016): 1–17. http://dx.doi.org/10.1155/2016/8091267.

Full text
Abstract:
A machine learning method needs to adapt to over time changes in the environment. Such changes are known as concept drift. In this paper, we propose concept drift tackling method as an enhancement of Online Sequential Extreme Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by adding adaptive capability for classification and regression problem. The scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme that works well to handle real drift, virtual drift, and hybrid drift. The AOS-ELM also works well for sudden drift and recurrent context change type. The scheme is a simple unified method implemented in simple lines of code. We evaluated AOS-ELM on regression and classification problem by using concept drift public data set (SEA and STAGGER) and other public data sets such as MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice does not need hidden nodes increase, we address some issues related to the increasing of the hidden nodes such as error condition and rank values. We propose taking the rank of the pseudoinverse matrix as an indicator parameter to detect “underfitting” condition.
APA, Harvard, Vancouver, ISO, and other styles
23

Kaneko, Hiromasa, and Kimito Funatsu. "Adaptive soft sensor based on online support vector regression and Bayesian ensemble learning for various states in chemical plants." Chemometrics and Intelligent Laboratory Systems 137 (October 2014): 57–66. http://dx.doi.org/10.1016/j.chemolab.2014.06.008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Santoso, Dwi Budi, Aliyatul Munna, and Dewi Handayani Untari Ningsih. "Improved playstore review sentiment classification accuracy with stacking ensemble." Journal of Soft Computing Exploration 5, no. 1 (March 18, 2024): 38–45. http://dx.doi.org/10.52465/joscex.v5i1.247.

Full text
Abstract:
In today's digital era, user reviews on the Playstore platform are an invaluable source of information for developers, offering insights that are critical for service improvement. Previous research has explored the application of stacking ensemble methods, such as in the context of predicting depression among university students, to enhance prediction accuracy. However, these studies often do not explicitly detail the data acquisition process, leaving a gap in understanding the applicability of these methods to different domains. This research aims to bridge this gap by applying the stacking ensemble approach to improve the accuracy of sentiment classification in Playstore reviews, with a clear exposition of the data collection method. Utilizing Logistic Regression as the meta classifier, this methodology is executed in several stages. Initially, data was collected from user reviews of online loan applications on Google Playstore, ensuring transparency in the data acquisition process. The data is then classified using three basic models: Random Forest, Naive Bayes, and SVM. The outputs of these models serve as inputs to the Logistic Regression meta model. A comparison of each base model output with the meta model was subsequently carried out. The test results on the Playstore review dataset demonstrated an increase in accuracy, precision, recall, and F1 score compared to using a single model, achieving an accuracy of 87.05%, which surpasses Random Forest (85.6%), Naive Bayes (85.55%), and SVM (86.5%). This indicates the effectiveness of the stacking ensemble method in providing deeper and more accurate insights into user sentiment, overcoming the limitations of single models and previous research by explicitly addressing data acquisition methods.
APA, Harvard, Vancouver, ISO, and other styles
25

Asad, Rimsha, Saud Altaf, Shafiq Ahmad, Haitham Mahmoud, Shamsul Huda, and Sofia Iqbal. "Machine Learning-Based Hybrid Ensemble Model Achieving Precision Education for Online Education Amid the Lockdown Period of COVID-19 Pandemic in Pakistan." Sustainability 15, no. 6 (March 19, 2023): 5431. http://dx.doi.org/10.3390/su15065431.

Full text
Abstract:
Institutions of higher learning have made persistent efforts to provide students with a high-quality education. Educational data mining (EDM) enables academic institutions to gain insight into student data in order to extract information for making predictions. COVID-19 represents the most catastrophic pandemic in human history. As a result of the global pandemic, all educational systems were shifted to online learning (OL). Due to issues with accessing the internet, disinterest, and a lack of available tools, online education has proven challenging for many students. Acquiring accurate education has emerged as a major goal for the future of this popular medium of education. Therefore, the focus of this research was to identifying attributes that could help in students’ performance prediction through a generalizable model achieving precision education in online education. The dataset used in this research was compiled from a survey taken primarily during the academic year of COVID-19, which was taken from the perspective of Pakistani university students. Five machine learning (ML) regressors were used in order to train the model, and its results were then analyzed. Comparatively, SVM has outperformed the other methods, yielding 87.5% accuracy, which was the highest of all the models tested. After that, an efficient hybrid ensemble model of machine learning was used to predict student performance using NB, KNN, SVM, decision tree, and logical regression during the COVID-19 period, yielding outclass results. Finally, the accuracy obtained through the hybrid ensemble model was obtained as 98.6%, which demonstrated that the hybrid ensemble learning model has performed better than any other model for predicting the performance of students.
APA, Harvard, Vancouver, ISO, and other styles
26

Pacol, Caren Ambat. "Sentiment Analysis of Students’ Feedback on Faculty Online Teaching Performance Using Machine Learning Techniques." WSEAS TRANSACTIONS ON INFORMATION SCIENCE AND APPLICATIONS 21 (February 19, 2024): 65–76. http://dx.doi.org/10.37394/23209.2024.21.7.

Full text
Abstract:
The pandemic has given rise to challenges across different sectors, particularly in educational institutions. The mode of instruction has shifted from in-person to flexible learning, leading to increased stress and concerns for key stakeholders such as teachers, parents, and students. The ongoing spread of diseases has made in-person classes unfeasible. Even if limited face to face classes will be allowed, online teaching is deemed to remain a practice to support instructional delivery to students. Therefore, it is essential to understand the challenges and issues encountered in online teaching, particularly from the perspective of students. This knowledge is crucial for supervisors and administrators, as it provides insights to aid in planning intervention measures. These interventions can support teachers in enhancing their online teaching performance for the benefit of their students. A process that can be applied to achieve this goal is sentiment analysis. In the field of education, one of the applications of sentiment analysis is in the evaluation of faculty teaching performance. It has been a practice in educational institutions to periodically assess their teachers’ performance. However, it has not been easy to take into account the students’ comments due to the lack of methods for automated text analytics. In line with this, techniques in sentiment analysis are presented in this study. Base models such as Naïve Bayes, Support Vector Machines, Logistic Regression, and Random Forest were explored in experiments and compared to a combination of the four called ensemble. Outcomes indicate that the ensemble of the four outperformed the base models. The utilization of Ngram vectorization in conjunction with ensemble techniques resulted in the highest F1 score compared to Count and TF-IDF methods. Additionally, this approach achieved the highest Cohen’s Kappa and Matthews Correlation Coefficient (MCC), along with the lowest Cross-entropy, signifying its preference as the model of choice for sentiment classification. When applied in conjunction with an ensemble, Count vectorization yielded the highest Cohen’s Kappa and Matthews Correlation Coefficient (MCC) and the lowest Cross-entropy loss in topic classification. Visualization techniques revealed that 65.4% of student responses were positively classified, while 25.5% were negatively classified. Meanwhile, predictions indicated that 47% of student responses were related to instructional design/delivery, 45.3% described the personality/behavior of teachers, 3.4% focused on the use of technology, 2.9% on content, and 1.5% on student assessment.
APA, Harvard, Vancouver, ISO, and other styles
27

Giamarelos, Nikolaos, Myron Papadimitrakis, Marios Stogiannos, Elias N. Zois, Nikolaos-Antonios I. Livanos, and Alex Alexandridis. "A Machine Learning Model Ensemble for Mixed Power Load Forecasting across Multiple Time Horizons." Sensors 23, no. 12 (June 8, 2023): 5436. http://dx.doi.org/10.3390/s23125436.

Full text
Abstract:
The increasing penetration of renewable energy sources tends to redirect the power systems community’s interest from the traditional power grid model towards the smart grid framework. During this transition, load forecasting for various time horizons constitutes an essential electric utility task in network planning, operation, and management. This paper presents a novel mixed power-load forecasting scheme for multiple prediction horizons ranging from 15 min to 24 h ahead. The proposed approach makes use of a pool of models trained by several machine-learning methods with different characteristics, namely neural networks, linear regression, support vector regression, random forests, and sparse regression. The final prediction values are calculated using an online decision mechanism based on weighting the individual models according to their past performance. The proposed scheme is evaluated on real electrical load data sensed from a high voltage/medium voltage substation and is shown to be highly effective, as it results in R2 coefficient values ranging from 0.99 to 0.79 for prediction horizons ranging from 15 min to 24 h ahead, respectively. The method is compared to several state-of-the-art machine-learning approaches, as well as a different ensemble method, producing highly competitive results in terms of prediction accuracy.
APA, Harvard, Vancouver, ISO, and other styles
28

Gaikwad, D. P., Vismita Nagrale, and M. P. Bauskar. "Ensemble of Learner for Network Intrusion Detection System." Journal of Network Security Computer Networks 9, no. 1 (April 6, 2023): 25–34. http://dx.doi.org/10.46610/jonscn.2023.v09i01.004.

Full text
Abstract:
The uses of the internet have improved drastically for online communication and working from home. Data sharing and integration of global information bring network security risks. To protect private data and information, network security is becoming a very important research topic. An intrusion detection system is generally used as safe operational tool. It excellently detects and prevents intruders in a network by issuing a warning before the attack is launched in a network. An ensemble technique is extensively used to employ intrusion detection systems. In this paper, the stacking method of the ensemble has been proposed for the intrusion detection system. Three base classifiers have been stacked using the Meta classifier. Multilayer Perceptron, Ripple Down Rule learner, and RepTree Decision Tree have been used as Base Classifiers. These Base Classifiers are stacked using Logistic Regression with ridge estimator Meta Classifier. These learners have been trained and tested using the NSL dataset. A genetic algorithm has been applied for choosing relevant and most corrected features which helped in reducing the dimension of the dataset. Experimental results demonstrate that the proposed stacked classifier gives accuracies of 79.36%, 99.72%, and 99.64% on a test, train dataset, and cross-validation respectively. It is observed that the proposed stacked classifiers do better than existing hybrid intrusion detection systems.
APA, Harvard, Vancouver, ISO, and other styles
29

Jin, Huaiping, Jiangang Li, Meng Wang, Bin Qian, Biao Yang, Zheng Li, and Lixian Shi. "Ensemble Just-In-Time Learning-Based Soft Sensor for Mooney Viscosity Prediction in an Industrial Rubber Mixing Process." Advances in Polymer Technology 2020 (March 27, 2020): 1–14. http://dx.doi.org/10.1155/2020/6575326.

Full text
Abstract:
The lack of online sensors for Mooney viscosity measurement has posed significant challenges for enabling efficient monitoring, control, and optimization of industrial rubber mixing process. To obtain real-time and accurate estimations of Mooney viscosity, a novel soft sensor method, referred to as multimodal perturbation- (MP-) based ensemble just-in-time learning Gaussian process regression (MP-EJITGPR), is proposed by exploiting ensemble JIT learning. This method employs perturbations on similarity measure and input variables for generating the diversity of JIT learners. Furthermore, a set of accurate and diverse JIT learners are built through an evolutionary multiobjective optimization by balancing the accuracy and diversity objectives explicitly. Moreover, all base JIT learners are combined adaptively using a finite mixture mechanism. The proposed method is applied to an industrial rubber mixing process for Mooney viscosity prediction, and the experimental results demonstrate its effectiveness and superiority over traditional soft sensor methods.
APA, Harvard, Vancouver, ISO, and other styles
30

Fayaz, Muhammad, Atif Khan, Javid Ur Rahman, Abdullah Alharbi, M. Irfan Uddin, and Bader Alouffi. "Ensemble Machine Learning Model for Classification of Spam Product Reviews." Complexity 2020 (December 17, 2020): 1–10. http://dx.doi.org/10.1155/2020/8857570.

Full text
Abstract:
Nowadays, online product reviews have been at the heart of the product assessment process for a company and its customers. They give feedback to a company on improving product quality, planning, and monitoring its business schemes in order to increase sale and gain more profit. They are also helpful for customers to select the right products in less effort and time. Most companies make spam reviews of products in order to increase the products sales and gain more profit. Detecting spam product reviews is a challenging issue in NLP (natural language processing). Numerous machine learning approaches have attempted to detect and classify the product reviews as spam or nonspam. However, in order to improve the classification accuracy, this study has introduced an ensemble machine learning model that combines predictions from multilayer perceptron (MLP), K-Nearest Neighbour (KNN), and Random Forest (RF) and predicts the outcome of the review as spam or real (nonspam), based on the majority vote of the contributing models. In order to accomplish the task of spam review classification, the proposed ensemble and other benchmark boosting approaches are tested with 25 statistical features extracted from mobile application reviews of Yelp Dataset. Then, three different selection techniques are exploited to diminish the feature space and filter out the top 10 optimal features. The effectiveness of the proposed ensemble, the individual models, and other benchmark boosting approaches is again evaluated with 10 optimal features in terms of classification accuracy. Experimental outcomes illustrate that the proposed ensemble model outperformed the individual classifiers (MLP, KNN, and RF) and state-of-the-art boosting approaches like Generalized Boost Regression Model (GBM), Extreme Gradient Boost (XGBoost), and AdaBoost Regression Model in terms of classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
31

Ochola, Dennis, Bastiaen Boekelo, Gerrie W. J. van de Ven, Godfrey Taulya, Jerome Kubiriba, Piet J. A. van Asten, and Ken E. Giller. "Mapping spatial distribution and geographic shifts of East African highland banana (Musa spp.) in Uganda." PLOS ONE 17, no. 2 (February 17, 2022): e0263439. http://dx.doi.org/10.1371/journal.pone.0263439.

Full text
Abstract:
East African highland banana (Musa acuminata genome group AAA-EA; hereafter referred to as banana) is critical for Uganda’s food supply, hence our aim to map current distribution and to understand changes in banana production areas over the past five decades. We collected banana presence/absence data through an online survey based on high-resolution satellite images and coupled this data with independent covariates as inputs for ensemble machine learning prediction of current banana distribution. We assessed geographic shifts of production areas using spatially explicit differences between the 1958 and 2016 banana distribution maps. The biophysical factors associated with banana spatial distribution and geographic shift were determined using a logistic regression model and classification and regression tree, respectively. Ensemble models were superior (AUC = 0.895; 0.907) compared to their constituent algorithms trained with 12 and 17 covariates, respectively: random forests (AUC = 0.883; 0.901), gradient boosting machines (AUC = 0.878; 0.903), and neural networks (AUC = 0.870; 0.890). The logistic regression model (AUC = 0.879) performance was similar to that for the ensemble model and its constituent algorithms. In 2016, banana cultivation was concentrated in the western (44%) and central (36%) regions, while only a small proportion was in the eastern (18%) and northern (2%) regions. About 60% of increased cultivation since 1958 was in the western region; 50% of decreased cultivation in the eastern region; and 44% of continued cultivation in the central region. Soil organic carbon, soil pH, annual precipitation, slope gradient, bulk density and blue reflectance were associated with increased banana cultivation while precipitation seasonality and mean annual temperature were associated with decreased banana cultivation over the past 50 years. The maps of spatial distribution and geographic shift of banana can support targeting of context-specific intensification options and policy advocacy to avert agriculture driven environmental degradation.
APA, Harvard, Vancouver, ISO, and other styles
32

Aslam, Naila, Kewen Xia, Furqan Rustam, Ernesto Lee, and Imran Ashraf. "Self voting classification model for online meeting app review sentiment analysis and topic modeling." PeerJ Computer Science 8 (December 15, 2022): e1141. http://dx.doi.org/10.7717/peerj-cs.1141.

Full text
Abstract:
Online meeting applications (apps) have emerged as a potential solution for conferencing, education and meetings, etc. during the COVID-19 outbreak and are used by private companies and governments alike. A large number of such apps compete with each other by providing a different set of functions towards users’ satisfaction. These apps take users’ feedback in the form of opinions and reviews which are later used to improve the quality of services. Sentiment analysis serves as the key function to obtain and analyze users’ sentiments from the posted feedback indicating the importance of efficient and accurate sentiment analysis. This study proposes the novel idea of self voting classification (SVC) where multiple variants of the same model are trained using different feature extraction approaches and the final prediction is based on the ensemble of these variants. For experiments, the data collected from the Google Play store for online meeting apps were used. Primarily, the focus of this study is to use a support vector machine (SVM) with the proposed SVC approach using both soft voting (SV) and hard voting (HV) criteria, however, decision tree, logistic regression, and k nearest neighbor have also been investigated for performance appraisal. Three variants of models are trained on a bag of words, term frequency-inverse document frequency, and hashing features to make the ensemble. Experimental results indicate that the proposed SVC approach can elevate the performance of traditional machine learning models substantially. The SVM obtains 1.00 and 0.98 accuracy scores, using HV and SV criteria, respectively when used with the proposed SVC approach. Topic-wise sentiment analysis using the latent Dirichlet allocation technique is performed as well for topic modeling.
APA, Harvard, Vancouver, ISO, and other styles
33

Yang, Kai, Huaiping Jin, Xiangguang Chen, Jiayu Dai, Li Wang, and Dongxiang Zhang. "Soft sensor development for online quality prediction of industrial batch rubber mixing process using ensemble just-in-time Gaussian process regression models." Chemometrics and Intelligent Laboratory Systems 155 (July 2016): 170–82. http://dx.doi.org/10.1016/j.chemolab.2016.04.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Khozouie, Nasim, Omid Rahmani Seryasat, and Sadegh Moshrefzadeh. "Prediction of Diabetes using Supervised Learning Approach." Health Nexus 2, no. 2 (2024): 103–11. http://dx.doi.org/10.61838/kman.hn.2.2.12.

Full text
Abstract:
This paper provides an in depth evaluate of diverse supervised machine getting to know fashions used for predicting diabetes. It discusses the strengths and barriers of various algorithms together with decision bushes, Random Forest, Rotation Forest Ensemble Classifier diabetic, okay-superstar, Simple Bayes, Logistic Regression, Functional tree, belief neural network, dataset to expect the diabetes, a publically to be had diabetes dataset from the website online /chistio. which include 520 Samples which can be patients and these samples have 200 diabetic sufferers and 320 non-diabetic sufferers and assessment sixteen Features in it. Results are testified on the weka3.6 open-source platform and proven the use of AUC, CA, F1, precision, and recall parameters.
APA, Harvard, Vancouver, ISO, and other styles
35

Pradeep K V, Rajarajeshwari S, Sujay Doshi, D. Yuvaraj, Nachiyappan S,. "Ensemble Learning-Based Browser Extension for Mitigating Cyber Attacks Carried out using Malicious Short URLs." Journal of Electrical Systems 20, no. 3s (April 4, 2024): 158–69. http://dx.doi.org/10.52783/jes.1264.

Full text
Abstract:
As digitization continues to expand and cybercrimes become more prevalent, making it is crucial to prioritize the implementation of robust security measures. Malicious short URLs are frequently utilized as a vector for cyber-attacks on online forums and social media platforms. To address this issue, a plugin-based solution that uses ensemble learning to combine random forest, k-neighbors classifier, and logistic regression into a stacked model, was developed. The model was trained over the combination of three most popular kaggle datasets, with over 1081195 URLs. Additionally, gradient boosting was applied to further enhance the model's performance, resulting in a 92% accuracy in the detection. We developed the browser extension with Flask and JavaScript that identifies URLs as malicious or safe, for facilitation of the proposed solution. The work emphasizes the need for effective measures to mitigate cyber-attack risks, particularly those involving malicious short URLs.
APA, Harvard, Vancouver, ISO, and other styles
36

Santhiya, S., and C. S. KanimozhiSelvi. "A study on dyslexia detection using machine learning techniques for checklist, questionnaire and online game based datasets." Applied and Computational Engineering 5, no. 1 (June 14, 2023): 837–42. http://dx.doi.org/10.54254/2755-2721/5/20230722.

Full text
Abstract:
Learning disabilities are one of the most common developmental disorders in children. Learning is fundamental to a child's overall development. Children struggle with daily activities such as reading, speaking, organizing things, and so on. The specific learning disorders are classified into dyslexia, dysgraphia, and dyscalculia. Children who find difficulty in reading and are unable to differentiate speech sounds are said to have dyslexia. Dysgraphia and dyscalculia deal with written and mathematical calculations. Early diagnosis and detection are essential for early recovery from diseases. The proposed article presents methodologies and techniques used for detecting dyslexia. The primary contribution of this paper is a comparative analysis of various machine learning algorithms for diagnosing dyslexia, including SVM, KNN, Logistic Regression, K-mean Clustering, Oversampling, and Ensemble methods. Deep learning methods such as CNN and LeNet architecture have been used to identify dyslexia. The proposed study examines recent advances in detecting dyslexia using machine learning and deep learning approaches and identifies prospective research areas for the future.
APA, Harvard, Vancouver, ISO, and other styles
37

Abidi, Syed Muhammad Raza, Wu Zhang, Saqib Ali Haidery, Sanam Shahla Rizvi, Rabia Riaz, Hu Ding, and Se Jin Kwon. "Educational Sustainability through Big Data Assimilation to Quantify Academic Procrastination Using Ensemble Classifiers." Sustainability 12, no. 15 (July 28, 2020): 6074. http://dx.doi.org/10.3390/su12156074.

Full text
Abstract:
Ubiquitous online learning is continuing to expand, and the factors affecting success and educational sustainability need to be quantified. Procrastination is one of the compelling characteristics that students observe as a failure to achieve the weaker outcomes. Past studies have mainly assessed the behaviors of procrastination by describing explanatory work. Throughout this research, we concentrate on predictive measures to identify and forecast procrastinator students by using ensemble machine learning models (i.e., Logistic Regression, Decision Tree, Gradient Boosting, and Forest). Our results indicate that the Gradient Boosting autotuned is a predictive champion model of high precision compared to the other default and hyper-parameterized tuned models in the pipeline. The accuracy we enumerated for the VALIDATION partition dataset is 91.77 percent, based on the Kolmogorov–Smirnov statistics. Additionally, our model allows teachers to monitor each procrastinator student who interacts with the web-based e-learning platform and take corrective action on the next day of the class. The earlier prediction of such procrastination behaviors would assist teachers in classifying students before completing the task, homework, or mastery of a skill, which is useful and a path to developing a sustainable atmosphere for education or education for sustainable development.
APA, Harvard, Vancouver, ISO, and other styles
38

Otorokpo, Emakpor Augustine, Margaret Dumebi Okpor, Rume Elizabeth Yoro, Success Endurance Brizimor, Ayo Michael Ifioko, Dickson Abiodun Obasuyi, Chris Chukwufunaya Odiakaose, et al. "DaBO-BoostE: Enhanced Data Balancing via Oversampling Technique for a Boosting Ensemble in Card-Fraud Detection." Advances in Multidisciplinary & Scientific Research Journal Publications 12 (2024): 45–66. http://dx.doi.org/10.22624/aims/maths/v12n2p4.

Full text
Abstract:
The unauthorized use of credit card information for fraudulent financial benefits by fraudsters without the knowledge of an unsuspecting users has become rampant due to financial inclusivity of financial institutions in their bid to reach both semi-urban and rural settlers. This in turn – has continued to ripple across the society with huge financial losses and lowered user trust implications for all cardholders. Thus, banks cum financial institutions are today poised to implement fraud detection schemes. 5-algorithms with(out) application of the synthetic minority over-sampling technique (SMOTE) were trained to assess how well they performed namely: Random Forest (RF), K-Nearest-Neighbor (KNN), Naive Bayes (NB), Support Vector Machines (SVM), and Logistic Regression (LR). Tested via flask, and integrated via streamlit as application programming interface on to various platforms – our experimental proposed RF ensemble performed best with an accuracy of 0.9802 after applying SMOTE; while LR, KNN, NB, SVM and DT yielded an accuracy of 0.9219, 0.9435, 0.9508, 0.5 and 0.9008 respectively. Our proposed ensemble achieved F1-score of 0.9919; while LR, KNN, NB, SVM and DT yields 0.9805, 0.921, 0.9125, and 0.8145 respectively. Results implies that proposed ensemble can be used with SMOTE data balancing technique for enhanced prediction for card fraud detection. Keywords: Random Forest, SMOTE, credit card fraud detection, feature selection, imbalanced dataset Otorokpo, A., Okpor, M.D., Yoro, E.R., Brizimor, S., Ifiokor, A.M., Obasuyi, D., Odiakaose, C.C., Ojugo, A.A., Atuduhor, R., Akiakeme, E., Ako, R.E., & Geteloma, V.O. (2024): DaBO-BoostE: Enhanced Data Balancing via Oversampling Technique for a Boosting Ensemble in Card-Fraud Detection. Journal of Advances in Mathematical & Computational Science. Vol. 12, No. 1. Pp 45-66. Available online at www.isteams.net/mathematics-computationaljournal. dx.doi.org/10.22624/AIMS/MATHS/V12N2P4
APA, Harvard, Vancouver, ISO, and other styles
39

Alajali, Walaa, Wei Zhou, Sheng Wen, and Yu Wang. "Intersection Traffic Prediction Using Decision Tree Models." Symmetry 10, no. 9 (September 7, 2018): 386. http://dx.doi.org/10.3390/sym10090386.

Full text
Abstract:
Traffic prediction is a critical task for intelligent transportation systems (ITS). Prediction at intersections is challenging as it involves various participants, such as vehicles, cyclists, and pedestrians. In this paper, we propose a novel approach for the accurate intersection traffic prediction by introducing extra data sources other than road traffic volume data into the prediction model. In particular, we take advantage of the data collected from the reports of road accidents and roadworks happening near the intersections. In addition, we investigate two types of learning schemes, namely batch learning and online learning. Three popular ensemble decision tree models are used in the batch learning scheme, including Gradient Boosting Regression Trees (GBRT), Random Forest (RF) and Extreme Gradient Boosting Trees (XGBoost), while the Fast Incremental Model Trees with Drift Detection (FIMT-DD) model is adopted for the online learning scheme. The proposed approach is evaluated using public data sets released by the Victorian Government of Australia. The results indicate that the accuracy of intersection traffic prediction can be improved by incorporating nearby accidents and roadworks information.
APA, Harvard, Vancouver, ISO, and other styles
40

Isiaka, Dauda Olorunkemi, Joshua Babatunde Agbogun, and Taiwo Kolajo. "A Framework for Predictive - Diagnosis of Prevalent Illness among University Students." Journal of Applied Artificial Intelligence 3, no. 2 (December 31, 2022): 24–38. http://dx.doi.org/10.48185/jaai.v3i2.667.

Full text
Abstract:
The issue of identifying the prevalence of sickness that is linked to the population of a nation, state, neighborhood, organization, or school has not been taken into consideration by the majority of prior studies on the prediction of illness among populations. They frequently merely choose any sickness based on assumption, while those that determined the prevalence of the condition before developing their framework utilized survey data or data from web repositories, which removes idiosyncrasies from those data. In order to increase performance, this research suggests an enhanced data analytics framework for the predictive diagnosis of common illnesses affecting university students. In order to do this, exploratory data analysis (EDA) using a multivariate analytic technique was conducted using a high-level model methodology using CRISP-DM stages. When the suggested strategy was evaluated on support vector machines, ensemble gradient boosting, random forest, decision tree, K-neighbors, and linear regression machine learning models, experimental findings revealed that it outperformed current methods. In comparison to other reviewed frameworks that used survey datasets, standardized or online repositories' dataset, the framework with emphasis on the ensemble Gradient Boosting classifier and regression had accuracy of 100% and mean absolute error of 0.18, respectively. It is also steady due to its ability to manage both small and big data sets without impacting the model's performance. The enhanced results through localized dataset demonstrate the benefit of including local data sources in the process of developing models for the diagnosis and prognosis of prevalent illnesses of any area with people.
APA, Harvard, Vancouver, ISO, and other styles
41

Lee, Sangjae, and Joon Yeon Choeh. "Movie Production Efficiency Moderating between Online Word-of-Mouth and Subsequent Box Office Revenue." Sustainability 12, no. 16 (August 14, 2020): 6602. http://dx.doi.org/10.3390/su12166602.

Full text
Abstract:
The studies are almost nonexistent regarding production efficiency of movies which is determined based on the relationship between movie resources powers (powers of actors, directors, distributors, and production companies) and box office. Our study attempts to examine how efficiency moderates the relationship between eWOM (online word-of-mouth) and revenue, and to show the difference in prediction performance between efficient and inefficient movies. Using data envelopment analysis to suggest efficiency of movies, movie efficiency negatively moderates the effects of review depth and volume on subsequent box office revenue compensating negative effects of smaller box office in previous period while efficiency exert a positive moderating effect on the influences of review rating and the number of positive reviews on revenue. This shows that review depth and volume are affected by the slack of movie resources powers for inefficient movies, and high rating and positive response for efficient movies to affect revenue. The results of decision trees, k-nearest-neighbors, and linear regression analysis based on ensemble methods using eWOM or movie variables indicate that the movies with the inefficient movie resources powers are providing greater prediction performance than movies with efficient movie resources powers. This show that diverse variation in the efficiency of movie resources powers contributes to prediction performance.
APA, Harvard, Vancouver, ISO, and other styles
42

Alqahtani, Hassan, and Asok Ray. "Neural Network-Based Automated Assessment of Fatigue Damage in Mechanical Structures." Machines 8, no. 4 (December 16, 2020): 85. http://dx.doi.org/10.3390/machines8040085.

Full text
Abstract:
This paper proposes a methodology for automated assessment of fatigue damage, which has been tested and validated with polycrystalline-alloy (Aℓ7075-T6) specimens on an experimental apparatus. Based on an ensemble of time series of ultrasonic test (UT) data, the proposed procedure is found to be capable of detecting fatigue-damage (at an early stage) in mechanical structures, which is followed by online evaluation of the associated risk. The underlying concept is built upon two neural network (NN)-based models, where the first NN model identifies the feature of the UT data belonging to one of the two classes: undamaged structure and damaged structure, and the second NN model further classifies an identified damaged structure into three classes: low-risk, medium-risk, and high-risk. The input information to the second NN model is the crack tip opening displacement (CTOD), which is computed by the first NN model via linear regression from an ensemble of optical data, acquired from the experiments. Both NN models have been trained by using scaled conjugate gradient algorithms. The results show that the first NN model classifies the energy of UT signals with (up to) 98.5% accuracy, and that the accuracy of the second NN model is 94.6%.
APA, Harvard, Vancouver, ISO, and other styles
43

Chen, Yifei, Zhenyu Jia, Dan Mercola, and Xiaohui Xie. "A Gradient Boosting Algorithm for Survival Analysis via Direct Optimization of Concordance Index." Computational and Mathematical Methods in Medicine 2013 (2013): 1–8. http://dx.doi.org/10.1155/2013/873595.

Full text
Abstract:
Survival analysis focuses on modeling and predicting the time to an event of interest. Many statistical models have been proposed for survival analysis. They often impose strong assumptions on hazard functions, which describe how the risk of an event changes over time depending on covariates associated with each individual. In particular, the prevalent proportional hazards model assumes that covariates are multiplicatively related to the hazard. Here we propose a nonparametric model for survival analysis that does not explicitly assume particular forms of hazard functions. Our nonparametric model utilizes an ensemble of regression trees to determine how the hazard function varies according to the associated covariates. The ensemble model is trained using a gradient boosting method to optimize a smoothed approximation of the concordance index, which is one of the most widely used metrics in survival model performance evaluation. We implemented our model in a software package called GBMCI (gradient boosting machine for concordance index) and benchmarked the performance of our model against other popular survival models with a large-scale breast cancer prognosis dataset. Our experiment shows that GBMCI consistently outperforms other methods based on a number of covariate settings. GBMCI is implemented in R and is freely available online.
APA, Harvard, Vancouver, ISO, and other styles
44

Ali, Hashir, Ehtesham Hashmi, Sule Yayilgan Yildirim, and Sarang Shaikh. "Analyzing Amazon Products Sentiment: A Comparative Study of Machine and Deep Learning, and Transformer-Based Techniques." Electronics 13, no. 7 (March 31, 2024): 1305. http://dx.doi.org/10.3390/electronics13071305.

Full text
Abstract:
In recent years, online shopping has surged in popularity, with customer reviews becoming a crucial aspect of the decision-making process. Reviews not only help potential customers make informed choices, but also provide businesses with valuable feedback and build trust. In this study, we conducted a thorough analysis of the Amazon reviews dataset, which includes several product categories. Our primary objective was to accurately classify sentiments using natural language processing, machine learning, ensemble learning, and deep learning techniques. Our research workflow encompassed several crucial steps. We explore data collection procedures; preprocessing steps, including normalization and tokenization; and feature extraction, utilizing the Bag-of-Words and TF–IDF methods. We conducted experiments employing a variety of machine learning algorithms, including Multinomial Naive Bayes, Random Forest, Decision Tree, and Logistic Regression. Additionally, we harnessed Bagging as an ensemble learning technique. Furthermore, we explored deep learning-based algorithms, such as CNNs, Bidirectional LSTM, and transformer-based models, like XLNet and BERT. Our comprehensive evaluations, utilizing metrics such as accuracy, precision, recall, and F1 score, revealed that the BERT algorithm outperformed others, achieving an impressive accuracy rate of 89%. This research provides valuable insights into the sentiment analysis of Amazon reviews, aiding both consumers and businesses in making informed decisions and enhancing product and service quality.
APA, Harvard, Vancouver, ISO, and other styles
45

Yin, Zhijun, Lina M. Sulieman, and Bradley A. Malin. "A systematic literature review of machine learning in online personal health data." Journal of the American Medical Informatics Association 26, no. 6 (March 25, 2019): 561–76. http://dx.doi.org/10.1093/jamia/ocz009.

Full text
Abstract:
Abstract Objective User-generated content (UGC) in online environments provides opportunities to learn an individual’s health status outside of clinical settings. However, the nature of UGC brings challenges in both data collecting and processing. The purpose of this study is to systematically review the effectiveness of applying machine learning (ML) methodologies to UGC for personal health investigations. Materials and Methods We searched PubMed, Web of Science, IEEE Library, ACM library, AAAI library, and the ACL anthology. We focused on research articles that were published in English and in peer-reviewed journals or conference proceedings between 2010 and 2018. Publications that applied ML to UGC with a focus on personal health were identified for further systematic review. Results We identified 103 eligible studies which we summarized with respect to 5 research categories, 3 data collection strategies, 3 gold standard dataset creation methods, and 4 types of features applied in ML models. Popular off-the-shelf ML models were logistic regression (n = 22), support vector machines (n = 18), naive Bayes (n = 17), ensemble learning (n = 12), and deep learning (n = 11). The most investigated problems were mental health (n = 39) and cancer (n = 15). Common health-related aspects extracted from UGC were treatment experience, sentiments and emotions, coping strategies, and social support. Conclusions The systematic review indicated that ML can be effectively applied to UGC in facilitating the description and inference of personal health. Future research needs to focus on mitigating bias introduced when building study cohorts, creating features from free text, improving clinical creditability of UGC, and model interpretability.
APA, Harvard, Vancouver, ISO, and other styles
46

Massey, Alexander, Corentin Boennec, Claudia Ximena Restrepo-Ortiz, Christophe Blanchet, Samuel Alizon, and Mircea T. Sofonea. "Real-time forecasting of COVID-19-related hospital strain in France using a non-Markovian mechanistic model." PLOS Computational Biology 20, no. 5 (May 17, 2024): e1012124. http://dx.doi.org/10.1371/journal.pcbi.1012124.

Full text
Abstract:
Projects such as the European Covid-19 Forecast Hub publish forecasts on the national level for new deaths, new cases, and hospital admissions, but not direct measurements of hospital strain like critical care bed occupancy at the sub-national level, which is of particular interest to health professionals for planning purposes. We present a sub-national French framework for forecasting hospital strain based on a non-Markovian compartmental model, its associated online visualisation tool and a retrospective evaluation of the real-time forecasts it provided from January to December 2021 by comparing to three baselines derived from standard statistical forecasting methods (a naive model, auto-regression, and an ensemble of exponential smoothing and ARIMA). In terms of median absolute error for forecasting critical care unit occupancy at the two-week horizon, our model only outperformed the naive baseline for 4 out of 14 geographical units and underperformed compared to the ensemble baseline for 5 of them at the 90% confidence level (n = 38). However, for the same level at the 4 week horizon, our model was never statistically outperformed for any unit despite outperforming the baselines 10 times spanning 7 out of 14 geographical units. This implies modest forecasting utility for longer horizons which may justify the application of non-Markovian compartmental models in the context of hospital-strain surveillance for future pandemics.
APA, Harvard, Vancouver, ISO, and other styles
47

Zhang, Yanju, Ruopeng Xie, Jiawei Wang, André Leier, Tatiana T. Marquez-Lago, Tatsuya Akutsu, Geoffrey I. Webb, Kuo-Chen Chou, and Jiangning Song. "Computational analysis and prediction of lysine malonylation sites by exploiting informative features in an integrative machine-learning framework." Briefings in Bioinformatics 20, no. 6 (August 24, 2018): 2185–99. http://dx.doi.org/10.1093/bib/bby079.

Full text
Abstract:
AbstractAs a newly discovered post-translational modification (PTM), lysine malonylation (Kmal) regulates a myriad of cellular processes from prokaryotes to eukaryotes and has important implications in human diseases. Despite its functional significance, computational methods to accurately identify malonylation sites are still lacking and urgently needed. In particular, there is currently no comprehensive analysis and assessment of different features and machine learning (ML) methods that are required for constructing the necessary prediction models. Here, we review, analyze and compare 11 different feature encoding methods, with the goal of extracting key patterns and characteristics from residue sequences of Kmal sites. We identify optimized feature sets, with which four commonly used ML methods (random forest, support vector machines, K-nearest neighbor and logistic regression) and one recently proposed [Light Gradient Boosting Machine (LightGBM)] are trained on data from three species, namely, Escherichia coli, Mus musculus and Homo sapiens, and compared using randomized 10-fold cross-validation tests. We show that integration of the single method-based models through ensemble learning further improves the prediction performance and model robustness on the independent test. When compared to the existing state-of-the-art predictor, MaloPred, the optimal ensemble models were more accurate for all three species (AUC: 0.930, 0.923 and 0.944 for E. coli, M. musculus and H. sapiens, respectively). Using the ensemble models, we developed an accessible online predictor, kmal-sp, available at http://kmalsp.erc.monash.edu/. We hope that this comprehensive survey and the proposed strategy for building more accurate models can serve as a useful guide for inspiring future developments of computational methods for PTM site prediction, expedite the discovery of new malonylation and other PTM types and facilitate hypothesis-driven experimental validation of novel malonylated substrates and malonylation sites.
APA, Harvard, Vancouver, ISO, and other styles
48

Wang, Peng, and Zhengliang Xu. "A Novel Consumer Purchase Behavior Recognition Method Using Ensemble Learning Algorithm." Mathematical Problems in Engineering 2020 (December 19, 2020): 1–10. http://dx.doi.org/10.1155/2020/6673535.

Full text
Abstract:
With the prosperous development of e-commerce platforms, consumer returns often occur. The issue of returns has become a stumbling block to the profitability of e-commerce companies. To protect consumers’ purchase rights, the Chinese government has introduced a 7-day unreasonable return policy. In order to use the return policy to attract consumers to buy, various e-commerce platforms have created a more relaxed and convenient return environment for consumers. On the one hand, the introduction of the return policy has increased customer trust in e-commerce platforms and stimulated purchase demand. On the other hand, the return behavior also increases the cost of the e-commerce platform. With the upgrading of consumption, customers pay more attention to personalized experience. In addition to considering price when purchasing online, the quality of services provided by e-commerce platforms will also directly affect customers’ purchasing decisions and return behavior. Therefore, under the personalized return policy of the e-commerce platform, whether consumers will make another purchase is worth studying. In order to achieve this goal, an ensemble learning method (AdaBoost-FSVM) based on fuzzy support vector machine (FSVM) is applied to predict the purchase intention of consumers. First, the grid search method is used to optimize the modeling parameters of the FSVM base classifier. Second, the AdaBoost-FSVM ensemble prediction model is constructed by using multiple base classifiers. In order to evaluate the performance of the prediction models used, logistic regression (LR), support vector machine (SVM), FSVM, random forest (RF), and XGBoost were used to construct prediction models for purchasing behavior. The experimental results demonstrate that the method used in this study has a more accurate prediction effect than the comparison algorithms. The predictive model used in this study can be used in the recommendation system of shopping websites and can also be used to guide e-commerce companies to customize various preferential policies and services, so as to quickly and accurately stimulate the purchase intention of more potential consumers.
APA, Harvard, Vancouver, ISO, and other styles
49

Correa, Ramon Santos, Patricia Teixeira Sampaio, Rafael Utsch Braga, Victor Alberto Lambertucci, Gustavo Matheus Almeida, and Antonio Padua Braga. "Prediction of Mechanical Properties of Seamless Steel Tubes Using Artificial Neural Networks." International Journal of Computational Intelligence and Applications 19, no. 04 (October 15, 2020): 2050028. http://dx.doi.org/10.1142/s1469026820500285.

Full text
Abstract:
A bottleneck of laboratory analysis in process industries including steelmaking plants is the low sampling rate. Inference models using only variables measured online have then been used to made such information available in advance. This study develops predictive models for key mechanical properties of seamless steel tubes, by strength, ultimate tensile strength and hardness. A plant in Brazil was used as the case study. The sample sizes of some steel tube families given namely, yield a particular property are discrepant and sometimes very small. To overcome this sample imbalance and lack of representativeness, committees of predictive neural network models based on bagging predictors, a type of ensemble method, were adopted. As a result, all steel families for all properties have been satisfactorily described showing the correlations between targets and model estimates close to 99%. These results were compared to multiple linear regression, support vector machine and a simpler neural network. Such information available in advance favors corrective actions before complete tube production mitigating rework costs in general.
APA, Harvard, Vancouver, ISO, and other styles
50

Alarfaj, Fawaz Khaled, and Jawad Abbas Khan. "Deep Dive into Fake News Detection: Feature-Centric Classification with Ensemble and Deep Learning Methods." Algorithms 16, no. 11 (November 3, 2023): 507. http://dx.doi.org/10.3390/a16110507.

Full text
Abstract:
The online spread of fake news on various platforms has emerged as a significant concern, posing threats to public opinion, political stability, and the dissemination of reliable information. Researchers have turned to advanced technologies, including machine learning (ML) and deep learning (DL) techniques, to detect and classify fake news to address this issue. This research study explores fake news classification using diverse ML and DL approaches. We utilized a well-known “Fake News” dataset sourced from Kaggle, encompassing a labelled news collection. We implemented diverse ML models, including multinomial naïve bayes (MNB), gaussian naïve bayes (GNB), Bernoulli naïve Bayes (BNB), logistic regression (LR), and passive aggressive classifier (PAC). Additionally, we explored DL models, such as long short-term memory (LSTM), convolutional neural networks (CNN), and CNN-LSTM. We compared the performance of these models based on key evaluation metrics, such as accuracy, precision, recall, and the F1 score. Additionally, we conducted cross-validation and hyperparameter tuning to ensure optimal performance. The results provide valuable insights into the strengths and weaknesses of each model in classifying fake news. We observed that DL models, particularly LSTM and CNN-LSTM, showed better performance compared to traditional ML models. These models achieved higher accuracy and demonstrated robustness in classification tasks. These findings emphasize the potential of DL models to tackle the spread of fake news effectively and highlight the importance of utilizing advanced techniques to address this challenging problem.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography