Log in

Relevant bibliographies by topics / Bayes predictor / Journal articles

To see the other types of publications on this topic, follow the link: Bayes predictor.

Journal articles on the topic 'Bayes predictor'

Author: Grafiati

Published: 30 November 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Bayes predictor.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Liang, Guohua, Xingquan Zhu, and Chengqi Zhang. "An Empirical Study of Bagging Predictors for Different Learning Algorithms." Proceedings of the AAAI Conference on Artificial Intelligence 25, no. 1 (August 4, 2011): 1802–3. http://dx.doi.org/10.1609/aaai.v25i1.8026.

Full text

Abstract:

Bagging is a simple yet effective design which combines multiple single learners to form an ensemble for prediction. Despite its popular usage in many real-world applications, existing research is mainly concerned with studying unstable learners as the key to ensure the performance gain of a bagging predictor, with many key factors remaining unclear. For example, it is not clear when a bagging predictor can outperform a single learner and what is the expected performance gain when different learning algorithms were used to form a bagging predictor. In this paper, we carry out comprehensive empirical studies to evaluate bagging predictors by using 12 different learning algorithms and 48 benchmark data-sets. Our analysis uses robustness and stability decompositions to characterize different learning algorithms, through which we rank all learning algorithms and comparatively study their bagging predictors to draw conclusions. Our studies assert that both stability and robustness are key requirements to ensure the high performance for building a bagging predictor. In addition, our studies demonstrated that bagging is statistically superior to most single base learners, except for KNN and Naïve Bayes (NB). Multi-layer perception (MLP), Naïve Bayes Trees (NBTree), and PART are the learning algorithms with the best bagging performance.

APA, Harvard, Vancouver, ISO, and other styles

2

Zhang, Shenghan, Yufeng Gu, Yinshan Gao, Xinxing Wang, Daoyong Zhang, and Liming Zhou. "Petrophysical Regression regarding Porosity, Permeability, and Water Saturation Driven by Logging-Based Ensemble and Transfer Learnings: A Case Study of Sandy-Mud Reservoirs." Geofluids 2022 (October 5, 2022): 1–31. http://dx.doi.org/10.1155/2022/9443955.

Full text

Abstract:

From a general review, most petrophysical models applied for the conventional logging interpretation imply that porosity, permeability, or water saturation mathematically have a linear or nonlinear relationship with well logs, and then arguing the prediction of these three parameters actually is accessible under a regression of logging sequences. Based on this knowledge, ensemble learning technique, partially developed for fitting problems, can be regarded as a solution. Light gradient boosting machine (LightGBM) is proved as one representative of the state-of-the-art ensemble learning, thus adopted as a potential solver to predict three target reservoir characters. To guarantee the predicting quality of LightGBM, continuous restricted Boltzmann machine (CRBM) and Bayesian optimization (Bayes) are introduced as assistants to enhance the significance of input logs and the setting of employed hyperparameters. Thereby, a new hybrid predictor, named CRBM-Bayes-LightGBM, is proposed for the prediction task. To validate the working performance of the proposed predictor, the basic data derived from the member of Chang 8, Jiyuan Oilfield, Ordos Basin, Northern China, is collected to launch the corresponding experiments. Additionally, to highlight the validating effect, three sophisticated predictors, including k-nearest neighbors (KNN), support vector regression (SVR), and random forest (RF), are introduced as competitors to implement a contrast. Since ensemble learning models universally will cause an underfitting issue when dealing with a small-volumetric dataset, transfer learning in this circumstance will be employed as an aided technique for the core predictor to achieve a satisfactory prediction. Then, three experiments are purposefully designed for four validated predictors, and given a comprehensive analysis of the gained experimented results, two critical points are concluded: (1) compared to three competitors, LightGBM-cored predictor has capability to produce more reliable predicted results, and the reliability can be further improved under a usage of more learning samples; (2) transfer learning is really functional in completing a satisfactory prediction for a small-volumetric dataset and furthermore has access to perform better when serving for the proposed predictor. Consequently, CRBM-Bayes-LightGBM combined with transfer learning is solidly demonstrated by a stronger capability and an expected robustness on the prediction of porosity, permeability, and water saturation, which then clarify that the proposed predictor can be viewed as a preferential selection when geologists, geophysicists, or petrophysicists need to finalize a characterization of sandy-mud reservoirs.

APA, Harvard, Vancouver, ISO, and other styles

3

Irmayani, Irmayani, and Budyanita Asrun. "Klasifikasi Sosial Ekonomi Menggunakan Naïve Bayes Classifier." Dewantara Journal of Technology 2, no. 2 (November 17, 2021): 70–74. http://dx.doi.org/10.59563/djtech.v2i2.138.

Full text

Abstract:

Data mining meliputi beberapa metode untuk membantu pengambilan keputusan salah satunya adalah metode klasifikasi. Metode klasifikasi meliputi beberapa cara salah satunya adalah Naive Bayes Classifier. Model Naive Bayes didasarkan pada teorema Bayes yang memiliki kemampuan klasifikasi serupa dengan decision tree. Pemanfaatan metode klasifikasi Naive Bayes Classifier digunakan pada penelitian ini dengan menggunakan data sosial ekonomi keluruhan Amessangeng Kota Palopo. Dengan mengambil sampel pada masyarakat dan menggunakan variabel-variabel predictor yang dapat digunakan menghasilkan suatu kesimpulan demi menghasilkan informasi yang akurat yang dapat membantu pengambilan keputusan terhadap kebijakan sosial ekonomi di Kelurahan Amessangeng

APA, Harvard, Vancouver, ISO, and other styles

4

Liu, Laura, Hyungsik Roger Moon, and Frank Schorfheide. "Forecasting With Dynamic Panel Data Models." Econometrica 88, no. 1 (2020): 171–201. http://dx.doi.org/10.3982/ecta14952.

Full text

Abstract:

This paper considers the problem of forecasting a collection of short time series using cross‐sectional information in panel data. We construct point predictors using Tweedie's formula for the posterior mean of heterogeneous coefficients under a correlated random effects distribution. This formula utilizes cross‐sectional information to transform the unit‐specific (quasi) maximum likelihood estimator into an approximation of the posterior mean under a prior distribution that equals the population distribution of the random coefficients. We show that the risk of a predictor based on a nonparametric kernel estimate of the Tweedie correction is asymptotically equivalent to the risk of a predictor that treats the correlated random effects distribution as known (ratio optimality). Our empirical Bayes predictor performs well compared to various competitors in a Monte Carlo study. In an empirical application, we use the predictor to forecast revenues for a large panel of bank holding companies and compare forecasts that condition on actual and severely adverse macroeconomic conditions.

APA, Harvard, Vancouver, ISO, and other styles

5

Robertson, David E., and Q. J. Wang. "A Bayesian Approach to Predictor Selection for Seasonal Streamflow Forecasting." Journal of Hydrometeorology 13, no. 1 (February 1, 2012): 155–71. http://dx.doi.org/10.1175/jhm-d-10-05009.1.

Full text

Abstract:

Abstract Statistical methods commonly used for forecasting climate and streamflows require the selection of appropriate predictors. Poorly designed predictor selection procedures can result in poor forecasts for independent events. This paper introduces a predictor selection method for the Bayesian joint probability modeling approach to seasonal streamflow forecasting at multiple sites. The method compares forecasting models using a pseudo-Bayes factor (PsBF). A stepwise expansion of a base model is carried out by including the candidate predictor with the highest PsBF that exceeds a selection threshold. Predictors representing the initial catchment conditions are selected on their ability to forecast streamflows and predictors representing future climate influences are selected on their ability to forecast rainfall. The final forecasting model combines selected predictors representing both initial catchment conditions and future climate influences to jointly forecast seasonal streamflows and rainfall. Applications of the predictor selection method to two catchments in eastern Australia show that the best predictors representing initial catchment conditions and future climate influences vary with location and forecast date. Antecedent streamflows are the best indicator of the initial catchment conditions. Predictors representing future climate influences are only selected for forecasts made between July and January. Indicators of El Niño dominate the selected predictors representing future climate influences. The skill of streamflow forecasts varies considerably between locations and throughout the year. Skill scores for the perennial streams of the Goulburn River catchment exceed 40% for several seasons, while for the intermittent streams in the Burdekin River catchment, the skill scores are lower.

APA, Harvard, Vancouver, ISO, and other styles

6

Wu, Yaning, Song Huang, Haijin Ji, Changyou Zheng, and Chengzu Bai. "A novel Bayes defect predictor based on information diffusion function." Knowledge-Based Systems 144 (March 2018): 1–8. http://dx.doi.org/10.1016/j.knosys.2017.12.015.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Hashem, Atef F., and Alaa H. Abdel-Hamid. "Statistical Prediction Based on Ordered Ranked Set Sampling Using Type-II Censored Data from the Rayleigh Distribution under Progressive-Stress Accelerated Life Tests." Journal of Mathematics 2023 (March 30, 2023): 1–19. http://dx.doi.org/10.1155/2023/5211682.

Full text

Abstract:

The objective of ranked set sampling is to gather observations from a population that is more likely to cover the population’s full range of values. In this paper, the ordered ranked set sample is obtained using the idea of order statistics from independent and nonidentically distributed random variables under progressive-stress accelerated life tests. The lifetime of the item tested under normal conditions is suggested to be subject to the Rayleigh distribution with a scale parameter satisfying the inverse power law such that the applied stress is a nonlinear increasing function of time. Considering the type-II censoring scheme, one-sample prediction for censored lifetimes is discussed. Numerous point predictors including the Bayes point predictor, conditional median predictor, and best unbiased predictor for future order statistics are discussed. Additionally, conditional prediction intervals for future order statistics are also studied. The theoretical findings reported in this work are shown by illustrative examples based on simulated data as well as real data sets. The effectiveness of the prediction methods is then evaluated by a Monte Carlo simulation study.

APA, Harvard, Vancouver, ISO, and other styles

8

Alvarez, R. Michael, Delia Bailey, and Jonathan N. Katz. "An Empirical Bayes Approach to Estimating Ordinal Treatment Effects." Political Analysis 19, no. 1 (2011): 20–31. http://dx.doi.org/10.1093/pan/mpq033.

Full text

Abstract:

Ordinal variables—categorical variables with a defined order to the categories, but without equal spacing between them—are frequently used in social science applications. Although a good deal of research exists on the proper modeling of ordinal response variables, there is not a clear directive as to how to model ordinal treatment variables. The usual approaches found in the literature for using ordinal treatment variables are either to use fully unconstrained, though additive, ordinal group indicators or to use a numeric predictor constrained to be continuous. Generalized additive models are a useful exception to these assumptions. In contrast to the generalized additive modeling approach, we propose the use of a Bayesian shrinkage estimator to model ordinal treatment variables. The estimator we discuss in this paper allows the model to contain both individual group—level indicators and a continuous predictor. In contrast to traditionally used shrinkage models that pull the data toward a common mean, we use a linear model as the basis. Thus, each individual effect can be arbitrary, but the model “shrinks” the estimates toward a linear ordinal framework according to the data. We demonstrate the estimator on two political science examples: the impact of voter identification requirements on turnout and the impact of the frequency of religious service attendance on the liberality of abortion attitudes.

APA, Harvard, Vancouver, ISO, and other styles

9

Arumi, Endah Ratna, Sumarno Adi Subrata, and Anisa Rahmawati. "Implementation of Naïve bayes Method for Predictor Prevalence Level for Malnutrition Toddlers in Magelang City." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 7, no. 2 (March 3, 2023): 201–7. http://dx.doi.org/10.29207/resti.v7i2.4438.

Full text

Abstract:

Nutritional status is an important factor in assessing the growth and development rate of babies and toddlers. Cases of malnutrition are increasing, especially in magelang city. Because nutritional problems (Malnutrition) can affect the health of toddlers. Therefore, this study aims to predict the level of prevalence of malnutrition with the Naïve Bayes method. This research uses an observational design, a single center study at the Magelang City Office, using the Naïve bayes method which is used as an application of time series data, and is most widely used for prediction, especially in data sets that have many categorical or nominal type attributes. The Naïve bayes method is used to predict such cases of malnutrition. The results of this study show that the Naïve Bayes method succeeded in predicting the magnitude of cases of malnourished toddlers in Magelang City with an accuracy percentage of 75% due to the very minimal amount of training data, and the areas that have the most malnutrition are in three areas, namely Magersari, North Tidar and Panjang.

APA, Harvard, Vancouver, ISO, and other styles

10

Burghardt, Thomas P., and Katalin Ajtai. "Neural/Bayes network predictor for inheritable cardiac disease pathogenicity and phenotype." Journal of Molecular and Cellular Cardiology 119 (June 2018): 19–27. http://dx.doi.org/10.1016/j.yjmcc.2018.04.006.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Haddouche, Maxime, Benjamin Guedj, Omar Rivasplata, and John Shawe-Taylor. "PAC-Bayes Unleashed: Generalisation Bounds with Unbounded Losses." Entropy 23, no. 10 (October 12, 2021): 1330. http://dx.doi.org/10.3390/e23101330.

Full text

Abstract:

We present new PAC-Bayesian generalisation bounds for learning problems with unbounded loss functions. This extends the relevance and applicability of the PAC-Bayes learning framework, where most of the existing literature focuses on supervised learning problems with a bounded loss function (typically assumed to take values in the interval [0;1]). In order to relax this classical assumption, we propose to allow the range of the loss to depend on each predictor. This relaxation is captured by our new notion of HYPothesis-dependent rangE (HYPE). Based on this, we derive a novel PAC-Bayesian generalisation bound for unbounded loss functions, and we instantiate it on a linear regression problem. To make our theory usable by the largest audience possible, we include discussions on actual computation, practicality and limitations of our assumptions.

APA, Harvard, Vancouver, ISO, and other styles

12

M, Dhanush, Hency Raj, Pratik Bothra, Raman Zanwar, and Dr S. Nagraj. "Chronic Kidney Disease Prediction by using Naive Bayes." International Journal for Research in Applied Science and Engineering Technology 12, no. 5 (May 31, 2024): 1994–98. http://dx.doi.org/10.22214/ijraset.2024.62002.

Full text

Abstract:

Abstract: Chronic kidney disease CKD is a chronic kidney problem that affects the human kidneys and causes it to not work properly or causes complete kidney failure, leads to dialysis or causes other related diseases and reduces the quality of life symptoms of this disease cannot be identified in the preliminary stage, only very few people are aware of this disease and can predict symptoms at an early stage, an earlier CKD predictor model should be available improved with higher prediction accuracy and precision, hence the need for a decision support system that helps nephrologists in times of emergency therefore, in this research, a naive Bayesian classifier is used for classification along with Hierarchy based selection nb cb h nb classifier works efficiently with huge datasets and reduces computational complexity speed of prediction and disease severity analysis with nbare extremely higher

APA, Harvard, Vancouver, ISO, and other styles

13

Aristawidya, Rafika, Indahwati Indahwati, Erfiani Erfiani, Anwar Fitrianto, and Muftih A. A. "PERBANDINGAN ANALISIS REGRESI LOGISTIK BINER DAN NAÏVE BAYES CLASSIFIER UNTUK MEMPREDIKSI FAKTOR RESIKO DIABETES." Jurnal Lebesgue : Jurnal Ilmiah Pendidikan Matematika, Matematika dan Statistika 5, no. 2 (August 8, 2024): 782–94. http://dx.doi.org/10.46306/lb.v5i2.617.

Full text

Abstract:

Diabetes is a global health problem that is increasing in prevalence worldwide. This study compares the performance of two data analysis methods, namely binary logistic regression and naïve bayes classifier in predicting diabetes risk. This study aims to identify factors that significantly affect diabetes risk and classify diabetes risk using binary logistic regression, then compare the classification with the naive bayes classifier algorithm. Binary logistic regression models the relationship between independent predictor variables and binary dependent variables, while naïve bayes classifier uses the assumption of independence between variables. In this study, both methods were evaluated based on accuracy, sensitivity, specificity and positive predictive value. The results show that the factors that influence the risk of diabetes are Age, Gender, Polyuria, Polydipsia, Genital thrush, Itching, Irritability, and Partial paresis. Furthermore, the binary logistic regression results have a higher classification accuracy (92.31%) compared to the naïve bayes classifier (84.61%). Therefore, binary logistic regression was identified as the best method to predict diabetes risk in the context of this study

APA, Harvard, Vancouver, ISO, and other styles

14

Nuzhny, A. S. "Bayes regularization in the selection of weight coefficients in the predictor ensembles." Proceedings of the Institute for System Programming of RAS 31, no. 4 (2019): 113–20. http://dx.doi.org/10.15514/ispras-2019-31(4)-7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Singh,, Pardeep. "Diseases Predictor using Machine Learning." INTERANTIONAL JOURNAL OF SCIENTIFIC RESEARCH IN ENGINEERING AND MANAGEMENT 08, no. 05 (May 14, 2024): 1–5. http://dx.doi.org/10.55041/ijsrem33957.

Full text

Abstract:

For disease management and treatment to be successful in the healthcare industry, prompt and correct diagnosis is crucial. The potential for creating intelligent disease prediction systems has greatly increased with the use of machine learning techniques. This study describes Project Ailment Analysis, a Python-based disease prediction program that makes use of the Naive Bayes, Decision Tree, and Random Forest machine learning methods. This project's primary objective is to identify the most likely illness from the patient's records and symptoms. The system seeks to optimize the diagnosis process and assist medical practitioners in making well-informed decisions by leveraging the power of these algorithms. The article discusses the rationale behind the project, presents a literature review of related work, describes the objectives, the proposed approach, and the implemented algorithms. Furthermore, the methodology, implementation details and future scope of the project are discussed and concluded with a summary of the research findings.

APA, Harvard, Vancouver, ISO, and other styles

16

Nguyen, Vu-Linh, and Eyke Hüllermeier. "Multilabel Classification with Partial Abstention: Bayes-Optimal Prediction under Label Independence." Journal of Artificial Intelligence Research 72 (November 2, 2021): 613–65. http://dx.doi.org/10.1613/jair.1.12610.

Full text

Abstract:

In contrast to conventional (single-label) classification, the setting of multilabel classification (MLC) allows an instance to belong to several classes simultaneously. Thus, instead of selecting a single class label, predictions take the form of a subset of all labels. In this paper, we study an extension of the setting of MLC, in which the learner is allowed to partially abstain from a prediction, that is, to deliver predictions on some but not necessarily all class labels. This option is useful in cases of uncertainty, where the learner does not feel confident enough on the entire label set. Adopting a decision-theoretic perspective, we propose a formal framework of MLC with partial abstention, which builds on two main building blocks: First, the extension of underlying MLC loss functions so as to accommodate abstention in a proper way, and second the problem of optimal prediction, that is, finding the Bayes-optimal prediction minimizing this generalized loss in expectation. It is well known that different (generalized) loss functions may have different risk-minimizing predictions, and finding the Bayes predictor typically comes down to solving a computationally complexity optimization problem. In the most general case, given a prediction of the (conditional) joint distribution of possible labelings, the minimizer of the expected loss needs to be found over a number of candidates which is exponential in the number of class labels. We elaborate on properties of risk minimizers for several commonly used (generalized) MLC loss functions, show them to have a specific structure, and leverage this structure to devise efficient methods for computing Bayes predictors. Experimentally, we show MLC with partial abstention to be effective in the sense of reducing loss when being allowed to abstain.

APA, Harvard, Vancouver, ISO, and other styles

17

Budiarti, Retno, Febri Hemarani, Mohammad Reza, and Rindi Melati Mulyasari. "Comparative Analysis of Machine Learning Algorithms on Family Wellness Classification." CAUCHY: Jurnal Matematika Murni dan Aplikasi 9, no. 2 (November 1, 2024): 222–36. http://dx.doi.org/10.18860/ca.v9i2.28259.

Full text

Abstract:

Family welfare is a state in which a family can experience happiness, have a decent quality of life, and be sufficient in meeting primary and secondary needs in family life. One factor that influences family welfare is the amount of per capita expenditure. This study aims to compare the performance of three machine learning algorithms, namely KNN (K-Nearest Neighbors), random forest, and naive Bayes, in classifying the status of families per province in Indonesia as prosperous or not prosperous. The data used in this study is demographic and social statistics data from the years 2017-2021, obtained from the bps.go.id website. The first statistical analysis conducted is principal component analysis (PCA) with 9 predictor variables. PCA produces four principal components which are then used in the KNN, random forest, and naive Bayes methods. The analysis results from the KNN, random forest, and naive Bayes methods each yield an F1-score of 65.46%, 68%, and 69.44%, respectively.

APA, Harvard, Vancouver, ISO, and other styles

18

Golparvar, L., and A. Parsian. "On Bayes predictor of times to failure of Type-II progressively censored sample." Journal of Statistical Computation and Simulation 85, no. 17 (November 7, 2014): 3420–36. http://dx.doi.org/10.1080/00949655.2014.977287.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Pennell, Michael L., and David B. Dunson. "Nonparametric Bayes Testing of Changes in a Response Distribution with an Ordinal Predictor." Biometrics 64, no. 2 (June 2008): 413–23. http://dx.doi.org/10.1111/j.1541-0420.2007.00885.x.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Premalatha, Mrs M., G. M. Delipan, V. Kavyashri, S. Sanjay, and K. Srijayakanth. "Human Symptoms Based on Diseases Predictor." International Journal for Research in Applied Science and Engineering Technology 11, no. 4 (April 30, 2023): 2425–29. http://dx.doi.org/10.22214/ijraset.2023.50642.

Full text

Abstract:

Abstract: Many situations occur in day to day life which affects a human being. Many problems are happening in fast manner and new diseases are rapidly being created. The main objective of this project is to apply classification algorithm to predict model for occurrence of various diseases. This project work is aimed in identifying the best classification algorithm to identify the disease probability of patients. The identification of the possibility of diseases in patients is a tedious task for doctors and researchers because it requires experience and more medical tests need to be taken. The main objective of this project is to find the best classification algorithm suitable to provide accuracy improvement during classification of normal and abnormal persons. The project contains Naïve Bayes, Support vector machine and decision tree classification with their accuracy score calculation. The applied NBS, SVM, DT classification help to predict the disease with higher accuracy in the new data set. Python 3.9 is used as the coding language.

APA, Harvard, Vancouver, ISO, and other styles

21

Hidayatillah, Rumaisah, Mirwan Mirwan, Mohammad Hakam, and Aryo Nugroho. "Levels of Political Participation Based on Naive Bayes Classifier." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 13, no. 1 (January 31, 2019): 73. http://dx.doi.org/10.22146/ijccs.42531.

Full text

Abstract:

Nowadays, social media is growing rapidly and globally until it finally became an important part of society. During campaign period for the regional head election in Indonesia, the candidates and their supporting parties actively use social media as a campaign tool. Social media like Twitter has been known as a political microblogging media that can provide data about current political event based on users’ tweets. By using Twitter as a data source, this study analyzes public participation during campaign period for 2018 Central Java regional head election. The purpose is to observe how much reaction is given to each candidate who advanced in the election. By using the crawling program, all tweets containing certain candidate names will be downloaded. After going through a series of preprocessing stages, data can be classified using Naive Bayes. Predictor features in classification datasets are the number of replies, retweets, and likes. While the target variable is reaction that is divided into three levels, including high, medium, and low. These levels are determined based on users’ reaction in a tweet. By using these rules, Naive Bayes managed to classify data correctly as much as 76.74% for Ganjar Pranowo and 68.81% for Sudirman Said.

APA, Harvard, Vancouver, ISO, and other styles

22

Budhy Adzy, Luthfy, Asriyanik Asriyanik, and Agung Pambudi. "ALGORITMA NAÏVE BAYES UNTUK KLASIFIKASI KELAYAKAN PENERIMA BANTUAN IURAN JAMINAN KESEHATAN PEMERINTAH DAERAH KABUPATEN SUKABUMI." Jurnal Mnemonic 6, no. 1 (May 15, 2023): 1–10. http://dx.doi.org/10.36040/mnemonic.v6i1.5714.

Full text

Abstract:

Penerima Bantuan Iuran (PBI) Jaminan Kesehatan (JK) merupakan tanggungan berbentuk perawatan kesehatan supaya penerima mendapat utilitas perlindungan kesehatan yang dihibahkan untuk masing-masing masyarakat yang sudah melunasi iuran ataupun iuran itu dibayarkan oleh pemerintah negara. Permasalahan yang biasa terjadi di dalam lapangan yakni dalam pemilihan Keluarga Penerima Manfaat (KPM) yang digunakan masih belum bisa membantu keputusan Supervisor secara objektif dan tepat sasaran. Penelitian ini dilaksanakan untuk membantu Supervisor dalam menentukan cara pemilihan kelayakan calon penerima PBI-JK yang tepat secara objektif dan sesuai target berasaskan standar yang sudah ditetapkan. Pengklasifikasian yang diterapkan mengaplikasikan algoritma naïve bayes dengan metode Knowledge Discovery in Databases (KDD). Algoritma naïve bayes merupakan satu diantara algoritma penggalian data dan pengklasifikasi statistik sebuah klasifikasi berpeluang mudah yang menerapkan teorema bayes melalui asumsi antar variabel ketidak ketergantungan yang luhur. Kelebihan dari algoritma naïve bayes adalah bersifat scalable dengan jumlah predictor dan titik data, bisa membuat prediksi nilai probabilitas (peluang) dan menangani kontinu beserta diskrit data. Pencapaian hasil atas penelitian ini yakni mewujudkan bentuk pengklasifikasian kelayakan penerima bantuan iuran jaminan kesehatan secara otomatis dalam klasifikasi data tersebut layak untuk dibantu atau tidak layak untuk diperbantukan dalam program pemerintah khusus kegiatan Penerima Bantuan Iuran Jaminan Kesehatan Pemerintah Daerah Kabupaten Sukabumi.

APA, Harvard, Vancouver, ISO, and other styles

23

Prawira, Arya, Desi Arisandi, and Tri Sutrisno. "Penerapan Algoritma Naive Bayes dan Multiple Linear Regression Untuk Prediksi Status dan Plafon Kredit (Studi Kasus: Bank ABC)." Journal on Education 5, no. 1 (December 27, 2022): 1075–87. http://dx.doi.org/10.31004/joe.v5i1.720.

Full text

Abstract:

Along with changing technology, Human resources are still needed in many parts of decision making. The companies and organizations still use human to analysis data. Despite of that performance, human analysis took longer time and effort to complete. And sometimes there is always a negative factor of human resources such as unmanageable human. Therefore, it’s always important to provide an excellent training source so this human resource able to reach an outcome that needed. Machine learning is one of the most common knowledge that use in decision making. There are many forms of machine learning such as regression, classification, clustering, etc. two of which is used in this application, regression and classification. Naive Bayes regression is one of classification method which rooted on Bayes theorem. Naïve Bayes use historical data to predict future outcome based on the characteristic on that historical data. Multiple Linear Regression involves more than one independent variable or predictor. With machine learning and human resources, man can easily to analyse credit worthiness and determine the credit limit of one bank costumer without taking a long time and much effort.

APA, Harvard, Vancouver, ISO, and other styles

24

Antika, Dwi Putri, Mohamat Fatekurohman, and I. Made Tirta. "Banking Credit Risk Analysis with Naive Bayes Approach and Cox Proportional Hazard." International Journal of Advanced Engineering Research and Science 9, no. 8 (2022): 365–70. http://dx.doi.org/10.22161/ijaers.98.41.

Full text

Abstract:

Credit is needed for some people for certain purposes. In credit, it takes a party that can be used as an intermediary such as a bank. The debtor may not be able to make payments according to the original policy or even cause losses where the Bank may lose the opportunity to earn interest, causing a decrease in total income. This problem is included in the case of non-performing loans. In statistics, the duration of time between a person not making a payment on time until a non-current loan occurs can be predicted using survival analysis. Meanwhile, to predict credit status, you can use classification or prediction methods in machine learning to find out how much influence the predictor variable has. In this study, with a different case, focusing on the credit risk case of how a bank decides to provide credit to prospective debtors using the classifier method found in Machine Learning, namely Naive Bayes and Cox regression from survival analysis. Through the evaluation test of the naive bayes classifier model using accuracy values, confusion matrix and ROC, it can be concluded that this model is a model with good performance for predicting credit status. Multinomial nave Bayes in this study has a higher performance value than Gaussian Naïve Bayes and Bernoulli Naïve Bayes which is 92%. Through cox regression, it is obtained that income factors and loan history have a major influence on determining credit status.

APA, Harvard, Vancouver, ISO, and other styles

25

Aggarwal, Priyanka. "Bayes Predictor of One-Parameter Exponential Family Type Population Mean Under Balanced Loss Function." Communications in Statistics - Theory and Methods 35, no. 8 (August 2006): 1397–408. http://dx.doi.org/10.1080/03610920600637321.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Lubis, Ahmadi Irmansyah, and Rudy Chandra. "Forward Selection Attribute Reduction Technique for Optimizing Naïve Bayes Performance in Sperm Fertility Prediction." Sinkron 8, no. 1 (January 1, 2023): 275–85. http://dx.doi.org/10.33395/sinkron.v8i1.11967.

Full text

Abstract:

The problem of infertility between husband and wife is an important issue that destroys family harmony, and many people still consider infertility or infertility a female problem. However, about 7% of men of childbearing age suffer from infertility. The biggest factor causing male infertility is sperm quality problems. Sperm analysis can be the best predictor of male fertility potential. Machine learning and data mining techniques can be used to automate disease diagnosis. This study aims to obtain a regular form classification model from sperm sample data of 100 volunteers. This classification model can be used to predict male fertility levels into 2 classes, namely normal and alter (decreased fertility). This study uses a fertility dataset obtained from the UCI Machine Learning Repository. Before the data mining process, data preprocessing is required. The classification process is carried out using Naive Bayes and attribute reduction techniques using forward selection to see the increase in the accuracy of Naive Bayes performance. The Naive Bayes test without attribute reduction has an accuracy of 85%, while attribute reduction with forward selection has an accuracy of 88% in predicting sperm fertility. Therefore, by using forward selection with Naive Bayes to reduce attributes in this study, this study was able to increase accuracy by 3% and can be used to help predict sperm fertility

APA, Harvard, Vancouver, ISO, and other styles

27

Echeverri, Julián, Juan C. Zambrano, and Albeiro López Herrera. "Genomic evaluation of Holstein cattle in Antioquia (Colombia): a case study." Revista Colombiana de Ciencias Pecuarias 27, no. 4 (November 6, 2014): 306–14. http://dx.doi.org/10.17533/udea.rccp.324905.

Full text

Abstract:

Summary Background: DNA markers have been widely used in genetic evaluation throughout the last decade due to the increased reliability of breeding values (BV) they allow, mainly in young animals. Objective: to compare breeding values estimated through the conventional method (best linear unbiased predictor, BLUP) with methods that include molecular markers for milk traits in Holstein cattle in Antioquia (Colombia). Methods: predictions of breeding values were performed using three methods: BLUP, molecular best linear unbiased predictor (MBLUP), and Bayes C. The breeding values were compared using Spearman's correlation coefficient and linear regression coefficient. Results: all Spearman correlation coefficients between breeding values obtained by different methods were greater than 0.5, while linear regression coefficients ranged between -2.10 and 1.58. Conclusions: prediction of breeding values through BLUP, MBLUP and Bayes C showed different results in terms of magnitude from the estimated values. However, animal ranking according to breeding values was not significantly different. Keywords: genetic markers, genomic selection, breeding value, milk quality, milk traits. ResumenAntecedentes: en la última década, los marcadores de DNA han sido ampliamente usados en evaluaciones genéticas porque incrementan la confiabilidad de valores genéticos principalmente en animales jóvenes. Objetivo: comparar valores genéticos (BV) estimados por el método convencional (mejor estimador lineal insesgado, BLUP) y métodos que incluyen marcadores moleculares para algunas características lecheras en ganado Holstein de Antioquia (Colombia). Métodos: la predicción de valores genéticos se realizó mediante tres métodos: BLUP, mejor predictor lineal insesgado molecular (MBLUP) y Bayes C. Los valores genéticos fueron comparados usando el coeficiente de correlación de Spearman y el coeficiente de regresión lineal. Resultados: todos los coeficientes de correlación de Spearman entre los valores genéticos obtenidos por los diferentes métodos fueron mayores de 0,5. Mientras que los coeficientes de regresión lineal oscilaron entre -2,10 y 1,96. Conclusiones: la predicción de valores genéticos empleando los métodos BLUP, MBLUP y Bayes C fue diferente en términos de la magnitud de los valores estimados. Sin embargo el ranking o clasificación de los animales por sus valores genéticos no fue alterado significativamente. Palabras clave: calidad de leche, características de la leche, marcadores genéticos, selección genómica, valor de cría. Resumo Antecedentes: na última década, os marcadores moleculares que identificam polimorfismos no DNA têm sido utilizados amplamente nas avaliações genéticas porque aumentam a fiabilidade dos valores genéticos (BV) estimados principalmente em animais jovens. Objetivo: comparar valores genéticos estimados pelo método convencional (melhor preditor linear não-viesado, BLUP) e métodos que incluem marcadores moleculares para algumas características leiteiras no gado holandês de Antioquia (Colômbia). Métodos: as predições dos valores genéticos foram realizadas por meio de três métodos: BLUP, melhor preditor linear não-viesado molecular (MBLUP) e Bayes C. Os valores genéticos foram comparados por meio de coeficientes de correlação de Spearman e de coeficientes de regressão linear. Resultados: os coeficientes de correlação de Spearman entre os valores genéticos obtidos pelos diferentes métodos foram maiores que 0,5. Enquanto os coeficientes de regressão linear variaram entre -2,10 e 1,96. Conclusões: a predição dos valores genéticos usando os métodos BLUP, MBLUP e Bayes C foi diferente em quanto à magnitude dos valores estimados. No entanto, o ranking ou classificação de animais por seus valores genéticos não foi alterada significativamente. Palavras chave: características do leite, marcadores genéticos, qualidade do leite, seleção genômica, valor genético.

APA, Harvard, Vancouver, ISO, and other styles

28

MORALES, MARÍA, CARMELO RODRÍGUEZ, and ANTONIO SALMERÓN. "SELECTIVE NAIVE BAYES FOR REGRESSION BASED ON MIXTURES OF TRUNCATED EXPONENTIALS." International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 15, no. 06 (December 2007): 697–716. http://dx.doi.org/10.1142/s0218488507004959.

Full text

Abstract:

Naive Bayes models have been successfully used in classification problems where the class variable is discrete. These models have also been applied to regression or prediction problems, i.e. classification problems where the class variable is continuous, but usually under the assumption that the joint distribution of the feature variables and the class is multivariate Gaussian. In this paper we are interested in regression problems where some of the feature variables are discrete while the others are continuous. We propose a Naive Bayes predictor based on the approximation of the joint distribution by a Mixture of Truncated Exponentials (MTE). We have followed a filter-wrapper procedure for selecting the variables to be used in the construction of the model. This scheme is based on the mutual information between each of the candidate variables and the class. Since the mutual information can not be computed exactly for the MTE distribution, we introduce an unbiased estimator of it, based on Monte Carlo methods. We test the performance of the proposed model in artificial and real-world datasets.

APA, Harvard, Vancouver, ISO, and other styles

29

Dou, Yiping, Nhu D. Le, and James V. Zidek. "Temporal Forecasting with a Bayesian Spatial Predictor: Application to Ozone." Advances in Meteorology 2012 (2012): 1–13. http://dx.doi.org/10.1155/2012/191575.

Full text

Abstract:

This paper develops and empirically compares two Bayesian and empirical Bayes space-time approaches for forecasting next-day hourly ground-level ozone concentrations. The comparison involves the Chicago area in the summer of 2000 and measurements from fourteen monitors as reported in the EPA's AQS database. One of these approaches adapts a multivariate method originally designed for spatial prediction. The second is based on a state-space modeling approach originally developed and used in a case study involving one week in Mexico City with ten monitoring sites. The first method proves superior to the second in the Chicago Case Study, judged by several criteria, notably root mean square predictive accuracy, computing times, and calibration of 95% predictive intervals.

APA, Harvard, Vancouver, ISO, and other styles

30

Mackenzie Rivero, Alexander, Alberto Rodríguez Rodríguez, Edwin Joao Merchán Carreño, and Rodrigo Martínez Béjar. "Machine Learning for the Evolutionary Analysis of Breast Cancer." Journal of Science and Research: Revista Ciencia e Investigación 3, CITT2017 (February 22, 2018): 44–49. http://dx.doi.org/10.26910/issn.2528-8083vol3isscitt2017.2018pp44-49.

Full text

Abstract:

The use of machine learning allows the creation of a predictive data model, as a result of the analysis in a data set with 286 instances and nine attributes belonging to the Institute of Oncology of the University Medical Center. Ljubljana. Based on this situation, the data are preprocessed by applying intelligent data analysis techniques to eliminate missing values as well as the evaluation of each attribute that allows the optimization of results. We used several classification algorithms including J48 trees, random forest, bayes net, naive bayes, decision table, in order to obtain one that given the characteristics of the data, would allow the best classification percentage and therefore a better matrix of confusion, Using 66 % of the data for learning and 33 % for validating the model. Using this model, a predictor with a 71,134 % e effectiveness is obtained to estimate or not the recurrence of breast cancer.

APA, Harvard, Vancouver, ISO, and other styles

31

Chakraborty, Prithwish, Manish Marwah, Martin Arlitt, and Naren Ramakrishnan. "Fine-Grained Photovoltaic Output Prediction Using a Bayesian Ensemble." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 274–80. http://dx.doi.org/10.1609/aaai.v26i1.8179.

Full text

Abstract:

Local and distributed power generation is increasingly relianton renewable power sources, e.g., solar (photovoltaic or PV) andwind energy. The integration of such sources into the power grid ischallenging, however, due to their variable and intermittent energyoutput. To effectively use them on alarge scale, it is essential to be able to predict power generation at afine-grained level. We describe a novel Bayesian ensemble methodologyinvolving three diverse predictors. Each predictor estimates mixingcoefficients for integrating PV generation output profiles but capturesfundamentally different characteristics. Two of them employ classicalparameterized (naive Bayes) and non-parametric (nearest neighbor) methods tomodel the relationship between weather forecasts and PV output. The thirdpredictor captures the sequentiality implicit in PV generation and uses motifsmined from historical data to estimate the most likely mixture weights usinga stream prediction methodology. We demonstrate the success and superiority of ourmethods on real PV data from two locations that exhibit diverse weatherconditions. Predictions from our model can be harnessed to optimize schedulingof delay tolerant workloads, e.g., in a data center.

APA, Harvard, Vancouver, ISO, and other styles

32

Simarmata, Justin Eduardo, Gerhard-Wilhelm Weber, and Debora Chrisinta. "Performance Evaluation of Classification Methods on Big Data: Decision Trees, Naive Bayes, K-Nearest Neighbors, and Support Vector Machines." Jurnal Matematika, Statistika dan Komputasi 20, no. 3 (May 15, 2024): 623–38. http://dx.doi.org/10.20956/j.v20i3.32970.

Full text

Abstract:

Performance evaluation of classification methods on big data is becoming increasingly important in addressing the challenges of data analysis at scale. This study aims to conduct a comparative evaluation of the classification method, namely Decision Trees (DT), Naive Bayes (NB), k-Nearest Neighbors (KNN), and Support Vector Machines (SVM), in analysis on big data evaluated from data simulation and application of real data available in the Rstudio package, namely ISLR. The simulation data used consisted of 2 types of datasets generated based on predictor variables that were normally distributed with different averages and variants and response variables generated in classes adjusted to the characteristics of predictor variables with different proportions. Real data are taken from two types of numeric variables and predictor variables available in the package. The number of sample sizes to be evaluated in each method is n = 500, n = 1000 and n = 5000. In real data, sample division is done randomly to maintain data representativeness. At the evaluation stage, the performance of the method is measured using accuracy metrics. The results of the evaluation of the simulation of Dataset 1 show that the methods that have an influence on the quality of the classification produced if applied to Big Data are the DT and KNN methods. However, in Dataset 2 there is a change in the results of the DT method, because of the influence on the number of classes and the proportion of class distribution in the data. The results obtained from data simulation, proven by applying to real data by showing that similar methods provide a quality influence if applied to Big Data, while the NB and SVM methods do not show a consistent influence when applied to Big Data. The results of observations in this study show that the DT and KNN methods have several advantages that make them suitable for application to Big Data.

APA, Harvard, Vancouver, ISO, and other styles

33

S, Arun Kumar. "Risk Assess: A Symptom-Based Disease Predictor." International Journal for Research in Applied Science and Engineering Technology 12, no. 1 (January 31, 2024): 724–30. http://dx.doi.org/10.22214/ijraset.2024.58037.

Full text

Abstract:

Abstract: As of late, the medical care area has seen extraordinary changes through the joining of state of data collection. This study presents "Risk Assess", an imaginative web application fastidiously created with Flask. “Risk Assess” fills in as a complete stage, working with patient vitals for generating risk assessments. This python-based web application uses different Machine Learning models such as Classification and Regression (CART), Linear Support Vector Machine (SVM), Gaussian Naïve Bayes (NB), K-Nearest Neighbor (KNN) to analyze and build a model to predict if the given set of symptoms leads to a particular disease. Outstandingly, “Risk Assess” soothes out Cancer, Diabetes, Heart –Disease, Kidney-Disease and LiverDisease based prediction upgrading comfort and functional productivity. Our undertaking highlights a solid login page empowering clients to make novel qualifications utilizing their email addresses. By requiring a username and secret key attached to their email, we focus on both security and client comfort. This research paper contains working engineering, plan standards, and execution complexities, featuring the pivotal job of Machine Learning. “Risk Assess” epitomizes utilization of Python-based advances and Machine Learning, exhibiting the potential for predictions based on symptoms through creative computerized arrangements

APA, Harvard, Vancouver, ISO, and other styles

34

Carpita, Maurizio, Enrico Ciavolino, and Paola Pasca. "Exploring and modelling team performances of the Kaggle European Soccer database." Statistical Modelling 19, no. 1 (January 10, 2019): 74–101. http://dx.doi.org/10.1177/1471082x18810971.

Full text

Abstract:

This study explores a big and open database of soccer leagues in 10 European countries. Data related to players, teams and matches covering seven seasons (from 2009/2010 to 2015/2016) were retrieved from Kaggle, an online platform in which big data are available for predictive modelling and analytics competition among data scientists. Based on both preliminary data analysis, experts’ evaluation and players’ position on the football pitch, role-based indicators of teams’ performance have been built and used to estimate the win probability of the home team with the binomial logistic regression (BLR) model that has been extended including the ELO rating predictor and two random effects due to the hierarchical structure of the dataset. The predictive power of the BLR model and its extensions has been compared with the one of other statistical modelling approaches (Random Forest, Neural Network, k-NN, Naïve Bayes). Results showed that role-based indicators substantially improved the performance of all the models used in both this work and in previous works available on Kaggle. The base BLR model increased prediction accuracy by 10 percentage points, and showed the importance of defence performances, especially in the last seasons. Inclusion of both ELO rating predictor and the random effects did not substantially improve prediction, as the simpler BLR model performed equally good. With respect to the other models, only Naïve Bayes showed more balanced results in predicting both win and no-win of the home team.

APA, Harvard, Vancouver, ISO, and other styles

35

Delgado, Rosario, and Héctor Sánchez-Delgado. "The effect of seasonality in predicting the level of crime. A spatial perspective." PLOS ONE 18, no. 5 (May 31, 2023): e0285727. http://dx.doi.org/10.1371/journal.pone.0285727.

Full text

Abstract:

This paper presents an innovative methodology to study the application of seasonality (the existence of cyclical patterns) to help predict the level of crime. This methodology combines the simplicity of entropy-based metrics that describe temporal patterns of a phenomenon, on the one hand, and the predictive power of machine learning on the other. First, the classical Colwell’s metrics Predictability and Contingency are used to measure different aspects of seasonality in a geographical unit. Second, if those metrics turn out to be significantly different from zero, supervised machine learning classification algorithms are built, validated and compared, to predict the level of crime based on the time unit. The methodology is applied to a case study in Barcelona (Spain), with month as the unit of time, and municipal district as the geographical unit, the city being divided into 10 of them, from a set of property crime data covering the period 2010-2018. The results show that (a) Colwell’s metrics are significantly different from zero in all municipal districts, (b) the month of the year is a good predictor of the level of crime, and (c) Naive Bayes is the most competitive classifier, among those who have been tested. The districts can be ordered using the Naive Bayes, based on the strength of the month as a predictor for each of them. Surprisingly, this order coincides with that obtained using Contingency. This fact is very revealing, given the apparent disconnection between entropy-based metrics and machine learning classifiers.

APA, Harvard, Vancouver, ISO, and other styles

36

Aisy, Salsabila Rahadatul, and Budi Prasetiyo. "Sentiment Analysist of the TPKS Law on Twitter Using InSet Lexicon with Multinomial Naïve Bayes and Support Vector Machine Based on Soft Voting." Recursive Journal of Informatics 1, no. 2 (September 29, 2023): 93–101. http://dx.doi.org/10.15294/rji.v1i2.68324.

Full text

Abstract:

Abstract. The Indonesian Sexual Violence Law (TPKS Law) is a law that regulates forms of sexual violence. The TPKS Law reaped pros and cons in the drafting process and was officially ratified on April 12th, 2022. However, after being ratified, pros and cons can still be found and supervision is needed over the implementation of the law. Purpose: This study was conducted to identify the application and accuracy of soft voting on multinomial naïve Bayes and support vector machine algorithm, also to find out public opinion on the TPKS Law as a support tool in evaluating the law. Methods/Study design/approach: The method used is InSet lexicon for labeling with the soft voting classification method on the multinomial naive Bayes and support vector machine algorithm. Result/Findings: The accuracy obtained by applying 10 k-fold cross validation in soft voting is 84.31%, which uses a weight of 1:3 for multinomial naive Bayes and support vector machines. Soft voting obtains better accuracy than its standalone predictor, and also works well for sentiment analysis of the TPKS Law. Novelty/Originality/Value: This study using two combined lexicons (Colloquial Indonesian lexicon and the InaNLP formalization dictionary) in normalization process and using InSet lexicon as automatic labeling for sentiment analysis on TPKS Law.

APA, Harvard, Vancouver, ISO, and other styles

37

Schepen, Andrew, Q. J. Wang, and David Robertson. "Evidence for Using Lagged Climate Indices to Forecast Australian Seasonal Rainfall." Journal of Climate 25, no. 4 (February 8, 2012): 1230–46. http://dx.doi.org/10.1175/jcli-d-11-00156.1.

Full text

Abstract:

Abstract Lagged oceanic and atmospheric climate indices are potentially useful predictors of seasonal rainfall totals. A rigorous Bayesian joint probability modeling approach is applied to find the cross-validation predictive densities of gridded Australian seasonal rainfall totals using lagged climate indices as predictors over the period of 1950–2009. The evidence supporting the use of each climate index as a predictor of seasonal rainfall is quantified by the pseudo-Bayes factor based on cross-validation predictive densities. The evidence strongly supports the use of climate indices from the Pacific region with weaker, but positive, evidence for the use of climate indices from the Indian region and the extratropical region. The spatial structure and seasonal variation of the evidence for each climate index is mapped and compared. Spatially, the strongest supporting evidence is found for forecasting in northern and eastern Australia. Seasonally, the strongest evidence is found from August–October to November–January and the weakest evidence is found from March–May to May–July. In some regions and seasons, there is little evidence supporting the use of climate indices for forecasting seasonal rainfall. Climate indices derived from sea surface temperature anomalies in the Pacific region show stronger persistence in the relationship with Australian seasonal rainfall totals than climate indices derived from sea surface temperature anomalies in the Indian region. Climate indices derived from atmospheric variables are also strongly supported, provided they represent the large-scale circulation. Many climate indices are found to show similar supporting evidence for forecasting Australian seasonal rainfall, leading to the prospect of combining climate indices in multiple predictor models and/or model averaging.

APA, Harvard, Vancouver, ISO, and other styles

38

Richard, Mark, and Jan Vecer. "Efficiency Testing of Prediction Markets: Martingale Approach, Likelihood Ratio and Bayes Factor Analysis." Risks 9, no. 2 (February 1, 2021): 31. http://dx.doi.org/10.3390/risks9020031.

Full text

Abstract:

This paper studies efficient market hypothesis in prediction markets and the results are illustrated for the in-play football betting market using the quoted odds for the English Premier League. Our analysis is based on the martingale property, where the last quoted probability should be the best predictor of the outcome and all previous quotes should be statistically insignificant. We use regression analysis to test for the significance of the previous quotes in both the time setup and the spatial setup based on stopping times, when the quoted probabilities reach certain bounds. The main contribution of this paper is to show how a potentially different distributional opinion based on the violation of the market efficiency can be monetized by optimal trading, where the agent maximizes logarithmic utility function. In particular, the trader can realize a trading profit that corresponds to the likelihood ratio in the situation of one market maker and one market taker, or the Bayes factor in the situation of two or more market takers.

APA, Harvard, Vancouver, ISO, and other styles

39

Hussain, A., G. Ali, F. Akhtar, Z. H. Khand, and A. Ali. "Design and Analysis of News Category Predictor." Engineering, Technology & Applied Science Research 10, no. 5 (October 26, 2020): 6380–85. http://dx.doi.org/10.48084/etasr.3825.

Full text

Abstract:

Recent technological advancements have changed significantly the way news is produced, consumed, and disseminated. Frequent and on-spot news reporting has been enabled, which smartphones can access anywhere and anytime. News categorization or classification can significantly help in its proper and timely dissemination. This study evaluates and compares news category predictors' performance based on four supervised machine learning models. We choose a standard dataset of British Broadcasting Corporation (BBC) news consisting of five categories: business, sports, technology, politics, and entertainment. Four multi-class news category predictors have been developed and trained on the same dataset: Naïve Bayes, Random Forest, K-Nearest Neighbors (KNN), and Support Vector Machine (SVM). Each category predictor's performance was evaluated by analyzing the confusion matrix and quantifying the test dataset's precision, recall, and overall accuracy. In the end, the performance of all category predictors was studied and compared. The results show that all category predictors have achieved satisfactory accuracy grades. However, the SVM model performed better than the four supervised learning models, categorizing news articles with 98.3% accuracy. In contrast, the lowest accuracy was obtained by the KNN model. However, the KNN model's performance can be enhanced by investigating the optimal number of neighbors (K) value.

APA, Harvard, Vancouver, ISO, and other styles

40

Ju, Zhe, and Shi-Yun Wang. "Identify Lysine Neddylation Sites Using Bi-profile Bayes Feature Extraction via the Chou’s 5-steps Rule and General Pseudo Components." Current Genomics 20, no. 8 (January 23, 2020): 592–601. http://dx.doi.org/10.2174/1389202921666191223154629.

Full text

Abstract:

Introduction: Neddylation is a highly dynamic and reversible post-translatiNeddylation is a highly dynamic and reversible post-translational modification. The abnormality of neddylation has previously been shown to be closely related to some human diseases. The detection of neddylation sites is essential for elucidating the regulation mechanisms of protein neddylation.onal modification which has been found to be involved in various biological processes and closely associated with many diseases. The accurate identification of neddylation sites is necessary to elucidate the underlying molecular mechanisms of neddylation. As the traditional experimental methods are time consuming and expensive, it is desired to develop computational methods to predict neddylation sites. In this study, a novel predictor named NeddPred is proposed to predict lysine neddylation sites. An effective feature extraction method, bi-profile bayes encoding, is employed to encode neddylation sites. Moreover, a fuzzy support vector machine algorithm is proposed to solve the class imbalance and noise problem in the prediction of neddylation sites. As illustrated by 10-fold cross-validation, NeddPred achieves an excellent performance with a Matthew's correlation coefficient of 0.7082 and an area under receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing neddylation sites predictor NeddyPreddy. Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly web-server for NeddPred is established at 123.206.31.171/NeddPred/. Objective: As the detection of the lysine neddylation sites by the traditional experimental method is often expensive and time-consuming, it is imperative to design computational methods to identify neddylation sites. Methods: In this study, a bioinformatics tool named NeddPred is developed to identify underlying protein neddylation sites. A bi-profile bayes feature extraction is used to encode neddylation sites and a fuzzy support vector machine model is utilized to overcome the problem of noise and class imbalance in the prediction. Results: Matthew's correlation coefficient of NeddPred achieved 0.7082 and an area under the receiver operating characteristic curve of 0.9769. Independent tests show that NeddPred significantly outperforms existing lysine neddylation sites predictor NeddyPreddy. Conclusion: Therefore, NeddPred can be a complement to the existing tools for the prediction of neddylation sites. A user-friendly webserver for NeddPred is accessible at 123.206.31.171/NeddPred/.

APA, Harvard, Vancouver, ISO, and other styles

41

Tauil, Yasser Bulaty, Bruno Samways dos Santos, and Rafael Henrique Palma Lima. "Machine learning techniques in classifying satisfaction with the economy of Latin American Citizens." OBSERVATÓRIO DE LA ECONOMÍA LATINOAMERICANA 22, no. 5 (May 27, 2024): e4912. http://dx.doi.org/10.55905/oelv22n5-199.

Full text

Abstract:

Applications of machine learning algorithms offer a new perspective for exploring data from opinion polls, aiming to better understand perceptions and population profiles on various topics, including satisfaction with the economy. In this context, this research aimed to apply machine learning algorithms to classify economic satisfaction from citizens in Latin American countries, evaluating their performance on different datasets and analyzing the variables contributing most to this theme. Four base algorithms were applied: Random Forest; Gradient Boosting; XGBoost; and Naïve Bayes, along with two ensemble methods: hard voting; and soft voting. Subsequently, variable importance was assessed using Gradient Boosting and Random Forest methods, showing the contribution of the predictors to the economic satisfaction. The results showed an accuracy of approximately 86% for three of the base classifiers and soft voting but all methods performed better in classifying dissatisfied citizens, struggling to recognize patterns of those satisfied with the economy. The most important predictor variables considered for classifying this satisfaction were "Satisfaction with democracy" and "Comparison with the economy 12 months before".

APA, Harvard, Vancouver, ISO, and other styles

42

Tobin, Martin J., and Amal Jubran. "Variable performance of weaning-predictor tests: role of Bayes' theorem and spectrum and test-referral bias." Intensive Care Medicine 32, no. 12 (November 8, 2006): 2002–12. http://dx.doi.org/10.1007/s00134-006-0439-4.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Zwick, Rebecca. "The Validity of the GMAT for the Prediction of Grades in Doctoral Study in Business and Management: An Empirical Bayes Approach." Journal of Educational Statistics 18, no. 1 (March 1993): 91–107. http://dx.doi.org/10.3102/10769986018001091.

Full text

Abstract:

A validity study was conducted to examine the degree to which GMAT scores and undergraduate grade-point average (UGPA) could predict first-year average (FYA) and final grade-point average in doctoral programs in business. A variety of empirical Bayes regression models, some of which took into account possible differences in regressions across schools and cohorts, were investigated for this purpose. Indexes of model fit showed that the most parsimonious model, which did not allow for school or cohort effects, was just as useful for prediction as the more complex models. The three preadmissions measures were found to be associated with graduate school grades, though to a lesser degree than in MBA programs. The prediction achieved using UGPA alone as a predictor tended to be more accurate than that obtained using GMAT verbal (GMATV) and GMAT quantitative (GMATQ) scores together. Including all three predictors was more effective than using only UGPA. The most likely explanation for the lower levels of prediction than in MBA programs is that doctoral programs tend to be more selective. Within-school means on GMATV, GMATQ, UGPA, and FYA were higher than those found in MBA validity studies; within-school standard deviations on FYA tended to be smaller. Among these very select, academically competent doctoral students, highly accurate prediction of grades may not be possible.

APA, Harvard, Vancouver, ISO, and other styles

44

Salehnasab, Cirruse, Abbas Hajifathali, Farkhondeh Asadi, Elham Roshandel, Alireza Kazemi, and Arash Roshanpoor. "Machine Learning Classification Algorithms to Predict aGvHD following Allo-HSCT: A Systematic Review." Methods of Information in Medicine 58, no. 06 (December 2019): 205–12. http://dx.doi.org/10.1055/s-0040-1709150.

Full text

Abstract:

Abstract Background The acute graft-versus-host disease (aGvHD) is the most important cause of mortality in patients receiving allogeneic hematopoietic stem cell transplantation. Given that it occurs at the stage of severe tissue damage, its diagnosis is late. With the advancement of machine learning (ML), promising real-time models to predict aGvHD have emerged. Objective This article aims to synthesize the literature on ML classification algorithms for predicting aGvHD, highlighting algorithms and important predictor variables used. Methods A systemic review of ML classification algorithms used to predict aGvHD was performed using a search of the PubMed, Embase, Web of Science, Scopus, Springer, and IEEE Xplore databases undertaken up to April 2019 based on Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) statements. The studies with a focus on using the ML classification algorithms in the process of predicting of aGvHD were considered. Results After applying the inclusion and exclusion criteria, 14 studies were selected for evaluation. The results of the current analysis showed that the algorithms used were Artificial Neural Network (79%), Support Vector Machine (50%), Naive Bayes (43%), k-Nearest Neighbors (29%), Regression (29%), and Decision Trees (14%), respectively. Also, many predictor variables have been used in these studies so that we have divided them into more abstract categories, including biomarkers, demographics, infections, clinical, genes, transplants, drugs, and other variables. Conclusion Each of these ML algorithms has a particular characteristic and different proposed predictors. Therefore, it seems these ML algorithms have a high potential for predicting aGvHD if the process of modeling is performed correctly.

APA, Harvard, Vancouver, ISO, and other styles

45

Johnston, Rich D. "Influence of Physical Characteristics and Match Outcome on Technical Errors During Rugby League Match Play." International Journal of Sports Physiology and Performance 14, no. 8 (September 1, 2019): 1043–49. http://dx.doi.org/10.1123/ijspp.2018-0354.

Full text

Abstract:

Purpose: To explore the relationship between technical errors during rugby league games, match success, and physical characteristics. Methods: A total of 27 semiprofessional rugby league players participated in this study (24.8 [2.5] y, 183.5 [5.3] cm, 97.1 [11.6] kg). Aerobic fitness, strength, and power were assessed prior to the start of the competitive season before technical performance was tracked during 22 competitive fixtures. Attacking errors were determined as any error that occurred in possession of the ball that resulted in a handover to the opposition. Defensive errors included line breaks, penalties, and missed or ineffective tackles. Match outcome, the zone on the field in which each error occurred, and the number of errors in an error chain (≤60 s between errors) were assessed. Results: During a loss, there were more defensive errors in the 0- to 40-m zone than when a match was won (effect size = 0.99 [0.04–1.94]). Error chains were a predictor of conceding a try (P = .0001, r2 = .22), with the odds ratio increasing to 2.33 when there were 7 errors per chain. High lower-body strength was associated with fewer defensive errors for backs (Bayes factor = 3.67) and forwards (Bayes factor = 19.31); relative bench press was also important for backs (Bayes factor = 3.21). Conclusions: Fewer defensive errors occur in the 0- to 40-m zone during winning matches; lower-body strength is strongly associated with fewer defensive errors in rugby league players.

APA, Harvard, Vancouver, ISO, and other styles

46

Endang S Kresnawati, Yulia Resti, Bambang Suprihatin, M. Rendy Kurniawan, and Widya Ayu Amanda. "Coronary Artery Disease Prediction Using Decision Trees and Multinomial Naïve Bayes with k-Fold Cross Validation." INOMATIKA 3, no. 2 (July 31, 2021): 174–89. http://dx.doi.org/10.35438/inomatika.v3i2.266.

Full text

Abstract:

Penyakit arteri koroner (coronary artery disease) menjadi penyebab utama kematian penduduk di dunia setidaknya selama dua dekade (2000-2019) dan mengalami peningkatan kematian terbesar dalam rentang waktu tersebut dibandingkan dengan penyebab kematian lainnya. Keberhasilan memprediksi penyakit arteri koroner secara dini berdasarkan data medis bermanfaat bagi pasien dan juga bagi kestabilan perekonomian negara. Tujuan penelitian ini adalah memprediksi penyakit arteri koroner jantung dengan mengimplementasikan dua metode statistical learning yaitu Multinomial Naïve Bayes dan pohon keputusan dengan validasi silang 10-fold, dimana variabel-variabel numerik didiskritisasi untuk memperoleh variabel-variabel kategorik. Hasil penelitian menunjukkan bahwa metode Pohon Keputusan memiliki kinerja yang lebih baik dibandingkan metode Multinomial Naïve Bayes dalam memprediksi penyakit arteri koroner. Ukuran kinerja metode Pohon Keputusan memperoleh tingkat akurasi 99,63 %, sensitivitas 100 %, spesifisitas 99,33%, presisi 99,23 %, dan nilai prediksi negatif (NPV) 100 %. Ukuran-ukuran ini mengindikasikan bahwa metode Pohon Keputusan layak digunakan untuk memprediksi penyakit arteri coroner, termasuk data independent berupa data penyakit arteri coroner lainnya dengan variable predictor yang sama. Hasil penelitian ini juga menunjukkan bahwa perbedaan rujukan dengan penelitian-penelitian sebelumnya dalam mendiskritisasi variabel numerik mampu meningkatkan kinerja metode dalam memprediksi penyakit arteri coroner.

APA, Harvard, Vancouver, ISO, and other styles

47

Dou, Lijun, Xiaoling Li, Hui Ding, Lei Xu, and Huaikun Xiang. "iRNA-m5C_NB: A Novel Predictor to Identify RNA 5-Methylcytosine Sites Based on the Naive Bayes Classifier." IEEE Access 8 (2020): 84906–17. http://dx.doi.org/10.1109/access.2020.2991477.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Lu, Peng, Jianxin Chen, Huihui Zhao, Yibo Gao, Liangtao Luo, Xiaohan Zuo, Qi Shi, Yiping Yang, Jianqiang Yi, and Wei Wang. "In SilicoSyndrome Prediction for Coronary Artery Disease in Traditional Chinese Medicine." Evidence-Based Complementary and Alternative Medicine 2012 (2012): 1–11. http://dx.doi.org/10.1155/2012/142584.

Full text

Abstract:

Coronary artery disease (CAD) is the leading causes of deaths in the world. The differentiation of syndrome (ZHENG) is the criterion of diagnosis and therapeutic in TCM. Therefore, syndrome predictionin silicocan be improving the performance of treatment. In this paper, we present a Bayesian network framework to construct a high-confidence syndrome predictor based on the optimum subset, that is, collected by Support Vector Machine (SVM) feature selection. Syndrome of CAD can be divided into asthenia and sthenia syndromes. According to the hierarchical characteristics of syndrome, we firstly label every case three types of syndrome (asthenia, sthenia, or both) to solve several syndromes with some patients. On basis of the three syndromes’ classes, we design SVM feature selection to achieve the optimum symptom subset and compare this subset with Markov blanket feature select using ROC. Using this subset, the six predictors of CAD’s syndrome are constructed by the Bayesian network technique. We also design Naïve Bayes, C4.5 Logistic, Radial basis function (RBF) network compared with Bayesian network. In a conclusion, the Bayesian network method based on the optimum symptoms shows a practical method to predict six syndromes of CAD in TCM.

APA, Harvard, Vancouver, ISO, and other styles

49

Kartini, Fetty Tri Anggraeny, Aang Kisnu Darmawan, Anik Anekawati, and Ivana Yudhisari. "Early Prediction for Graduation of Private High School Students with Machine Learning Approach." Technium: Romanian Journal of Applied Sciences and Technology 16 (October 29, 2023): 129–36. http://dx.doi.org/10.47577/technium.v16i.9971.

Full text

Abstract:

Graduation rates indicate school success. Predicting student graduation helps schools identify students in danger of dropping out and intervene early to enhance academic performance. It can also assist policymakers create graduation and dropout prevention initiatives. However, based on a literature search, predicting student graduation rates from admission test scores is difficult. School grades are a better predictor of timely tertiary graduation than acceptance test scores because college success requires cognitive abilities and self-regulation competencies, which are better indexed by school grades. Self-efficacy, school academic culture, and future expectations can also affect student graduation rates. Finally, the selective admissions modality needs to be refined. This study aims to (1) predict private high school graduation with eight algorithms: Random tree, Naïve Bayes Multinomial, Support Vector Machine (SVM), Random forest (RF), K-Nearest Neighbor, Ada Boost, Multilayer perceptron, Logistic regression, and (2) compare the performance of the eight algorithms. According to research, the Random tree, Naïve Bayes Multinomial, Random forest (RF), and Ada boost algorithms all perform at 99.49% for the first aim. For the second objective, the Random Tree approach outperforms other algorithms in Accuracy (99.49%), Precision (100%), F-Measure (99.74%), and consumption time (0 seconds). Therefore, the Random tree algorithm outperforms others. This research contributes in two ways: scientifically by testing eight algorithms—Random tree, Naïve Bayes Multinomial, Support Vector Machine (SVM), Random forest (RF), K-Nearest Neighbor, Ada Boost, Multilayer perceptron, and Logistic regression—to predict private high school graduation, and secondly by recommending school administrators to develop a selective enrollment model.

APA, Harvard, Vancouver, ISO, and other styles

50

Verma, Deepa, Kirti Kushwah, Muskan Jain, Pooja Jain, and Riya Pal. "Remote Diagnosis Based on Symptoms." International Journal for Research in Applied Science and Engineering Technology 10, no. 5 (May 31, 2022): 3739–42. http://dx.doi.org/10.22214/ijraset.2022.43209.

Full text

Abstract:

Abstract: Machine Learning Approach for distinctive Complaint Vaticination victimisation Machine Literacy is grounded on vaticination modelling that predicts illness of the cases per the symptoms handed by the druggies as an input to the system. This paper provides a study of prognosticating multiple conditions victimisation Machine Learning algorithms. Then we're going to use the idea of supervised Machine Literacy during which perpetration are done by applying Decision Tree, Random Forest, Naïve Bayes and KNN algorithms which can grease in beforehand vaticination of conditions directly and advanced cases watch. The results assured that the system would be useful and stoner acquainted for cases for timely judgments of conditions in a case. Medicine and health care are a number of the foremost pivotal rudiments of the frugality and mortal life. There’s an inconceivable volume of change within the world we tend to live in presently and also the world that was numerous weeks back. Everything has turned horrible and divergent. During this state of affairs, wherever everything has turned virtual, the croakers and nursers are putting up most sweats to save lots of people's lives indeed though they need to Peril their own. There are still some remote townlets that warrant medical installations.Machines are ever considered better than humans as, with none mortal error, they will perform tasks more expeditiously and with an indeed position of delicacy. A complaint predictor is known as virtual croaker, which might prognosticate the sickness of any case with none mortal error. Also, in conditions like COVID-19 and EBOLA, a Complaint predictor is a blessing because it'll determine a human's sickness with none physical contact Keywords: Machine Literacy, Disease Prediction, Decision Tree, Random Forest, Naïve Bayes.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!