Log in

Relevant bibliographies by topics / HYBRID RESAMPLING / Journal articles

To see the other types of publications on this topic, follow the link: HYBRID RESAMPLING.

Journal articles on the topic 'HYBRID RESAMPLING'

Author: Grafiati

Published: 11 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'HYBRID RESAMPLING.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Arun, Pattathal V., and Sunil K. Katiyar. "A CNN based Hybrid approach towards automatic image registration." Geodesy and Cartography 62, no. 1 (June 1, 2013): 33–49. http://dx.doi.org/10.2478/geocart-2013-0005.

Full text

Abstract:

Abstract Image registration is a key component of various image processing operations which involve the analysis of different image data sets. Automatic image registration domains have witnessed the application of many intelligent methodologies over the past decade; however inability to properly model object shape as well as contextual information had limited the attainable accuracy. In this paper, we propose a framework for accurate feature shape modeling and adaptive resampling using advanced techniques such as Vector Machines, Cellular Neural Network (CNN), SIFT, coreset, and Cellular Automata. CNN has found to be effective in improving feature matching as well as resampling stages of registration and complexity of the approach has been considerably reduced using corset optimization The salient features of this work are cellular neural network approach based SIFT feature point optimisation, adaptive resampling and intelligent object modelling. Developed methodology has been compared with contemporary methods using different statistical measures. Investigations over various satellite images revealed that considerable success was achieved with the approach. System has dynamically used spectral and spatial information for representing contextual knowledge using CNN-prolog approach. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling.

APA, Harvard, Vancouver, ISO, and other styles

2

Arun, Pattathal Vijayakumar. "A CNN BASED HYBRID APPROACH TOWARDS AUTOMATIC IMAGE REGISTRATION." Geodesy and Cartography 39, no. 3 (September 26, 2013): 121–28. http://dx.doi.org/10.3846/20296991.2013.840409.

Full text

Abstract:

Image registration is a key component of spatial analyses that involve different data sets of the same area. Automatic approaches in this domain have witnessed the application of several intelligent methodologies over the past decade; however accuracy of these approaches have been limited due to the inability to properly model shape as well as contextual information. In this paper, we investigate the possibility of an evolutionary computing based framework towards automatic image registration. Cellular Neural Network has been found to be effective in improving feature matching as well as resampling stages of registration, and complexity of the approach has been considerably reduced using corset optimization. CNN-prolog based approach has been adopted to dynamically use spectral and spatial information for representing contextual knowledge. The salient features of this work are feature point optimisation, adaptive resampling and intelligent object modelling. Investigations over various satellite images revealed that considerable success has been achieved with the procedure. Methodology also illustrated to be effective in providing intelligent interpretation and adaptive resampling.

APA, Harvard, Vancouver, ISO, and other styles

3

Zafar, Taimoor, Tariq Mairaj, Anzar Alam, and Haroon Rasheed. "Hybrid resampling scheme for particle filter-based inversion." IET Science, Measurement & Technology 14, no. 4 (June 1, 2020): 396–406. http://dx.doi.org/10.1049/iet-smt.2018.5531.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Jentsch, Carsten, and Jens-Peter Kreiss. "The multiple hybrid bootstrap — Resampling multivariate linear processes." Journal of Multivariate Analysis 101, no. 10 (November 2010): 2320–45. http://dx.doi.org/10.1016/j.jmva.2010.06.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Lee, Ernesto, Furqan Rustam, Wajdi Aljedaani, Abid Ishaq, Vaibhav Rupapara, and Imran Ashraf. "Predicting Pulsars from Imbalanced Dataset with Hybrid Resampling Approach." Advances in Astronomy 2021 (December 3, 2021): 1–13. http://dx.doi.org/10.1155/2021/4916494.

Full text

Abstract:

Pulsar stars, usually neutron stars, are spherical and compact objects containing a large quantity of mass. Each pulsar star possesses a magnetic field and emits a slightly different pattern of electromagnetic radiation which is used to identify the potential candidates for a real pulsar star. Pulsar stars are considered an important cosmic phenomenon, and scientists use them to study nuclear physics, gravitational waves, and collisions between black holes. Defining the process of automatic detection of pulsar stars can accelerate the study of pulsar stars by scientists. This study contrives an accurate and efficient approach for true pulsar detection using supervised machine learning. For experiments, the high time-resolution (HTRU2) dataset is used in this study. To resolve the data imbalance problem and overcome model overfitting, a hybrid resampling approach is presented in this study. Experiments are performed with imbalanced and balanced datasets using well-known machine learning algorithms. Results demonstrate that the proposed hybrid resampling approach proves highly influential to avoid model overfitting and increase the prediction accuracy. With the proposed hybrid resampling approach, the extra tree classifier achieves a 0.993 accuracy score for true pulsar star prediction.

APA, Harvard, Vancouver, ISO, and other styles

6

Saputro, Dewi Retno Sari, Sulistyaningsih Sulistyaningsih, and Purnami Widyaningsih. "SPATIAL AUTOREGRESSIVE (SAR) MODEL WITH ENSEMBLE LEARNING-MULTIPLICATIVE NOISE WITH LOGNORMAL DISTRIBUTION (CASE ON POVERTY DATA IN EAST JAVA)." MEDIA STATISTIKA 14, no. 1 (June 22, 2021): 89–97. http://dx.doi.org/10.14710/medstat.14.1.89-97.

Full text

Abstract:

The regression model that can be used to model spatial data is Spatial Autoregressive (SAR) model. The level of accuracy of the estimated parameters of the SAR model can be improved, especially to provide better results and can reduce the error rate by resampling method. Resampling is done by adding noise (noise) to the data using Ensemble Learning (EL) with multiplicative noise. The research objective is to estimate the parameters of the SAR model using EL with multiplicative noise. In this research was also applied a spatial regression model of the ensemble non-hybrid multiplicative noise which has a lognormal distribution of cases on poverty data in East Java in 2016. The results showed that the estimated value of the non-hybrid spatial ensemble spatial regression model with multiplicative noise with a lognormal distribution was obtained from the average parameter estimation of 10 Spatial Error Model (SEM) resulting from resampling. The multiplicative noise used is generated from lognormal distributions with an average of one and a standard deviation of 0.433. The Root Mean Squared Error (RMSE) value generated by the non-hybrid spatial ensemble regression model with multiplicative noise with a lognormal distribution is 22.99.

APA, Harvard, Vancouver, ISO, and other styles

7

Abdullahi, Dauda Sani, Dr Muhammad Sirajo Aliyu, and Usman Musa Abdullahi. "Comparative analysis of resampling algorithms in the prediction of stroke diseases." UMYU Scientifica 2, no. 1 (March 30, 2023): 88–94. http://dx.doi.org/10.56919/usci.2123.011.

Full text

Abstract:

Stroke disease is a serious cause of death globally. Early predictions of the disease will save a lot of lives but most of the clinical datasets are imbalanced in nature including the stroke dataset, making the predictive algorithms biased towards the majority class. The objective of this research is to compare different data resampling algorithms on the stroke dataset to improve the prediction performances of the machine learning models. This paper considered five (5) resampling algorithms namely; Random over Sampling (ROS), Synthetic Minority oversampling Technique (SMOTE), Adaptive Synthetic (ADASYN), hybrid techniques like SMOTE with Edited Nearest Neighbor (SMOTE-ENN), and SMOTE with Tomek Links (SMOTE-TOMEK) and trained on six (6) machine learning classifiers namely; Logistic Regression (LR), Decision Tree (DT), K-nearest Neighbor (KNN), Support Vector Machines (SVM), Random Forest (RF), and XGBoost (XGB). The hybrid technique SMOTE-ENN influences the machine learning classifiers the best followed by the SMOTE technique while the combination of SMOTE and XGB perform better with an accuracy of 97.99% and G-mean score of 0.99, and auc_roc score of 0.99. Resampling algorithms balance the dataset and enhanced the predictive power of machine learning algorithms. Therefore, we recommend resampling stroke dataset in predicting stroke disease than modeling on the imbalanced dataset.

APA, Harvard, Vancouver, ISO, and other styles

8

Jadwal, Pankaj Kumar, Sonal Jain, and Basant Agarwal. "Clustering-based hybrid resampling techniques for social lending data." International Journal of Intelligent Systems Technologies and Applications 20, no. 3 (2021): 183. http://dx.doi.org/10.1504/ijista.2021.10044536.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Jadwal, Pankaj Kumar, Sonal Jain, and Basant Agarwal. "Clustering-based hybrid resampling techniques for social lending data." International Journal of Intelligent Systems Technologies and Applications 20, no. 3 (2021): 183. http://dx.doi.org/10.1504/ijista.2021.120495.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Karthikeyan, S., and T. Kathirvalavakumar. "A Hybrid Data Resampling Algorithm Combining Leader and SMOTE for Classifying the High Imbalanced Datasets." Indian Journal Of Science And Technology 16, no. 16 (April 27, 2023): 1214–20. http://dx.doi.org/10.17485/ijst/v16i16.146.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Datta, Debaleena, Pradeep Kumar Mallick, Jana Shafi, Jaeyoung Choi, and Muhammad Fazal Ijaz. "Computational Intelligence for Observation and Monitoring: A Case Study of Imbalanced Hyperspectral Image Data Classification." Computational Intelligence and Neuroscience 2022 (April 30, 2022): 1–23. http://dx.doi.org/10.1155/2022/8735201.

Full text

Abstract:

Imbalance in hyperspectral images creates a crisis in its analysis and classification operation. Resampling techniques are utilized to minimize the data imbalance. Although only a limited number of resampling methods were explored in the previous research, a small quantity of work has been done. In this study, we propose a novel illustrative study of the performance of the existing resampling techniques, viz. oversampling, undersampling, and hybrid sampling, for removing the imbalance from the minor samples of the hyperspectral dataset. The balanced dataset is classified in the next step, using the tree-based ensemble classifiers by including the spectral and spatial features. Finally, the comparative study is performed based on the statistical analysis of the outcome obtained from those classifiers that are discussed in the results section. In addition, we applied a new ensemble hybrid classifier named random rotation forest to our dataset. Three benchmark hyperspectral datasets: Indian Pines, Salinas Valley, and Pavia University, are applied for performing the experiments. We have taken precision, recall, F score, Cohen kappa, and overall accuracy as assessment metrics to evaluate our model. The obtained result shows that SMOTE, Tomek Links, and their combinations stand out to be the more optimized resampling strategies. Moreover, the ensemble classifiers such as rotation forest and random rotation ensemble provide more accuracy than others of their kind.

APA, Harvard, Vancouver, ISO, and other styles

12

Ongko, Erianto, and Hartono Hartono. "Hybrid approach redefinition-multi class with resampling and feature selection for multi-class imbalance with overlapping and noise." Bulletin of Electrical Engineering and Informatics 10, no. 3 (June 1, 2021): 1718–28. http://dx.doi.org/10.11591/eei.v10i3.3057.

Full text

Abstract:

Class imbalance and overlapping on multi-class can reduce the performance and accuracy of the classification. Noise must also be considered because it can reduce the performance of classification. With a resampling algorithm and feature selection, this paper proposes a method for improving the performance of hybrid approach redefinition-multi class (HAR-MI). Resampling algorithm can overcome the problem of noise but cannot handle overlapping well. Feature selection is good at dealing with overlapping but can experience a decrease in quality if there is a noise. The HAR-MI approach is a way to deal with multi-class imbalance issues, but it has some drawbacks when dealing with overlapping. The contribution of this paper is to suggest a new approach for dealing with class imbalance, overlapping, and noise in multi-class. This is accomplished by employing minimizing overlapping selection (MOSS) as an ensemble learning algorithm and a preprocessing technique in HAR-MI, as well as employing multi-class combination cleaning and resampling (MC-CCR) as a resampling algorithm at the processing stage. When subjected to overlapping and classifier performance, it is discovered that the proposed method produces good results, as evidenced by higher augmented r-value, class average accuracy, class balance accuracy, multi class g-mean, and confusion entropy.

APA, Harvard, Vancouver, ISO, and other styles

13

Shivanandappa, Manjunatha, and Malini M. Patil. "Extraction of image resampling using correlation aware convolution neural networks for image tampering detection." International Journal of Electrical and Computer Engineering (IJECE) 12, no. 3 (June 1, 2022): 3033. http://dx.doi.org/10.11591/ijece.v12i3.pp3033-3043.

Full text

Abstract:

<span>Detecting hybrid tampering attacks in an image is extremely difficult; especially when copy-clone tampered segments exhibit identical illumination and contrast level about genuine objects. The existing method fails to detect tampering when the image undergoes hybrid transformation such as scaling, rotation, compression, and also fails to detect under small-smooth tampering. The existing resampling feature extraction using the Deep learning techniques fails to obtain a good correlation among neighboring pixels in both horizontal and vertical directions. This work presents correlation aware convolution neural network (CA-CNN) for extracting resampling features for detecting hybrid tampering attacks. Here the image is resized for detecting tampering under a small-smooth region. The CA-CNN is composed of a three-layer horizontal, vertical, and correlated layer. The correlated layer is used for obtaining correlated resampling feature among horizontal sequence and vertical sequence. Then feature is aggregated and the descriptor is built. An experiment is conducted to evaluate the performance of the CA-CNN model over existing tampering detection methodologies considering the various datasets. From the result achieved it can be seen the CA-CNN is efficient considering various distortions and post-processing attacks such joint photographic expert group (JPEG) compression, and scaling. This model achieves much better accuracies, recall, precision, false positive rate (FPR), and F-measure compared existing methodologies.</span>

APA, Harvard, Vancouver, ISO, and other styles

14

Zhang, Xudong, Liang Zhao, Wei Zhong, and Feng Gu. "A novel hybrid resampling algorithm for parallel/distributed particle filters." Journal of Parallel and Distributed Computing 151 (May 2021): 24–37. http://dx.doi.org/10.1016/j.jpdc.2021.02.005.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Ustyannie, Windyaning, Emy Setyaningsih, and Catur Iswahyudi. "Optimization of software defects prediction in imbalanced class using a combination of resampling methods with support vector machine and logistic regression." JURNAL INFOTEL 13, no. 4 (December 9, 2021): 176–84. http://dx.doi.org/10.20895/infotel.v13i4.726.

Full text

Abstract:

The main problem in producing high accuracy software defect prediction is if the data set has an imbalance class and dichotomous characteristics. The imbalanced class problem can be solved using a data level approach, such as resampling methods. While the problem of software defects predicting if the data set has dichotomous characteristics can be approached using the classification method. This study aimed to analyze the performance of the proposed software defect prediction method to identify the best combination of resampling methods with the appropriate classification method to provide the highest accuracy. The combination of the proposed methods first is the resampling process using oversampling, under-sampling, or hybrid methods. The second process uses the classification method, namely the Support Vector Machine (SVM) algorithm and the Logistic Regression (LR) algorithm. The proposed, tested model uses five NASA MDP data sets with the same number attributes of 37. Based on the t-test, the < = 0.0344 < 0.05 and the > = 3.1524 > 2.7765 which indicates that the combination of the proposed methods is suitable for classifying imbalanced class. The performance of the classification algorithm has also improved with the use of the resampling process. The average increase in AUC values using the resampling in the SVM algorithm is 17.19%, and the LR algorithm is at 7.26% compared to without the resampling process. Combining the three resampling methods with the SVM algorithm and the LR algorithm shows that the best combining method is the oversampling method with the SVM algorithm to software defects prediction in imbalanced class with an average accuracy value of 84.02% and AUC 91.65%.

APA, Harvard, Vancouver, ISO, and other styles

16

Hartono, Hartono, and Erianto Ongko. "Avoiding Overfitting dan Overlapping in Handling Class Imbalanced Using Hybrid Approach with Smoothed Bootstrap Resampling and Feature Selection." JOIV : International Journal on Informatics Visualization 6, no. 2 (June 28, 2022): 343. http://dx.doi.org/10.30630/joiv.6.2.985.

Full text

Abstract:

The dataset tends to have the possibility to experience imbalance as indicated by the presence of a class with a much larger number (majority) compared to other classes(minority). This condition results in the possibility of failing to obtain a minority class even though the accuracy obtained is high. In handling class imbalance, the problems of diversity and classifier performance must be considered. Hence, the Hybrid Approach method that combines the sampling method and classifier ensembles presents satisfactory results. The Hybrid Approach generally uses the oversampling method, which is prone to overfitting problems. The overfitting condition is indicated by high accuracy in the training data, but the testing data can show differences in accuracy. Therefore, in this study, Smoothed Bootstrap Resampling is the oversampling method used in the Hybrid Approach, which can prevent overfitting. However, it is not only the class imbalance that contributes to the decline in classifier performance. There are also overlapping issues that need to be considered. The approach that can be used to overcome overlapping is Feature Selection. Feature selection can reduce overlap by minimizing the overlap degree. This research combined the application of Feature Selection with Hybrid Approach Redefinition, which modifies the use of Smoothed Bootstrap Resampling in handling class imbalance in medical datasets. The preprocessing stage in the proposed method was carried out using Smoothed Bootstrap Resampling and Feature Selection. The Feature Selection method used is Feature Assessment by Sliding Thresholds (FAST). While the processing is done using Random Under Sampling and SMOTE. The overlapping measurement parameters use Augmented R-Value, and Classifier Performance uses the Balanced Error Rate, Precision, Recall, and F-Value parameters. The Balanced Error Rate states the combined error of the majority and minority classes in the 10-Fold Validation test, allowing each subset to become training data. The results showed that the proposed method provides better performance when compared to the comparison method

APA, Harvard, Vancouver, ISO, and other styles

17

Snieder, Everett, Karen Abogadil, and Usman T. Khan. "Resampling and ensemble techniques for improving ANN-based high-flow forecast accuracy." Hydrology and Earth System Sciences 25, no. 5 (May 18, 2021): 2543–66. http://dx.doi.org/10.5194/hess-25-2543-2021.

Full text

Abstract:

Abstract. Data-driven flow-forecasting models, such as artificial neural networks (ANNs), are increasingly featured in research for their potential use in operational riverine flood warning systems. However, the distributions of observed flow data are imbalanced, resulting in poor prediction accuracy on high flows in terms of both amplitude and timing error. Resampling and ensemble techniques have been shown to improve model performance on imbalanced datasets. However, the efficacy of these methods (individually or combined) has not been explicitly evaluated for improving high-flow forecasts. In this research, we systematically evaluate and compare three resampling methods, random undersampling (RUS), random oversampling (ROS), and the synthetic minority oversampling technique for regression (SMOTER), and four ensemble techniques, randomised weights and biases, bagging, adaptive boosting (AdaBoost), and least-squares boosting (LSBoost), on their ability to improve high stage prediction accuracy using ANNs. These methods are implemented both independently and in combined hybrid techniques, where the resampling methods are embedded within the ensemble methods. This systematic approach for embedding resampling methods is a novel contribution. This research presents the first analysis of the effects of combining these methods on high stage prediction accuracy. Data from two Canadian watersheds (the Bow River in Alberta and the Don River in Ontario), representing distinct hydrological systems, are used as the basis for the comparison of the methods. The models are evaluated on overall performance and on typical and high stage subsets. The results of this research indicate that resampling produces marginal improvements to high stage prediction accuracy, whereas ensemble methods produce more substantial improvements, with or without resampling. Many of the techniques used produced an asymmetric trade-off between typical and high stage performance; reduction of high stage error resulted in disproportionately larger error on a typical stage. The methods proposed in this study highlight the diversity-in-learning concept and help support future studies on adapting ensemble algorithms for resampling. This research contains many of the first instances of such methods for flow forecasting and, moreover, their efficacy in addressing the imbalance problem and heteroscedasticity, which are commonly observed in high-flow and flood-forecasting models.

APA, Harvard, Vancouver, ISO, and other styles

18

Seetan, Raed, Jacob Bible, Michael Karavias, Wael Seitan, and Sam Thangiah. "Radiation Hybrid Mapping: A Resampling-based Method for Building High-Resolution Maps." Advances in Science, Technology and Engineering Systems Journal 2, no. 3 (August 2017): 1390–400. http://dx.doi.org/10.25046/aj0203175.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Cao, Lu, and Hong Shen. "Imbalanced data classification based on hybrid resampling and twin support vector machine." Computer Science and Information Systems 14, no. 3 (2017): 579–95. http://dx.doi.org/10.2298/csis161221017l.

Full text

Abstract:

Imbalanced datasets exist widely in real life. The identification of the minority class in imbalanced datasets tends to be the focus of classification. As a variant of enhanced support vector machine (SVM), the twin support vector machine (TWSVM) provides an effective technique for data classification. TWSVM is based on a relative balance in the training sample dataset and distribution to improve the classification accuracy of the whole dataset, however, it is not effective in dealing with imbalanced data classification problems. In this paper, we propose to combine a re-sampling technique, which utilizes oversampling and under-sampling to balance the training data, with TWSVM to deal with imbalanced data classification. Experimental results show that our proposed approach outperforms other state-of-art methods.

APA, Harvard, Vancouver, ISO, and other styles

20

Malek, Nur Hanisah Abdul, Wan Fairos Wan Yaacob, Yap Bee Wah, Syerina Azlin Md Nasir, Norshahida Shaadan, and Sapto Wahyu Indratno. "Comparison of ensemble hybrid sampling with bagging and boosting machine learning approach for imbalanced data." Indonesian Journal of Electrical Engineering and Computer Science 29, no. 1 (January 1, 2022): 598. http://dx.doi.org/10.11591/ijeecs.v29.i1.pp598-608.

Full text

Abstract:

Training an imbalanced dataset can cause classifiers to overfit the majority class and increase the possibility of information loss for the minority class. Moreover, accuracy may not give a clear picture of the classifier’s performance. This paper utilized decision tree (DT), support vector machine (SVM), artificial neural networks (ANN), K-nearest neighbors (KNN) and Naïve Bayes (NB) besides ensemble models like random forest (RF) and gradient boosting (GB), which use bagging and boosting methods, three sampling approaches and seven performance metrics to investigate the effect of class imbalance on water quality data. Based on the results, the best model was gradient boosting without resampling for almost all metrics except balanced accuracy, sensitivity and area under the curve (AUC), followed by random forest model without resampling in term of specificity, precision and AUC. However, in term of balanced accuracy and sensitivity, the highest performance was achieved by random forest with a random under-sampling dataset. Focusing on each performance metric separately, the results showed that for specificity and precision, it is better not to preprocess all the ensemble classifiers. Nevertheless, the results for balanced accuracy and sensitivity showed improvement for both ensemble classifiers when using all the resampled dataset.

APA, Harvard, Vancouver, ISO, and other styles

21

Wang, Xiaohui, Hao Zhang, Shengzhou Bai, and Yuxian Yue. "Design of agile satellite constellation based on hybrid-resampling particle swarm optimization method." Acta Astronautica 178 (January 2021): 595–605. http://dx.doi.org/10.1016/j.actaastro.2020.09.040.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Wang, Qiang. "A Hybrid Sampling SVM Approach to Imbalanced Data Classification." Abstract and Applied Analysis 2014 (2014): 1–7. http://dx.doi.org/10.1155/2014/972786.

Full text

Abstract:

Imbalanced datasets are frequently found in many real applications. Resampling is one of the effective solutions due to generating a relatively balanced class distribution. In this paper, a hybrid sampling SVM approach is proposed combining an oversampling technique and an undersampling technique for addressing the imbalanced data classification problem. The proposed approach first uses an undersampling technique to delete some samples of the majority class with less classification information and then applies an oversampling technique to gradually create some new positive samples. Thus, a balanced training dataset is generated to replace the original imbalanced training dataset. Finally, through experimental results on the real-world datasets, our proposed approach has the ability to identify informative samples and deal with the imbalanced data classification problem.

APA, Harvard, Vancouver, ISO, and other styles

23

Ersa Budi Sutanto, Ghytsa Alif Jabir, Nadhifan Humam Fitrial, Ni Luh Putu Yayang Septia Ningsih, Siti Andhasah Siti Andhasah, and Rani Nooraeni. "Faktor-Faktor yang Memengaruhi Pernikahan Dini pada Wanita Usia 20-24 di Indonesia Tahun 2017: Penerapan Metode Regresi Logistik Biner dengan Penyesuaian Resampling Data Imbalance." Jurnal Statistika dan Aplikasinya 3, no. 1 (June 28, 2019): 39–49. http://dx.doi.org/10.21009/jsa.03105.

Full text

Abstract:

Pernikahan dini adalah perkawinan yang terjadi pada anak di bawah usia 18 tahun. Secara umum, angka prevalensi pernikahan dini di Indonesia masih cukup tinggi karena 23 dari 34 provinsi di Indonesia memiliki angka prevalensi pernikahan pada usia dini diatas rata-rata nasional. Kategori yang digunakan pada kasus ini menunjukkan keadaan imbalance sehingga diperlukan adanya penyesuaian dalam menganalisis data. Permasalahan yang sering dijumpai akses data kasus pernikahan dini tidak tercatat atau terekam pada dokumen resmi. Dalam upaya untuk mengatasi hal tersebut, penelitian ini menggunakan pengakuan dari perempuan berusia 20-24 tahun pada saat dilakukan survei yang menyatakan bahwa mereka pernah kawin sebelum usia 18 tahun. Variabel yang digunakan dalam penelitian ini yaitu status kemiskinan, klasifikasi wilayah tempat tinggal, status pernikahan dini kepala rumah tangga (KRT), pendidikan KRT, jenis pekerjaan KRT, jumlah anggota rumah tangga, dan status penggunaan internet perempuan berusia 20-24 tahun. Sebelum dianalisis, data imbalance tersebut dilakukan penyesuaian dengan teknik resampling yang meliputi : over sampling, under sampling, dan hybrid kemudian dianalisis dengan metode regresi logistik biner. Selain itu juga akan dicari faktor-faktor yang memengaruhi pernikahan dini pada perempuan berusia 20-24 tahun. Berdasarkan hasil pengolahan dengan program R, teknik resampling yang paling tepat untuk kasus ini adalah oversampling. Dengan metode resampling tersebut, seluruh variabel signifikan berpengaruh terhadap pernikahan dini pada perempuan usia 20-24.

APA, Harvard, Vancouver, ISO, and other styles

24

Nieto-del-Amor, Félix, Gema Prats-Boluda, Javier Garcia-Casado, Alba Diaz-Martinez, Vicente Jose Diago-Almela, Rogelio Monfort-Ortiz, Dongmei Hao, and Yiyao Ye-Lin. "Combination of Feature Selection and Resampling Methods to Predict Preterm Birth Based on Electrohysterographic Signals from Imbalance Data." Sensors 22, no. 14 (July 7, 2022): 5098. http://dx.doi.org/10.3390/s22145098.

Full text

Abstract:

Due to its high sensitivity, electrohysterography (EHG) has emerged as an alternative technique for predicting preterm labor. The main obstacle in designing preterm labor prediction models is the inherent preterm/term imbalance ratio, which can give rise to relatively low performance. Numerous studies obtained promising preterm labor prediction results using the synthetic minority oversampling technique. However, these studies generally overestimate mathematical models’ real generalization capacity by generating synthetic data before splitting the dataset, leaking information between the training and testing partitions and thus reducing the complexity of the classification task. In this work, we analyzed the effect of combining feature selection and resampling methods to overcome the class imbalance problem for predicting preterm labor by EHG. We assessed undersampling, oversampling, and hybrid methods applied to the training and validation dataset during feature selection by genetic algorithm, and analyzed the resampling effect on training data after obtaining the optimized feature subset. The best strategy consisted of undersampling the majority class of the validation dataset to 1:1 during feature selection, without subsequent resampling of the training data, achieving an AUC of 94.5 ± 4.6%, average precision of 84.5 ± 11.7%, maximum F1-score of 79.6 ± 13.8%, and recall of 89.8 ± 12.1%. Our results outperformed the techniques currently used in clinical practice, suggesting the EHG could be used to predict preterm labor in clinics.

APA, Harvard, Vancouver, ISO, and other styles

25

Wongvorachan, Tarid, Surina He, and Okan Bulut. "A Comparison of Undersampling, Oversampling, and SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining." Information 14, no. 1 (January 16, 2023): 54. http://dx.doi.org/10.3390/info14010054.

Full text

Abstract:

Educational data mining is capable of producing useful data-driven applications (e.g., early warning systems in schools or the prediction of students’ academic achievement) based on predictive models. However, the class imbalance problem in educational datasets could hamper the accuracy of predictive models as many of these models are designed on the assumption that the predicted class is balanced. Although previous studies proposed several methods to deal with the imbalanced class problem, most of them focused on the technical details of how to improve each technique, while only a few focused on the application aspect, especially for the application of data with different imbalance ratios. In this study, we compared several sampling techniques to handle the different ratios of the class imbalance problem (i.e., moderately or extremely imbalanced classifications) using the High School Longitudinal Study of 2009 dataset. For our comparison, we used random oversampling (ROS), random undersampling (RUS), and the combination of the synthetic minority oversampling technique for nominal and continuous (SMOTE-NC) and RUS as a hybrid resampling technique. We used the Random Forest as our classification algorithm to evaluate the results of each sampling technique. Our results show that random oversampling for moderately imbalanced data and hybrid resampling for extremely imbalanced data seem to work best. The implications for educational data mining applications and suggestions for future research are discussed.

APA, Harvard, Vancouver, ISO, and other styles

26

Haberlandt, U., A. D. Ebner von Eschenbach, and I. Buchwald. "A space-time hybrid hourly rainfall model for derived flood frequency analysis." Hydrology and Earth System Sciences Discussions 5, no. 4 (September 1, 2008): 2459–90. http://dx.doi.org/10.5194/hessd-5-2459-2008.

Full text

Abstract:

Abstract. For derived flood frequency analysis based on hydrological modelling long continuous precipitation time series with high temporal resolution are needed. Often, the observation network with recording rainfall gauges is poor, so stochastic precipitation synthesis is a good alternative. Here, a hybrid two step procedure is proposed to provide suitable space-time precipitation fields as input for hydrological modelling. First, a univariate alternating renewal model is presented to simulate independent hourly precipitation time series for several locations. In the second step a multi-site resampling procedure is applied on the synthetic point rainfall event series to reproduce the spatial dependence structure of rainfall. The alternating renewal model describes wet spell durations, dry spell durations and wet spell amounts using univariate frequency distributions separately for two seasons. The dependence between wet spell amount and duration is accounted for by 2-copulas. For disaggregation of the wet spells into hourly intensities a predefined profile is used. In the second step resampling is carried out successively on all synthetic event series using simulated annealing with an objective function considering three bivariate spatial rainfall characteristics. In a case study synthetic precipitation is generated for two mesoscale catchments in the Bode river basin of northern Germany and applied for derived flood frequency analysis using the hydrological model HEC-HMS. The results show good performance in reproducing average and extreme rainfall characteristics as well as in reproducing observed flood frequencies. However, they also show that it is important to consider the same rainfall station network for calibration of the hydrological model with observed data as for application using synthetic rainfall data.

APA, Harvard, Vancouver, ISO, and other styles

27

Restrepo, John, Nelson Correa-Rojas, and Jorge Herrera-Ramirez. "Speckle Noise Reduction in Digital Holography Using a DMD and Multi-Hologram Resampling." Applied Sciences 10, no. 22 (November 22, 2020): 8277. http://dx.doi.org/10.3390/app10228277.

Full text

Abstract:

Speckle noise is a well-documented problem on coherent imaging techniques like Digital Holography. A method to reduce the speckle noise level is presented, based on introducing a Digital Micromirror Device to phase modulate the illumination over the object. Multiple holograms with varying illuminations are recorded and the reconstructed intensities are averaged to obtain a final improved image. A simple numerical resampling scheme is proposed to further improve noise reduction. The obtained results demonstrate the effectiveness of the hybrid approach.

APA, Harvard, Vancouver, ISO, and other styles

28

Sukamto, Hadiyanto, and Kurnianingsih. "A Hybrid Resampling Method with K-Nearest Neighbour (FHR-KNN) for Imbalanced Preeclampsia Dataset." Ingénierie des systèmes d information 28, no. 2 (April 30, 2023): 483–90. http://dx.doi.org/10.18280/isi.280225.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Haberlandt, U., A. D. Ebner von Eschenbach, and I. Buchwald. "A space-time hybrid hourly rainfall model for derived flood frequency analysis." Hydrology and Earth System Sciences 12, no. 6 (December 15, 2008): 1353–67. http://dx.doi.org/10.5194/hess-12-1353-2008.

Full text

Abstract:

Abstract. For derived flood frequency analysis based on hydrological modelling long continuous precipitation time series with high temporal resolution are needed. Often, the observation network with recording rainfall gauges is poor, especially regarding the limited length of the available rainfall time series. Stochastic precipitation synthesis is a good alternative either to extend or to regionalise rainfall series to provide adequate input for long-term rainfall-runoff modelling with subsequent estimation of design floods. Here, a new two step procedure for stochastic synthesis of continuous hourly space-time rainfall is proposed and tested for the extension of short observed precipitation time series. First, a single-site alternating renewal model is presented to simulate independent hourly precipitation time series for several locations. The alternating renewal model describes wet spell durations, dry spell durations and wet spell intensities using univariate frequency distributions separately for two seasons. The dependence between wet spell intensity and duration is accounted for by 2-copulas. For disaggregation of the wet spells into hourly intensities a predefined profile is used. In the second step a multi-site resampling procedure is applied on the synthetic point rainfall event series to reproduce the spatial dependence structure of rainfall. Resampling is carried out successively on all synthetic event series using simulated annealing with an objective function considering three bivariate spatial rainfall characteristics. In a case study synthetic precipitation is generated for some locations with short observation records in two mesoscale catchments of the Bode river basin located in northern Germany. The synthetic rainfall data are then applied for derived flood frequency analysis using the hydrological model HEC-HMS. The results show good performance in reproducing average and extreme rainfall characteristics as well as in reproducing observed flood frequencies. The presented model has the potential to be used for ungauged locations through regionalisation of the model parameters.

APA, Harvard, Vancouver, ISO, and other styles

30

Zemmal, Nawel, Nacer Eddine Benzebouchi, Nabiha Azizi, Didier Schwab, and Samir Brahim Belhaouari. "Unbalanced Learning for Diabetes Diagnosis Based on Enhanced Resampling and Stacking Classifier." International Journal of Intelligent Information Technologies 18, no. 1 (January 1, 2022): 1–29. http://dx.doi.org/10.4018/ijiit.309583.

Full text

Abstract:

Diabetes is characterized by an abnormally enhanced concentration of glucose in the blood serum. It has a damaging impact on several noble body systems. Today, the concept of unbalanced learning has developed considerably in the domain of medical diagnosis, which greatly reduces the generation of erroneous classification results. The paper takes a hybrid approach to imbalanced learning by proposing an enhanced multimodal meta-learning method called IRESAMPLE+St to distinguish between normal and diabetic patients. This approach relies on the Stacking paradigm by utilizing the complementarity that may exist between classifiers. In the same focus of this study, a modified RESAMPLE-based technique referred to as IRESAMPLE+ and the SMOTE method are integrated as a preliminary resampling step to overcome and resolve the problem of unbalanced data. The suggested IRESAMPLE+St provides a computerized diabetes diagnostic system with impressive results, comparing it to the principal related studies, reflecting the design and engineering successes achieved.

APA, Harvard, Vancouver, ISO, and other styles

31

Cave-Ayland, Christopher, Chris-Kriton Skylaris, and Jonathan W. Essex. "A Monte Carlo Resampling Approach for the Calculation of Hybrid Classical and Quantum Free Energies." Journal of Chemical Theory and Computation 13, no. 2 (January 31, 2017): 415–24. http://dx.doi.org/10.1021/acs.jctc.6b00506.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Freitas, J. F. G. de, M. Niranjan, A. H. Gee, and A. Doucet. "Sequential Monte Carlo Methods to Train Neural Network Models." Neural Computation 12, no. 4 (April 1, 2000): 955–93. http://dx.doi.org/10.1162/089976600300015664.

Full text

Abstract:

We discuss a novel strategy for training neural networks using sequential Monte Carlo algorithms and propose a new hybrid gradient descent/sampling importance resampling algorithm (HySIR). In terms of computational time and accuracy, the hybrid SIR is a clear improvement over conventional sequential Monte Carlo techniques. The new algorithm may be viewed as a global optimization strategy that allows us to learn the probability distributions of the network weights and outputs in a sequential framework. It is well suited to applications involving on-line, nonlinear, and nongaussian signal processing. We show how the new algorithm outperforms extended Kalman filter training on several problems. In particular, we address the problem of pricing option contracts, traded in financial markets. In this context, we are able to estimate the one-step-ahead probability density functions of the options prices.

APA, Harvard, Vancouver, ISO, and other styles

33

Zheng, Wei, and Jinlei Shen. "Adjustable hybrid resampling approach to computationally efficient probabilistic inference of structural damage based on vibration measurements." Journal of Civil Structural Health Monitoring 6, no. 1 (December 22, 2015): 153–73. http://dx.doi.org/10.1007/s13349-015-0149-0.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Han, Guanghui, Xiabi Liu, Heye Zhang, Guangyuan Zheng, Nouman Qadeer Soomro, Murong Wang, and Weihua Liu. "Hybrid resampling and multi-feature fusion for automatic recognition of cavity imaging sign in lung CT." Future Generation Computer Systems 99 (October 2019): 558–70. http://dx.doi.org/10.1016/j.future.2019.05.009.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Harmon, Mark E., Olga N. Krankina, and Jay Sexton. "Decomposition vectors: a new approach to estimating woody detritus decomposition dynamics." Canadian Journal of Forest Research 30, no. 1 (February 1, 2000): 76–84. http://dx.doi.org/10.1139/x99-187.

Full text

Abstract:

A chronosequence of three species of logs (Pinus sylvestris L., Picea abies (L.) Karst, and Betula pendula Roth.) from northwestern Russia was resampled to develop a new method to estimate rates of biomass, volume, and density loss. We call this resampling of a chronosequence the decomposition-vector method, and it represents a hybrid between the chronosequence and time-series approaches. The decomposition-vector method with a 3-year resampling interval gave decomposition rates statistically similar to those of the one-time chronosequence method. This indicated that, for most cases, a negative exponential pattern of biomass, volume, and density loss occurred. In the case of biomass loss of P. sylvestris, however, polynomial regression indicated decomposition rates were initially low, then increased, and then decreased as biomass was lost. This strongly suggests three distinct phases: the first when decomposers colonized the woody detritus, a second period of rapid exponential mass loss, and a third period of slow decomposition. The consequences for this complex pattern of decomposition were explored at the ecosystem level using a simple model. We found that a single rate constant can be used if inputs vary within a factor of 10, but that this approach is problematical if inputs are more variable.

APA, Harvard, Vancouver, ISO, and other styles

36

Filho, Alberto Cargnelutti, Cleiton Antonio Wartha, Jéssica Andiara Kleinpaul, Ismael Mario Marcio Neu, and Daniela Lixinski Silveira. "Sample Size to Estimate the Mean and Median of Traits in Canola." Journal of Agricultural Science 10, no. 11 (October 15, 2018): 123. http://dx.doi.org/10.5539/jas.v10n11p123.

Full text

Abstract:

The aim of this study was to determine the sample size (i.e., number of plants) required to estimate the mean and median of canola (Brassica napus L.) traits of the Hyola 61, Hyola 76, and Hyola 433 hybrids with precision levels. At 124 days after sowing, 225 plants of each hybrid were randomly collected. In each plant, morphological (plant height) and productive traits (number of siliques, fresh matter of siliques, fresh matter of aerial part without siliques, fresh matter of aerial part, dry matter of siliques, dry matter of aerial part without siliques, and dry matter of aerial part) were measured. For each trait, measures of central tendency, variability, skewness, and kurtosis were calculated. Sample size was determined by resampling with replacement of 10,000 resamples. The sample size required for the estimation of measures of central tendency (mean and median) varies between traits and hybrids. Productive traits required larger sample sizes in relation to the morphological traits. Larger sample sizes are required for the hybrids Hyola 433, Hyola 61, and Hyola 76, in this sequence. In order to estimate the mean of canola traits of the Hyola 61, Hyola 76 e Hyola 433 hybrids with the amplitude of the confidence interval of 95% equal to 30% of the estimated mean, 208 plants are required. Whereas 661 plants are necessary to estimate the median with the same precision.

APA, Harvard, Vancouver, ISO, and other styles

37

Cordoba, Diego A. L., Carla M. C. C. Koike, and Flavio de Barros Vidal. "Particle Filter and Visual Tracking: a Hybrid Resampling approach to improving robustness in cluttered and occluded environments." Learning and Nonlinear Models 14, no. 2 (2016): 4–15. http://dx.doi.org/10.21528/lnlm-vol14-no2-art1.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Alatawi, Mohammed Naif, Najah Alsubaie, Habib Ullah Khan, Tariq Sadad, Hathal Salamah Alwageed, Shaukat Ali, and Islam Zada. "Cyber Security against Intrusion Detection Using Ensemble-Based Approaches." Security and Communication Networks 2023 (February 18, 2023): 1–7. http://dx.doi.org/10.1155/2023/8048311.

Full text

Abstract:

The attacks of cyber are rapidly increasing due to advanced techniques applied by hackers. Furthermore, cyber security is demanding day by day, as cybercriminals are performing cyberattacks in this digital world. So, designing privacy and security measurements for IoT-based systems is necessary for secure network. Although various techniques of machine learning are applied to achieve the goal of cyber security, but still a lot of work is needed against intrusion detection. Recently, the concept of hybrid learning gives more attention to information security specialists for further improvement against cyber threats. In the proposed framework, a hybrid method of swarm intelligence and evolutionary for feature selection, namely, PSO-GA (PSO-based GA) is applied on dataset named CICIDS-2017 before training the model. The model is evaluated using ELM-BA based on bootstrap resampling to increase the reliability of ELM. This work achieved highest accuracy of 100% on PortScan, Sql injection, and brute force attack, which shows that the proposed model can be employed effectively in cybersecurity applications.

APA, Harvard, Vancouver, ISO, and other styles

39

Razon, Alan Marquez, Yizhou Chen, Han Yushan, Steven Gagniere, Michael Tupek, and Joseph Teran. "A Linear and Angular Momentum Conserving Hybrid Particle/Grid Iteration for Volumetric Elastic Contact." Proceedings of the ACM on Computer Graphics and Interactive Techniques 6, no. 3 (August 16, 2023): 1–25. http://dx.doi.org/10.1145/3606924.

Full text

Abstract:

We present a momentum conserving hybrid particle/grid iteration for resolving volumetric elastic collision. Our hybrid method uses implicit time stepping with a Lagrangian finite element discretization of the volumetric elastic material together with impulse-based collision-correcting momentum updates designed to exactly conserve linear and angular momentum. We use a two-step process for collisions: first we use a novel grid-based approach that leverages the favorable collision resolution properties of Particle-In-Cell (PIC) techniques, then we finalize with a classical collision impulse strategy utilizing continuous collision detection. Our PIC approach uses Affine-Particle-In-Cell momentum transfers as collision preventing impulses together with novel perfectly momentum conserving boundary resampling and downsampling operators that prevent artifacts in portions of the boundary where the grid resolution is of disparate resolution. We combine this with a momentum conserving augury iteration to remove numerical cohesion and model sliding friction. Our collision strategy has the same continuous collision detection as traditional approaches, however our hybrid particle/grid iteration drastically reduces the number of iterations required. Lastly, we develop a novel symmetric positive semi-definite Rayleigh damping model that increases the convexity of the nonlinear systems associated with implicit time stepping. We demonstrate the robustness and efficiency of our approach in a number of collision intensive examples.

APA, Harvard, Vancouver, ISO, and other styles

40

Fortin, Mathieu, Rubén Manso, and Robert Schneider. "Parametric bootstrap estimators for hybrid inference in forest inventories." Forestry: An International Journal of Forest Research 91, no. 3 (November 22, 2017): 354–65. http://dx.doi.org/10.1093/forestry/cpx048.

Full text

Abstract:

Abstract In forestry, the variable of interest is not always directly available from forest inventories. Consequently, practitioners have to rely on models to obtain predictions of this variable of interest. This context leads to hybrid inference, which is based on both the probability design and the model. Unfortunately, the current analytical hybrid estimators for the variance of the point estimator are mainly based on linear or nonlinear models and their use is limited when the model reaches a high level of complexity. An alternative consists of using a variance estimator based on resampling methods (Rubin, D. B. (1987). Multiple imputation for nonresponse surveys. John Wiley & Sons, Hoboken, New Jersey, USA). However, it turns out that a parametric bootstrap (BS) estimator of the variance can be biased in contexts of hybrid inference. In this study, we designed and tested a corrected BS estimator for the variance of the point estimator, which can easily be implemented as long as all of the stochastic components of the model can be properly simulated. Like previous estimators, this corrected variance estimator also makes it possible to distinguish the contribution of the sampling and the model to the variance of the point estimator. The results of three simulation studies of increasing complexity showed no evidence of bias for this corrected variance estimator, which clearly outperformed the BS variance estimator used in previous studies. Since the implementation of this corrected variance estimator is not much more complicated, we recommend its use in contexts of hybrid inference based on complex models.

APA, Harvard, Vancouver, ISO, and other styles

41

Chen, Liping, Jiabao Jiang, and Yong Zhang. "HSDP: A Hybrid Sampling Method for Imbalanced Big Data Based on Data Partition." Complexity 2021 (June 21, 2021): 1–9. http://dx.doi.org/10.1155/2021/6877284.

Full text

Abstract:

The classical classifiers are ineffective in dealing with the problem of imbalanced big dataset classification. Resampling the datasets and balancing samples distribution before training the classifier is one of the most popular approaches to resolve this problem. An effective and simple hybrid sampling method based on data partition (HSDP) is proposed in this paper. First, all the data samples are partitioned into different data regions. Then, the data samples in the noise minority samples region are removed and the samples in the boundary minority samples region are selected as oversampling seeds to generate the synthetic samples. Finally, a weighted oversampling process is conducted considering the generation of synthetic samples in the same cluster of the oversampling seed. The weight of each selected minority class sample is computed by the ratio between the proportion of majority class in the neighbors of this selected sample and the sum of all these proportions. Generation of synthetic samples in the same cluster of the oversampling seed guarantees new synthetic samples located inside the minority class area. Experiments conducted on eight datasets show that the proposed method, HSDP, is better than or comparable with the typical sampling methods for F-measure and G-mean.

APA, Harvard, Vancouver, ISO, and other styles

42

Liu, Jian, Jin Yuan, Jiyuan Cui, Yunru Liu, and Xuemei Liu. "Contour Resampling-Based Garlic Clove Bud Orientation Recognition for High-Speed Precision Seeding." Agriculture 12, no. 9 (August 29, 2022): 1334. http://dx.doi.org/10.3390/agriculture12091334.

Full text

Abstract:

Achieving fast and accurate recognition of garlic clove bud orientation is necessary for high-speed garlic seed righting operation and precision sowing. However, disturbances from actual field sowing conditions, such as garlic skin, vibration, and rapid movement of garlic seeds, can affect the accuracy of recognition. Meanwhile, garlic precision planters need to realize a recognition algorithm with low-delay calculation under the condition of limited computing power, which is a challenge for embedded computing platforms. Existing solutions suffer from low recognition rate and high algorithm complexity. Therefore, a high-speed method for recognizing garlic clove bud direction based on deep learning is proposed, which uses an auxiliary device to obtain the garlic clove contours as the basis for bud orientation classification. First, hybrid garlic breeds with the largest variation in shape were selected randomly and used as research materials, and a binary image dataset of garlic seed contours was created through image sampling and various data enhancement methods to ensure the generalization of the model that had been trained on the data. Second, three lightweight deep-learning classifiers, transfer learning based on MobileNetV3, a naive convolutional neural network model, and a contour resampling-based fully connected network, were utilized to realize accurate and high-speed orientation recognition of garlic clove buds. Third, after the optimization of the model’s structure and hyper-parameters, recognition models suitable for different levels of embedded hardware performance were trained and tested on the low-cost embedded platform. The experimental results showed that the MobileNetV3 model based on transfer learning, the naive convolutional neural network model, and the fully connected model achieved accuracy of 98.71, 98.21, and 98.16%, respectively. The recognition speed of the three including auxiliary programs was 19.35, 97.39, and 151.40 FPS, respectively. Theoretically, the processing speed of 151 seeds per second achieves a 1.3 hm2/h planting speed with single-row operation, which outperforms state-of-the-art methods in garlic-clove-bud-orientation recognition and could meet the needs of high-speed precise seeding.

APA, Harvard, Vancouver, ISO, and other styles

43

Leow, Yen-Siang, Kok-Why Ng, Yih-Jian Yoong, and Seng-Beng Ng. "Sickle cell segmentation and classification for thalassemia aid diagnosis." F1000Research 10 (November 23, 2021): 1185. http://dx.doi.org/10.12688/f1000research.73314.1.

Full text

Abstract:

Background: Thalassemia is a hereditary blood disease in which abnormal red blood cells (RBCs) carry insufficient oxygen throughout the body. Conventional methods of thalassemia detection through a complete blood count (CBC) test and peripheral blood smear image still possess a lot of weaknesses. Methods: This paper proposes a hybrid segmentation method to segment the RBCs. It incorporates adaptive thresholding and canny edge method to segment the RBCs. Morphological operations are performed to clean the leftovers. Shape and texture features are extracted using the segmented masks and the gray level co-occurrence matrix. Data imbalance treatment is used for solving the imbalance cell type class in distribution. In the data resampling layer, the synthetic minority oversampling technique (SMOTE), adaptive synthetic sampling (ADASYN), and random over sampling (ROS) are performed and evaluated using the decision tree and logistic regression. In the classification layer, the decision tree, random forest classifier and support vector machine (SVM) are assessed and compared for the best performance in classification. Results:The proposed method outperforms the other methods in the image segmentation layer with the structural similarity index measure (SSIM) of 89.88%. In the data resampling layer, ADASYN is employed as it is more accurate than the SMOTE and ROS. The random forest classifier is chosen at the classification layer as it is more accurate than the decision tree and support vector machine (SVM). Conclusions:The proposed method is tested on the latest dataset of erythrocyteIDB3 and it solves the issues of imbalanced data due to the insufficient cell classes.

APA, Harvard, Vancouver, ISO, and other styles

44

Zhang, Ming, Kai Wang, and Yan-ting Zhou. "Online State of Charge Estimation of Lithium-Ion Cells Using Particle Filter-Based Hybrid Filtering Approach." Complexity 2020 (January 10, 2020): 1–10. http://dx.doi.org/10.1155/2020/8231243.

Full text

Abstract:

Filtering based state of charge (SOC) estimation with an equivalent circuit model is commonly extended to Lithium-ion (Li-ion) batteries for electric vehicle (EV) or similar energy storage applications. During the last several decades, different implementations of online parameter identification such as Kalman filters have been presented in literature. However, if the system is a moving EV during rapid acceleration or regenerative braking or when using heating or air conditioning, most of the existing works suffer from poor prediction of state and state estimation error covariance, leading to the problem of accuracy degeneracy of the algorithm. On this account, this paper presents a particle filter-based hybrid filtering method particularly for SOC estimation of Li-ion cells in EVs. A sampling importance resampling particle filter is used in combination with a standard Kalman filter and an unscented Kalman filter as a proposal distribution for the particle filter to be made much faster and more accurate. Test results show that the error on the state estimate is less than 0.8% despite additive current measurement noise with 0.05 A deviation.

APA, Harvard, Vancouver, ISO, and other styles

45

Hua, Jianping, Michael L. Bittner, and Edward R. Dougherty. "Evaluating Gene Set Enrichment Analysis via a Hybrid Data Model." Cancer Informatics 13s1 (January 2014): CIN.S13305. http://dx.doi.org/10.4137/cin.s13305.

Full text

Abstract:

Gene set enrichment analysis (GSA) methods have been widely adopted by biological labs to analyze data and generate hypotheses for validation. Most of the existing comparison studies focus on whether the existing GSA methods can produce accurate P-values; however, practitioners are often more concerned with the correct gene-set ranking generated by the methods. The ranking performance is closely related to two critical goals associated with GSA methods: the ability to reveal biological themes and ensuring reproducibility, especially for small-sample studies. We have conducted a comprehensive simulation study focusing on the ranking performance of seven representative GSA methods. We overcome the limitation on the availability of real data sets by creating hybrid data models from existing large data sets. To build the data model, we pick a master gene from the data set to form the ground truth and artificially generate the phenotype labels. Multiple hybrid data models can be constructed from one data set and multiple data sets of smaller sizes can be generated by resampling the original data set. This approach enables us to generate a large batch of data sets to check the ranking performance of GSA methods. Our simulation study reveals that for the proposed data model, the Q2 type GSA methods have in general better performance than other GSA methods and the global test has the most robust results. The properties of a data set play a critical role in the performance. For the data sets with highly connected genes, all GSA methods suffer significantly in performance.

APA, Harvard, Vancouver, ISO, and other styles

46

Bashir, Adnan, Muhammad Ahmed Shehzad, Aamna Khan, Ayesha Niaz, Muhammad Nabeel Asghar, Ramy Aldallal, and Mutua Kilai. "Use of Wavelet and Bootstrap Methods in Streamflow Prediction." Journal of Mathematics 2023 (February 17, 2023): 1–13. http://dx.doi.org/10.1155/2023/4222934.

Full text

Abstract:

Streamflow prediction is vital to control the effects of floods and mitigation. Physical prediction model often provides satisfactory results, but these models require massive computational work and hydrogeomorphological variables to develop a prediction system. At the same time, data-driven prediction models are quick to apply, easy to handle, and reliable. This study investigates a new hybrid model, the wavelet bootstrap quadratic response surface, for accurate streamflow prediction. Wavelet analysis is a well-known time-frequency joint analysis technique applied in various fields like biological signals, vibration signals, and hydrological signals. The wavelet analysis is used to denoise the time series data. Bootstrap is a nonparametric method for removing uncertainty that uses an intensive resampling methodology with replacement. The authors analyzed the results of the studied models with different statistical metrics, and it has been observed that the wavelet bootstrap quadratic response surface model provides the most efficient results.

APA, Harvard, Vancouver, ISO, and other styles

47

Lu, Ya-Ting, Horng-Jiun Chao, Yi-Chun Chiang, and Hsiang-Yin Chen. "Explainable Machine Learning Techniques To Predict Amiodarone-Induced Thyroid Dysfunction Risk: Multicenter, Retrospective Study With External Validation." Journal of Medical Internet Research 25 (February 7, 2023): e43734. http://dx.doi.org/10.2196/43734.

Full text

Abstract:

Background Machine learning offers new solutions for predicting life-threatening, unpredictable amiodarone-induced thyroid dysfunction. Traditional regression approaches for adverse-effect prediction without time-series consideration of features have yielded suboptimal predictions. Machine learning algorithms with multiple data sets at different time points may generate better performance in predicting adverse effects. Objective We aimed to develop and validate machine learning models for forecasting individualized amiodarone-induced thyroid dysfunction risk and to optimize a machine learning–based risk stratification scheme with a resampling method and readjustment of the clinically derived decision thresholds. Methods This study developed machine learning models using multicenter, delinked electronic health records. It included patients receiving amiodarone from January 2013 to December 2017. The training set was composed of data from Taipei Medical University Hospital and Wan Fang Hospital, while data from Taipei Medical University Shuang Ho Hospital were used as the external test set. The study collected stationary features at baseline and dynamic features at the first, second, third, sixth, ninth, 12th, 15th, 18th, and 21st months after amiodarone initiation. We used 16 machine learning models, including extreme gradient boosting, adaptive boosting, k-nearest neighbor, and logistic regression models, along with an original resampling method and 3 other resampling methods, including oversampling with the borderline-synthesized minority oversampling technique, undersampling–edited nearest neighbor, and over- and undersampling hybrid methods. The model performance was compared based on accuracy; Precision, recall, F1-score, geometric mean, area under the curve of the receiver operating characteristic curve (AUROC), and the area under the precision-recall curve (AUPRC). Feature importance was determined by the best model. The decision threshold was readjusted to identify the best cutoff value and a Kaplan-Meier survival analysis was performed. Results The training set contained 4075 patients from Taipei Medical University Hospital and Wan Fang Hospital, of whom 583 (14.3%) developed amiodarone-induced thyroid dysfunction, while the external test set included 2422 patients from Taipei Medical University Shuang Ho Hospital, of whom 275 (11.4%) developed amiodarone-induced thyroid dysfunction. The extreme gradient boosting oversampling machine learning model demonstrated the best predictive outcomes among all 16 models. The accuracy; Precision, recall, F1-score, G-mean, AUPRC, and AUROC were 0.923, 0.632, 0.756, 0.688, 0.845, 0.751, and 0.934, respectively. After readjusting the cutoff, the best value was 0.627, and the F1-score reached 0.699. The best threshold was able to classify 286 of 2422 patients (11.8%) as high-risk subjects, among which 275 were true-positive patients in the testing set. A shorter treatment duration; higher levels of thyroid-stimulating hormone and high-density lipoprotein cholesterol; and lower levels of free thyroxin, alkaline phosphatase, and low-density lipoprotein were the most important features. Conclusions Machine learning models combined with resampling methods can predict amiodarone-induced thyroid dysfunction and serve as a support tool for individualized risk prediction and clinical decision support.

APA, Harvard, Vancouver, ISO, and other styles

48

Harwood, Stuart M., Dimitar Trenev, Spencer T. Stober, Panagiotis Barkoutsos, Tanvi P. Gujarati, Sarah Mostame, and Donny Greenberg. "Improving the Variational Quantum Eigensolver Using Variational Adiabatic Quantum Computing." ACM Transactions on Quantum Computing 3, no. 1 (March 31, 2022): 1–20. http://dx.doi.org/10.1145/3479197.

Full text

Abstract:

The variational quantum eigensolver (VQE) is a hybrid quantum-classical algorithm for finding the minimum eigenvalue of a Hamiltonian that involves the optimization of a parameterized quantum circuit. Since the resulting optimization problem is in general nonconvex, the method can converge to suboptimal parameter values that do not yield the minimum eigenvalue. In this work, we address this shortcoming by adopting the concept of variational adiabatic quantum computing (VAQC) as a procedure to improve VQE. In VAQC, the ground state of a continuously parameterized Hamiltonian is approximated via a parameterized quantum circuit. We discuss some basic theory of VAQC to motivate the development of a hybrid quantum-classical homotopy continuation method. The proposed method has parallels with a predictor-corrector method for numerical integration of differential equations. While there are theoretical limitations to the procedure, we see in practice that VAQC can successfully find good initial circuit parameters to initialize VQE. We demonstrate this with two examples from quantum chemistry. Through these examples, we provide empirical evidence that VAQC, combined with other techniques (an adaptive termination criteria for the classical optimizer and a variance-based resampling method for the expectation evaluation), can provide more accurate solutions than “plain” VQE, for the same amount of effort.

APA, Harvard, Vancouver, ISO, and other styles

49

Han, Guanghui, Xiabi Liu, Guangyuan Zheng, Murong Wang, and Shan Huang. "Automatic recognition of 3D GGO CT imaging signs through the fusion of hybrid resampling and layer-wise fine-tuning CNNs." Medical & Biological Engineering & Computing 56, no. 12 (June 6, 2018): 2201–12. http://dx.doi.org/10.1007/s11517-018-1850-z.

Full text

APA, Harvard, Vancouver, ISO, and other styles

50

Jung, Ilok, Jaewon Ji, and Changseob Cho. "EmSM: Ensemble Mixed Sampling Method for Classifying Imbalanced Intrusion Detection Data." Electronics 11, no. 9 (April 23, 2022): 1346. http://dx.doi.org/10.3390/electronics11091346.

Full text

Abstract:

Research on the application of machine learning to the field of intrusion detection is attracting great interest. However, depending on the application, it is difficult to collect the data needed for training and testing, as the least frequent data type reflects the most serious threats, resulting in imbalanced data, which leads to overfitting and hinders precise classification. To solve this problem, in this study, we propose a mixed resampling method using a hybrid synthetic minority oversampling technique with an edited neural network that increases the minority class and removes noisy data to generate a balanced dataset. A bagging ensemble algorithm is then used to optimize the model with the new data. We performed verification using two public intrusion detection datasets: PKDD2007 (balanced) and CSIC2012 (imbalanced). The proposed technique yields improved performance over state-of-the-art techniques. Furthermore, the proposed technique enables improved true positive identification and classification of serious threats that rarely occur, representing a major functional innovation.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!