Journal articles on the topic 'FEATURE SELECTION TECHNIQUE'

To see the other types of publications on this topic, follow the link: FEATURE SELECTION TECHNIQUE.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'FEATURE SELECTION TECHNIQUE.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Sharaff, Aakanksha, Naresh Kumar Nagwani, and Kunal Swami. "Impact of Feature Selection Technique on Email Classification." International Journal of Knowledge Engineering-IACSIT 1, no. 1 (2015): 59–63. http://dx.doi.org/10.7763/ijke.2015.v1.10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Salama, Mostafa A., and Ghada Hassan. "A Novel Feature Selection Measure Partnership-Gain." International Journal of Online and Biomedical Engineering (iJOE) 15, no. 04 (February 27, 2019): 4. http://dx.doi.org/10.3991/ijoe.v15i04.9831.

Full text
Abstract:
Multivariate feature selection techniques search for the optimal features subset to reduce the dimensionality and hence the complexity of a classification task. Statistical feature selection techniques measure the mutual correlation between features well as the correlation of each feature to the tar- get feature. However, adding a feature to a feature subset could deteriorate the classification accuracy even though this feature positively correlates to the target class. Although most of existing feature ranking/selection techniques consider the interdependency between features, the nature of interaction be- tween features in relationship to the classification problem is still not well investigated. This study proposes a technique for forward feature selection that calculates the novel measure Partnership-Gain to select a subset of features whose partnership constructively correlates to the target feature classification. Comparative analysis to other well-known techniques shows that the proposed technique has either an enhanced or a comparable classification accuracy on the datasets studied. We present a visualization of the degree and direction of the proposed measure of features’ partnerships for a better understanding of the measure’s nature.
APA, Harvard, Vancouver, ISO, and other styles
3

Sikri, Alisha, N. P. Singh, and Surjeet Dalal. "Analysis of Rank Aggregation Techniques for Rank Based on the Feature Selection Technique." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 3s (March 11, 2023): 95–108. http://dx.doi.org/10.17762/ijritcc.v11i3s.6160.

Full text
Abstract:
In order to improve classification accuracy and lower future computation and data collecting costs, feature selection is the process of choosing the most crucial features from a group of attributes and removing the less crucial or redundant ones. To narrow down the features that need to be analyzed, a variety of feature selection procedures have been detailed in published publications. Chi-Square (CS), IG, Relief, GR, Symmetrical Uncertainty (SU), and MI are six alternative feature selection methods used in this study. The provided dataset is aggregated using four rank aggregation strategies: "rank aggregation," "Borda Count (BC) methodology," "score and rank combination," and "unified feature scoring" based on the outcomes of the six feature selection method (UFS). These four procedures by themselves were unable to generate a clear selection rank for the characteristic. To produce different ranks of traits, this ensemble of aggregating ranks is carried out. For this, the bagging method of majority voting was applied.
APA, Harvard, Vancouver, ISO, and other styles
4

Goswami, Saptarsi, Amit Kumar Das, Amlan Chakrabarti, and Basabi Chakraborty. "A feature cluster taxonomy based feature selection technique." Expert Systems with Applications 79 (August 2017): 76–89. http://dx.doi.org/10.1016/j.eswa.2017.01.044.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Jain, Rahi, and Wei Xu. "HDSI: High dimensional selection with interactions algorithm on feature selection and testing." PLOS ONE 16, no. 2 (February 16, 2021): e0246159. http://dx.doi.org/10.1371/journal.pone.0246159.

Full text
Abstract:
Feature selection on high dimensional data along with the interaction effects is a critical challenge for classical statistical learning techniques. Existing feature selection algorithms such as random LASSO leverages LASSO capability to handle high dimensional data. However, the technique has two main limitations, namely the inability to consider interaction terms and the lack of a statistical test for determining the significance of selected features. This study proposes a High Dimensional Selection with Interactions (HDSI) algorithm, a new feature selection method, which can handle high-dimensional data, incorporate interaction terms, provide the statistical inferences of selected features and leverage the capability of existing classical statistical techniques. The method allows the application of any statistical technique like LASSO and subset selection on multiple bootstrapped samples; each contains randomly selected features. Each bootstrap data incorporates interaction terms for the randomly sampled features. The selected features from each model are pooled and their statistical significance is determined. The selected statistically significant features are used as the final output of the approach, whose final coefficients are estimated using appropriate statistical techniques. The performance of HDSI is evaluated using both simulated data and real studies. In general, HDSI outperforms the commonly used algorithms such as LASSO, subset selection, adaptive LASSO, random LASSO and group LASSO.
APA, Harvard, Vancouver, ISO, and other styles
6

Ramineni, Vyshnavi, and Goo-Rak Kwon. "Diagnosis of Alzheimer’s Disease using Wrapper Feature Selection Method." Korean Institute of Smart Media 12, no. 3 (April 30, 2023): 30–37. http://dx.doi.org/10.30693/smj.2023.12.3.30.

Full text
Abstract:
Alzheimer’s disease (AD) symptoms are being treated by early diagnosis, where we can only slow the symptoms and research is still undergoing. In consideration, using T1-weighted images several classification models are proposed in Machine learning to identify AD. In this paper, we consider the improvised feature selection, to reduce the complexity by using wrapping techniques and Restricted Boltzmann Machine (RBM). This present work used the subcortical and cortical features of 278 subjects from the ADNI dataset to identify AD and sMRI. Multi-class classification is used for the experiment i.e., AD, EMCI, LMCI, HC. The proposed feature selection consists of Forward feature selection, Backward feature selection, and Combined PCA & RBM. Forward and backward feature selection methods use an iterative method starting being no features in the forward feature selection and backward feature selection with all features included in the technique. PCA is used to reduce the dimensions and RBM is used to select the best feature without interpreting the features. We have compared the three models with PCA to analysis. The following experiment shows that combined PCA &RBM, and backward feature selection give the best accuracy with respective classification model RF i.e., 88.65, 88.56% respectively.
APA, Harvard, Vancouver, ISO, and other styles
7

Zabidi, A., W. Mansor, and Khuan Y. Lee. "Optimal Feature Selection Technique for Mel Frequency Cepstral Coefficient Feature Extraction in Classifying Infant Cry with Asphyxia." Indonesian Journal of Electrical Engineering and Computer Science 6, no. 3 (June 1, 2017): 646. http://dx.doi.org/10.11591/ijeecs.v6.i3.pp646-655.

Full text
Abstract:
<p>Mel Frequency Cepstral Coefficient is an efficient feature representation method for extracting human-audible audio signals. However, its representation of features is large and redundant. Therefore, feature selection is required to select the optimal subset of Mel Frequency Cepstral Coefficient features. The performance of two types of feature selection techniques; Orthogonal Least Squares and F-ratio for selecting Mel Frequency Cepstral Coefficient features of infant cry with asphyxia was examined. OLS selects the feature subset based on their contribution to the reduction of error, while F-Ratio selects them according to their discriminative abilities. The feature selection techniques were combined with Multilayer Perceptron to distinguish between asphyxiated infant cry and normal cry signals. The performance of the feature selection methods was examined by analysing the Multilayer Perceptron classification accuracy resulted from the combination of the feature selection techniques and Multilayer Perceptron. The results indicate that Orthogonal Least Squares is the most suitable feature selection method in classifying infant cry with asphyxia since it produces the highest classification accuracy.<em></em></p>
APA, Harvard, Vancouver, ISO, and other styles
8

Miftahushudur, Tajul, Chaeriah Bin Ali Wael, and Teguh Praludi. "Infinite Latent Feature Selection Technique for Hyperspectral Image Classification." Jurnal Elektronika dan Telekomunikasi 19, no. 1 (August 31, 2019): 32. http://dx.doi.org/10.14203/jet.v19.32-37.

Full text
Abstract:
The classification process is one of the most crucial processes in hyperspectral imaging. One of the limitations in classification process using machine learning technique is its complexities, where hyperspectral image format has a thousand band that can be used as a feature for learning purpose. This paper presents a comparison between two feature selection technique based on probability approach that not only can tackle the problem, but also improve accuracy. Infinite Latent Feature Selection (ILFS) and Relief Techniques are implemented in a hyperspectral image to select the most important feature or band before applied in Support Vector Machine (SVM). The result showed ILFS technique can improve classification accuracy better than Relief (92.21% vs. 88.10%). However, Relief can extract less feature to reach its best accuracy with only 6 features compared with ILFS with 9.
APA, Harvard, Vancouver, ISO, and other styles
9

Saifan, Ahmad A., and Lina Abu-wardih. "Software Defect Prediction Based on Feature Subset Selection and Ensemble Classification." ECTI Transactions on Computer and Information Technology (ECTI-CIT) 14, no. 2 (October 9, 2020): 213–28. http://dx.doi.org/10.37936/ecti-cit.2020142.224489.

Full text
Abstract:
Two primary issues have emerged in the machine learning and data mining community: how to deal with imbalanced data and how to choose appropriate features. These are of particular concern in the software engineering domain, and more specifically the field of software defect prediction. This research highlights a procedure which includes a feature selection technique to single out relevant attributes, and an ensemble technique to handle the class-imbalance issue. In order to determine the advantages of feature selection and ensemble methods we look at two potential scenarios: (1) Ensemble models constructed from the original datasets, without feature selection; (2) Ensemble models constructed from the reduced datasets after feature selection has been applied. Four feature selection techniques are employed: Principal Component Analysis (PCA), Pearson’s correlation, Greedy Stepwise Forward selection, and Information Gain (IG). The aim of this research is to assess the effectiveness of feature selection techniques using ensemble techniques. Five datasets, obtained from the PROMISE software depository, are analyzed; tentative results indicate that ensemble methods can improve the model's performance without the use of feature selection techniques. PCA feature selection and bagging based on K-NN perform better than both bagging based on SVM and boosting based on K-NN and SVM, and feature selection techniques including Pearson’s correlation, Greedy stepwise, and IG weaken the ensemble models’ performance.
APA, Harvard, Vancouver, ISO, and other styles
10

Ali, Tariq, Asif Nawaz, and Hafiza Ayesha Sadia. "Genetic Algorithm Based Feature Selection Technique for Electroencephalography Data." Applied Computer Systems 24, no. 2 (December 1, 2019): 119–27. http://dx.doi.org/10.2478/acss-2019-0015.

Full text
Abstract:
Abstract High dimensionality is a well-known problem that has a huge number of highlights in the data, yet none is helpful for a particular data mining task undertaking, for example, classification and grouping. Therefore, selection of features is used frequently to reduce the data set dimensionality. Feature selection is a multi-target errand, which diminishes dataset dimensionality, decreases the running time, and furthermore enhances the expected precision. In the study, our goal is to diminish the quantity of features of electroencephalography data for eye state classification and achieve the same or even better classification accuracy with the least number of features. We propose a genetic algorithm-based feature selection technique with the KNN classifier. The accuracy is improved with the selected feature subset using the proposed technique as compared to the full feature set. Results prove that the classification precision of the proposed strategy is enhanced by 3 % on average when contrasted with the accuracy without feature selection.
APA, Harvard, Vancouver, ISO, and other styles
11

Seetha, Hari, M. Narasimha Murty, and R. Saravanan. "Effective feature selection technique for text classification." International Journal of Data Mining, Modelling and Management 7, no. 3 (2015): 165. http://dx.doi.org/10.1504/ijdmmm.2015.071451.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Sarkar, Chandrima, Sarah Cooley, and Jaideep Srivastava. "Robust Feature Selection Technique Using Rank Aggregation." Applied Artificial Intelligence 28, no. 3 (March 14, 2014): 243–57. http://dx.doi.org/10.1080/08839514.2014.883903.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Ahmad, Amir, and Lipika Dey. "A feature selection technique for classificatory analysis." Pattern Recognition Letters 26, no. 1 (January 2005): 43–56. http://dx.doi.org/10.1016/j.patrec.2004.08.015.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Phogat, Manu, and Dharmender Kumar. "Disease Single Nucleotide Polymorphism Selection using Hybrid Feature Selection Technique." Journal of Physics: Conference Series 1950, no. 1 (August 1, 2021): 012079. http://dx.doi.org/10.1088/1742-6596/1950/1/012079.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Solovei, Olga. "NEW ORGANIZATION PROCESS OF FEATURE SELECTION BY FILTER WITH CORRELATION-BASED FEATURES SELECTION METHOD." Innovative Technologies and Scientific Solutions for Industries, no. 3 (21) (November 18, 2022): 39–50. http://dx.doi.org/10.30837/itssi.2022.21.039.

Full text
Abstract:
The subject of the article is feature selection techniques that are used on data preprocessing step before building machine learning models. In this paper the focus is put on a Filter technique when it uses Correlation-based Feature Selection (further CFS) with symmetrical uncertainty method (further CFS-SU) or CFS with Pearson Correlation (further CFS-PearCorr). The goal of the work is to increase the efficiency of feature selection by Filter with CFS by proposing a new organization process of feature selection. The tasks which are solved in the article: review and analysis of the existing organization process of feature selections by Filter with CFS; identify the routs cause the performance degradation; propose a new approach; evaluate the proposed approach. To implement the specified tasks, the following methods were used: information theory, process theory, algorithm theory, statistics theory, sampling techniques, data modeling theory, science experiments. Results. Based on the received results are proved: 1) the chosen features subset’s evaluation function couldn’t be based only on CFS merit as it causes a learning algorithm’s results degradation; 2) the accuracies of the classification learning algorithms had improved and the values of determination coefficient of the regression leaning algorithms had increased when features are selected according to the proposed new organization process. Conclusions. A new organization process for feature selection which is proposed in current work combines filter and learning algorithm properties in evaluation strategy which helps to choose the optimal feature subset for predefined learning algorithm. The computation complexity of the proposed approach to feature selection doesn’t depend on dataset’s dimensions which makes it robust to different data varieties; it eliminates the time needed for feature subsets’ search as subsets are selected randomly. The conducted experiments proved that the performance of the classification and regression learning algorithms with features selected according to the new flow had outperformed the performance of the same learning algorithms built with without applied new process on data preprocessing step.
APA, Harvard, Vancouver, ISO, and other styles
16

Wiradinata, Trianggoro, and Adi Suryaputra Paramita. "Clustering and Feature Selection Technique for Improving Internet Traffic Classification Using K-NN." Journal of Advances in Computer Networks 4, no. 1 (2016): 24–27. http://dx.doi.org/10.18178/jacn.2016.4.1.198.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Mohamed, Rozlini, Munirah Mohd Yusof, Noorhaniza Wahid, Norhanifah Murli, and Muhaini Othman. "Bat algorithm and k-means techniques for classification performance improvement." Indonesian Journal of Electrical Engineering and Computer Science 15, no. 3 (September 1, 2019): 1411. http://dx.doi.org/10.11591/ijeecs.v15.i3.pp1411-1418.

Full text
Abstract:
This paper presents Bat Algorithm and K-Means techniques for classification performance improvement. The objective of this study is to investigate efficiency of Bat Algorithm in discrete dataset and to find the optimum feature in discrete dataset. In this study, one technique that comprise the discretization technique and feature selection technique have been proposed. Our contribution is in two process of classification: pre-processing and feature selection process. First, to proposed discretization techniques called as BkMD, where we hybrid Bat Algorithm technique and K-Means classifier. Second, to proposed BkMDFS as feature selection technique where Bat Algorithm is embed into BkMD. In order to evaluate our proposed techniques, 14 continuous dataset from various applications are used in experiment. From the experiment, results show that BkMDFS outperforms in most performance measures. Hence it shows that, Bat Algorithm have potential to be one of the discretization technique and feature selection technique.
APA, Harvard, Vancouver, ISO, and other styles
18

Abubakar, Shamsuddeen Muhammad, and Zahraddeen Sufyanu. "Comparisons of Filter, Wrapper and Embedded-Based Feature Selection Techniques for Consistency of Software Metrics Analysis." SLU Journal of Science and Technology 4, no. 1&2 (July 20, 2022): 188–204. http://dx.doi.org/10.56471/slujst.v4i.238.

Full text
Abstract:
Identifying and selecting the most consistent subset of metrics which improves the performance of software defect prediction model is paramount but challenging problem as it receives little attention in literature. The current research aimed at investigating the consistency of subsets of metrics that are produced by embedded feature selection techniques. Ten (10) feature selection techniques used from the families of filter and wrapper-based feature selection techniques commonly used in the defect prediction domain. Ten (10) publicly available defect datasets were studied which span both proprietary and open source domains. SVM-RFE-RF presented 42-93% consistent metrics across datasets. While the prior study on non-Embedded produced 56.5% consistent metrics at median. SVM-RFE-LF approach of Embedded Feature Selection Technique produced 54-80% consistent metrics across datasets and 42.5% at median. To state the purpose of tittle has been achieved Embedded based Feature Selection Techniques produced most efficient consistent subset selection across the entire datasets and amongst the feature selection techniques as compared with counterpart filter and wrapper-based feature selection techniques
APA, Harvard, Vancouver, ISO, and other styles
19

Thepade, Sudeep, Rik Das, and Saurav Ghosh. "A Novel Feature Extraction Technique Using Binarization of Bit Planes for Content Based Image Classification." Journal of Engineering 2014 (2014): 1–13. http://dx.doi.org/10.1155/2014/439218.

Full text
Abstract:
A number of techniques have been proposed earlier for feature extraction using image binarization. Efficiency of the techniques was dependent on proper threshold selection for the binarization method. In this paper, a new feature extraction technique using image binarization has been proposed. The technique has binarized the significant bit planes of an image by selecting local thresholds. The proposed algorithm has been tested on a public dataset and has been compared with existing widely used techniques using binarization for extraction of features. It has been inferred that the proposed method has outclassed all the existing techniques and has shown consistent classification performance.
APA, Harvard, Vancouver, ISO, and other styles
20

Chotchantarakun, Knitchepon, and Ohm Sornil. "Adaptive Multi-level Backward Tracking for Sequential Feature Selection." Journal of ICT Research and Applications 15, no. 1 (June 29, 2021): 1–20. http://dx.doi.org/10.5614/itbj.ict.res.appl.2021.15.1.1.

Full text
Abstract:
In the past few decades, the large amount of available data has become a major challenge in data mining and machine learning. Feature selection is a significant preprocessing step for selecting the most informative features by removing irrelevant and redundant features, especially for large datasets. These selected features play an important role in information searching and enhancing the performance of machine learning models. In this research, we propose a new technique called One-level Forward Multi-level Backward Selection (OFMB). The proposed algorithm consists of two phases. The first phase aims to create preliminarily selected subsets. The second phase provides an improvement on the previous result by an adaptive multi-level backward searching technique. Hence, the idea is to apply an improvement step during the feature addition and an adaptive search method on the backtracking step. We have tested our algorithm on twelve standard UCI datasets based on k-nearest neighbor and naive Bayes classifiers. Their accuracy was then compared with some popular methods. OFMB showed better results than the other sequential forward searching techniques for most of the tested datasets.
APA, Harvard, Vancouver, ISO, and other styles
21

Manshah, Muhammad, Rana Aamir Raza, Saadia Ajmal, Urooj Pasha, and Asghar Ali. "An Efficient Swarm based Feature Selection Technique using Random Weight Neural Network." JOURNAL OF NANOSCOPE (JN) 2, no. 2 (December 31, 2021): 231–55. http://dx.doi.org/10.52700/jn.v2i2.49.

Full text
Abstract:
Feature selection (FS) is one of the most important pre-processing tasks in machine learning (ML) and data mining, that selects optimum features by eliminating noisy and irrelevant features from the data; to improve the generalization ability of a learning model (i.e., classifier). During the classification process, data with high dimensional feature space requires different optimization techniques to obtain better predictive performance. In this paper we present a swarm intelligence based technique called binary artificial bee colony (Binary-ABC) to obtain optimum feature subset. Different binary and multiclass datasets are utilized to evaluate the performance of our proposed technique. Experimental results show that our technique provides better generalization ability with random weight neural network (RWNN), when compare with other ML classifiers.
APA, Harvard, Vancouver, ISO, and other styles
22

Ayyad, Sarah M., Ahmed I. Saleh, and Labib M. Labib. "A new distributed feature selection technique for classifying gene expression data." International Journal of Biomathematics 12, no. 04 (May 2019): 1950039. http://dx.doi.org/10.1142/s1793524519500396.

Full text
Abstract:
Classification of gene expression data is a pivotal research area that plays a substantial role in diagnosis and prediction of diseases. Generally, feature selection is one of the extensively used techniques in data mining approaches, especially in classification. Gene expression data are usually composed of dozens of samples characterized by thousands of genes. This increases the dimensionality coupled with the existence of irrelevant and redundant features. Accordingly, the selection of informative genes (features) becomes difficult, which badly affects the gene classification accuracy. In this paper, we consider the feature selection for classifying gene expression microarray datasets. The goal is to detect the most possibly cancer-related genes in a distributed manner, which helps in effectively classifying the samples. Initially, the available huge amount of considered features are subdivided and distributed among several processors. Then, a new filter selection method based on a fuzzy inference system is applied to each subset of the dataset. Finally, all the resulted features are ranked, then a wrapper-based selection method is applied. Experimental results showed that our proposed feature selection technique performs better than other techniques since it produces lower time latency and improves classification performance.
APA, Harvard, Vancouver, ISO, and other styles
23

Al-Rasheed, Amal. "Identification of important features and data mining classification techniques in predicting employee absenteeism at work." International Journal of Electrical and Computer Engineering (IJECE) 11, no. 5 (October 1, 2021): 4587. http://dx.doi.org/10.11591/ijece.v11i5.pp4587-4596.

Full text
Abstract:
Employees absenteeism at the work costs organizations billions a year. Prediction of employees’ absenteeism and the reasons behind their absence help organizations in reducing expenses and increasing productivity. Data mining turns the vast volume of human resources data into information that can help in decision-making and prediction. Although the selection of features is a critical step in data mining to enhance the efficiency of the final prediction, it is not yet known which method of feature selection is better. Therefore, this paper aims to compare the performance of three well-known feature selection methods in absenteeism prediction, which are relief-based feature selection, correlation-based feature selection and information-gain feature selection. In addition, this paper aims to find the best combination of feature selection method and data mining technique in enhancing the absenteeism prediction accuracy. Seven classification techniques were used as the prediction model. Additionally, cross-validation approach was utilized to assess the applied prediction models to have more realistic and reliable results. The used dataset was built at a courier company in Brazil with records of absenteeism at work. Regarding experimental results, correlationbased feature selection surpasses the other methods through the performance measurements. Furthermore, bagging classifier was the best-performing data mining technique when features were selected using correlation-based feature selection with an accuracy rate of (92%).
APA, Harvard, Vancouver, ISO, and other styles
24

Khaokaew, Yonchanok, Tanapat Anusas-Amornkul, and Koonlachat Meesublak. "Intrusion Detection System Based on Hybrid Feature Selection and Support Vector Machine (HFS-SVM)." Applied Mechanics and Materials 781 (August 2015): 125–28. http://dx.doi.org/10.4028/www.scientific.net/amm.781.125.

Full text
Abstract:
In recent years, anomaly based intrusion detection techniques are continuously developed and a support vector machine (SVM) is one of the technique. However, it requires training time and storage if there are lots of numbers of features. In this paper, a hybrid feature selection, using Correlation based on Feature Selection and Motif Discovery using Random Projection techniques, is proposed to reduce the number of features from 41 to 3 features with KDD'99 dataset. It is compared with a regular SVM technique with 41 features. The results show that the accuracy rate is also high at 98% and the training time is less than the regular SVM almost by half.
APA, Harvard, Vancouver, ISO, and other styles
25

Pan, Wei, Pei Jun Ma, and Xiao Hong Su. "Large Margin Feature Selection for Support Vector Machine." Applied Mechanics and Materials 274 (January 2013): 161–64. http://dx.doi.org/10.4028/www.scientific.net/amm.274.161.

Full text
Abstract:
Feature selection is an preprocessing step in pattern analysis and machine learning. In this paper, we design a algorithm for feature subset. We present L1-norm regularization technique for sparse feature weight. Margin loss are introduced to evaluate features, and we employs gradient descent to search the optimal solution to maximize margin. The proposed technique is tested on UCI data sets. Compared with four margin based loss functions for SVM, the proposed technique is effective and efficient.
APA, Harvard, Vancouver, ISO, and other styles
26

Hany, Maha, Shaheera Rashwan, and Neveen M. Abdelmotilib. "A Machine Learning Method for Prediction of Yogurt Quality and Consumers Preferencesusing Sensory Attributes and Image Processing Techniques." Machine Learning and Applications: An International Journal 10, no. 1 (March 30, 2023): 1–7. http://dx.doi.org/10.5121/mlaij.2023.10101.

Full text
Abstract:
Prediction of quality and consumers’ preferences is essential task for food producers to improve their market share and reduce any gap in food safety standards. In this paper, we develop a machine learning method to predict yogurt preferences based on the sensory attributes and analysis of samples’ images using image processing texture and color feature extraction techniques. We compare three unsupervised ML feature selection techniques (Principal Component Analysis and Independent Component Analysis and t-distributed Stochastic Neighbour Embedding) with one supervised ML feature selection technique (Linear Discriminant Analysis) in terms of accuracy of classification. Results show the efficiency of the supervised ML feature selection technique over the traditional feature selection techniques.
APA, Harvard, Vancouver, ISO, and other styles
27

Sahu, Sanat Kumar, and A. K. Shrivas. "Comparative Study of Classification Models with Genetic Search Based Feature Selection Technique." International Journal of Applied Evolutionary Computation 9, no. 3 (July 2018): 1–11. http://dx.doi.org/10.4018/ijaec.2018070101.

Full text
Abstract:
Feature selection plays a very important role to retrieve the relevant features from datasets and computationally improves the performance of a model. The objective of this study is to evaluate the most important features of a chronic kidney disease (CKD) dataset and diagnose the CKD problem. In this research work, the authors have used a genetic search with the Wrapper Subset Evaluator method for feature selection to increase the overall performance of the classification model. They have also used Bayes Network, Classification and Regression Tree (CART), Radial Basis Function Network (RBFN) and J48 classifier for classification of CKD and non-CKD data. The proposed genetic search based feature selection technique (GSBFST) selects the best features from CKD dataset and compares the performance of classifiers with proposed and existing genetic search feature selection techniques (FSTs). All classification models give the better result with proposed GSBFST as compared to without FST and existing genetic search FSTs.
APA, Harvard, Vancouver, ISO, and other styles
28

S, Varshavardhini. "An Efficient Feature Subset Selection with Fuzzy Wavelet Neural Network for Data Mining in Big Data Environment." Journal of Internet Services and Information Security 13, no. 2 (May 30, 2023): 233–48. http://dx.doi.org/10.58346/jisis.2023.i2.015.

Full text
Abstract:
Big data refers to the massive quality of data being generated at a drastic speed from various heterogeneous sources namely social media, mobile devices, internet transactions, networked devices, and sensors. Several data mining (DM) and machine learning (ML) models have been presented for the extraction of knowledge from Big Data. Since the big datasets include numerous features, feature selection techniques are essential to eliminate unwanted and unrelated features which degrade the classification efficiency. The adoption of DM tools for big data environments necessitates remodeling the algorithm. In this aspect, this paper presents an intelligent feature subset selection with fuzzy wavelet neural network (FSS-FWNN) for big data classification. The FSS-FWNN technique incorporates Hadoop Ecosystem tool for handling big data in an effectual way. Besides, the FSS-FWNN technique involves three processes namely preprocessing, feature selection, and classification. In addition, quasi-oppositional chicken swarm optimization (QOCSO) technique is employed for the feature selection process and the FWNN technique is applied for the classification process. The design of QOCSO algorithm as an FS technique for big data classification shows the novelty of the work and the feature subset selection process considerably enhances the classification performance. An extensive set of simulations is carried out and the results are reviewed in terms of several evaluation factors in order to analyse the improvement of the FSS-FWNN approach. The experimental findings demonstrated that the FSS-FWNN approach outperformed the most current algorithms.
APA, Harvard, Vancouver, ISO, and other styles
29

Shirazi, Syed Atir Raza, Sania Shamim, Abdul Hannan Khan, and Aqsa Anwar. "Intrusion detection using decision tree classifier with feature reduction technique." Mehran University Research Journal of Engineering and Technology 42, no. 2 (March 28, 2023): 30. http://dx.doi.org/10.22581/muet1982.2302.04.

Full text
Abstract:
The number of internet users and network services is increasing rapidly in the recent decade gradually. A Large volume of data is produced and transmitted over the network. Number of security threats to the network has also been increased. Although there are many machine learning approaches and methods are used in intrusion detection systems to detect the attacks, but generally they are not efficient for large datasets and real time detection. Machine learning classifiers using all features of datasets minimized the accuracy of detection for classifier. A reduced feature selection technique that selects the most relevant features to detect the attack with ML approach has been used to obtain higher accuracy. In this paper, we used recursive feature elimination technique and selected more relevant features with machine learning approaches for big data to meet the challenge of detecting the attack. We applied this technique and classifier to NSL KDD dataset. Results showed that selecting all features for detection can maximize the complexity in the context of large data and performance of classifier can be increased by feature selection best in terms of efficiency and accuracy.
APA, Harvard, Vancouver, ISO, and other styles
30

Kim, Sung-Dong. "A Feature Selection Technique based on Distributional Differences." Journal of Information Processing Systems 2, no. 1 (March 1, 2006): 23–27. http://dx.doi.org/10.3745/jips.2006.2.1.023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

ASonawale, Swati, and Roshani Ade. "Dimensionality Reduction: An Effective Technique for Feature Selection." International Journal of Computer Applications 117, no. 3 (May 20, 2015): 18–23. http://dx.doi.org/10.5120/20535-2893.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Jenul, Anna, Stefan Schrunner, Kristian Hovde Liland, Ulf Geir Indahl, Cecilia Marie Futsaether, and Oliver Tomic. "RENT—Repeated Elastic Net Technique for Feature Selection." IEEE Access 9 (2021): 152333–46. http://dx.doi.org/10.1109/access.2021.3126429.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Xu, Jiucheng, Lin Sun, Yunpeng Gao, and Tianhe Xu. "An ensemble feature selection technique for cancer recognition." Bio-Medical Materials and Engineering 24, no. 1 (2014): 1001–8. http://dx.doi.org/10.3233/bme-130897.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Bruzzone, L., and S. B. Serpico. "A technique for feature selection in multiclass problems." International Journal of Remote Sensing 21, no. 3 (January 2000): 549–63. http://dx.doi.org/10.1080/014311600210740.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Lin, Hao, and Wei Chen. "Prediction of thermophilic proteins using feature selection technique." Journal of Microbiological Methods 84, no. 1 (January 2011): 67–70. http://dx.doi.org/10.1016/j.mimet.2010.10.013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Priyadarsini, Pullagura Indira, Manikonda Srininivasa Sesha Sai, Akula Suneetha, and Munnangi Velengini Bala Teresa Santhi. "Robust Feature Selection Technique for Intrusion Detection System." International Journal of Control and Automation 11, no. 2 (February 28, 2018): 33–44. http://dx.doi.org/10.14257/ijca.2018.11.2.04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Kamarudin, Muhammad Hilmi, Carsten Maple, and Tim Watson. "Hybrid feature selection technique for intrusion detection system." International Journal of High Performance Computing and Networking 13, no. 2 (2019): 232. http://dx.doi.org/10.1504/ijhpcn.2019.097503.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Watson, Tim, Muhammad Hilmi Kamarudin, and Carsten Maple. "Hybrid feature selection technique for intrusion detection system." International Journal of High Performance Computing and Networking 13, no. 2 (2019): 232. http://dx.doi.org/10.1504/ijhpcn.2019.10018670.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Bhattacharya, Abhishek, and Radha Tamal Goswami. "Community Based Feature Selection Method for Detection of Android Malware." Journal of Global Information Management 26, no. 3 (July 2018): 54–77. http://dx.doi.org/10.4018/jgim.2018070105.

Full text
Abstract:
The amount of malware has been rising drastically as the Android operating system enabled smartphones and tablets are gaining popularity around the world in last couple of years. One of the popular methods of static detection techniques is permission/feature-based detection of malware through the AndroidManifest.xml file using machine learning classifiers. Ignoring important features or keeping irrelevant features may specifically cause mystification to classification algorithms. Therefore, to reduce classification time and improve accuracy, different feature reduction tools have been used in past literature. Community detection is one of the major tools in social network analysis but its implementation in the context of malware detection is quite rare. In this article, the authors introduce a community-based feature reduction technique for Android malware detection. The proposed method is evaluated on two datasets consisting of 3004 benign components and 1363 malware components. The proposed community-based feature reduction technique produces a classification accuracy of 98.20% and ROC value up to 0.989.
APA, Harvard, Vancouver, ISO, and other styles
40

Mahmoud, Hanan Ahmed Hosni, Abeer Abdulaziz AlArfaj, and Alaaeldin M. Hafez. "A Fast Hybrid Classification Algorithm with Feature Reduction for Medical Images." Applied Bionics and Biomechanics 2022 (March 22, 2022): 1–11. http://dx.doi.org/10.1155/2022/1367366.

Full text
Abstract:
In this paper, we are introducing a fast hybrid fuzzy classification algorithm with feature reduction for medical images. We incorporated the quantum-based grasshopper computing algorithm (QGH) with feature extraction using fuzzy clustering technique ( C -means). QGH integrates quantum computing into machine learning and intelligence applications. The objective of our technique is to the integrate QGH method, specifically into cervical cancer detection that is based on image processing. Many features such as color, geometry, and texture found in the cells imaged in Pap smear lab test are very crucial in cancer diagnosis. Our proposed technique is based on the extraction of the best features using a more than 2600 public Pap smear images and further applies feature reduction technique to reduce the feature space. Performance evaluation of our approach evaluates the influence of the extracted feature on the classification precision by performing two experimental setups. First setup is using all the extracted features which leads to classification without feature bias. The second setup is a fusion technique which utilized QGH with the fuzzy C-means algorithm to choose the best features. In the setups, we allocate the assessment to accuracy based on the selection of best features and of different categories of the cancer. In the last setup, we utilized a fusion technique engaged with statistical techniques to launch a qualitative agreement with the feature selection in several experimental setups.
APA, Harvard, Vancouver, ISO, and other styles
41

Adeleke, A., N. A. Samsudin, Z. A. Othman, and S. K. Ahmad Khalid. "A two-step feature selection method for quranic text classification." Indonesian Journal of Electrical Engineering and Computer Science 16, no. 2 (November 1, 2019): 730. http://dx.doi.org/10.11591/ijeecs.v16.i2.pp730-736.

Full text
Abstract:
Feature selection is an integral phase in text classification problems. It is primarily applied in preprocessing text data prior to labeling. However, there exist some limitations with the FS techniques. The filter-based FS techniques have the drawback of lower accuracy performance while the wrapper-based techniques are highly computationally expensive to process. In this paper, a two-step FS method is presented. In the first step, chisquare (CH) filter-based technique is used to reduce the dimensionality of the feature set and then wrapper correlation-based (CFS) technique is employed in the second step to further select most relevant features from the reduced feature set. Specifically, the ultimate aim is to reduce the computational runtime while achieving high classification accuracy. Subsequently, the proposed method was applied in labeling instances of the input data (Quranic verses) using standard classifiers: naïve bayes (NB), support vector machine (SVM), decision trees (J48). The results report the proposed method achieved accuracy result of 93.6% at 4.17secs.
APA, Harvard, Vancouver, ISO, and other styles
42

Naseri, Hamed, E. Owen D. Waygood, Bobin Wang, Zachary Patterson, and Ricardo A. Daziano. "A Novel Feature Selection Technique to Better Predict Climate Change Stage of Change." Sustainability 14, no. 1 (December 21, 2021): 40. http://dx.doi.org/10.3390/su14010040.

Full text
Abstract:
Indications of people’s environmental concern are linked to transport decisions and can provide great support for policymaking on climate change. This study aims to better predict individual climate change stage of change (CC-SoC) based on different features of transport-related behavior, General Ecological Behavior, New Environmental Paradigm, and socio-demographic characteristics. Together these sources result in over 100 possible features that indicate someone’s level of environmental concern. Such a large number of features may create several analytical problems, such as overfitting, accuracy reduction, and high computational costs. To this end, a new feature selection technique, named the Coyote Optimization Algorithm-Quadratic Discriminant Analysis (COA-QDA), is first proposed to find the optimal features to predict CC-SoC with the highest accuracy. Different conventional feature selection methods (Lasso, Elastic Net, Random Forest Feature Selection, Extra Trees, and Principal Component Analysis Feature Selection) are employed to compare with the COA-QDA. Afterward, eight classification techniques are applied to solve the prediction problem. Finally, a sensitivity analysis is performed to determine the most important features affecting the prediction of CC-SoC. The results indicate that COA-QDA outperforms conventional feature selection methods by increasing average testing data accuracy from 0.7% to 5.6%. Logistic Regression surpasses other classifiers with the highest prediction accuracy.
APA, Harvard, Vancouver, ISO, and other styles
43

Gupta, Shikha, and Anuradha Chug. "A feature selection strategy for improving software maintainability prediction." Intelligent Data Analysis 26, no. 2 (March 14, 2022): 311–44. http://dx.doi.org/10.3233/ida-215825.

Full text
Abstract:
Software maintainability is a significant contributor while choosing particular software. It is helpful in estimation of the efforts required after delivering the software to the customer. However, issues like imbalanced distribution of datasets, and redundant and irrelevant occurrence of various features degrade the performance of maintainability prediction models. Therefore, current study applies ImpS algorithm to handle imbalanced data and extensively investigates several Feature Selection (FS) techniques including Symmetrical Uncertainty (SU), RandomForest filter, and Correlation-based FS using one open-source, three proprietaries and two commercial datasets. Eight different machine learning algorithms are utilized for developing prediction models. The performance of models is evaluated using Accuracy, G-Mean, Balance, & Area under the ROC Curve. Two statistical tests, Friedman Test and Wilcoxon Signed Ranks Test are conducted for assessing different FS techniques. The results substantiate that FS techniques significantly improve the performance of various prediction models with an overall improvement of 18.58%, 129.73%, 80.00%, and 45.76% in the median values of Accuracy, G-Mean, Balance, & AUC, respectively for all the datasets taken together. Friedman test advocates the supremacy of SU FS technique. Wilcoxon Signed Ranks test showcases that SU FS technique is significantly superior to the CFS technique for three out of six datasets.
APA, Harvard, Vancouver, ISO, and other styles
44

Kurniabudi, Kurniabudi, Abdul Harris, and Albertus Edward Mintaria. "Komparasi Information Gain, Gain Ratio, CFs-Bestfirst dan CFs-PSO Search Terhadap Performa Deteksi Anomali." JURNAL MEDIA INFORMATIKA BUDIDARMA 5, no. 1 (January 22, 2021): 332. http://dx.doi.org/10.30865/mib.v5i1.2258.

Full text
Abstract:
Large data dimensionality is one of the issues in anomaly detection. One approach used to overcome large data dimensions is feature selection. An effective feature selection technique will produce the most relevant features and can improve the classification algorithm to detect attacks. There have been many studies on feature selection techniques, each using different methods and strategies to find the best and relevant features. In this study, a comparison of Information Gain, Gain Ratio, CFs-BestFirst and CFs-PSO Search techniques was compared. The selection features of the four techniques were further validated by the Naive Bayes classification algorithm, k-NN and J48. This study uses the ISCX CICIDS-2017 dataset. Based on the test results the feature selection techniques affect the performance of the Naive Bayes algorithm, k-NN and J48. Increasingly relevant and important features can improve detection performance. The test results also show that the number of features influences the processing / computing time. CFs-BestFirst produces a smaller number of features compared to CFs-PSO Search, Information Gain and Gain Ratio so it requires lower processing time. In addition, k-NN requires a higher processing time than Naive Bayes and J48
APA, Harvard, Vancouver, ISO, and other styles
45

Alalhareth, Mousa, and Sung-Chul Hong. "An Improved Mutual Information Feature Selection Technique for Intrusion Detection Systems in the Internet of Medical Things." Sensors 23, no. 10 (May 22, 2023): 4971. http://dx.doi.org/10.3390/s23104971.

Full text
Abstract:
In healthcare, the Internet of Things (IoT) is used to remotely monitor patients and provide real-time diagnoses, which is referred to as the Internet of Medical Things (IoMT). This integration poses a risk from cybersecurity threats that can harm patient data and well-being. Hackers can manipulate biometric data from biosensors or disrupt the IoMT system, which is a major concern. To address this issue, intrusion detection systems (IDS) have been proposed, particularly using deep learning algorithms. However, developing IDS for IoMT is challenging due to high data dimensionality leading to model overfitting and degraded detection accuracy. Feature selection has been proposed to prevent overfitting, but the existing methods assume that feature redundancy increases linearly with the size of the selected features. Such an assumption does not hold, as the amount of information a feature carries about the attack pattern varies from feature to feature, especially when dealing with early patterns, due to data sparsity that makes it difficult to perceive the common characteristics of selected features. This negatively affects the ability of the mutual information feature selection (MIFS) goal function to estimate the redundancy coefficient accurately. To overcome this issue, this paper proposes an enhanced feature selection technique called Logistic Redundancy Coefficient Gradual Upweighting MIFS (LRGU-MIFS) that evaluates candidate features individually instead of comparing them with common characteristics of the already-selected features. Unlike the existing feature selection techniques, LRGU calculates the redundancy score of a feature using the logistic function. It increases the redundancy value based on the logistic curve, which reflects the nonlinearity of the relationship of the mutual information between features in the selected set. Then, the LRGU was incorporated into the goal function of MIFS as a redundancy coefficient. The experimental evaluation shows that the proposed LRGU was able to identify a compact set of significant features that outperformed those selected by the existing techniques. The proposed technique overcomes the challenge of perceiving common characteristics in cases of insufficient attack patterns and outperforms existing techniques in identifying significant features.
APA, Harvard, Vancouver, ISO, and other styles
46

Najm, Assia, Abdelali Zakrani, and Abdelaziz Marzak. "Optainet-based technique for SVR feature selection and parameters optimization for software cost prediction." MATEC Web of Conferences 348 (2021): 01002. http://dx.doi.org/10.1051/matecconf/202134801002.

Full text
Abstract:
The software cost prediction is a crucial element for a project’s success because it helps the project managers to efficiently estimate the needed effort for any project. There exist in literature many machine learning methods like decision trees, artificial neural networks (ANN), and support vector regressors (SVR), etc. However, many studies confirm that accurate estimations greatly depend on hyperparameters optimization, and on the proper input feature selection that impacts highly the accuracy of software cost prediction models (SCPM). In this paper, we propose an enhanced model using SVR and the Optainet algorithm. The Optainet is used at the same time for 1-selecting the best set of features and 2-for tuning the parameters of the SVR model. The experimental evaluation was conducted using a 30% holdout over seven datasets. The performance of the suggested model is then compared to the tuned SVR model using Optainet without feature selection. The results were also compared to the Boruta and random forest features selection methods. The experiments show that for overall datasets, the Optainet-based method improves significantly the accuracy of the SVR model and it outperforms the random forest and Boruta feature selection methods.
APA, Harvard, Vancouver, ISO, and other styles
47

Zerhari, B., A. Ait Lehcen, and S. Mouline. "A New Horizo-Vertical Distributed Feature Selection Approach." Cybernetics and Information Technologies 18, no. 4 (November 1, 2018): 15–28. http://dx.doi.org/10.2478/cait-2018-0045.

Full text
Abstract:
Abstract Feature selection technique has been a very active research topic that addresses the problem of reducing the dimensionality. Whereas, datasets are continuously growing over time both in samples and features number. As a result, handling both irrelevant and redundant features has become a real challenge. In this paper we propose a new straightforward framework which combines the horizontal and vertical distributed feature selection technique, called Horizo-Vertical Distributed Feature Selection approach (HVDFS), aimed at achieving good performances as well as reducing the number of features. The effectiveness of our approach is demonstrated on three well-known datasets compared to the centralized and the previous distributed approach, using four well-known classifiers.
APA, Harvard, Vancouver, ISO, and other styles
48

Rana, Bharti, Akanksha Juneja, and Ramesh Kumar Agrawal. "Relevant Feature Subset Selection from Ensemble of Multiple Feature Extraction Methods for Texture Classification." International Journal of Computer Vision and Image Processing 5, no. 1 (January 2015): 48–65. http://dx.doi.org/10.4018/ijcvip.2015010103.

Full text
Abstract:
Performance of texture classification for a given set of texture patterns depends on the choice of feature extraction technique. Integration of features from various feature extraction methods not only eliminates risk of method selection but also brings benefits from the participating methods which play complimentary role among themselves to represent underlying texture pattern. However, it comes at the cost of a large feature vector which may contain redundant features. The presence of such redundant features leads to high computation time, memory requirement and may deteriorate the performance of the classifier. In this research workMonirst phase, a pool of texture features is constructed by integrating features from seven well known feature extraction methods. In the second phase, a few popular feature subset selection techniques are investigated to determine a minimal subset of relevant features from this pool of features. In order to check the efficacy of the proposed approach, performance is evaluated on publically available Brodatz dataset, in terms of classification error. Experimental results demonstrate substantial improvement in classification performance over existing feature extraction techniques. Furthermore, ranking and statistical test also strengthen the results.
APA, Harvard, Vancouver, ISO, and other styles
49

O’Leary, Daniel, and Joel Kubby. "Feature Selection and ANN Solar Power Prediction." Journal of Renewable Energy 2017 (November 8, 2017): 1–7. http://dx.doi.org/10.1155/2017/2437387.

Full text
Abstract:
A novel method of solar power forecasting for individuals and small businesses is developed in this paper based on machine learning, image processing, and acoustic classification techniques. Increases in the production of solar power at the consumer level require automated forecasting systems to minimize loss, cost, and environmental impact for homes and businesses that produce and consume power (prosumers). These new participants in the energy market, prosumers, require new artificial neural network (ANN) performance tuning techniques to create accurate ANN forecasts. Input masking, an ANN tuning technique developed for acoustic signal classification and image edge detection, is applied to prosumer solar data to improve prosumer forecast accuracy over traditional macrogrid ANN performance tuning techniques. ANN inputs tailor time-of-day masking based on error clustering in the time domain. Results show an improvement in prediction to target correlation, the R2 value, lowering inaccuracy of sample predictions by 14.4%, with corresponding drops in mean average error of 5.37% and root mean squared error of 6.83%.
APA, Harvard, Vancouver, ISO, and other styles
50

Bhattacharya, Abhishek, Radha Tamal Goswami, Kuntal Mukherjee, and Nhu Gia Nguyen. "An Ensemble Voted Feature Selection Technique for Predictive Modeling of Malwares of Android." International Journal of Information System Modeling and Design 10, no. 2 (April 2019): 46–69. http://dx.doi.org/10.4018/ijismd.2019040103.

Full text
Abstract:
Each Android application requires accumulations of permissions in installation time and they are considered as the features which can be utilized in permission-based identification of Android malwares. Recently, ensemble feature selection techniques have received increasing attention over conventional techniques in different applications. In this work, a cluster based voted ensemble voted feature selection technique combining five base wrapper approaches of R libraries is projected for identifying most prominent set of features in the predictive modeling of Android malwares. The proposed method preserves both the desirable features of an ensemble feature selector, accuracy and diversity. Moreover, in this work, five different data partitioning ratios are considered and the impact of those ratios on predictive model are measured using coefficient of determination (r-square) and root mean square error. The proposed strategy has created significant better outcome in term of the number of selected features and classification accuracy.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography