Journal articles on the topic 'KNN classification'

To see the other types of publications on this topic, follow the link: KNN classification.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'KNN classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Gweon, Hyukjun, Matthias Schonlau, and Stefan H. Steiner. "The k conditional nearest neighbor algorithm for classification and class probability estimation." PeerJ Computer Science 5 (May 13, 2019): e194. http://dx.doi.org/10.7717/peerj-cs.194.

Full text
Abstract:
The k nearest neighbor (kNN) approach is a simple and effective nonparametric algorithm for classification. One of the drawbacks of kNN is that the method can only give coarse estimates of class probabilities, particularly for low values of k. To avoid this drawback, we propose a new nonparametric classification method based on nearest neighbors conditional on each class: the proposed approach calculates the distance between a new instance and the kth nearest neighbor from each class, estimates posterior probabilities of class memberships using the distances, and assigns the instance to the class with the largest posterior. We prove that the proposed approach converges to the Bayes classifier as the size of the training data increases. Further, we extend the proposed approach to an ensemble method. Experiments on benchmark data sets show that both the proposed approach and the ensemble version of the proposed approach on average outperform kNN, weighted kNN, probabilistic kNN and two similar algorithms (LMkNN and MLM-kHNN) in terms of the error rate. A simulation shows that kCNN may be useful for estimating posterior probabilities when the class distributions overlap.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Shichao. "Cost-sensitive KNN classification." Neurocomputing 391 (May 2020): 234–42. http://dx.doi.org/10.1016/j.neucom.2018.11.101.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Zhao, Puning, and Lifeng Lai. "Efficient Classification with Adaptive KNN." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 12 (May 18, 2021): 11007–14. http://dx.doi.org/10.1609/aaai.v35i12.17314.

Full text
Abstract:
In this paper, we propose an adaptive kNN method for classification, in which different k are selected for different test samples. Our selection rule is easy to implement since it is completely adaptive and does not require any knowledge of the underlying distribution. The convergence rate of the risk of this classifier to the Bayes risk is shown to be minimax optimal for various settings. Moreover, under some special assumptions, the convergence rate is especially fast and does not decay with the increase of dimensionality.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Shichao, Xuelong Li, Ming Zong, Xiaofeng Zhu, and Debo Cheng. "Learning k for kNN Classification." ACM Transactions on Intelligent Systems and Technology 8, no. 3 (April 22, 2017): 1–19. http://dx.doi.org/10.1145/2990508.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Khairina, Nurul, Theofil Tri Saputra Sibarani, Rizki Muliono, Zulfikar Sembiring, and Muhathir Muhathir. "Identification of Pneumonia using The K-Nearest Neighbors Method using HOG Fitur Feature Extraction." JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING 5, no. 2 (January 26, 2022): 562–68. http://dx.doi.org/10.31289/jite.v5i2.6216.

Full text
Abstract:
Pneumonia is a wet lung disease. Pneumonia is generally caused by viruses, bacteria or fungi. Not infrequently Pneumonia can cause death. The K-Nearest Neighbors method is a classification method that uses the majority value from the closest k value category. At this time people are not too worried about pneumonia because this pneumonia has symptoms like a normal cough. However, fast and accurate information from health experts is also very necessary so that pneumonia symptoms can be recognized early and how to deal with them can also be done faster. In this study, researchers will diagnose pneumonia to obtain information quickly about the symptoms of pneumonia. This information will adopt human knowledge into computers designed to solve the problem of identifying pneumonia. In this study, the K-Nearest Neighbors method will be combined with the HOG Extraction Feature to identify pneumonia more accurately. The KNN classification used is Fine KNN, Cosine KNN, and Cubic KNN. Where will be seen how the value of accuracy, precision, recall, and fi-score. The results showed that the classification could run well on the Fine KKN, Cosine KNN, and Cubic KNN methods. Fine KNN has an accuracy rate of 80.67, Cosine KNN has an accuracy rate of 84,93333, and Cubic KNN has an accuracy rate of 83,13333. Fine KNN has precision, recall and f1-score values of 0.794842, 0.923706, and 0.854442. Cosine KNN has precision, recall and f1-score values of 0.803048, 0.954039, and 0.872056. Cubic KNN has precision, recall and f1-score values of 0.73388, 0.964561, and 0.833555. From the test results, positive and negative identification of pneumonia was found to be more accurate with the Cosine KNN classification which reached 84,93333.
APA, Harvard, Vancouver, ISO, and other styles
6

Raeisi Shahraki, Hadi, Saeedeh Pourahmad, and Najaf Zare. "K Important Neighbors: A Novel Approach to Binary Classification in High Dimensional Data." BioMed Research International 2017 (2017): 1–9. http://dx.doi.org/10.1155/2017/7560807.

Full text
Abstract:
K nearest neighbors (KNN) are known as one of the simplest nonparametric classifiers but in high dimensional setting accuracy of KNN are affected by nuisance features. In this study, we proposed the K important neighbors (KIN) as a novel approach for binary classification in high dimensional problems. To avoid the curse of dimensionality, we implemented smoothly clipped absolute deviation (SCAD) logistic regression at the initial stage and considered the importance of each feature in construction of dissimilarity measure with imposing features contribution as a function of SCAD coefficients on Euclidean distance. The nature of this hybrid dissimilarity measure, which combines information of both features and distances, enjoys all good properties of SCAD penalized regression and KNN simultaneously. In comparison to KNN, simulation studies showed that KIN has a good performance in terms of both accuracy and dimension reduction. The proposed approach was found to be capable of eliminating nearly all of the noninformative features because of utilizing oracle property of SCAD penalized regression in the construction of dissimilarity measure. In very sparse settings, KIN also outperforms support vector machine (SVM) and random forest (RF) as the best classifiers.
APA, Harvard, Vancouver, ISO, and other styles
7

Yang, Zhida, Peng Liu, and Yi Yang. "Convective/Stratiform Precipitation Classification Using Ground-Based Doppler Radar Data Based on the K-Nearest Neighbor Algorithm." Remote Sensing 11, no. 19 (September 29, 2019): 2277. http://dx.doi.org/10.3390/rs11192277.

Full text
Abstract:
Stratiform and convective rain types are associated with different cloud physical processes, vertical structures, thermodynamic influences and precipitation types. Distinguishing convective and stratiform systems is beneficial to meteorology research and weather forecasting. However, there is no clear boundary between stratiform and convective precipitation. In this study, a machine learning algorithm, K-nearest neighbor (KNN), is used to classify precipitation types. Six Doppler radar (WSR-98D/SA) data sets from Jiangsu, Guangzhou and Anhui Provinces in China were used as training and classification samples, and the 2A23 product of the Tropical Precipitation Measurement Mission (TRMM) was used to obtain the training labels and evaluate the classification performance. Classifying precipitation types using KNN requires three steps. First, features are selected from the radar data by comparing the range of each variable for different precipitation types. Second, the same unclassified samples are classified with different k values to choose the best-performing k. Finally, the unclassified samples are put into the KNN algorithm with the best k to classify precipitation types, and the classification performance is evaluated. Three types of cases, squall line, embedded convective and stratiform cases, are classified by KNN. The KNN method can accurately classify the location and area of stratiform and convective systems. For stratiform classifications, KNN has a 95% probability of detection, 8% false alarm rate, and 87% cumulative success index; for convective classifications, KNN yields a 78% probability of detection, a 13% false alarm rate, and a 69% cumulative success index. These results imply that KNN can correctly classify almost all stratiform precipitation and most convective precipitation types. This result suggests that KNN has great potential in classifying precipitation types.
APA, Harvard, Vancouver, ISO, and other styles
8

Ganatra, Dr Dhimant. "Improving classification accuracy :The KNN approach." International Journal of Advanced Trends in Computer Science and Engineering 9, no. 4 (August 25, 2020): 6147–50. http://dx.doi.org/10.30534/ijatcse/2020/287942020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Su, Yixin, and Sheng-Uei Guan. "Density and Distance Based KNN Approach to Classification." International Journal of Applied Evolutionary Computation 7, no. 2 (April 2016): 45–60. http://dx.doi.org/10.4018/ijaec.2016040103.

Full text
Abstract:
KNN algorithm is a simple and efficient algorithm developed to solve classification problems. However, it encounters problems when classifying datasets with non-uniform density distributions. The existing KNN voting mechanism may lose essential information by considering majority only and get degraded performance when a dataset has uneven distribution. The other drawback comes from the way that KNN treat all the participating candidates equally when judging upon one test datum. To overcome the weaknesses of KNN, a Region of Influence Based KNN (RI-KNN) is proposed. RI-KNN computes for each training datum region of influence information based on their nearby data (i.e. locality information) so that each training datum can encode some locality information from its region. Information coming from both training and testing stages will contribute to the formation of weighting formula. By solving these two problems, RI-KNN is shown to outperform KNN upon several artificial datasets and real datasets without sacrificing time cost much in nearly all tested datasets.
APA, Harvard, Vancouver, ISO, and other styles
10

Boyko, Nataliya I., and Mykhaylo V. Muzyka. "Methods of analysis of multimodal data to increase the accuracy of classification." Applied Aspects of Information Technology 5, no. 2 (July 4, 2022): 147–60. http://dx.doi.org/10.15276/aait.05.2022.11.

Full text
Abstract:
This paper proposes methods for analyzing multimodal data that will help improve the overall accuracy of the results and plans for classifying K-Nearest Neighbor (KNN) to minimize their risk. The mechanism of increasing the accuracy of KNN classification is considered. The research methods used in this work are comparison, analysis, induction, and experiment. This work aimed to improve the accuracy of KNN classification by comparing existing algorithms and applying new methods. Many literary and media sources on the classification according to the algorithm k of the nearest neighbors were analyzed, and the most exciting variations of the given algorithm were selected. Emphasis will be placed on achieving maximum classification accuracy by comparing existing and improving methods for choosing the number k and finding the nearest class. Algorithms with and without data analysis and preprocessing are also compared. All the strategies discussed in this article will be achieved purely practically. An experimental classification by k nearest neighbors with different variations was performed. Data for the experiment used two different data sets of various sizes. Different classifications k and the test sample size were taken as classification arguments. The paper studies three variants of the algorithm k nearest neighbors: the classical KNN, KNN with the lowest average and hybrid KNN. These algorithms are compared for different test sample sizes for other numbers k. The article analyzes the data before classification. As for selecting the number k, no simple method would give the maximum result with great accuracy. The essence of the algorithm is to find k closest to the sample of objects already classified by predefined and numbered classes. Then, among these k objects, you need to count how often the class occurs and assign the most common class to the selected object. If two classes' occurrences are the largest and the same, the class with the smaller number is assigned
APA, Harvard, Vancouver, ISO, and other styles
11

Boro, Debojit, and Dhruba K. Bhattacharyya. "Particle swarm optimisation-based KNN for improving KNN and ensemble classification performance." International Journal of Innovative Computing and Applications 6, no. 3/4 (2015): 145. http://dx.doi.org/10.1504/ijica.2015.073004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

McHUGH, E. S., A. P. SHINN, and J. W. KAY. "Discrimination of the notifiable pathogen Gyrodactylus salaris from G. thymalli (Monogenea) using statistical classifiers applied to morphometric data." Parasitology 121, no. 3 (September 2000): 315–23. http://dx.doi.org/10.1017/s0031182099006381.

Full text
Abstract:
The identification and discrimination of 2 closely related and morphologically similar species of Gyrodactylus, G. salaris and G. thymalli, were assessed using the statistical classification methodologies Linear Discriminant Analysis (LDA) and k-Nearest Neighbours (KNN). These statistical methods were applied to morphometric measurements made on the gyrodactylid attachment hooks. The mean estimated classification percentages of correctly identifying each species were 98·1% (LDA) and 97·9% (KNN) for G. salaris and 99·9% (LDA) and 73·2% (KNN) for G. thymalli. The analysis was expanded to include another 2 closely related species and the new classification efficiencies were 94·6% (LDA) and 98·0% (KNN) for G. salaris; 98·2% (LDA) and 72·6% (KNN) for G. thymalli; 86·7% (LDA) and 91·8% (KNN) for G. derjavini; and 76·5% (LDA) and 77·7% (KNN) for G. truttae. The higher correct classification scores of G. salaris and G. thymalli by the LDA classifier in the 2-species analysis over the 4-species analysis suggested the development of a 2-stage classifier. The mean estimated correct classification scores were 99·97% (LDA) and 99·99% (KNN) for the G. salaris–G. thymalli pairing and 99·4% (LDA) and 99·92% (KNN) for the G. derjavini–G. truttae pairing. Assessment of the 2-stage classifier using only marginal hook data was very good with classification efficiencies of 100% (LDA) and 99·6% (KNN) for the G. salaris–G. thymalli pairing and 97·2% (LDA) and 99·2% (KNN) for the G. derjavini–G. truttae pairing. Paired species were then discriminated individually in the second stage of the classifier using data from the full set of hooks. These analyses demonstrate that using the methods of LDA and KNN statistical classification, the discrimination of closely related and pathogenic species of Gyrodactylus may be achieved using data derived from light microscope studies.
APA, Harvard, Vancouver, ISO, and other styles
13

Pandey, Shubham, Vivek Sharma, and Garima Agrawal. "Modification of KNN Algorithm." International Journal of Engineering and Computer Science 8, no. 11 (November 28, 2019): 24869–77. http://dx.doi.org/10.18535/ijecs/v8i11.4383.

Full text
Abstract:
K-Nearest Neighbor (KNN) classification is one of the most fundamental and simple classification methods. It is among the most frequently used classification algorithm in the case when there is little or no prior knowledge about the distribution of the data. In this paper a modification is taken to improve the performance of KNN. The main idea of KNN is to use a set of robust neighbors in the training data. This modified KNN proposed in this paper is better from traditional KNN in both terms: robustness and performance. Inspired from the traditional KNN algorithm, the main idea is to classify an input query according to the most frequent tag in set of neighbor tags with the say of the tag closest to the new tuple being the highest. Proposed Modified KNN can be considered a kind of weighted KNN so that the query label is approximated by weighting the neighbors of the query. The procedure computes the frequencies of the same labeled neighbors to the total number of neighbors with value associated with each label multiplied by a factor which is inversely proportional to the distance between new tuple and neighbours. The proposed method is evaluated on a variety of several standard UCI data sets. Experiments show the significant improvement in the performance of KNN method.
APA, Harvard, Vancouver, ISO, and other styles
14

Najadat, Hassan, Rasha Obeidat, and Ismail Hmeidi. "Clustering Generalised Instances Set Approaches for Text Classification." Journal of Information & Knowledge Management 10, no. 01 (March 2011): 91–107. http://dx.doi.org/10.1142/s0219649211002857.

Full text
Abstract:
This paper introduces three new text classification methods: Clustering-Based Generalised Instances Set (CB-GIS), Multilevel Clustering-Based Generalised Instances Set (MLC_GIS) and Multilevel Clustering-Based, k Nearest Neighbours (MLC-kNN). These new methods aim to unify the strengths and overcome the drawbacks of the three similarity-based text classification methods, namely, kNN, centroid-based and GIS. The new methods utilise a clustering technique called spherical K-means to represent each class by a representative set of generalised instances to be used later in the classification. The CB-GIS method applies a flat clustering method while MLC-GIS and MLC-kNN apply multilevel clustering. Extensive experiments have been conducted to evaluate the new methods and compare them with kNN, centroid-based and GIS classifiers on the Reuters-21578(10) benchmark dataset. The evaluation has been performed in terms of the classification performance and the classification efficiency. The experimental results show that the top-performing classification method is the MLC-kNN classifier, followed by the MLC-GIS and CB-GIS classifiers. According to the best micro-averaged F1 scores, the new methods (CB-GIS, MLC-CIS, MLC-kNN) have improvements of 4.48%, 4.65% and 4.76% over kNN, 1.84%, 1.92% and 2.12% over the centroid-based and 5.26%, 5.34% and 5.45% over GIS respectively. With respect to the best macro-averaged F1 scores, the new methods (CB-GIS, MLC-CIS, MLC-kNN) have improvements of 10.29%, 10.19% and 10.45% over kNN, respectively, 0.1%, 0.03% and 0.29% over the centroid-based and 3.75%, 3.68% and 3.94% over GIS respectively.
APA, Harvard, Vancouver, ISO, and other styles
15

Xing, Wenchao, and Yilin Bei. "Medical Health Big Data Classification Based on KNN Classification Algorithm." IEEE Access 8 (2020): 28808–19. http://dx.doi.org/10.1109/access.2019.2955754.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Kumar, N. Suresh, and Pothina Praveena. "Evolution of hybrid distance based kNN classification." IAES International Journal of Artificial Intelligence (IJ-AI) 10, no. 2 (June 1, 2021): 510. http://dx.doi.org/10.11591/ijai.v10.i2.pp510-518.

Full text
Abstract:
<span id="docs-internal-guid-b63d466d-7fff-f94f-7540-9cb92d7bb505"><span>The evolution of classification of opinion mining and user review analysis span from decades reaching into ubiquitous computing in efforts such as movie review analysis. The performance of linear and non-linear models are discussed to classify the positive and negative reviews of movie data sets. The effectiveness of linear and non-linear algorithms are tested and compared in-terms of average accuracy. The performance of various algorithms is tested by implementing them on internet movie data base (IMDB). The hybrid kNN model optimizes the performance classification interns of accuracy. The accuracy of polarity prediction rate is improved with random-distance-weighted-kNN-ABC when compared with kNN algorithm applied alone.</span></span>
APA, Harvard, Vancouver, ISO, and other styles
17

Abdul, Zrar Kh, Abdulbasit K. Al‑Talabani, Chnoor M. Rahman, and Safar M. Asaad. "Electrocardiogram Heartbeat Classification using Convolutional Neural Network-k Nearest Neighbor." ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY 12, no. 1 (February 29, 2024): 61–67. http://dx.doi.org/10.14500/aro.11444.

Full text
Abstract:
Electrocardiogram (ECG) analysis is widely used by cardiologists and medical practitioners for monitoring cardiac health. A high-performance automatic ECG classification system is challenging because there is difficulty in detecting and categorizing different waveforms in the signal, especially in manual analysis of ECG signals, which means, a better classification system is needed in terms of performance and accuracy. Hence, in this paper, the authors propose an accurate ECG classification and monitoring system called convolutional neural network-k nearest neighbor (CNN-kNN). The proposed method utilizes 1D-CNN and kNN. Unlike the existing techniques, the examined technique does not need training during classifying the ECG signals. The CNN-kNN is evaluated against the PhysioNet’s MIT-BIH and PTB diagnostics datasets. The CNN is fed using the ECG beat raw signal directly. In addition, the learned features are extracted from the 1D-CNN model and its dimensions are reduced using two fully connected layers and then fed to the k-NN classifier. The CNN-kNN model achieved average accuracies of 98% and 97.4% on arrhythmia and myocardial infarction classifications, respectively. These results are evidence of the great ability of the proposed model compared to the mentioned models in this article.
APA, Harvard, Vancouver, ISO, and other styles
18

Aung, Swe Swe, Itaru Nagayama, and Shiro Tamaki. "Dual-kNN for a Pattern Classification Approach." IEIE Transactions on Smart Processing & Computing 6, no. 5 (October 31, 2017): 326–33. http://dx.doi.org/10.5573/ieiespc.2017.6.5.326.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Kaur, Gagandeep, Anshu Sharma, and Anurag Sharma. "Heart Disease Prediction using KNN classification approach." International Journal of Computer Sciences and Engineering 7, no. 5 (May 31, 2019): 416–20. http://dx.doi.org/10.26438/ijcse/v7i5.416420.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Jia, Bin-Bin, and Min-Ling Zhang. "Multi-Dimensional Classification via kNN Feature Augmentation." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 3975–82. http://dx.doi.org/10.1609/aaai.v33i01.33013975.

Full text
Abstract:
Multi-dimensional classification (MDC) deals with the problem where one instance is associated with multiple class variables, each of which specifies its class membership w.r.t. one specific class space. Existing approaches learn from MDC examples by focusing on modeling dependencies among class variables, while the potential usefulness of manipulating feature space hasn’t been investigated. In this paper, a first attempt towards feature manipulation for MDC is proposed which enriches the original feature space with kNNaugmented features. Specifically, simple counting statistics on the class membership of neighboring MDC examples are used to generate augmented feature vector. In this way, discriminative information from class space is encoded into the feature space to help train the multi-dimensional classification model. To validate the effectiveness of the proposed feature augmentation techniques, extensive experiments over eleven benchmark data sets as well as four state-of-the-art MDC approaches are conducted. Experimental results clearly show that, compared to the original feature space, classification performance of existing MDC approaches can be significantly improved by incorporating kNN-augmented features.
APA, Harvard, Vancouver, ISO, and other styles
21

Yousif, Yousif Elfatih. "Weather Prediction System Using KNN Classification Algorithm." European Journal of Information Technologies and Computer Science 2, no. 1 (February 10, 2022): 10–13. http://dx.doi.org/10.24018/compute.2022.2.1.44.

Full text
Abstract:
Data Mining is a technology that facilitates extracting relevant and which have factors in common from the set of data. It is the process of analysis data from different perspectives and discovering problems, patterns, and correlations in data sets that are useful for predicting outcomes that help you make a correct decision. Weather Prediction is a field of meteorology that is created by collecting dynamic data related to the current state of the weather such as temperature, humidity, rainfall, wind. In this paper, we designed a system using a classification method by k-Nearest Neighbors algorithm for predict whether through previous data to determine the expected temperature and humidity the prediction results were compared with real results, the comparison was good and acceptable.
APA, Harvard, Vancouver, ISO, and other styles
22

Ghauri, Sajjad Ahmed. "KNN BASED CLASSIFICATION OF DIGITAL MODULATED SIGNALS." IIUM Engineering Journal 17, no. 2 (November 30, 2016): 71–82. http://dx.doi.org/10.31436/iiumej.v17i2.641.

Full text
Abstract:
Demodulation process without the knowledge of modulation scheme requires Automatic Modulation Classification (AMC). When receiver has limited information about received signal then AMC become essential process. AMC finds important place in the field many civil and military fields such as modern electronic warfare, interfering source recognition, frequency management, link adaptation etc. In this paper we explore the use of K-nearest neighbor (KNN) for modulation classification with different distance measurement methods. Five modulation schemes are used for classification purpose which is Binary Phase Shift Keying (BPSK), Quadrature Phase Shift Keying (QPSK), Quadrature Amplitude Modulation (QAM), 16-QAM and 64-QAM. Higher order cummulants (HOC) are used as an input feature set to the classifier. Simulation results shows that proposed classification method provides better results for the considered modulation formats.
APA, Harvard, Vancouver, ISO, and other styles
23

Jia, Bin-Bin, and Min-Ling Zhang. "Multi-dimensional classification via kNN feature augmentation." Pattern Recognition 106 (October 2020): 107423. http://dx.doi.org/10.1016/j.patcog.2020.107423.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Wang, Shi Min, and Xian Zhe Cao. "Research on Text Classification Technique." Applied Mechanics and Materials 278-280 (January 2013): 2081–84. http://dx.doi.org/10.4028/www.scientific.net/amm.278-280.2081.

Full text
Abstract:
At first the paper will introduce the basic conception and the generic progress of text classification, after that it will introduce three text classification algorithms in detail and finally it will verify NB, SVM and KNN by experiment with the data mining software-weka. The result of experiment shows that KNN is more efficient than the other two algorithms in recall and precision.
APA, Harvard, Vancouver, ISO, and other styles
25

Desiani, Anita, Adinda Ayu Lestari, M. Al-Ariq, Ali Amran, and Yuli Andriani. "Comparison of Support Vector Machine and K-Nearest Neighbors in Breast Cancer Classification." Pattimura International Journal of Mathematics (PIJMath) 1, no. 1 (May 1, 2022): 33–42. http://dx.doi.org/10.30598/pijmathvol1iss1pp33-42.

Full text
Abstract:
Cancer is one of the leading causes of death, and breast cancer is the second leading cause of cancer death in women. One method to realize the level of malignancy of breast cancer from an early age is by classifying the cancer malignancy using data mining. One of the widely used data mining methods with a good level of accuracy is the Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). Evaluation techniques of percentage split and cross-validation were used to evaluate and compare the SVM and KNN classification models. The result was that the accuracy level of the SVM classification method was better than the KNN classification method when using the cross-validation technique, which is 95,7081%. Meanwhile, the KNN classification method was better than the SVM classification method when using the percentage split technique, which is 95,4220%. From the comparison results, it can be seen that the KNN and SVM methods work well in the classification of breast cancer.
APA, Harvard, Vancouver, ISO, and other styles
26

Ma, Manfu, Wei Deng, Hongtong Liu, and Xinmiao Yun. "An Intrusion Detection Model based on Hybrid Classification algorithm." MATEC Web of Conferences 246 (2018): 03027. http://dx.doi.org/10.1051/matecconf/201824603027.

Full text
Abstract:
Due to using the single classification algorithm can not meet the performance requirements of intrusion detection, combined with the numerical value of KNN and the advantage of naive Bayes in the structure of data, an intrusion detection model KNN-NB based on KNN and Naive Bayes hybrid classification algorithm is proposed. The model first preprocesses the NSL-KDD intrusion detection data set. And then by exploiting the advantages of KNN algorithm in data values, the model calculates the distance between the samples according to the feature items and selects the K sample data with the smallest distance. Finally, by naive Bayes to get the final result. The experimental results on the NSL-KDD dataset show that the KNN-NB algorithm can meet the requirement of balanced performance than the traditional KNN and Naive Bayes algorithm in term of accuracy, sensitivity, false detection rate, specificity, and missed detection rate.
APA, Harvard, Vancouver, ISO, and other styles
27

Sajib, AH, AZM Shafiullah, and AH Sumon. "An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance." Dhaka University Journal of Science 61, no. 1 (May 27, 2013): 81–85. http://dx.doi.org/10.3329/dujs.v61i1.15101.

Full text
Abstract:
This study considers the classification problem for binary output attribute when input attributes are drawn from multivariate normal distribution, in both clean and contaminated case. Classical metrics are affected by the outliers, while robust metrics are computationally inefficient. In order to achieve robustness and computational efficiency at the same time, we propose a new robust distance metric for K-Nearest Neighbor (KNN) method. We call our proposed metric Alternative Robust Mahalanobis Distance (ARMD) metric. Thus KNN using ARMD is alternative KNN method. The classical metrics use non robust estimate (mean) as the building block. To construct the proposed ARMD metric, we replace non robust estimate (mean) by its robust counterpart median. Thus, we developed ARMD metric for alternative KNN classification technique. Our simulation studies show that the proposed alternative KNN method gives better results in case of contaminated data compared to the classical KNN. The performance of our method is similar to classical KNN using the existing robust metric. The major advantage of proposed method is that it requires less computing time compared to classical KNN that using existing robust metric. Dhaka Univ. J. Sci. 61(1): 81-85, 2013 (January) DOI: http://dx.doi.org/10.3329/dujs.v61i1.15101
APA, Harvard, Vancouver, ISO, and other styles
28

Taguelmimt, Redha, and Rachid Beghdad. "DS-kNN." International Journal of Information Security and Privacy 15, no. 2 (April 2021): 131–44. http://dx.doi.org/10.4018/ijisp.2021040107.

Full text
Abstract:
On one hand, there are many proposed intrusion detection systems (IDSs) in the literature. On the other hand, many studies try to deduce the important features that can best detect attacks. This paper presents a new and an easy-to-implement approach to intrusion detection, named distance sum-based k-nearest neighbors (DS-kNN), which is an improved version of k-NN classifier. Given a data sample to classify, DS-kNN computes the distance sum of the k-nearest neighbors of the data sample in each of the possible classes of the dataset. Then, the data sample is assigned to the class having the smallest sum. The experimental results show that the DS-kNN classifier performs better than the original k-NN algorithm in terms of accuracy, detection rate, false positive, and attacks classification. The authors mainly compare DS-kNN to CANN, but also to SVM, S-NDAE, and DBN. The obtained results also show that the approach is very competitive.
APA, Harvard, Vancouver, ISO, and other styles
29

Demidova, Liliya A. "Two-Stage Hybrid Data Classifiers Based on SVM and kNN Algorithms." Symmetry 13, no. 4 (April 7, 2021): 615. http://dx.doi.org/10.3390/sym13040615.

Full text
Abstract:
The paper considers a solution to the problem of developing two-stage hybrid SVM-kNN classifiers with the aim to increase the data classification quality by refining the classification decisions near the class boundary defined by the SVM classifier. In the first stage, the SVM classifier with default parameters values is developed. Here, the training dataset is designed on the basis of the initial dataset. When developing the SVM classifier, a binary SVM algorithm or one-class SVM algorithm is used. Based on the results of the training of the SVM classifier, two variants of the training dataset are formed for the development of the kNN classifier: a variant that uses all objects from the original training dataset located inside the strip dividing the classes, and a variant that uses only those objects from the initial training dataset that are located inside the area containing all misclassified objects from the class dividing strip. In the second stage, the kNN classifier is developed using the new training dataset above-mentioned. The values of the parameters of the kNN classifier are determined during training to maximize the data classification quality. The data classification quality using the two-stage hybrid SVM-kNN classifier was assessed using various indicators on the test dataset. In the case of the improvement of the quality of classification near the class boundary defined by the SVM classifier using the kNN classifier, the two-stage hybrid SVM-kNN classifier is recommended for further use. The experimental results approve the feasibility of using two-stage hybrid SVM-kNN classifiers in the data classification problem. The experimental results obtained with the application of various datasets confirm the feasibility of using two-stage hybrid SVM-kNN classifiers in the data classification problem.
APA, Harvard, Vancouver, ISO, and other styles
30

Wang, Chun Ye, and Xiao Feng Zhou. "The MapReduce Parallel Study of KNN Algorithm." Advanced Materials Research 989-994 (July 2014): 2123–27. http://dx.doi.org/10.4028/www.scientific.net/amr.989-994.2123.

Full text
Abstract:
Although the parallelization KNN algorithm improves the classification efficiency of the algorithm, the calculation of the parallel algorithms increases with the increasing of the training sample data scale, affecting the classification efficiency of the algorithm. Aiming at this shortage, this paper will improve the original parallelization KNN algorithm in the MapReduce model, adding the text pretreatment process to improve the classification efficiency of the algorithm.
APA, Harvard, Vancouver, ISO, and other styles
31

Kette, Efraim Kurniawan Dairo. "MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD." Indonesian Journal of Physics 32, no. 2 (December 28, 2021): 20–25. http://dx.doi.org/10.5614/itb.ijp.2021.32.2.5.

Full text
Abstract:
In pattern recognition, the k-Nearest Neighbor (kNN) algorithm is the simplest non-parametric algorithm. Due to its simplicity, the model cases and the quality of the training data itself usually influence kNN algorithm classification performance. Therefore, this article proposes a sparse correlation weight model, combined with the Training Data Set Cleaning (TDC) method by Classification Ability Ranking (CAR) called the CAR classification method based on Coefficient-Weighted kNN (CAR-CWKNN) to improve kNN classifier performance. Correlation weight in Sparse Representation (SR) has been proven can increase classification accuracy. The SR can show the 'neighborhood' structure of the data, which is why it is very suitable for classification based on the Nearest Neighbor. The Classification Ability (CA) function is applied to classify the best training sample data based on rank in the cleaning stage. The Leave One Out (LV1) concept in the CA works by cleaning data that is considered likely to have the wrong classification results from the original training data, thereby reducing the influence of the training sample data quality on the kNN classification performance. The results of experiments with four public UCI data sets related to classification problems show that the CAR-CWKNN method provides better performance in terms of accuracy.
APA, Harvard, Vancouver, ISO, and other styles
32

R. Sasikala, Dr. S. P. Swornambiga. "Novel K-Nearest Neighbor With Convolutional Neural Networks (KNN-CNN) For Accurate Brain Tumor Detection In Image Mining." Tuijin Jishu/Journal of Propulsion Technology 44, no. 4 (October 26, 2023): 2090–99. http://dx.doi.org/10.52783/tjjpt.v44.i4.1184.

Full text
Abstract:
Brain tumor classification plays a crucial role in early diagnosis and effective treatment planning. In this paper, we propose a novel approach, K-Nearest Neighbor with Convolutional Neural Networks (KNN-CNN), for accurate brain tumor classification. The proposed method combines the strengths of K-Nearest Neighbor (KNN) and Convolutional Neural Networks (CNNs) to leverage both traditional feature-based classification and deep learning-based feature extraction. We use CNNs to learn high-level features from brain tumor images, and KNN is employed to classify tumors based on the extracted features. The experimental results on a brain tumor dataset demonstrate the effectiveness and efficiency of the KNN-CNN approach, achieving high classification accuracy and outperforming traditional methods.
APA, Harvard, Vancouver, ISO, and other styles
33

Barth, Jackson, Duwani Katumullage, Chenyu Yang, and Jing Cao. "Classification of Wines Using Principal Component Analysis." Journal of Wine Economics 16, no. 1 (February 2021): 56–67. http://dx.doi.org/10.1017/jwe.2020.35.

Full text
Abstract:
AbstractClassification of wines with a large number of correlated covariates may lead to classification results that are difficult to interpret. In this study, we use a publicly available dataset on wines from three known cultivars, where there are 13 highly correlated variables measuring chemical compounds of wines. The goal is to produce an efficient classifier with straightforward interpretation to shed light on the important features of wines in the classification. To achieve the goal, we incorporate principal component analysis (PCA) in the k-nearest neighbor (kNN) classification to deal with the serious multicollinearity among the explanatory variables. PCA can identify the underlying dominant features and provide a more succinct and straightforward summary over the correlated covariates. The study shows that kNN combined with PCA yields a much simpler and interpretable classifier that has comparable performance with kNN based on all the 13 variables. The appropriate number of principal components is chosen to strike a balance between predictive accuracy and simplicity of interpretation. Our final classifier is based on only two principal components, which can be interpreted as the strength of taste and level of alcohol and fermentation in wines, respectively. (JEL Classifications: C10, Cl4, D83)
APA, Harvard, Vancouver, ISO, and other styles
34

Prasanti, Annisya Aprilia, M. Ali Fauzi, and Muhammad Tanzil Furqon. "Neighbor Weighted K-Nearest Neighbor for Sambat Online Classification." Indonesian Journal of Electrical Engineering and Computer Science 12, no. 1 (October 1, 2018): 155. http://dx.doi.org/10.11591/ijeecs.v12.i1.pp155-160.

Full text
Abstract:
<p>Sambat Online is one of the implementation of E-Government for complaints management provided by Malang City Government. All of the complaints will be classified into its intended department. In this study, automatic complaint classification system using Neighbor Weighted K-Nearest Neighbor (NW-KNN) is poposed because Sambat Online has imbalanced data. The system developed consists of three main stages including preprocessing, N-Gram feature extraction, and classification using NW-KNN. Based on the experiment results, it can be concluded that the NW-KNN algorithm is able to classify the imbalanced data well with the most optimal k-neighbor value is 3 and unigram as the best features by 77.85% precision, 74.18% recall, and 75.25% f-measure value. Compared to the conventional KNN, NW-KNN algorithm also proved to be better for imbalanced data problems with very slightly differences.</p>
APA, Harvard, Vancouver, ISO, and other styles
35

Muhammad Noor Mathivanan, Norsyela, Nor Azura Md.Ghani, and Roziah Mohd Janor. "Improving Classification Accuracy Using Clustering Technique." Bulletin of Electrical Engineering and Informatics 7, no. 3 (September 1, 2018): 465–70. http://dx.doi.org/10.11591/eei.v7i3.1272.

Full text
Abstract:
Product classification is the key issue in e-commerce domains. Many products are released to the market rapidly and to select the correct category in taxonomy for each product has become a challenging task. The application of classification model is useful to precisely classify the products. The study proposed a method to apply clustering prior to classification. This study has used a large-scale real-world data set to identify the efficiency of clustering technique to improve the classification model. The conventional text classification procedures are used in the study such as preprocessing, feature extraction and feature selection before applying the clustering technique. Results show that clustering technique improves the accuracy of the classification model. The best classification model for all three approaches which are classification model only, classification with hierarchical clustering and classification with K-means clustering is K-Nearest Neighbor (KNN) model. Even though the accuracy of the KNN models are the same across different approaches but the KNN model with K-means clustering had the shortest time of execution. Hence, applying K-means clustering prior to KNN model helps in reducing the computation time.
APA, Harvard, Vancouver, ISO, and other styles
36

Prabavathy, S. "Classification of Musical Instruments Sound Using Pre-Trained Model with Machine Learning Techniques." Asian Journal of Electrical Sciences 9, no. 1 (May 5, 2020): 45–48. http://dx.doi.org/10.51983/ajes-2020.9.1.2369.

Full text
Abstract:
Classify the musical instruments by machine is a challenging task. Musical data classification becomes very popular in research field. A huge manual process required to classify the musical instrument. This proposed system classifies the musical instruments using GoogleNet which is a pretrained network model; SVM and kNN are the two techniques which is used to classify the features. In this paper, to simply musical instruments classifications based on its features which are extracted from various instruments using recent algorithms. The performance of kNN with SVM compares in this proposed work. The musical instruments are identified and its accuracy is computed with the classifiers SVM and kNN, using the SVM with GoogleNet 99% achieve as a high accuracy rate in classifying the musical instruments. In this system sixteen musical instruments used to find the accuracy using SVM and kNN.
APA, Harvard, Vancouver, ISO, and other styles
37

Chen, Guobin, Xianzhong Xie, and Shijin Li. "Research on Complex Classification Algorithm of Breast Cancer Chip Based on SVM-RFE Gene Feature Screening." Complexity 2020 (June 13, 2020): 1–12. http://dx.doi.org/10.1155/2020/1342874.

Full text
Abstract:
Screening and classification of characteristic genes is a complex classification problem, and the characteristic sequences of gene expression show high-dimensional characteristics. How to select an effective gene screening algorithm is the main problem to be solved by analyzing gene chips. The combination of KNN, SVM, and SVM-RFE is selected to screen complex classification problems, and a new method to solve complex classification problems is provided. In the process of gene chip pretreatment, LogFC and P value equivalents in the gene expression matrix are screened, and different gene features are screened, and then SVM-RFE algorithm is used to sort and screen genes. Firstly, the characteristics of gene chips are analyzed and the number between probes and genes is counted. Clustering analysis among each sample and PCA classification analysis of different samples are carried out. Secondly, the basic algorithms of SVM and KNN are tested, and the important indexes such as error rate and accuracy rate of the algorithms are tested to obtain the optimal parameters. Finally, the performance indexes of accuracy, precision, recall, and F1 of several complex classification algorithms are compared through the complex classification of SVM, KNN, KNN-PCA, SVM-PCA, SVM-RFE-SVM, and SVM-RFE-KNN at P=0. 01,0.05,0.001. SVM-RFE-SVM has the best classification effect and can be used as a gene chip classification algorithm to analyze the characteristics of genes.
APA, Harvard, Vancouver, ISO, and other styles
38

Lee, Yuchun. "Handwritten Digit Recognition Using K Nearest-Neighbor, Radial-Basis Function, and Backpropagation Neural Networks." Neural Computation 3, no. 3 (September 1991): 440–49. http://dx.doi.org/10.1162/neco.1991.3.3.440.

Full text
Abstract:
Results of recent research suggest that carefully designed multilayer neural networks with local “receptive fields” and shared weights may be unique in providing low error rates on handwritten digit recognition tasks. This study, however, demonstrates that these networks, radial basis function (RBF) networks, and k nearest-neighbor (kNN) classifiers, all provide similar low error rates on a large handwritten digit database. The backpropagation network is overall superior in memory usage and classification time but can provide “false positive” classifications when the input is not a digit. The backpropagation network also has the longest training time. The RBF classifier requires more memory and more classification time, but less training time. When high accuracy is warranted, the RBF classifier can generate a more effective confidence judgment for rejecting ambiguous inputs. The simple kNN classifier can also perform handwritten digit recognition, but requires a prohibitively large amount of memory and is much slower at classification. Nevertheless, the simplicity of the algorithm and fast training characteristics makes the kNN classifier an attractive candidate in hardware-assisted classification tasks. These results on a large, high input dimensional problem demonstrate that practical constraints including training time, memory usage, and classification time often constrain classifier selection more strongly than small differences in overall error rate.
APA, Harvard, Vancouver, ISO, and other styles
39

Pulungan, Annisa Fadhillah, Muhammad Zarlis, and Saib Suwilo. "Analysis of Braycurtis, Canberra and Euclidean Distance in KNN Algorithm." SinkrOn 4, no. 1 (September 26, 2019): 74. http://dx.doi.org/10.33395/sinkron.v4i1.10207.

Full text
Abstract:
Classification is a technique used to build a classification model from a sample of training data. One of the most popular classification techniques is The K-Nearest Neighbor (KNN). The KNN algorithm has important parameter that affect the performance of the KNN Algorithm. The parameter is the value of the K and distance matrix. The distance between two points is determined by the calculation of the distance matrix before classification process by the KNN. The purpose of this study was to analyze and compare performance of the KNN using the distance function. The distance functions are Braycurtis Distance, Canberra Distance and Euclidean Distance based on an accuracy perspective. This study uses the Iris Dataset from the UCI Machine Learning Repository. The evaluation method used id 10-Fold Cross-Validation. The result showed that the Braycurtis distance method had better performance that Canberra Distance and Euclidean Distance methods at K=6, K=7, K=8 ad K=10 with accuracy values of 96 %.
APA, Harvard, Vancouver, ISO, and other styles
40

Liu, Zongying, Shaoxi Li, Jiangling Hao, Jingfeng Hu, and Mingyang Pan. "An Efficient and Fast Model Reduced Kernel KNN for Human Activity Recognition." Journal of Advanced Transportation 2021 (June 2, 2021): 1–9. http://dx.doi.org/10.1155/2021/2026895.

Full text
Abstract:
With accumulation of data and development of artificial intelligence, human activity recognition attracts lots of attention from researchers. Many classic machine learning algorithms, such as artificial neural network, feed forward neural network, K-nearest neighbors, and support vector machine, achieve good performance for detecting human activity. However, these algorithms have their own limitations and their prediction accuracy still has space to improve. In this study, we focus on K-nearest neighbors (KNN) and solve its limitations. Firstly, kernel method is employed in model KNN, which transforms the input features to be the high-dimensional features. The proposed model KNN with kernel (K-KNN) improves the accuracy of classification. Secondly, a novel reduced kernel method is proposed and used in model K-KNN, which is named as Reduced Kernel KNN (RK-KNN). It reduces the processing time and enhances the classification performance. Moreover, this study proposes an approach of defining number of K neighbors, which reduces the parameter dependency problem. Based on the experimental works, the proposed RK-KNN obtains the best performance in benchmarks and human activity datasets compared with other models. It has super classification ability in human activity recognition. The accuracy of human activity data is 91.60% for HAPT and 92.67% for Smartphone, respectively. Averagely, compared with the conventional KNN, the proposed model RK-KNN increases the accuracy by 1.82% and decreases standard deviation by 0.27. The small gap of processing time between KNN and RK-KNN in all datasets is only 1.26 seconds.
APA, Harvard, Vancouver, ISO, and other styles
41

Xiao, Xing Jiang, Wei Qing Wang, Hua Feng Ding, and Lu Cao. "KNN Algorithm Based on Weighted Entropy of Attribute Value." Advanced Materials Research 179-180 (January 2011): 1000–1004. http://dx.doi.org/10.4028/www.scientific.net/amr.179-180.1000.

Full text
Abstract:
The traditional KNN algorithm usually adopts European distance formula to measure the distance between two samples. Since each attribute functions differently in the actual sample data collection, the accuracy of the classification will be reduced consequently, this article proposes one method to measure the attribute value and entropy weight, namely KNN algorithm based on weighted entropy of attribute value. The experiment indicated that, compared with the traditional KNN algorithm, the algorithm proposed in this article can not only guarantee the efficiency of classification but also enhance the accuracy of classification.
APA, Harvard, Vancouver, ISO, and other styles
42

Wahyono, Wahyono, I. Nyoman Prayana Trisna, Sarah Lintang Sariwening, Muhammad Fajar, and Danur Wijayanto. "Comparison of distance measurement on k-nearest neighbour in textual data classification." Jurnal Teknologi dan Sistem Komputer 8, no. 1 (November 5, 2019): 54–58. http://dx.doi.org/10.14710/jtsiskom.8.1.2020.54-58.

Full text
Abstract:
One algorithm to classify textual data in automatic organizing of documents application is KNN, by changing word representations into vectors. The distance calculation in the KNN algorithm becomes essential in measuring the closeness between data elements. This study compares four distance calculations commonly used in KNN, namely Euclidean, Chebyshev, Manhattan, and Minkowski. The dataset used data from Youtube Eminem’s comments which contain 448 data. This study showed that Euclidian or Minkowski on the KNN algorithm achieved the best result compared to Chebycev and Manhattan. The best results on KNN are obtained when the K value is 3.
APA, Harvard, Vancouver, ISO, and other styles
43

Ehsani, Rezvan, and Finn Drabløs. "Robust Distance Measures for kNN Classification of Cancer Data." Cancer Informatics 19 (January 2020): 117693512096554. http://dx.doi.org/10.1177/1176935120965542.

Full text
Abstract:
The k-Nearest Neighbor ( kNN) classifier represents a simple and very general approach to classification. Still, the performance of kNN classifiers can often compete with more complex machine-learning algorithms. The core of kNN depends on a “guilt by association” principle where classification is performed by measuring the similarity between a query and a set of training patterns, often computed as distances. The relative performance of kNN classifiers is closely linked to the choice of distance or similarity measure, and it is therefore relevant to investigate the effect of using different distance measures when comparing biomedical data. In this study on classification of cancer data sets, we have used both common and novel distance measures, including the novel distance measures Sobolev and Fisher, and we have evaluated the performance of kNN with these distances on 4 cancer data sets of different type. We find that the performance when using the novel distance measures is comparable to the performance with more well-established measures, in particular for the Sobolev distance. We define a robust ranking of all the distance measures according to overall performance. Several distance measures show robust performance in kNN over several data sets, in particular the Hassanat, Sobolev, and Manhattan measures. Some of the other measures show good performance on selected data sets but seem to be more sensitive to the nature of the classification data. It is therefore important to benchmark distance measures on similar data prior to classification to identify the most suitable measure in each case.
APA, Harvard, Vancouver, ISO, and other styles
44

Wang, Cailing, LeiChao Li, SuQiang He, and Jing Zhang. "Tumor imaging diagnosis analysis based on improved KNN algorithm." Journal of Physics: Conference Series 2132, no. 1 (December 1, 2021): 012018. http://dx.doi.org/10.1088/1742-6596/2132/1/012018.

Full text
Abstract:
Abstract As a simple, effective and non-parameter analysis method, knn is widely used in text classification, image recognition, etc. [1]. However, this method requires a lot of calculations in practical applications, and the uneven distribution of training samples will directly lead to a decrease in the accuracy of tumor image classification. To solve this problem, we propose a method based on dynamic weighted KNN to improve the accuracy of classification, which is used to solve the problem of automatic prediction and classification of medical tumor images based on image features and automatic abnormality detection. According to the classification of tumor image characteristics, it can be divided into two categories: benign and malignant. This method can assist doctors in making medical diagnosis and analysis more accurately. The experimental results show that this method has certain advantages compared with the traditional KNN algorithm.
APA, Harvard, Vancouver, ISO, and other styles
45

Qi, Ai Ling, Jing Fang Wang, Frank Wang, Unekwu Idachaba, and Gbola Akanmu. "Welding Defect Classification of Ultrasonic Detection Based on PCA and KNN." Applied Mechanics and Materials 380-384 (August 2013): 902–6. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.902.

Full text
Abstract:
Aiming to problems of welding defect classification in the ultrasonic detection, according to the characteristics of welding defect, a classification method based on PCA and KNN is proposed in order to solve the problem of ultrasonic testing signal feature extraction and defect recognition. For the impact of the redundant attributes of feature extraction on the classification accuracy, in this paper, an ultrasonic flaw feature extraction algorithm based on PCA is proposed. Ultrasonic flaw intelligent classification is always a difficult problem in NDT. KNN algorithm is proposed to classify the different defects. Compared with BP algorithm, experiment results show that the model based on PCA and KNN can get stable classification results, high accuracy, and can effectively improve the classification efficiency.
APA, Harvard, Vancouver, ISO, and other styles
46

Jasmir, Jasmir, Siti Nurmaini, and Bambang Tutuko. "Fine-Grained Algorithm for Improving KNN Computational Performance on Clinical Trials Text Classification." Big Data and Cognitive Computing 5, no. 4 (October 28, 2021): 60. http://dx.doi.org/10.3390/bdcc5040060.

Full text
Abstract:
Text classification is an important component in many applications. Text classification has attracted the attention of researchers to continue to develop innovations and build new classification models that are sourced from clinical trial texts. In building classification models, many methods are used, including supervised learning. The purpose of this study is to improve the computational performance of one of the supervised learning methods, namely KNN, in building a clinical trial document text classification model by combining KNN and the fine-grained algorithm. This research contributed to increasing the computational performance of KNN from 388,274 s to 260,641 s in clinical trial texts on a clinical trial text dataset with a total of 1,000,000 data.
APA, Harvard, Vancouver, ISO, and other styles
47

Sugesti, Annisa, Moch Abdul Mukid, and Tarno Tarno. "PERBANDINGAN KINERJA MUTUAL K-NEAREST NEIGHBOR (MKNN) DAN K-NEAREST NEIGHBOR (KNN) DALAM ANALISIS KLASIFIKASI KELAYAKAN KREDIT." Jurnal Gaussian 8, no. 3 (August 30, 2019): 366–76. http://dx.doi.org/10.14710/j.gauss.v8i3.26681.

Full text
Abstract:
Credit feasibility analysis is important for lenders to avoid the risk among the increasement of credit applications. This analysis can be carried out by the classification technique. Classification technique used in this research is instance-based classification. These techniques tend to be simple, but are very dependent on the determination of K values. K is number of nearest neighbor considered for class classification of new data. A small value of K is very sensitive to outliers. This weakness can be overcome using an algorithm that is able to handle outliers, one of them is Mutual K-Nearest Neighbor (MKNN). MKNN removes outliers first, then predicts new observation classes based on the majority class of their mutual nearest neighbors. The algorithm will be compared with KNN without outliers. The model is evaluated by 10-fold cross validation and the classification performance is measured by Gemoetric-Mean of sensitivity and specificity. Based on the analysis the optimal value of K is 9 for MKNN and 3 for KNN, with the highest G-Mean produced by KNN is equal to 0.718, meanwhile G-Mean produced by MKNN is 0.702. The best alternative to classifying credit feasibility in this study is K-Nearest Neighbor (KNN) algorithm with K=3.Keywords: Classification, Credit, MKNN, KNN, G-Mean.
APA, Harvard, Vancouver, ISO, and other styles
48

Tan, Lingling, Junkai Yi, and Fei Yang. "Improving Performance of Massive Text Real-Time Classification for Document Confidentiality Management." Applied Sciences 14, no. 4 (February 15, 2024): 1565. http://dx.doi.org/10.3390/app14041565.

Full text
Abstract:
For classified and sensitive electronic documents within the scope of enterprises and organizations, in order to standardize and strengthen the confidentiality management of enterprises and meet the actual needs of secret text classification, a document automatic classification optimization method based on keyword retrieval and the kNN classification algorithm is proposed. The method supports keyword classification management, provides users with keywords of multiple risk levels, and then combines a matching scanning algorithm to label keywords of different levels. The text with labels is used as the training set of the kNN algorithm to classify the target text and realize the classification protection of text data. Aimed at solving the shortcomings of large feature vector dimension, low classification efficiency, and low accuracy in existing kNN text classification methods, an optimization method is proposed using a feature selection algorithm and a kNN algorithm based on an AVX instruction set to realize real-time classification of massive texts. By constructing a keyword dictionary and an optimized feature vector, parallel calculation of the feature vector weight and distance vector is realized, and the accuracy and efficiency of text classification are improved. The experimental results show that the multi-classification effect of the feature selection algorithm used in this paper, tf-DE, is better than that of the traditional tf-idf algorithm, and the classification effect of kNN is comparable to that of the support vector machine (SVM) algorithm. With the increase in feature vector dimensions, the classification effect of the text classification algorithm is improved and the classification time also increases linearly. The AVX-256 acceleration method takes about 55% of the time of the original version, thus verifying the effect of multi-classification of massive texts for document confidentiality management.
APA, Harvard, Vancouver, ISO, and other styles
49

Li, Runya, and Shenglian Li. "Multimedia Image Data Analysis Based on KNN Algorithm." Computational Intelligence and Neuroscience 2022 (April 12, 2022): 1–8. http://dx.doi.org/10.1155/2022/7963603.

Full text
Abstract:
In order to improve the authenticity of multispectral remote sensing image data analysis, the KNN algorithm and hyperspectral remote sensing technology are used to organically combine advanced multimedia technology with spectral technology to subdivide the spectrum. Different classification methods are used to classify CHRIS 0°, and the results are analyzed and compared: SVM classification accuracy is the highest 72 8448%, Kappa coefficient is 0.6770, and SVM is used to classify CHRIS images from five angles, and the results are compared and analyzed: the classification accuracy is from high to low, and the order is FZA = 0 > FZA = −36 > FZA = −55 > FZA = 36 > FZA = 55; SVM is used to classify the multiangle combined image, and the result is compared with the CHRIS 0° result: the overall classification accuracy of angle-combined image types is lower than that of single-angle images; the SVM is used to classify the band-combined image, and the result is compared with CHRIS 0°: the overall classification accuracy of the band combination image forest type is very low, and the effect is not as good as the combining multiangle image classification results. It is verified that if CHRIS multiangle hyper-spectral data are used for classification, the SVM method should be used to classify spectral remote sensing image data with the best effect.
APA, Harvard, Vancouver, ISO, and other styles
50

Nur Ghaniaviyanto Ramadhan. "Indonesian Online News Topics Classification using Word2Vec and K-Nearest Neighbor." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 5, no. 6 (December 30, 2021): 1083–89. http://dx.doi.org/10.29207/resti.v5i6.3547.

Full text
Abstract:
News is information disseminated by newspapers, radio, television, the internet, and other media. According to the survey results, there are many news titles from various topics spread on the internet. This of course makes newsreaders have difficulty when they want to find the desired news topic to read. These problems can be solved by grouping or so-called classification. The classification process is carried out of course by using a computerized process. This study aims to classify several news topics in Indonesian language using the KNN classification model and word2vec to convert words into vectors which aim to facilitate the classification process. The use of KNN in this study also determines the optimal K value to be used. In addition to using the classification model, this study also uses a word embedding-based model, namely word2vec. The results obtained using the word2vec and KNN models have an accuracy of 89.2% with a value of K=7. The word2vec and KNN models are also superior to the support vector machine, logistic regression, and random forest classification models.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography