Log in

Relevant bibliographies by topics / Data Classification / Journal articles

To see the other types of publications on this topic, follow the link: Data Classification.

Journal articles on the topic 'Data Classification'

Author: Grafiati

Published: 4 June 2021

Last updated: 7 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Data Classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Geethika, Paruchuri, and Voleti Prasanthi. "Booster in High Dimensional Data Classification." International Journal of Trend in Scientific Research and Development Volume-2, Issue-3 (April 30, 2018): 1186–90. http://dx.doi.org/10.31142/ijtsrd11368.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Alhaisoni, Majed Mohaia, Rabie A. Ramadan, and Ahmed Y. Khedr. "SCF: Smart Big Data Classification Framework." Indian Journal of Science and Technology 12, no. 37 (October 10, 2019): 1–8. http://dx.doi.org/10.17485/ijst/2019/v12i37/148647.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

S, Gowtham, and Karuppusamy S. "Review of Data Mining Classification Techniques." Bonfring International Journal of Software Engineering and Soft Computing 9, no. 2 (April 30, 2019): 8–11. http://dx.doi.org/10.9756/bijsesc.9013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Uprichard, Emma. "Dirty Data: Longitudinal Classification Systems." Sociological Review 59, no. 2_suppl (December 2011): 93–112. http://dx.doi.org/10.1111/j.1467-954x.2012.02058.x.

Full text

Abstract:

Typically in longitudinal quantitative research, classifications are tracked over time. However, most classifications change in absolute terms in that some die whilst others are created, and in their meaning. There is a need, therefore, to re-think how longitudinal quantitative research might explore both the qualitative changes to classification systems as well as the quantitative changes within each classification. By drawing on the changing classifications of local food retail outlets in the city of York (UK) since the 1950s as an illustrative example, an alternative way of graphing longitudinal quantitative data is presented which ultimately provides a description of both types of change over time. In so doing, this article argues for the increased use of ‘dirty data’ in longitudinal quantitative analysis, a step which allows for the exploration of both qualitative and quantitative changes to, and within, classification systems. This ultimately challenges existing assumptions relating to the quality and type of data used in quantitative research and how change in the social world is measured in general.

APA, Harvard, Vancouver, ISO, and other styles

5

Anam, Mamoona, Dr Kantilal P. Rane, Ali Alenezi, Ruby Mishra, Dr Swaminathan Ramamurthy, and Ferdin Joe John Joseph. "Content Classification Tasks with Data Preprocessing Manifestations." Webology 19, no. 1 (January 20, 2022): 1413–30. http://dx.doi.org/10.14704/web/v19i1/web19094.

Full text

Abstract:

Deep reinforcement learning has a major hurdle in terms of data efficiency. We solve this challenge by pretraining an encoder with unlabeled input, which is subsequently finetuned on a tiny quantity of task-specific input. We use a mixture of latent dynamics modelling and unsupervised goal-conditioned RL to encourage learning representations that capture various elements of the underlying MDP. Our approach significantly outperforms previous work combining offline representation pretraining with task-specific finetuning when limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience) and compares favourably with other pretraining methods that require orders of magnitude more data. When paired with larger models and more diverse, task-aligned observational data, our methodology shows great promise, nearing human-level performance and data efficiency on Atari in the best-case scenario.

APA, Harvard, Vancouver, ISO, and other styles

6

Rani, A. Nithya, and Dr Antony Selvdoss Davamani. "Classification on Missing Data for Multiple Imputations." International Journal of Trend in Scientific Research and Development Volume-2, Issue-3 (April 30, 2018): 745–49. http://dx.doi.org/10.31142/ijtsrd9566.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

N.J., Anjala. "Algorithmic Assessment of Text based Data Classification in Big Data Sets." Journal of Advanced Research in Dynamical and Control Systems 12, SP4 (March 31, 2020): 1231–34. http://dx.doi.org/10.5373/jardcs/v12sp4/20201598.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Bian, Jiang, Dayong Tian, Yuanyan Tang, and Dacheng Tao. "Trajectory Data Classification." ACM Transactions on Intelligent Systems and Technology 10, no. 4 (August 29, 2019): 1–34. http://dx.doi.org/10.1145/3330138.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Suthaharan, Shan. "Big data classification." ACM SIGMETRICS Performance Evaluation Review 41, no. 4 (April 17, 2014): 70–73. http://dx.doi.org/10.1145/2627534.2627557.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Anonymous. "Sonar data classification." Eos, Transactions American Geophysical Union 69, no. 38 (1988): 868. http://dx.doi.org/10.1029/88eo01128.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

R, Vijaya Kumar Reddy, Srinivasa Rao B, Shaik Subhani, and Ravi Prakash. "Effectiveness of Data Augmentation on Handwritten Digit Classification." Journal of Advanced Research in Dynamical and Control Systems 11, no. 12 (December 20, 2019): 90–96. http://dx.doi.org/10.5373/jardcs/v11i12/20193216.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

G., Dr Ayyappan. "Various classifications for caesarian section classification dataset data set." Indian Journal of Computer Science and Engineering 9, no. 6 (December 20, 2018): 145–47. http://dx.doi.org/10.21817/indjcse/2018/v9i6/180906013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Aydadenta, Husna, and Adiwijaya. "On the classification techniques in data mining for microarray data classification." Journal of Physics: Conference Series 971 (March 2018): 012004. http://dx.doi.org/10.1088/1742-6596/971/1/012004.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Ustuner, M., F. B. Sanli, and S. Abdikan. "BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 379–84. http://dx.doi.org/10.5194/isprs-archives-xli-b7-379-2016.

Full text

Abstract:

The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.

APA, Harvard, Vancouver, ISO, and other styles

15

Ustuner, M., F. B. Sanli, and S. Abdikan. "BALANCED VS IMBALANCED TRAINING DATA: CLASSIFYING RAPIDEYE DATA WITH SUPPORT VECTOR MACHINES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B7 (June 21, 2016): 379–84. http://dx.doi.org/10.5194/isprsarchives-xli-b7-379-2016.

Full text

Abstract:

The accuracy of supervised image classification is highly dependent upon several factors such as the design of training set (sample selection, composition, purity and size), resolution of input imagery and landscape heterogeneity. The design of training set is still a challenging issue since the sensitivity of classifier algorithm at learning stage is different for the same dataset. In this paper, the classification of RapidEye imagery with balanced and imbalanced training data for mapping the crop types was addressed. Classification with imbalanced training data may result in low accuracy in some scenarios. Support Vector Machines (SVM), Maximum Likelihood (ML) and Artificial Neural Network (ANN) classifications were implemented here to classify the data. For evaluating the influence of the balanced and imbalanced training data on image classification algorithms, three different training datasets were created. Two different balanced datasets which have 70 and 100 pixels for each class of interest and one imbalanced dataset in which each class has different number of pixels were used in classification stage. Results demonstrate that ML and NN classifications are affected by imbalanced training data in resulting a reduction in accuracy (from 90.94% to 85.94% for ML and from 91.56% to 88.44% for NN) while SVM is not affected significantly (from 94.38% to 94.69%) and slightly improved. Our results highlighted that SVM is proven to be a very robust, consistent and effective classifier as it can perform very well under balanced and imbalanced training data situations. Furthermore, the training stage should be precisely and carefully designed for the need of adopted classifier.

APA, Harvard, Vancouver, ISO, and other styles

16

Ambulkar, Bhagyashree, and Prof Gunjan Agre. "Data Mining Over Encrypted Data of Database Client Engine Using Hybrid Classification Approach." International Journal of Innovative Research in Computer Science & Technology 5, no. 3 (May 31, 2017): 291–94. http://dx.doi.org/10.21276/ijircst.2017.5.3.7.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Patil, Swati B., and Arjun Kuruva. "Analysis of User Session Data using the Map Reduce Classification with Big Data." International Journal of Trend in Scientific Research and Development Volume-2, Issue-5 (August 31, 2018): 2008–11. http://dx.doi.org/10.31142/ijtsrd18221.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Sikun, Gan. "BIG DATA EMOTION CLASSIFICATION." Young Scholars Journal, no. 1-2 (2022): 18–22. http://dx.doi.org/10.29013/ysj-22-1.2-18-22.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Belarouci, Sara, and Mohammed Amine Chikh. "Medical imbalanced data classification." Advances in Science, Technology and Engineering Systems Journal 2, no. 3 (April 2017): 116–24. http://dx.doi.org/10.25046/aj020316.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Picka, Jeffrey D. "Data Science and Classification." Technometrics 49, no. 3 (August 2007): 363–64. http://dx.doi.org/10.1198/tech.2007.s513.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

PENG, L., B. YANG, Y. CHEN, and A. ABRAHAM. "Data gravitation based classification." Information Sciences 179, no. 6 (March 1, 2009): 809–19. http://dx.doi.org/10.1016/j.ins.2008.11.007.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Borodko, A. "CLASSIFICATION OF DATA CENTERS." Telecom IT 7, no. 1 (2019): 1–9. http://dx.doi.org/10.31854/2307-1303-2019-7-1-1-9.

Full text

Abstract:

The data center (DC) is the most progressive form of computing resources when it is necessary to provide services to a wide range of users. Research subject. The article discusses the classification of data centers, their main functions, composition, purpose of creation and factors affecting them. Methodology and core results. The article provides a classification and structural analysis of the methods and technologies for constructing information storage and processing systems. In the work with a systematic approach analyzed factors affecting data centers. Practical relevance. It consists in the possibility of using the proposed classification in the tasks of systematically introducing the Internet of things devices into the data center, implementing software-defined data centers and developing methods for assessing the effectiveness of the functioning of the data center.

APA, Harvard, Vancouver, ISO, and other styles

23

Otley, Amanda, Michelle Morris, Andy Newing, and Mark Birkin. "Local and Application-Specific Geodemographics for Data-Led Urban Decision Making." Sustainability 13, no. 9 (April 26, 2021): 4873. http://dx.doi.org/10.3390/su13094873.

Full text

Abstract:

This work seeks to introduce improvements to the traditional variable selection procedures employed in the development of geodemographic classifications. It presents a proposal for shifting from a traditional approach for generating general-purpose one-size-fits-all geodemographic classifications to application-specific classifications. This proposal addresses the recent scepticism towards the utility of general-purpose applications by employing supervised machine learning techniques in order to identify contextually relevant input variables from which to develop geodemographic classifications with increased discriminatory power. A framework introducing such techniques in the variable selection phase of geodemographic classification development is presented via a practical use-case that is focused on generating a geodemographic classification with an increased capacity for discriminating the propensity for Library use in the UK city of Leeds. Two local classifications are generated for the city, one a general-purpose classification, and the other, an application-specific classification incorporating supervised Feature Selection methods in the selection of input variables. The discriminatory power of each classification is evaluated and compared, with the result successfully demonstrating the capacity for the application-specific approach to generate a more contextually relevant result, and thus underpins increasingly targeted public policy decision making, particularly in the context of urban planning.

APA, Harvard, Vancouver, ISO, and other styles

24

Kumari, S. Surya, and G. Anjan Babu. "Sentiment classification using unlabelled data with emoticon classification." International Journal of Knowledge Engineering and Soft Data Paradigms 7, no. 1 (2020): 1. http://dx.doi.org/10.1504/ijkesdp.2020.112616.

Full text

APA, Harvard, Vancouver, ISO, and other styles

25

Kumari, S. Surya, and G. Anjan Babu. "Sentiment classification using unlabelled data with emoticon classification." International Journal of Knowledge Engineering and Soft Data Paradigms 7, no. 1 (2020): 1. http://dx.doi.org/10.1504/ijkesdp.2020.10034771.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Senthamil Selvi, M., and S. Jansi Rani. "Classification of Admission Data Using Classification Learner Toolbox." Journal of Physics: Conference Series 1979, no. 1 (August 1, 2021): 012043. http://dx.doi.org/10.1088/1742-6596/1979/1/012043.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Jiang, Tingxuan, Harald van der Werff, and Freek van der Meer. "Classification Endmember Selection with Multi-Temporal Hyperspectral Data." Remote Sensing 12, no. 10 (May 15, 2020): 1575. http://dx.doi.org/10.3390/rs12101575.

Full text

Abstract:

In hyperspectral image classification, so-called spectral endmembers are used as reference data. These endmembers are either extracted from an image or taken from another source. Research has shown that endmembers extracted from an image usually perform best when classifying a single image. However, it is unclear if this also holds when classifying multi-temporal hyperspectral datasets. In this paper, we use spectral angle mapper, which is a frequently used classifier for hyperspectral datasets to classify multi-temporal airborne visible/infrared imaging spectrometer (AVIRIS) hyperspectral imagery. Three classifications are done on each of the images with endmembers being extracted from the corresponding image, and three more classifications are done on the three images while using averaged endmembers. We apply image-to-image registration and change detection to analyze the consistency of the classification results. We show that the consistency of classification accuracy using the averaged endmembers (around 65%) outperforms the classification results generated using endmembers that are extracted from each image separately (around 40%). We conclude that, for multi-temporal datasets, it is better to have an endmember collection that is not directly from the image, but is processed to a representative average.

APA, Harvard, Vancouver, ISO, and other styles

28

Krawczyk, Bartosz, Jerzy Stefanowski, and Michał Wozniak. "Data stream classification and big data analytics." Neurocomputing 150 (February 2015): 238–39. http://dx.doi.org/10.1016/j.neucom.2014.10.025.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Mirarchi, Domenico, Giovanni Canino, Patrizia Vizza, Pierangelo Veltri, Salvatore Cuomo, Claudio Petrolo, and Giuseppe Chiarella. "Data mining techniques for vestibular data classification." International Journal of Internet Technology and Secured Transactions 7, no. 1 (2017): 51. http://dx.doi.org/10.1504/ijitst.2017.085734.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Petrolo, Claudio, Salvatore Cuomo, Pierangelo Veltri, Patrizia Vizza, Giovanni Canino, Domenico Mirarchi, and Giuseppe Chiarella. "Data mining techniques for vestibular data classification." International Journal of Internet Technology and Secured Transactions 7, no. 1 (2017): 51. http://dx.doi.org/10.1504/ijitst.2017.10006656.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

McDonough, Caitrin W., Steven M. Smith, Rhonda M. Cooper-DeHoff, and William R. Hogan. "Optimizing Antihypertensive Medication Classification in Electronic Health Record-Based Data: Classification System Development and Methodological Comparison." JMIR Medical Informatics 8, no. 2 (February 27, 2020): e14777. http://dx.doi.org/10.2196/14777.

Full text

Abstract:

Background Computable phenotypes have the ability to utilize data within the electronic health record (EHR) to identify patients with certain characteristics. Many computable phenotypes rely on multiple types of data within the EHR including prescription drug information. Hypertension (HTN)-related computable phenotypes are particularly dependent on the correct classification of antihypertensive prescription drug information, as well as corresponding diagnoses and blood pressure information. Objective This study aimed to create an antihypertensive drug classification system to be utilized with EHR-based data as part of HTN-related computable phenotypes. Methods We compared 4 different antihypertensive drug classification systems based off of 4 different methodologies and terminologies, including 3 RxNorm Concept Unique Identifier (RxCUI)–based classifications and 1 medication name–based classification. The RxCUI-based classifications utilized data from (1) the Drug Ontology, (2) the new Medication Reference Terminology, and (3) the Anatomical Therapeutic Chemical Classification System and DrugBank, whereas the medication name–based classification relied on antihypertensive drug names. Each classification system was applied to EHR-based prescription drug data from hypertensive patients in the OneFlorida Data Trust. Results There were 13,627 unique RxCUIs and 8025 unique medication names from the 13,879,046 prescriptions. We observed a broad overlap between the 4 methods, with 84.1% (691/822) to 95.3% (695/729) of terms overlapping pairwise between the different classification methods. Key differences arose from drug products with multiple dosage forms, drug products with an indication of benign prostatic hyperplasia, drug products that contain more than 1 ingredient (combination products), and terms within the classification systems corresponding to retired or obsolete RxCUIs. Conclusions In total, 2 antihypertensive drug classifications were constructed, one based on RxCUIs and one based on medication name, that can be used in future computable phenotypes that require antihypertensive drug classifications.

APA, Harvard, Vancouver, ISO, and other styles

32

P., Avila Clemenshia. "A Research on Cancer Subtype Classification Using Gene Expression Data." Journal of Advanced Research in Dynamical and Control Systems 12, SP4 (March 31, 2020): 490–500. http://dx.doi.org/10.5373/jardcs/v12sp4/20201514.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Mutasher, Watheq Ghanim, and Abbas Fadhil Aljuboori. "Real Time Big Data Sentiment Analysis and Classification of Facebook." Webology 19, no. 1 (January 20, 2022): 1112–27. http://dx.doi.org/10.14704/web/v19i1/web19076.

Full text

Abstract:

Many peoples use Facebook to connect and share their views on various issues, with the majority of user-generated content consisting of textual information. Since there is so much actual data from people who are posting messages on their situation in real time thoughts on a range of subjects in everyday life, the collection and analysis of these data, which may well be helpful for political decision or public opinion monitoring, is a worthwhile research project. Therefore, in this paper doing to analyze for public text post on Facebook stream in real time through environment Hadoop ecosystem by using apache spark with NLTK python. The post or feeds are gathered form the Facebook API in real time the data stored database used Apache spark to quick query processing the text partitions in each data nodes (machine). Also used Amazon cloud based Hadoop cluster ecosystem into processing of huge data and eliminate on-site hardware, IT support, and other operational difficulties and installation configuration Hadoop such as Hadoop distribution file system and Apache spark. By using the principle of decision dictionary, emotion analysis is used as positive, negative, or neutral and execution two algorithms in machine learning (naive bias & support vector machine) to build model predict the outcome demonstrates a high level of precision in sentiment analysis.

APA, Harvard, Vancouver, ISO, and other styles

34

Raviya, Kaushik H., and Biren Gajjar. "Performance Evaluation of Different Data Mining Classification Algorithm Using WEKA." Paripex - Indian Journal Of Research 2, no. 1 (January 15, 2012): 19–21. http://dx.doi.org/10.15373/22501991/jan2013/8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Hajizadeh, Zahra, Mohammad Taheri, and Mansoor Zolghadri Jahromi. "Nearest Neighbor Classification with Locally Weighted Distance for Imbalanced Data." International Journal of Computer and Communication Engineering 3, no. 2 (2014): 81–86. http://dx.doi.org/10.7763/ijcce.2014.v3.296.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Yastikli, N., and Z. Cetin. "CLASSIFICATION OF LiDAR DATA WITH POINT BASED CLASSIFICATION METHODS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B3 (June 9, 2016): 441–45. http://dx.doi.org/10.5194/isprs-archives-xli-b3-441-2016.

Full text

Abstract:

LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features) and have been tested in the study area in Zekeriyaköy, Istanbul which includes the partly open areas, forest areas and many types of the buildings. The data set used in this research obtained from Istanbul Metropolitan Municipality which was collected with ‘Riegl LSM-Q680i’ full-waveform laser scanner with the density of 16 points/m2. The proposed automatic point based Approach 1 and Approach 2 classifications successfully produced the ground, building and vegetation classes which were very similar although different features were used.

APA, Harvard, Vancouver, ISO, and other styles

37

Chincholkar, Bhushan R. "Implementation Analysis of Data Classification Approach for Sentiment Classification." International Journal for Research in Applied Science and Engineering Technology 9, no. VII (July 15, 2021): 1509–12. http://dx.doi.org/10.22214/ijraset.2021.36613.

Full text

Abstract:

Sentiment analysis is one of the fastest growing fields with its demand and potential benefits that are increasing every day. Sentiment analysis aims to classify the polarity of a document through natural language processing, text analysis. With the help of internet and modern technology, there has bee n a tremendous growth in the amount of data. Each individual is in position to precise his/her own ideas freely on social media. All of this data can be analyzed and used in order to draw benefits and quality information. In this paper, the focus is on cyber-hate classification based on for public opinion or views, since the spread of hate speech using social media can have disruptive impacts on social sentiment analysis. In particular, here proposing a modified approach with two stage training for dealing with text ambiguity and classifying three type approach positive, negative and neutral sentiment, and compare its performance with those popular methods also as well as some existing fuzzy approaches. Afterword comparing the performance of proposed approach with commonly used sentiment classifiers which are known to perform well in this task. The experimental results indicate that our modified approach performs marginally better than the other algorithms.

APA, Harvard, Vancouver, ISO, and other styles

38

Yastikli, N., and Z. Cetin. "CLASSIFICATION OF LiDAR DATA WITH POINT BASED CLASSIFICATION METHODS." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B3 (June 9, 2016): 441–45. http://dx.doi.org/10.5194/isprsarchives-xli-b3-441-2016.

Full text

Abstract:

LiDAR is one of the most effective systems for 3 dimensional (3D) data collection in wide areas. Nowadays, airborne LiDAR data is used frequently in various applications such as object extraction, 3D modelling, change detection and revision of maps with increasing point density and accuracy. The classification of the LiDAR points is the first step of LiDAR data processing chain and should be handled in proper way since the 3D city modelling, building extraction, DEM generation, etc. applications directly use the classified point clouds. The different classification methods can be seen in recent researches and most of researches work with the gridded LiDAR point cloud. In grid based data processing of the LiDAR data, the characteristic point loss in the LiDAR point cloud especially vegetation and buildings or losing height accuracy during the interpolation stage are inevitable. In this case, the possible solution is the use of the raw point cloud data for classification to avoid data and accuracy loss in gridding process. In this study, the point based classification possibilities of the LiDAR point cloud is investigated to obtain more accurate classes. The automatic point based approaches, which are based on hierarchical rules, have been proposed to achieve ground, building and vegetation classes using the raw LiDAR point cloud data. In proposed approaches, every single LiDAR point is analyzed according to their features such as height, multi-return, etc. then automatically assigned to the class which they belong to. The use of un-gridded point cloud in proposed point based classification process helped the determination of more realistic rule sets. The detailed parameter analyses have been performed to obtain the most appropriate parameters in the rule sets to achieve accurate classes. The hierarchical rule sets were created for proposed Approach 1 (using selected spatial-based and echo-based features) and Approach 2 (using only selected spatial-based features) and have been tested in the study area in Zekeriyaköy, Istanbul which includes the partly open areas, forest areas and many types of the buildings. The data set used in this research obtained from Istanbul Metropolitan Municipality which was collected with ‘Riegl LSM-Q680i’ full-waveform laser scanner with the density of 16 points/m2. The proposed automatic point based Approach 1 and Approach 2 classifications successfully produced the ground, building and vegetation classes which were very similar although different features were used.

APA, Harvard, Vancouver, ISO, and other styles

39

Kim, Yeseul, Kyung-Do Lee, Sang-Il Na, Suk-Young Hong, No-Wook Park, and Hee Young Yoo. "MODIS Data-based Crop Classification using Selective Hierarchical Classification." Korean Journal of Remote Sensing 32, no. 3 (June 30, 2016): 235–44. http://dx.doi.org/10.7780/kjrs.2016.32.3.3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

张, 俊达. "Volume Data Classification Visualization Based on Probabilistic Classification Model." Computer Science and Application 09, no. 11 (2019): 1986–92. http://dx.doi.org/10.12677/csa.2019.911223.

Full text

APA, Harvard, Vancouver, ISO, and other styles

41

Di Prinzio, M., A. Castellarin, and E. Toth. "Data-driven catchment classification: application to the pub problem." Hydrology and Earth System Sciences 15, no. 6 (June 23, 2011): 1921–35. http://dx.doi.org/10.5194/hess-15-1921-2011.

Full text

Abstract:

Abstract. A promising approach to catchment classification makes use of unsupervised neural networks (Self Organising Maps, SOM's), which organise input data through non-linear techniques depending on the intrinsic similarity of the data themselves. Our study considers ∼300 Italian catchments scattered nationwide, for which several descriptors of the streamflow regime and geomorphoclimatic characteristics are available. We compare a reference classification, identified by using indices of the streamflow regime as input to SOM, with four alternative classifications, which were identified on the basis of catchment descriptors that can be derived for ungauged basins. One alternative classification adopts the available catchment descriptors as input to SOM, the remaining classifications are identified by applying SOM to sets of derived variables obtained by applying Principal Component Analysis (PCA) and Canonical Correlation Analysis (CCA) to the available catchment descriptors. The comparison is performed relative to a PUB problem, that is for predicting several streamflow indices in ungauged basins. We perform an extensive cross-validation to quantify nationwide the accuracy of predictions of mean annual runoff, mean annual flood, and flood quantiles associated with given exceedance probabilities. Results of the study indicate that performing PCA and, in particular, CCA on the available set of catchment descriptors before applying SOM significantly improves the effectiveness of SOM classifications by reducing the uncertainty of hydrological predictions in ungauged sites.

APA, Harvard, Vancouver, ISO, and other styles

42

Jooa, Jae Yun, and Seokho Lee. "Binary classification on compositional data." Communications for Statistical Applications and Methods 28, no. 1 (January 31, 2021): 89–97. http://dx.doi.org/10.29220/csam.2021.28.1.089.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Zheng, H., H. Y. Wang, N. D. Black, and R. J. Winder. "Data structures, coding and classification." Technology and Health Care 18, no. 1 (February 21, 2010): 71–87. http://dx.doi.org/10.3233/thc-2010-0568.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Sen, Prithviraj, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. "Collective Classification in Network Data." AI Magazine 29, no. 3 (September 6, 2008): 93. http://dx.doi.org/10.1609/aimag.v29i3.2157.

Full text

Abstract:

Many real-world applications produce networked data such as the world-wide web (hypertext documents connected via hyperlinks), social networks (for example, people connected by friendship links), communication networks (computers connected via communication links) and biological networks (for example, protein interaction networks). A recent focus in machine learning research has been to extend traditional machine learning classification techniques to classify nodes in such networks. In this article, we provide a brief introduction to this area of research and how it has progressed during the past decade. We introduce four of the most widely used inference algorithms for classifying networked data and empirically compare them on both synthetic and real-world data.

APA, Harvard, Vancouver, ISO, and other styles

45

Molitor, Denali, and Deanna Needell. "Hierarchical Classification Using Binary Data." AI Magazine 40, no. 2 (June 24, 2019): 59–65. http://dx.doi.org/10.1609/aimag.v40i2.2846.

Full text

Abstract:

In classification problems, especially those that categorize data into a large number of classes, the classes often naturally follow a hierarchical structure. That is, some classes are likely to share similar structures and features. Those characteristics can be captured by considering a hierarchical relationship among the class labels. Motivated by a recent simple classification approach on binary data, we propose a variant that is tailored to efficient classification of hierarchical data. In certain settings, specifically, when some classes are significantly easier to identify than others, we show case computational and accuracy advantages.

APA, Harvard, Vancouver, ISO, and other styles

46

Salama, M., A. Hasanen, and A. Fahmy. "Pattern-based Data-Classification Technique." International Conference on Electrical Engineering 7, no. 7 (May 1, 2010): 1–25. http://dx.doi.org/10.21608/iceeng.2010.33265.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

Piech, Izabela, Tadeusz Żaba, and Aleksandra Jankowska. "DATA CLASSIFICATION BASED ON PHOTOGRAMMETRY." Geomatics, Landmanagement and Landscape 2 (2020): 93–110. http://dx.doi.org/10.15576/gll/2020.2.93.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Krishnaswamy, Anilesh K., Haoming Li, David Rein, Hanrui Zhang, and Vincent Conitzer. "Classification with Strategically Withheld Data." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 6 (May 18, 2021): 5514–22. http://dx.doi.org/10.1609/aaai.v35i6.16694.

Full text

Abstract:

Machine learning techniques can be useful in applications such as credit approval and college admission. However, to be classified more favorably in such contexts, an agent may decide to strategically withhold some of her features, such as bad test scores. This is a missing data problem with a twist: which data is missing depends on the chosen classifier, because the specific classifier is what may create the incentive to withhold certain feature values. We address the problem of training classifiers that are robust to this behavior. We design three classification methods: MINCUT, Hill-Climbing (HC) and Incentive-Compatible Logistic Regression (IC-LR). We show that MINCUT is optimal when the true distribution of data is fully known. However, it can produce complex decision boundaries, and hence be prone to overfitting in some cases. Based on a characterization of truthful classifiers (i.e., those that give no incentive to strategically hide features), we devise a simpler alternative called HC which consists of a hierarchical ensemble of out-of-the-box classifiers, trained using a specialized hill-climbing procedure which we show to be convergent. For several reasons, MINCUT and HC are not effective in utilizing a large number of complementarily informative features. To this end, we present IC-LR, a modification of Logistic Regression that removes the incentive to strategically drop features. We also show that our algorithms perform well in experiments on real-world data sets, and present insights into their relative performance in different settings.

APA, Harvard, Vancouver, ISO, and other styles

49

Patil, Adwait. "Covid Classification Using Audio Data." International Journal for Research in Applied Science and Engineering Technology 9, no. 10 (October 31, 2021): 1633–37. http://dx.doi.org/10.22214/ijraset.2021.38675.

Full text

Abstract:

Abstract: Coronavirus outbreak has affected the entire world adversely this project has been developed in order to help common masses diagnose their chances of been covid positive just by using coughing sound and basic patient data. Audio classification is one of the most interesting applications of deep learning. Similar to image data audio data is also stored in form of bits and to understand and analyze this audio data we have used Mel frequency cepstral coefficients (MFCCs) which makes it possible to feed the audio to our neural network. In this project we have used Coughvid a crowdsource dataset consisting of 27000 audio files and metadata of same amount of patients. In this project we have used a 1D Convolutional Neural Network (CNN) to process the audio and metadata. Future scope for this project will be a model that rates how likely it is that a person is infected instead of binary classification. Keywords: Audio classification, Mel frequency cepstral coefficients, Convolutional neural network, deep learning, Coughvid

APA, Harvard, Vancouver, ISO, and other styles

50

Lis, Kamila, Mateusz Koryciński, and Konrad A. Ciecierski. "Classification of masked image data." PLOS ONE 16, no. 7 (July 6, 2021): e0254181. http://dx.doi.org/10.1371/journal.pone.0254181.

Full text

Abstract:

Data classification is one of the most commonly used applications of machine learning. The are many developed algorithms that can work in various environments and for different data distributions that perform this task with excellence. Classification algorithms, just like other machine learning algorithms have one thing in common: in order to operate on data, they must see the data. In the present world, where concerns about privacy, GDPR (General Data Protection Regulation), business confidentiality and security are growing bigger and bigger; this requirement to work directly on the original data might become, in some situations, a burden. In this paper, an approach to the classification of images that cannot be directly accessed during training has been made. It has been shown that one can train a deep neural network to create such a representation of the original data that i) without additional information, the original data cannot be restored, and ii) that this representation—called a masked form—can still be used for classification purposes. Moreover, it has been shown that classification of the masked data can be done using both classical and neural network-based classifiers.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!