Journal articles on the topic 'Textual data-mining'

To see the other types of publications on this topic, follow the link: Textual data-mining.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Textual data-mining.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Yasuda, Akio. "Reviewing "Text Mining": Textual Data Mining." IEEJ Transactions on Electronics, Information and Systems 125, no. 5 (2005): 682–89. http://dx.doi.org/10.1541/ieejeiss.125.682.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Raiyani, Ronak S., Dr Bankim Radadiya, and Dr Satish Thumar. "Analyzing, Developing and Implementing Data Mining Techniques on Databases, Web Contents and Textual Data." Paripex - Indian Journal Of Research 2, no. 3 (January 15, 2012): 48–50. http://dx.doi.org/10.15373/22501991/mar2013/18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Yassir, Ali Hameed, Ali A. Mohammed, Adel Abdul-Jabbar Alkhazraji, Mustafa Emad Hameed, Mohammed Saad Talib, and Mohanad Faeq Ali. "Sentimental classification analysis of polarity multi-view textual data using data mining techniques." International Journal of Electrical and Computer Engineering (IJECE) 10, no. 5 (October 1, 2020): 5526. http://dx.doi.org/10.11591/ijece.v10i5.pp5526-5534.

Full text
Abstract:
The data and information available in most community environments is complex in nature. Sentimental data resources may possibly consist of textual data collected from multiple information sources with different representations and usually handled by different analytical models. These types of data resource characteristics can form multi-view polarity textual data. However, knowledge creation from this type of sentimental textual data requires considerable analytical efforts and capabilities. In particular, data mining practices can provide exceptional results in handling textual data formats. Besides, in the case of the textual data exists as multi-view or unstructured data formats, the hybrid and integrated analysis efforts of text data mining algorithms are vital to get helpful results. The objective of this research is to enhance the knowledge discovery from sentimental multi-view textual data which can be considered as unstructured data format to classify the polarity information documents in the form of two different categories or types of useful information. A proposed framework with integrated data mining algorithms has been discussed in this paper, which is achieved through the application of X-means algorithm for clustering and HotSpot algorithm of association rules. The analysis results have shown improved accuracies of classifying the sentimental multi-view textual data into two categories through the application of the proposed framework on online polarity user-reviews dataset upon a given topics.
APA, Harvard, Vancouver, ISO, and other styles
4

Jayasudha, J., and A. Christina Esther. "Mining Sequential Pattern of Data in Textual Document Using Data Mining Classification Technique." Asian Journal of Computer Science and Technology 8, S1 (February 5, 2019): 41–45. http://dx.doi.org/10.51983/ajcst-2019.8.s1.1961.

Full text
Abstract:
Text document were transmitted over the internet for the text communication. So they were occurred many problems like repeated text occurred because of same data were provided in the internet. To characterize and extracting that is a most critical task for the researchers. Many researchers were characterized and applied in many fields like real-life scenarios, such as real-time monitoring on abnormal user behaviors, etc. In this case to detect and characterize the personalized behavior of the user were provide some drawbacks. To solve this problem, this paper analyzing the sequential data and characterize the user behavior with the help of the data mining sequential pattern matching algorithm.
APA, Harvard, Vancouver, ISO, and other styles
5

Eltaher, Mohammed, and Jeongkyu Lee. "Social User Mining." International Journal of Multimedia Data Engineering and Management 4, no. 4 (October 2013): 58–70. http://dx.doi.org/10.4018/ijmdem.2013100104.

Full text
Abstract:
In recent years, the pervasive use of social media has generated huge amounts of data that starts to gain a lot of attentions. Each social media source utilizes different data types such as textual and visual. For example, Twitter1 is for a short text message, Flickr2 is for images and videos, and Facebook3 allows all of these data types. It is highly desired to find patterns of social media users from such different data formats. With the use of data mining techniques, the social media data opens a lot of opportunities for researchers. Despite of its short history, social media mining has become very active research area. This paper provides a comprehensive survey on recent research on social user mining. In particular, the survey focuses on two aspects: (1) social user mining based on data types, such as textual, visual, and both textual and visual information, and (2) social user mining based on mining techniques. In addition, we present our current research on social user mining as well as its future directions.
APA, Harvard, Vancouver, ISO, and other styles
6

Davahli, Mohammad Reza, Waldemar Karwowski, Edgar Gutierrez, Krzysztof Fiok, Grzegorz Wróbel, Redha Taiar, and Tareq Ahram. "Identification and Prediction of Human Behavior through Mining of Unstructured Textual Data." Symmetry 12, no. 11 (November 19, 2020): 1902. http://dx.doi.org/10.3390/sym12111902.

Full text
Abstract:
The identification of human behavior can provide useful information across multiple job spectra. Recent advances in applying data-based approaches to social sciences have increased the feasibility of modeling human behavior. In particular, studying human behavior by analyzing unstructured textual data has recently received considerable attention because of the abundance of textual data. The main objective of the present study was to discuss the primary methods for identifying and predicting human behavior through the mining of unstructured textual data. Of the 823 articles analyzed, 87 met the predefined inclusion criteria and were included in the literature review. Our results show that the included articles could be symmetrically classified into two groups. The first group of articles attempted to identify the leading indicators of human behavior in unstructured textual data. In this group, the data-based approaches had three main components: (1) collecting self-reported survey data, (2) collecting data from social media and extracting data features, and (3) applying correlation analysis to evaluate the relationship between two sets of data. In contrast, the second group focused on the accuracy of data-based approaches for predicting human behavior. In this group, the data-based approaches could be categorized into (1) approaches based on labeled unstructured textual data and (2) approaches based on unlabeled unstructured textual data. The review provides a comprehensive insight into unstructured textual data mining to identify and predict human behavior and personality traits.
APA, Harvard, Vancouver, ISO, and other styles
7

HOLZMAN, LARS E., TODD A. FISHER, LEON M. GALITSKY, APRIL KONTOSTATHIS, and WILLIAM M. POTTENGER. "A SOFTWARE INFRASTRUCTURE FOR RESEARCH IN TEXTUAL DATA MINING." International Journal on Artificial Intelligence Tools 13, no. 04 (December 2004): 829–49. http://dx.doi.org/10.1142/s0218213004001843.

Full text
Abstract:
Few tools exist that address the challenges facing researchers in the Textual Data Mining (TDM) field. Some are too specific to their application, or are prototypes not suitable for general use. More general tools often are not capable of processing large volumes of data. We have created a Textual Data Mining Infrastructure (TMI) that incorporates both existing and new capabilities in a reusable framework conducive to developing new tools and components. TMI adheres to strict guidelines that allow it to run in a wide range of processing environments – as a result, it accommodates the volume of computing and diversity of research occurring in TDM. A unique capability of TMI is support for optimization. This facilitates text mining research by automating the search for optimal parameters in text mining algorithms. In this article we describe a number of applications that use the TMI. A brief tutorial is provided on the use of TMI. We present several novel results that have not been published elsewhere. We also discuss how the TMI utilizes existing machine-learning libraries, thereby enabling researchers to continue and extend their endeavors with minimal effort. Towards that end, TMI is available on the web at .
APA, Harvard, Vancouver, ISO, and other styles
8

Chen, Pei Bin, Lan Hu, Hui Yang, Xiang Feng Xue, Chuan Xu Liu, and Xin Jian Li. "Target Value Analysis Based on Data Mining Technology." Applied Mechanics and Materials 602-605 (August 2014): 3096–99. http://dx.doi.org/10.4028/www.scientific.net/amm.602-605.3096.

Full text
Abstract:
In this paper, the data mining technology and the mining process was explained; and several common methods of data mining were described. Based on the characteristics of the target value, application of text classification and textual association in the target value mining were discussed, and the process model of data mining concerning target value was also expressed.
APA, Harvard, Vancouver, ISO, and other styles
9

Ur-Rahman, Nadeem. "Textual Data Mining For Knowledge Discovery and Data Classification: A Comparative Study." European Scientific Journal, ESJ 13, no. 21 (July 31, 2017): 429. http://dx.doi.org/10.19044/esj.2017.v13n21p429.

Full text
Abstract:
Business Intelligence solutions are key to enable industrial organisations (either manufacturing or construction) to remain competitive in the market. These solutions are achieved through analysis of data which is collected, retrieved and re-used for prediction and classification purposes. However many sources of industrial data are not being fully utilised to improve the business processes of the associated industry. It is generally left to the decision makers or managers within a company to take effective decisions based on the information available throughout product design and manufacture or from the operation of business or production processes. Substantial efforts and energy are required in terms of time and money to identify and exploit the appropriate information that is available from the data. Data Mining techniques have long been applied mainly to numerical forms of data available from various data sources but their applications to analyse semi-structured or unstructured databases are still limited to a few specific domains. The applications of these techniques in combination with Text Mining methods based on statistical, natural language processing and visualisation techniques could give beneficial results. Text Mining methods mainly deal with document clustering, text summarisation and classification and mainly rely on methods and techniques available in the area of Information Retrieval (IR). These help to uncover the hidden information in text documents at an initial level. This paper investigates applications of Text Mining in terms of Textual Data Mining (TDM) methods which share techniques from IR and data mining. These techniques may be implemented to analyse textual databases in general but they are demonstrated here using examples of Post Project Reviews (PPR) from the construction industry as a case study. The research is focused on finding key single or multiple term phrases for classifying the documents into two classes i.e. good information and bad information documents to help decision makers or project managers to identify key issues discussed in PPRs which can be used as a guide for future project management process.
APA, Harvard, Vancouver, ISO, and other styles
10

Alguliev, Rasim M., Ramiz M. Aliguliyev, and Saadat A. Nazirova. "Classification of Textual E-Mail Spam Using Data Mining Techniques." Applied Computational Intelligence and Soft Computing 2011 (2011): 1–8. http://dx.doi.org/10.1155/2011/416308.

Full text
Abstract:
A new method for clustering of spam messages collected in bases of antispam system is offered. The genetic algorithm is developed for solving clustering problems. The objective function is a maximization of similarity between messages in clusters, which is defined byk-nearest neighbor algorithm. Application of genetic algorithm for solving constrained problems faces the problem of constant support of chromosomes which reduces convergence process. Therefore, for acceleration of convergence of genetic algorithm, a penalty function that prevents occurrence of infeasible chromosomes at ranging of values of function of fitness is used. After classification, knowledge extraction is applied in order to get information about classes. Multidocument summarization method is used to get the information portrait of each cluster of spam messages. Classifying and parametrizing spam templates, it will be also possible to define the thematic dependence from geographical dependence (e.g., what subjects prevail in spam messages sent from certain countries). Thus, the offered system will be capable to reveal purposeful information attacks if those occur. Analyzing origins of the spam messages from collection, it is possible to define and solve the organized social networks of spammers.
APA, Harvard, Vancouver, ISO, and other styles
11

Zia, Amjad, Muzzamil Aziz, Ioana Popa, Sabih Ahmed Khan, Amirreza Fazely Hamedani, and Abdul R. Asif. "Artificial Intelligence-Based Medical Data Mining." Journal of Personalized Medicine 12, no. 9 (August 24, 2022): 1359. http://dx.doi.org/10.3390/jpm12091359.

Full text
Abstract:
Understanding published unstructured textual data using traditional text mining approaches and tools is becoming a challenging issue due to the rapid increase in electronic open-source publications. The application of data mining techniques in the medical sciences is an emerging trend; however, traditional text-mining approaches are insufficient to cope with the current upsurge in the volume of published data. Therefore, artificial intelligence-based text mining tools are being developed and used to process large volumes of data and to explore the hidden features and correlations in the data. This review provides a clear-cut and insightful understanding of how artificial intelligence-based data-mining technology is being used to analyze medical data. We also describe a standard process of data mining based on CRISP-DM (Cross-Industry Standard Process for Data Mining) and the most common tools/libraries available for each step of medical data mining.
APA, Harvard, Vancouver, ISO, and other styles
12

Hassani, Hossein, Christina Beneki, Stephan Unger, Maedeh Taj Mazinani, and Mohammad Reza Yeganegi. "Text Mining in Big Data Analytics." Big Data and Cognitive Computing 4, no. 1 (January 16, 2020): 1. http://dx.doi.org/10.3390/bdcc4010001.

Full text
Abstract:
Text mining in big data analytics is emerging as a powerful tool for harnessing the power of unstructured textual data by analyzing it to extract new knowledge and to identify significant patterns and correlations hidden in the data. This study seeks to determine the state of text mining research by examining the developments within published literature over past years and provide valuable insights for practitioners and researchers on the predominant trends, methods, and applications of text mining research. In accordance with this, more than 200 academic journal articles on the subject are included and discussed in this review; the state-of-the-art text mining approaches and techniques used for analyzing transcripts and speeches, meeting transcripts, and academic journal articles, as well as websites, emails, blogs, and social media platforms, across a broad range of application areas are also investigated. Additionally, the benefits and challenges related to text mining are also briefly outlined.
APA, Harvard, Vancouver, ISO, and other styles
13

Kim, Yoon-Sung, Hae-Chang Rim, and Do-Gil Lee. "Business environmental analysis for textual data using data mining and sentence-level classification." Industrial Management & Data Systems 119, no. 1 (February 4, 2019): 69–88. http://dx.doi.org/10.1108/imds-07-2017-0317.

Full text
Abstract:
Purpose The purpose of this paper is to propose a methodology to analyze a large amount of unstructured textual data into categories of business environmental analysis frameworks. Design/methodology/approach This paper uses machine learning to classify a vast amount of unstructured textual data by category of business environmental analysis framework. Generally, it is difficult to produce high quality and massive training data for machine-learning-based system in terms of cost. Semi-supervised learning techniques are used to improve the classification performance. Additionally, the lack of feature problem that traditional classification systems have suffered is resolved by applying semantic features by utilizing word embedding, a new technique in text mining. Findings The proposed methodology can be used for various business environmental analyses and the system is fully automated in both the training and classifying phases. Semi-supervised learning can solve the problems with insufficient training data. The proposed semantic features can be helpful for improving traditional classification systems. Research limitations/implications This paper focuses on classifying sentences that contain the information of business environmental analysis in large amount of documents. However, the proposed methodology has a limitation on the advanced analyses which can directly help managers establish strategies, since it does not summarize the environmental variables that are implied in the classified sentences. Using the advanced summarization and recommendation techniques could extract the environmental variables among the sentences, and they can assist managers to establish effective strategies. Originality/value The feature selection technique developed in this paper has not been used in traditional systems for business and industry, so that the whole process can be fully automated. It also demonstrates practicality so that it can be applied to various business environmental analysis frameworks. In addition, the system is more economical than traditional systems because of semi-supervised learning, and can resolve the lack of feature problem that traditional systems suffer. This work is valuable for analyzing environmental factors and establishing strategies for companies.
APA, Harvard, Vancouver, ISO, and other styles
14

M.Karthica and Dr.K. Meenakshi Sundaram. "A Comparative Analysis of Text Mining Techniques and Algorithms." International Journal for Modern Trends in Science and Technology 9, no. 01 (January 25, 2023): 54–61. http://dx.doi.org/10.46501/ijmtst0901010.

Full text
Abstract:
With the abundant technological progression and its colossal consumption develops the gigantic quantity of unstructured text data digitally. This type of data controlluxurious information as well as knowledge. Therefore, in order to extract such an amount of knowledge from unstructured text data, a data expert involve to perform mining techniques over textual data. Text mining is the procedure of extracting hidden, priory unidentified, as well asconsiderablyutilizeful information from unstructured textual data.Web browsers became an significantas well as implement to create the information available at our finger tips. World Wide Web became with information as well as it became tough to regaindata according to the required data. Text mining is a subdivision under web mining. This paper deals with a study of different techniques, pattern of content text mining and the areas which has been influenced by content mining. The web contains efficient, unstructured, partiallyprearranged and multimedia data. This paper focuses on text mining techniques and its algorithmswhich help to retrieve data information in huge data retrieval in content based method.
APA, Harvard, Vancouver, ISO, and other styles
15

Kostoff, Ronald N., and Eliezer Geisler. "Strategic Management and Implementation of Textual Data Mining in Government Organizations." Technology Analysis & Strategic Management 11, no. 4 (December 1999): 493–525. http://dx.doi.org/10.1080/095373299107302.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Rajput, D., R. Thakur, and G. Thakur. "Karnaugh Map Approach for Mining Frequent Termset from Uncertain Textual Data." British Journal of Mathematics & Computer Science 4, no. 3 (January 10, 2014): 333–46. http://dx.doi.org/10.9734/bjmcs/2014/6023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Pancerz, Krzysztof, and Olga Mich. "Numerical Data Clustering Algorithms in Mining Real Estate Listings." Barometr Regionalny. Analizy i Prognozy 12, no. 3 (January 9, 2015): 43–50. http://dx.doi.org/10.56583/br.1035.

Full text
Abstract:
In the paper, we propose a method for mining real-estate listings using clustering algorithms intended for numerical data. The presented approach is based on information systems over ontological graphs. Such information systems have been proposed to deal with data in the form of concepts linked by different semantic relations. A special attention is focused on preprocessing steps transforming advertisements in the textual form into information systems defined over ontological graphs, as well as on encoding attribute values for clustering algorithms.
APA, Harvard, Vancouver, ISO, and other styles
18

Kobayashi, Vladimer B., Stefan T. Mol, Hannah A. Berkers, Gábor Kismihók, and Deanne N. Den Hartog. "Text Mining in Organizational Research." Organizational Research Methods 21, no. 3 (August 10, 2017): 733–65. http://dx.doi.org/10.1177/1094428117722619.

Full text
Abstract:
Despite the ubiquity of textual data, so far few researchers have applied text mining to answer organizational research questions. Text mining, which essentially entails a quantitative approach to the analysis of (usually) voluminous textual data, helps accelerate knowledge discovery by radically increasing the amount data that can be analyzed. This article aims to acquaint organizational researchers with the fundamental logic underpinning text mining, the analytical stages involved, and contemporary techniques that may be used to achieve different types of objectives. The specific analytical techniques reviewed are (a) dimensionality reduction, (b) distance and similarity computing, (c) clustering, (d) topic modeling, and (e) classification. We describe how text mining may extend contemporary organizational research by allowing the testing of existing or new research questions with data that are likely to be rich, contextualized, and ecologically valid. After an exploration of how evidence for the validity of text mining output may be generated, we conclude the article by illustrating the text mining process in a job analysis setting using a dataset composed of job vacancies.
APA, Harvard, Vancouver, ISO, and other styles
19

Gutierrez, Edgar, Waldemar Karwowski, Krzysztof Fiok, Mohammad Reza Davahli, Tameika Liciaga, and Tareq Ahram. "Analysis of Human Behavior by Mining Textual Data: Current Research Topics and Analytical Techniques." Symmetry 13, no. 7 (July 16, 2021): 1276. http://dx.doi.org/10.3390/sym13071276.

Full text
Abstract:
The goal of this study was to conduct a literature review of current approaches and techniques for identifying, understanding, and predicting human behaviors through mining a variety of sources of textual data with a focus on enabling classification of psychological behaviors regarding emotion, cognition, and social empathy. This review was performed using keyword searches in ISI Web of Science, Engineering Village Compendex, ProQuest Dissertations, and Google Scholar. Our findings show that, despite recent advancements in predicting human behaviors based on unstructured textual data, significant developments in data analytics systems for identification, determination of interrelationships, and prediction of human cognitive, emotional and social behaviors remain lacking.
APA, Harvard, Vancouver, ISO, and other styles
20

Abid, Amal, Salma Jamoussi, and Abdelmajid Ben Hamadou. "AIS-Clus: A Bio-Inspired Method for Textual Data Stream Clustering." Vietnam Journal of Computer Science 06, no. 02 (May 2019): 223–56. http://dx.doi.org/10.1142/s2196888819500143.

Full text
Abstract:
The spread of real-time applications has led to a huge amount of data shared between users. This vast volume of data rapidly evolving over time is referred to as data stream. Clustering and processing such data poses many challenges to the data mining community. Indeed, traditional data mining techniques become unfeasible to mine such a continuous flow of data where characteristics, features, and concepts are rapidly changing over time. This paper presents a novel method for data stream clustering. In this context, major challenges of data stream processing are addressed, namely, infinite length, concept drift, novelty detection, and feature evolution. To handle these issues, the proposed method uses the Artificial Immune System (AIS) meta-heuristic. The latter has been widely used for data mining tasks and it owns the property of adaptability required by data stream clustering algorithms. Our method, called AIS-Clus, is able to detect novel concepts using the performance of the learning process of the AIS meta-heuristic. Furthermore, AIS-Clus has the ability to adapt its model to handle concept drift and feature evolution for textual data streams. Experimental results have been performed on textual datasets where efficient and promising results are obtained.
APA, Harvard, Vancouver, ISO, and other styles
21

Yudiarta, Nyoman Gede, Made Sudarma, and Wayan Gede Ariastina. "Penerapan Metode Clustering Text Mining Untuk Pengelompokan Berita Pada Unstructured Textual Data." Majalah Ilmiah Teknologi Elektro 17, no. 3 (December 5, 2018): 339. http://dx.doi.org/10.24843/mite.2018.v17i03.p06.

Full text
Abstract:
Pemerintahan yang baik adalah pemerintahan yang program – programnya diketahui dan bermanfaat bagi masyarakatnya. Pada Pemerintah Provinsi Bali yang memiliki tupoksi dalam melakukan penyebarluasan informasi adalah Biro Humas Setda Provinsi Bali melalui media yang dimiliki. Dikarenakan pada saat input berita ke media dalam hal ini website Biro Humas tidak disertakan kategori menyebabkan timbulnya permasalahan berupa sulitnya mengetahui berita – berita yang mana saja yang masuk ke kategori tertentu. Clustering merupakan metode untuk mengatasi permasalahan tersebut. Salah satu algoritma yang digunakan dalam metode Clustering adalah algoritma K-Means. Penelitian ini berfokus pada perancangan untuk mengelompokan data berita ke suatu kategori dengan menggunakan K-Means. Untuk mengolah dokumen yang didapat agar lebih mempermudah dalam proses clustering, dilakukanlah preproses dokumen terlebih dahulu. Preproses dokumen terdiri dari case folding, tokenization, filtering dan stemming. Tf-Idf dilakukan untuk melalukan pembobotan terhadap term yang didapatkan pada preproses dokumen. Dari hasil coba yang dilakukan dengan menggunakan jumlah data yang berbeda yaitu 50, 100, 200, 300, 400, dan 500 data didapatkan hasil bahwa algoritma K-Means yang diterapkan untuk meng cluster berita, mampu bekerja dan memberikan akurasi yang memuaskan, dengan rata-rata Precision sebesar 73,11% sedangkan Recall sebesar 69,65% serta Purity sebesar 0,80 untuk semua data uji. Jika dilihat perbandingan dari setiap data uji, pengujian pada 50 data memiliki tingkat rata-rata precision dan recall paling tinggi yaitu 76,92% untuk precision nya dan untuk recall nya sebesar 79,58% sedangkan untuk Purity nya nilai yang paling tinggi adalah pada pengujian 300 data yaitu sebesar 0,83.
APA, Harvard, Vancouver, ISO, and other styles
22

Li, Shenzhi, Tianhao Wu, and William M. Pottenger. "Distributed higher order association rule mining using information extracted from textual data." ACM SIGKDD Explorations Newsletter 7, no. 1 (June 2005): 26–35. http://dx.doi.org/10.1145/1089815.1089820.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Menon, Rakesh, Loh Han Tong, and S. Sathiyakeerthi. "Analyzing textual databases using data mining to enable fast product development processes." Reliability Engineering & System Safety 88, no. 2 (May 2005): 171–80. http://dx.doi.org/10.1016/j.ress.2004.07.007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Iqbal, Farkhund, Hamad Binsalleeh, Benjamin C. M. Fung, and Mourad Debbabi. "A unified data mining solution for authorship analysis in anonymous textual communications." Information Sciences 231 (May 2013): 98–112. http://dx.doi.org/10.1016/j.ins.2011.03.006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Roche, Mathieu. "COVID-19 and Media datasets: Period- and location-specific textual data mining." Data in Brief 33 (December 2020): 106356. http://dx.doi.org/10.1016/j.dib.2020.106356.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Radinsky, K., S. Davidovich, and S. Markovitch. "Learning to Predict from Textual Data." Journal of Artificial Intelligence Research 45 (December 26, 2012): 641–84. http://dx.doi.org/10.1613/jair.3865.

Full text
Abstract:
Given a current news event, we tackle the problem of generating plausible predictions of future events it might cause. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precisely labeled causality examples, we mine 150 years of news articles and apply semantic natural language modeling techniques to headlines containing certain predefined causality patterns. For generalization, the model uses a vast number of world knowledge ontologies. Empirical evaluation on real news articles shows that our Pundit algorithm performs as well as non-expert humans.
APA, Harvard, Vancouver, ISO, and other styles
27

Islam, Mohammad Rabiul, Imad Fakhri Al-Shaikhli, Rizal Bin Mohd Nor, and Vijayakumar Varadarajan. "Technical Approach in Text Mining for Stock Market Prediction: A Systematic Review." Indonesian Journal of Electrical Engineering and Computer Science 10, no. 2 (May 1, 2018): 770. http://dx.doi.org/10.11591/ijeecs.v10.i2.pp770-777.

Full text
Abstract:
Text mining methods and techniques have disclosed the mining task throughout information retrieval discipline in the field of soft computing techniques. To find the meaningful information from the vast amount of electronic textual data become a humongous task for trading decision. This empirical research of text mining role on financial text analysing in where stock predictive model need to improve based on rank search method. The review of this paper basically focused on text mining techniques, methods and principle component analysis that help reduce the dimensionality within the characteristics and optimal features. Moreover, most sophisticated soft-computing methods and techniques are reviewed in terms of analysis, comparison and evaluation for its performance based on electronic textual data. Due to research significance, this empirical research also highlights the limitation of different strategies and methods on exact aspects of theoretical framework for enhancing of performance.
APA, Harvard, Vancouver, ISO, and other styles
28

Chen, Yixin, Wen Wang, Wenbo He, and Xiaofeng Li. "An Empirical Study of the Textual Content of Online Videos." International Journal of Semantic Computing 10, no. 03 (September 2016): 323–46. http://dx.doi.org/10.1142/s1793351x16400122.

Full text
Abstract:
Fuelled by the advancement in multimedia technologies, users across the world have witnessed the proliferation of online videos. Compared with the visual content of these videos, the textual content, for example, titles, tags, or descriptions, has been more broadly exploited in the real-world video data mining or information retrieval tasks. To enhance the understanding of videos, and improve the performance of the tasks such as automatic video annotation, video clustering, and cross-modal tag cleansing, the textual and visual content of videos are combined, through various methods. However, the absence of an empirical study on the properties of these contents makes them less solid to gain satisfactory performance. Therefore, in this paper, we conduct this study to verify the properties of textual content and draw insights from these analyses to promote further developments in video data mining that combine the two contents.
APA, Harvard, Vancouver, ISO, and other styles
29

Gul, Sumeer, Shohar Bano, and Taseen Shah. "Exploring data mining: facets and emerging trends." Digital Library Perspectives 37, no. 4 (October 20, 2021): 429–48. http://dx.doi.org/10.1108/dlp-08-2020-0078.

Full text
Abstract:
Purpose Data mining along with its varied technologies like numerical mining, textual mining, multimedia mining, web mining, sentiment analysis and big data mining proves itself as an emerging field and manifests itself in the form of different techniques such as information mining; big data mining; big data mining and Internet of Things (IoT); and educational data mining. This paper aims to discuss how these technologies and techniques are used to derive information and, eventually, knowledge from data. Design/methodology/approach An extensive review of literature on data mining and its allied techniques was carried to ascertain the emerging procedures and techniques in the domain of data mining. Clarivate Analytic’s Web of Science and Sciverse Scopus were explored to discover the extent of literature published on Data Mining and its varied facets. Literature was searched against various keywords such as data mining; information mining; big data; big data and IoT; and educational data mining. Further, the works citing the literature on data mining were also explored to visualize a broad gamut of emerging techniques about this growing field. Findings The study validates that knowledge discovery in databases has rendered data mining as an emerging field; the data present in these databases paves the way for data mining techniques and analytics. This paper provides a unique view about the usage of data, and logical patterns derived from it, how new procedures, algorithms and mining techniques are being continuously upgraded for their multipurpose use for the betterment of human life and experiences. Practical implications The paper highlights different aspects of data mining, its different technological approaches, and how these emerging data technologies are used to derive logical insights from data and make data more meaningful. Originality/value The paper tries to highlight the current trends and facets of data mining.
APA, Harvard, Vancouver, ISO, and other styles
30

Cabrio, Elena, Julien Cojan, Alessio Palmero Aprosio, and Fabien Gandon. "Natural language interaction with the web of data by mining its textual side." Intelligenza Artificiale 6, no. 2 (2012): 121–33. http://dx.doi.org/10.3233/ia-120034.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Oukid, Lamia, Omar Boussaid, Nadjia Benblidia, and Fadila Bentayeb. "TLabel." International Journal of Data Warehousing and Mining 12, no. 4 (October 2016): 54–74. http://dx.doi.org/10.4018/ijdwm.2016100103.

Full text
Abstract:
Data Warehousing technologies and On-Line Analytical Processing (OLAP) feature a wide range of techniques for the analysis of structured data. However, these techniques are inadequate when it comes to analyzing textual data. Indeed, classical aggregation operators have earned their spurs in the online analysis of numerical data, but are unsuitable for the analysis of textual data. To alleviate this shortcoming, on-line analytical processing in text cubes requires new analysis operators adapted to textual data. In this paper, the authors propose a new aggregation operator named Text Label (TLabel), based on text categorization. Their operator aggregates textual data in several classes of documents. Each class is associated with a label that represents the semantic content of the textual data of the class. TLabel is founded on a tailoring of text mining techniques to OLAP. To validate their operator, the authors perform an experimental study and the preliminary results show the interest of their approach for Text OLAP.
APA, Harvard, Vancouver, ISO, and other styles
32

Cook, Diane J., Nitish Manocha, and Lawrence B. Holder. "Using a Graph-Based Data Mining System to Perform Web Search." International Journal of Pattern Recognition and Artificial Intelligence 17, no. 05 (August 2003): 705–20. http://dx.doi.org/10.1142/s0218001403002617.

Full text
Abstract:
The World Wide Web provides an immense source of information. Accessing information of interest presents a challenge to scientists and analysts, particularly if the desired information is structural in nature. Our goal is to design a structural search engine that uses the hyperlink structure of the Web, in addition to textual information, to search for sites of interest. Our structural search engine, called WebSUBDUE, searches not only for particular words or topics but also for a desired hyperlink structure. Enhanced by WordNet text functions, our search engine retrieves sites corresponding to structures formed by graph-based user queries. We hypothesize that this system can form the heart of a structural query engine, and demonstrate the approach on a number of structural web queries.
APA, Harvard, Vancouver, ISO, and other styles
33

Bognár, Eszter Katalin. "Novel IT Technologies on the Digital Battlefield: The Application of Big Data and Data Mining Technologies." Hadmérnök 15, no. 4 (2020): 141–58. http://dx.doi.org/10.32567/hm.2020.4.10.

Full text
Abstract:
In modern warfare, the most important innovation to date has been the utilisation of information as a weapon. The basis of successful military operations is the ability to correctly assess a situation based on credible collected information. In today’s military, the primary challenge is not the actual collection of data. It has become more important to extract relevant information from that data. This requirement cannot be successfully completed without necessary improvements in tools and techniques to support the acquisition and analysis of data. This study defines Big Data and its concept as applied to military reconnaissance, focusing on the processing of imagery and textual data, bringing to light modern data processing and analytics methods that enable effective processing.
APA, Harvard, Vancouver, ISO, and other styles
34

Nielbo, Kristoffer L., Ryan Nichols, and Edward Slingerland. "Mining the Past – Data-Intensive Knowledge Discovery in the Study of Historical Textual Traditions." Journal of Cognitive Historiography 3, no. 1-2 (August 7, 2017): 93–118. http://dx.doi.org/10.1558/jch.31662.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Zhecheva, Denitsa, and Nayden NENKOV. "Business demands for processing unstructured textual data – text mining techniques for companies to implement." Access Journal - Access to Science, Business, Innovation in the digital economy 3, no. 2 (April 17, 2022): 107–20. http://dx.doi.org/10.46656/access.2022.3.2(2).

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Liem, David A., Sanjana Murali, Dibakar Sigdel, Yu Shi, Xuan Wang, Jiaming Shen, Howard Choi, et al. "Phrase mining of textual data to analyze extracellular matrix protein patterns across cardiovascular disease." American Journal of Physiology-Heart and Circulatory Physiology 315, no. 4 (October 1, 2018): H910—H924. http://dx.doi.org/10.1152/ajpheart.00175.2018.

Full text
Abstract:
Extracellular matrix (ECM) proteins have been shown to play important roles regulating multiple biological processes in an array of organ systems, including the cardiovascular system. Using a novel bioinformatics text-mining tool, we studied six categories of cardiovascular disease (CVD), namely, ischemic heart disease, cardiomyopathies, cerebrovascular accident, congenital heart disease, arrhythmias, and valve disease, anticipating novel ECM protein-disease and protein-protein relationships hidden within vast quantities of textual data. We conducted a phrase-mining analysis, delineating the relationships of 709 ECM proteins with the 6 groups of CVDs reported in 1,099,254 abstracts. The technology pipeline known as Context-Aware Semantic Online Analytical Processing was applied to semantically rank the association of proteins to each CVD and all six CVDs, performing analyses to quantify each protein-disease relationship. We performed principal component analysis and hierarchical clustering of the data, where each protein was visualized as a six-dimensional vector. We found that ECM proteins display variable degrees of association with the six CVDs; certain CVDs share groups of associated proteins, whereas others have divergent protein associations. We identified 82 ECM proteins sharing associations with all 6 CVDs. Our bioinformatics analysis ascribed distinct ECM pathways (via Reactome) from this subset of proteins, namely, insulin-like growth factor regulation and interleukin-4 and interleukin-13 signaling, suggesting their contribution to the pathogenesis of all six CVDs. Finally, we performed hierarchical clustering analysis and identified protein clusters predominantly associated with a targeted CVD; analyses of these proteins revealed unexpected insights underlying the key ECM-related molecular pathogenesis of each CVD, including virus assembly and release in arrhythmias. NEW & NOTEWORTHY The present study is the first application of a text-mining algorithm to characterize the relationships of 709 extracellular matrix-related proteins with 6 categories of cardiovascular disease described in 1,099,254 abstracts. Our analysis informed unexpected extracellular matrix functions, pathways, and molecular relationships implicated in the six cardiovascular diseases.
APA, Harvard, Vancouver, ISO, and other styles
37

Netolický, Pavel, Jonáš Petrovský, and František Dařena. "Text‑Mining in Streams of Textual Data Using Time Series Applied to Stock Market." Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis 66, no. 6 (2018): 1573–80. http://dx.doi.org/10.11118/actaun201866061573.

Full text
Abstract:
Each day, a lot of text data is generated. This data comes from various sources and may contain valuable information. In this article, we use text mining methods to discover if there is a connection between news articles and changes of the S&P 500 stock index. The index values and documents were divided into time windows according to the direction of the index value changes. We achieved a classification accuracy of 65–74 %.
APA, Harvard, Vancouver, ISO, and other styles
38

Menon, Rakesh, Loh Han Tong, S. Sathiyakeerthi, Aarnout Brombacher, and Christopher Leong. "The Needs and Benefits of Applying Textual Data Mining within the Product Development Process." Quality and Reliability Engineering International 20, no. 1 (January 30, 2004): 1–15. http://dx.doi.org/10.1002/qre.536.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Ur-Rahman, N., and J. A. Harding. "Textual data mining for industrial knowledge management and text classification: A business oriented approach." Expert Systems with Applications 39, no. 5 (April 2012): 4729–39. http://dx.doi.org/10.1016/j.eswa.2011.09.124.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Chen, Jingfeng, Wei Wei, Chonghui Guo, Lin Tang, and Leilei Sun. "Textual analysis and visualization of research trends in data mining for electronic health records." Health Policy and Technology 6, no. 4 (December 2017): 389–400. http://dx.doi.org/10.1016/j.hlpt.2017.10.003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Al-Hassan, Abeer A., Faleh Alshameri, and Edgar H. Sibley. "A research case study: Difficulties and recommendations when using a textual data mining tool." Information & Management 50, no. 7 (November 2013): 540–52. http://dx.doi.org/10.1016/j.im.2013.05.010.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Ali, Wajid, Wanli Zuo, Rahman Ali, Xianglin Zuo, and Gohar Rahman. "Causality Mining in Natural Languages Using Machine and Deep Learning Techniques: A Survey." Applied Sciences 11, no. 21 (October 27, 2021): 10064. http://dx.doi.org/10.3390/app112110064.

Full text
Abstract:
The era of big textual corpora and machine learning technologies have paved the way for researchers in numerous data mining fields. Among them, causality mining (CM) from textual data has become a significant area of concern and has more attention from researchers. Causality (cause-effect relations) serves as an essential category of relationships, which plays a significant role in question answering, future events predication, discourse comprehension, decision making, future scenario generation, medical text mining, behavior prediction, and textual prediction entailment. While, decades of development techniques for CM are still prone to performance enhancement, especially for ambiguous and implicitly expressed causalities. The ineffectiveness of the early attempts is mainly due to small, ambiguous, heterogeneous, and domain-specific datasets constructed by manually linguistic and syntactic rules. Many researchers have deployed shallow machine learning (ML) and deep learning (DL) techniques to deal with such datasets, and they achieved satisfactory performance. In this survey, an effort has been made to address a comprehensive review of some state-of-the-art shallow ML and DL approaches in CM. We present a detailed taxonomy of CM and discuss popular ML and DL approaches with their comparative weaknesses and strengths, applications, popular datasets, and frameworks. Lastly, the future research challenges are discussed with illustrations of how to transform them into productive future research directions.
APA, Harvard, Vancouver, ISO, and other styles
43

Guan, Jian, Alan S. Levitan, and Sandeep Goyal. "Text Mining Using Latent Semantic Analysis: An Illustration through Examination of 30 Years of Research at JIS." Journal of Information Systems 32, no. 1 (October 1, 2016): 67–86. http://dx.doi.org/10.2308/isys-51625.

Full text
Abstract:
ABSTRACTBig Data presents a tremendous challenge for the accounting profession today. This challenge is characterized by, among other things, the explosive growth of unstructured data, such as text. In recent years, new text-mining methods have emerged to turn unstructured textual data into actionable information. A critical role of accounting information systems (AIS) research is to help the accounting profession assess and utilize these methodologies in an accounting context. This paper introduces the latent semantic analysis (LSA), a text-mining approach that discovers latent structures in unstructured textual data, to the AIS research community. An LSA-based approach is used to analyze AIS research as published in the Journal of Information Systems (JIS) over the last 30 years. JIS research serves as an appropriate domain of analysis because of a perceived need to contextualize the scope of AIS research. The research themes and trends resulting from this analysis contribute to a better understanding of this identity.
APA, Harvard, Vancouver, ISO, and other styles
44

Das, Subasish, Anandi Dutta, and Marcus A. Brewer. "Case Study of Trend Mining in Transportation Research Record Articles." Transportation Research Record: Journal of the Transportation Research Board 2674, no. 10 (July 24, 2020): 1–14. http://dx.doi.org/10.1177/0361198120936254.

Full text
Abstract:
This study employs two topic models to perform trend mining on an abundance of textual data to determine trends in research topics from immense collections of unstructured documents over the years. This study collected data from the titles and abstracts of the papers published in Transportation Research Record: Journal of the Transportation Research Board, since 1974. The content of these papers was ideal for examining research trends in various fields of research because it contains large textual data. In previous studies, exploratory analysis tools such as text mining were used to provide descriptive information about the data. However, this method does not provide researchers with quantifications of the topics and their correlations. Furthermore, the contents examined in this study are largely unstructured, and therefore they require faster machine learning algorithms to decipher them. For these reasons, the research team chose to employ two topic modeling tools, latent Dirichlet allocation and structural topic model, to perform trend mining. This analysis succeeded in extracting 20 main topics, identified by keywords, from the data. The research team also developed two interactive topic model visualization tools that can be used to extract topics from journal titles and abstracts, respectively. The findings from this study provide researchers with a further understanding of research patterns within ever-evolving area of transportation engineering studies.
APA, Harvard, Vancouver, ISO, and other styles
45

Gholizadeh, Shafie, Armin Seyeditabari, and Wlodek Zadrozny. "Topological Signature of 19th Century Novelists: Persistent Homology in Text Mining." Big Data and Cognitive Computing 2, no. 4 (October 18, 2018): 33. http://dx.doi.org/10.3390/bdcc2040033.

Full text
Abstract:
Topological Data Analysis (TDA) refers to a collection of methods that find the structure of shapes in data. Although recently, TDA methods have been used in many areas of data mining, it has not been widely applied to text mining tasks. In most text processing algorithms, the order in which different entities appear or co-appear is being lost. Assuming these lost orders are informative features of the data, TDA may play a significant role in the resulted gap on text processing state of the art. Once provided, the topology of different entities through a textual document may reveal some additive information regarding the document that is not reflected in any other features from conventional text processing methods. In this paper, we introduce a novel approach that hires TDA in text processing in order to capture and use the topology of different same-type entities in textual documents. First, we will show how to extract some topological signatures in the text using persistent homology-i.e., a TDA tool that captures topological signature of data cloud. Then we will show how to utilize these signatures for text classification.
APA, Harvard, Vancouver, ISO, and other styles
46

Fafalios, Pavlos, Panagiotis Papadakos, and Yannis Tzitzikas. "Enriching Textual Search Results at Query Time Using Entity Mining, Linked Data and Link Analysis." International Journal of Semantic Computing 08, no. 04 (December 2014): 515–44. http://dx.doi.org/10.1142/s1793351x14400170.

Full text
Abstract:
The integration of the classical Web (of documents) with the emerging Web of Data is a challenging vision. In this paper we focus on an integration approach during searching which aims at enriching the responses of non-semantic search systems with semantic information, i.e. Linked Open Data (LOD), and exploiting the outcome for offering advanced exploratory search services which provide an overview of the search space and allow the users to explore the related LOD. We use named entities identified in the search results for automatically connecting search hits with LOD and we consider a scenario where this entity-based integration is performed at query time with no human effort and no a-priori indexing which is beneficial in terms of configurability and freshness. However, the number of identified entities can be high and the same is true for the semantic information about these entities that can be fetched from the available LOD. To this end, in this paper we propose a Link Analysis-based method which is used for ranking (and thus selecting to show) the more important semantic information related to the search results. We report the results of a survey regarding the marine domain with promising results, and comparative results that illustrate the effectiveness of the proposed (PageRank-based) ranking scheme. Finally, we report experimental results regarding efficiency showing that the proposed functionality can be offered even at query time.
APA, Harvard, Vancouver, ISO, and other styles
47

Hacking, Coen, Hilde Verbeek, Jan P. H. Hamers, Katya Sion, and Sil Aarts. "Text mining in long-term care: Exploring the usefulness of artificial intelligence in a nursing home setting." PLOS ONE 17, no. 8 (August 25, 2022): e0268281. http://dx.doi.org/10.1371/journal.pone.0268281.

Full text
Abstract:
Objectives In nursing homes, narrative data are collected to evaluate quality of care as perceived by residents or their family members. This results in a large amount of textual data. However, as the volume of data increases, it becomes beyond the capability of humans to analyze it. This study aims to explore the usefulness of text mining approaches regarding narrative data gathered in a nursing home setting. Design Exploratory study showing a variety of text mining approaches. Setting and participants Data has been collected as part of the project ‘Connecting Conversations’: assessing experienced quality of care by conducting individual interviews with residents of nursing homes (n = 39), family members (n = 37) and care professionals (n = 49). Methods Several pre-processing steps were applied. A variety of text mining analyses were conducted: individual word frequencies, bigram frequencies, a correlation analysis and a sentiment analysis. A survey was conducted to establish a sentiment analysis model tailored to text collected in long-term care for older adults. Results Residents, family members and care professionals uttered respectively 285, 362 and 549 words per interview. Word frequency analysis showed that words that occurred most frequently in the interviews are often positive. Despite some differences in word usage, correlation analysis displayed that similar words are used by all three groups to describe quality of care. Most interviews displayed a neutral sentiment. Care professionals expressed a more diverse sentiment compared to residents and family members. A topic clustering analysis showed a total of 12 topics including ‘relations’ and ‘care environment’. Conclusions and implications This study demonstrates the usefulness of text mining to extend our knowledge regarding quality of care in a nursing home setting. With the rise of textual (narrative) data, text mining can lead to valuable new insights for long-term care for older adults.
APA, Harvard, Vancouver, ISO, and other styles
48

Shaikh, Anoud, Naeem Ahmed Mahoto, and Mukhtiar Ali Unar. "Bringing Shape to Textual Data – A Feasible Demonstration." Mehran University Research Journal of Engineering and Technology 38, no. 4 (October 1, 2019): 901–14. http://dx.doi.org/10.22581/muet1982.1904.04.

Full text
Abstract:
The Internet has revolutionized the communication paradigm. This has led towards immense amount of unstructured data (i.e. textual data), which is a major source to get useful knowledge about people in several application domains. TM (Text Mining) extracts high quality information to discover knowledge by drawing patterns and relationships in textual data. This field has taken great attention of the research community. As a result, several attempts have been made to propose, introduce and refine techniques applied for uncovering knowledge from text data. This study aims at: (1) presenting existing TM techniques in the scientific literature, (2) reporting challenges/issues and gaps that still need attention, and (3) proposing a framework to bring shape to textual data. A prototype has been developed to demonstrate the effectiveness and potential worth of proposed approach to display how unstructured data (i.e. news articles in this study) has been brought to a shape representing interesting knowledge. The proposed framework implements basic NLP (Natural Language Processing) functions in combination of AYLIEN API (Application Programming Interface) functions. The results reveal the fact that how events, celebrities and popular news-items have been covered in the electronic media, and it also represents subjectivity of topical news events. The news coverage trends highlight the significance of daily news events, which may assist in getting insight about the media groups.
APA, Harvard, Vancouver, ISO, and other styles
49

Zhang, Yudong, Wenhao Zheng, and Ming Li. "Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentially." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 5845–52. http://dx.doi.org/10.1609/aaai.v33i01.33015845.

Full text
Abstract:
Semantic feature learning for natural language and programming language is a preliminary step in addressing many software mining tasks. Many existing methods leverage information in lexicon and syntax to learn features for textual data. However, such information is inadequate to represent the entire semantics in either text sentence or code snippet. This motivates us to propose a new approach to learn semantic features for both languages, through extracting three levels of information, namely global, local and sequential information, from textual data. For tasks involving both modalities, we project the data of both types into a uniform feature space so that the complementary knowledge in between can be utilized in their representation. In this paper, we build a novel and general-purpose feature learning framework called UniEmbed, to uniformly learn comprehensive semantic representation for both natural language and programming language. Experimental results on three real-world software mining tasks show that UniEmbed outperforms state-of-the-art models in feature learning and prove the capacity and effectiveness of our model.
APA, Harvard, Vancouver, ISO, and other styles
50

高淑貞, 高淑貞. "IIDMCC: An Innovation Idea Discovery Model Using Online Customers Complaint Messages." 網際網路技術學刊 23, no. 2 (March 2022): 209–16. http://dx.doi.org/10.53106/160792642022032302002.

Full text
Abstract:
<p>Online customers&rsquo; complaints have attracted increasing attention to innovation developers. By applying text mining and classification-oriented data mining techniques, an innovation idea discovery model using online customers&rsquo; complaint messages (IIDMCC) was proposed and implemented in this article. Methods included text mining to derive bags of words, sparsity exclusion to produce a term matrix, and supervised classification data mining to reveal decision rules. The IIDMCC showed 90.63% prediction accuracy based on 14720 complaint messages collected from official forum and online communities of a case company in the mobile phone sector from Taiwan. Validation of data inputs, method, and outputs was conducted via case company specialists. The article concludes that analyses of online complaint messages may potentially contribute to the exploration and discovery of innovation ideas. The paper demonstrates the use of mining open textual data in general and complaint messages in particular in the domain of knowledge discovery in databases.</p> <p>&nbsp;</p>
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography