To see the other types of publications on this topic, follow the link: Big text data.

Journal articles on the topic 'Big text data'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Big text data.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

N.J., Anjala. "Algorithmic Assessment of Text based Data Classification in Big Data Sets." Journal of Advanced Research in Dynamical and Control Systems 12, SP4 (March 31, 2020): 1231–34. http://dx.doi.org/10.5373/jardcs/v12sp4/20201598.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Hassani, Hossein, Christina Beneki, Stephan Unger, Maedeh Taj Mazinani, and Mohammad Reza Yeganegi. "Text Mining in Big Data Analytics." Big Data and Cognitive Computing 4, no. 1 (January 16, 2020): 1. http://dx.doi.org/10.3390/bdcc4010001.

Full text
Abstract:
Text mining in big data analytics is emerging as a powerful tool for harnessing the power of unstructured textual data by analyzing it to extract new knowledge and to identify significant patterns and correlations hidden in the data. This study seeks to determine the state of text mining research by examining the developments within published literature over past years and provide valuable insights for practitioners and researchers on the predominant trends, methods, and applications of text mining research. In accordance with this, more than 200 academic journal articles on the subject are included and discussed in this review; the state-of-the-art text mining approaches and techniques used for analyzing transcripts and speeches, meeting transcripts, and academic journal articles, as well as websites, emails, blogs, and social media platforms, across a broad range of application areas are also investigated. Additionally, the benefits and challenges related to text mining are also briefly outlined.
APA, Harvard, Vancouver, ISO, and other styles
3

Kodabagi, M. M., Deepa Sarashetti, and Vilas Naik. "A Text Information Retrieval Technique for Big Data Using Map Reduce." Bonfring International Journal of Software Engineering and Soft Computing 6, Special Issue (October 31, 2016): 22–26. http://dx.doi.org/10.9756/bijsesc.8236.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Courtney, Kyle, Rachael Samberg, and Timothy Vollmer. "Big data gets big help: Law and policy literacies for text data mining." College & Research Libraries News 81, no. 4 (April 9, 2020): 193. http://dx.doi.org/10.5860/crln.81.4.193.

Full text
Abstract:
A wealth of digital texts and the proliferation of automated research methodologies enable researchers to analyze large sets of data at a speed that would be impossible to achieve through manual review. When researchers use these automated techniques and methods for identifying, extracting, and analyzing patterns, trends, and relationships across large volumes of un- or thinly structured digital content, they are applying a methodology called text data mining or TDM. TDM is also referred to, with slightly different emphases, as “computational text analysis” or “content mining.”
APA, Harvard, Vancouver, ISO, and other styles
5

Rajagopal, D., and K. Thilakavalli. "Efficient Text Mining Prototype for Big Data." International Journal of Data Mining And Emerging Technologies 5, no. 1 (2015): 38. http://dx.doi.org/10.5958/2249-3220.2015.00007.5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Iqbal, Waheed, Waqas Ilyas Malik, Faisal Bukhari, Khaled Mohamad Almustafa, and Zubiar Nawaz. "Big Data Full-Text Search Index Minimization Using Text Summarization." Information Technology and Control 50, no. 2 (June 17, 2021): 375–89. http://dx.doi.org/10.5755/j01.itc.50.2.25470.

Full text
Abstract:
An efficient full-text search is achieved by indexing the raw data with an additional 20 to 30 percent storagecost. In the context of Big Data, this additional storage space is huge and introduces challenges to entertainfull-text search queries with good performance. It also incurs overhead to store, manage, and update the largesize index. In this paper, we propose and evaluate a method to minimize the index size to offer full-text searchover Big Data using an automatic extractive-based text summarization method. To evaluate the effectivenessof the proposed approach, we used two real-world datasets. We indexed actual and summarized datasets usingApache Lucene and studied average simple overlapping, Spearman’s rho correlation, and average rankingscore measures of search results obtained using different search queries. Our experimental evaluation showsthat automatic text summarization is an effective method to reduce the index size significantly. We obtained amaximum of 82% reduction in index size with 42% higher relevance of the search results using the proposedsolution to minimize the full-text index size.
APA, Harvard, Vancouver, ISO, and other styles
7

Toon, Elizabeth, Carsten Timmermann, and Michael Worboys. "Text-Mining and the History of Medicine: Big Data, Big Questions?" Medical History 60, no. 2 (March 14, 2016): 294–96. http://dx.doi.org/10.1017/mdh.2016.18.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lepper, Marcel. "Big Data, Global Villages." Philological Encounters 1, no. 1-4 (January 26, 2016): 131–62. http://dx.doi.org/10.1163/24519197-00000006.

Full text
Abstract:
How should the field of philology react to the ongoing quantitative growth of its material basis? This essay will first discuss two opposing strategies: The quantitative analysis of large amounts of data, promoted above all by Franco Moretti, is contrasted with the canon-oriented method of resorting to small corpora. Yet both the culturally conservative anxiety over growing masses of texts as well as the enthusiasm for the ‘digital humanities’ and the technological indexation of large text corpora prove to be unmerited when considering the complexity of the problem. Therefore, this essay advocates for a third, heuristic approach, which 1) accounts for the changes in global text production and storage, 2) is conscious of the material-political conditions that determine the accessibility of texts, and 3) creates a bridge between close and distant reading by binding quantitative approaches to fundamental, qualitative philological principles, thus helping philologists keep track of the irritating, provocative, and subversive elements of texts that automated queries inevitably miss.
APA, Harvard, Vancouver, ISO, and other styles
9

Khan, Zaheer, and Tim Vorley. "Big data text analytics: an enabler of knowledge management." Journal of Knowledge Management 21, no. 1 (February 13, 2017): 18–34. http://dx.doi.org/10.1108/jkm-06-2015-0238.

Full text
Abstract:
Purpose The purpose of this paper is to examine the role of big data text analytics as an enabler of knowledge management (KM). The paper argues that big data text analytics represents an important means to visualise and analyse data, especially unstructured data, which have the potential to improve KM within organisations. Design/methodology/approach The study uses text analytics to review 196 articles published in two of the leading KM journals – Journal of Knowledge Management and Journal of Knowledge Management Research & Practice – in 2013 and 2014. The text analytics approach is used to process, extract and analyse the 196 papers to identify trends in terms of keywords, topics and keyword/topic clusters to show the utility of big data text analytics. Findings The findings show how big data text analytics can have a key enabler role in KM. Drawing on the 196 articles analysed, the paper shows the power of big data-oriented text analytics tools in supporting KM through the visualisation of data. In this way, the authors highlight the nature and quality of the knowledge generated through this method for efficient KM in developing a competitive advantage. Research limitations/implications The research has important implications concerning the role of big data text analytics in KM, and specifically the nature and quality of knowledge produced using text analytics. The authors use text analytics to exemplify the value of big data in the context of KM and highlight how future studies could develop and extend these findings in different contexts. Practical implications Results contribute to understanding the role of big data text analytics as a means to enhance the effectiveness of KM. The paper provides important insights that can be applied to different business functions, from supply chain management to marketing management to support KM, through the use of big data text analytics. Originality/value The study demonstrates the practical application of the big data tools for data visualisation, and, with it, improving KM.
APA, Harvard, Vancouver, ISO, and other styles
10

Kagan, Pavel. "Big data sets in construction." E3S Web of Conferences 110 (2019): 02007. http://dx.doi.org/10.1051/e3sconf/201911002007.

Full text
Abstract:
The paper studies the processing of large information data arrays (Big Data) in construction. The issues of the applicability of the big data concept (Big Data) at various stages of the life cycle of buildings and structures are considered. Methods for data conversion for their further processing are proposed. The methods used in the analysis of "big data" allow working with unstructured data sets (Data Mining). An approach is considered, in which the analysis of arbitrary data can be reduced to text analysis, similar to the analysis of ordinary text messages. At the moment, it is important and interesting to isolate non-obvious links present in the analysed data. The advantage of using big data is that it is not necessary to advance hypotheses for testing. Hypotheses appear during data analysis. Dependence analysis is a basic approach when working with big data. The concept of an automatic big data analysis system is proposed. For data mining, text analysis algorithms should be used, and discriminant functions should be used for the main problem to be solved (data classification).
APA, Harvard, Vancouver, ISO, and other styles
11

Jun, Sunghae. "A Big Data Preprocessing using Statistical Text Mining." Journal of Korean Institute of Intelligent Systems 25, no. 5 (October 25, 2015): 470–76. http://dx.doi.org/10.5391/jkiis.2015.25.5.470.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Christen, Markus, Thomas Niederberger, Thomas Ott, Suleiman Aryobsei, and Reto Hofstetter. "Micro-text classification between small and big data." Nonlinear Theory and Its Applications, IEICE 6, no. 4 (2015): 556–69. http://dx.doi.org/10.1587/nolta.6.556.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Noh,Hyung-Nam. "Big Data Text Mining -Focusing on political speeches-." Journal of Speech Communication ll, no. 26 (November 2014): 289–325. http://dx.doi.org/10.18625/jsc.2014..26.289.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Ye, Zhan, Ahmad P. Tafti, Karen Y. He, Kai Wang, and Max M. He. "SparkText: Biomedical Text Mining on Big Data Framework." PLOS ONE 11, no. 9 (September 29, 2016): e0162721. http://dx.doi.org/10.1371/journal.pone.0162721.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Gattiker, A., F. H. Gebara, H. P. Hofstee, J. D. Hayes, and A. Hylick. "Big Data text-oriented benchmark creation for Hadoop." IBM Journal of Research and Development 57, no. 3/4 (May 2013): 10:1–10:6. http://dx.doi.org/10.1147/jrd.2013.2240732.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Tresor, Kotonko Lumanga Manga, and Xu Dhe zi. "WEKA for Reducing High -Dimensional Big Text Data." International Journal of Advanced Engineering Research and Science 5, no. 11 (2018): 52–55. http://dx.doi.org/10.22161/ijaers.5.11.10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Tresor, Kotonko Lumanga Manga, and Professor Xu Dezhi. "WEKA FOR REDUCING HIGH - DIMENSIONAL BIG TEXT DATA." Indian Journal of Computer Science and Engineering 9, no. 4 (October 20, 2018): 124–29. http://dx.doi.org/10.21817/indjcse/2018/v9i5/180905016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Dhanani, Jenish, Rupa Mehta, and Dipti Rana. "Sentiment Weighted Word Embedding for Big Text Data." International Journal of Web-Based Learning and Teaching Technologies 16, no. 6 (November 2021): 1–17. http://dx.doi.org/10.4018/ijwltt.20211101.oa2.

Full text
Abstract:
Sentiment analysis is the practice of eliciting a sentiment orientation of people's opinions (i.e. positive, negative and neutral) toward the specific entity. Word embedding technique like Word2vec is an effective approach to encode text data into real-valued semantic feature vectors. However, it fails to preserve sentiment information that results in performance deterioration for sentiment analysis. Additionally, big sized textual data consisting of large vocabulary and its associated feature vectors demands huge memory and computing power. To overcome these challenges, this research proposed a MapReduce based Sentiment weighted Word2Vec (MSW2V), which learns the sentiment and semantic feature vectors using sentiment dictionary and big textual data in a distributed MapReduce environment, where memory and computing power of multiple computing nodes are integrated to accomplish the huge resource demand. Experimental results demonstrate the outperforming performance of the MSW2V compared to the existing distributed and non-distributed approaches.
APA, Harvard, Vancouver, ISO, and other styles
19

Choi, Jin Hee, and Kyoung ho Choi. "Big data text mining for the term‘combat uniform’." Taegu Science University Defense Security Institute 5, no. 2 (April 30, 2021): 43–51. http://dx.doi.org/10.37181/jscs.2021.5.2.043.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

HeeKu, Jin, and Yoon Su Jeong. "A study on social big data analysis using text clustering." International Journal of Engineering & Technology 7, no. 2.12 (April 3, 2018): 1. http://dx.doi.org/10.14419/ijet.v7i2.12.11023.

Full text
Abstract:
Background/Objectives: As the use of big data increases in various fields, the use of social big data analysis for social media is increasing rapidly.This study proposed a method to apply text clustering for analysis by related topics of texts extracted using text mining of social big data.Methods/Statistical analysis: R was used for data collection and analysis, and social big data was collected from Twitter. The clustering model applicable to the related subject analysis of Twitter text was compared and selected and text clustering was performed. Text clustering is analyzed through a cluster dendrogram by generating a corpus, then grouping similar entities from the term-document matrix, and removing the sparse words.Findings: In this study, text clustering improves the difficulty in analyzing by word association and subject in text mining methods such as word cloud. Especially, in the text clustering model for the related topic analysis of social big data, the hierarchical clustering model based on the cosine similarity was more suitable than the non-hierarchical model for identifying which terms in the tweet have an association with each other. In addition, cluster dendrogram has been found to be effective in analyzing text contexts by grouping several groups of similar texts repeatedly in the visualization process.Improvements/Applications: This study can be used to confirm ideas and opinions of various participants by using Social Big Data, and to analyze more precisely the complex relationship between the prediction of social problems and the phenomenon.
APA, Harvard, Vancouver, ISO, and other styles
21

Lee, Jungmin, Eunja Jun, and Jungmin Chae. "Big Data Analysis for Dance Studies Using Text Mining." Journal of Dance Society for Documentation & History 42 (September 30, 2016): 191–212. http://dx.doi.org/10.26861/sddh.2016.42.191.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Kim, Ilhwan. "Newspaper Big Data and Text Mining for Digital Humanities." Journal of Language & Literature 78 (June 30, 2019): 41–62. http://dx.doi.org/10.15565/jll.2019.06.78.41.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Bharadwaj, Deepankar, and Arvind Shukla. "Text Mining Technique on Big Data Using Genetic Algorithm." International Journal of Computer Sciences and Engineering 6, no. 7 (July 31, 2018): 674–81. http://dx.doi.org/10.26438/ijcse/v6i7.674681.

Full text
APA, Harvard, Vancouver, ISO, and other styles
24

Alam, Saqib, and Nianmin Yao. "Big Data Analytics, Text Mining and Modern English Language." Journal of Grid Computing 17, no. 2 (August 4, 2018): 357–66. http://dx.doi.org/10.1007/s10723-018-9452-4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Huang, Shan Shan, and In Sik Shin. "Analysis of Big Data Text on Hanwha Visual Design." Korea Institute of Design Research Society 6, no. 1 (March 31, 2021): 318–26. http://dx.doi.org/10.46248/kidrs.2021.1.318.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Yang, Yijie. "Internet User Consumption Behavior Based on Big Data." E3S Web of Conferences 292 (2021): 02004. http://dx.doi.org/10.1051/e3sconf/202129202004.

Full text
Abstract:
With the vigorous development of the Internet economy, competition in the domestic market has become increasingly fierce. This research mainly discusses the consumption behavior of Internet users based on big data. Text preprocessing is required first after the target text has been obtained. Clean and delete content that is of no value or limited value in the text. First, define the Request network connection request. Secondly, write a function module for obtaining multiple commodity IDs. Traverse the full text to obtain the target content, which is often used to retrieve and replace the target text. Finally, the crawling of individual product information. According to the research product goals, big data technology is used to obtain data on two dimensions of product sales information and user experience information. Store the product sales specification text and product review text as divergent texts for the next stage of data cleaning to predict user consumption behavior. The function of the product, in the price range of 100-200 yuan, the user’s attention is 27%. This research helps companies formulate precise marketing strategies.
APA, Harvard, Vancouver, ISO, and other styles
27

Haider, Murtaza, and Amir Gandomi. "When big data made the headlines: mining the text of big data coverage in the news media." International Journal of Services Technology and Management 27, no. 1/2 (2021): 23. http://dx.doi.org/10.1504/ijstm.2021.113574.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Gandomi, Amir, and Murtaza Haider. "When big data made the headlines: mining the text of big data coverage in the news media." International Journal of Services Technology and Management 27, no. 1/2 (2021): 23. http://dx.doi.org/10.1504/ijstm.2021.10035936.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Li, Qin, Shaobo Li, Sen Zhang, Jie Hu, and Jianjun Hu. "A Review of Text Corpus-Based Tourism Big Data Mining." Applied Sciences 9, no. 16 (August 12, 2019): 3300. http://dx.doi.org/10.3390/app9163300.

Full text
Abstract:
With the massive growth of the Internet, text data has become one of the main formats of tourism big data. As an effective expression means of tourists’ opinions, text mining of such data has big potential to inspire innovations for tourism practitioners. In the past decade, a variety of text mining techniques have been proposed and applied to tourism analysis to develop tourism value analysis models, build tourism recommendation systems, create tourist profiles, and make policies for supervising tourism markets. The successes of these techniques have been further boosted by the progress of natural language processing (NLP), machine learning, and deep learning. With the understanding of the complexity due to this diverse set of techniques and tourism text data sources, this work attempts to provide a detailed and up-to-date review of text mining techniques that have been, or have the potential to be, applied to modern tourism big data analysis. We summarize and discuss different text representation strategies, text-based NLP techniques for topic extraction, text classification, sentiment analysis, and text clustering in the context of tourism text mining, and their applications in tourist profiling, destination image analysis, market demand, etc. Our work also provides guidelines for constructing new tourism big data applications and outlines promising research areas in this field for incoming years.
APA, Harvard, Vancouver, ISO, and other styles
30

Novo-Lourés, María, Reyes Pavón, Rosalía Laza, David Ruano-Ordas, and Jose R. Méndez. "Using Natural Language Preprocessing Architecture (NLPA) for Big Data Text Sources." Scientific Programming 2020 (August 1, 2020): 1–13. http://dx.doi.org/10.1155/2020/2390941.

Full text
Abstract:
During the last years, big data analysis has become a popular means of taking advantage of multiple (initially valueless) sources to find relevant knowledge about real domains. However, a large number of big data sources provide textual unstructured data. A proper analysis requires tools able to adequately combine big data and text-analysing techniques. Keeping this in mind, we combined a pipelining framework (BDP4J (Big Data Pipelining For Java)) with the implementation of a set of text preprocessing techniques in order to create NLPA (Natural Language Preprocessing Architecture), an extendable open-source plugin implementing preprocessing steps that can be easily combined to create a pipeline. Additionally, NLPA incorporates the possibility of generating datasets using either a classical token-based representation of data or newer synset-based datasets that would be further processed using semantic information (i.e., using ontologies). This work presents a case study of NLPA operation covering the transformation of raw heterogeneous big data into different dataset representations (synsets and tokens) and using the Weka application programming interface (API) to launch two well-known classifiers.
APA, Harvard, Vancouver, ISO, and other styles
31

van Altena, Allard, Perry Moerland, Aeilko Zwinderman, and Sílvia Olabarriaga. "Usage of the Term Big Data in Biomedical Publications: A Text Mining Approach." Big Data and Cognitive Computing 3, no. 1 (February 6, 2019): 13. http://dx.doi.org/10.3390/bdcc3010013.

Full text
Abstract:
In this study, we attempt to assess the value of the term Big Data when used by researchers in their publications. For this purpose, we systematically collected a corpus of biomedical publications that use and do not use the term Big Data. These documents were used as input to a machine learning classifier to determine how well they can be separated into two groups and to determine the most distinguishing classification features. We generated 100 classifiers that could correctly distinguish between Big Data and non-Big Data documents with an area under the Receiver Operating Characteristic (ROC) curve of 0.96. The differences between the two groups were characterized by terms specific to Big Data themes—such as `computational’, `mining’, and `challenges’—and also by terms that indicate the research field, such as `genomics’. The ROC curves when plotted for various time intervals showed no difference over time. We conclude that there is a detectable and stable difference between publications that use the term Big Data and those that do not. Furthermore, the use of the term Big Data within a publication seems to indicate a distinct type of research in the biomedical field. Therefore, we conclude that value can be attributed to the term Big Data when used in a publication and this value has not changed over time.
APA, Harvard, Vancouver, ISO, and other styles
32

Rodzvilla, John. "Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value From Social Media, and Add Big(ger) Text to Big Data." Journal of Web Librarianship 11, no. 2 (April 3, 2017): 148–49. http://dx.doi.org/10.1080/19322909.2017.1302273.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Streuber, Sonja. "Deep Text: Using Text Analytics To Conquer Information Overload, Get Real Value From Social Media, and Add Big(Ger) text to Big Data." Public Services Quarterly 13, no. 3 (July 3, 2017): 179–81. http://dx.doi.org/10.1080/15228959.2017.1370065.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Bayat, Behrooz. "Deep Text: Using Text Analytics to Conquer Information Overload, Get Real Value from Social Media, and Add Big(ger) Text to Big Data." Electronic Library 35, no. 6 (November 6, 2017): 1269–70. http://dx.doi.org/10.1108/el-09-2017-0188.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Raghupathi, Viju, Yilu Zhou, and Wullianallur Raghupathi. "Exploring Big Data Analytic Approaches to Cancer Blog Text Analysis." International Journal of Healthcare Information Systems and Informatics 14, no. 4 (October 2019): 1–20. http://dx.doi.org/10.4018/ijhisi.2019100101.

Full text
Abstract:
In this article, the authors explore the potential of a big data analytics approach to unstructured text analytics of cancer blogs. The application is developed using Cloudera platform's Hadoop MapReduce framework. It uses several text analytics algorithms, including word count, word association, clustering, and classification, to identify and analyze the patterns and keywords in cancer blog postings. This article establishes an exploratory approach to involving big data analytics methods in developing text analytics applications for the analysis of cancer blogs. Additional insights are extracted through various means, including the development of categories or keywords contained in the blogs, the development of a taxonomy, and the examination of relationships among the categories. The application has the potential for generalizability and implementation with health content in other blogs and social media. It can provide insight and decision support for cancer management and facilitate efficient and relevant searches for information related to cancer.
APA, Harvard, Vancouver, ISO, and other styles
36

Jun, Sunghae. "Patent Big Data Analysis Using Bayesian Text Mining and Visualization." Journal of Korean Institute of Intelligent Systems 30, no. 2 (April 30, 2020): 154–60. http://dx.doi.org/10.5391/jkiis.2020.30.2.154.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Kim, Doo Hwan, and Ho Jeong Park. "Military Security Policy Research Using Big Data and Text Mining." Jouranl of Information and Security 19, no. 4 (October 31, 2019): 23–34. http://dx.doi.org/10.33778/kcsa.2019.19.4.023.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Chowkwanyun, Merlin. "Big Data, Large-Scale Text Analysis, and Public Health Research." American Journal of Public Health 109, S2 (February 2019): S126—S127. http://dx.doi.org/10.2105/ajph.2019.304965.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Ayed, Abdelkarim Ben, Mohamed Ben Halima, and Adel M. Alimi. "MapReduce Based Text Detection in Big Data Natural Scene Videos." Procedia Computer Science 53 (2015): 216–23. http://dx.doi.org/10.1016/j.procs.2015.07.297.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Moreno, Antonio, and Teófilo Redondo. "Text Analytics: the convergence of Big Data and Artificial Intelligence." International Journal of Interactive Multimedia and Artificial Intelligence 3, no. 6 (2016): 57. http://dx.doi.org/10.9781/ijimai.2016.369.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Andrade, Carina Sofia, and Maribel Yasmina Santos. "Sentiment Analysis with Text Mining in Contexts of Big Data." International Journal of Technology and Human Interaction 13, no. 3 (July 2017): 47–67. http://dx.doi.org/10.4018/ijthi.2017070104.

Full text
Abstract:
The evolution of technology, along with the common use of different devices connected to the Internet, provides a vast growth in the volume and variety of data that are daily generated at high velocity, phenomenon commonly denominated as Big Data. Related with this, several Text Mining techniques make possible the extraction of useful insights from that data, benefiting the decision-making process across multiple areas, using the information, models, patterns or tendencies that these techniques are able to identify. With Sentiment Analysis, it is possible to understand which sentiments and opinions are implicit in this data. This paper proposes an architecture for Sentiment Analysis that uses data from the Twitter, which is able to collect, store, process and analyse data on a real-time fashion. To demonstrate its utility, practical applications are developed using real world examples where Sentiment Analysis brings benefits when applied. With the presented demonstration case, it is possible to verify the role of each used technology and the techniques adopted for Sentiment Analysis.
APA, Harvard, Vancouver, ISO, and other styles
42

Gupta, Vedika, Vivek Kumar Singh, Udayan Ghose, and Pankaj Mukhija. "A quantitative and text-based characterization of big data research." Journal of Intelligent & Fuzzy Systems 36, no. 5 (May 14, 2019): 4659–75. http://dx.doi.org/10.3233/jifs-179016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Chatterjee, Ankush, Umang Gupta, Manoj Kumar Chinnakotla, Radhakrishnan Srikanth, Michel Galley, and Puneet Agrawal. "Understanding Emotions in Text Using Deep Learning and Big Data." Computers in Human Behavior 93 (April 2019): 309–17. http://dx.doi.org/10.1016/j.chb.2018.12.029.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Gerakidis, Sergios, Sofia Megarchioti, and Basilis Mamalis. "Efficient Big Text Data Clustering Algorithms using Hadoop and Spark." International Journal of Computer Applications 174, no. 15 (January 15, 2021): 13–21. http://dx.doi.org/10.5120/ijca2021921030.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Amazal, Houda, and Mohamed Kissi. "A New Big Data Feature Selection Approach for Text Classification." Scientific Programming 2021 (April 19, 2021): 1–10. http://dx.doi.org/10.1155/2021/6645345.

Full text
Abstract:
Feature selection (FS) is a fundamental task for text classification problems. Text feature selection aims to represent documents using the most relevant features. This process can reduce the size of datasets and improve the performance of the machine learning algorithms. Many researchers have focused on elaborating efficient FS techniques. However, most of the proposed approaches are evaluated for small datasets and validated using single machines. As textual data dimensionality becomes higher, traditional FS methods must be improved and parallelized to handle textual big data. This paper proposes a distributed approach for feature selection based on mutual information (MI) method, which is widely applied in pattern recognition and machine learning. A drawback of MI is that it ignores the frequency of the terms during the selection of features. The proposal introduces a distributed FS method, namely, Maximum Term Frequency-Mutual Information (MTF-MI), based on term frequency and mutual information techniques to improve the quality of the selected features. The proposed approach is implemented on Hadoop using the MapReduce programming model. The effectiveness of MTF-MI is demonstrated through several text classification experiments using the multinomial Naïve Bayes classifier on three datasets. Through a series of tests, the results reveal that the proposed MTF-MI method improves the classification results compared with four state-of-the-art methods in terms of macro-F1 and micro-F1 measures.
APA, Harvard, Vancouver, ISO, and other styles
46

Alotaibi, Youseef, Muhammad Noman Malik, Huma Hayat Khan, Anab Batool, Saif ul Islam, Abdulmajeed Alsufyani, and Saleh Alghamdi. "Suggestion Mining from Opinionated Text of Big Social Media Data." Computers, Materials & Continua 68, no. 3 (2021): 3323–38. http://dx.doi.org/10.32604/cmc.2021.016727.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Alharbi, Abdullah, Wael Alosaimi, and M. Irfan Uddin. "Automatic Surveillance of Pandemics Using Big Data and Text Mining." Computers, Materials & Continua 68, no. 1 (2021): 303–17. http://dx.doi.org/10.32604/cmc.2021.016230.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Singh, Shashi Pal, Ajai Kumar, Rachna Awasthi, Neetu Yadav, and Shikha Jain. "Intelligent Bilingual Data Extraction and Rebuilding Using Data Mining for Big Data." Journal of Computational and Theoretical Nanoscience 17, no. 1 (January 1, 2020): 513–18. http://dx.doi.org/10.1166/jctn.2020.8699.

Full text
Abstract:
In today’s World there exists various source of data in various formats (file formats), different structure, different types and etc. which is a hug collection of unstructured over the internet or social media. This gives rise to categorization of data as unstructured, semi structured and structured data. Data that exist in irregular manner without any particular schema are referred as unstructured data which is very difficult to process as it consists of irregularities and ambiguities. So, we are focused on Intelligent Processing Unit which converts unstructured big data into intelligent meaningful information. Intelligent text extraction is a technique that automatically identifies and extracts text from file format. The system consists of different stages which include the pre-processing, keyphase extraction techniques and transformation for the text extraction and retrieve structured data from unstructured data. The system consists multiple method/approach give better result. We are currently working in various file formats and converting the file format into DOCX which will come in the form of the un-structure Form, and then we will obtain that file in the structure form with the help of intelligent Pre-processing. The pre-process stages that triggers the unstructured data/corpus into structured data converting into meaning full. The Initial stage is the system remove the stop word, unwanted symbols noisy data and line spacing. The second stage is Data Extraction from various sources of file or types of files into proper format plain text. The then in third stage we transform the data or information from one format to another for the user to understand the data. The final step is rebuilding the file in its original format maintaining tag of the files. The large size files are divided into sub small size file to executed the parallel processing algorithms for fast processing of larger files and data. Parallel processing is a very important concept for text extraction and with its help; the big file breaks in a small file and improves the result. Extraction of data is done in Bilingual language, and represent the most relevant information contained in the document. Key-phase extraction is an important problem of data mining, Knowledge retrieval and natural speech processing. Keyword Extraction technique has been used to abstract keywords that exclusively recognize a document. Rebuilding is an important part of this project and we will use the entire concept in that file format and in the last, we need the same format which we have done in that file. This concept is being widely used but not much work of the work has been done in the area of developing many functionalities under one tool, so this makes us feel the requirement of such a tool which can easily and efficiently convert unstructured files into structured one.
APA, Harvard, Vancouver, ISO, and other styles
49

Debao, Dai, Ma Yinxia, and Zhao Min. "Analysis of big data job requirements based on K-means text clustering in China." PLOS ONE 16, no. 8 (August 5, 2021): e0255419. http://dx.doi.org/10.1371/journal.pone.0255419.

Full text
Abstract:
This paper aims to understand the characteristics of domestic big data jobs requirements through k-means text clustering, help enterprises, and employees to identify big data talents, and promote the further development of big data-related research. Firstly, the crawler software is used to crawl the recruitment information about "big data" on the zhaopin.com recruitment website. Then, Jieba word segmentation and K-means text clustering are used to cluster big data recruitment positions, and the number of clustering was determined by the average sum of squares within the group. Finally, big data jobs are divided into 10 categories, and the urban distribution, salary level, education requirements, and experience requirements of big data jobs are discussed and analyzed from the perspectives of the overall data set and clustering results, to clarify the characteristics of big data job demands. The analysis results show that the job demands of big data are mainly distributed in first-tier cities and new first-tier cities. Enterprises are more inclined to job seekers with a college degree or bachelor’s degree and more than one year’s relevant experience. There are wage differences among different types of jobs. The higher the position, the higher the requirement for education and experience will be.
APA, Harvard, Vancouver, ISO, and other styles
50

Antons, David, and Christoph F. Breidbach. "Big Data, Big Insights? Advancing Service Innovation and Design With Machine Learning." Journal of Service Research 21, no. 1 (December 11, 2017): 17–39. http://dx.doi.org/10.1177/1094670517738373.

Full text
Abstract:
Service innovation is intertwined with service design, and knowledge from both fields should be integrated to advance theoretical and normative insights. However, studies bridging service innovation and service design are in their infancy. This is because the body of service innovation and service design research is large and heterogeneous, which makes it difficult, if not impossible, for any human to read and understand its entire content and to delineate appropriate guidelines on how to broaden the scope of either field. Our work addresses this challenge by presenting the first application of topic modeling, a type of machine learning, to review and analyze currently available service innovation and service design research ( n = 641 articles with 10,543 pages of written text or 4,119,747 words). We provide an empirical contribution to service research by identifying and analyzing 69 distinct research topics in the published text corpus, a theoretical contribution by delineating an extensive research agenda consisting of four research directions and 12 operationalizable guidelines to facilitate cross-fertilization between the two fields, and a methodological contribution by introducing and demonstrating the applicability of topic modeling and machine learning as a novel type of big data analytics to our discipline.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography