Acceder

Bibliografías temáticas / Text Stream Clustering

Índice

Artículos de revistas
Tesis
Capítulos de libros
Actas de conferencias

Literatura académica sobre el tema "Text Stream Clustering"

Autor: Grafiati

Publicado: 28 de diciembre de 2024

Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros

Elija tipo de fuente:

Consulte las listas temáticas de artículos, libros, tesis, actas de conferencias y otras fuentes académicas sobre el tema "Text Stream Clustering".

Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.

También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.

Artículos de revistas sobre el tema "Text Stream Clustering"

1

Vo, Tham y Phuc Do. "GOW-Stream: A novel approach of graph-of-words based mixture model for semantic-enhanced text stream clustering". Intelligent Data Analysis 25, n.º 5 (15 de septiembre de 2021): 1211–31. http://dx.doi.org/10.3233/ida-205443.

Texto completo

Resumen

Recently, rapid growth of social networks and online news resources from Internet have made text stream clustering become an insufficient application in multiple domains (e.g.: text retrieval diversification, social event detection, text summarization, etc.) Different from traditional static text clustering approach, text stream clustering task has specific key challenges related to the rapid change of topics/clusters and high-velocity of coming streaming document batches. Recent well-known model-based text stream clustering models, such as: DTM, DCT, MStream, etc. are considered as word-independent evaluation approach which means largely ignoring the relations between words while sampling clusters/topics. It definitely leads to the decrease of overall model accuracy performance, especially for short-length text documents such as comments, microblogs, etc. in social networks. To tackle these existing problems, in this paper we propose a novel approach of graph-of-words (GOWs) based text stream clustering, called GOW-Stream. The application of common GOWs which are generated from each document batch while sampling clusters/topics can support to overcome the word-independent evaluation challenge. Our proposed GOW-Stream is promising to significantly achieve better text stream clustering performance than recent state-of-the-art baselines. Extensive experiments on multiple benchmark real-world datasets demonstrate the effectiveness of our proposed model in both accuracy and time-consuming performances.

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Qiang, Jipeng, Wanyin Xu, Yun Li, Yunhao Yuan y Yi Zhu. "Lifelong Learning Augmented Short Text Stream Clustering Method". IEEE Access 9 (2021): 70493–501. http://dx.doi.org/10.1109/access.2021.3078096.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Gong, Linghui, Jianping Zeng y Shiyong Zhang. "Text stream clustering algorithm based on adaptive feature selection". Expert Systems with Applications 38, n.º 3 (marzo de 2011): 1393–99. http://dx.doi.org/10.1016/j.eswa.2010.07.041.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Ma, Hui Fang y Hui Li Ma. "Combining Burst Detection for Hot Topic Extraction". Advanced Materials Research 268-270 (julio de 2011): 1283–88. http://dx.doi.org/10.4028/www.scientific.net/amr.268-270.1283.

Texto completo

Resumen

As traditional text representations are not suitable for online dynamic streams, this paper presents a hot topic extraction technique that can be used for tracking news topics over time. The model combines individual word burst into the document-word vector representation, which can emphasize the temporally features of text streams. An energy ratio threshold based burst detection approach is proposed and TF-PDF is then combined to weigh the terms. Experiment results demonstrate that this model is effective in topic extraction for news stream and it can better improve the clustering performance.

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Taninpong, Phimphaka y Sudsanguan Ngamsuriyaroj. "Tree-based text stream clustering with application to spam mail classification". International Journal of Data Mining, Modelling and Management 10, n.º 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.095354.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Ngamsuriyaroj, Sudsanguan y Phimphaka Taninpong. "Tree-based text stream clustering with application to spam mail classification". International Journal of Data Mining, Modelling and Management 10, n.º 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.10015879.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Li, Pei y Ze Deng. "Use of Distributed Semi-Supervised Clustering for Text Classification". Journal of Circuits, Systems and Computers 28, n.º 08 (julio de 2019): 1950127. http://dx.doi.org/10.1142/s0218126619501275.

Texto completo

Resumen

Text classification is an important way to handle and organize textual data. Among existing methods of text classification, semi-supervised clustering is a main-stream technique. In the era of ‘Big data’, the current semi-supervised clustering approaches for text classification generally do not apply for excessive costs in scalability and computing performance for massive text data. Aiming at this problem, this study proposes a scalable text classification algorithm for large-scale text collections, namely D-TESC by modifying a state-of-the-art semi-supervised clustering approach for text classification in a centralized fashion (TESC). D-TESC can process the textual data in a distributed manner to meet a great scalability. The experimental results indicate that (1) the D-TESC algorithm has a comparable classification quality with TESC, and (2) outperforms TESC by average 7.2 times by using eight CPU threads in terms of scalability.

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Chen, Junyang, Zhiguo Gong y Weiwen Liu. "A Dirichlet process biterm-based mixture model for short text stream clustering". Applied Intelligence 50, n.º 5 (1 de febrero de 2020): 1609–19. http://dx.doi.org/10.1007/s10489-019-01606-1.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Kumar, Sushil y Komal Kumar Bhatia. "Clustering Based Approach for Novelty Detection in Text Documents". Asian Journal of Computer Science and Technology 8, n.º 2 (5 de mayo de 2019): 116–21. http://dx.doi.org/10.51983/ajcst-2019.8.2.2130.

Texto completo

Resumen

As the information is overloaded over the internet accessing of information from the internet according to a given query provides redundant and irrelevant information. It is necessary to retrieve relevant and novel information from a given query by the user. With the result of this the user will require minimum effort to access the information need. In this work we proposed a clustering based approach for novelty detection which will provide the relevant and novel documents for the information need. Based on the user query the incoming stream of documents will be clustered using k-means algorithm. Then the cluster heads are selected from the various clusters with the minimum distance. These cluster heads are the novel documents from a collection of documents from different clusters having the large distance. The proposed technique can be further used in the field of information retrieval.

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Hamou, Reda Mohamed, Abdelmalek Amine y Ahmed Chaouki Lokbani. "The Social Spiders in the Clustering of Texts". International Journal of Artificial Life Research 3, n.º 3 (julio de 2012): 1–14. http://dx.doi.org/10.4018/jalr.2012070101.

Texto completo

Resumen

In this paper the authors experiment and test a new biomimetic approach based on social spiders to solve a combinatorial problem ie the automatic classification of texts because a very large data stream flows and particularly on the web. Representation of textual data was performed by a method independent of the language ie n-gram characters and words because there is currently no method of learning that can directly represent unstructured data (text). To validate the classification, the authors used a measure of evaluation based on recall and precision (F-measure). During the experiment, the authors found a powerful visualization tool in social spiders that they exploit to make visual classification.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Más fuentes

Tesis sobre el tema "Text Stream Clustering"

1

Crossman, Nathaniel C. "Stream Clustering And Visualization Of Geotagged Text Data For Crisis Management". Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590957641168863.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Wang, Ye. "Robust Text Mining in Online Social Network Context". Thesis, 2018. https://vuir.vu.edu.au/38645/.

Texto completo

Resumen

Text mining is involved in a broad scope of applications in diverse domains that mainly, but not exclusively, serve political, commercial, medical and academic needs. Along with the rapid development of the Internet technology in recent thirty years and the advent of online social media and network in a decade, text data is obliged to entail features of online social data streams, for example, the explosive growth, the constantly changing content and the huge volume. As a result, text mining is no longer merely oriented to textual content itself, but requires consideration of surroundings and combining theories and techniques of stream processing and social network analysis, which give birth to a wide range of applications used for understanding thoughts spread over the world , such as sentiment analysis, mass surveillance and market prediction. Automatically discovering sequences of words that represent appropriate themes in a collection of documents, topic detection closely associated with document clustering and classification. These two tasks play integral roles in revealing deep insight into the text content in the whole text mining framework. However, most existing detection techniques cannot adapt to the dynamic social context. This shows bottlenecks of detecting performance and deficiencies of topic models. In this thesis, we take aim at text data stream, investigating novel techniques and solutions for robust text mining to tackle arising challenges associated with the online social context by incorporating methodologies of stream processing, topic detection and document clustering and classification. In particular, we have advanced the state-of-theart by making the following contributions: 1. A Multi-Window based Ensemble Learning (MWEL) framework is proposed for imbalanced streaming data that comprehensively improves the classification performance. MWEL ensures that the ensemble classifier is maintained up to date and adaptive to the evolving data distribution by applying a multi-window monitoring mechanism and efficient updating strategy. 2. A semi-supervised learning method is proposed to detect latent topics from news streams and the corresponding social context with a constraint propagation scheme to adequately exploit the hidden geometrical structure as supervised information in given data space. A collective learning algorithm is proposed to integrate the textual content into the social context. A locally weighted scheme is afterwards proposed to seek an improvement of the algorithm stability. 3. A Robust Hierarchical Ensemble (RHE) framework is introduced to enhance the robustness of the topic model. It, on the one hand, reduces repercussions caused by outliers and noises, and on the other overcomes inherent defects of text data. RHE adapts to the changing distribution of text stream by constructing a flexible document hierarchy which can be dynamically adjusted. A discussion of how to extract the most valuable social context is conducted with experiments for the purpose of removing some noises from the surroundings and efficiency of the proposed.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Capítulos de libros sobre el tema "Text Stream Clustering"

1

Sharma, Iti, Aaditya Jain y Harish Sharma. "Stream and Online Clustering for Text Documents". En International Conference on Advanced Computing Networking and Informatics, 469–75. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-2673-8_49.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Olariu, Andrei. "Hierarchical Clustering in Improving Microblog Stream Summarization". En Computational Linguistics and Intelligent Text Processing, 424–35. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37256-8_35.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Li, Chunshan, Yunming Ye, Xiaofeng Zhang, Dianhui Chu, Shengchun Deng y Xiaofei Xu. "Clustering Based Topic Events Detection on Text Stream". En Intelligent Information and Database Systems, 42–52. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-05476-6_5.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Molina, Roberto, Waldo Hasperué y Augusto Villa Monte. "D3CAS: Distributed Clustering Algorithm Applied to Short-Text Stream Processing". En Communications in Computer and Information Science, 211–20. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-20787-8_15.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Attaoui, Mohammed Oualid, Mustapha Lebbah, Nabil Keskes, Hanene Azzag y Mohammed Ghesmoune. "Soft Subspace Growing Neural Gas for Data Stream Clustering". En Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series, 569–80. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-30490-4_46.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Joshi, Basanta, Umanga Bista y Manoj Ghimire. "Intelligent Clustering Scheme for Log Data Streams". En Computational Linguistics and Intelligent Text Processing, 454–65. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-642-54903-8_38.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Liu, Yubao, Jiarong Cai, Jian Yin y Ada Wai-Chee Fu. "Clustering Massive Text Data Streams by Semantic Smoothing Model". En Advanced Data Mining and Applications, 389–400. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-73871-8_36.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Luo, Yonghong, Ying Zhang, Xiaoke Ding, Xiangrui Cai, Chunyao Song y Xiaojie Yuan. "StrDip: A Fast Data Stream Clustering Algorithm Using the Dip Test of Unimodality". En Web Information Systems Engineering – WISE 2018, 193–208. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-02925-8_14.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

Zhao, Yanchang, Longbing Cao, Huaifeng Zhang y Chengqi Zhang. "Data Clustering". En Handbook of Research on Innovations in Database Technologies and Applications, 562–72. IGI Global, 2009. http://dx.doi.org/10.4018/978-1-60566-242-8.ch060.

Texto completo

Resumen

Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering, including well-known clustering techniques, such as partitioning clustering, hierarchical clustering, density-based clustering and grid-based clustering, and recent advances in clustering, such as subspace clustering, text clustering and data stream clustering. The major challenges and future trends of data clustering will also be introduced in this chapter. The remainder of this chapter is organized as follows. The background of data clustering will be introduced in Section 2, including the definition of clustering, categories of clustering techniques, features of good clustering algorithms, and the validation of clustering. Section 3 will present main approaches for clustering, which range from the classic partitioning and hierarchical clustering to recent approaches of bi-clustering and semisupervised clustering. Challenges and future trends will be discussed in Section 4, followed by the conclusions in the last section.

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Park, Jun Pyo, Chang-Sup Park y Yon Dohn Chung. "Energy and Latency Efficient Access of Wireless XML Stream". En Cross-Disciplinary Models and Applications of Database Management, 57–79. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-61350-471-0.ch003.

Texto completo

Resumen

In this article, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latency-efficient broadcasting of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains the tag name, attributes, and text content of an element, as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the stream with shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing in the mobile clients. Through extensive experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

Los estilos APA, Harvard, Vancouver, ISO, etc.

Actas de conferencias sobre el tema "Text Stream Clustering"

1

Rao, Y. y X. J. Li. "A Topic-based Dynamic Clustering Algorithm for Text Stream". En 2015 International Conference on Artificial Intelligence and Industrial Engineering. Paris, France: Atlantis Press, 2015. http://dx.doi.org/10.2991/aiie-15.2015.130.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

2

Kalogeratos, Argyris, Panagiotis Zagorisios y Aristidis Likas. "Improving Text Stream Clustering using Term Burstiness and Co-burstiness". En SETN '16: 9th Hellenic Conference on Artificial Intelligence. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2903220.2903229.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

3

Crossman, Nathaniel C., Soon M. Chung y Vincent A. Schmidt. "Stream Clustering and Visualization of Geotagged Text Data for Crisis Management". En 2019 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2019. http://dx.doi.org/10.1109/icodse48700.2019.9092760.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

4

Crossman, Nathaniel C. y Soon M. Chung. "GPU-Accelerated Stream Clustering of Geotagged Text Data for Crisis Management". En 2022 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2022. http://dx.doi.org/10.1109/icodse56892.2022.9971926.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

5

Kumar, Jay, Junming Shao, Salah Uddin y Wazir Ali. "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering". En Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.70.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

6

Rakib, Md Rashadul Hasan, Norbert Zeh y Evangelos Milios. "Short Text Stream Clustering via Frequent Word Pairs and Reassignment of Outliers to Clusters". En DocEng '20: ACM Symposium on Document Engineering 2020. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3395027.3419589.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

7

Si, XianLiang, Peipei Li, Xuegang Hu y Yuhong Zhang. "An Online Dirichlet Model based on Sentence Embedding and DBSCAN for Noisy Short Text Stream Clustering". En 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892414.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

8

Rakib, Md Rashadul Hasan, Norbert Zeh y Evangelos Milios. "Efficient clustering of short text streams using online-offline clustering". En DocEng '21: ACM Symposium on Document Engineering 2021. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3469096.3469866.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

9

He, Qi, Kuiyu Chang, Ee-Peng Lim y Jun Zhang. "Bursty Feature Representation for Clustering Text Streams". En Proceedings of the 2007 SIAM International Conference on Data Mining. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2007. http://dx.doi.org/10.1137/1.9781611972771.50.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

10

Zhao, Yukun, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz y Maarten de Rijke. "Explainable User Clustering in Short Text Streams". En SIGIR '16: The 39th International ACM SIGIR conference on research and development in Information Retrieval. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2911451.2911522.

Texto completo

Los estilos APA, Harvard, Vancouver, ISO, etc.

Ofrecemos descuentos en todos los planes premium para autores cuyas obras están incluidas en selecciones literarias temáticas. ¡Contáctenos para obtener un código promocional único!