Accedi

Bibliografie tematiche / Text Stream Clustering

Indice

Articoli di riviste
Tesi
Capitoli di libri
Atti di convegni

Letteratura scientifica selezionata sul tema "Text Stream Clustering"

Autore: Grafiati

Pubblicato: 28 dicembre 2024

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili

Scegli il tipo di fonte:

Consulta la lista di attuali articoli, libri, tesi, atti di convegni e altre fonti scientifiche attinenti al tema "Text Stream Clustering".

Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.

Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.

Articoli di riviste sul tema "Text Stream Clustering"

1

Vo, Tham, e Phuc Do. "GOW-Stream: A novel approach of graph-of-words based mixture model for semantic-enhanced text stream clustering". Intelligent Data Analysis 25, n. 5 (15 settembre 2021): 1211–31. http://dx.doi.org/10.3233/ida-205443.

Testo completo

Abstract (sommario):

Recently, rapid growth of social networks and online news resources from Internet have made text stream clustering become an insufficient application in multiple domains (e.g.: text retrieval diversification, social event detection, text summarization, etc.) Different from traditional static text clustering approach, text stream clustering task has specific key challenges related to the rapid change of topics/clusters and high-velocity of coming streaming document batches. Recent well-known model-based text stream clustering models, such as: DTM, DCT, MStream, etc. are considered as word-independent evaluation approach which means largely ignoring the relations between words while sampling clusters/topics. It definitely leads to the decrease of overall model accuracy performance, especially for short-length text documents such as comments, microblogs, etc. in social networks. To tackle these existing problems, in this paper we propose a novel approach of graph-of-words (GOWs) based text stream clustering, called GOW-Stream. The application of common GOWs which are generated from each document batch while sampling clusters/topics can support to overcome the word-independent evaluation challenge. Our proposed GOW-Stream is promising to significantly achieve better text stream clustering performance than recent state-of-the-art baselines. Extensive experiments on multiple benchmark real-world datasets demonstrate the effectiveness of our proposed model in both accuracy and time-consuming performances.

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Qiang, Jipeng, Wanyin Xu, Yun Li, Yunhao Yuan e Yi Zhu. "Lifelong Learning Augmented Short Text Stream Clustering Method". IEEE Access 9 (2021): 70493–501. http://dx.doi.org/10.1109/access.2021.3078096.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Gong, Linghui, Jianping Zeng e Shiyong Zhang. "Text stream clustering algorithm based on adaptive feature selection". Expert Systems with Applications 38, n. 3 (marzo 2011): 1393–99. http://dx.doi.org/10.1016/j.eswa.2010.07.041.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Ma, Hui Fang, e Hui Li Ma. "Combining Burst Detection for Hot Topic Extraction". Advanced Materials Research 268-270 (luglio 2011): 1283–88. http://dx.doi.org/10.4028/www.scientific.net/amr.268-270.1283.

Testo completo

Abstract (sommario):

As traditional text representations are not suitable for online dynamic streams, this paper presents a hot topic extraction technique that can be used for tracking news topics over time. The model combines individual word burst into the document-word vector representation, which can emphasize the temporally features of text streams. An energy ratio threshold based burst detection approach is proposed and TF-PDF is then combined to weigh the terms. Experiment results demonstrate that this model is effective in topic extraction for news stream and it can better improve the clustering performance.

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Taninpong, Phimphaka, e Sudsanguan Ngamsuriyaroj. "Tree-based text stream clustering with application to spam mail classification". International Journal of Data Mining, Modelling and Management 10, n. 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.095354.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Ngamsuriyaroj, Sudsanguan, e Phimphaka Taninpong. "Tree-based text stream clustering with application to spam mail classification". International Journal of Data Mining, Modelling and Management 10, n. 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.10015879.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Li, Pei, e Ze Deng. "Use of Distributed Semi-Supervised Clustering for Text Classification". Journal of Circuits, Systems and Computers 28, n. 08 (luglio 2019): 1950127. http://dx.doi.org/10.1142/s0218126619501275.

Testo completo

Abstract (sommario):

Text classification is an important way to handle and organize textual data. Among existing methods of text classification, semi-supervised clustering is a main-stream technique. In the era of ‘Big data’, the current semi-supervised clustering approaches for text classification generally do not apply for excessive costs in scalability and computing performance for massive text data. Aiming at this problem, this study proposes a scalable text classification algorithm for large-scale text collections, namely D-TESC by modifying a state-of-the-art semi-supervised clustering approach for text classification in a centralized fashion (TESC). D-TESC can process the textual data in a distributed manner to meet a great scalability. The experimental results indicate that (1) the D-TESC algorithm has a comparable classification quality with TESC, and (2) outperforms TESC by average 7.2 times by using eight CPU threads in terms of scalability.

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Chen, Junyang, Zhiguo Gong e Weiwen Liu. "A Dirichlet process biterm-based mixture model for short text stream clustering". Applied Intelligence 50, n. 5 (1 febbraio 2020): 1609–19. http://dx.doi.org/10.1007/s10489-019-01606-1.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Kumar, Sushil, e Komal Kumar Bhatia. "Clustering Based Approach for Novelty Detection in Text Documents". Asian Journal of Computer Science and Technology 8, n. 2 (5 maggio 2019): 116–21. http://dx.doi.org/10.51983/ajcst-2019.8.2.2130.

Testo completo

Abstract (sommario):

As the information is overloaded over the internet accessing of information from the internet according to a given query provides redundant and irrelevant information. It is necessary to retrieve relevant and novel information from a given query by the user. With the result of this the user will require minimum effort to access the information need. In this work we proposed a clustering based approach for novelty detection which will provide the relevant and novel documents for the information need. Based on the user query the incoming stream of documents will be clustered using k-means algorithm. Then the cluster heads are selected from the various clusters with the minimum distance. These cluster heads are the novel documents from a collection of documents from different clusters having the large distance. The proposed technique can be further used in the field of information retrieval.

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Hamou, Reda Mohamed, Abdelmalek Amine e Ahmed Chaouki Lokbani. "The Social Spiders in the Clustering of Texts". International Journal of Artificial Life Research 3, n. 3 (luglio 2012): 1–14. http://dx.doi.org/10.4018/jalr.2012070101.

Testo completo

Abstract (sommario):

In this paper the authors experiment and test a new biomimetic approach based on social spiders to solve a combinatorial problem ie the automatic classification of texts because a very large data stream flows and particularly on the web. Representation of textual data was performed by a method independent of the language ie n-gram characters and words because there is currently no method of learning that can directly represent unstructured data (text). To validate the classification, the authors used a measure of evaluation based on recall and precision (F-measure). During the experiment, the authors found a powerful visualization tool in social spiders that they exploit to make visual classification.

Gli stili APA, Harvard, Vancouver, ISO e altri

Più fonti

Tesi sul tema "Text Stream Clustering"

1

Crossman, Nathaniel C. "Stream Clustering And Visualization Of Geotagged Text Data For Crisis Management". Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590957641168863.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Wang, Ye. "Robust Text Mining in Online Social Network Context". Thesis, 2018. https://vuir.vu.edu.au/38645/.

Testo completo

Abstract (sommario):

Text mining is involved in a broad scope of applications in diverse domains that mainly, but not exclusively, serve political, commercial, medical and academic needs. Along with the rapid development of the Internet technology in recent thirty years and the advent of online social media and network in a decade, text data is obliged to entail features of online social data streams, for example, the explosive growth, the constantly changing content and the huge volume. As a result, text mining is no longer merely oriented to textual content itself, but requires consideration of surroundings and combining theories and techniques of stream processing and social network analysis, which give birth to a wide range of applications used for understanding thoughts spread over the world , such as sentiment analysis, mass surveillance and market prediction. Automatically discovering sequences of words that represent appropriate themes in a collection of documents, topic detection closely associated with document clustering and classification. These two tasks play integral roles in revealing deep insight into the text content in the whole text mining framework. However, most existing detection techniques cannot adapt to the dynamic social context. This shows bottlenecks of detecting performance and deficiencies of topic models. In this thesis, we take aim at text data stream, investigating novel techniques and solutions for robust text mining to tackle arising challenges associated with the online social context by incorporating methodologies of stream processing, topic detection and document clustering and classification. In particular, we have advanced the state-of-theart by making the following contributions: 1. A Multi-Window based Ensemble Learning (MWEL) framework is proposed for imbalanced streaming data that comprehensively improves the classification performance. MWEL ensures that the ensemble classifier is maintained up to date and adaptive to the evolving data distribution by applying a multi-window monitoring mechanism and efficient updating strategy. 2. A semi-supervised learning method is proposed to detect latent topics from news streams and the corresponding social context with a constraint propagation scheme to adequately exploit the hidden geometrical structure as supervised information in given data space. A collective learning algorithm is proposed to integrate the textual content into the social context. A locally weighted scheme is afterwards proposed to seek an improvement of the algorithm stability. 3. A Robust Hierarchical Ensemble (RHE) framework is introduced to enhance the robustness of the topic model. It, on the one hand, reduces repercussions caused by outliers and noises, and on the other overcomes inherent defects of text data. RHE adapts to the changing distribution of text stream by constructing a flexible document hierarchy which can be dynamically adjusted. A discussion of how to extract the most valuable social context is conducted with experiments for the purpose of removing some noises from the surroundings and efficiency of the proposed.

Gli stili APA, Harvard, Vancouver, ISO e altri

Capitoli di libri sul tema "Text Stream Clustering"

1

Sharma, Iti, Aaditya Jain e Harish Sharma. "Stream and Online Clustering for Text Documents". In International Conference on Advanced Computing Networking and Informatics, 469–75. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-2673-8_49.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Olariu, Andrei. "Hierarchical Clustering in Improving Microblog Stream Summarization". In Computational Linguistics and Intelligent Text Processing, 424–35. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37256-8_35.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Li, Chunshan, Yunming Ye, Xiaofeng Zhang, Dianhui Chu, Shengchun Deng e Xiaofei Xu. "Clustering Based Topic Events Detection on Text Stream". In Intelligent Information and Database Systems, 42–52. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-05476-6_5.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Molina, Roberto, Waldo Hasperué e Augusto Villa Monte. "D3CAS: Distributed Clustering Algorithm Applied to Short-Text Stream Processing". In Communications in Computer and Information Science, 211–20. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-20787-8_15.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Attaoui, Mohammed Oualid, Mustapha Lebbah, Nabil Keskes, Hanene Azzag e Mohammed Ghesmoune. "Soft Subspace Growing Neural Gas for Data Stream Clustering". In Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series, 569–80. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-30490-4_46.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Joshi, Basanta, Umanga Bista e Manoj Ghimire. "Intelligent Clustering Scheme for Log Data Streams". In Computational Linguistics and Intelligent Text Processing, 454–65. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-642-54903-8_38.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Liu, Yubao, Jiarong Cai, Jian Yin e Ada Wai-Chee Fu. "Clustering Massive Text Data Streams by Semantic Smoothing Model". In Advanced Data Mining and Applications, 389–400. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-73871-8_36.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Luo, Yonghong, Ying Zhang, Xiaoke Ding, Xiangrui Cai, Chunyao Song e Xiaojie Yuan. "StrDip: A Fast Data Stream Clustering Algorithm Using the Dip Test of Unimodality". In Web Information Systems Engineering – WISE 2018, 193–208. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-02925-8_14.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Zhao, Yanchang, Longbing Cao, Huaifeng Zhang e Chengqi Zhang. "Data Clustering". In Handbook of Research on Innovations in Database Technologies and Applications, 562–72. IGI Global, 2009. http://dx.doi.org/10.4018/978-1-60566-242-8.ch060.

Testo completo

Abstract (sommario):

Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering, including well-known clustering techniques, such as partitioning clustering, hierarchical clustering, density-based clustering and grid-based clustering, and recent advances in clustering, such as subspace clustering, text clustering and data stream clustering. The major challenges and future trends of data clustering will also be introduced in this chapter. The remainder of this chapter is organized as follows. The background of data clustering will be introduced in Section 2, including the definition of clustering, categories of clustering techniques, features of good clustering algorithms, and the validation of clustering. Section 3 will present main approaches for clustering, which range from the classic partitioning and hierarchical clustering to recent approaches of bi-clustering and semisupervised clustering. Challenges and future trends will be discussed in Section 4, followed by the conclusions in the last section.

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Park, Jun Pyo, Chang-Sup Park e Yon Dohn Chung. "Energy and Latency Efficient Access of Wireless XML Stream". In Cross-Disciplinary Models and Applications of Database Management, 57–79. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-61350-471-0.ch003.

Testo completo

Abstract (sommario):

In this article, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latency-efficient broadcasting of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains the tag name, attributes, and text content of an element, as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the stream with shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing in the mobile clients. Through extensive experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.

Gli stili APA, Harvard, Vancouver, ISO e altri

Atti di convegni sul tema "Text Stream Clustering"

1

Rao, Y., e X. J. Li. "A Topic-based Dynamic Clustering Algorithm for Text Stream". In 2015 International Conference on Artificial Intelligence and Industrial Engineering. Paris, France: Atlantis Press, 2015. http://dx.doi.org/10.2991/aiie-15.2015.130.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Kalogeratos, Argyris, Panagiotis Zagorisios e Aristidis Likas. "Improving Text Stream Clustering using Term Burstiness and Co-burstiness". In SETN '16: 9th Hellenic Conference on Artificial Intelligence. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2903220.2903229.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Crossman, Nathaniel C., Soon M. Chung e Vincent A. Schmidt. "Stream Clustering and Visualization of Geotagged Text Data for Crisis Management". In 2019 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2019. http://dx.doi.org/10.1109/icodse48700.2019.9092760.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Crossman, Nathaniel C., e Soon M. Chung. "GPU-Accelerated Stream Clustering of Geotagged Text Data for Crisis Management". In 2022 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2022. http://dx.doi.org/10.1109/icodse56892.2022.9971926.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Kumar, Jay, Junming Shao, Salah Uddin e Wazir Ali. "An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering". In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.70.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

6

Rakib, Md Rashadul Hasan, Norbert Zeh e Evangelos Milios. "Short Text Stream Clustering via Frequent Word Pairs and Reassignment of Outliers to Clusters". In DocEng '20: ACM Symposium on Document Engineering 2020. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3395027.3419589.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Si, XianLiang, Peipei Li, Xuegang Hu e Yuhong Zhang. "An Online Dirichlet Model based on Sentence Embedding and DBSCAN for Noisy Short Text Stream Clustering". In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892414.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Rakib, Md Rashadul Hasan, Norbert Zeh e Evangelos Milios. "Efficient clustering of short text streams using online-offline clustering". In DocEng '21: ACM Symposium on Document Engineering 2021. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3469096.3469866.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

He, Qi, Kuiyu Chang, Ee-Peng Lim e Jun Zhang. "Bursty Feature Representation for Clustering Text Streams". In Proceedings of the 2007 SIAM International Conference on Data Mining. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2007. http://dx.doi.org/10.1137/1.9781611972771.50.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Zhao, Yukun, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz e Maarten de Rijke. "Explainable User Clustering in Short Text Streams". In SIGIR '16: The 39th International ACM SIGIR conference on research and development in Information Retrieval. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2911451.2911522.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Offriamo sconti su tutti i piani premium per gli autori le cui opere sono incluse in raccolte letterarie tematiche. Contattaci per ottenere un codice promozionale unico!