Auswahl der wissenschaftlichen Literatur zum Thema „Text Stream Clustering“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Text Stream Clustering" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Text Stream Clustering"

1

Vo, Tham, und Phuc Do. „GOW-Stream: A novel approach of graph-of-words based mixture model for semantic-enhanced text stream clustering“. Intelligent Data Analysis 25, Nr. 5 (15.09.2021): 1211–31. http://dx.doi.org/10.3233/ida-205443.

Der volle Inhalt der Quelle
Annotation:
Recently, rapid growth of social networks and online news resources from Internet have made text stream clustering become an insufficient application in multiple domains (e.g.: text retrieval diversification, social event detection, text summarization, etc.) Different from traditional static text clustering approach, text stream clustering task has specific key challenges related to the rapid change of topics/clusters and high-velocity of coming streaming document batches. Recent well-known model-based text stream clustering models, such as: DTM, DCT, MStream, etc. are considered as word-independent evaluation approach which means largely ignoring the relations between words while sampling clusters/topics. It definitely leads to the decrease of overall model accuracy performance, especially for short-length text documents such as comments, microblogs, etc. in social networks. To tackle these existing problems, in this paper we propose a novel approach of graph-of-words (GOWs) based text stream clustering, called GOW-Stream. The application of common GOWs which are generated from each document batch while sampling clusters/topics can support to overcome the word-independent evaluation challenge. Our proposed GOW-Stream is promising to significantly achieve better text stream clustering performance than recent state-of-the-art baselines. Extensive experiments on multiple benchmark real-world datasets demonstrate the effectiveness of our proposed model in both accuracy and time-consuming performances.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Qiang, Jipeng, Wanyin Xu, Yun Li, Yunhao Yuan und Yi Zhu. „Lifelong Learning Augmented Short Text Stream Clustering Method“. IEEE Access 9 (2021): 70493–501. http://dx.doi.org/10.1109/access.2021.3078096.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Gong, Linghui, Jianping Zeng und Shiyong Zhang. „Text stream clustering algorithm based on adaptive feature selection“. Expert Systems with Applications 38, Nr. 3 (März 2011): 1393–99. http://dx.doi.org/10.1016/j.eswa.2010.07.041.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Ma, Hui Fang, und Hui Li Ma. „Combining Burst Detection for Hot Topic Extraction“. Advanced Materials Research 268-270 (Juli 2011): 1283–88. http://dx.doi.org/10.4028/www.scientific.net/amr.268-270.1283.

Der volle Inhalt der Quelle
Annotation:
As traditional text representations are not suitable for online dynamic streams, this paper presents a hot topic extraction technique that can be used for tracking news topics over time. The model combines individual word burst into the document-word vector representation, which can emphasize the temporally features of text streams. An energy ratio threshold based burst detection approach is proposed and TF-PDF is then combined to weigh the terms. Experiment results demonstrate that this model is effective in topic extraction for news stream and it can better improve the clustering performance.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Taninpong, Phimphaka, und Sudsanguan Ngamsuriyaroj. „Tree-based text stream clustering with application to spam mail classification“. International Journal of Data Mining, Modelling and Management 10, Nr. 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.095354.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Ngamsuriyaroj, Sudsanguan, und Phimphaka Taninpong. „Tree-based text stream clustering with application to spam mail classification“. International Journal of Data Mining, Modelling and Management 10, Nr. 4 (2018): 353. http://dx.doi.org/10.1504/ijdmmm.2018.10015879.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Li, Pei, und Ze Deng. „Use of Distributed Semi-Supervised Clustering for Text Classification“. Journal of Circuits, Systems and Computers 28, Nr. 08 (Juli 2019): 1950127. http://dx.doi.org/10.1142/s0218126619501275.

Der volle Inhalt der Quelle
Annotation:
Text classification is an important way to handle and organize textual data. Among existing methods of text classification, semi-supervised clustering is a main-stream technique. In the era of ‘Big data’, the current semi-supervised clustering approaches for text classification generally do not apply for excessive costs in scalability and computing performance for massive text data. Aiming at this problem, this study proposes a scalable text classification algorithm for large-scale text collections, namely D-TESC by modifying a state-of-the-art semi-supervised clustering approach for text classification in a centralized fashion (TESC). D-TESC can process the textual data in a distributed manner to meet a great scalability. The experimental results indicate that (1) the D-TESC algorithm has a comparable classification quality with TESC, and (2) outperforms TESC by average 7.2 times by using eight CPU threads in terms of scalability.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Chen, Junyang, Zhiguo Gong und Weiwen Liu. „A Dirichlet process biterm-based mixture model for short text stream clustering“. Applied Intelligence 50, Nr. 5 (01.02.2020): 1609–19. http://dx.doi.org/10.1007/s10489-019-01606-1.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Kumar, Sushil, und Komal Kumar Bhatia. „Clustering Based Approach for Novelty Detection in Text Documents“. Asian Journal of Computer Science and Technology 8, Nr. 2 (05.05.2019): 116–21. http://dx.doi.org/10.51983/ajcst-2019.8.2.2130.

Der volle Inhalt der Quelle
Annotation:
As the information is overloaded over the internet accessing of information from the internet according to a given query provides redundant and irrelevant information. It is necessary to retrieve relevant and novel information from a given query by the user. With the result of this the user will require minimum effort to access the information need. In this work we proposed a clustering based approach for novelty detection which will provide the relevant and novel documents for the information need. Based on the user query the incoming stream of documents will be clustered using k-means algorithm. Then the cluster heads are selected from the various clusters with the minimum distance. These cluster heads are the novel documents from a collection of documents from different clusters having the large distance. The proposed technique can be further used in the field of information retrieval.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Hamou, Reda Mohamed, Abdelmalek Amine und Ahmed Chaouki Lokbani. „The Social Spiders in the Clustering of Texts“. International Journal of Artificial Life Research 3, Nr. 3 (Juli 2012): 1–14. http://dx.doi.org/10.4018/jalr.2012070101.

Der volle Inhalt der Quelle
Annotation:
In this paper the authors experiment and test a new biomimetic approach based on social spiders to solve a combinatorial problem ie the automatic classification of texts because a very large data stream flows and particularly on the web. Representation of textual data was performed by a method independent of the language ie n-gram characters and words because there is currently no method of learning that can directly represent unstructured data (text). To validate the classification, the authors used a measure of evaluation based on recall and precision (F-measure). During the experiment, the authors found a powerful visualization tool in social spiders that they exploit to make visual classification.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Text Stream Clustering"

1

Crossman, Nathaniel C. „Stream Clustering And Visualization Of Geotagged Text Data For Crisis Management“. Wright State University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=wright1590957641168863.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Wang, Ye. „Robust Text Mining in Online Social Network Context“. Thesis, 2018. https://vuir.vu.edu.au/38645/.

Der volle Inhalt der Quelle
Annotation:
Text mining is involved in a broad scope of applications in diverse domains that mainly, but not exclusively, serve political, commercial, medical and academic needs. Along with the rapid development of the Internet technology in recent thirty years and the advent of online social media and network in a decade, text data is obliged to entail features of online social data streams, for example, the explosive growth, the constantly changing content and the huge volume. As a result, text mining is no longer merely oriented to textual content itself, but requires consideration of surroundings and combining theories and techniques of stream processing and social network analysis, which give birth to a wide range of applications used for understanding thoughts spread over the world , such as sentiment analysis, mass surveillance and market prediction. Automatically discovering sequences of words that represent appropriate themes in a collection of documents, topic detection closely associated with document clustering and classification. These two tasks play integral roles in revealing deep insight into the text content in the whole text mining framework. However, most existing detection techniques cannot adapt to the dynamic social context. This shows bottlenecks of detecting performance and deficiencies of topic models. In this thesis, we take aim at text data stream, investigating novel techniques and solutions for robust text mining to tackle arising challenges associated with the online social context by incorporating methodologies of stream processing, topic detection and document clustering and classification. In particular, we have advanced the state-of-theart by making the following contributions: 1. A Multi-Window based Ensemble Learning (MWEL) framework is proposed for imbalanced streaming data that comprehensively improves the classification performance. MWEL ensures that the ensemble classifier is maintained up to date and adaptive to the evolving data distribution by applying a multi-window monitoring mechanism and efficient updating strategy. 2. A semi-supervised learning method is proposed to detect latent topics from news streams and the corresponding social context with a constraint propagation scheme to adequately exploit the hidden geometrical structure as supervised information in given data space. A collective learning algorithm is proposed to integrate the textual content into the social context. A locally weighted scheme is afterwards proposed to seek an improvement of the algorithm stability. 3. A Robust Hierarchical Ensemble (RHE) framework is introduced to enhance the robustness of the topic model. It, on the one hand, reduces repercussions caused by outliers and noises, and on the other overcomes inherent defects of text data. RHE adapts to the changing distribution of text stream by constructing a flexible document hierarchy which can be dynamically adjusted. A discussion of how to extract the most valuable social context is conducted with experiments for the purpose of removing some noises from the surroundings and efficiency of the proposed.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Text Stream Clustering"

1

Sharma, Iti, Aaditya Jain und Harish Sharma. „Stream and Online Clustering for Text Documents“. In International Conference on Advanced Computing Networking and Informatics, 469–75. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-13-2673-8_49.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Olariu, Andrei. „Hierarchical Clustering in Improving Microblog Stream Summarization“. In Computational Linguistics and Intelligent Text Processing, 424–35. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-37256-8_35.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Li, Chunshan, Yunming Ye, Xiaofeng Zhang, Dianhui Chu, Shengchun Deng und Xiaofei Xu. „Clustering Based Topic Events Detection on Text Stream“. In Intelligent Information and Database Systems, 42–52. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-05476-6_5.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Molina, Roberto, Waldo Hasperué und Augusto Villa Monte. „D3CAS: Distributed Clustering Algorithm Applied to Short-Text Stream Processing“. In Communications in Computer and Information Science, 211–20. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-20787-8_15.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Attaoui, Mohammed Oualid, Mustapha Lebbah, Nabil Keskes, Hanene Azzag und Mohammed Ghesmoune. „Soft Subspace Growing Neural Gas for Data Stream Clustering“. In Artificial Neural Networks and Machine Learning – ICANN 2019: Text and Time Series, 569–80. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-30490-4_46.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Joshi, Basanta, Umanga Bista und Manoj Ghimire. „Intelligent Clustering Scheme for Log Data Streams“. In Computational Linguistics and Intelligent Text Processing, 454–65. Berlin, Heidelberg: Springer Berlin Heidelberg, 2014. http://dx.doi.org/10.1007/978-3-642-54903-8_38.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Liu, Yubao, Jiarong Cai, Jian Yin und Ada Wai-Chee Fu. „Clustering Massive Text Data Streams by Semantic Smoothing Model“. In Advanced Data Mining and Applications, 389–400. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. http://dx.doi.org/10.1007/978-3-540-73871-8_36.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Luo, Yonghong, Ying Zhang, Xiaoke Ding, Xiangrui Cai, Chunyao Song und Xiaojie Yuan. „StrDip: A Fast Data Stream Clustering Algorithm Using the Dip Test of Unimodality“. In Web Information Systems Engineering – WISE 2018, 193–208. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-030-02925-8_14.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Zhao, Yanchang, Longbing Cao, Huaifeng Zhang und Chengqi Zhang. „Data Clustering“. In Handbook of Research on Innovations in Database Technologies and Applications, 562–72. IGI Global, 2009. http://dx.doi.org/10.4018/978-1-60566-242-8.ch060.

Der volle Inhalt der Quelle
Annotation:
Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering, including well-known clustering techniques, such as partitioning clustering, hierarchical clustering, density-based clustering and grid-based clustering, and recent advances in clustering, such as subspace clustering, text clustering and data stream clustering. The major challenges and future trends of data clustering will also be introduced in this chapter. The remainder of this chapter is organized as follows. The background of data clustering will be introduced in Section 2, including the definition of clustering, categories of clustering techniques, features of good clustering algorithms, and the validation of clustering. Section 3 will present main approaches for clustering, which range from the classic partitioning and hierarchical clustering to recent approaches of bi-clustering and semisupervised clustering. Challenges and future trends will be discussed in Section 4, followed by the conclusions in the last section.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Park, Jun Pyo, Chang-Sup Park und Yon Dohn Chung. „Energy and Latency Efficient Access of Wireless XML Stream“. In Cross-Disciplinary Models and Applications of Database Management, 57–79. IGI Global, 2012. http://dx.doi.org/10.4018/978-1-61350-471-0.ch003.

Der volle Inhalt der Quelle
Annotation:
In this article, we address the problem of delayed query processing raised by tree-based index structures in wireless broadcast environments, which increases the access time of mobile clients. We propose a novel distributed index structure and a clustering strategy for streaming XML data that enables energy and latency-efficient broadcasting of XML data. We first define the DIX node structure to implement a fully distributed index structure which contains the tag name, attributes, and text content of an element, as well as its corresponding indices. By exploiting the index information in the DIX node stream, a mobile client can access the stream with shorter latency. We also suggest a method of clustering DIX nodes in the stream, which can further enhance the performance of query processing in the mobile clients. Through extensive experiments, we demonstrate that our approach is effective for wireless broadcasting of XML data and outperforms the previous methods.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Text Stream Clustering"

1

Rao, Y., und X. J. Li. „A Topic-based Dynamic Clustering Algorithm for Text Stream“. In 2015 International Conference on Artificial Intelligence and Industrial Engineering. Paris, France: Atlantis Press, 2015. http://dx.doi.org/10.2991/aiie-15.2015.130.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Kalogeratos, Argyris, Panagiotis Zagorisios und Aristidis Likas. „Improving Text Stream Clustering using Term Burstiness and Co-burstiness“. In SETN '16: 9th Hellenic Conference on Artificial Intelligence. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2903220.2903229.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Crossman, Nathaniel C., Soon M. Chung und Vincent A. Schmidt. „Stream Clustering and Visualization of Geotagged Text Data for Crisis Management“. In 2019 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2019. http://dx.doi.org/10.1109/icodse48700.2019.9092760.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Crossman, Nathaniel C., und Soon M. Chung. „GPU-Accelerated Stream Clustering of Geotagged Text Data for Crisis Management“. In 2022 International Conference on Data and Software Engineering (ICoDSE). IEEE, 2022. http://dx.doi.org/10.1109/icodse56892.2022.9971926.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Kumar, Jay, Junming Shao, Salah Uddin und Wazir Ali. „An Online Semantic-enhanced Dirichlet Model for Short Text Stream Clustering“. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2020. http://dx.doi.org/10.18653/v1/2020.acl-main.70.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Rakib, Md Rashadul Hasan, Norbert Zeh und Evangelos Milios. „Short Text Stream Clustering via Frequent Word Pairs and Reassignment of Outliers to Clusters“. In DocEng '20: ACM Symposium on Document Engineering 2020. New York, NY, USA: ACM, 2020. http://dx.doi.org/10.1145/3395027.3419589.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Si, XianLiang, Peipei Li, Xuegang Hu und Yuhong Zhang. „An Online Dirichlet Model based on Sentence Embedding and DBSCAN for Noisy Short Text Stream Clustering“. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 2022. http://dx.doi.org/10.1109/ijcnn55064.2022.9892414.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Rakib, Md Rashadul Hasan, Norbert Zeh und Evangelos Milios. „Efficient clustering of short text streams using online-offline clustering“. In DocEng '21: ACM Symposium on Document Engineering 2021. New York, NY, USA: ACM, 2021. http://dx.doi.org/10.1145/3469096.3469866.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

He, Qi, Kuiyu Chang, Ee-Peng Lim und Jun Zhang. „Bursty Feature Representation for Clustering Text Streams“. In Proceedings of the 2007 SIAM International Conference on Data Mining. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2007. http://dx.doi.org/10.1137/1.9781611972771.50.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Zhao, Yukun, Shangsong Liang, Zhaochun Ren, Jun Ma, Emine Yilmaz und Maarten de Rijke. „Explainable User Clustering in Short Text Streams“. In SIGIR '16: The 39th International ACM SIGIR conference on research and development in Information Retrieval. New York, NY, USA: ACM, 2016. http://dx.doi.org/10.1145/2911451.2911522.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie