Статті в журналах з теми "SUMMARIZATION ALGORITHMS"

Щоб переглянути інші типи публікацій з цієї теми, перейдіть за посиланням: SUMMARIZATION ALGORITHMS.

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся з топ-50 статей у журналах для дослідження на тему "SUMMARIZATION ALGORITHMS".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Переглядайте статті в журналах для різних дисциплін та оформлюйте правильно вашу бібліографію.

1

Chang, Hsien-Tsung, Shu-Wei Liu, and Nilamadhab Mishra. "A tracking and summarization system for online Chinese news topics." Aslib Journal of Information Management 67, no. 6 (November 16, 2015): 687–99. http://dx.doi.org/10.1108/ajim-10-2014-0147.

Повний текст джерела
Анотація:
Purpose – The purpose of this paper is to design and implement new tracking and summarization algorithms for Chinese news content. Based on the proposed methods and algorithms, the authors extract the important sentences that are contained in topic stories and list those sentences according to timestamp order to ensure ease of understanding and to visualize multiple news stories on a single screen. Design/methodology/approach – This paper encompasses an investigational approach that implements a new Dynamic Centroid Summarization algorithm in addition to a Term Frequency (TF)-Density algorithm to empirically compute three target parameters, i.e., recall, precision, and F-measure. Findings – The proposed TF-Density algorithm is implemented and compared with the well-known algorithms Term Frequency-Inverse Word Frequency (TF-IWF) and Term Frequency-Inverse Document Frequency (TF-IDF). Three test data sets are configured from Chinese news web sites for use during the investigation, and two important findings are obtained that help the authors provide more precision and efficiency when recognizing the important words in the text. First, the authors evaluate three topic tracking algorithms, i.e., TF-Density, TF-IDF, and TF-IWF, with the said target parameters and find that the recall, precision, and F-measure of the proposed TF-Density algorithm is better than those of the TF-IWF and TF-IDF algorithms. In the context of the second finding, the authors implement a blind test approach to obtain the results of topic summarizations and find that the proposed Dynamic Centroid Summarization process can more accurately select topic sentences than the LexRank process. Research limitations/implications – The results show that the tracking and summarization algorithms for news topics can provide more precise and convenient results for users tracking the news. The analysis and implications are limited to Chinese news content from Chinese news web sites such as Apple Library, UDN, and well-known portals like Yahoo and Google. Originality/value – The research provides an empirical analysis of Chinese news content through the proposed TF-Density and Dynamic Centroid Summarization algorithms. It focusses on improving the means of summarizing a set of news stories to appear for browsing on a single screen and carries implications for innovative word measurements in practice.
Стилі APA, Harvard, Vancouver, ISO та ін.
2

Yadav, Divakar, Naman Lalit, Riya Kaushik, Yogendra Singh, Mohit, Dinesh, Arun Kr Yadav, Kishor V. Bhadane, Adarsh Kumar, and Baseem Khan. "Qualitative Analysis of Text Summarization Techniques and Its Applications in Health Domain." Computational Intelligence and Neuroscience 2022 (February 9, 2022): 1–14. http://dx.doi.org/10.1155/2022/3411881.

Повний текст джерела
Анотація:
For the better utilization of the enormous amount of data available to us on the Internet and in different archives, summarization is a valuable method. Manual summarization by experts is an almost impossible and time-consuming activity. People could not access, read, or use such a big pile of information for their needs. Therefore, summary generation is essential and beneficial in the current scenario. This paper presents an efficient qualitative analysis of the different algorithms used for text summarization. We implemented five different algorithms, namely, term frequency-inverse document frequency (TF-IDF), LexRank, TextRank, BertSum, and PEGASUS, for a summary generation. These algorithms are chosen based on various factors. After reviewing the state-of-the-art literature, it generates good summaries results. The performance of these algorithms is compared on two different datasets, i.e., Reddit-TIFU and MultiNews, and their results are measured using Recall-Oriented Understudy for Gisting Evaluation (ROUGE) measure to perform analysis to decide the best algorithm among these and generate the summary. After performing a qualitative analysis of the above algorithms, we observe that for both the datasets, i.e., Reddit-TIFU and MultiNews, PEGASUS had the best average F-score for abstractive text summarization and TextRank algorithms for extractive text summarization, with a better average F-score.
Стилі APA, Harvard, Vancouver, ISO та ін.
3

Mall, Shalu, Avinash Maurya, Ashutosh Pandey, and Davain Khajuria. "Centroid Based Clustering Approach for Extractive Text Summarization." International Journal for Research in Applied Science and Engineering Technology 11, no. 6 (June 30, 2023): 3404–9. http://dx.doi.org/10.22214/ijraset.2023.53542.

Повний текст джерела
Анотація:
Abstract: Extractive text summarization is the process of identifying the most important information from a large text and presenting it in a condensed form. One popular approach to this problem is the use of centroid-based clustering algorithms, which group together similar sentences based on their content and then select representative sentences from each cluster to form a summary. In this research, we present a centroid-based clustering algorithm for email summarization that combines the use of word embeddings with a clustering algorithm. We compare our algorithm to existing summarization techniques. Our results show that our approach stands close to existing methods in terms of summary quality, while also being computationally efficient. Overall, our work demonstrates the potential of centroid-based clustering algorithms for extractive text summarization and suggests avenues for further research in this area.
Стилі APA, Harvard, Vancouver, ISO та ін.
4

BOKAEI, MOHAMMAD HADI, HOSSEIN SAMETI, and YANG LIU. "Extractive summarization of multi-party meetings through discourse segmentation." Natural Language Engineering 22, no. 1 (March 4, 2015): 41–72. http://dx.doi.org/10.1017/s1351324914000199.

Повний текст джерела
Анотація:
AbstractIn this article we tackle the problem of multi-party conversation summarization. We investigate the role of discourse segmentation of a conversation on meeting summarization. First, an unsupervised function segmentation algorithm is proposed to segment the transcript into functionally coherent parts, such asMonologuei(which indicates a segment where speakeriis the dominant speaker, e.g., lecturing all the other participants) orDiscussionx1x2, . . .,xn(which indicates a segment where speakersx1toxninvolve in a discussion). Then the salience score for a sentence is computed by leveraging the score of the segment containing the sentence. Performance of our proposed segmentation and summarization algorithms is evaluated using the AMI meeting corpus. We show better summarization performance over other state-of-the-art algorithms according to different metrics.
Стилі APA, Harvard, Vancouver, ISO та ін.
5

Dutta, Soumi, Vibhash Chandra, Kanav Mehra, Asit Kumar Das, Tanmoy Chakraborty, and Saptarshi Ghosh. "Ensemble Algorithms for Microblog Summarization." IEEE Intelligent Systems 33, no. 3 (May 2018): 4–14. http://dx.doi.org/10.1109/mis.2018.033001411.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
6

Han, Kai, Shuang Cui, Tianshuai Zhu, Enpei Zhang, Benwei Wu, Zhizhuo Yin, Tong Xu, Shaojie Tang, and He Huang. "Approximation Algorithms for Submodular Data Summarization with a Knapsack Constraint." ACM SIGMETRICS Performance Evaluation Review 49, no. 1 (June 22, 2022): 65–66. http://dx.doi.org/10.1145/3543516.3453922.

Повний текст джерела
Анотація:
Data summarization, a fundamental methodology aimed at selecting a representative subset of data elements from a large pool of ground data, has found numerous applications in big data processing, such as social network analysis [5, 7], crowdsourcing [6], clustering [4], network design [13], and document/corpus summarization [14]. Moreover, it is well acknowledged that the "representativeness" of a dataset in data summarization applications can often be modeled by submodularity - a mathematical concept abstracting the "diminishing returns" property in the real world. Therefore, a lot of studies have cast data summarization as a submodular function maximization problem (e.g., [2]).
Стилі APA, Harvard, Vancouver, ISO та ін.
7

Han, Kai, Shuang Cui, Tianshuai Zhu, Enpei Zhang, Benwei Wu, Zhizhuo Yin, Tong Xu, Shaojie Tang, and He Huang. "Approximation Algorithms for Submodular Data Summarization with a Knapsack Constraint." Proceedings of the ACM on Measurement and Analysis of Computing Systems 5, no. 1 (February 18, 2021): 1–31. http://dx.doi.org/10.1145/3447383.

Повний текст джерела
Анотація:
Data summarization, i.e., selecting representative subsets of manageable size out of massive data, is often modeled as a submodular optimization problem. Although there exist extensive algorithms for submodular optimization, many of them incur large computational overheads and hence are not suitable for mining big data. In this work, we consider the fundamental problem of (non-monotone) submodular function maximization with a knapsack constraint, and propose simple yet effective and efficient algorithms for it. Specifically, we propose a deterministic algorithm with approximation ratio 6 and a randomized algorithm with approximation ratio 4, and show that both of them can be accelerated to achieve nearly linear running time at the cost of weakening the approximation ratio by an additive factor of ε. We then consider a more restrictive setting without full access to the whole dataset, and propose streaming algorithms with approximation ratios of 8+ε and 6+ε that make one pass and two passes over the data stream, respectively. As a by-product, we also propose a two-pass streaming algorithm with an approximation ratio of 2+ε when the considered submodular function is monotone. To the best of our knowledge, our algorithms achieve the best performance bounds compared to the state-of-the-art approximation algorithms with efficient implementation for the same problem. Finally, we evaluate our algorithms in two concrete submodular data summarization applications for revenue maximization in social networks and image summarization, and the empirical results show that our algorithms outperform the existing ones in terms of both effectiveness and efficiency.
Стилі APA, Harvard, Vancouver, ISO та ін.
8

Popescu, Claudiu, Lacrimioara Grama, and Corneliu Rusu. "A Highly Scalable Method for Extractive Text Summarization Using Convex Optimization." Symmetry 13, no. 10 (September 30, 2021): 1824. http://dx.doi.org/10.3390/sym13101824.

Повний текст джерела
Анотація:
The paper describes a convex optimization formulation of the extractive text summarization problem and a simple and scalable algorithm to solve it. The optimization program is constructed as a convex relaxation of an intuitive but computationally hard integer programming problem. The objective function is highly symmetric, being invariant under unitary transformations of the text representations. Another key idea is to replace the constraint on the number of sentences in the summary with a convex surrogate. For solving the program we have designed a specific projected gradient descent algorithm and analyzed its performance in terms of execution time and quality of the approximation. Using the datasets DUC 2005 and Cornell Newsroom Summarization Dataset, we have shown empirically that the algorithm can provide competitive results for single document summarization and multi-document query-based summarization. On the Cornell Newsroom Summarization Dataset, it ranked second among the unsupervised methods tested. For the more challenging task of multi-document query-based summarization, the method was tested on the DUC 2005 Dataset. Our algorithm surpassed the other reported methods with respect to the ROUGE-SU4 metric, and it was at less than 0.01 from the top performing algorithms with respect to ROUGE-1 and ROUGE-2 metrics.
Стилі APA, Harvard, Vancouver, ISO та ін.
9

Boussaid, L., A. Mtibaa, M. Abid, and M. Paindavoin. "Real-Time Algorithms for Video Summarization." Journal of Applied Sciences 6, no. 8 (April 1, 2006): 1679–85. http://dx.doi.org/10.3923/jas.2006.1679.1685.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
10

Ke, Xiangyu, Arijit Khan, and Francesco Bonchi. "Multi-relation Graph Summarization." ACM Transactions on Knowledge Discovery from Data 16, no. 5 (October 31, 2022): 1–30. http://dx.doi.org/10.1145/3494561.

Повний текст джерела
Анотація:
Graph summarization is beneficial in a wide range of applications, such as visualization, interactive and exploratory analysis, approximate query processing, reducing the on-disk storage footprint, and graph processing in modern hardware. However, the bulk of the literature on graph summarization surprisingly overlooks the possibility of having edges of different types. In this article, we study the novel problem of producing summaries of multi-relation networks, i.e., graphs where multiple edges of different types may exist between any pair of nodes. Multi-relation graphs are an expressive model of real-world activities, in which a relation can be a topic in social networks, an interaction type in genetic networks, or a snapshot in temporal graphs. The first approach that we consider for multi-relation graph summarization is a two-step method based on summarizing each relation in isolation, and then aggregating the resulting summaries in some clever way to produce a final unique summary. In doing this, as a side contribution, we provide the first polynomial-time approximation algorithm based on the k -Median clustering for the classic problem of lossless single-relation graph summarization. Then, we demonstrate the shortcomings of these two-step methods, and propose holistic approaches, both approximate and heuristic algorithms, to compute a summary directly for multi-relation graphs. In particular, we prove that the approximation bound of k -Median clustering for the single relation solution can be maintained in a multi-relation graph with proper aggregation operation over adjacency matrices corresponding to its multiple relations. Experimental results and case studies (on co-authorship networks and brain networks) validate the effectiveness and efficiency of the proposed algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
11

Bewoor, M. S., and S. H. Patil. "Empirical Analysis of Single and Multi Document Summarization using Clustering Algorithms." Engineering, Technology & Applied Science Research 8, no. 1 (February 20, 2018): 2562–67. http://dx.doi.org/10.48084/etasr.1775.

Повний текст джерела
Анотація:
The availability of various digital sources has created a demand for text mining mechanisms. Effective summary generation mechanisms are needed in order to utilize relevant information from often overwhelming digital data sources. In this view, this paper conducts a survey of various single as well as multi-document text summarization techniques. It also provides analysis of treating a query sentence as a common one, segmented from documents for text summarization. Experimental results show the degree of effectiveness in text summarization over different clustering algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
12

Silber, H. Gregory, and Kathleen F. McCoy. "Efficiently Computed Lexical Chains as an Intermediate Representation for Automatic Text Summarization." Computational Linguistics 28, no. 4 (December 2002): 487–96. http://dx.doi.org/10.1162/089120102762671954.

Повний текст джерела
Анотація:
While automatic text summarization is an area that has received a great deal of attention in recent research, the problem of efficiency in this task has not been frequently addressed. When the size and quantity of documents available on the Internet and from other sources are considered, the need for a highly efficient tool that produces usable summaries is clear. We present a linear-time algorithm for lexical chain computation. The algorithm makes lexical chains a computationally feasible candidate as an intermediate representation for automatic text summarization. A method for evaluating lexical chains as an intermediate step in summarization is also presented and carried out. Such an evaluation was heretofore not possible because of the computational complexity of previous lexical chains algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
13

Varade, Saurabh, Ejaaz Sayyed, Vaibhavi Nagtode, and Shilpa Shinde. "Text Summarization using Extractive and Abstractive Methods." ITM Web of Conferences 40 (2021): 03023. http://dx.doi.org/10.1051/itmconf/20214003023.

Повний текст джерела
Анотація:
Text Summarization is a process where a huge text file is converted into summarized version which will preserve the original meaning and context. The main aim of any text summarization is to provide a accurate and precise summary. One approach is to use a sentence ranking algorithm. This comes under extractive summarization. Here, a graph based ranking algorithm is used to rank the sentences in the text and then top k-scored sentences are included in the summary. The most widely used algorithm to decide the importance of any vertex in a graph based on the information retrieved from the graph is Graph Based Ranking Algorithm. TextRank is one of the most efficient ranking algorithms which is used for Web link analysis that is for measuring the importance of website pages. Another approach is abstractive summarization where a LSTM encoder decoder model is used along with attention mechanism which focuses on some important words from the input. Encoder encodes the input sequence and decoder along with attention mechanism gives the summary as the output.
Стилі APA, Harvard, Vancouver, ISO та ін.
14

Mohsin, Muhammad, Shazad Latif, Muhammad Haneef, Usman Tariq, Muhammad Attique Khan, Sefedine Kadry, Hwan-Seung Yong, and Jung-In Choi. "Improved Text Summarization of News Articles Using GA-HC and PSO-HC." Applied Sciences 11, no. 22 (November 9, 2021): 10511. http://dx.doi.org/10.3390/app112210511.

Повний текст джерела
Анотація:
Automatic Text Summarization (ATS) is gaining attention because a large volume of data is being generated at an exponential rate. Due to easy internet availability globally, a large amount of data is being generated from social networking websites, news websites and blog websites. Manual summarization is time consuming, and it is difficult to read and summarize a large amount of content. Automatic text summarization is the solution to deal with this problem. This study proposed two automatic text summarization models which are Genetic Algorithm with Hierarchical Clustering (GA-HC) and Particle Swarm Optimization with Hierarchical Clustering (PSO-HC). The proposed models use a word embedding model with Hierarchal Clustering Algorithm to group sentences conveying almost same meaning. Modified GA and adaptive PSO based sentence ranking models are proposed for text summary in news text documents. Simulations are conducted and compared with other understudied algorithms to evaluate the performance of proposed methodology. Simulations results validate the superior performance of the proposed methodology.
Стилі APA, Harvard, Vancouver, ISO та ін.
15

Amoudi, Ghada, Amal Almansour, and Hanan Saleh Alghamdi. "Improved Graph-Based Arabic Hotel Review Summarization Using Polarity Classification." Applied Sciences 12, no. 21 (October 29, 2022): 10980. http://dx.doi.org/10.3390/app122110980.

Повний текст джерела
Анотація:
The increasing number of online product and service reviews has created a substantial information resource for individuals and businesses. Automatic review summarization helps overcome information overload. Research in automatic text summarization shows remarkable advancement. However, research on Arabic text summarization has not been sufficiently conducted. This study proposes an extractive Arabic review summarization approach that incorporates the reviews’ polarity and sentiment aspects and employs a graph-based ranking algorithm, TextRank. We demonstrate the advantages of the proposed methods through a set of experiments using hotel reviews from Booking.com. Reviews were grouped based on their polarity, and then TextRank was applied to produce the summary. Results were evaluated using two primary measures, BLEU and ROUGE. Further, two Arabic native speakers’ summaries were used for evaluation purposes. The results showed that this approach improved the summarization scores in most experiments, reaching an F1 score of 0.6294. Contributions of this work include applying a graph-based approach to a new domain, Arabic hotel reviews, adding sentiment dimension to summarization, analyzing the algorithms of the two primary summarization metrics showing the working of these measures and how they could be used to give accurate results, and finally, providing four human summaries for two hotels which could be utilized for another research.
Стилі APA, Harvard, Vancouver, ISO та ін.
16

Canhasi, Ercan. "Fast document summarization using locality sensitive hashing and memory access efficient node ranking." International Journal of Electrical and Computer Engineering (IJECE) 6, no. 3 (June 1, 2016): 945. http://dx.doi.org/10.11591/ijece.v6i3.9030.

Повний текст джерела
Анотація:
Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.
Стилі APA, Harvard, Vancouver, ISO та ін.
17

Canhasi, Ercan. "Fast document summarization using locality sensitive hashing and memory access efficient node ranking." International Journal of Electrical and Computer Engineering (IJECE) 6, no. 3 (June 1, 2016): 945. http://dx.doi.org/10.11591/ijece.v6i3.pp945-954.

Повний текст джерела
Анотація:
Text modeling and sentence selection are the fundamental steps of a typical extractive document summarization algorithm. The common text modeling method connects a pair of sentences based on their similarities. Even thought it can effectively represent the sentence similarity graph of given document(s) its big drawback is a large time complexity of $O(n^2)$, where n represents the number of sentences. The quadratic time complexity makes it impractical for large documents. In this paper we propose the fast approximation algorithms for the text modeling and the sentence selection. Our text modeling algorithm reduces the time complexity to near-linear time by rapidly finding the most similar sentences to form the sentences similarity graph. In doing so we utilized Locality-Sensitive Hashing, a fast algorithm for the approximate nearest neighbor search. For the sentence selection step we propose a simple memory-access-efficient node ranking method based on the idea of scanning sequentially only the neighborhood arrays. Experimentally, we show that sacrificing a rather small percentage of recall and precision in the quality of the produced summary can reduce the quadratic to sub-linear time complexity. We see the big potential of proposed method in text summarization for mobile devices and big text data summarization for internet of things on cloud. In our experiments, beside evaluating the presented method on the standard general and query multi-document summarization tasks, we also tested it on few alternative summarization tasks including general and query, timeline, and comparative summarization.
Стилі APA, Harvard, Vancouver, ISO та ін.
18

Flannery, Jeremiah. "Using NLP to Generate MARC Summary Fields for Notre Dame ’s Catholic Pamphlets." International Journal of Librarianship 5, no. 1 (July 23, 2020): 20–35. http://dx.doi.org/10.23974/ijol.2020.vol5.1.158.

Повний текст джерела
Анотація:
Three NLP (Natural Language Processing) automated summarization techniques were tested on a special collection of Catholic Pamphlets acquired by Hesburgh Libraries. The automated summaries were generated after feeding the pamphlets as .pdf files into an OCR pipeline. Extensive data cleaning and text preprocessing were necessary before the computer summarization algorithms could be launched. Using the standard ROUGE F1 scoring technique, the Bert Extractive Summarizer technique had the best summarization score. It most closely matched the human reference summaries. The BERT Extractive technique yielded an average Rouge F1 score of 0.239. The Gensim python package implementation of TextRank scored at .151. A hand-implemented TextRank algorithm created summaries that scored at 0.144. This article covers the implementation of automated pipelines to read PDF text, the strengths and weakness of automated summarization techniques, and what the successes and failures of these summaries mean for their potential to be used in Hesburgh Libraries.
Стилі APA, Harvard, Vancouver, ISO та ін.
19

Meena, Yogesh Kumar, and Dinesh Gopalani. "Evolutionary Algorithms for Extractive Automatic Text Summarization." Procedia Computer Science 48 (2015): 244–49. http://dx.doi.org/10.1016/j.procs.2015.04.177.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
20

Kalyani, BJD, Jaishri Wankhede, and Shaik Shahanaz. "Data Mining Oriented Automatic Scientific Documents Summarization." International Journal on Recent and Innovation Trends in Computing and Communication 11, no. 4 (May 4, 2023): 126–30. http://dx.doi.org/10.17762/ijritcc.v11i4.6395.

Повний текст джерела
Анотація:
The scientific research process usually begins with an examination of the advanced, which may include voluminous publications. Summarizing scientific articles can assist researchers in their research by speeding up the research process. The summary of scientific articles differs from the abstract text in general due to its specific structure and the inclusion of cited sentences. Most of the important information in scientific articles is presented in tables, statistics, and algorithm pseudocode. These features, however, rarely appear in the standard text. Therefore, a number of methods that consider the value of the structure of a scientific article have been suggested that improve the standard of the produced summary. This paper makes use of clustering algorithms to handle CL- SciSumm 2020 and longsumm 2020 tasks for summarization of scientific documents. There are three well-known clustering algorithms that are employed to tackle CL- SciSumm 2020 and LongSumm 2020 tasks, and several sentences recording functions, with textual deduction, are used to retrieved phrases from each cluster to generate summary.
Стилі APA, Harvard, Vancouver, ISO та ін.
21

Paramanantham, Vinsent, and Dr S. Suresh Kumar. "A Review on Key Features and Novel Methods for Video Summarization." International Journal of Engineering and Advanced Technology 12, no. 3 (February 28, 2023): 88–105. http://dx.doi.org/10.35940/ijeat.f3737.0212323.

Повний текст джерела
Анотація:
In this paper, we discuss techniques, algorithms, evaluation methods used in online, offline, supervised, unsupervised, multi-video and clustering methods used for Video Summarization/Multi-view Video Summarization from various references. We have studied different techniques in the literature and described the features used for generating video summaries with evaluation methods, supervised, unsupervised, algorithms and the datasets used. We have covered the survey towards the new frontier of research in computational intelligence technique like ANN (Artificial Neural Network) and other evolutionary algorithms for VS using both supervised and unsupervised methods. We highlight on single, multi-video summarization with features like video, audio, and semantic embeddings considered for VS in the literature. A careful presentation is attempted to bring the performance comparison with Precision, Recall, F-Score, and manual methods to evaluate the VS.
Стилі APA, Harvard, Vancouver, ISO та ін.
22

Na, Liu, Tang Di, Lu Ying, Tang Xiao-Jun, and Wang Hai-Wen. "Topic-sensitive multi-document summarization algorithm." Computer Science and Information Systems 12, no. 4 (2015): 1375–89. http://dx.doi.org/10.2298/csis140815060n.

Повний текст джерела
Анотація:
Latent Dirichlet Allocation (LDA) has been used to generate text corpora topics recently. However, not all the estimated topics are of equal importance or correspond to genuine themes of the domain. Some of the topics can be a collection of irrelevant words or represent insignificant themes. This paper proposed a topic-sensitive algorithm for multi-document summarization. This algorithm uses LDA model and weight linear combination strategy to identify significance topic which is used in sentence weight calculation. Each topic is measured by three different LDA criteria. Significance topic is evaluated by using weight linear combination to combine the multi-criteria. In addition to topic features, the proposed approach also considered some statistics features, such as term frequency, sentence position, sentence length, etc. It not only highlights the advantages of statistics features, but also cooperates with topic model. The experiments showed that the proposed algorithm achieves better performance than the other state-of-the-art algorithms on DUC2002 corpus.
Стилі APA, Harvard, Vancouver, ISO та ін.
23

Pandey, Karran, Fanny Chevalier, and Karan Singh. "Juxtaform: interactive visual summarization for exploratory shape design." ACM Transactions on Graphics 42, no. 4 (July 26, 2023): 1–14. http://dx.doi.org/10.1145/3592436.

Повний текст джерела
Анотація:
We present juxtaform , a novel approach to the interactive summarization of large shape collections for conceptual shape design. We conduct a formative study to ascertain design goals for creative shape exploration tools. Motivated by a mathematical formulation of these design goals, juxtaform integrates the exploration, analysis, selection, and refinement of large shape collections to support an interactive divergence-convergence shape design workflow. We exploit sparse, segmented sketch-stroke visual abstractions of shape and a novel visual summarization algorithm to balance the needs of shape understanding, in-situ shape juxtaposition, and visual clutter. Our evaluation is three-fold: we show that existing shape and stroke clustering algorithms do not address our design goals compared to our proposed shape corpus summarization algorithm; we compare juxtaform against a structured image gallery interface for various shape design and analysis tasks; and we present multiple compelling 2D/3D applications using juxtaform.
Стилі APA, Harvard, Vancouver, ISO та ін.
24

Amini, Amineh, and Teh Ying Wah. "On Density-Based Clustering Algorithms over Evolving Data Streams: A Summarization Paradigm." Applied Mechanics and Materials 263-266 (December 2012): 2234–37. http://dx.doi.org/10.4028/www.scientific.net/amm.263-266.2234.

Повний текст джерела
Анотація:
Clustering is one of the prominent classes in the mining data streams. Among various clustering algorithms that have been developed, density-based method has the ability to discover arbitrary shape clusters, and to detect the outliers. Recently, various algorithms adopted density-based methods for clustering data streams. In this paper, we look into three remarkable algorithms in two groups of micro-clustering and grid-based including DenStream, D-Stream, and MR-Stream. We compare the algorithms based on evaluating algorithm performance and clustering quality metrics.
Стилі APA, Harvard, Vancouver, ISO та ін.
25

Chettah, Khadidja, and Amer Draa. "A Quantum-Inspired Genetic Algorithm for Extractive Text Summarization." International Journal of Natural Computing Research 10, no. 2 (April 2021): 42–60. http://dx.doi.org/10.4018/ijncr.2021040103.

Повний текст джерела
Анотація:
Automatic text summarization has recently become a key instrument for reducing the huge quantity of textual data. In this paper, the authors propose a quantum-inspired genetic algorithm (QGA) for extractive single-document summarization. The QGA is used inside a totally automated system as an optimizer to search for the best combination of sentences to be put in the final summary. The presented approach is compared with 11 reference methods including supervised and unsupervised summarization techniques. They have evaluated the performances of the proposed approach on the DUC 2001 and DUC 2002 datasets using the ROUGE-1 and ROUGE-2 evaluation metrics. The obtained results show that the proposal can compete with other state-of-the-art methods. It is ranked first out of 12, outperforming all other algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
26

Md, Abdul Quadir, Raghav V. Anand, Senthilkumar Mohan, Christy Jackson Joshua, Sabhari S. Girish, Anthra Devarajan, and Celestine Iwendi. "Data-Driven Analysis of Privacy Policies Using LexRank and KL Summarizer for Environmental Sustainability." Sustainability 15, no. 7 (March 29, 2023): 5941. http://dx.doi.org/10.3390/su15075941.

Повний текст джерела
Анотація:
Natural language processing (NLP) is a field in machine learning that analyses and manipulate huge amounts of data and generates human language. There are a variety of applications of NLP such as sentiment analysis, text summarization, spam filtering, language translation, etc. Since privacy documents are important and legal, they play a vital part in any agreement. These documents are very long, but the important points still have to be read thoroughly. Customers might not have the necessary time or the knowledge to understand all the complexities of a privacy policy document. In this context, this paper proposes an optimal model to summarize the privacy policy in the best possible way. The methodology of text summarization is the process where the summaries from the original huge text are extracted without losing any vital information. Using the proposed idea of a common word reduction process combined with natural language processing algorithms, this paper extracts the sentences in the privacy policy document that hold high weightage and displays them to the customer, and it can save the customer’s time from reading through the entire policy while also providing the customers with only the important lines that they need to know before signing the document. The proposed method uses two different extractive text summarization algorithms, namely LexRank and Kullback Leibler (KL) Summarizer, to summarize the obtained text. According to the results, the summarized sentences obtained via the common word reduction process and text summarization algorithms were more significant than the raw privacy policy text. The introduction of this novel methodology helps to find certain important common words used in a particular sector to a greater depth, thus allowing more in-depth study of a privacy policy. Using the common word reduction process, the sentences were reduced by 14.63%, and by applying extractive NLP algorithms, significant sentences were obtained. The results after applying NLP algorithms showed a 191.52% increase in the repetition of common words in each sentence using the KL summarizer algorithm, while the LexRank algorithm showed a 361.01% increase in the repetition of common words. This implies that common words play a large role in determining a sector’s privacy policies, making our proposed method a real-world solution for environmental sustainability.
Стилі APA, Harvard, Vancouver, ISO та ін.
27

D’Silva, Suzanne, Neha Joshi, Sudha Rao, Sangeetha Venkatraman, and Seema Shrawne. "Improved Algorithms for Document Classification &Query-based Multi-Document Summarization." International Journal of Engineering and Technology 3, no. 4 (2011): 404–9. http://dx.doi.org/10.7763/ijet.2011.v3.261.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
28

Gahman, Nicholas, and Vinayak Elangovan. "A Comparison of Document Similarity Algorithms." International Journal of Artificial Intelligence & Applications 14, no. 2 (March 30, 2023): 41–50. http://dx.doi.org/10.5121/ijaia.2023.14204.

Повний текст джерела
Анотація:
Document similarity is an important part of Natural Language Processing and is most commonly used forplagiarism-detection and text summarization. Thus, finding the overall most effective document similarity algorithm could have a major positive impact on the field of Natural Language Processing. This report setsout to examine the numerous document similarity algorithms, and determine which ones are the mostuseful. It addresses the most effective document similarity algorithm by categorizing them into 3 types ofdocument similarity algorithms: statistical algorithms, neural networks, and corpus/knowledge-basedalgorithms. The most effective algorithms in each category are also compared in our work using a series of benchmark datasets and evaluations that test every possible area that each algorithm could be used in.
Стилі APA, Harvard, Vancouver, ISO та ін.
29

Singco, Van Zachary V., Joel C. Trillo, Cristopher C. Abalorio, James Cloyd M. Bustillo, Junell T. Bojocan, and Michelle C. Elape. "OCR-based Hybrid Image Text Summarizer using Luhn Algorithm with FinetuneTransformer Modelsfor Long Document." International Journal of Emerging Technology and Advanced Engineering 13, no. 2 (February 4, 2023): 47–56. http://dx.doi.org/10.46338/ijetae0223_07.

Повний текст джерела
Анотація:
The accessibility of an enormous number of image text documents on the internet has expanded the opportunities to develop a system for image text recognition with text summarization. Several approaches used in ATS in the literature are based on extractive and abstractive techniques; however, few implementations of the hybrid approach were observed. This paper employed state-of-the-art transformer models with the Luhn algorithm for extracted texts using Tesseract OCR. Nine models were generated and tested using the hybrid text summarization approach. Using ROUGE metrics, we compared the proposed system finetune abstractive models against existing abstractive models that use the same dataset Xsum. As a result, the finetune model got the highest ROUGE score during evaluation; in ROUGE-1 score was 57%, the ROUGE-2 score was 43%, and the ROUGE-L score was 42%. Furthermore, even when better algorithms and models were available for summarization, the Luhn algorithm and T5 finetune model provided significant results.
Стилі APA, Harvard, Vancouver, ISO та ін.
30

Liu, Qiang, Jiaxing Wei, Hao Liu, and Yimu Ji. "A Hierarchical Parallel Graph Summarization Approach Based on Ranking Nodes." Applied Sciences 13, no. 8 (April 7, 2023): 4664. http://dx.doi.org/10.3390/app13084664.

Повний текст джерела
Анотація:
Graph summarization techniques are vital in simplifying and extracting enormous quantities of graph data. Traditional static graph structure-based summarization algorithms generally follow a minimum description length (MDL) style, and concentrate on minimizing the graph storage overhead. However, these methods also suffer from incomprehensive summary dimensions and inefficiency problems. In addition, the need for graph summarization techniques often varies among different graph applications, but an ideal summary method should generally retain the important characteristics of the key nodes in the final summary graph. This paper proposes a novel method based on ranking nodes, called HRNS, that follows a hierarchical parallel graph summarization approach. The HRNS first preprocesses the node ranking using a hybrid weighted importance strategy, and introduces the node importance factor into traditional MDL-based summarization algorithms; it then leverages a hierarchical parallel process to accelerate the summary computation. The experimental results obtained using both real and simulated datasets show that HRNS can efficiently extract nodes with high importance, and that the average importance over six datasets ranges from 0.107 to 0.167; thus, HRNS can achieve a significant performance gain on speedups, as the sum error ratios are also lower than the methods traditionally used.
Стилі APA, Harvard, Vancouver, ISO та ін.
31

Al-amri, Redhwan, Raja Kumar Murugesan, Mubarak Almutairi, Kashif Munir, Gamal Alkawsi, and Yahia Baashar. "A Clustering Algorithm for Evolving Data Streams Using Temporal Spatial Hyper Cube." Applied Sciences 12, no. 13 (June 27, 2022): 6523. http://dx.doi.org/10.3390/app12136523.

Повний текст джерела
Анотація:
As applications generate massive amounts of data streams, the requirement for ways to analyze and cluster this data has become a critical field of research for knowledge discovery. Data stream clustering’s primary objective and goal are to acquire insights into incoming data. Recognizing all possible patterns in data streams that enter at variable rates and structures and evolve over time is critical for acquiring insights. Analyzing the data stream has been one of the vital research areas due to the inevitable evolving aspect of the data stream and its vast application domains. Existing algorithms for handling data stream clustering consider adding various data summarization structures starting from grid projection and ending with buffers of Core-Micro and Macro clusters. However, it is found that the static assumption of the data summarization impacts the quality of clustering. To fill this gap, an online clustering algorithm for handling evolving data streams using a tempo-spatial hyper cube called BOCEDS TSHC has been developed in this research. The role of the tempo-spatial hyper cube (TSHC) is to add more dimensions to the data summarization for more degree of freedom. TSHC when added to Buffer-based Online Clustering for Evolving Data Stream (BOCEDS) results in a superior evolving data stream clustering algorithm. Evaluation based on both the real world and synthetic datasets has proven the superiority of the developed BOCEDS TSHC clustering algorithm over the baseline algorithms with respect to most of the clustering metrics.
Стилі APA, Harvard, Vancouver, ISO та ін.
32

Faguo Zhou. "Research on Chinese Multi-document Automatic Summarization Algorithms." International Journal of Advancements in Computing Technology 4, no. 23 (December 31, 2012): 43–49. http://dx.doi.org/10.4156/ijact.vol4.issue23.6.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
33

Zhang, Chunyan, Junchao Wang, Qinglei Zhou, Ting Xu, Ke Tang, Hairen Gui, and Fudong Liu. "A Survey of Automatic Source Code Summarization." Symmetry 14, no. 3 (February 25, 2022): 471. http://dx.doi.org/10.3390/sym14030471.

Повний текст джерела
Анотація:
Source code summarization refers to the natural language description of the source code’s function. It can help developers easily understand the semantics of the source code. We can think of the source code and the corresponding summarization as being symmetric. However, the existing source code summarization is mismatched with the source code, missing, or out of date. Manual source code summarization is inefficient and requires a lot of human efforts. To overcome such situations, many studies have been conducted on Automatic Source Code Summarization (ASCS). Given a set of source code, the ASCS techniques can automatically generate a summary described with natural language. In this paper, we give a review of the development of ASCS technology. Almost all ASCS technology involves the following stages: source code modeling, code summarization generation, and quality evaluation. We further categorize the existing ASCS techniques based on the above stages and analyze their advantages and shortcomings. We also draw a clear map on the development of the existing algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
34

Shirali, Nooshin, and Marjan Abdeyazdan. "An Imperialist Competitive Algorithm for Persian Text Segmentation." Ciência e Natura 37 (December 19, 2015): 247. http://dx.doi.org/10.5902/2179460x20780.

Повний текст джерела
Анотація:
Segmentation has been used in different natural language processing tasks, such as information retrieval and text summarization. In this paper a novel Persian text segmentation algorithm is proposed. Our proposed algorithm applies the imperialist competitive algorithm (ICA) to find the optimal topic boundaries. It is the first time that an evolutionary algorithm applies in Persian text segmentation. The experimental results show that proposed algorithm is more accurate than other Persian text segmentation algorithms.
Стилі APA, Harvard, Vancouver, ISO та ін.
35

KUŞ, Anıl, and Çiğdem İnan ACI. "An Extractive Text Summarization Model for Generating Extended Abstracts of Medical Papers in Turkish." Bilgisayar Bilimleri ve Teknolojileri Dergisi 4, no. 1 (June 15, 2023): 19–26. http://dx.doi.org/10.54047/bibted.1260697.

Повний текст джерела
Анотація:
The rapid growth of technology has led to an increase in the amount of data available in the digital environment. This situation makes it difficult for users to find the information they are looking for within this vast dataset, making it time-consuming. To alleviate this difficulty, automatic text summarization systems have been developed as a more efficient way to access relevant information in texts compared to traditional summarization techniques. This study aims to extract extended summaries of Turkish medical papers written about COVID-19. Although scientific papers already have abstracts, more comprehensive summaries are still needed. To the best of our knowledge, automatic summarization of academic studies related to COVID-19 in the Turkish language has not been done before. A dataset was created by collecting 84 Turkish papers from DergiPark. Extended summaries of 2455 and 1708 characters were obtained using widely used extractive methods such as Term Frequency and LexRank algorithms, respectively. The performance of the text summarization model was evaluated based on Recall, Precision, and F-score criteria, and the algorithms were shown to be effective for Turkish. The results of the study showed similar accuracy rates to previous studies in the literature.
Стилі APA, Harvard, Vancouver, ISO та ін.
36

Salman, Zainab Abdul-Wahid. "Text Summarizing and Clustering Using Data Mining Technique." Al-Mustansiriyah Journal of Science 34, no. 1 (March 30, 2023): 58–64. http://dx.doi.org/10.23851/mjs.v34i1.1195.

Повний текст джерела
Анотація:
Text summarization is an important research topic in the field of information technology because of the large volume of texts, and the large amount of data found on the Internet and social media. The task of summarizing the text has gained great importance that requires finding highly efficient ways in the process of extracting knowledge in various fields, Thus, there was a need for methods of summarizing texts for one document or multiple documents. The summarization methods aim to obtain the main content of the set of documents at the same time to reduce redundant information. In this paper, an efficient method to summarize texts is proposed that depends on the word association algorithm to separate and merge sentences after summarizing them. As well as the use of data mining technology in the process of redistributing information according to the (K-Mean) algorithm and the use of (Term Frequency Inverse Document Frequency TF-IDF) technology for measuring the properties of summarized texts. The experimental results found that the summarization ratios are good by deleting unimportant words. Also, the method of extracting characteristics for texts was useful in grouping similar texts into clusters, which makes this method possible to be combined with other methods in artificial intelligence such as fuzzy logic or evolutionary algorithms in increasing summarization rates and accelerating cluster operations.
Стилі APA, Harvard, Vancouver, ISO та ін.
37

Koutra, Danai. "The power of summarization in graph mining and learning." Proceedings of the VLDB Endowment 14, no. 13 (September 2021): 3416. http://dx.doi.org/10.14778/3484224.3484238.

Повний текст джерела
Анотація:
Our ability to generate, collect, and archive data related to everyday activities, such as interacting on social media, browsing the web, and monitoring well-being, is rapidly increasing. Getting the most benefit from this large-scale data requires analysis of patterns it contains, which is computationally intensive or even intractable. Summarization techniques produce compact data representations (summaries) that enable faster processing by complex algorithms and queries. This talk will cover summarization of interconnected data (graphs) [3], which can represent a variety of natural processes (e.g., friendships, communication). I will present an overview of my group's work on bridging the gap between research on summarized network representations and real-world problems. Examples include summarization of massive knowledge graphs for refinement [2] and on-device querying [4], summarization of graph streams for persistent activity detection [1], and summarization within graph neural networks for fast, interpretable classification [5]. I will conclude with open challenges and opportunities for future research.
Стилі APA, Harvard, Vancouver, ISO та ін.
38

Suman, Srishty, Utkarsh Rastogi, and Rajat Tiwari. "Image Stitching Algorithms - A Review." Circulation in Computer Science 1, no. 2 (December 24, 2016): 14–18. http://dx.doi.org/10.22632/ccs-2016-251-39.

Повний текст джерела
Анотація:
Image stitching is the process of combining two or more images of the same scene as a single larger image. Image stitching is needed in many applications like video stabilization, video summarization, video compression, panorama creation. The effectiveness of image stitching depends on the overlap removal, matching of the intensity of images, the techniques used for blending the image. In this paper, the various techniques devised earlier for the image stitching and their applications in the relative places has been reviewed.
Стилі APA, Harvard, Vancouver, ISO та ін.
39

Raposo, Francisco, Ricardo Ribeiro, and David Martins de Matos. "On the Application of Generic Summarization Algorithms to Music." IEEE Signal Processing Letters 22, no. 1 (January 2015): 26–30. http://dx.doi.org/10.1109/lsp.2014.2347582.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
40

Cohen, Edith, Nick Duffield, Haim Kaplan, Carstent Lund, and Mikkel Thorup. "Algorithms and estimators for summarization of unaggregated data streams." Journal of Computer and System Sciences 80, no. 7 (November 2014): 1214–44. http://dx.doi.org/10.1016/j.jcss.2014.04.009.

Повний текст джерела
Стилі APA, Harvard, Vancouver, ISO та ін.
41

Sun, Yunyun, Peng Li, Zhaohui Jiang, and Sujun Hu. "Feature fusion and clustering for key frame extraction." Mathematical Biosciences and Engineering 18, no. 6 (2021): 9294–311. http://dx.doi.org/10.3934/mbe.2021457.

Повний текст джерела
Анотація:
<abstract> <p>Numerous limitations of Shot-based and Content-based key-frame extraction approaches have encouraged the development of Cluster-based algorithms. This paper proposes an Optimal Threshold and Maximum Weight (OTMW) clustering approach that allows accurate and automatic extraction of video summarization. Firstly, the video content is analyzed using the image color, texture and information complexity, and video feature dataset is constructed. Then a Golden Section method is proposed to determine the threshold function optimal solution. The initial cluster center and the cluster number <italic>k</italic> are automatically obtained by employing the improved clustering algorithm. k-clusters video frames are produced with the help of K-MEANS algorithm. The representative frame of each cluster is extracted using the Maximum Weight method and an accurate video summarization is obtained. The proposed approach is tested on 16 multi-type videos, and the obtained key-frame quality evaluation index, and the average of Fidelity and Ratio are 96.11925 and 97.128, respectively. Fortunately, the key-frames extracted by the proposed approach are consistent with artificial visual judgement. The performance of the proposed approach is compared with several state-of-the-art cluster-based algorithms, and the Fidelity are increased by 12.49721, 10.86455, 10.62984 and 10.4984375, respectively. In addition, the Ratio is increased by 1.958 on average with small fluctuations. The obtained experimental results demonstrate the advantage of the proposed solution over several related baselines on sixteen diverse datasets and validated that proposed approach can accurately extract video summarization from multi-type videos.</p> </abstract>
Стилі APA, Harvard, Vancouver, ISO та ін.
42

Sakr, Mohamed, Walid Atwa, and Arabi Keshk. "Genetic-based Summarization for Local Outlier Detection in Data Stream." International Journal of Intelligent Systems and Applications 13, no. 1 (February 8, 2021): 58–68. http://dx.doi.org/10.5815/ijisa.2021.01.05.

Повний текст джерела
Анотація:
Outlier detection is one of the important tasks in data mining. Detecting outliers over streaming data has become an important task in many applications, such as network analysis, fraud detections, and environment monitoring. One of the well-known outlier detection algorithms called Local Outlier Factor (LOF). However, the original LOF has many drawbacks that can’t be used with data streams: 1- it needs a lot of processing power (CPU) and large memory to detect the outliers. 2- it deals with static data which mean that in any change in data the LOF recalculates the outliers from the beginning on the whole data. These drawbacks make big challenges for existing outlier detection algorithms in terms of their accuracies when they are implemented in the streaming environment. In this paper, we propose a new algorithm called GSILOF that focuses on detecting outliers from data streams using genetics. GSILOF solve the problem of large memory needed as it has fixed memory bound. GSILOF has two phases. First, the summarization phase that tries to summarize the past data arrived. Second, the detection phase detects the outliers from the new arriving data. The summarization phase uses a genetic algorithm to try to find the subset of points that can represent the whole original set. our experiments have been done over real datasets. Our experiments confirming the effectiveness of the proposed approach and the high quality of approximate solutions in a set of real-world streaming data.
Стилі APA, Harvard, Vancouver, ISO та ін.
43

Veena R, D. Ramesh, and Hanumantappa M. "Automatic text summarization–A systematic literature review." World Journal of Advanced Engineering Technology and Sciences 8, no. 2 (March 30, 2023): 126–29. http://dx.doi.org/10.30574/wjaets.2023.8.2.0080.

Повний текст джерела
Анотація:
Automatic summarization is the act of computationally condensing a set of data to produce a subset (a summary) that captures the key ideas or information within the original text. To do this, artificial intelligence algorithms that are tailored for diverse sorts of data are frequently created and used. Ten research articles considering databases like IEEE, Scopus, and Springer Nature have been considered. The paradigm shift that AI has created in the field of Automatic Text Summarization is discussed in detail.
Стилі APA, Harvard, Vancouver, ISO та ін.
44

Krishnaveni P. and Balasundaram S. R. "Automatic Text Summarization by Providing Coverage, Non-Redundancy, and Novelty Using Sentence Graph." Journal of Information Technology Research 15, no. 1 (January 2022): 1–18. http://dx.doi.org/10.4018/jitr.2022010108.

Повний текст джерела
Анотація:
The day-to-day growth of online information necessitates intensive research in automatic text summarization (ATS). The ATS software produces summary text by extracting important information from the original text. With the help of summaries, users can easily read and understand the documents of interest. Most of the approaches for ATS used only local properties of text. Moreover, the numerous properties make the sentence selection difficult and complicated. So this article uses a graph based summarization to utilize structural and global properties of text. It introduces maximal clique based sentence selection (MCBSS) algorithm to select important and non-redundant sentences that cover all concepts of the input text for summary. The MCBSS algorithm finds novel information using maximal cliques (MCs). The experimental results of recall oriented understudy for gisting evaluation (ROUGE) on Timeline dataset show that the proposed work outperforms the existing graph algorithms Bushy Path (BP), Aggregate Similarity (AS), and TextRank (TR).
Стилі APA, Harvard, Vancouver, ISO та ін.
45

Lin, Yu-Ru, Hari Sundaram, and Aisling Kelliher. "JAM: Joint Action Matrix Factorization for Summarizing a Temporal Heterogeneous Social Network." Proceedings of the International AAAI Conference on Web and Social Media 3, no. 1 (March 20, 2009): 250–53. http://dx.doi.org/10.1609/icwsm.v3i1.14002.

Повний текст джерела
Анотація:
This paper presents JAM (Joint Action Matrix Factorization), a novel framework to summarize social activity from rich media social networks. Summarizing social network activities requires an understanding of the relationships among concepts, users, and the context in which the concepts are used. Our work has three contributions: First, we propose a novel summarization method which extracts the co-evolution on multiple facets of social activity – who (users), what (concepts), how (actions) and when (time), and constructs a context rich summary called "activity theme". Second, we provide an efficient algorithm for mining activity themes over time. The algorithm extracts representative elements in each facet based on their co-occurrences with other facets through specific actions. Third, we propose new metrics for evaluating the summarization results based on the temporal and topological relationship among activity themes. Extensive experiments on real-world Flickr datasets demonstrate that our technique significantly outperforms several baseline algorithms. The results explore nontrivial evolution in Flickr photo-sharing communities.
Стилі APA, Harvard, Vancouver, ISO та ін.
46

TATAR, DOINA, ANDREEA MIHIS, DANA LUPSA, and EMMA TAMAIANU-MORITA. "ENTAILMENT-BASED LINEAR SEGMENTATION IN SUMMARIZATION." International Journal of Software Engineering and Knowledge Engineering 19, no. 08 (December 2009): 1023–38. http://dx.doi.org/10.1142/s0218194009004520.

Повний текст джерела
Анотація:
This paper presents some original methods for text summarization of a single source document by extraction. The methods are based on some of our own text segmentation algorithms. We denote them as logical segmentation because for all these methods (LTT, ArcInt and ArcReal) the score of a sentence is calculated starting from the number of sentences which are entailed by it. For a text (which is a sequence of sentences) the scores form a structure which indicates how the most important sentences alternate with less important ones and organizes the text according to its logical content. The second logical method, Pure Entailment also uses definition of the relation of entailment between two texts. At least to our knowledge, it is for the first time that the relation of Text Entailment between the sentences of a text is used for segmentation and summarization. The third original method applies Dynamic Programming and centering theory to the sentences logically scored as above. The obtained ranked logical segments are used in the summarization. Our methods of segmentation and summarization are applied and evaluated against a manually realized segmentation and summarization of the same text by Donald Richie, "The Koan".
Стилі APA, Harvard, Vancouver, ISO та ін.
47

Christian, Hans, Mikhael Pramodana Agus, and Derwin Suhartono. "Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF)." ComTech: Computer, Mathematics and Engineering Applications 7, no. 4 (December 31, 2016): 285. http://dx.doi.org/10.21512/comtech.v7i4.3746.

Повний текст джерела
Анотація:
The increasing availability of online information has triggered an intensive research in the area of automatic text summarization within the Natural Language Processing (NLP). Text summarization reduces the text by removing the less useful information which helps the reader to find the required information quickly. There are many kinds of algorithms that can be used to summarize the text. One of them is TF-IDF (TermFrequency-Inverse Document Frequency). This research aimed to produce an automatic text summarizer implemented with TF-IDF algorithm and to compare it with other various online source of automatic text summarizer. To evaluate the summary produced from each summarizer, The F-Measure as the standard comparison value had been used. The result of this research produces 67% of accuracy with three data samples which are higher compared to the other online summarizers.
Стилі APA, Harvard, Vancouver, ISO та ін.
48

Rana, Deepak Singh. "Generating Document Summary using Data Mining and Clustering Techniques." Mathematical Statistician and Engineering Applications 70, no. 1 (January 31, 2021): 285–92. http://dx.doi.org/10.17762/msea.v70i1.2310.

Повний текст джерела
Анотація:
Abstract This paper presents a novel approach to generating document summaries using data mining and clustering techniques, specifically K-means clustering and bisecting K-means clustering algorithms. With the exponential growth of textual data, there is an increasing need for efficient and accurate summarization techniques to aid users in understanding the key information within large collections of documents. This study explores the potential of data mining and clustering methods in extracting salient features from textual data and producing high-quality summaries. By applying K-means clustering and bisecting K-means clustering algorithms to the preprocessed textual data, the proposed approach groups similar sentences together and selects the most representative sentences from each cluster to form the final summary. The performance of the proposed method is evaluated using standard evaluation metrics, such as precision, recall, and F1-score, and compared with existing summarization techniques. The results demonstrate that the combination of data mining and clustering techniques provides a promising solution for generating accurate and concise document summaries, with potential applications in various domains, such as news aggregation, scientific literature summarization, and social media content analysis.
Стилі APA, Harvard, Vancouver, ISO та ін.
49

Manju, K., S. David Peter, and Sumam Idicula. "A Framework for Generating Extractive Summary from Multiple Malayalam Documents." Information 12, no. 1 (January 18, 2021): 41. http://dx.doi.org/10.3390/info12010041.

Повний текст джерела
Анотація:
Automatic extractive text summarization retrieves a subset of data that represents most notable sentences in the entire document. In the era of digital explosion, which is mostly unstructured textual data, there is a demand for users to understand the huge amount of text in a short time; this demands the need for an automatic text summarizer. From summaries, the users get the idea of the entire content of the document and can decide whether to read the entire document or not. This work mainly focuses on generating a summary from multiple news documents. In this case, the summary helps to reduce the redundant news from the different newspapers. A multi-document summary is more challenging than a single-document summary since it has to solve the problem of overlapping information among sentences from different documents. Extractive text summarization yields the sensitive part of the document by neglecting the irrelevant and redundant sentences. In this paper, we propose a framework for extracting a summary from multiple documents in the Malayalam Language. Also, since the multi-document summarization data set is sparse, methods based on deep learning are difficult to apply. The proposed work discusses the performance of existing standard algorithms in multi-document summarization of the Malayalam Language. We propose a sentence extraction algorithm that selects the top ranked sentences with maximum diversity. The system is found to perform well in terms of precision, recall, and F-measure on multiple input documents.
Стилі APA, Harvard, Vancouver, ISO та ін.
50

Papalampidi, Pinelopi, Frank Keller, and Mirella Lapata. "Movie Summarization via Sparse Graph Construction." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 15 (May 18, 2021): 13631–39. http://dx.doi.org/10.1609/aaai.v35i15.17607.

Повний текст джерела
Анотація:
We summarize full-length movies by creating shorter videos containing their most informative scenes. We explore the hypothesis that a summary can be created by assembling scenes which are turning points (TPs), i.e., key events in a movie that describe its storyline. We propose a model that identifies TP scenes by building a sparse movie graph that represents relations between scenes and is constructed using multimodal information. According to human judges, the summaries created by our approach are more informative and complete, and receive higher ratings, than the outputs of sequence-based models and general-purpose summarization algorithms. The induced graphs are interpretable, displaying different topology for different movie genres.
Стилі APA, Harvard, Vancouver, ISO та ін.
Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!

До бібліографії