Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Text indexing.

Artykuły w czasopismach na temat „Text indexing”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Text indexing”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Zbigniew, Kaleta. "Semantic Text Indexing". Computer Science 15, nr 1 (2014): 19. http://dx.doi.org/10.7494/csci.2014.15.1.19.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Ferragina, Paolo, i Giovanni Manzini. "Indexing compressed text". Journal of the ACM 52, nr 4 (lipiec 2005): 552–81. http://dx.doi.org/10.1145/1082036.1082039.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Navarro, Gonzalo, i Nicola Prezza. "Universal compressed text indexing". Theoretical Computer Science 762 (marzec 2019): 41–50. http://dx.doi.org/10.1016/j.tcs.2018.09.007.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Amir, Amihood, Gad M. Landau i Esko Ukkonen. "Online timestamped text indexing". Information Processing Letters 82, nr 5 (czerwiec 2002): 253–59. http://dx.doi.org/10.1016/s0020-0190(01)00275-7.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Jones, Kevin. "Text management and indexing". Learned Publishing 5, nr 3 (1.01.1992): 168–69. http://dx.doi.org/10.1002/leap/50055.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Maaß, Moritz G., i Johannes Nowak. "Text indexing with errors". Journal of Discrete Algorithms 5, nr 4 (grudzień 2007): 662–81. http://dx.doi.org/10.1016/j.jda.2006.11.001.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
7

Ferragina, Paolo, i Roberto Grossi. "Improved Dynamic Text Indexing". Journal of Algorithms 31, nr 2 (maj 1999): 291–319. http://dx.doi.org/10.1006/jagm.1998.0999.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
8

Bernardini, Giulia, Huiping Chen, Gabriele Fici, Grigorios Loukides i Solon P. Pissis. "Reverse-Safe Text Indexing". ACM Journal of Experimental Algorithmics 26 (8.07.2021): 1–26. http://dx.doi.org/10.1145/3461698.

Pełny tekst źródła
Streszczenie:
We introduce the notion of reverse-safe data structures. These are data structures that prevent the reconstruction of the data they encode (i.e., they cannot be easily reversed). A data structure D is called z - reverse-safe when there exist at least z datasets with the same set of answers as the ones stored by D . The main challenge is to ensure that D stores as many answers to useful queries as possible, is constructed efficiently, and has size close to the size of the original dataset it encodes. Given a text of length n and an integer z , we propose an algorithm that constructs a z -reverse-safe data structure ( z -RSDS) that has size O(n) and answers decision and counting pattern matching queries of length at most d optimally, where d is maximal for any such z -RSDS. The construction algorithm takes O(nɷ log d) time, where ɷ is the matrix multiplication exponent. We show that, despite the nɷ factor, our engineered implementation takes only a few minutes to finish for million-letter texts. We also show that plugging our method in data analysis applications gives insignificant or no data utility loss. Furthermore, we show how our technique can be extended to support applications under realistic adversary models. Finally, we show a z -RSDS for decision pattern matching queries, whose size can be sublinear in n . A preliminary version of this article appeared in ALENEX 2020.
Style APA, Harvard, Vancouver, ISO itp.
9

Golub, Koraljka. "Automatic Subject Indexing of Text". KNOWLEDGE ORGANIZATION 46, nr 2 (2019): 104–21. http://dx.doi.org/10.5771/0943-7444-2019-2-104.

Pełny tekst źródła
Streszczenie:
Automatic subject indexing addresses problems of scale and sustainability and can be at the same time used to enrich existing metadata records, establish more connections across and between resources from various metadata and resource collections, and enhance consistency of the metadata. In this work, automatic subject indexing focuses on assigning index terms or classes from established knowledge organization systems (KOSs) for subject indexing like thesauri, subject headings systems and classification systems. The following major approaches are discussed, in terms of their similarities and differences, advantages and disadvantages for automatic assigned indexing from KOSs: “text categorization,” “document clustering,” and “document classification.” Text categorization is perhaps the most widespread, machine-learning approach with what seems generally good reported performance. Document clustering automatically both creates groups of related documents and extracts names of subjects depicting the group at hand. Document classification re-uses the intellectual effort invested into creating a KOS for subject indexing and even simple string-matching algorithms have been reported to achieve good results, because one concept can be described using a number of different terms, including equivalent, related, narrower and broader terms. Finally, applicability of automatic subject indexing to operative information systems and challenges of evaluation are outlined, suggesting the need for more research.
Style APA, Harvard, Vancouver, ISO itp.
10

Belazzougui, Djamal, i Gonzalo Navarro. "Alphabet-Independent Compressed Text Indexing". ACM Transactions on Algorithms 10, nr 4 (sierpień 2014): 1–19. http://dx.doi.org/10.1145/2635816.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
11

Hon, Wing-Kai, Tsung-Han Ku, Rahul Shah, Sharma V. Thankachan i Jeffrey Scott Vitter. "Compressed text indexing with wildcards". Journal of Discrete Algorithms 19 (marzec 2013): 23–29. http://dx.doi.org/10.1016/j.jda.2012.12.003.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
12

Grist, Deirdre. "Indexing legislative text: Alberta Hansard". Indexer: The International Journal of Indexing: Volume 23, Issue 3 23, nr 3 (1.04.2003): 138–39. http://dx.doi.org/10.3828/indexer.2003.23.3.7.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
13

Lancaster, F. W. "Retrieval experiments: Full text versus human indexing versus automatic indexing". Journal of the American Society for Information Science 49, nr 5 (1998): 483–84. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<483::aid-asi13>3.0.co;2-a.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
14

Lancaster, F. W. "Retrieval experiments: Full text versus human indexing versus automatic indexing". Journal of the American Society for Information Science 49, nr 5 (1998): 484. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<484::aid-asi14>3.0.co;2-6.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
15

Lancaster, F. W. "Retrieval experiments: Full text versus human indexing versus automatic indexing". Journal of the American Society for Information Science 49, nr 5 (15.04.1998): 484. http://dx.doi.org/10.1002/(sici)1097-4571(19980415)49:5<484::aid-asi14>3.3.co;2-y.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
16

Rocher, Tatiana, Mathieu Giraud i Mikaël Salson. "Indexing labeled sequences". PeerJ Computer Science 4 (26.03.2018): e148. http://dx.doi.org/10.7717/peerj-cs.148.

Pełny tekst źródła
Streszczenie:
Background Labels are a way to add some information on a text, such as functional annotations such as genes on a DNA sequences. V(D)J recombinations are DNA recombinations involving two or three short genes in lymphocytes. Sequencing this short region (500 bp or less) produces labeled sequences and brings insight in the lymphocyte repertoire for onco-hematology or immunology studies. Methods We present two indexes for a text with non-overlapping labels. They store the text in a Burrows–Wheeler transform (BWT) and a compressed label sequence in a Wavelet Tree. The label sequence is taken in the order of the text (TL-index) or in the order of the BWT (TLBW-index). Both indexes need a space related to the entropy of the labeled text. Results These indexes allow efficient text–label queries to count and find labeled patterns. The TLBW-index has an overhead on simple label queries but is very efficient on combined pattern–label queries. We implemented the indexes in C++ and compared them against a baseline solution on pseudo-random as well as on V(D)J labeled texts. Discussion New indexes such as the ones we proposed improve the way we index and query labeled texts as, for instance, lymphocyte repertoire for hematological and immunological studies.
Style APA, Harvard, Vancouver, ISO itp.
17

Smiraglia, Richard P. "Keywords, Indexing, Text Analysis: An Editorial". KNOWLEDGE ORGANIZATION 40, nr 3 (2013): 155–59. http://dx.doi.org/10.5771/0943-7444-2013-3-155.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
18

Salminen, Airi, Jean Tague-Sutcliffe i Charles McClellan. "From text to hypertext by indexing". ACM Transactions on Information Systems 13, nr 1 (2.01.1995): 69–99. http://dx.doi.org/10.1145/195705.195717.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
19

Gey, Fredric, i Daniel P. Dabney. "Full-text against intellectual indexing controversy". Journal of the American Society for Information Science 41, nr 8 (grudzień 1990): 613–14. http://dx.doi.org/10.1002/(sici)1097-4571(199012)41:8<613::aid-asi9>3.0.co;2-d.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
20

Bille, Philip, Johannes Fischer, Inge Li Gørtz, Tsvi Kopelowitz, Benjamin Sach i Hjalte Wedel Vildhøj. "Sparse Text Indexing in Small Space". ACM Transactions on Algorithms 12, nr 3 (15.06.2016): 1–19. http://dx.doi.org/10.1145/2836166.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
21

ZHANG, MENG, LIANG HU i YI ZHANG. "WEIGHTED AUTOMATA FOR FULL-TEXT INDEXING". International Journal of Foundations of Computer Science 22, nr 04 (czerwiec 2011): 921–43. http://dx.doi.org/10.1142/s0129054111008490.

Pełny tekst źródła
Streszczenie:
Full-text index structures are widely used in string matching and bioinformatics. These structures such as DAWGs and suffix trees allow fast searches on texts. In this paper, we present a new partition of the factors of a word, called a consistent minimal linear partition. Based on this partition, we introduce the weighted directed word graph (WDWG), a space-economical full-text index. WDWGs are basically cyclic, which means that they may accept infinite strings. But by assigning weights to edges, the acceptable strings are limited only to the factors of the input string. For a given word w, any factor of w can be indexed by a state of the WDWG and its length. A WDWG of w has at most |w| states and 2|w| - 1 transition edges. We present an on-line algorithm to construct a WDWG for a given word in time linear in the length of the word. Our experiment shows the size of WDWGs is smaller than that of DAWGs for many data sets including DNA sequences, Chinese texts and English texts.
Style APA, Harvard, Vancouver, ISO itp.
22

Rao .S, Venkata. "Correlation Preserving Indexing Based Text Clustering". IOSR Journal of Computer Engineering 13, nr 1 (2013): 27–30. http://dx.doi.org/10.9790/0661-1312730.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
23

Navarro, Gonzalo, Erkki Sutinen i Jorma Tarhio. "Indexing text with approximate q-grams". Journal of Discrete Algorithms 3, nr 2-4 (czerwiec 2005): 157–75. http://dx.doi.org/10.1016/j.jda.2004.08.003.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
24

Ferragina, Paolo. "Dynamic Text Indexing under String Updates". Journal of Algorithms 22, nr 2 (luty 1997): 296–328. http://dx.doi.org/10.1006/jagm.1996.0814.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
25

Gibney, Daniel, i Sharma V. Thankachan. "Text Indexing for Regular Expression Matching". Algorithms 14, nr 5 (23.04.2021): 133. http://dx.doi.org/10.3390/a14050133.

Pełny tekst źródła
Streszczenie:
Finding substrings of a text T that match a regular expression p is a fundamental problem. Despite being the subject of extensive research, no solution with a time complexity significantly better than O(|T||p|) has been found. Backurs and Indyk in FOCS 2016 established conditional lower bounds for the algorithmic problem based on the Strong Exponential Time Hypothesis that helps explain this difficulty. A natural question is whether we can improve the time complexity for matching the regular expression by preprocessing the text T? We show that conditioned on the Online Matrix–Vector Multiplication (OMv) conjecture, even with arbitrary polynomial preprocessing time, a regular expression query on a text cannot be answered in strongly sublinear time, i.e., O(|T|1−ε) for any ε>0. Furthermore, if we extend the OMv conjecture to a plausible conjecture regarding Boolean matrix multiplication with polynomial preprocessing time, which we call Online Matrix–Matrix Multiplication (OMM), we can strengthen this hardness result to there being no solution with a query time that is O(|T|3/2−ε). These results hold for alphabet sizes three or greater. We then provide data structures that answer queries in O(|T||p|τ) time where τ∈[1,|T|] is fixed at construction. These include a solution that works for all regular expressions with Expτ·|T| preprocessing time and space. For patterns containing only ‘concatenation’ and ‘or’ operators (the same type used in the hardness result), we provide (1) a deterministic solution which requires Expτ·|T|log2|T| preprocessing time and space, and (2) when |p|≤|T|z for z=2o(log|T|), a randomized solution with amortized query time which answers queries correctly with high probability, requiring Expτ·|T|2Ωlog|T| preprocessing time and space.
Style APA, Harvard, Vancouver, ISO itp.
26

Lienhart, Rainer, i Wolfgang Effelsberg. "Automatic text segmentation and text recognition for video indexing". Multimedia Systems 8, nr 1 (1.01.2000): 69–81. http://dx.doi.org/10.1007/s005300050006.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
27

ŽĎÁREK, JAN, i BOŘIVOJ MELICHAR. "TREE-BASED 2D INDEXING". International Journal of Foundations of Computer Science 22, nr 08 (grudzień 2011): 1893–907. http://dx.doi.org/10.1142/s0129054111009100.

Pełny tekst źródła
Streszczenie:
A new approach to the 2D pattern matching and specifically to 2D text indexing is proposed. A transformation of a 2D text into the form of a tree is presented. It preserves the context of each element of the 2D text. The tree can be linearised using the prefix notation into the form of a string (a linear text) and the pattern matching is performed in this text. Pushdown automata indexing the 2D text are constructed over the tree representation. They allow to search for 2D prefixes, 2D suffixes, and 2D factors of the 2D text in time proportional to the size of the representation of a 2D pattern. This result achieves the properties analogous to the results obtained in tree pattern matching and string indexing.
Style APA, Harvard, Vancouver, ISO itp.
28

Ul Hassan, Mohamed Manzoor. "A Robust Multi-Keyword Text Content Retrieval by Utilizing Hash Indexing". International Journal of Innovative Research in Computer Science & Technology 9, nr 2 (marzec 2021): 1–5. http://dx.doi.org/10.21276/ijircst.2021.9.2.1.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
29

Chatterjee, Niladri, i Pramod Kumar Sahoo. "Random Indexing and Modified Random Indexing based approach for extractive text summarization". Computer Speech & Language 29, nr 1 (styczeń 2015): 32–44. http://dx.doi.org/10.1016/j.csl.2014.07.001.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
30

Sutcliffe, Glyn. "The indexing of biography as a special genre or as historically documented text". Indexer: The International Journal of Indexing: Volume 39, Issue 2 39, nr 2 (1.06.2021): 151–63. http://dx.doi.org/10.3828/indexer.2021.16.

Pełny tekst źródła
Streszczenie:
The indexing of biography as a genre, per se, is reconsidered with respect to subject indexing in general and the histories of lives in particular. The index of a biography of the chess player Bobby Fischer is compared with the index of a history of chess at the height of the Cold War conflict in which Bobby Fischer was the central protagonist. Some received theory of the indexing of biographies is critiqued and challenged by practical comparisons. Indexing from a literary perspective is considered and contrasted with back-of-book indexing from an information retrieval standpoint.
Style APA, Harvard, Vancouver, ISO itp.
31

Moreo Fernández, Alejandro, Andrea Esuli i Fabrizio Sebastiani. "Lightweight Random Indexing for Polylingual Text Classification". Journal of Artificial Intelligence Research 57 (13.10.2016): 151–85. http://dx.doi.org/10.1613/jair.5194.

Pełny tekst źródła
Streszczenie:
Multilingual Text Classification (MLTC) is a text classification task in which documents are written each in one among a set L of natural languages, and in which all documents must be classified under the same classification scheme, irrespective of language. There are two main variants of MLTC, namely Cross-Lingual Text Classification (CLTC) and Polylingual Text Classification (PLTC). In PLTC, which is the focus of this paper, we assume (differently from CLTC) that for each language in L there is a representative set of training documents; PLTC consists of improving the accuracy of each of the |L| monolingual classifiers by also leveraging the training documents written in the other (|L| − 1) languages. The obvious solution, consisting of generating a single polylingual classifier from the juxtaposed monolingual vector spaces, is usually infeasible, since the dimensionality of the resulting vector space is roughly |L| times that of a monolingual one, and is thus often unmanageable. As a response, the use of machine translation tools or multilingual dictionaries has been proposed. However, these resources are not always available, or are not always free to use. One machine-translation-free and dictionary-free method that, to the best of our knowledge, has never been applied to PLTC before, is Random Indexing (RI). We analyse RI in terms of space and time efficiency, and propose a particular configuration of it (that we dub Lightweight Random Indexing – LRI). By running experiments on two well known public benchmarks, Reuters RCV1/RCV2 (a comparable corpus) and JRC-Acquis (a parallel one), we show LRI to outperform (both in terms of effectiveness and efficiency) a number of previously proposed machine-translation-free and dictionary-free PLTC methods that we use as baselines.
Style APA, Harvard, Vancouver, ISO itp.
32

Boubekeur, Fatiha, i Wassila Azzoug. "Concept-Based Indexing in Text Information Retrieval". International Journal of Computer Science and Information Technology 5, nr 1 (28.02.2013): 119–36. http://dx.doi.org/10.5121/ijcsit.2013.5110.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
33

Navarro, Gonzalo. "Indexing text using the Ziv–Lempel trie". Journal of Discrete Algorithms 2, nr 1 (marzec 2004): 87–114. http://dx.doi.org/10.1016/s1570-8667(03)00066-2.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
34

Byers, David. "Full-text indexing of non-textual resources". Computer Networks and ISDN Systems 30, nr 1-7 (kwiecień 1998): 141–48. http://dx.doi.org/10.1016/s0169-7552(98)00059-2.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
35

Rachidi, Youssef. "Text Detection in Video for Video Indexing". International Journal of Computer Trends and Technology 68, nr 4 (25.04.2020): 96–99. http://dx.doi.org/10.14445/22312803/ijctt-v68i4p117.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
36

Woodruff, Allison Gyle, i Christian Plaunt. "GIPSY: Automated geographic indexing of text documents". Journal of the American Society for Information Science 45, nr 9 (październik 1994): 645–55. http://dx.doi.org/10.1002/(sici)1097-4571(199410)45:9<645::aid-asi2>3.0.co;2-8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
37

Weinberg, Bella Hass. "Challenges in indexing electronic text and images". Journal of the American Society for Information Science 45, nr 9 (październik 1994): 718–23. http://dx.doi.org/10.1002/(sici)1097-4571(199410)45:9<718::aid-asi9>3.0.co;2-f.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
38

Rahim, Robbi, Nuning Kurniasih, Muhammad Dedi Irawan, Yustria Handika Siregar, Abdurrozzaq Hasibuan, Deffi Ayu Puspito Sari, Tiarma Simanihuruk i in. "Latent Semantic Indexing for Indonesian Text Similarity". International Journal of Engineering & Technology 7, nr 2.3 (8.03.2018): 73. http://dx.doi.org/10.14419/ijet.v7i2.3.12619.

Pełny tekst źródła
Streszczenie:
Document is a written letter that can be used as evidence of information. Plagiarism is a deliberate or unintentional act of obtaining or attempting to obtain credit or value for a scientific work, citing some or all of the scientific work of another party acknowledged as a scientific work without stating the source properly and adequately. Latent Semantic Indexing method serves to find text that has the same text against from a document. The algorithm used is TF/IDF Algorithm that is the result of multiplication of TF value with IDF for a term in document while Vector Space Model (VSM) is method to see the level of closeness or similarity of word by way of weighting term.
Style APA, Harvard, Vancouver, ISO itp.
39

Villarroel, Miguel, Pablo de la Fuente, Alberto Pedrero, Jesús Vegas i Joaquín Adiego. "Obtaining feedback for indexing from highlighted text". Electronic Library 20, nr 4 (sierpień 2002): 306–13. http://dx.doi.org/10.1108/02640470210438919.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
40

Svenonius, Elaine. "Challenges in Indexing Electronic Text and Images". Information Processing & Management 31, nr 2 (marzec 1995): 259–60. http://dx.doi.org/10.1016/0306-4573(95)80048-x.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
41

Arroyuelo, Diego, Gonzalo Navarro i Kunihiko Sadakane. "Stronger Lempel-Ziv Based Compressed Text Indexing". Algorithmica 62, nr 1-2 (8.09.2010): 54–101. http://dx.doi.org/10.1007/s00453-010-9443-8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
42

Mansour, Nashat, Ramzi A. Haraty, Walid Daher i Manal Houri. "An auto-indexing method for Arabic text". Information Processing & Management 44, nr 4 (lipiec 2008): 1538–45. http://dx.doi.org/10.1016/j.ipm.2007.12.007.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
43

Esler, Sandra L., i Michael L. Nelson. "NASA indexing benchmarks: evaluating text search engines". Journal of Network and Computer Applications 20, nr 4 (październik 1997): 339–53. http://dx.doi.org/10.1006/jnca.1997.0049.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
44

Dai, Suyang, Ronghui You, Zhiyong Lu, Xiaodi Huang, Hiroshi Mamitsuka i Shanfeng Zhu. "FullMeSH: improving large-scale MeSH indexing with full text". Bioinformatics 36, nr 5 (9.10.2019): 1533–41. http://dx.doi.org/10.1093/bioinformatics/btz756.

Pełny tekst źródła
Streszczenie:
Abstract Motivation With the rapidly growing biomedical literature, automatically indexing biomedical articles by Medical Subject Heading (MeSH), namely MeSH indexing, has become increasingly important for facilitating hypothesis generation and knowledge discovery. Over the past years, many large-scale MeSH indexing approaches have been proposed, such as Medical Text Indexer, MeSHLabeler, DeepMeSH and MeSHProbeNet. However, the performance of these methods is hampered by using limited information, i.e. only the title and abstract of biomedical articles. Results We propose FullMeSH, a large-scale MeSH indexing method taking advantage of the recent increase in the availability of full text articles. Compared to DeepMeSH and other state-of-the-art methods, FullMeSH has three novelties: (i) Instead of using a full text as a whole, FullMeSH segments it into several sections with their normalized titles in order to distinguish their contributions to the overall performance. (ii) FullMeSH integrates the evidence from different sections in a ‘learning to rank’ framework by combining the sparse and deep semantic representations. (iii) FullMeSH trains an Attention-based Convolutional Neural Network for each section, which achieves better performance on infrequent MeSH headings. FullMeSH has been developed and empirically trained on the entire set of 1.4 million full-text articles in the PubMed Central Open Access subset. It achieved a Micro F-measure of 66.76% on a test set of 10 000 articles, which was 3.3% and 6.4% higher than DeepMeSH and MeSHLabeler, respectively. Furthermore, FullMeSH demonstrated an average improvement of 4.7% over DeepMeSH for indexing Check Tags, a set of most frequently indexed MeSH headings. Availability and implementation The software is available upon request. Supplementary information Supplementary data are available at Bioinformatics online.
Style APA, Harvard, Vancouver, ISO itp.
45

Amir, Amihood, Ayelet Butman i Ely Porat. "On the relationship between histogram indexing and block-mass indexing". Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 372, nr 2016 (28.05.2014): 20130132. http://dx.doi.org/10.1098/rsta.2013.0132.

Pełny tekst źródła
Streszczenie:
Histogram indexing , also known as jumbled pattern indexing and permutation indexing is one of the important current open problems in pattern matching. It was introduced about 6 years ago and has seen active research since. Yet, to date there is no algorithm that can preprocess a text T in time o (| T | 2 /polylog| T |) and achieve histogram indexing, even over a binary alphabet, in time independent of the text length. The pattern matching version of this problem has a simple linear-time solution. Block-mass pattern matching problem is a recently introduced problem, motivated by issues in mass-spectrometry. It is also an example of a pattern matching problem that has an efficient, almost linear-time solution but whose indexing version is daunting. However, for fixed finite alphabets, there has been progress made. In this paper, a strong connection between the histogram indexing problem and the block-mass pattern indexing problem is shown. The reduction we show between the two problems is amazingly simple. Its value lies in recognizing the connection between these two apparently disparate problems, rather than the complexity of the reduction. In addition, we show that for both these problems, even over unbounded alphabets, there are algorithms that preprocess a text T in time o (| T | 2 /polylog| T |) and enable answering indexing queries in time polynomial in the query length. The contributions of this paper are twofold: (i) we introduce the idea of allowing a trade-off between the preprocessing time and query time of various indexing problems that have been stumbling blocks in the literature. (ii) We take the first step in introducing a class of indexing problems that, we believe, cannot be pre-processed in time o (| T | 2 /polylog| T |) and enable linear-time query processing.
Style APA, Harvard, Vancouver, ISO itp.
46

Farrow, John. "All in the mind: concept analysis in indexing". Indexer: The International Journal of Indexing: Volume 19, Issue 4 19, nr 4 (1.10.1995): 243–47. http://dx.doi.org/10.3828/indexer.1995.19.4.2.

Pełny tekst źródła
Streszczenie:
The indexing process consists of the comprehension of the document to be indexed, followed by the production of a set of index terms. Differences between academic indexing and back-of-book indexing are discussed. Text comprehension is a branch of human information processing, and it is argued that the model of text comprehension and production developed by van Dijk and Kintsch can form the basis for a cognitive process model of indexing. Strategies for testing such a model are suggested.
Style APA, Harvard, Vancouver, ISO itp.
47

Ding, Yi, i Xian Fu. "Topical Concept Based Text Clustering Method". Advanced Materials Research 532-533 (czerwiec 2012): 939–43. http://dx.doi.org/10.4028/www.scientific.net/amr.532-533.939.

Pełny tekst źródła
Streszczenie:
Text clustering typically involves clustering in a high dimensional space, which appears difficult with regard to virtually all practical settings. In addition, given a particular clustering result it is typically very hard to come up with a good explanation of why the text clusters have been constructed the way they are. . To solve these problems, based on topic concept clustering, this paper proposes a method for Chinese document clustering. In this paper, we introduce a novel topical document clustering method called Document Features Indexing Clustering (DFIC), which can identify topics accurately and cluster documents according to these topics. In DFIC, “topic elements” are defined and extracted for indexing base clusters. Additionally, document features are investigated and exploited. Experimental results show that DFIC can gain a higher precision (92.76%) than some widely used traditional clustering methods.
Style APA, Harvard, Vancouver, ISO itp.
48

Gupta, Shweta, Sunita Yadav i Rajesh Prasad. "Document Retrieval using Efficient Indexing Techniques". International Journal of Business Analytics 3, nr 4 (październik 2016): 64–82. http://dx.doi.org/10.4018/ijban.2016100104.

Pełny tekst źródła
Streszczenie:
Document retrieval plays a crucial role in retrieving relevant documents. Relevancy depends upon the occurrences of query keywords in a document. Several documents include a similar key terms and hence they need to be indexed. Most of the indexing techniques are either based on inverted index or full-text index. Inverted index create lists and support word-based pattern queries. While full-text index handle queries comprise of any sequence of characters rather than just words. Problems arise when text cannot be separated as words in some western languages. Also, there are difficulties in space used by compressed versions of full-text indexes. Recently, one of the unique data structure called wavelet tree has been popular in the text compression and indexing. It indexes words or characters of the text documents and help in retrieving top ranked documents more efficiently. This paper presents a review on most recent efficient indexing techniques used in document retrieval.
Style APA, Harvard, Vancouver, ISO itp.
49

Harish, B. S. "Text Document Classification: An Approach Based on Indexing". International Journal of Data Mining & Knowledge Management Process 2, nr 1 (31.01.2012): 43–62. http://dx.doi.org/10.5121/ijdkp.2012.2104.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
50

Rose, Leonard. "Index Maker: Automatic indexing from word‐processor text". Electronic Library 5, nr 3 (marzec 1987): 140. http://dx.doi.org/10.1108/eb044744.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii