Journal articles on the topic 'String algorithms'

To see the other types of publications on this topic, follow the link: String algorithms.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'String algorithms.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Bhagya Sri, Mukku, Rachita Bhavsar, and Preeti Narooka. "String Matching Algorithms." International Journal Of Engineering And Computer Science 7, no. 03 (March 23, 2018): 23769–72. http://dx.doi.org/10.18535/ijecs/v7i3.19.

Full text
Abstract:
To analyze the content of the documents, the various pattern matching algorithms are used to find all the occurrences of a limited set of patterns within an input text or input document. In order to perform this task, this research work used four existing string matching algorithms; they are Brute Force algorithm, Knuth-Morris-Pratt algorithm (KMP), Boyer Moore algorithm and Rabin Karp algorithm. This work also proposes three new string matching algorithms. They are Enhanced Boyer Moore algorithm, Enhanced Rabin Karp algorithm and Enhanced Knuth-Morris-Pratt algorithm. Findings: For experimentation, this work has used two types of documents, i.e. .txt and .docx. Performance measures used are search time, number of iterations and accuracy. From the experimental results, it is realized that the enhanced KMP algorithm gives better accuracy compared to other string matching algorithms. Application/Improvements: Normally, these algorithms are used in the field of text mining, document classification, content analysis and plagiarism detection. In future, these algorithms have to be enhanced to improve their performance and the various types of documents will be used for experimentation.
APA, Harvard, Vancouver, ISO, and other styles
2

Zhang, Zhaoyang. "Review on String-Matching Algorithm." SHS Web of Conferences 144 (2022): 03018. http://dx.doi.org/10.1051/shsconf/202214403018.

Full text
Abstract:
String-matching algorithm is one of the most researched algorithms in computer science which has become an important factor in many technologies. This field aims at utilizing the least time and resources to find desired sequence of character in complex data content. The most classical and famous string-search algorithms are Knuth-Morris-Pratt (KMP) algorithm and Boyer-Moore (DM) algorithm. These two algorithms provide efficient heuristic jump rules by prefix or suffix. Bitap algorithm was the first to introduce bit-parallelism into string-matching field. Backward Non-Deterministic DAWG Matching (BNDM) algorithm is a modern practical algorithm that is an outstanding combination of theoretical research and practical application. Those meaningful algorithms play a guiding role in future research in string-search algorithm to improve the average performance of the algorithm and reduce resource consumption.
APA, Harvard, Vancouver, ISO, and other styles
3

Latorre, Omar. "Exact and kernelization algorithms for Closet String." Selecciones Matemáticas 7, no. 2 (December 30, 2020): 257–66. http://dx.doi.org/10.17268/sel.mat.2020.02.08.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Khadiev, Kamil, Artem Ilikaev, and Jevgenijs Vihrovs. "Quantum Algorithms for Some Strings Problems Based on Quantum String Comparator." Mathematics 10, no. 3 (January 26, 2022): 377. http://dx.doi.org/10.3390/math10030377.

Full text
Abstract:
We study algorithms for solving three problems on strings. These are sorting of n strings of length k, “the Most Frequent String Search Problem”, and “searching intersection of two sequences of strings”. We construct quantum algorithms that are faster than classical (randomized or deterministic) counterparts for each of these problems. The quantum algorithms are based on the quantum procedure for comparing two strings of length k in O(k) queries. The first problem is sorting n strings of length k. We show that classical complexity of the problem is Θ(nk) for constant size alphabet, but our quantum algorithm has O˜(nk) complexity. The second one is searching the most frequent string among n strings of length k. We show that the classical complexity of the problem is Θ(nk), but our quantum algorithm has O˜(nk) complexity. The third problem is searching for an intersection of two sequences of strings. All strings have the same length k. The size of the first set is n, and the size of the second set is m. We show that the classical complexity of the problem is Θ((n+m)k), but our quantum algorithm has O˜((n+m)k) complexity.
APA, Harvard, Vancouver, ISO, and other styles
5

Franek, Frantisek, and Michael Liut. "Computing Maximal Lyndon Substrings of a String." Algorithms 13, no. 11 (November 12, 2020): 294. http://dx.doi.org/10.3390/a13110294.

Full text
Abstract:
There are two reasons to have an efficient algorithm for identifying all right-maximal Lyndon substrings of a string: firstly, Bannai et al. introduced in 2015 a linear algorithm to compute all runs of a string that relies on knowing all right-maximal Lyndon substrings of the input string, and secondly, Franek et al. showed in 2017 a linear equivalence of sorting suffixes and sorting right-maximal Lyndon substrings of a string, inspired by a novel suffix sorting algorithm of Baier. In 2016, Franek et al. presented a brief overview of algorithms for computing the Lyndon array that encodes the knowledge of right-maximal Lyndon substrings of the input string. Among those presented were two well-known algorithms for computing the Lyndon array: a quadratic in-place algorithm based on the iterated Duval algorithm for Lyndon factorization and a linear algorithmic scheme based on linear suffix sorting, computing the inverse suffix array, and applying to it the next smaller value algorithm. Duval’s algorithm works for strings over any ordered alphabet, while for linear suffix sorting, a constant or an integer alphabet is required. The authors at that time were not aware of Baier’s algorithm. In 2017, our research group proposed a novel algorithm for the Lyndon array. Though the proposed algorithm is linear in the average case and has O(nlog(n)) worst-case complexity, it is interesting as it emulates the fast Fourier algorithm’s recursive approach and introduces τ-reduction, which might be of independent interest. In 2018, we presented a linear algorithm to compute the Lyndon array of a string inspired by Phase I of Baier’s algorithm for suffix sorting. This paper presents the theoretical analysis of these two algorithms and provides empirical comparisons of both of their C++ implementations with respect to the iterated Duval algorithm.
APA, Harvard, Vancouver, ISO, and other styles
6

Jantan, Hamidah, and Nurul Aisyiah Baharudin. "Mobile-Based Word Matching Detection using Intelligent Predictive Algorithm." International Journal of Interactive Mobile Technologies (iJIM) 13, no. 09 (September 5, 2019): 140. http://dx.doi.org/10.3991/ijim.v13i09.10848.

Full text
Abstract:
Word matching is a string searching technique for information retrieval in Natural Language Processing (NLP). There are several algorithms have been used for string search and matching such as Knuth Morris Pratt, Boyer Moore, Horspool, Intelligent Predictive and many other. However, there some issues need to be considered in measuring the performance of the algorithms such as the efficiency for searching small alphabets, time taken in processing the pattern of the text and extra space to support a huge table or state machines. Intelligent Predictive (IP) algorithm capable to solve several word matching issues discovered in other string searching algorithms especially with abilities to skip the pre-processing of the pattern, uses simple rules during matching process and does not involved complex computations. Due to those reasons,<strong> </strong>IP algorithm is used in this study due to the ability of this algorithm to produce a good result in string searching process. This article aims to apply IP algorithm together with Optical Character Recognition (OCR) tool for mobile-based word matching detection. There are four phases in this study consists of data preparation, mobile based system design, algorithm implementation and result analysis. The efficiency of the proposed algorithm was evaluated based on the execution time of searching process among the selected algorithms. The result shows that the IP algorithm for string searching process is more efficient in execution time compared to well-known algorithm i.e. Boyer Moore algorithm. In future work, the performance of string searching process can be enhanced by using other suitable optimization searching techniques such as Genetic Algorithm, Particle Swarm Optimization, Ant Colony Optimization and many others.
APA, Harvard, Vancouver, ISO, and other styles
7

Baeza-Yates, R. A. "Algorithms for string searching." ACM SIGIR Forum 23, no. 3-4 (April 1989): 34–58. http://dx.doi.org/10.1145/74697.74700.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Evans, D. J., and S. Ghanemi. "Parallel String Matching Algorithms." Kybernetes 17, no. 3 (March 1988): 32–44. http://dx.doi.org/10.1108/eb005791.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Nazeen, Sumaiya, M. Sohel Rahman, and Rezwana Reaz. "Indeterminate string inference algorithms." Journal of Discrete Algorithms 10 (January 2012): 23–34. http://dx.doi.org/10.1016/j.jda.2011.08.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Parshukova, N. B. "CRYPTOGRAPHIC ALGORITHMS IN SPREADSHEETS." Informatics in school, no. 8 (November 9, 2019): 51–55. http://dx.doi.org/10.32517/2221-1993-2019-18-8-51-55.

Full text
Abstract:
The article examines three well known cryptographic algorithm — Skytale, Caesar's cipher, Vigenere's cipher and the method of their implementation using spreadsheets. Functions on work with strings, such as calculation of string length, search of a substring position in a string, substring selection, concatenation are considered.
APA, Harvard, Vancouver, ISO, and other styles
11

Russo, Luıs, and Alexandre Francisco. "Small Longest Tandem Scattered Subsequences." Scientific Annals of Computer Science 31, no. 1 (August 9, 2021): 79–110. http://dx.doi.org/10.7561/sacs.2021.1.79.

Full text
Abstract:
We consider the problem of identifying tandem scattered subsequences within a string. Our algorithm identifies a longest subsequence which occurs twice without overlap in a string. This algorithm is based on the Hunt-Szymanski algorithm, therefore its performance improves if the string is not self similar, which occurs naturally on strings over large alphabets. Our algorithm relies on new results for data structures that support dynamic longest increasing sub-sequences. In the process we also obtain improved algorithms for the decremental string comparison problem.
APA, Harvard, Vancouver, ISO, and other styles
12

BERGERON, ANNE, and SYLVIE HAMEL. "VECTOR ALGORITHMS FOR APPROXIMATE STRING MATCHING." International Journal of Foundations of Computer Science 13, no. 01 (February 2002): 53–65. http://dx.doi.org/10.1142/s0129054102000947.

Full text
Abstract:
Vector algorithms allow the computation of an output vector r = r1 r2 ⋯ rm given an input vector e = e1 e2 ⋯ em in a bounded number of operations, independent of m the length of the vectors. The allowable operations are usually restricted to bit-wise operations available in processors, including shifts and binary addition with carry. These restrictions imple that the existence of a vector algorithm for a particular problem opens the way to extremely fast implementations, using the inherent parallelism of bit-wise operations. This paper presents general results on the existence and construction of vertor algorithms, with a particular focus on problems arising from computational biology. We show that efficient vector algorithms exist for the problem of approximate string matching with arbitrary weighted distances, generalizing a previous result by G. Myers. We also characterize a class of automata for which vector algorithms can be automatically derived from the transition table of the automata.
APA, Harvard, Vancouver, ISO, and other styles
13

Lasch, Robert, Ismail Oukid, Roman Dementiev, Norman May, Suleyman S. Demirsoy, and Kai-Uwe Sattler. "Faster & strong: string dictionary compression using sampling and fast vectorized decompression." VLDB Journal 29, no. 6 (July 20, 2020): 1263–85. http://dx.doi.org/10.1007/s00778-020-00620-x.

Full text
Abstract:
AbstractString dictionaries constitute a large portion of the memory footprint of database applications. While strong string dictionary compression algorithms exist, these come with impractical access and compression times. Therefore, lightweight algorithms such as front coding (PFC) are favored in practice. This paper endeavors to make strong string dictionary compression practical. We focus on Re-Pair Front Coding (RPFC), a grammar-based compression algorithm, since it consistently offers better compression ratios than other algorithms in the literature. To accelerate compression times, we propose block-based RPFC (BRPFC) which consists in independently compressing small blocks of the dictionary. For further accelerated compression times especially on large string dictionaries, we also propose an alternative version of BRPFC that uses sampling to speed up compression. Moreover, to accelerate access times, we devise a vectorized access method, using $$\hbox {Intel}^{\circledR }$$ Intel ® Advanced Vector Extensions 512 ($$\hbox {Intel}^{\circledR }$$ Intel ® AVX-512). Our experimental evaluation shows that sampled BRPFC offers compression times up to 190 $$\times $$ × faster than RPFC, and random string lookups 2.3 $$\times $$ × faster than RPFC on average. These results move our modified RPFC into a practical range for use in database systems because the overhead of Re-Pair-based compression for access times can be reduced by 2 $$\times $$ × .
APA, Harvard, Vancouver, ISO, and other styles
14

Chung, K. L. "Fast string matching algorithms for run-length coded strings." Computing 54, no. 2 (June 1995): 119–25. http://dx.doi.org/10.1007/bf02238127.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Bernardini, Giulia, Huiping Chen, Alessio Conte, Roberto Grossi, Grigorios Loukides, Nadia Pisanti, Solon P. Pissis, Giovanna Rosone, and Michelle Sweering. "Combinatorial Algorithms for String Sanitization." ACM Transactions on Knowledge Discovery from Data 15, no. 1 (January 6, 2021): 1–34. http://dx.doi.org/10.1145/3418683.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Ukkonen, Esko. "Algorithms for approximate string matching." Information and Control 64, no. 1-3 (January 1985): 100–118. http://dx.doi.org/10.1016/s0019-9958(85)80046-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Lecroq, Thierry. "Fast exact string matching algorithms." Information Processing Letters 102, no. 6 (June 2007): 229–35. http://dx.doi.org/10.1016/j.ipl.2007.01.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Ghuman, Sukhpal, Emanuele Giaquinta, and Jorma Tarhio. "Lyndon Factorization Algorithms for Small Alphabets and Run-Length Encoded Strings." Algorithms 12, no. 6 (June 21, 2019): 124. http://dx.doi.org/10.3390/a12060124.

Full text
Abstract:
We present two modifications of Duval’s algorithm for computing the Lyndon factorization of a string. One of the algorithms has been designed for strings containing runs of the smallest character. It works best for small alphabets and it is able to skip a significant number of characters of the string. Moreover, it can be engineered to have linear time complexity in the worst case. When there is a run-length encoded string R of length ρ , the other algorithm computes the Lyndon factorization of R in O ( ρ ) time and in constant space. It is shown by experimental results that the new variations are faster than Duval’s original algorithm in many scenarios.
APA, Harvard, Vancouver, ISO, and other styles
19

CHRISTOU, MICHALIS, MAXIME CROCHEMORE, and COSTAS S. ILIOPOULOS. "IDENTIFYING ALL ABELIAN PERIODS OF A STRING IN QUADRATIC TIME AND RELEVANT PROBLEMS." International Journal of Foundations of Computer Science 23, no. 06 (September 2012): 1371–84. http://dx.doi.org/10.1142/s0129054112500190.

Full text
Abstract:
Abelian periodicity of strings has been studied extensively over the last years. In 2006 Constantinescu and Ilie defined the abelian period of a string and several algorithms for the computation of all abelian periods of a string were given. In contrast to the classical period of a word, its abelian version is more flexible, factors of the word are considered the same under any internal permutation of their letters. We show two O(|y|2) algorithms for the computation of all abelian periods of a string y. The first one maps each letter to a suitable number such that each factor of the string can be identified by the unique sum of the numbers corresponding to its letters and hence abelian periods can be identified easily. The other one maps each letter to a prime number such that each factor of the string can be identified by the unique product of the numbers corresponding to its letters and so abelian periods can be identified easily. We also define weak abelian periods on strings and give an O(|y|log(|y|)) algorithm for their computation, together with some other algorithms for more basic problems.
APA, Harvard, Vancouver, ISO, and other styles
20

Markić, Ivan, Maja Štula, Marija Zorić, and Darko Stipaničev. "Entropy-Based Approach in Selection Exact String-Matching Algorithms." Entropy 23, no. 1 (December 28, 2020): 31. http://dx.doi.org/10.3390/e23010031.

Full text
Abstract:
The string-matching paradigm is applied in every computer science and science branch in general. The existence of a plethora of string-matching algorithms makes it hard to choose the best one for any particular case. Expressing, measuring, and testing algorithm efficiency is a challenging task with many potential pitfalls. Algorithm efficiency can be measured based on the usage of different resources. In software engineering, algorithmic productivity is a property of an algorithm execution identified with the computational resources the algorithm consumes. Resource usage in algorithm execution could be determined, and for maximum efficiency, the goal is to minimize resource usage. Guided by the fact that standard measures of algorithm efficiency, such as execution time, directly depend on the number of executed actions. Without touching the problematics of computer power consumption or memory, which also depends on the algorithm type and the techniques used in algorithm development, we have developed a methodology which enables the researchers to choose an efficient algorithm for a specific domain. String searching algorithms efficiency is usually observed independently from the domain texts being searched. This research paper aims to present the idea that algorithm efficiency depends on the properties of searched string and properties of the texts being searched, accompanied by the theoretical analysis of the proposed approach. In the proposed methodology, algorithm efficiency is expressed through character comparison count metrics. The character comparison count metrics is a formal quantitative measure independent of algorithm implementation subtleties and computer platform differences. The model is developed for a particular problem domain by using appropriate domain data (patterns and texts) and provides for a specific domain the ranking of algorithms according to the patterns’ entropy. The proposed approach is limited to on-line exact string-matching problems based on information entropy for a search pattern. Meticulous empirical testing depicts the methodology implementation and purports soundness of the methodology.
APA, Harvard, Vancouver, ISO, and other styles
21

Al-Ssulami, Abdulrakeeb M., Hassan Mathkour, and Mohammed Amer Arafah. "Efficient String Matching Algorithm for Searching Large DNA and Binary Texts." International Journal on Semantic Web and Information Systems 13, no. 4 (October 2017): 198–220. http://dx.doi.org/10.4018/ijswis.2017100110.

Full text
Abstract:
The exact string matching is essential in application areas such as Bioinformatics and Intrusion Detection Systems. Speeding-up the string matching algorithm will therefore result in accelerating the searching process in DNA and binary data. Previously, there are two types of fast algorithms exist, bit-parallel based algorithms and hashing algorithms. The bit-parallel based are efficient when dealing with patterns of short lengths, less than 64, but slow on long patterns. On the other hand, hashing algorithms have optimal sublinear average case on large alphabets and long patterns, but the efficiency not so good on small alphabet such as DNA and binary texts. In this paper, the authors present hybrid algorithm to overcome the shortcomings of those previous algorithms. The proposed algorithm is based on q-gram hashing with guaranteeing the maximal shift in advance. Experimental results on random and complete human genome confirm that the proposed algorithm is efficient on various pattern lengths and small alphabet.
APA, Harvard, Vancouver, ISO, and other styles
22

Fadlil, Abdul, Sunardi Sunardi, and Rezki Ramdhani. "Similarity Identification Based on Word Trigrams Using Exact String Matching Algorithms." INTENSIF: Jurnal Ilmiah Penelitian dan Penerapan Teknologi Sistem Informasi 6, no. 2 (August 13, 2022): 253–70. http://dx.doi.org/10.29407/intensif.v6i2.18141.

Full text
Abstract:
Several studies regarding excellent exact string matching algorithms can be used to identify similarity, including the Rabin-Karp, Winnowing, and Horspool Boyer-Moore algorithms. In determining similarities, the Rabin-Karp and Winnowing algorithms use fingerprints, while the Horspool Boyer-Moore algorithm uses a bad-character table. However, previous research focused on identifying similarities using these algorithms based on character n-gram. In contrast, identification based on the word n-gram to determine the similarity based on its linguistic meaning, especially for longer strings, had not been covered yet. Therefore, a word-level trigram was proposed to identify similarities based on the word trigrams using the three algorithms and compare each performance. Based on precision, recall, and running time comparison, the Rabin-Karp algorithm results were 100%, 100%, and 0.19 ms, respectively; the Winnowing algorithm results with the smallest window were 100%, 56%, and 0.18 ms, respectively; and the Horspool algorithm results were 100%, 100%, and 0.06 ms. From these results, it can be concluded that the performance of the Horspool Boyer-Moore algorithm is better in terms of precision, recall, and running time.
APA, Harvard, Vancouver, ISO, and other styles
23

Na, Joong Chae, Sukhyeun Cho, Siwon Choi, Jin Wook Kim, Kunsoo Park, and Jeong Seop Sim. "A new graph model and algorithms for consistent superstring problems." Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 372, no. 2016 (May 28, 2014): 20130134. http://dx.doi.org/10.1098/rsta.2013.0134.

Full text
Abstract:
Problems related to string inclusion and non-inclusion have been vigorously studied in diverse fields such as data compression, molecular biology and computer security. Given a finite set of positive strings and a finite set of negative strings , a string α is a consistent superstring if every positive string is a substring of α and no negative string is a substring of α . The shortest (resp. longest) consistent superstring problem is to find a string α that is the shortest (resp. longest) among all the consistent superstrings for the given sets of strings. In this paper, we first propose a new graph model for consistent superstrings for given and . In our graph model, the set of strings represented by paths satisfying some conditions is the same as the set of consistent superstrings for and . We also present algorithms for the shortest and the longest consistent superstring problems. Our algorithms solve the consistent superstring problems for all cases, including cases that are not considered in previous work. Moreover, our algorithms solve in polynomial time the consistent superstring problems for more cases than the previous algorithms. For the polynomially solvable cases, our algorithms are more efficient than the previous ones.
APA, Harvard, Vancouver, ISO, and other styles
24

NGASSAM, ERNEST KETCHA, DERRICK G. KOURIE, and BRUCE W. WATSON. "ON IMPLEMENTATION AND PERFORMANCE OF TABLE-DRIVEN DFA-BASED STRING PROCESSORS." International Journal of Foundations of Computer Science 19, no. 01 (February 2008): 53–70. http://dx.doi.org/10.1142/s012905410800553x.

Full text
Abstract:
Table-driven (TD) DFA-based string processing algorithms are examined from a number of vantage points. Firstly, various strategies for implementing such algorithms in a cache-efficient manner are identified. The denotational semantics of such algorithms is encapsulated in a function whose various arguments are associated with each implementation strategy. This formal view of the implementation strategies suggests twelve different algorithms, each blending together the implementation strategies in a particular way. The performance of these algorithms is examined in against a set of artificially generated data. Results indicate a number of cases where the new algorithms outperform the traditional TD algorithm.
APA, Harvard, Vancouver, ISO, and other styles
25

Khadiev, Kamil, and Vladislav Remidovskii. "Classical and Quantum Algorithms for Assembling a Text from a Dictionary." Nonlinear Phenomena in Complex Systems 24, no. 3 (October 12, 2021): 207–21. http://dx.doi.org/10.33581/1561-4085-2021-24-3-207-221.

Full text
Abstract:
We study algorithms for solving the problem of assembling a text (long string) from a dictionary (a sequence of small strings). The problem has an application in bioinformatics and has a connection with the sequence assembly method for reconstructing a long deoxyribonucleic-acid (DNA) sequence from small fragments. The problem is assembling a string t of length n from strings s1,...,sm. Firstly, we provide a classical (randomized) algorithm with running time Õ(nL0.5 + L) where L is the sum of lengths of s1,...,sm. Secondly, we provide a quantum algorithm with running time Õ(nL0.25 + √mL). Thirdly, we show the lower bound for a classical (randomized or deterministic) algorithm that is Ω(n+L). So, we obtain the quadratic quantum speed-up with respect to the parameter L; and our quantum algorithm have smaller running time comparing to any classical (randomized or deterministic) algorithm in the case of non-constant length of strings in the dictionary.
APA, Harvard, Vancouver, ISO, and other styles
26

Leonardo, Brinardi, and Seng Hansun. "Text Documents Plagiarism Detection using Rabin-Karp and Jaro-Winkler Distance Algorithms." Indonesian Journal of Electrical Engineering and Computer Science 5, no. 2 (February 1, 2017): 462. http://dx.doi.org/10.11591/ijeecs.v5.i2.pp462-471.

Full text
Abstract:
Plagiarism is an act that is considered by the university as a fraud by taking someone ideas or writings without mentioning the references and claimed as his own. Plagiarism detection system is generally implement string matching algorithm in a text document to search for common words between documents. There are some algorithms used for string matching, two of them are Rabin-Karp and Jaro-Winkler Distance algorithms. Rabin-Karp algorithm is one of compatible algorithms to solve the problem of multiple string patterns, while, Jaro-Winkler Distance algorithm has advantages in terms of time. A plagiarism detection application is developed and tested on different types of documents, i.e. doc, docx, pdf and txt. From the experimental results, we obtained that both of these algorithms can be used to perform plagiarism detection of those documents, but in terms of their effectiveness, Rabin-Karp algorithm is much more effective and faster in the process of detecting the document with the size more than 1000 KB.
APA, Harvard, Vancouver, ISO, and other styles
27

Nadarajan, Krishnaveny, and Zuriati Ahmad Zukarnain. "Analysis of String Matching Compression Algorithms." Journal of Computer Science 4, no. 3 (March 1, 2008): 205–10. http://dx.doi.org/10.3844/jcssp.2008.205.210.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Galil, Zvi. "Optimal parallel algorithms for string matching." Information and Control 67, no. 1-3 (October 1985): 144–57. http://dx.doi.org/10.1016/s0019-9958(85)80031-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Baeza-Yates, Ricardo A., and Luis O. Fuentes. "A framework to animate string algorithms." Information Processing Letters 59, no. 5 (September 1996): 241–44. http://dx.doi.org/10.1016/0020-0190(96)00117-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Crochemore, M., A. Czumaj, L. Gasieniec, S. Jarominek, T. Lecroq, W. Plandowski, and W. Rytter. "Speeding up two string-matching algorithms." Algorithmica 12, no. 4-5 (November 1994): 247–67. http://dx.doi.org/10.1007/bf01185427.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Raskhodnikova, Sofya, Dana Ron, Ronitt Rubinfeld, and Adam Smith. "Sublinear Algorithms for Approximating String Compressibility." Algorithmica 65, no. 3 (February 22, 2012): 685–709. http://dx.doi.org/10.1007/s00453-012-9618-6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Oliveira, R. M., E. S. Helou, and E. F. Costa. "String-averaging incremental stochastic subgradient algorithms." Optimization Methods and Software 34, no. 3 (July 23, 2018): 665–92. http://dx.doi.org/10.1080/10556788.2018.1496432.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Lecroq, Thierry. "Experimental results on string matching algorithms." Software: Practice and Experience 25, no. 7 (July 1995): 727–65. http://dx.doi.org/10.1002/spe.4380250703.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Jiang, Ya Ping, Yue Xia Tian, and Jun Wei Zhao. "An Improved BMQ Algorithm for Pattern Matching." Advanced Materials Research 998-999 (July 2014): 814–17. http://dx.doi.org/10.4028/www.scientific.net/amr.998-999.814.

Full text
Abstract:
Pattern matching algorithm is widely used. It plays an important role in information retrieval, data mining, intrusion detection and other fields. Among them, the BM algorithm is the most common. A new improved algorithm-BMQ algorithm is proposed on the basis of BM and related algorithms. The improved algorithm makes use of uniqueness and combination of the last character and next character of string, to increase the probability of the maximum right shift. Theoretical analysis and experimental comparison shows that the BMQ is better than BM algorithms in the process of string matching and string searching; in order to further verify its effectiveness, the improved algorithm is introduced to intrusion detection system, the experimental results show that BMQ algorithm improves the efficiency of intrusion detection.
APA, Harvard, Vancouver, ISO, and other styles
35

Baturu, Charles, and Naufal abdi. "Brute Force Algorithm Implementation Of Dictionary Search." Jurnal Info Sains : Informatika dan Sains 10, no. 1 (March 1, 2020): 24–30. http://dx.doi.org/10.54209/infosains.v10i1.29.

Full text
Abstract:
In the manual dictionary the more words are accommodated, the heavier the dictionary. Dictionaries that are generally book-shaped are difficult to carry anywhere because they are thick and heavy. With this mobile-based computer dictionary application, users no longer need to carry heavy dictionaries and the search process will also be faster and easier. This mobile-based kmputer dictionary application is created using the java programming language with the NetBeans 7.0 editor and using an RMS (Record Management System) based database. So it can be installed on various types of mobile phones that already support Java. String Matching is one of the algorithms used to speed up the search process for the desired word. String matching algorithms have been used frequently before as examples in the process of matching strings based on the equation of text data namely Brute Force. Because this Brute Force algorithm can be used to perform string or text searches. Brute force algorithm is an algorithm to match a pattern with all text between 0 and n-m to find the presence of a pattern in text.
APA, Harvard, Vancouver, ISO, and other styles
36

Mazurenko, A. V., and N. V. Boldyrikhin. "Accelerated preprocessing in task of searching substrings in a string." Vestnik of Don State Technical University 19, no. 3 (October 4, 2019): 290–300. http://dx.doi.org/10.23947/1992-5980-2019-19-3-290-300.

Full text
Abstract:
Introduction. A rapid development of the systems such as Yandex, Google, etc., has predetermined the relevance of the task of searching substrings in a string, and approaches to its solution are actively investigated. This task is used to create database management systems that support associative search. Besides, it is applicable in solving information security issues and creating antivirus programs. Algorithms of searching substring in a string are used in signature-based discovery tasks.Materials and Methods. The solution to the problem is based on the Aho-Corasick algorithm which is a typical technique of searching substrings in a string. At the same time, a new approach regarding preprocessing is employed.Research Results. The possibility of constructing the transition function and suffix references through suffix arrays and special mappings, is shown. The relationship between the prefix tree and suffix arrays was investigated, which provided the development of a fundamentally new method of constructing the transition and error functions. The results obtained enable to substantially shorten the time intervals spent on the preelection processing of a set of pattern strings when using an integer alphabet. The paper lists eight algorithms. The developed algorithms are evaluated. The results obtained are compared to the formerly known. Two theorems and eight lemmas are proved. Two examples illustrating features of the practical application of the developed preprocessing procedure are given.Discussion and Conclusions. The preprocessing procedure proposed in this paper is based on the communication between the suffix array built on the ground of a set of pattern strings and the construction of transition and error functions at the initial stages of the Aho-Corasick algorithm. This approach differs from the traditional one and requires the use of algorithms providing a suffix array in linear time. Thus, the algorithms that enable to significantly reduce the time for preprocessing of a set of pattern strings under the condition of using a certain type of alphabet in comparison to the known approach proposed in the Aho- Corasick algorithm are described. The research results presented in the paper can be used in antivirus programs that apply searching for signatures of malicious data objects in the memory of a computer system. In addition, this approach to solving the problem on searching substrings in a string will significantly speed up the operation of database management systems using associative search.
APA, Harvard, Vancouver, ISO, and other styles
37

FARO, SIMONE, and THIERRY LECROQ. "EFFICIENT VARIANTS OF THE BACKWARD-ORACLE-MATCHING ALGORITHM." International Journal of Foundations of Computer Science 20, no. 06 (December 2009): 967–84. http://dx.doi.org/10.1142/s0129054109006991.

Full text
Abstract:
In this article we present two efficient variants of the BOM string matching algorithm which are more efficient and flexible than the original algorithm. We also present bit-parallel versions of them obtaining an efficient variant of the BNDM algorithm. Then we compare the newly presented algorithms with some of the most recent and effective string matching algorithms. It turns out that the new proposed variants are very flexible and achieve very good results, especially in the case of large alphabets.
APA, Harvard, Vancouver, ISO, and other styles
38

Znamenskij, Sergej Vital'evich. "Stable assessment of the quality of similarity algorithms of character strings and their normalizations." Program Systems: Theory and Applications 9, no. 4 (December 28, 2018): 561–78. http://dx.doi.org/10.25209/2079-3316-2018-9-4-561-578.

Full text
Abstract:
The choice of search tools for hidden commonality in the data of a new nature requires stable and reproducible comparative assessments of the quality of abstract algorithms for the proximity of symbol strings. Conventional estimates based on artificially generated or manually labeled tests vary significantly, rather evaluating the method of this artificial generation with respect to similarity algorithms, and estimates based on user data cannot be accurately reproduced. A simple, transparent, objective and reproducible numerical quality assessment of a string metric. Parallel texts of book translations in different languages are used. The quality of a measure is estimated by the percentage of errors in possible different tries of determining the translation of a given paragraph among two paragraphs of a book in another language, one of which is actually a translation. The stability of assessments is verified by independence from the choice of a book and a pair of languages. The numerical experiment steadily ranked by quality algorithms for abstract character string comparisons and showed a strong dependence on the choice of normalization.
APA, Harvard, Vancouver, ISO, and other styles
39

Budimirovic, Nebojsa, and Nebojsa Bacanin. "Novel Algorithms for Graph Clustering Applied to Human Activities." Mathematics 9, no. 10 (May 12, 2021): 1089. http://dx.doi.org/10.3390/math9101089.

Full text
Abstract:
In this paper, a novel algorithm (IBC1) for graph clustering with no prior assumption of the number of clusters is introduced. Furthermore, an additional algorithm (IBC2) for graph clustering when the number of clusters is given beforehand is presented. Additionally, a new measure of evaluation of clustering results is given—the accuracy of formed clusters (T). For the purpose of clustering human activities, the procedure of forming string sequences are presented. String symbols are gained by modeling spatiotemporal signals obtained from inertial measurement units. String sequences provided a starting point for forming the complete weighted graph. Using this graph, the proposed algorithms, as well as other well-known clustering algorithms, are tested. The best results are obtained using novel IBC2 algorithm: T = 96.43%, Rand Index (RI) 0.966, precision rate (P) 0.918, recall rate (R) 0.929 and balanced F-measure (F) 0.923.
APA, Harvard, Vancouver, ISO, and other styles
40

Liu, Bing, Dan Han, and Shuang Zhang. "Approximate Chinese String Matching Techniques Based on Pinyin Input Method." Applied Mechanics and Materials 513-517 (February 2014): 1017–20. http://dx.doi.org/10.4028/www.scientific.net/amm.513-517.1017.

Full text
Abstract:
String matching is one of the most typical problems in computer science. Previous studies mainly focused on accurate string matching problem. However, with the rapid development of the computer and Internet as well as the continuously rising of new issues, people find that it has very important theoretical value and practical meaning to research and design efficient approximate string matching algorithms. Approximate string matching is also called string matching that allows errors, which mainly aims to find the pattern string in the text and database and allows k differences between the pattern string and its occurring forms in the text. For the problem of approximate string matching, though a number of algorithms have been proposed, there are fewer studies which focus on large size of alphabet . Most of experts are interested in small or middle size of alphabet . For large size of , especially for Chinese characters and Asian phonetics, there are fewer efficient algorithms. For the above reasons, this paper focuses on the approximate Chinese strings matching problem based on the pinyin input method.
APA, Harvard, Vancouver, ISO, and other styles
41

FirdoseAhmed, Gulfishan, and Nilay Khare. "Hardware based String Matching Algorithms: A Survey." International Journal of Computer Applications 88, no. 11 (February 14, 2014): 16–19. http://dx.doi.org/10.5120/15396-3898.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Gupta, Sumit, and Akhtar Rasool. "Bit Parallel String Matching Algorithms: A Survey." International Journal of Computer Applications 95, no. 10 (June 18, 2014): 27–32. http://dx.doi.org/10.5120/16632-6501.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Barton, Carl, Costas S. Iliopoulos, and Solon P. Pissis. "Fast algorithms for approximate circular string matching." Algorithms for Molecular Biology 9, no. 1 (2014): 9. http://dx.doi.org/10.1186/1748-7188-9-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

DasGupta, Bhaskar, Kishori M. Konwar, Ion I. Mandoiu, and Alex A. Shvartsman. "Highly scalable algorithms for robust string barcoding." International Journal of Bioinformatics Research and Applications 1, no. 2 (2005): 145. http://dx.doi.org/10.1504/ijbra.2005.007574.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Singh, Rama, Deepak Rai, and Rajesh Prasad. "A review on parameterized string matching algorithms." Journal of Information and Optimization Sciences 39, no. 1 (November 10, 2017): 275–83. http://dx.doi.org/10.1080/02522667.2017.1374730.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Park, J. H., and K. M. George. "Efficient parallel hardware algorithms for string matching." Microprocessors and Microsystems 23, no. 3 (October 1999): 155–68. http://dx.doi.org/10.1016/s0141-9331(99)00032-0.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Lemström, Kjell, Gonzalo Navarro, and Yoan Pinzon. "Practical algorithms for transposition-invariant string-matching." Journal of Discrete Algorithms 3, no. 2-4 (June 2005): 267–92. http://dx.doi.org/10.1016/j.jda.2004.08.009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
48

Muthukrishnan, S. "Detecting False Matches in String-Matching Algorithms." Algorithmica 18, no. 4 (August 1997): 512–20. http://dx.doi.org/10.1007/pl00009168.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

JOKINEN, PETTERI, JORMA TARHIO, and ESKO UKKONEN. "A Comparison of Approximate String Matching Algorithms." Software: Practice and Experience 26, no. 12 (December 1996): 1439–58. http://dx.doi.org/10.1002/(sici)1097-024x(199612)26:12<1439::aid-spe71>3.0.co;2-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Tarhio, Jorma, Jan Holub, and Emanuele Giaquinta. "Technology beats algorithms (in exact string matching)." Software: Practice and Experience 47, no. 12 (August 1, 2017): 1877–85. http://dx.doi.org/10.1002/spe.2511.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography