Dissertations / Theses on the topic 'Exact string matching problem'

To see the other types of publications on this topic, follow the link: Exact string matching problem.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 23 dissertations / theses for your research on the topic 'Exact string matching problem.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Klaib, Ahmad. "Exact string matching algorithms for searching DNA and protein sequences and searching chemical databases." Thesis, University of Huddersfield, 2014. http://eprints.hud.ac.uk/id/eprint/24266/.

Full text
Abstract:
The enormous quantities of biological and chemical files and databases are likely to grow year on year, consequently giving rise to the need to develop string-matching algorithms capable of minimizing the searching response time. Being aware of this need, this thesis aims to develop string matching algorithms to search biological sequences and chemical structures by studying exact string matching algorithms in detail. As a result, this research developed a new classification of string matching algorithms containing eight categories according to the pre-processing function of algorithms and proposed five new string matching algorithms; BRBMH, BRQS, Odd and Even algorithm (OE), Random String Matching algorithm (RSMA) and Skip Shift New algorithm (SSN). The main purpose behind the proposed algorithms is to reduce the searching response time and the total number of comparisons. They are tested by comparing them with four well- known standard algorithms, Boyer Moore Horspool (BMH), Quick Search (QS), TVSBS and BRFS. This research applied all of the algorithms to sample data files by implementing three types of tests. The number of comparison tests showed a substantial difference in the number of comparisons our algorithms use compared to the non-hybrid algorithms such as QS and BMH. In addition, the tests showed considerable difference between our algorithms and other hybrid algorithm such as TVSBS and BRFS. For instance, the average elapsed search time tests showed that our algorithms presented better average elapsed search time than the BRFS, TVSBS, QS and BMH algorithms, while the average number of tests showed better number of attempts compared to BMH, QS, TVSBS and BRFS algorithms. A new contribution has been added by this research by using the fastest proposed algorithm, the SSN algorithm, to develop a chemical structure searching toolkit to search chemical structures in our local database. The new algorithms were paralleled using OpenMP and MPI parallel models and tested at the University of Science Malaysia (USM) on a Stealth Cluster with different number of threads and processors to improve the speed of searching pattern in the given text which, as we believe, is another contribution.
APA, Harvard, Vancouver, ISO, and other styles
2

黎少斌 and Shiao-bun Lai. "Trading off time for space for the string matching problem." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31214216.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Lai, Shiao-bun. "Trading off time for space for the string matching problem /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B18061795.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Chen, Hui-Min, and 陳慧敏. "An Exact String Matching Problem Using Data Encoding Scheme." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/99302180071058684639.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
96
The traditional exact string matching problem is to find all locations of a pattern string with length m in a text with length n. Here we propose a new encoding method to shorten the both lengths of pattern and text by substituting the substring between a special character for its length in O(m+n). Then we use an exact matching algorithm to solve the exact string matching problem on the encoding pattern and text. As can be seen、by using the encoding method、the pattern and text can be shortened about 2/|Σ| times the lengths of the original ones. In practice、it performs better than 2/|Σ|. For instance、for an English sentence pattern whose length is 50 and a text whose length is 200000、in average、the pattern is shortened to 6% of its original length and the text is shortened to 12.4% of its original length. Thus、the exact matching can be done in a much shorter time.
APA, Harvard, Vancouver, ISO, and other styles
5

Chen, Hui-Min. "An Exact String Matching Problem Using Data Encoding Scheme." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200814110600.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Chen, Kuei-Hao, and 陳奎昊. "Improved Algorithms for Exact String Matching Problems." Thesis, 2013. http://ndltd.ncl.edu.tw/handle/64377810416197972394.

Full text
Abstract:
博士
國立暨南國際大學
資訊工程學系
101
In this dissertation, we consider two problems: the exact string matching problem and its variation, the exact circular string matching. The exact string matching problem is: Given two strings, a text $T$ of length $n$ and a pattern $P$ of length $m$, find all occurrences of $P$ in $T$. We propose a strategy to analyze the average performance of the reverse factor algorithm. The analysis is based on the assumption that the text is very long as compared to the length of the pattern, and each symbol in the text is drawn uniformly from a random source with $sigma$ symbols. Our analysis uses only elementary techniques in probability theory and avoids applying complicated combinatorics in stringology. We also propose a new algorithm for exact string matching problem under the name Improved BNDM algorithm. The Improved BNDM algorithm uses the $q$-gram filtering technique to speed up the performance of the Turbo BNDM algorithm. The time complexity of the Improved BNDM algorithm achieves $mathcal{O}(n)$ in the worst case and $mathcal{O}(nlog_sigma m/m)$ in the average case where $sigma$ is the alphabet size. It is optimal in both worst case and average case. Another problem we discuss in this dissertation is the exact circular string matching problem. Given a string $P=p_1p_2cdots p_m$, let a string $P^{(i)}=p_ip_{i+1}cdots p_mp_1cdots p_{i-1}$. The exact circular string matching problem is: Given two strings, $T$ of length $n$ and $P$ of length $m$, find all occurrences of $P^{(i)}$ in $T$ for $1leq i leq m$. We propose two algorithms that perform searching of a circular string in the text using bit-parallel technique. Our algorithms use only the composition of bitwise-logical operations and basic arithmetic operations, and apply this technique to solve the problem. These algorithms are given names CSBNDM and $ ext{CSBNDN}_q$, respectively. We give several experiments to verify that they have good performance for random strings and DNA sequence.
APA, Harvard, Vancouver, ISO, and other styles
7

Hou, Kuan-Wei, and 侯冠維. "The Discrete Convolution Method for Solving the Exact String Matching Problem." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/13881027671610839668.

Full text
Abstract:
碩士
國立清華大學
電機工程學系
100
In this thesis, we introduce discrete convolution method on solving the exact string matching problem. Based on the assumption that all the text and pattern strings are generated randomly, we derived an equation which can approximate the probability of appearing of a pattern string in a text string. From this equation, we see that the probability that a pattern string appears in a text string reduces to 0 quickly as the length of the pattern string increases. Because of this observation, we introduce an algorithm based on the discrete convolution method with early termination. The algorithm terminates as soon as it discovers that a prefix of the pattern string does not appear in the text string. We show that the discrete convolution method with early termination is quite efficient to solve exact string matching problem for randomly generated text and pattern strings. In this thesis, we also show that the shift-add algorithm is equivalent to and can be implemented by the discrete convolution.
APA, Harvard, Vancouver, ISO, and other styles
8

Liao, Kuei-Hui, and 廖桂慧. "Solving the Exact String Matching Problem by Using the 2-Substring Algorithm." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/46723272181241712677.

Full text
Abstract:
碩士
國立暨南國際大學
生物醫學科技研究所
95
String matching is a very important component of many problems, such as data compression, search engine, speech recognition, virus detection, computational biology, and so on. There are many efficient method proposed to solve the string matching problem. For example, KMP algorithm、Boyer-and-Moore algorithm. In this thesis, we proposed a method to solve the exact string matching problem. We proposed a rule, called the 2-substring rule, which avoids the brute force method and can be used to solve the problem. We know the time complexity of KMP algorithm is better than Boyer-and-Moore algorithm. But the practice of Boyer-and-Moore algorithm is faster than KMP algorithm. We implement our method in practice. Our method not only is faster than KMP algorithm, Boyer and Moore algorithm, Horspool algorithm and Quick Search algorithm but also is earlier to implement.
APA, Harvard, Vancouver, ISO, and other styles
9

Zhong-He, Chen, and 陳中和. "The Application of Convolution to Suffix to Prefix Rule for the Exact String Matching Problem." Thesis, 2007. http://ndltd.ncl.edu.tw/handle/55234892315310089339.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
95
In this thesis, we consider the exact string matching problem. We first point out a rule, called the suffix to prefix rule, which can be used to avoid the brute-force sliding window approach. The Backward Nondeterministic DAWG Matching algorithm, Backward Oracle algorithm and Reverse Factor algorithm all use this rule. To implement this rule, we have to find the longest suffix of text T which is equal to a prefix of pattern P. In this thesis, we point out that convolution can be used to do this. As can be seen, the convolution technique is easy to understand and easy to program.
APA, Harvard, Vancouver, ISO, and other styles
10

Chen, Zhong-He. "The Application of Convolution to Suffix to Prefix Rule for the Exact String Matching Problem." 2006. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2006200719565300.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Liu, Kuei-Wen, and 劉貴文. "Some Results on Exact String Matching Algorithm." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/14472736499752714951.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
96
In this paper, we introduce two algorithms for stringology. One is to solve the exact string matching problem and the other is to find the string cycle (period of a string). In the exact string matching problem, we are given two strings T=t1t2...tn and P=p1p2...pm. We are asked to find all occurrences of P in T. Our searching method is based upon a single character rule. Consider a location i in P. Suppose that pi is aligned with tj and pi <> tj. We then must move P in such a way that some pk=tj will be aligned with tj. In the thesis, we propose a method to find the optimal location i. We also modified a string matching approach, named wide window approach, which divides the text into n/m overlapping windows of size 2m-1. In the windows, the approach attempts m possible occurrence positions in parallel. It firstly searches pattern suffixes from middle to right with the modified convolution method, shifts the window directly when it fails, otherwise, scans the corresponding prefixes backward with the modified convolution method again. For the period of string problem, we are given a string T and we are asked to find the periods of T. Our algorithm is based upon the bit parallel approach.
APA, Harvard, Vancouver, ISO, and other styles
12

Liu, Kuei-Wen. "Some Results on Exact String Matching Algorithm." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200814333600.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Lu, Chia Wei, and 呂嘉維. "Efficient Exact and Approximate String Matching Algorithms." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/49922113552090356808.

Full text
Abstract:
博士
國立清華大學
資訊工程學系
102
In this thesis, we first propose two algorithms for exact string matching problem, which aims to find all the positions i's in a given text where a given pattern occurs. Our algorithms find an optimal selective comparing order of the characters of the pattern so that we could have a better performance in the searching phase. To find the optimal comparing order, we adopt the branch and bound approach. Moreover, our proposed algorithm can be combined with other existing exact string matching algorithms to improve the searching efficiency. The experimental results show that our algorithms indeed have the smallest number of character comparisons and are also efficient in time as compared with other existing exact string matching algorithms. Second, we propose a new filtration algorithm, as well as a hybrid filtration strategy, to efficiently solve the approximate string matching problem (also called the k-difference problem), which aims to find all the positions i's in a given text such that there exists a substring of the text ending at position i whose edit distance from a given pattern is less than or equal to a given error bound k. Our experimental results on simulated datasets of DNA sequences show that when compared with other filtration algorithms, our filtration algorithm has better performance on the efficiency to filter out those positions of the text at which the pattern does not occur approximately. Moreover, our hybrid filtration strategy further improves the effectiveness of our filtration algorithm. Third, we propose a progressive approach to solve the DNA resequencing problem which is defined as follows: We are given an unknown DNA sequence X and a known reference sequence R. Our task is to see whether X and R are similar or not. The present popular approach is to break up X into subsequences by the next generation sequencing (NGS) technologies, called reads. We then map the reads of X onto R with a suitable error bound. However, if the similarity between X and R is not very high (<95%), there would be many reads unmapped, and we then cannot obtain the mutations inside the unmapped regions. One can use a large error bound to increase the number of reads mapped. But it is not a good solution because increasing error bound will also increase the probability of false positive mapping. Our approach uses a small error bound and to increase the number of reads mapped, our approach modifies R each time after the reads are mapped. Thus our approach is a progressive approach. Compared with other available tools, our approach allows us to be able to map more reads to the reference sequence. In our simulated experiments, we also show the high correctness of our mapping algorithm.
APA, Harvard, Vancouver, ISO, and other styles
14

Pan, Z. H., and 潘子豪. "An Approximate String Matching Based on an Exact String Matching with Constant Pattern Length." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/45738203365580501083.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
96
The string matching problem is a very important problem which is needed to solve in computer science, computational biology and other domains. The exact matching problem is defined as follows: Given a string T whose length is n and a string P whose length is m, n≧m, find each location where string P occurs in string T. The approximate string matching problem is defined as follows: Given a pattern string P of length and a text string T of length n, n≧m, and a maximal number k of errors allowed, find each location i of T such that there exists a suffix T of T(0,i) and ED(A,B)≦k. In our thesis, we will use the exact string matching to solve the approximate string matching problem.
APA, Harvard, Vancouver, ISO, and other styles
15

Pan, Zu-Hao. "An Approximate String Matching Based on an Exact String Matching with Constant Pattern Length." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200816061900.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Huang, Lan-Ya, and 黃蘭雅. "An Exact String Matching Algorithms Using Hashing Scheme." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/16933370591311289241.

Full text
Abstract:
碩士
國立暨南國際大學
資訊管理學系
96
In this thesis, we consider how to solve the exact string matching problem. The exact string matching problem is to find all locations of a pattern string P in a text string T. In general, string matching algorithm problem work in linear time and linear space. The two well-known of them are Knuth-Morris-Pratt (KMP) algorithm and Boyer-Moore (BM) algorithm. We will use the hashing scheme to solve the exact string matching problem. Our method is simple to implement, and our algorithm work in constant space. Experimental shows that our algorithm is better than Brute Force algorithm and KMP algorithm.
APA, Harvard, Vancouver, ISO, and other styles
17

Huang, Lan-Ya. "An Exact String Matching Algorithms Using Hashing Scheme." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200814285000.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Miau, Chau-An, and 繆朝安. "An Exact Algorithm for the Maximum Induced Matching Problem." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/46698554974843985792.

Full text
Abstract:
碩士
國立中正大學
資訊工程研究所
100
Given a graph G = (V,E) the MAXIMUM r-REGULAR INDUCED SUBGRAPH problem is to find a vertex set R ⊆ V of maximum cardinality such that G[R] is r-regular. An induced matching M ⊆ E in a graph G = (V,E) is a matching such that no two edges in M are joined by any third edge of the graph. The MAXIMUM INDUCED MATCHING problem is to find an induced matching of maximum cardinality. By definition the maximum induced matching problem is the maximum 1-regular induced subgraph problem. Gupta et al. gave an o(2^n) time algorithm for solving the MAXIMUM r-REGULAR INDUCED SUBGRAPH problem. This algorithm solves the MAXIMUM INDUCED MATCHING problem in O^∗(1.6957^n) where n is the number of vertices in the input graph. In this thesis, we show that the maximum induced matching problem can be reduced to the maximum independent set problem and we give a more efficient algorithm for the MAXIMUM INDUCED MATCHING problem running in time O^∗(1.4786^n).
APA, Harvard, Vancouver, ISO, and other styles
19

Ou, Chia-Shin, and 歐家欣. "On the Bit-Parallel Approaches to String Matching Problem." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/71663047138441425921.

Full text
Abstract:
博士
國立清華大學
資訊工程學系
102
In this thesis, we consider two problems: (1) Developing a systematical approach to solve problems involving special properties of bit-vectors (2) Developing a bit-parallel approach to solve the nearest neighbor string matching problem. For problem 1, suppose that we are given a vector consisting of only 1's and 0's and we are interested in finding some special properties of this vector. These properties all involve "for-all" or "there-exists" notations and we are interested in bit-parallel processes to find these properties. In Myers' work, a sequence of logical operations can be expressed as a logical formula: (∃k ≤ i, A(k) and ∀x∈[k, i – 1]) = (((A & B) + B) ^ B) | A) which is by no means easy to be obtained. The contribution of our research is to present a systematical method to find such logical formulas to solve problems involving bit vectors with "for-all" and "there-exists" notations. Using our way of thinking, the equation in Myers' work can be deduced step by step. Five logical prototype problems, "single-for-all", "single-there- exists", "multiple-for-all", "multiple- there-exists" and "multiple-there-exists-and-for-all", are defined in this thesis and are proved that all can be computed using bit-parallel operations in O(n/w) time, where w is the word size of the machine. We also consider four variants of the five problems and show that their logical formulas can be obtained using those of the five prototype problems systematically. The nearest neighbor string matching problem is defined as follows: Given a text string T = t1t2…tn and a pattern string P = p1p2…pm, the nearest neighbor string matching problem is to find all substrings of T whose edit distances with P are the smallest, among all substrings of T. The nearest neighbor string matching problem has useful applications in bioinformatics. It can be straight-forwardly solved by the Seller's Algorithm and the Myers Algorithm which are used to solve the approximate string matching problem. Hyyro and Navarro proposed a filtering algorithm to speed up the Myers Algorithm. However, Hyyro and Navarro's filtering approach needs to perform a pre-processing based on the error bound k which is given by the definition of approximate string matching problem. Hence, it is not suitable to be used to solve the nearest neighbor string matching problem which has no k. In this thesis, we present a modification of the Hyyro and Navarro Algorithm, and also present a bit-parallel algorithm combining the Myers Algorithm and the modified Hyyro and Navarro Algorithm. In experiments, we show that our bit-parallel algorithm is efficient.
APA, Harvard, Vancouver, ISO, and other styles
20

Yen, Chen-Cheng, and 顏成哲. "A Combination of Berry Ravindran Algorithm and Two Way Algorithm for Exact String Matching." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/43753477020737174183.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
96
The exact string matching problem is defined as follows: We are given a text and a pattern , . Our purpose is to find all occurrences of in . It is a classical and important problem in computer science. In this thesis, we propose a new approach to solve this problem. Two Way algorithm and the Berry Ravindran algorithm are good algorithms which find all occurrences of in , but some problems still exist. If we consider the situation that the last two text characters of the window with size m are the same with that of , Berry Ravindran algorithm is not efficient in this situation. If a mismatch often occurs within v part of , Two Way algorithm is not efficient in this situation. So we combine Two Way algorithm and Berry Ravindran algorithm to slove this problem Finally; we discuss the results of Two Way algorithm, Berry Ravindran algorithm and our approach.
APA, Harvard, Vancouver, ISO, and other styles
21

Yen, Chen-Cheng. "A Combination of Berry Ravindran Algorithm and Two Way Algorithm for Exact String Matching." 2007. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200814134200.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Yang, Chao-Yuan, and 楊朝淵. "A New Algorithm for Solving the String Matching with k-Mismatches Problem." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/53970156684807658094.

Full text
Abstract:
碩士
國立暨南國際大學
資訊工程學系
93
String matching with k-mismatches problem is defined as follows. Given a pattern P with length m, a text T with length n and an integer k, string matching with k-mismatches problem wants to find all occurrences of P on T with k maximal number of mismatches allowable. In this thesis, we propose a new algorithm to solve the problem. The algorithm uses the Kangaroo method mentioned in the paper by Galil and Giancarlo and that by Landau and Vishkin. Our algorithm and the Kangaroo Method are distinct from the number of steps to find mismatches. Each step in our algorithm may find two new mismatches or at least one new mismatch in some case. We can compare our algorithm to the Kangaroo Method. Each step in the Kangaroo Method only finds a mismatch. Another difference between our algorithm and the Kangaroo method is the number of positions on the text which we need to verify if those positions are occurrences of the pattern with at most k mismatches. Our algorithm would consider that some positions on the text are impossible to be occurrences of the pattern and then we don’t need to waste any time to check whether those locations are matches with at most k mismatches.
APA, Harvard, Vancouver, ISO, and other styles
23

Heng-WeiLin and 林恒瑋. "An Exact Algorithm for the Maximum Induced Matching Problem in Unit Disk Graphs." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/5gerxg.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography