Log in

Relevant bibliographies by topics / String algorithm / Dissertations / Theses

To see the other types of publications on this topic, follow the link: String algorithm.

Dissertations / Theses on the topic 'String algorithm'

Author: Grafiati

Published: 9 March 2023

Last updated: 10 March 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'String algorithm.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Berry, Thomas. "Algorithm engineering : string processing." Thesis, Liverpool John Moores University, 2002. http://researchonline.ljmu.ac.uk/4973/.

Full text

Abstract:

The string matching problem has attracted a lot of interest throughout the history of computer science, and is crucial to the computing industry. The theoretical community in Computer Science has a developed a rich literature in the design and analysis of string matching algorithms. To date, most of this work has been based on the asymptotic analysis of the algorithms. This analysis rarely tell us how the algorithm will perform in practice and considerable experimentation and fine-tuning is typically required to get the most out of a theoretical idea. In this thesis, promising string matching algorithms discovered by the theoretical community are implemented, tested and refined to the point where they can be usefully applied in practice. In the course of this work we have presented the following new algorithms. We prove that the time complexity of the new algorithms, for the average case is linear. We also compared the new algorithms with the existing algorithms by experimentation. " We implemented the existing one dimensional string matching algorithms for English texts. From the findings of the experimental results we identified the best two algorithms. We combined these two algorithms and introduce a new algorithm. " We developed a new two dimensional string matching algorithm. This algorithm uses the structure of the pattern to reduce the number of comparisons required to search for the pattern. " We described a method for efficiently storing text. Although this reduces the size of the storage space, it is not a compression method as in the literature. Our aim is to improve both space and time taken by a string matching algorithm. Our new algorithm searches for patterns in the efficiently stored text without decompressing the text. " We illustrated that by pre-processing the text we can improve the speed of the string matching algorithm when we search for a large number of patterns in a given text. " We proposed a hardware solution for searching in an efficiently stored DNA text.

APA, Harvard, Vancouver, ISO, and other styles

2

MacLeod, Christopher. "The synthesis of artificial neural networks using single string evolutionary techniques." Thesis, Robert Gordon University, 1999. http://hdl.handle.net/10059/367.

Full text

Abstract:

The research presented in this thesis is concerned with optimising the structure of Artificial Neural Networks. These techniques are based on computer modelling of biological evolution or foetal development. They are known as Evolutionary, Genetic or Embryological methods. Specifically, Embryological techniques are used to grow Artificial Neural Network topologies. The Embryological Algorithm is an alternative to the popular Genetic Algorithm, which is widely used to achieve similar results. The algorithm grows in the sense that the network structure is added to incrementally and thus changes from a simple form to a more complex form. This is unlike the Genetic Algorithm, which causes the structure of the network to evolve in an unstructured or random way. The thesis outlines the following original work: The operation of the Embryological Algorithm is described and compared with the Genetic Algorithm. The results of an exhaustive literature search in the subject area are reported. The growth strategies which may be used to evolve Artificial Neural Network structure are listed. These growth strategies are integrated into an algorithm for network growth. Experimental results obtained from using such a system are described and there is a discussion of the applications of the approach. Consideration is given of the advantages and disadvantages of this technique and suggestions are made for future work in the area. A new learning algorithm based on Taguchi methods is also described. The report concludes that the method of incremental growth is a useful and powerful technique for defining neural network structures and is more efficient than its alternatives. Recommendations are also made with regard to the types of network to which this approach is best suited. Finally, the report contains a discussion of two important aspects of Genetic or Evolutionary techniques related to the above. These are Modular networks (and their synthesis) and the functionality of the network itself.

APA, Harvard, Vancouver, ISO, and other styles

3

Dubois, Simon. "Offline Approximate String Matching forInformation Retrieval : An experiment on technical documentation." Thesis, Tekniska Högskolan, Högskolan i Jönköping, JTH. Forskningsmiljö Informationsteknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-22566.

Full text

Abstract:

Approximate string matching consists in identifying strings as similar even ifthere is a number of mismatch between them. This technique is one of thesolutions to reduce the exact matching strictness in data comparison. In manycases it is useful to identify stream variation (e.g. audio) or word declension (e.g.prefix, suffix, plural). Approximate string matching can be used to score terms in InformationRetrieval (IR) systems. The benefit is to return results even if query terms doesnot exactly match indexed terms. However, as approximate string matchingalgorithms only consider characters (nor context neither meaning), there is noguarantee that additional matches are relevant matches. This paper presents the effects of some approximate string matchingalgorithms on search results in IR systems. An experimental research design hasbeen conducting to evaluate such effects from two perspectives. First, resultrelevance is analysed with precision and recall. Second, performance is measuredthanks to the execution time required to compute matches. Six approximate string matching algorithms are studied. Levenshtein andDamerau-Levenshtein computes edit distance between two terms. Soundex andMetaphone index terms based on their pronunciation. Jaccard similarity calculatesthe overlap coefficient between two strings. Tests are performed through IR scenarios regarding to different context,information need and search query designed to query on a technicaldocumentation related to software development (man pages from Ubuntu). Apurposive sample is selected to assess document relevance to IR scenarios andcompute IR metrics (precision, recall, F-Measure). Experiments reveal that all tested approximate matching methods increaserecall on average, but, except Metaphone, they also decrease precision. Soundexand Jaccard Similarity are not advised because they fail on too many IR scenarios.Highest recall is obtained by edit distance algorithms that are also the most timeconsuming. Because Levenshtein-Damerau has no significant improvementcompared to Levenshtein but costs much more time, the last one is recommendedfor use with a specialised documentation. Finally some other related recommendations are given to practitioners toimplement IR systems on technical documentation.

APA, Harvard, Vancouver, ISO, and other styles

4

Frey, Jeffrey Daniel. "Finding Song Melody Similarities Using a DNA String Matching Algorithm." Kent State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=kent1208961242.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Gundu, Pavan Kumar. "Trajectory Tracking Control of Unmanned Ground Vehicles using an Intermittent Learning Algorithm." Thesis, Virginia Tech, 2019. http://hdl.handle.net/10919/93213.

Full text

Abstract:

Traffic congestion and safety has become a major issue in the modern world's commute. Congestion has been causing people to travel billions of hours more and to purchase billions of gallons of fuel extra which account to congestion cost of billions of dollars. Autonomous driving vehicles have been one solution to this problem because of their huge impact on efficiency, pollution, and human safety. Also, extensive research has been carried out on control design of vehicular platoons because a further improvement in traffic throughput while not compromising the safety is possible when the vehicles in the platoon are provided with better predictive abilities. Motion control is a key area of autonomous driving research that handles moving parts of vehicles in a deliberate and controlled manner. A widely worked on problem in motion control concerned with time parameterized reference tracking is trajectory tracking. Having an efficient and effective tracking algorithm embedded in the autonomous driving system is the key for better performance in terms of resources consumed and tracking error. Many tracking control algorithms in literature rely on an accurate model of the vehicle and often, it can be an intimidating task to come up with an accurate model taking into consideration various conditions like friction, heat effects, ageing processes etc. And typically, control algorithms rely on periodic execution of the tasks that update the control actions, but such updates might not be required, which result in unnecessary actions that waste resources. The main focus of this work is to design an intermittent model-free optimal control algorithm in order to enable autonomous vehicles to track trajectories at high-speeds. To obtain a solution which is model-free, a Q-learning setup with an actor-network to approximate the optimal intermittent controller and a critic network to approximate the optimal cost, resulting in the appropriate tuning laws is considered.
Master of Science
A risen research effort in the area of autonomous vehicles has been witnessed in the past few decades because these systems improve safety, comfort, transport time and energy consumption which are some of the main issues humans are facing in the modern world’s highway systems. Systems like emergency braking, automatic parking, blind angle vehicle detection are creating a safer driving environment in populated areas. Advanced driver assistance systems (ADAS) are what such kind of systems are known as. An extension of these partially automated ADAS are vehicles with fully automated driving abilities, which are able to drive by themselves without any human involvement. An extensively proposed approach for making traffic throughput more efficient on existing highways is to assemble autonomous vehicles into platoons. Small intervehicle spacing and many vehicles constituting each platoon formation improve the traffic throughput significantly. Lately, the advancements in computational capabilities, in terms of both algorithms and hardware, communications, and navigation and sensing devices contributed a lot to the development of autonomous systems (both single and multiagent) that operate with high reliability in uncertain/dynamic operating conditions and environments. Motion control is an important area in the autonomous vehicles research. Trajectory-tracking is a widely studied motion control scenario which is about designing control laws that force a system to follow some time-dependent reference path and it is important to have an effective and efficient trajectory-tracking control law in an autonomous vehicle to reduce the resources consumed and tracking error. The goal of this work is to design an intermittent model-free trajectory tracking control algorithm where there is no need of any mathematical model of the vehicle system being controlled and which can reduce the controller updates by allowing the system to evolve in an open loop fashion and close the loop only when an user defined triggering condition is satisfied. The approach is energy efficient in that the control updates are limited to instances when they are needed rather than unnecessary periodic updates. Q-learning which is a model-free reinforcement learning technique is used in the trajectory tracking motion control algorithm to make the vehicles track their respective reference trajectories without any requirement of their motion model, the knowledge of which is generally needed when dealing with a motion control problem. The testing of the designed algorithm in simulations and experiments is presented in this work. The study and development of a vehicle platform in order to perform the experiments is also discussed. Different motion control and sensing techniques are presented and used. The vehicle platform is shown to track a reference trajectory autonomously without any human intervention, both in simulations and experiments, proving the effectiveness of the proposed algorithm.

APA, Harvard, Vancouver, ISO, and other styles

6

Momeninasab, Leila. "Design and Implementation of a Name Matching Algorithm for Persian Language." Thesis, Linköpings universitet, Interaktiva och kognitiva system, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-102210.

Full text

Abstract:

Name matching plays a vital and crucial role in many applications. They are for example used in information retrieval or deduplication systems to do comparisons among names to match them together or to find the names that refer to identical objects, persons, or companies. Since names in each application are subject to variations and errors that are unavoidable in any system and because of the importance of name matching, so far many algorithms have been developed to handle matching of names. These algorithms consider the name variations that may happen because of spelling, pattern or phonetic modifications. However most existing methods were developed for use with the English language and so cover the characteristics of this language. Up to now no specific one has been designed and implemented for the Persian language. The purpose of this thesis is to present a name matching algorithm for Persian. In this project, after consideration of all major algorithms in this area, we selected one of the basic methods for name matching that we then expanded to make it work particularly well for Persian names. This proposed algorithm, called Persian Edit Distance Algorithm or shortly PEDA, was built based on the characteristics of the Persian language and it compares Persian names with each other on three levels: phonetic similarity, character form similarity and keyboard distance, in order to give more accurate results for Persian names. The algorithm gets Persian names as its input and determines their similarity as a percentage in the output. In this thesis three series of experiments have been accomplished in order to evaluate the proposed algorithm. The f-measure average shows a value of 0.86 for the first series and a value of 0.80 for the second series results. The first series of experiments have been repeated with Levenshtein as well, and have 33.9% false negatives on average while PEDA has a false negative average of 6.4%. The third series of experiments shows that PEDA works well for one edit, two edits and three edits with true positive average values of 99%, 81%, and 69% respectively.

APA, Harvard, Vancouver, ISO, and other styles

7

BERNARDINI, GIULIA. "COMBINATORIAL METHODS FOR BIOLOGICAL DATA." Doctoral thesis, Università degli Studi di Milano-Bicocca, 2021. http://hdl.handle.net/10281/305220.

Full text

Abstract:

Lo scopo di questa tesi è di elaborare e analizzare metodi rigorosi dal punto di vista matematico per l’analisi di due tipi di dati biologici: dati relativi a pan-genomi e filogenesi. Con il termine “pan-genoma” si indica, in generale, un insieme di sequenze genomiche strettamente correlate (tipicamente appartenenti a individui della stessa specie) che si vogliano utilizzare congiuntamente come sequenze di riferimento per un’intera popolazione. Una filogenesi, invece, rappresenta le relazioni evolutive in un gruppo di entità, che siano esseri viventi, geni, lingue naturali, manoscritti antichi o cellule tumorali. Con l’eccezione di uno dei risultati presentati in questa tesi, relativo all’analisi di filogenesi tumorali, il taglio della dissertazione è prevalentemente teorico: lo scopo è studiare gli aspetti combinatori dei problemi affrontati, più che fornire soluzioni efficaci in pratica. Una conoscenza approfondita degli aspetti teorici di un problema, del resto, permette un'analisi matematicamente rigorosa delle soluzioni già esistenti, individuandone i punti deboli e quelli di forza, fornendo preziosi dettagli sul loro funzionamento e aiutando a decidere quali problemi vadano ulteriormente investigati. Oltretutto, è spesso il caso che nuovi risultati teorici (algoritmi, strutture dati o riduzioni ad altri problemi più noti) si possano direttamente applicare o adattare come soluzione ad un problema pratico, o come minimo servano ad ispirare lo sviluppo di nuovi metodi efficaci in pratica. La prima parte della tesi è dedicata a nuovi metodi per eseguire delle operazioni fondamentali su un testo elastico-degenerato, un oggetto computazionale che codifica in maniera compatta un insieme di testi simili tra loro, come, ad esempio, un pan-genoma. Nello specifico, si affrontano il problema di cercare una sequenza di lettere in un testo elastico-degenerato, sia in maniera esatta che tollerando un numero prefissato di errori, e quello di confrontare due testi degenerati. Nella seconda parte si considerano sia filogenesi tumorali, che ricostruiscono per l'appunto l'evoluzione di un tumore, sia filogenesi "classiche", che rappresentano, ad esempio, la storia evolutiva delle specie viventi. In particolare, si presentano nuove tecniche per confrontare due o più filogenesi tumorali, necessarie per valutare i risultati di diversi metodi che ricostruiscono le filogenesi stesse, e una nuova e più efficiente soluzione a un problema di lunga data relativo a filogenesi "classiche", consistente nel determinare se sia possibile sistemare, in presenza di dati mancanti, un insieme di specie in un albero filogenetico che abbia determinate proprietà.
The main goal of this thesis is to develop new algorithmic frameworks to deal with (i) a convenient representation of a set of similar genomes and (ii) phylogenetic data, with particular attention to the increasingly accurate tumor phylogenies. A “pan-genome” is, in general, any collection of genomic sequences to be analyzed jointly or to be used as a reference for a population. A phylogeny, in turn, is meant to describe the evolutionary relationships among a group of items, be they species of living beings, genes, natural languages, ancient manuscripts or cancer cells. With the exception of one of the results included in this thesis, related to the analysis of tumor phylogenies, the focus of the whole work is mainly theoretical, the intent being to lay firm algorithmic foundations for the problems by investigating their combinatorial aspects, rather than to provide practical tools for attacking them. Deep theoretical insights on the problems allow a rigorous analysis of existing methods, identifying their strong and weak points, providing details on how they perform and helping to decide which problems need to be further addressed. In addition, it is often the case where new theoretical results (algorithms, data structures and reductions to other well-studied problems) can either be directly applied or adapted to fit the model of a practical problem, or at least they serve as inspiration for developing new practical tools. The first part of this thesis is devoted to methods for handling an elastic-degenerate text, a computational object that compactly encodes a collection of similar texts, like a pan-genome. Specifically, we attack the problem of matching a sequence in an elastic-degenerate text, both exactly and allowing a certain amount of errors, and the problem of comparing two degenerate texts. In the second part we consider both tumor phylogenies, describing the evolution of a tumor, and “classical” phylogenies, representing, for instance, the evolutionary history of the living beings. In particular, we present new techniques to compare two or more tumor phylogenies, needed to evaluate the results of different inference methods, and we give a new, efficient solution to a longstanding problem on “classical” phylogenies: to decide whether, in the presence of missing data, it is possible to arrange a set of species in a phylogenetic tree that enjoys specific properties.

APA, Harvard, Vancouver, ISO, and other styles

8

Moradi, Arvin. "Smart Clustering System for Filtering and Cleaning User Generated Content : Creating a profanity filter for Truecaller." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-124408.

Full text

Abstract:

This thesis focuses on investigating and creating an application for filtering user-generated content. The method was to examine how profanity and racist expressions are used and manipulated to evade filtering processes in similar systems. Focus also went on to study different algorithms to get this process to be quick and efficient, i.e., to process as many names in the shortest amount of time possible. This is because the client needs to filter millions of new uploads every day. The result shows that the application detects profanity and manipulated profanity. Data from the customer’s database was also used for testing purposes, and the result showed that the application also works in practice. The performance test shows that the application has a fast execution time. We could see this by approximating it to a linear func-tion with respect to time and the number of names entered. The conclusion was that the filter works and discovers profanity not detected earlier. Future updates to strengthen the decision process could be to introduce a third-party service, or a web interface where you can manually control decisions. Execution time is good and shows that 10 million names can be pro-cessed in about 6 hours. In the future, one can parallelize queries to the database so that multiple names can be processed simultaneously.
Denna avhandling fokuserar på att utreda och skapa en applikation för filtrering av användargenererat innehåll. Metoden gick ut på att undersöka hur svordomar samt rasistiska uttryck används och manipuleras för att undgå filtrerings processer i liknande system. Fokus gick även ut på att studera olika algoritmer för att få denna process att vara snabb och effektiv, dvs kunna bearbeta så många namn på kortast möjliga tid. Detta beror på att kunden i detta sammanhang får in miljontals nya uppladdningar varje dag, som måste filtreras innan använding. Resultatet visar att applikationen upptäcker svordomar i olika former. Data från kundens databas användes också för test syfte, och resultatet visade att applikationen även fungerar i praktiken. Prestanda testet visar att applikationen har en snabb exekveringstid. Detta kunde vi se genom att estimera den till en linjär funktion med hänsyn till tid och antal namn som matats in. Slutsatsen blev att filtret fungerar och upptäcker svordomar som inte upptäckts tidigare i kundens databas. För att stärka besluten i processen kan man i framtida uppdateringar införa tredje parts tjänster, eller ett web interface där man manuelt kan styra beslut. Exekverings tiden är bra och visar att 10 miljoner namn kan bearbetas på cirka 6 timmar. I framtiden kan man parallellisera förfrågningarna till databasen så att flera namn kan bearbetas samtidigt.

APA, Harvard, Vancouver, ISO, and other styles

9

Alex, Ann Theja. "Local Alignment of Gradient Features for Face Photo and Face Sketch Recognition." University of Dayton / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=dayton1353372694.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Pinzon, Yoan Jose. "String algorithms on sequence comparison." Thesis, King's College London (University of London), 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.395648.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Fischer, Johannes. "Data Structures for Efficient String Algorithms." Diss., lmu, 2007. http://nbn-resolving.de/urn:nbn:de:bvb:19-75053.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Newman, Alantha. "Algorithms for string and graph layout." Thesis, Massachusetts Institute of Technology, 2004. http://hdl.handle.net/1721.1/28745.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.
Includes bibliographical references (p. 121-125).
Many graph optimization problems can be viewed as graph layout problems. A layout of a graph is a geometric arrangement of the vertices subject to given constraints. For example, the vertices of a graph can be arranged on a line or a circle, on a two- or three-dimensional lattice, etc. The goal is usually to place all the vertices so as to optimize some specified objective function. We develop combinatorial methods as well as models based on linear and semidefinite programming for graph layout problems. We apply these techniques to some well-known optimization problems. In particular, we give improved approximation algorithms for the string folding problem on the two- and three-dimensional square lattices. This combinatorial graph problem is motivated by the protein folding problem, which is central in computational biology. We then present a new semidefinite programming formulation for the linear ordering problem (also known as the maximum acyclic subgraph problem) and show that it provides an improved bound on the value of an optimal solution for random graphs. This is the first relaxation that improves on the trivial "all edges" bound for random graphs.
by Alantha Newman.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

13

Mohamed, Manal Abd El-Kadeer Kasem. "Algorithmic issues on string regularities." Thesis, King's College London (University of London), 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.422247.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Luccio, Flaminia L. Carleton University Dissertation Computer Science. "Distributed algorithms for routing and string recognition." Ottawa, 1995.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

15

Aljamea, Mudhi Mohammed. "Advances in string algorithms for information security applications." Thesis, King's College London (University of London), 2016. https://kclpure.kcl.ac.uk/portal/en/theses/advances-in-string-algorithms-for-information-security-applications(93f6c82c-8b7a-4fb1-a0bd-29f14549d57c).html.

Full text

Abstract:

This thesis focuses on introducing novel algorithms in information security through studying successful algorithms in bioinformatics and adapting them to solve some open problems in information security. Although, the problems in both bioinformatics and information security are different, yet, they might be considered very similar when it comes to identifying and solving them using Stringology techniques. Different successful bioinformatics algorithms have been studied and introduced to solve different security problems such as malware detection, biometrics and big data. Firstly, we present a dynamic computer malware detection model; a novel approach for detecting malware code embedded in different types of computer files, with consistency, accuracy and in high speed without excessive memory usages. This model was inspired by REAL; an efficient read aligner used by next generation sequencing for processing biological data. In addition, we introduce a novel algorithmic approach to detect malicious URLs in image secret communications. Secondly, we also focus on biometrics, specifically fingerprint which is considered to be one of the most reliable and used technique to identify individuals. In particular, we introduce a new fingerprint matching technique, which matches the fingerprint information using circular approximate string matching to solve the rotation problem overcoming the previous methods’ obstacles. Finally, we conclude with an algorithmic approach to analyse big data readings from smart meters to confirm some privacy issues concerns.

APA, Harvard, Vancouver, ISO, and other styles

16

Cheng, Lok-lam, and 鄭樂霖. "Approximate string matching in DNA sequences." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2003. http://hub.hku.hk/bib/B29350591.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Sastre, Javier M. "Efficient finite-state algorithms for the application of local grammars." Phd thesis, Université de Marne la Vallée, 2011. http://tel.archives-ouvertes.fr/tel-00621249.

Full text

Abstract:

Notre travail porte sur le développement d'algorithmes performants d'application de grammaires locales, en prenant comme référence ceux des logiciels libres existants: l'analyseur syntaxique descendant d'Unitex et l'analyseur syntaxique à la Earley d'Outilex. Les grammaires locales sont un formalisme de représentation de la syntaxe des langues naturelles basé sur les automates finis. Les grammaires locales sont un modèle de construction de descriptions précises et à grande échelle de la syntaxe des langues naturelles par le biais de l'observation systématique et l'accumulation méthodique de données. L'adéquation des grammaires locales pour cette tâche a été testée à l'occasion de nombreux travaux. À cause de la nature ambiguë des langues naturelles et des propriétés des grammaires locales, les algorithmes classiques d'analyse syntaxique tels que LR, CYK et Tomita ne peuvent pas être utilisés dans le contexte de ce travail. Les analyseurs descendant et Earley sont des alternatives possibles, cependant, ils ont des coûts asymptotiques exponentiels pour le cas des grammaires locales. Nous avons d'abord conçu un algorithme d'application de grammaires locales avec un coût polynomial dans le pire des cas. Ensuite, nous avons conçu des structures de données performantes pour la représentation d'ensembles d'éléments et de séquences. Elles ont permis d'améliorer la vitesse de notre algorithme dans le cas général. Nous avons mis en oeuvre notre algorithme et ceux des systèmes Unitex et Outilex avec les mêmes outils afin de les tester dans les mêmes conditions. En outre, nous avons mis en oeuvre différentes versions de chaque algorithme en utilisant nos structures de données et algorithmes pour la représentation d'ensembles et ceux fournis par la Standard Template Library (STL) de GNU. Nous avons comparé les performances des différents algorithmes et de leurs variantes dans le cadre d'un projet industriel proposé par l'entreprise Telefónica I+D: augmenter la capacité de compréhension d'un agent conversationnel qui fournit des services en ligne, voire l'envoi de SMS à des téléphones portables ainsi que des jeux et d'autres contenus numériques. Les conversations avec l'agent sont en espagnol et passent par Windows Live Messenger. En dépit du domaine limité et de la simplicité des grammaires appliquées, les temps d'exécution de notre algorithme, couplé avec nos structures de données et algorithmes pour la représentation d'ensembles, ont été plus courts. Grâce au coût asymptotique amélioré, on peut s'attendre à des temps d'exécution significativement inférieurs par rapport aux algorithmes utilisés dans les systèmes Unitex et Outilex, pour le cas des grammaires complexes et à large couverture.

APA, Harvard, Vancouver, ISO, and other styles

18

Toopsuwan, Chalita. "Algorithms and combinatorics of repetitions in strings." Thesis, King's College London (University of London), 2017. https://kclpure.kcl.ac.uk/portal/en/theses/algorithms-and-combinatorics-of-repetitions-in-strings(a20776fa-6a15-4c37-bf87-505211309fd7).html.

Full text

Abstract:

Repetitions in strings constitute one of the most fundamental areas of string combinatorics with exactly essential applications to text algorithms, data compression, and also analysis of biological sequences. It is relevant to periodicities, regularities, and compression. The higher compression rate can be obtained from the repetitive behavior of strings, and reversely some compression techniques are at the core of fast algorithms for detecting repetitions. Repetitions are highly periodic factors (or substrings) in strings, there are various type of repetitions such as repeat, repetition, squares, cubes, palindrome, maximal periodicitiie which is also called runs. The aim of this thesis is concentrated on the repetitions in strings in algorithmic and combinatorics approaches as they are very intricate and plenty of interesting works remain as open problems. The critical study of this thesis firstly approach to the maximal periodicities or runs. It presents in Algorithmics of repetitions, local periods and critical factorization. An algorithm is designed in order to compute all runs for a string drawn from an infinite alphabet. On a string of length n, the algorithm runs optimally in time O(n log n) while there is a linear number of runs. The key model of computation is the comparison of letters which is done with the equality operator only. Under the same proposition, another time-optimal algorithm is created. This gives the same running time to compute local periods and all critical factorisations. The prefix table of input strings is applied as the main tool of those algorithms. In this study, we also design a simple algorithm based on the Dictionary of Basic Factors of the input string. The notion of Gapped Palindrome and its Anti-exponent goes toward this research. A palindrome is a string x = a1 · · · an which is equal to its reversal ex = an · · · a1. The definition of a gapped palindromes is given by a string of the form uveu, where u, v are strings, |v| 2, and eu is the reversal of u. Replicating the standard notion of string exponent, we together define the anti-exponent of a gapped palindrome uveu as the quotient of |uveu| by |uv|. In this work, an algorithm is described to compute the maximal anti-exponent of gapped palindromes occurring in an ordinary palindrome-free string. To get an efficient computation of maximal anti-exponent of factors in a palindrome-free string, we apply techniques based on the suffix automaton and the reversed Lempel-Ziv factorisation. The complexity analyse shows that algorithm runs in linear-time on a fixed-size alphabet. Repeats are also of main concern in the domains of text compression and of pattern matching so lastly the study of repeat and its exponents are discussed in this thesis. Here we create linear-time algorithm to compute maximal exponent of repeats occurring in an overlapping-free string. Two main tools for the algorithm are a factorisation of the string and the Suffix Automaton of some factors. Eventually, we obtain the graceful result as the direct consequence from this research. There is the linearity of the number of occurrences of repeats whose exponent is maximal in an overlap-free string. Among all of the previous researches and our further viewpoints in this thesis, acquiring knowledge on repetitions in string remains interesting open questions to continue.

APA, Harvard, Vancouver, ISO, and other styles

19

黎少斌 and Shiao-bun Lai. "Trading off time for space for the string matching problem." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1996. http://hub.hku.hk/bib/B31214216.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Lai, Shiao-bun. "Trading off time for space for the string matching problem /." Hong Kong : University of Hong Kong, 1996. http://sunzi.lib.hku.hk/hkuto/record.jsp?B18061795.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Rahman, Mohammad Sohel. "Fast and efficient algorithms on strings and sequences." Thesis, King's College London (University of London), 2007. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.617795.

Full text

Abstract:

The last few decades have witnessed the fascinating outgrowth of the field of String Algorithms. And, with the ever increasing mass of information and rapid pace of dissemination and sharing thereof, the importance and relevance of these algorithms can be expected to grow even further in near future. In this thesis, we have studied a number of interesting problems on strings and sequences and taken an effort to devise efficient algorithms to solve them. The problems we have studied here falls into two main categories of problems, namely the String Matching Problems and the Longest Common Subsequence Problems. The first String matching problem handled in this thesis is the pattern matching problems under the don't care paradigm. Don't care pattern matching problems have been studied since 1974 and the quest for improved algorithms is still on. We present efficient algorithms for different versions of this problem. We also present efficient algorithms for a more general problem, namely the pattern matching problem with degenerate strings. Next, we consider another well-studied problem namely the Swap Matching problem. We present a new graph-theoretic approach to model the problem, which opens a new and hitherto unexplored avenue to solve it. Then, using the model, we devise an efficient algorithm, which works particularly well for short patterns. Computer assisted music analysis and music information retrieval has a number of tasks that can be related to stringology. To this extent, we consider the problem of identifying rhythms in a musical text, which can be used in automatic song classification. Since there doesn't exist a complete agreement on the definitions of related features, we take an effort to present a new paradigm to model this problem and devise an efficient algorithm to solve it. Next, we consider a number ofrelatively new interesting pattern matching problems from indexing point of view. We present efficient indexing schemes for the problem of pattern matching in given intervals, the property matching problem and the circular pattern matching problem. We conclude our study of string matching problems by focusing on devising an efficient data structure to index gapped factors. In the second part of this thesis, we study the longest common subsequence (LCS) problems and variants thereof. We first introduce a number of new LCS variants by applying the notion of gap-constraints. The new variants are motivated by practical applications from computational molecular biology and we present efficient algorithms to solve them. Then, using the techniques developed while solving these variants, we present efficient algorithms for the classic LCS problem and for another relatively new variant, namely, the Constrained LCS (CLCS) problem. Finally, we efficiently solve the LCS and CLCS problems for degenerate strings.

APA, Harvard, Vancouver, ISO, and other styles

22

Hunt, Kieran. "A comparison of exact string search algorithms for deep packet inspection." Thesis, Rhodes University, 2018. http://hdl.handle.net/10962/60629.

Full text

Abstract:

Every day, computer networks throughout the world face a constant onslaught of attacks. To combat these, network administrators are forced to employ a multitude of mitigating measures. Devices such as firewalls and Intrusion Detection Systems are prevalent today and employ extensive Deep Packet Inspection to scrutinise each piece of network traffic. Systems such as these usually require specialised hardware to meet the demand imposed by high throughput networks. Hardware like this is extremely expensive and singular in its function. It is with this in mind that the string search algorithms are introduced. These algorithms have been proven to perform well when searching through large volumes of text and may be able to perform equally well in the context of Deep Packet Inspection. String search algorithms are designed to match a single pattern to a substring of a given piece of text. This is not unlike the heuristics employed by traditional Deep Packet Inspection systems. This research compares the performance of a large number of string search algorithms during packet processing. Deep Packet Inspection places stringent restrictions on the reliability and speed of the algorithms due to increased performance pressures. A test system had to be designed in order to properly test the string search algorithms in the context of Deep Packet Inspection. The system allowed for precise and repeatable tests of each algorithm and then for their comparison. Of the algorithms tested, the Horspool and Quick Search algorithms posted the best results for both speed and reliability. The Not So Naive and Rabin-Karp algorithms were slowest overall.

APA, Harvard, Vancouver, ISO, and other styles

23

Pockrandt, Christopher Maximilian [Verfasser]. "Approximate String Matching : Improving Data Structures and Algorithms / Christopher Maximilian Pockrandt." Berlin : Freie Universität Berlin, 2019. http://d-nb.info/1183675879/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Baker, Audrey. "An algorithm for the strong freeness of quadratic lie polynomials /." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=100761.

Full text

Abstract:

The Lie algebra associated to the lower central and p-central series of a group is an important invariant of the group but is difficult to compute. For a finitely presented group this Lie algebra, can be determined under a certain condition on the initial forms of the relators, namely that of strong freeness. We give an algorithm for the strong freeness of 4 quadratic Lie polynomials in 4 variables over an arbitrary field thus extending a result of Bush and Labute.

APA, Harvard, Vancouver, ISO, and other styles

25

Yahiaoui, Said. "Algorithmes et applications pour la coloration et les alliances dans les graphes." Thesis, Lyon 1, 2013. http://www.theses.fr/2013LYO10274.

Full text

Abstract:

Dans cette thèse, nous nous intéressons aux aspects algorithmiques et applications de deux problèmes de graphes, à savoir, la coloration et les alliances. La première partie concerne deux variantes de la coloration de graphes, la coloration Grundy et la coloration forte stricte. Nous commençons par l'étude du nombre Grundy des graphes réguliers. Nous donnons une condition fixe k, nous fournissons une condition nécessaire et suffisante pour que le nombre Grundy d'un graphe régulier soit au moins égal k. Nous caractérisons la classe des graphes cubiques (3-réguliers) pour laquelle le nombre Grundy est égal à 4, et nous présentons un algorithme linéaire pour déterminer le nombre Grundy d'un graphe cubique quelconque. Par ailleurs, en se basant sur la coloration forte stricte pour décomposer les arbres en petites composantes, nous présentons un nouvel algorithme pour l'appariement d'arbres étiquetés, non-ordonnés non-enracinés. Nous montrons que la distance calculée entre deux arbres est une pseudo-métrique. Nos expérimentations sur de larges bases synthétiques et des bases de données réelles confirment nos résultats analytiques et montrent que la distance proposée est précise et son algorithme est scalable. La seconde partie de cette thèse est consacrée aux alliances dans les graphes. Nous proposons un algorithme distribué autostabilisant pour la construction d'alliance offensive globale minimale dans un graphe arbitraire. Nous démontrons que cet algorithme converge sous le démon synchrone en temps linéaire. Ensuite, nous donnons le premier algorithme distribué autostabilisant pour le problème de l'alliance forte globale minimale dans un graphe quelconque. Nous prouvons que cet algorithme est polynomial sous le démon inéquitable distribué. Nous montrons par la suite, comment cet algorithme peut être adapté pour des généralisations du problème, comme la k-alliance forte et l'alliance forte pondérée. Enfin, en se basant sur les propriétés structurelles de l'alliance offensive, nous présentons une solution pour décentraliser le protocole de signalisation SIP. Ceci rend possible son déploiement dans un réseau mobile ad hoc
This thesis investigates the algorithmic aspects and applications of two graph problems, namely, colorings and alliances. In the first part, we focus on two variants of the proper vertex coloring, the Grundy coloring and the strict strong coloring. We start by the study of Grundy number for regular graphs. We give a sufficient condition for d-regular graphs with sufficiently large girth to have Grundy number equals d + 1. Then, using graph homomorphism, we obtain a necessary and sufficient condition for d-regular graphs to have Grundy number at least k. Moreover, we characterize cubic graphs (3-regular) for which the Grundy number is d + 1, and present a linear-time algorithm to determine the Grundy number of any arbitrary cubic graph. Subsequently, based on the strict strong coloring, we present an approach for the problem of matching labeled trees. Using this coloring, we propose a new algorithm to deterministically decompose a tree into small components. This leads to an efficient algorithm to measure an accurate distance between unrooted unordered labeled trees. The second part is devoted to the alliances in graphs. We first propose a linear-time self-stabilizing algorithm for the minimal global offensive alliance set problem, under the synchronous distributed scheduler. Then, we give the first self-stabilizing algorithm for the minimal global powerful alliance set problem in arbitrary graphs. Moreover, we show how this algorithm can be adapted to find the minimal global powerful k-alliance and the minimal weighted global powerful alliance sets. We prove that all these algorithms converge in polynomial-time under the unfair distributed scheduler. Finally, based on the structural properties of the offensive alliance, we propose a solution to decentralize the signaling protocol SIP. This enables SIP applications in mobile ad hoc networks

APA, Harvard, Vancouver, ISO, and other styles

26

Sawada, Joseph James. "Fast algorithms to generate restricted classes of strings under rotation." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0015/NQ48228.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Magnanti, Thomas L., and Rita Vachani. "A Strong Cutting Plane Algorithm for Production Scheduling with Changeover Costs." Massachusetts Institute of Technology, Operations Research Center, 1987. http://hdl.handle.net/1721.1/5192.

Full text

Abstract:

Changeover costs (and times) are central to numerous manufacturing operations. These costs arise whenever work centers capable of processing only one product at a time switch from the manufacture of one product to another. Although many researchers have contributed to the solution of scheduling problems that include changeover costs, due to the problem's combinatorial explosiveness, optimization-based methods have met with limited success. In this paper, we develop and apply polyhedral methods from integer programming for a dynamic version of the problem. Computational tests with problems containing one to five products (and up to 225 integer variables) show that polyhedral methods based upon a set of facet inequalities developed in this paper can effectively reduce the gap between the value of an integer program formulation of the problem and its linear programming relaxation (by a factor of 94 to 100 per cent). These results suggest the use of a combined cutting plane/branch and bound procedure as a solution approach. In a test with a five product problem, this procedure, when compared with a standard linear programming-based branch and bound approach, reduced computation time by a factor of seven.

APA, Harvard, Vancouver, ISO, and other styles

28

Yim, Cheuk-hon Terence. "Approximate string alignment and its application to ESTs, mRNAs and genome mapping." Click to view the E-thesis via HKUTO, 2004. http://sunzi.lib.hku.hk/hkuto/record/B31455736.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Samir, Uzzaman Mohammad. "Algorithms and optimized implementations in context of circular string and relevant web security." Thesis, King's College London (University of London), 2017. https://kclpure.kcl.ac.uk/portal/en/theses/algorithms-and-optimized-implementations-in-context-of-circular-string-and-relevant-web-security(3f33c6b0-db65-4a39-baa3-d241963deb1e).html.

Full text

Abstract:

Ammonia plays a central role in the pathogenesis of cerebral oedema in paracetamol-induced acute liver failure (PALF). Infection and inflammation play an important synergistic role in its development. Toll-like receptors (TLRs) sense pathogens and induce inflammation but whether this contributes to the development of cerebral oedema in PALF remains unknown. I postulated that ammonia-induced cerebral oedema and immune dysfunction are mediated by TLR9 and aimed to determine whether this could be prevented in a hyperammonemic TLR9 knockout mouse model. TLR9 expression on circulating neutrophils and their function in PALF was assessed. To examine the influence of PALF plasma and endogenous DNA on TLR9 expression, healthy neutrophils were incubated with PALF plasma with/without DNase. Ammonium acetate (NH4-Ac) was injected intraperitoneally in wild type Black6 (WT-B6), TLR9-/- B6 mice and TLR9fl/fl LysCre B6 mice with TLR9 deleted from neutrophils and macrophages. The TLR9 antagonist ODN2088 was also evaluated. Neutrophil TLR9 correlated with plasma IL-8 and ammonia concentration and increased with severity of hepatic encephalopathy and systemic inflammation. Healthy neutrophil TLR9 expression increased upon stimulation with PALF plasma which was abrogated by pre-incubation with DNase. Following NH4-Ac stimulation, intracellular cytokine (IFN-γ, TNF-α and IL-6) production of lymphocytes and macrophages were increased in WT-B6 mice compared to controls. This was accompanied by increased brain water however in TLR9-/-, cytokine production and brain water content were decreased. This was seen similarly in WT-B6 administered the TLR9 antagonist ODN2088 in conjunction with NH4-Ac. TLR9fl/fl LysCre mice had decreased cytokine production and brain water compared to the TLR9fl/fl group following NH4-Ac injection. Total DNA levels were increased in the circulation after NH4-Ac injection. In summary, ammonia-induced cerebral oedema and immune dysfunction are mediated through TLR9 and DNA dependent. The amelioration of brain oedema and lymphocyte cytokine production by ODN2088 supports exploration of TLR9 antagonism in early PALF to prevent progression to cerebral oedema.

APA, Harvard, Vancouver, ISO, and other styles

30

Kotwal, Mithilesh N. "Encoding of trellises with strong tailbiting property." Ohio : Ohio University, 2005. http://www.ohiolink.edu/etd/view.cgi?ohiou1113584078.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Bavirisetty, Rambabu 1963. "COMPARISON OF STRESS RECOVERY ALGORITHMS." Thesis, The University of Arizona, 1987. http://hdl.handle.net/10150/276405.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Toth, Róbert. "Přibližné vyhledávání řetězců v předzpracovaných dokumentech." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236122.

Full text

Abstract:

This thesis deals with the problem of approximate string matching, also called string matching allowing errors. The thesis targets the area of offline algorithms, which allows very fast pattern matching thanks to index created during initial text preprocessing phase. Initially, we will define the problem itself and demonstrate variety of its applications, followed by short survey of different approaches to cope with this problem. Several existing algorithms based on suffix trees will be explained in detail and new hybrid algorithm will be proposed. Algorithms wil be implemented in C programming language and thoroughly compared in series of experiments with focus on newly presented algorithm.

APA, Harvard, Vancouver, ISO, and other styles

33

Klaib, Ahmad. "Exact string matching algorithms for searching DNA and protein sequences and searching chemical databases." Thesis, University of Huddersfield, 2014. http://eprints.hud.ac.uk/id/eprint/24266/.

Full text

Abstract:

The enormous quantities of biological and chemical files and databases are likely to grow year on year, consequently giving rise to the need to develop string-matching algorithms capable of minimizing the searching response time. Being aware of this need, this thesis aims to develop string matching algorithms to search biological sequences and chemical structures by studying exact string matching algorithms in detail. As a result, this research developed a new classification of string matching algorithms containing eight categories according to the pre-processing function of algorithms and proposed five new string matching algorithms; BRBMH, BRQS, Odd and Even algorithm (OE), Random String Matching algorithm (RSMA) and Skip Shift New algorithm (SSN). The main purpose behind the proposed algorithms is to reduce the searching response time and the total number of comparisons. They are tested by comparing them with four well- known standard algorithms, Boyer Moore Horspool (BMH), Quick Search (QS), TVSBS and BRFS. This research applied all of the algorithms to sample data files by implementing three types of tests. The number of comparison tests showed a substantial difference in the number of comparisons our algorithms use compared to the non-hybrid algorithms such as QS and BMH. In addition, the tests showed considerable difference between our algorithms and other hybrid algorithm such as TVSBS and BRFS. For instance, the average elapsed search time tests showed that our algorithms presented better average elapsed search time than the BRFS, TVSBS, QS and BMH algorithms, while the average number of tests showed better number of attempts compared to BMH, QS, TVSBS and BRFS algorithms. A new contribution has been added by this research by using the fastest proposed algorithm, the SSN algorithm, to develop a chemical structure searching toolkit to search chemical structures in our local database. The new algorithms were paralleled using OpenMP and MPI parallel models and tested at the University of Science Malaysia (USM) on a Stealth Cluster with different number of threads and processors to improve the speed of searching pattern in the given text which, as we believe, is another contribution.

APA, Harvard, Vancouver, ISO, and other styles

34

Bingmann, Timo [Verfasser], and P. [Akademischer Betreuer] Sanders. "Scalable String and Suffix Sorting: Algorithms, Techniques, and Tools / Timo Bingmann ; Betreuer: P. Sanders." Karlsruhe : KIT-Bibliothek, 2018. http://d-nb.info/1164081144/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Nosek, Ondřej. "Hardwarová akcelerace algoritmu pro hledání podobnosti dvou DNA řetězců." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2007. http://www.nusl.cz/ntk/nusl-236882.

Full text

Abstract:

Methods for aproximate string matching of various sequences used in bioinformatics are crucial part of development in this branch. Tasks are of very large time complexity and therefore we want create a hardware platform for acceleration of these computations. Goal of this work is to design a generalized architecture based on FPGA technology, which can work with various types of sequences. Designed acceleration card will use especially dynamic algorithms like Needleman-Wunsch and Smith-Waterman.

APA, Harvard, Vancouver, ISO, and other styles

36

Martin, Florent de. "Influence of the nonlinear behaviour of soft soils on strong ground motions." Thesis, Châtenay-Malabry, Ecole centrale de Paris, 2010. http://www.theses.fr/2010ECAP0013/document.

Full text

Abstract:

Le comportement nonlinéaire des sols observé lors des mouvements sismiques forts est maintenant bien admis et le déploiement des puits accélérométriques a permis des analyses détaillées de la propagation des ondes ainsi qu’une évaluation quantitative des paramètres physiques tels que la vitesse de cisaillement et de compression des ondes et les facteurs d’amortissements en fonction de la déformation. En dépit du nombre grandissant d’études sur ce phénomène, sa connaissance est encore récente et les recherches sur les données de puits accélérométriques restent une étape importante vers la compréhension du comportement complexe in-situ des sédiments soumis à des mouvements sismiques forts.L’objectif de ces travaux est triple. Premièrement, un code d’inversion par algorithme génétique est développé afin d’inverser des données de puits accélérométriques via la théorie des matrices de propagation de Thomson-Haskell. Cette technique nous permet dans un premier temps de valider la structure en une dimension (1D) (e.g., vitesse des ondes de cisaillement, facteurs d’ amortissements) d’un puits accélérométrique dans le domaine linéaire et dans un second temps de mettre en évidence de manière quantitative le comportement nonlinéaire des sédiments lors du séisme de Fukuoka, 2005, Japon. Deuxièmement, les résultats de l’inversion sont utilisés pour tester des lois de comportement simples et avancées en utilisant la Méthode des éléments Finis. Les résultats montrent clairement que l’hypothèse bi-linéaire de la loi de comportement simple produit des séries temporelles non réalistes en vitesse et en accélération. L’utilisation d’une loi de comportement avancée mène à de meilleurs résultats, cependant, le nombre de paramètres ajustables pour obtenir des résultats consistants avec l’observation est un obstable inévitable. Troisièmement, afin d’étendre l’étude des effets de site à des dimensions supérieures, des codes 2D et 3D de la Méthode en éléments Spectraux sont développés et validés en comparant leurs résultats dans le domaine linéaire avec ceux obtenus théoriquement ou via d’autres méthodes numériques
Nonlinear behavior of soft soils observed during strong ground motions isnow well established and the deployment of vertical arrays (i.e., boreholestations) has contributed to detailed wave propagation analyses and the assessmentfor quantitative physical parameters such as shear-wave velocity,pressure-wave velocity and damping factors with respect to shear strain levels.Despite the growing number of studies on this phenomena, its knowledgeis still recent and research on borehole station data remains an importantstep toward the understanding of the complex in-situ behavior of soft sedimentssubjected to strong ground motions.The purpose of this work is threefold. First, an inversion code by geneticalgorithm is developed in order to inverse borehole stations data viathe Thomson-Haskell propagator matrix method. This technique allows usto validate the one-dimensional (1D) structure (e.g., shear-wave velocity,damping factors) of a borehole in the linear elastic domain and to showquantitative evidence of the nonlinear behavior of the soft sediments duringthe 2005 Fukuoka Prefecture western offshore earthquake, Japan. Second,the results of the inversion are used in order to test simple and advancedconstitutive laws using the Finite Elements Method. The results clearlyshow that the bi-linear assumption of the simple constitutive law producesunrealistic velocity and acceleration time histories. The use of the advancedconstitutive law leads to better results, however, the number of parametersto be tuned in order to obtain results consistent with the observation is anunavoidable obstacle. Third, in order to extend the study of site effects tohigher dimensions, 2D and 3D codes of the very efficient Spectral ElementsMethod are developed and validated by comparing their results in the lineardomain with those obtained theoretically or with other numerical methods

APA, Harvard, Vancouver, ISO, and other styles

37

Yim, Cheuk-hon Terence, and 嚴卓漢. "Approximate string alignment and its application to ESTs, mRNAs and genome mapping." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B31455736.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Vishwanathan, S. V. N. "Kernel Methods Fast Algorithms and real life applications." Thesis, Indian Institute of Science, 2003. http://hdl.handle.net/2005/49.

Full text

Abstract:

Support Vector Machines (SVM) have recently gained prominence in the field of machine learning and pattern classification (Vapnik, 1995, Herbrich, 2002, Scholkopf and Smola, 2002). Classification is achieved by finding a separating hyperplane in a feature space, which can be mapped back onto a non-linear surface in the input space. However, training an SVM involves solving a quadratic optimization problem, which tends to be computationally intensive. Furthermore, it can be subject to stability problems and is non-trivial to implement. This thesis proposes an fast iterative Support Vector training algorithm which overcomes some of these problems. Our algorithm, which we christen Simple SVM, works mainly for the quadratic soft margin loss (also called the l2 formulation). We also sketch an extension for the linear soft-margin loss (also called the l1 formulation). Simple SVM works by incrementally changing a candidate Support Vector set using a locally greedy approach, until the supporting hyperplane is found within a finite number of iterations. It is derived by a simple (yet computationally crucial) modification of the incremental SVM training algorithms of Cauwenberghs and Poggio (2001) which allows us to perform update operations very efficiently. Constant-time methods for initialization of the algorithm and experimental evidence for the speed of the proposed algorithm, when compared to methods such as Sequential Minimal Optimization and the Nearest Point Algorithm are given. We present results on a variety of real life datasets to validate our claims. In many real life applications, especially for the l2 formulation, the kernel matrix K є R n x n can be written as K = Z T Z + Λ , where, Z є R n x m with m << n and Λ є R n x n is diagonal with nonnegative entries. Hence the matrix K - Λ is rank-degenerate, Extending the work of Fine and Scheinberg (2001) and Gill et al. (1975) we propose an efficient factorization algorithm which can be used to find a L D LT factorization of K in 0(nm2) time. The modified factorization, after a rank one update of K, can be computed in 0(m2) time. We show how the Simple SVM algorithm can be sped up by taking advantage of this new factorization. We also demonstrate applications of our factorization to interior point methods. We show a close relation between the LDV factorization of a rectangular matrix and our LDLT factorization (Gill et al., 1975). An important feature of SVM's is that they can work with data from any input domain as long as a suitable mapping into a Hilbert space can be found, in other words, given the input data we should be able to compute a positive semi definite kernel matrix of the data (Scholkopf and Smola, 2002). In this thesis we propose kernels on a variety of discrete objects, such as strings, trees, Finite State Automata, and Pushdown Automata. We show that our kernels include as special cases the celebrated Pair-HMM kernels (Durbin et al., 1998, Watkins, 2000), the spectrum kernel (Leslie et al., 20024, convolution kernels for NLP (Collins and Duffy, 2001), graph diffusion kernels (Kondor and Lafferty, 2002) and various other string-matching kernels. Because of their widespread applications in bio-informatics and web document based algorithms, string kernels are of special practical importance. By intelligently using the matching statistics algorithm of Chang and Lawler (1994), we propose, perhaps, the first ever algorithm to compute string kernels in linear time. This obviates dynamic programming with quadratic time complexity and makes string kernels a viable alternative for the practitioner. We also propose extensions of our string kernels to compute kernels on trees efficiently. This thesis presents a linear time algorithm for ordered trees and a log-linear time algorithm for unordered trees. In general, SVM's require time proportional to the number of Support Vectors for prediction. In case the dataset is noisy a large fraction of the data points become Support Vectors and thus time required for prediction increases. But, in many applications like search engines or web document retrieval, the dataset is noisy, yet, the speed of prediction is critical. We propose a method for string kernels by which the prediction time can be reduced to linear in the length of the sequence to be classified, regardless of the number of Support Vectors. We achieve this by using a weighted version of our string kernel algorithm. We explore the relationship between dynamic systems and kernels. We define kernels on various kinds of dynamic systems including Markov chains (both discrete and continuous), diffusion processes on graphs and Markov chains, Finite State Automata, various linear time-invariant systems etc Trajectories arc used to define kernels introduced on initial conditions lying underlying dynamic system. The same idea is extended to define Kernels on a. dynamic system with respect to a set of initial conditions. This framework leads to a large number of novel kernels and also generalize many previously proposed kernels. Lack of adequate training data is a problem which plagues classifiers. We propose n new method to generate virtual training samples in the case of handwritten digit data. Our method uses the two dimensional suffix tree representation of a set of matrices to encode an exponential number of virtual samples in linear space thus leading to an increase in classification accuracy. This in turn, leads us naturally to a, compact data dependent representation of a test pattern which we call the description tree. We propose a new kernel for images and demonstrate a quadratic time algorithm for computing it by wing the suffix tree representation of an image. We also describe a method to reduce the prediction time to quadratic in the size of the test image by using techniques similar to those used for string kernels.

APA, Harvard, Vancouver, ISO, and other styles

39

Robinson, Matthew Brandon Cleaver Gerald B. "Towards a systematic investigation of weakly coupled free fermionic heterotic string gauge group statistics." Waco, Tex. : Baylor University, 2009. http://hdl.handle.net/2104/5358.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

Oliveira, Rafael Massambone de. "String-averaging incremental subgradient methods for constrained convex optimization problems." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-14112017-150512/.

Full text

Abstract:

In this doctoral thesis, we propose new iterative methods for solving a class of convex optimization problems. In general, we consider problems in which the objective function is composed of a finite sum of convex functions and the set of constraints is, at least, convex and closed. The iterative methods we propose are basically designed through the combination of incremental subgradient methods and string-averaging algorithms. Furthermore, in order to obtain methods able to solve optimization problems with many constraints (and possibly in high dimensions), generally given by convex functions, our analysis includes an operator that calculates approximate projections onto the feasible set, instead of the Euclidean projection. This feature is employed in the two methods we propose; one deterministic and the other stochastic. A convergence analysis is proposed for both methods and numerical experiments are performed in order to verify their applicability, especially in large scale problems.
Nesta tese de doutorado, propomos novos métodos iterativos para a solução de uma classe de problemas de otimização convexa. Em geral, consideramos problemas nos quais a função objetivo é composta por uma soma finita de funções convexas e o conjunto de restrições é, pelo menos, convexo e fechado. Os métodos iterativos que propomos são criados, basicamente, através da junção de métodos de subgradientes incrementais e do algoritmo de média das sequências. Além disso, visando obter métodos flexíveis para soluções de problemas de otimização com muitas restrições (e possivelmente em altas dimensões), dadas em geral por funções convexas, a nossa análise inclui um operador que calcula projeções aproximadas sobre o conjunto viável, no lugar da projeção Euclideana. Essa característica é empregada nos dois métodos que propomos; um determinístico e o outro estocástico. Uma análise de convergência é proposta para ambos os métodos e experimentos numéricos são realizados a fim de verificar a sua aplicabilidade, principalmente em problemas de grande escala.

APA, Harvard, Vancouver, ISO, and other styles

41

Barton, Carl Samuel. "Algorithmic problems in strings with applications to the analysis of biological sequences." Thesis, King's College London (University of London), 2015. http://kclpure.kcl.ac.uk/portal/en/theses/algorithmic-problems-in-strings-with-applications-to-the-analysis-of-biological-sequences(461c8961-c256-4ff8-97f7-c0718709367d).html.

Full text

Abstract:

Recent advances in molecular biology have dramatically changed the way biological data analysis is performed [119, 92]. Next-Generation Sequenc- ing (NGS) technologies produce high-throughput data of highly controlled quality, hundreds of times faster and cheaper than a decade ago. Mapping of short reads to a reference sequence is a fundamental problem in NGS technologies. After finding an occurrence of a high quality fragment of the read, the rest must be approximately aligned, but a good alignment would not be expected to contain a large number of gaps (consecutive insertions or deletions). We present an alternative alignment algorithm which computes the optimal alignment with a bounded number of gaps. Another problem arising from NGS technologies is merging overlapping reads into a single string. We present a data structure which allows for the efficient computation of the overlaps between two strings as well as being applicable to other problems. Weighted strings are a representation of data that allows for a subtle representation of ambiguity in strings. In this document we present algorithms for other problems related to weighted strings: the computation of exact and approximate inverted repeats in weighted strings, computing repetitions and computing covers. We investigate the average-case complexity of wildcard matching. Wildcards can be used to model single nucleotide polymorphisms and so, efficient algorithms to search for strings with wildcards are necessary. In this document we investigate how efficient algorithms for this problem can be on average. There exist many organisms such as viruses, bacteria, eukaryotic cells, and archaea which have a circular DNA structure. If a biologist wishes to find occurrences of a particular virus in a carriers DNA sequence which may not be circular it must be possible to efficiently locate occurrences of circular strings. In this document we present a number of algorithms for circular string matching.

APA, Harvard, Vancouver, ISO, and other styles

42

Jackson, Robert Charles. "Data detection algorithms for perpendicular magnetic recording in the presence of strong media noise." Thesis, University of Warwick, 2008. http://wrap.warwick.ac.uk/851/.

Full text

Abstract:

As the throughput and density requirements increase for perpendicular magnetic recording channels, the presence of strong media noise degrades performance. Detection algorithms have been developed that increase performance in channels with strong media noise through the use of data dependent detectors. However optimal data dependent detectors are exponentially more complex than data independent detectors, and therefore cannot be fully exploited. In this thesis we shall discuss the existing detection algorithms, comparing the performance against the complexity. We then introduce a new sub-optimal detection algorithm, which employs a simple pre-detector that supplies estimates to a main detector. Numerical simulations are performed which show near optimal performance, but without the exponential increase in complexity. We will also show how detector implementations can exploit structure in the trellis to further reduce complexity, through loops and path invariants. An analytical means of measuring bit error rate from only the statistics of noise is presented, and this is utilised to optimally determine the equaliser and ISI target coefficients for a white noise Viterbi detector. Finally, we introduce a new class of VLSI binary addition algorithms which can be utilised to increase the throughput of a Viterbi detector, but which also has a wider application in hardware design.

APA, Harvard, Vancouver, ISO, and other styles

43

Foster, Carol Lynn. "Algorithms, abstraction and implementation : a massively multilevel theory of strong equivalence of complex systems." Thesis, University of Edinburgh, 1991. http://hdl.handle.net/1842/6591.

Full text

Abstract:

This thesis puts forward a formal theory of levels and algorithms to provide a foundation for those terms as they are used in much of cognitive science and computer science. Abstraction with respect to concreteness is distinguished from abstraction with respect to detail, resulting in three levels of concreteness and a large number of algorithmic levels, which are levels of detail and the primary focus of the theory. An algorithm or ideal machine is a set of sequences of states defining a particular level of detail. Rather than one fundamental ideal machine to describe the behaviour of a complex system, there are many possible ideal machines, extending Turing's approach to reflect the multiplicity of system descriptions required to express more than weak input-output equivalence of systems. Cognitive science is concerned with stronger equivalence; e.g., do two models go through the same states at some level of description? The state-based definition of algorithms serves as a basis for such strong equivalence and facilitates formal renditions of abstraction and implementation as relations between algorithms. It is possible to prove within the new framework whether or not one given algorithm is a valid implementation of another, or whether two unequal algorithms have a common abstraction, for example. Some implications of the theory are discussed, notably a characterisation of connectionist versus classical models.

APA, Harvard, Vancouver, ISO, and other styles

44

Lee-Cultura, Serena Glyn. "An FPT Algorithm for STRING-TO-STRING CORRECTION." Thesis, 2011. http://hdl.handle.net/1828/3498.

Full text

Abstract:

Parameterized string correction decision problems investigate the possibility of transforming a given string X into a target string Y using a fixed number of edit operations, k. There are four possible edit operations: swap, delete, insert and substi- tute. In this work we consider the NP--complete STRING-TO-STRING CORREC- TION problem restricted to deletes and swaps and parameterized by the number of allowed operations. Specifically, the problem asks whether there exists a trans- formation from X into Y consisting of at most k deletes or swaps. We present a fixed parameter algorithm that runs in O(2k(k + m)), where m is the length of the destination string. Further, we present an implementation of an extended version of the algorithm that constructs the transformation sequence ! of length ay most k, given its existence. This thesis concludes with a discussion comparing the practical run times obtained from our implementation with the proposed theoretical results. Efficient string correction algorithms have applications in several areas, for example computational linguistics, error detection and correction, and computational biology.
Graduate

APA, Harvard, Vancouver, ISO, and other styles

45

Shih-hsin, Liang, and 梁仕炘. "A K-modular String Matching Algorithm." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/19890631579529954123.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
98
Lots of string matching algorithms has been proposed. Among them, Boyer-Moore is fast but difficult to analyze its time complexity. The naive, or brute force, algorithm is the simplest string matching method. Its time complexity is O(m*n) when searching with a pattern of length m in a text of length n. Although Morris Pratt (MP) and Knuth-Morris-Pratt (KMP) algorithm method are very simple, and their time complexity is O(n), but every character must be checked at lease once. Colussi method, the one most peculiar method, divides its process into two step. The first step, is by using Knuth-Morris-Pratt (KMP) prefix function table, scans from left to right and leaps some characters. The second step searches from right to left on the characters missed by the first step. By this method, it searches faster than KMP method, but the time analysis is not so simple as KMP’s. Therefore we try to find a method which is simple and is easy to analyse. Our method also have two step similar with Colussi’s. The first step is to compare character from left to right every k characters, then the second step, also starting from the left side, is to check the un-compared characters. This method is simple and easy to analyze as KMP’s, but the number of comparisons is lower than KMP’s.

APA, Harvard, Vancouver, ISO, and other styles

46

Hsu, Yu-Ling, and 徐玉玲. "An Efficient Real Scaled String Matching Algorithm." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/44140988949103322076.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
96
Text searching is used on a wide variety of technologies, such as computer science, multimedia library, web search, etc. This thesis proposes how to find all possible patterns that appear under the real scales in text. On our algorithm, we consider two cases of real scales. One of them contains the scale range is greater than or equal to 1. The other case contains the range is less than or equal to 1. By using the Real Scaled Indexing Tree (RSIT), we can determine whether a substring under every scale can match the text or not. Assume that the lengths of text and pattern length are n and m, respectively. The time complexity is O(n^3).

APA, Harvard, Vancouver, ISO, and other styles

47

Liu, Kuei-Wen, and 劉貴文. "Some Results on Exact String Matching Algorithm." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/14472736499752714951.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
96
In this paper, we introduce two algorithms for stringology. One is to solve the exact string matching problem and the other is to find the string cycle (period of a string). In the exact string matching problem, we are given two strings T=t1t2...tn and P=p1p2...pm. We are asked to find all occurrences of P in T. Our searching method is based upon a single character rule. Consider a location i in P. Suppose that pi is aligned with tj and pi <> tj. We then must move P in such a way that some pk=tj will be aligned with tj. In the thesis, we propose a method to find the optimal location i. We also modified a string matching approach, named wide window approach, which divides the text into n/m overlapping windows of size 2m-1. In the windows, the approach attempts m possible occurrence positions in parallel. It firstly searches pattern suffixes from middle to right with the modified convolution method, shifts the window directly when it fails, otherwise, scans the corresponding prefixes backward with the modified convolution method again. For the period of string problem, we are given a string T and we are asked to find the periods of T. Our algorithm is based upon the bit parallel approach.

APA, Harvard, Vancouver, ISO, and other styles

48

Liu, Kuei-Wen. "Some Results on Exact String Matching Algorithm." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200814333600.

Full text

APA, Harvard, Vancouver, ISO, and other styles

49

Shieh, Y. K., and 謝一功. "An Improved Approximate String Matching Algorithm Based upon the Boyer-Moore Algorithm." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/66121462520038468689.

Full text

Abstract:

碩士
國立暨南國際大學
資訊工程學系
96
In this thesis, we discuss the exact string matching problem and approximate string matching problem. We avoid using a brute-force algorithm to solve the string matching problem. We improve the Bad Character Rule of Boyer and Moore Algorithm and compare it with the Horspool Algorithm. And we find that the improved method include the information of the Horspool Algorithm. Therefore, we combine the improved method with the Horspool Algorithm. The combinative method has a better efficiency in searching phase.

APA, Harvard, Vancouver, ISO, and other styles

50

Shieh, Yi-Kung. "An Improved Approximate String Matching Algorithm Based upon the Boyer-Moore Algorithm." 2008. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0020-2406200816121600.

Full text

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!