Dissertations / Theses: 'Parsing (computer grammar)'

1

Ghosh, Debajit 1974. "Automatic grammar induction from semantic parsing." Thesis, Massachusetts Institute of Technology, 1998. http://hdl.handle.net/1721.1/50435.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Teboul, Olivier. "Shape Grammar Parsing : application to Image-based Modeling." Phd thesis, Ecole Centrale Paris, 2011. http://tel.archives-ouvertes.fr/tel-00628906.

Full text

Abstract:

The purpose of this thesis was to perform facade image parsing with shape grammars in order to tackle single-view image-based 3D building modeling. The scope of the thesis was lying at the border of Computer Graphics and Computer Vision, both in terms of methods and applications.Two different and complementary approaches have been proposed: a bottom-up parsing algorithm that aimed at grouping similar regions of a facade image so as to retrieve the underlying layout, and a top-down parsing algorithm based on a very powerful framework: Reinforcement Learning. This novel parsing algorithm uses pixel-wise image supports based on supervised learning in a global optimization of a Markov Decision Process.Both methods were evaluated quantitatively and qualitatively. The second one was proved to support various architectures, several shape grammars and image supports, and showed robustness to challenging viewing conditions; illumination and large occlusions. The second method outperformed the state-of-the-art both in terms of segmentation and speed performances. It also provides a much more flexible framework, in which many extensions may be envisioned.The conclusion of this work was that the problem of single-view image-based 3D building modeling could be solved elegantly by using shape grammar as a Rosetta stone to decipher the language of Architecture through a well-suited Reinforcement Learning formulation. This solution was a potential answer to large-scale reconstruction of urban environments from images, but also suggested the possibility of introducing Reinforcement Learning in other vision tasks such as generic image parsing, where it have been barely explored so far.

APA, Harvard, Vancouver, ISO, and other styles

3

Aycock, John Daniel. "Practical Earley parsing and the SPARK toolkit." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/NQ58556.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Pan, Yinfei. "Parallel XML parsing." Diss., Online access via UMI:, 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

5

Yang, Yongsheng. "A maximum entropy approach to Chinese language parsing /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?COMP%202002%20YANG.

Full text

Abstract:

Thesis (M. Phil.)--Hong Kong University of Science and Technology, 2002.
Includes bibliographical references (leaves 54-55). Also available in electronic version. Access restricted to campus users.

APA, Harvard, Vancouver, ISO, and other styles

6

Bhalerao, Rohit Dinesh. "Parallel XML parsing." Diss., Online access via UMI:, 2007.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

7

Van, Delden Sebastian Alexander. "Larger-first partial parsing." Doctoral diss., University of Central Florida, 2003. http://digital.library.ucf.edu/cdm/ref/collection/RTD/id/2038.

Full text

Abstract:

University of Central Florida College of Engineering Thesis
Larger-first partial parsing is a primarily top-down approach to partial parsing that is opposite to current easy-fzrst, or primarily bottom-up, strategies. A rich partial tree structure is captured by an algorithm that assigns a hierarchy of structural tags to each of the input tokens in a sentence. Part-of-speech tags are first assigned to the words in a sentence by a part-of-speech tagger. A cascade of Deterministic Finite State Automata then uses this part-of-speech information to identify syntactic relations primarily in a descending order of their size. The cascade is divided into four specialized sections: (1) a Comma Network, which identifies syntactic relations associated with commas; (2) a Conjunction Network, which partially disambiguates phrasal conjunctions and llly disambiguates clausal conjunctions; (3) a Clause Network, which identifies non-comma-delimited clauses; and (4) a Phrase Network, which identifies the remaining base phrases in the sentence. Each automaton is capable of adding one or more levels of structural tags to the tokens in a sentence. The larger-first approach is compared against a well-known easy-first approach. The results indicate that this larger-first approach is capable of (1) producing a more detailed partial parse than an easy first approach; (2) providing better containment of attachment ambiguity; (3) handling overlapping syntactic relations; and (4) achieving a higher accuracy than the easy-first approach. The automata of each network were developed by an empirical analysis of several sources and are presented here in detail.
Ph.D.
Doctorate;
Department of Electrical Engineering and Computer Science
Engineering and Computer Science
Electrical Engineering and Computer Science
215 p.
xiv, 212 leaves, bound : ill. ; 28 cm.

APA, Harvard, Vancouver, ISO, and other styles

8

Walenski, Matthew S. "Relating parsers and grammars : on the structure and real-time comprehension of English infinitival complements /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2002. http://wwwlib.umi.com/cr/ucsd/fullcit?p3044770.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Lakeland, Corrin, and n/a. "Lexical approaches to backoff in statistical parsing." University of Otago. Department of Computer Science, 2006. http://adt.otago.ac.nz./public/adt-NZDU20060913.134736.

Full text

Abstract:

This thesis develops a new method for predicting probabilities in a statistical parser so that more sophisticated probabilistic grammars can be used. A statistical parser uses a probabilistic grammar derived from a training corpus of hand-parsed sentences. The grammar is represented as a set of constructions - in a simple case these might be context-free rules. The probability of each construction in the grammar is then estimated by counting its relative frequency in the corpus. A crucial problem when building a probabilistic grammar is to select an appropriate level of granularity for describing the constructions being learned. The more constructions we include in our grammar, the more sophisticated a model of the language we produce. However, if too many different constructions are included, then our corpus is unlikely to contain reliable information about the relative frequency of many constructions. In existing statistical parsers two main approaches have been taken to choosing an appropriate granularity. In a non-lexicalised parser constructions are specified as structures involving particular parts-of-speech, thereby abstracting over individual words. Thus, in the training corpus two syntactic structures involving the same parts-of-speech but different words would be treated as two instances of the same event. In a lexicalised grammar the assumption is that the individual words in a sentence carry information about its syntactic analysis over and above what is carried by its part-of-speech tags. Lexicalised grammars have the potential to provide extremely detailed syntactic analyses; however, Zipf�s law makes it hard for such grammars to be learned. In this thesis, we propose a method for optimising the trade-off between informative and learnable constructions in statistical parsing. We implement a grammar which works at a level of granularity in between single words and parts-of-speech, by grouping words together using unsupervised clustering based on bigram statistics. We begin by implementing a statistical parser to serve as the basis for our experiments. The parser, based on that of Michael Collins (1999), contains a number of new features of general interest. We then implement a model of word clustering, which we believe is the first to deliver vector-based word representations for an arbitrarily large lexicon. Finally, we describe a series of experiments in which the statistical parser is trained using categories based on these word representations.

APA, Harvard, Vancouver, ISO, and other styles

10

Moss, William B. "Evaluating inherited attributes using Haskell and lazy evaluation." Diss., Connect to the thesis, 2005. http://hdl.handle.net/10066/1486.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Crowfoot, Norman C. "A visual aid for designing regular expression parsers /." Online version of thesis, 1988. http://hdl.handle.net/1850/10446.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Prost, Jean-Philippe. "Modelling Syntactic Gradience with Loose Constraint-based Parsing." Phd thesis, Université de Provence - Aix-Marseille I, 2008. http://tel.archives-ouvertes.fr/tel-00352828.

Full text

Abstract:

La grammaticalité d'une phrase est habituellement conçue comme une notion binaire : une phrase est soit grammaticale, soit agrammaticale. Cependant, bon nombre de travaux se penchent de plus en plus sur l'étude de degrés d'acceptabilité intermédiaires, auxquels le terme de gradience fait parfois référence. À ce jour, la majorité de ces travaux s'est concentrée sur l'étude de l'évaluation humaine de la gradience syntaxique. Cette étude explore la possibilité de construire un modèle robuste qui s'accorde avec ces jugements humains.
Nous suggérons d'élargir au langage mal formé les concepts de Gradience Intersective et de Gradience Subsective, proposés par Aarts pour la modélisation de jugements graduels. Selon ce nouveau modèle, le problème que soulève la gradience concerne la classification d'un énoncé dans une catégorie particulière, selon des critères basés sur les caractéristiques syntaxiques de l'énoncé. Nous nous attachons à étendre la notion de Gradience Intersective (GI) afin qu'elle concerne le choix de la meilleure solution parmi un ensemble de candidats, et celle de Gradience Subsective (GS) pour qu'elle concerne le calcul du degré de typicité de cette structure au sein de sa catégorie. La GI est alors modélisée à l'aide d'un critère d'optimalité, tandis que la GS est modélisée par le calcul d'un degré d'acceptabilité grammaticale. Quant aux caractéristiques syntaxiques requises pour permettre de classer un énoncé, notre étude de différents cadres de représentation pour la syntaxe du langage naturel montre qu'elles peuvent aisément être représentées dans un cadre de syntaxe modèle-théorique (Model-Theoretic Syntax). Nous optons pour l'utilisation des Grammaires de Propriétés (GP), qui offrent, précisément, la possibilité de modéliser la caractérisation d'un énoncé. Nous présentons ici une solution entièrement automatisée pour la modélisation de la gradience syntaxique, qui procède de la caractérisation d'une phrase bien ou mal formée, de la génération d'un arbre syntaxique optimal, et du calcul d'un degré d'acceptabilité grammaticale pour l'énoncé.
À travers le développement de ce nouveau modèle, la contribution de ce travail comporte trois volets.
Premièrement, nous spécifions un système logique pour les GP qui permet la révision de sa formalisation sous l'angle de la théorie des modèles. Il s'attache notamment à formaliser les mécanismes de satisfaction et de relâche de contraintes mis en oeuvre dans les GP, ainsi que la façon dont ils permettent la projection d'une catégorie lors du processus d'analyse. Ce nouveau système introduit la notion de satisfaction relâchée, et une formulation en logique du premier ordre permettant de raisonner au sujet d'un énoncé.
Deuxièmement, nous présentons notre implantation du processus d'analyse syntaxique relâchée à base de contraintes (Loose Satisfaction Chart Parsing, ou LSCP), dont nous prouvons qu'elle génère toujours une analyse syntaxique complète et optimale. Cette approche est basée sur une technique de programmation dynamique (dynamic programming), ainsi que sur les mécanismes décrits ci-dessus. Bien que d'une complexité élevée, cette solution algorithmique présente des performances suffisantes pour nous permettre d'expérimenter notre modèle de gradience.
Et troisièmement, après avoir postulé que la prédiction de jugements humains d'acceptabilité peut se baser sur des facteurs dérivés de la LSCP, nous présentons un modèle numérique pour l'estimation du degré d'acceptabilité grammaticale d'un énoncé. Nous mesurons une bonne corrélation de ces scores avec des jugements humains d'acceptabilité grammaticale. Qui plus est, notre modèle s'avère obtenir de meilleures performances que celles obtenues par un modèle préexistant que nous utilisons comme référence, et qui, quant à lui, a été expérimenté à l'aide d'analyses syntaxiques générées manuellement.

APA, Harvard, Vancouver, ISO, and other styles

13

Thompson, Cynthia Ann. "Semantic lexicon acquisition for learning natural language interfaces /." Digital version accessible at:, 1998. http://wwwlib.umi.com/cr/utexas/main.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Dagerman, Björn. "Semantic Analysis of Natural Language and Definite Clause Grammar using Statistical Parsing and Thesauri." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-26142.

Full text

Abstract:

Services that rely on the semantic computations of users’ natural linguistic inputs are becoming more frequent. Computing semantic relatedness between texts is problematic due to the inherit ambiguity of natural language. The purpose of this thesis was to show how a sentence could be compared to a predefined semantic Definite Clause Grammar (DCG). Furthermore, it should show how a DCG-based system could benefit from such capabilities. Our approach combines openly available specialized NLP frameworks for statistical parsing, part-of-speech tagging and word-sense disambiguation. We compute the semantic relatedness using a large lexical and conceptual-semantic thesaurus. Also, we extend an existing programming language for multimodal interfaces, which uses static predefined DCGs: COactive Language Definition (COLD). That is, every word that should be acceptable by COLD needs to be explicitly defined. By applying our solution, we show how our approach can remove dependencies on word definitions and improve grammar definitions in DCG-based systems.

APA, Harvard, Vancouver, ISO, and other styles

15

Sharp, Randall Martin. "A model of grammar based on principles of government and binding." Thesis, University of British Columbia, 1985. http://hdl.handle.net/2429/24917.

Full text

Abstract:

This thesis describes an implementation of a model of natural language grammar based on current theories of transformational grammar, collectively referred to as Government and Binding (GB) theory. A description is presented of the principles of GB, including X-bar syntax and the theories of Case, Theta, Binding, Bounding, and Government The principles, in effect, constitute an embodiment of "universal grammar" (UG), i.e. the abstract characterization of the innately endowed human language faculty. Associated with the principles is a set of parameters that alter the effect of the principles. The "core grammar" of a specific language is an instantiation of UG with the parameters set in a particular way. To demonstrate the cross-linguistic nature of the theory, a subset of the "core grammars" of Spanish and English is implemented, including their parametric values and certain language-specific transformations required to characterize grammatical sentences. Sentences in one language are read in and converted through a series of reverse transformations to a base representation in the target language. To this representation, transformations are applied that produce a set of output sentences. The well-formedness of these sentences is verified by the general principles of UG as controlled by the parameters. Any that fail to meet the conditions are rejected so that only grammatical sentences are displayed. The model is written in the Prolog programming language.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

16

Jansen, Anthony Robert 1973. "Encoding and parsing of algebraic expressions by experienced users of mathematics." Monash University, School of Computer Science and Software Engineering, 2002. http://arrow.monash.edu.au/hdl/1959.1/8059.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Lum, Bik. "A rule-based analysis system for Chinese sentences /." [Hong Kong : University of Hong Kong], 1989. http://sunzi.lib.hku.hk/hkuto/record.jsp?B1240231X.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Mansfield, Martin F. "Design of a generic parse tree for imperative languages." Virtual Press, 1992. http://liblink.bsu.edu/uhtbin/catkey/834617.

Full text

Abstract:

Since programs are written in many languages and design documents are not maintained (if they ever existed), there is a need to extract the design and other information that the programs represent. To do this without writing a separate program for each language, a common representation of the symbol table and parse tree would be required.The purpose of the parse tree and symbol table will not be to generate object code but to provide a platform for analysis tools. In this way the tool designer develops only one version instead of separate versions for each language. The generic symbol table and generic parse tree may not be as detailed as those same structures in a specific compiler but the parse tree must include all structures for imperative languages.
Department of Computer Science

APA, Harvard, Vancouver, ISO, and other styles

19

Warote, Nuntaporn. "ETRANS : an English-Thai translator /." Online version of thesis, 1991. http://hdl.handle.net/1850/11639.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Bordihn, Henning. "Contributions to the syntactical analysis beyond context-freeness." Thesis, Universität Potsdam, 2011. http://opus.kobv.de/ubp/volltexte/2012/5971/.

Full text

Abstract:

Parsability approaches of several grammar formalisms generating also non-context-free languages are explored. Chomsky grammars, Lindenmayer systems, grammars with controlled derivations, and grammar systems are treated. Formal properties of these mechanisms are investigated, when they are used as language acceptors. Furthermore, cooperating distributed grammar systems are restricted so that efficient deterministic parsing without backtracking becomes possible. For this class of grammar systems, the parsing algorithm is presented and the feature of leftmost derivations is investigated in detail.
Ansätze zum Parsing verschiedener Grammatikformalismen, die auch nicht-kontextfreie Sprachen erzeugen können, werden diskutiert. Chomsky-Grammatiken, Lindenmayer-Systeme, Grammatiken mit gesteuerten Ersetzungen und Grammatiksysteme werden behandelt. Formale Eigenschaften dieser Mechanismen als Akzeptoren von Sprachen werden untersucht. Weiterhin werden kooperierende verteilte (CD) Grammatiksysteme derart beschränkt, dass effizientes deterministisches Parsing ohne Backtracking möglich ist. Für diese Klasse von Grammatiksystemen wird der Parsingalgorithmus vorgestellt und die Rolle von Linksableitungen wird detailliert betrachtet.

APA, Harvard, Vancouver, ISO, and other styles

21

林碧 and Bik Lum. "A rule-based analysis system for Chinese sentences." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1989. http://hub.hku.hk/bib/B31208769.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Hoyos, Jacob. "PLPrepare: A Grammar Checker for Challenging Cases." Digital Commons @ East Tennessee State University, 2021. https://dc.etsu.edu/etd/3898.

Full text

Abstract:

This study investigates one of the Polish language’s most arbitrary cases: the genitive masculine inanimate singular. It collects and ranks several guidelines to help language learners discern its proper usage and also introduces a framework to provide detailed feedback regarding arbitrary cases. The study tests this framework by implementing and evaluating a hybrid grammar checker called PLPrepare. PLPrepare performs similarly to other grammar checkers and is able to detect genitive case usages and provide feedback based on a number of error classifications.

APA, Harvard, Vancouver, ISO, and other styles

23

Gupta, Pankaj. "The Design and Implementation of a Prolog Parser Using Javacc." Thesis, University of North Texas, 2002. https://digital.library.unt.edu/ark:/67531/metadc3251/.

Full text

Abstract:

Operatorless Prolog text is LL(1) in nature and any standard LL parser generator tool can be used to parse it. However, the Prolog text that conforms to the ISO Prolog standard allows the definition of dynamic operators. Since Prolog operators can be defined at run-time, operator symbols are not present in the grammar rules of the language. Unless the parser generator allows for some flexibility in the specification of the grammar rules, it is very difficult to generate a parser for such text. In this thesis we discuss the existing parsing methods and their modified versions to parse languages with dynamic operator capabilities. Implementation details of a parser using Javacc as a parser generator tool to parse standard Prolog text is provided. The output of the parser is an “Abstract Syntax Tree” that reflects the correct precedence and associativity rules among the various operators (static and dynamic) of the language. Empirical results are provided that show that a Prolog parser that is generated by the parser generator like Javacc is comparable in efficiency to a hand-coded parser.

APA, Harvard, Vancouver, ISO, and other styles

24

Du, Toit Christine. "The use of temporal context in the generation of strings." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/53183.

Full text

Abstract:

Thesis (MSc)--Stellenbosch University , 2002.
ENGLISH ABSTRACT: Grammars with regulated rewriting are used to restrict the application of contextfree productions in order to avoid certain derivations. This enables these grammars to generate both context-free and non-context-free languages using only production rules with a context-free format. These grammars are more powerful than contextfree grammars, but usually not as powerful as context-sensitive grammars. Various grammars with regulated rewriting have been developed and some will be discussed in this thesis. Propositional linear temporal logic is a formal system used to describe truth values of propositions over time. This is done by defining a timeline together with a set of propositions. It is then possible to construct temporal logic formulae, consisting of these propositions and temporal operators, to specify the truth values of the propositions for every step in the timeline. In this thesis we define and discuss temporal grammars that combine grammars with propositionallinear temporal logic. Since a derivation can be associated with a timeline, a regulating device can be constructed from temporal logic formulae, that will control the application of productions within the derivation. The discussion on temporal grammars includes some of the properties of these grammars, while many ideas are illustrated by examples.
AFRIKAANSE OPSOMMING: Grammatikas met gereguleerde herskrywing word gebruik om 'n beperking te plaas op die toepassing van konteksvrye produksies en verhoed sodoende sekere afleidings. Hierdie grammatikas beskik oor die vermoe om beide konteksvrye en nie-konteksvrye tale te genereer deur slegs produksiereels van 'n konteksvrye formaat te gebruik. Grammatikas met gereguleerde herskrywing is dus sterker as konteksvrye grammatikas, alhoewel dit soms swakker as konteks-sensitiewe grammatikas is. 'n Verskeidenheid sulke grammatikas is al ontwikkel en sommige sal in hierdie tesis bespreek word. Proposisionele lineere temporale logika is 'n formele stelsel wat gebruik kan word om die waarheidswaardes van proposisies oor tyd te beskryf. Dit word gedoen deur 'n tydlyn, asook 'n versameling proposisies te definieer. Dit is clan moontlik om temporale operatore tesame met die proposisies te gebruik om temporale logika-formules te konstrueer wat in staat is om waarheidswaardes van die proposisies te spesifiseer vir elke oomblik in die tydlyn. In hierdie tesis word temporale grammatikas, wat grammatikas met proposisionele lineere temporale logika kombineer, gedefinieer en bespreek. Aangesien 'n afleiding met 'n tydlyn geassosieer kan word, is dit moontlik om 'n regulerende meganisme uit temporale logika-formules te konstrueer wat die toepassing van produksiereels in die afleiding kontroleer. Die bespreking van temporale grammatikas sluit 'n verskeidenheid eienskappe van die grammatikas in, asook 'n aantal voorbeelde wat ter illustrasie gebruik word.

APA, Harvard, Vancouver, ISO, and other styles

25

Shi, Lei. "A general purpose semantic parser using FrameNet and WordNet®." Thesis, University of North Texas, 2004. https://digital.library.unt.edu/ark:/67531/metadc4483/.

Full text

Abstract:

Syntactic parsing is one of the best understood language processing applications. Since language and grammar have been formally defined, it is easy for computers to parse the syntactic structure of natural language text. Does meaning have structure as well? If it has, how can we analyze the structure? Previous systems rely on a one-to-one correspondence between syntactic rules and semantic rules. But such systems can only be applied to limited fragments of English. In this thesis, we propose a general-purpose shallow semantic parser which utilizes a semantic network (WordNet), and a frame dataset (FrameNet). Semantic relations recognized by the parser are based on how human beings represent knowledge of the world. Parsing semantic structure allows semantic units and constituents to be accessed and processed in a more meaningful way than syntactic parsing, moving the automation of understanding natural language text to a higher level.

APA, Harvard, Vancouver, ISO, and other styles

26

Reuer, Veit. "PromisD." Doctoral thesis, Humboldt-Universität zu Berlin, Philosophische Fakultät II, 2005. http://dx.doi.org/10.18452/15266.

Full text

Abstract:

Gegenstand der Arbeit ist zunächst eine Analyse der didaktischen Anforderungen an Sprachlernsysteme, die sich zum Teil aus dem Fremdsprachenunterricht ergeben. Daraus ergibt sich ein Übungstyp, der vom Lerner eine frei gestaltete Eingabe erfordert und damit insbesondere die kommunikative Kompetenz fördert, der aber auch mit Hilfe computerlinguistischer Methoden realisiert werden kann. Anschließend wird zur Auswahl einer geeigneten Grammatiktheorie insbesondere die Lexical Functional Grammar (LFG) näher betrachtet. Die Theorie muss sich aus computerlinguistischer Sicht für eine Implementierung im Rahmen eines Sprachlernprogramms eignen und es ist von zusätzlichem Vorteil, wenn die verwendeten Konzepte denen in Lernergrammatiken ähneln, um so die Generierung von Rückmeldungen zu vereinfachen. Im darauf folgenden Abschnitt wird kurz das eigentliche Programm PromisD (Projekt mediengestütztes interaktives Sprachenlernen - Deutsch) vorgestellt, wie es sich auch dem Nutzer präsentiert. Schließlich wird ein so genanntes antizipationsfreies Verfahren entwickelt, bei dem weder in der Grammatik noch im Lexikon Informationen zur Fehleridentifizierung enthalten sind. Die Fehlererkennung wird dabei auf die Bereiche eingeschränkt, in denen sich in einem Lernerkorpus häufig Fehler zeigen, um einerseits wesentliche Fehlertypen abzudecken und andererseits eine größere Effizienz bei der Analyse von realen Eingaben zu erreichen. Die Vorstellung des Verfahrens unterteilt sich entsprechend den grundlegenden Struktureinheiten der LFG in zwei Bereiche: die Konstituentenstruktur mit einer modifizierten Form des Earley-Algorithmus zur Integration von Fehlerhypothesen in die Chart und die Feature-Struktur mit einer veränderten Unifikationstrategie zur Behandlung und Speicherung von sich widersprechenden Werten in F-Strukturen. Zum Abschluss erfolgt die Evaluation und es werden die Möglichkeiten zur Gestaltung einer Rückmeldung an den Lerner diskutiert.
The dissertation starts with an analysis of the requirements for Intelligent Computer-Assisted Language Learning systems (ICALL), which partially depend on didactic aspects of foreign language teaching. Based on this a type of exercise can be identified, that on the one hand allows the learner to enter free formed input supporting the so called communicative competence as a major didactic goal and on the other hand may be realised with advanced computational linguistics'' methods. In the following chapter a look at grammar theories and especially Lexical Functional Grammar (LFG) is taken. The grammar theory needs to be tractable in an implementation and it is of a further advantage if the concepts of the theory are similar to the concepts in learner grammars in order to simplify the generation of feedback. Subsequently the user interface of the actual program is presented with a focus on error messages. The implementation is named PromisD, which stands for "Projekt mediengestütztes interaktives Sprachenlernen - Deutsch". Finally an anticipation-free parsing method is developed using neither information from the lexicon nor the grammar in order to identify grammar errors. The recognition is restricted to those areas where errors occur frequently in a learner corpus in order to allow for a greater efficiency parsing authentic data. Along the two structural levels in LFG the presentation of the algorithm follows: the constituent-structure with a modified Early-algorithm integrating error hypotheses into the chart and the feature-structure with a new unification-strategie storing information about clashing values in the f-structure. The dissertation closes with an evaluation and an outlook on the generation of error messages.

APA, Harvard, Vancouver, ISO, and other styles

27

Head, Michael Reuben. "Analysis and optimization for processing grid-scale XML datasets." Diss., Online access via UMI:, 2009.

Find full text

APA, Harvard, Vancouver, ISO, and other styles

28

Skrzypczak, Piotr. "Parallel parsing of context-free grammars." Thesis, Blekinge Tekniska Högskola, Sektionen för datavetenskap och kommunikation, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-2958.

Full text

Abstract:

During the last decade increasing interest in parallel programming can be observed. It is caused by a tendency of developing microprocessors as a multicore units, that can perform instructions simultaneously. Popular and widely used example of such platform is a graphic processing unit (GPU). Its ability to perform calculations simultaneously is being investigated as a way for improving performance of the complex algorithms. Therefore, GPU’s are now having the architectures that allows to use its computional power by programmers and software developers in the same way as CPU. One of these architectures is CUDA platform, developed by nVidia. Aim of this thesis is to implement the parallel CYK algorithm, which is one of the most popular parsing algorithms, for CUDA platform, that will gain a significant speed-up in comparison with the sequential CYK algorithm. The thesis presents review of existing parallelisations of CYK algorithm, descriptions of implemented algorithms (basic version and few modifications), and experimental stage, that includes testing these versions for various inputs in order to justify which version of algorithm is giving the best performance. There are three versions of algorithm presented, from which one was selected as the best (giving about 10 times better performance for the longest instances of inputs). Also, a limited version of algorithm, that gives best performance (even 100 times better in comparison with non-limited sequential version), but requires some conditions to be fulfilled by grammar, is presented. The motivation for the thesis is to use the developed algorithm in GCS.

APA, Harvard, Vancouver, ISO, and other styles

29

Knutsson, Ola. "Developing and Evaluating Language Tools for Writers and Learners of Swedish." Doctoral thesis, Stockholm : KTH, 2005. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-442.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Sastre, Javier M. "Efficient finite-state algorithms for the application of local grammars." Phd thesis, Université de Marne la Vallée, 2011. http://tel.archives-ouvertes.fr/tel-00621249.

Full text

Abstract:

Notre travail porte sur le développement d'algorithmes performants d'application de grammaires locales, en prenant comme référence ceux des logiciels libres existants: l'analyseur syntaxique descendant d'Unitex et l'analyseur syntaxique à la Earley d'Outilex. Les grammaires locales sont un formalisme de représentation de la syntaxe des langues naturelles basé sur les automates finis. Les grammaires locales sont un modèle de construction de descriptions précises et à grande échelle de la syntaxe des langues naturelles par le biais de l'observation systématique et l'accumulation méthodique de données. L'adéquation des grammaires locales pour cette tâche a été testée à l'occasion de nombreux travaux. À cause de la nature ambiguë des langues naturelles et des propriétés des grammaires locales, les algorithmes classiques d'analyse syntaxique tels que LR, CYK et Tomita ne peuvent pas être utilisés dans le contexte de ce travail. Les analyseurs descendant et Earley sont des alternatives possibles, cependant, ils ont des coûts asymptotiques exponentiels pour le cas des grammaires locales. Nous avons d'abord conçu un algorithme d'application de grammaires locales avec un coût polynomial dans le pire des cas. Ensuite, nous avons conçu des structures de données performantes pour la représentation d'ensembles d'éléments et de séquences. Elles ont permis d'améliorer la vitesse de notre algorithme dans le cas général. Nous avons mis en oeuvre notre algorithme et ceux des systèmes Unitex et Outilex avec les mêmes outils afin de les tester dans les mêmes conditions. En outre, nous avons mis en oeuvre différentes versions de chaque algorithme en utilisant nos structures de données et algorithmes pour la représentation d'ensembles et ceux fournis par la Standard Template Library (STL) de GNU. Nous avons comparé les performances des différents algorithmes et de leurs variantes dans le cadre d'un projet industriel proposé par l'entreprise Telefónica I+D: augmenter la capacité de compréhension d'un agent conversationnel qui fournit des services en ligne, voire l'envoi de SMS à des téléphones portables ainsi que des jeux et d'autres contenus numériques. Les conversations avec l'agent sont en espagnol et passent par Windows Live Messenger. En dépit du domaine limité et de la simplicité des grammaires appliquées, les temps d'exécution de notre algorithme, couplé avec nos structures de données et algorithmes pour la représentation d'ensembles, ont été plus courts. Grâce au coût asymptotique amélioré, on peut s'attendre à des temps d'exécution significativement inférieurs par rapport aux algorithmes utilisés dans les systèmes Unitex et Outilex, pour le cas des grammaires complexes et à large couverture.

APA, Harvard, Vancouver, ISO, and other styles

31

Tolone, Elsa. "Analyse syntaxique à l'aide des tables du Lexique-Grammaire du français." Phd thesis, Université Paris-Est, 2011. http://tel.archives-ouvertes.fr/tel-00625875/en/.

Full text

Abstract:

Les tables du Lexique-Grammaire, dont le développement a été initié par Gross (1975), constituent un lexique syntaxique très riche pour le français. Elles couvrent diverses catégories lexicales telles que les verbes, les noms, les adjectifs et les adverbes. Cette base de données linguistiques n'est cependant pas directement exploitable informatiquement car elle est incomplète et manque de cohérence. Chaque table regroupe un certain nombre d'entrées jugées similaires car elles acceptent des propriétés communes. Ces propriétés ont pour particularité de ne pas être codées dans les tables mêmes mais uniquement décrites dans la littérature. Pour rendre ces tables exploitables, il faut expliciter les propriétés intervenant dans chacune d'entre elles. De plus, un grand nombre de ces propriétés doivent être renommées dans un souci de cohérence. Notre objectif est d'adapter les tables pour les rendre utilisables dans diverses applications de Traitement Automatique des Langues (TAL), notamment l'analyse syntaxique. Nous expliquons les problèmes rencontrés et les méthodes adoptées pour permettre leur intégration dans un analyseur syntaxique. Nous proposons LGExtract, un outil générique pour générer un lexique syntaxique pour le TAL à partir des tables du Lexique-Grammaire. Il est relié à une table globale dans laquelle nous avons ajouté les propriétés manquantes et un unique script d'extraction incluant toutes les opérations liées à chaque propriété devant être effectuées pour toutes les tables. Nous présentons également LGLex, le nouveau lexique syntaxique généré des verbes, des noms prédicatifs, des expressions figées et des adverbes. Ensuite, nous montrons comment nous avons converti les verbes et les noms prédicatifs de ce lexique au format Alexina, qui est celui du lexique Lefff (Lexique des Formes Fléchies du Français) (Sagot, 2010), un lexique morphologique et syntaxique à large couverture et librement disponible pour le français. Ceci permet son intégration dans l'analyseur syntaxique FRMG (French MetaGrammar) (Thomasset et de La Clergerie, 2005), un analyseur profond à large couverture pour le français, basé sur les grammaires d'arbres adjoints (TAG), reposant habituellement sur le Lefff. Cette étape de conversion consiste à extraire l'information syntaxique codée dans les tables du Lexique-Grammaire. Nous présentons les fondements linguistiques de ce processus de conversion et le lexique obtenu. Nous évaluons l'analyseur syntaxique FRMG sur le corpus de référence de la campagne d'évaluation d'analyseur du français Passage (Produire des Annotations Syntaxiques à Grande Échelle) (Hamon et al., 2008), en comparant sa version basée sur le Lefff avec notre version reposant sur les tables du Lexique-Grammaire converties.

APA, Harvard, Vancouver, ISO, and other styles

32

Yamangil, Elif. "Rich Linguistic Structure from Large-Scale Web Data." Thesis, Harvard University, 2013. http://dissertations.umi.com/gsas.harvard:11162.

Full text

Abstract:

The past two decades have shown an unexpected effectiveness of Web-scale data in natural language processing. Even the simplest models, when paired with unprecedented amounts of unstructured and unlabeled Web data, have been shown to outperform sophisticated ones. It has been argued that the effectiveness of Web-scale data has undermined the necessity of sophisticated modeling or laborious data set curation. In this thesis, we argue for and illustrate an alternative view, that Web-scale data not only serves to improve the performance of simple models, but also can allow the use of qualitatively more sophisticated models that would not be deployable otherwise, leading to even further performance gains.
Engineering and Applied Sciences

APA, Harvard, Vancouver, ISO, and other styles

33

Le, Roux Joseph. "La coordination dans les grammaires d'interaction." Phd thesis, Institut National Polytechnique de Lorraine - INPL, 2007. http://tel.archives-ouvertes.fr/tel-00185248.

Full text

Abstract:

Cette thèse présente une modélisation des principaux aspects syntaxiques de la coordination dans les grammaires d'interaction de Guy Perrier . Les grammaires d'interaction permettent d'expliciter la valence des groupes conjoints. C'est précisément sur cette notion qu'est fondée notre modélisation.
Nous présentons également tous les travaux autour de cette modélisation qui nous ont permis d'aboutir à une implantation réaliste: le développement du logiciel XMG et son utilisation pour l'écriture de grammaires lexicalisées, le filtrage lexical par intersection d'automates et l'analyse syntaxique.

APA, Harvard, Vancouver, ISO, and other styles

34

Le, Roux Joseph. "La coordination dans les grammaires d'interaction." Electronic Thesis or Diss., Vandoeuvre-les-Nancy, INPL, 2007. http://www.theses.fr/2007INPL063N.

Full text

Abstract:

Cette thèse présente une modélisation des principaux aspects syntaxiques de la coordination dans les grammaires d'interaction de Guy Perrier. Les grammaires d'interaction permettent d'expliciter la valence des groupes conjoints. C'est précisément sur cette notion qu'est fondée notre modélisation. Nous présentons également tous les travaux autour de cette modélisation qui nous ont permis d'aboutir à une implantation réaliste: le développement du logiciel XMG et son utilisation pour l'écriture de grammaires lexicalisées, le filtrage lexical par intersection d'automates et l'analyse syntaxique
This thesis presents a modelisation of the main syntactical aspects of coordination using Guy Perrier's Interaction Grammars as the target formalism. Interaction Grammars make it possible to explicitly define conjuncts' valencies. This is precisely what our modelisation is based upon. We also present work around this modelisation that enabled us to provide a realistic implementation: lexicalized grammar development (using our tool XMG), lexical disambiguation based on automata intersection and parsing

APA, Harvard, Vancouver, ISO, and other styles

35

Bauer, Daniel. "Grammar-Based Semantic Parsing Into Graph Representations." Thesis, 2017. https://doi.org/10.7916/D8JH3ZRR.

Full text

Abstract:

Directed graphs are an intuitive and versatile representation of natural language meaning because they can capture relationships between instances of events and entities, including cases where entities play multiple roles. Yet, there are few approaches in natural language processing that use graph manipulation techniques for semantic parsing. This dissertation studies graph-based representations of natural language meaning, discusses a formal-grammar based approach to the semantic construction of graph representations, and develops methods for open-domain semantic parsing into such representations. To perform string-to-graph translation I use synchronous hyperedge replacement grammars (SHRG). The thesis studies this grammar formalism from a formal, linguistic, and algorithmic perspective. It proposes a new lexicalized variant of this formalism (LSHRG), which is inspired by tree insertion grammar and provides a clean syntax/semantics interface. The thesis develops a new method for automatically extracting SHRG and LSHRG grammars from annotated “graph banks”, which uses existing syntactic derivations to structure the extracted grammar. It also discusses a new method for semantic parsing with large, automatically extracted grammars, that translates syntactic derivations into derivations of the synchronous grammar, as well as initial work on parse reranking and selection using a graph model. I evaluate this work on the Abstract Meaning Representation (AMR) dataset. The results show that the grammar-based approach to semantic analysis shows promise as a technique for semantic parsing and that string-to-graph grammars can be induced efficiently. Taken together, the thesis lays the foundation for future work on graph methods in natural language semantics.

APA, Harvard, Vancouver, ISO, and other styles

36

Kolbly, Donovan Michael. "Extensible language implementation." Thesis, 2002. http://wwwlib.umi.com/cr/utexas/fullcit?p3110637.

Full text

APA, Harvard, Vancouver, ISO, and other styles

37

Somogyi, Nora. "Interpretive parsing technique for building object networks." Diss., 2005. http://etd.library.vanderbilt.edu/ETD-db/available/etd-03312005-121604/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

"Robust parsing with confluent preorder parser." 1996. http://library.cuhk.edu.hk/record=b6073203.

Full text

Abstract:

by Ho, Kei Shiu Edward.
"June 1996."
Thesis (Ph.D.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (p. 186-193).
Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web.
Mode of access: World Wide Web.

APA, Harvard, Vancouver, ISO, and other styles

39

Mayberry, Marshall Reeves. "Incremental nonmonotonic parsing through semantic self-organization." Thesis, 2003. http://wwwlib.umi.com/cr/utexas/fullcit?p3116385.

Full text

APA, Harvard, Vancouver, ISO, and other styles

40

"GLR parsing with multiple grammars for natural language queries." 2000. http://library.cuhk.edu.hk/record=b5890308.

Full text

Abstract:

Luk Po Chui.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2000.
Includes bibliographical references (leaves 97-100).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Efficiency and Memory --- p.2
Chapter 1.2 --- Ambiguity --- p.3
Chapter 1.3 --- Robustness --- p.4
Chapter 1.4 --- Thesis Organization --- p.5
Chapter 2 --- Background --- p.7
Chapter 2.1 --- Introduction --- p.7
Chapter 2.2 --- Context-Free Grammars --- p.8
Chapter 2.3 --- The LR Parsing Algorithm --- p.9
Chapter 2.4 --- The Generalized LR Parsing Algorithm --- p.12
Chapter 2.4.1 --- Graph-Structured Stack --- p.12
Chapter 2.4.2 --- Packed Shared Parse Forest --- p.14
Chapter 2.5 --- Time and Space Complexity --- p.16
Chapter 2.6 --- Related Work on Parsing --- p.17
Chapter 2.6.1 --- GLR* --- p.17
Chapter 2.6.2 --- TINA --- p.18
Chapter 2.6.3 --- PHOENIX --- p.19
Chapter 2.7 --- Chapter Summary --- p.21
Chapter 3 --- Grammar Partitioning --- p.22
Chapter 3.1 --- Introduction --- p.22
Chapter 3.2 --- Motivation --- p.22
Chapter 3.3 --- Previous Work on Grammar Partitioning --- p.24
Chapter 3.4 --- Our Grammar Partitioning Approach --- p.26
Chapter 3.4.1 --- Definitions and Concepts --- p.26
Chapter 3.4.2 --- Guidelines for Grammar Partitioning --- p.29
Chapter 3.5 --- An Example --- p.30
Chapter 3.6 --- Chapter Summary --- p.34
Chapter 4 --- Parser Composition --- p.35
Chapter 4.1 --- Introduction --- p.35
Chapter 4.2 --- GLR Lattice Parsing --- p.36
Chapter 4.2.1 --- Lattice with Multiple Granularity --- p.36
Chapter 4.2.2 --- Modifications to the GLR Parsing Algorithm --- p.37
Chapter 4.3 --- Parser Composition Algorithms --- p.45
Chapter 4.3.1 --- Parser Composition by Cascading --- p.46
Chapter 4 3.2 --- Parser Composition with Predictive Pruning --- p.48
Chapter 4.3.3 --- Comparison of Parser Composition by Cascading and Parser Composition with Predictive Pruning --- p.54
Chapter 4.4 --- Chapter Summary --- p.54
Chapter 5 --- Experimental Results and Analysis --- p.56
Chapter 5.1 --- Introduction --- p.56
Chapter 5.2 --- Experimental Corpus --- p.57
Chapter 5.3 --- ATIS Grammar Development --- p.60
Chapter 5.4 --- Grammar Partitioning and Parser Composition on ATIS Domain --- p.62
Chapter 5.4.1 --- ATIS Grammar Partitioning --- p.62
Chapter 5.4.2 --- Parser Composition on ATIS --- p.63
Chapter 5.5 --- Ambiguity Handling --- p.66
Chapter 5.6 --- Semantic Interpretation --- p.69
Chapter 5.6.1 --- Best Path Selection --- p.69
Chapter 5.6.2 --- Semantic Frame Generation --- p.71
Chapter 5.6.3 --- Post-Processing --- p.72
Chapter 5.7 --- Experiments --- p.73
Chapter 5.7.1 --- Grammar Coverage --- p.73
Chapter 5.7.2 --- Size of Parsing Table --- p.74
Chapter 5.7.3 --- Computational Costs --- p.76
Chapter 5.7.4 --- Accuracy Measures in Natural Language Understanding --- p.81
Chapter 5.7.5 --- Summary of Results --- p.90
Chapter 5.8 --- Chapter Summary --- p.91
Chapter 6 --- Conclusions --- p.92
Chapter 6.1 --- Thesis Summary --- p.92
Chapter 6.2 --- Thesis Contributions --- p.93
Chapter 6.3 --- Future Work --- p.94
Chapter 6.3.1 --- Statistical Approach on Grammar Partitioning --- p.94
Chapter 6.3.2 --- Probabilistic modeling for Best Parse Selection --- p.95
Chapter 6.3.3 --- Robust Parsing Strategies --- p.96
Bibliography --- p.97
Chapter A --- ATIS-3 Grammar --- p.101
Chapter A.l --- English ATIS-3 Grammar Rules --- p.101
Chapter A.2 --- Chinese ATIS-3 Grammar Rules --- p.104

APA, Harvard, Vancouver, ISO, and other styles

41

"Chinese noun phrase parsing with a hybrid approach." Chinese University of Hong Kong, 1996. http://library.cuhk.edu.hk/record=b5888967.

Full text

Abstract:

by Angel Suet Yi Tse.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (leaves 126-130).
Abstract
Acknowledgements
Table of Contents
List of Tables
List of Figures
Plagiarism Declaration
Chapter Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Overview --- p.1
Chapter 1.2 --- Motivation --- p.2
Chapter 1.3 --- Applications of NP parsing --- p.4
Chapter 1.4 --- The Hybrid Approach of NP Partial Parsing with Rule Set Derived from de NPs --- p.5
Chapter 1.5 --- Organization of the Thesis --- p.7
Chapter Chapter 2 --- Related Work --- p.9
Chapter 2.1 --- Overview --- p.9
Chapter 2.2 --- Chinese Versus English Languages --- p.10
Chapter 2.3 --- Traditional Versus Contemporary Parsing Approaches --- p.15
Chapter 2.3.1 --- Linguistics-based and Corpus-based Knowledge Acquisition --- p.15
Chapter 2.3.2 --- Basic Processing Unit --- p.16
Chapter 2.3.3 --- Related Literature --- p.17
Chapter 2.4 --- Sentence / Free Text Parsing --- p.18
Chapter 2.4.1 --- Linguistics-based --- p.18
Chapter 2.4.2 --- Corpus-based --- p.21
Chapter 2.5 --- NP Processing --- p.22
Chapter 2.5.1 --- NP Detection --- p.22
Chapter 2.5.2 --- NP Partial Parsing --- p.26
Chapter 2.6 --- Summary --- p.27
Chapter Chapter 3 --- Knowledge Elicitation for General NP Partial Parsing from De NPs --- p.28
Chapter 3.1 --- Overview --- p.28
Chapter 3.2 --- Background --- p.29
Chapter 3.3 --- Research in De Phrases --- p.33
Chapter 3.3.1 --- Research of de Phrases in Pure Linguistics --- p.33
Chapter 3.3.2 --- Research in de Phrases in Computational Linguistics --- p.36
Chapter 3.4 --- Significance of De Phrases --- p.37
Chapter 3.4.1 --- Implication to General NP Parsing --- p.37
Chapter 3.4.2 --- Embedded Knowledge for General NP Parsing --- p.37
Chapter 3.5 --- Summary --- p.39
Chapter Chapter 4 --- Knowledge Acquisition Approaches for General NP Partial Parsing --- p.40
Chapter 4.1 --- Overview --- p.40
Chapter 4.2 --- Linguistic-based Approach --- p.41
Chapter 4.3 --- Corpus-based Approach --- p.43
Chapter 4.3.1 --- Generalization of NP Grammatical Patterns --- p.44
Chapter 4.3.2 --- Pitfall of Generalization --- p.47
Chapter 4.4 --- The Hybrid Approach --- p.47
Chapter 4.4.1 --- Combining Strategies --- p.50
Chapter 4.4.2 --- Merging Techniques --- p.53
Chapter 4.5 --- CNP3- The Chinese NP Partial Parser --- p.55
Chapter 4.5.1 --- The NP Detection and Extraction Unit (DEU) --- p.56
Chapter 4.5.2 --- The Knowledge Acquisition Unit (KAU) --- p.56
Chapter 4.5.3 --- The Parsing Unit (PU) --- p.57
Chapter 4.5.4 --- Internal Representation of Chinese NPs and Grammar Rules --- p.57
Chapter 4.6 --- Summary --- p.58
Chapter Chapter 5 --- "Experiments on Linguistics-, Corpus-based and the Hybrid Approaches" --- p.60
Chapter 5.1 --- Overview --- p.60
Chapter 5.2 --- Objective of Experiments --- p.61
Chapter 5.3 --- Experimental Setup --- p.62
Chapter 5.3.1 --- The Corpora --- p.62
Chapter 5.3.2 --- The Standard and Extended Tag Sets --- p.64
Chapter 5.4 --- Overview of Experiments --- p.67
Chapter 5.5 --- Evaluation of Linguistic De NP Rules (Experiment 1 A) --- p.70
Chapter 5.5.1 --- Method --- p.71
Chapter 5.5.2 --- Results --- p.72
Chapter 5.5.3 --- Analysis --- p.72
Chapter 5.6 --- Evaluation of Corpus-based Approach (Experiment IB) --- p.74
Chapter 5.6.1 --- Method --- p.74
Chapter 5.6.2 --- Results --- p.75
Chapter 5.6.3 --- Analysis --- p.76
Chapter 5.6.4 --- Generalization of NP Grammatical Patterns (Experiment 1B'） --- p.76
Chapter 5.6.5 --- Results after Merging of Rule Sets (Experiment 1C) --- p.77
Chapter 5.6.6 --- Error Analysis --- p.79
Chapter 5.7 --- Phase II Evaluation: Test on General NP Parsing (Experiment 2) --- p.82
Chapter 5.7.1 --- Method --- p.83
Chapter 5.7.2 --- Results --- p.85
Chapter 5.7.3 --- Error Analysis --- p.86
Chapter 5.8 --- Summary --- p.92
Chapter Chapter 6 --- Reliability Evaluation of the Hybrid Approach --- p.94
Chapter 6.1 --- Overview --- p.94
Chapter 6.2 --- Objective --- p.95
Chapter 6.3 --- The Training and Test Corpora --- p.96
Chapter 6.4 --- The Knowledge Base --- p.98
Chapter 6.5 --- Convergence Sequence Tests --- p.99
Chapter 6.5.1 --- Results of Close Convergence Tests --- p.100
Chapter 6.5.2 --- Results of Open Convergence Tests --- p.104
Chapter 6.5.3 --- Conclusions with Convergence Tests --- p.106
Chapter 6.6 --- Cross Evaluation Tests --- p.106
Chapter 6.6.1 --- Results --- p.109
Chapter 6.6.2 --- Conclusions with Cross Evaluation Tests --- p.112
Chapter 6.7 --- Summary --- p.113
Chapter Chapter 7 --- Discussion and Conclusions --- p.115
Chapter 7.1 --- Overview --- p.115
Chapter 7.2 --- Difficulties Encountered --- p.116
Chapter 7.2.1 --- Lack of Standard in Part-of-speech Categorization in Chinese Language --- p.116
Chapter 7.2.2 --- Under or Over-specification of Tag Class in Tag Set --- p.118
Chapter 7.2.3 --- Difficulty in Nominal Compound NP Analysis --- p.119
Chapter 7.3 --- Conclusions --- p.120
Chapter 7.4 --- Future Work --- p.122
Chapter 7.4.1 --- Full Automation of NP Pattern Generalization --- p.122
Chapter 7.4.2 --- Incorporation of Semantic Constraints --- p.123
Chapter 7.4.3 --- Computational Structural Analysis of Nominal Compound NP --- p.124
References --- p.126
Appendix A The Extended Tag Set --- p.131
Appendix B Linguistic Grammar Rules --- p.135
Appendix C Generalized Grammar Rules --- p.138

APA, Harvard, Vancouver, ISO, and other styles

42

Lane, Richard Vernon. "Semantic Database Model Language (SDML): grammar specification and parser." 1986. http://hdl.handle.net/2097/22101.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Wong, Yuk Wah 1979. "Learning for semantic parsing and natural language generation using statistical machine translation techniques." Thesis, 2007. http://hdl.handle.net/2152/3351.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Kate, Rohit Jaivant 1978. "Learning for semantic parsing with kernels under various forms of supervision." Thesis, 2007. http://hdl.handle.net/2152/3272.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

"Syntactic and semantic interplay during Chinese text processing." Chinese University of Hong Kong, 1996. http://library.cuhk.edu.hk/record=b5888931.

Full text

Abstract:

by Tang Siu-Lam.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1996.
Includes bibliographical references (leaves 48-54).
Appendix in Chinese.
Acknowledgements --- p.I
Abstract --- p.II
Table of Contents --- p.III
Appendix --- p.IV
Introduction --- p.1
Parsing Models --- p.3
Possible Causes for the Discrepancies Observed in Past Studies --- p.7
Language Specific Properties and Parsing --- p.13
The Present Study --- p.15
Experiment1 --- p.19
Method --- p.22
Results and Discussion --- p.25
Experiment2 --- p.28
Method --- p.30
Results and Discussion --- p.30
Experiment3 --- p.35
Method --- p.38
Results and Discussion --- p.38
General Discussion --- p.45
References --- p.43
Appendix --- p.55
"Instructions used in Experiments 1, 2, and3" --- p.55

APA, Harvard, Vancouver, ISO, and other styles

46

Irwin, Warwick. "Understanding and improving object-oriented software through static software analysis : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science in the University of Canterbury /." 2007. http://library.canterbury.ac.nz/etd/adt-NZCU20070628.161653.

Full text

APA, Harvard, Vancouver, ISO, and other styles

47

"Conditional random fields with dynamic potentials for Chinese named entity recognition." 2008. http://library.cuhk.edu.hk/record=b5893775.

Full text

Abstract:

Wu, Yiu Kei.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2008.
Includes bibliographical references (p. 69-75).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Chinese NER Problem --- p.1
Chapter 1.2 --- Contribution of Our Proposed Framework --- p.3
Chapter 2 --- Related Work --- p.6
Chapter 2.1 --- Hidden Markov Models --- p.7
Chapter 2.2 --- Maximum Entropy Models --- p.8
Chapter 2.3 --- Conditional Random Fields --- p.10
Chapter 3 --- Our Proposed Model --- p.14
Chapter 3.1 --- Background --- p.14
Chapter 3.1.1 --- Problem Formulation --- p.14
Chapter 3.1.2 --- Conditional Random Fields --- p.16
Chapter 3.1.3 --- Semi-Markov Conditional Random Fields --- p.26
Chapter 3.2 --- The Formulation of Our Proposed Model --- p.28
Chapter 3.2.1 --- The Main Principle --- p.28
Chapter 3.2.2 --- The Detailed Formulation --- p.36
Chapter 3.2.3 --- Adapting Features from Original CRF to CRFDP --- p.51
Chapter 4 --- Experiments --- p.54
Chapter 4.1 --- Datasets --- p.55
Chapter 4.2 --- Features --- p.57
Chapter 4.3 --- Evaluation Metrics --- p.61
Chapter 4.4 --- Results and Discussion --- p.63
Chapter 5 --- Conclusions and Future Work --- p.67
Bibliography --- p.69
A --- p.76
B --- p.78
C --- p.88

APA, Harvard, Vancouver, ISO, and other styles

48

"Hybrid tag-set for natural language processing." 1999. http://library.cuhk.edu.hk/record=b5889925.

Full text

Abstract:

Leung Wai Kwong.
Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.
Includes bibliographical references (leaves 90-95).
Abstracts in English and Chinese.
Chapter 1 --- Introduction --- p.1
Chapter 1.1 --- Motivation --- p.1
Chapter 1.2 --- Objective --- p.3
Chapter 1.3 --- Organization of thesis --- p.3
Chapter 2 --- Background --- p.5
Chapter 2.1 --- Chinese Noun Phrases Parsing --- p.5
Chapter 2.2 --- Chinese Noun Phrases --- p.6
Chapter 2.3 --- Problems with Syntactic Parsing --- p.11
Chapter 2.3.1 --- Conjunctive Noun Phrases --- p.11
Chapter 2.3.2 --- De-de Noun Phrases --- p.12
Chapter 2.3.3 --- Compound Noun Phrases --- p.13
Chapter 2.4 --- Observations --- p.15
Chapter 2.4.1 --- Inadequacy in Part-of-Speech Categorization for Chi- nese NLP --- p.16
Chapter 2.4.2 --- The Need of Semantic in Noun Phrase Parsing --- p.17
Chapter 2.5 --- Summary --- p.17
Chapter 3 --- Hybrid Tag-set --- p.19
Chapter 3.1 --- Objectives --- p.19
Chapter 3.1.1 --- Resolving Parsing Ambiguities --- p.19
Chapter 3.1.2 --- Investigation of Nominal Compound Noun Phrases --- p.20
Chapter 3.2 --- Definition of Hybrid Tag-set --- p.20
Chapter 3.3 --- Introduction to Cilin --- p.21
Chapter 3.4 --- Problems with Cilin --- p.23
Chapter 3.4.1 --- Unknown words --- p.23
Chapter 3.4.2 --- Multiple Semantic Classes --- p.25
Chapter 3.5 --- Introduction to Chinese Word Formation --- p.26
Chapter 3.5.1 --- Disyllabic Word Formation --- p.26
Chapter 3.5.2 --- Polysyllabic Word Formation --- p.28
Chapter 3.5.3 --- Observation --- p.29
Chapter 3.6 --- Automatic Assignment of Hybrid Tag to Chinese Word --- p.31
Chapter 3.7 --- Summary --- p.34
Chapter 4 --- Automatic Semantic Assignment --- p.35
Chapter 4.1 --- Previous Researches on Semantic Tagging --- p.36
Chapter 4.2 --- SAUW - Automatic Semantic Assignment of Unknown Words --- p.37
Chapter 4.2.1 --- POS-to-SC Association (Process 1) --- p.38
Chapter 4.2.2 --- Morphology-based Deduction (Process 2) --- p.39
Chapter 4.2.3 --- Di-syllabic Word Analysis (Process 3 and 4) --- p.41
Chapter 4.2.4 --- Poly-syllabic Word Analysis (Process 5) --- p.47
Chapter 4.3 --- Illustrative Examples --- p.47
Chapter 4.4 --- Evaluation and Analysis --- p.49
Chapter 4.4.1 --- Experiments --- p.49
Chapter 4.4.2 --- Error Analysis --- p.51
Chapter 4.5 --- Summary --- p.52
Chapter 5 --- Word Sense Disambiguation --- p.53
Chapter 5.1 --- Introduction to Word Sense Disambiguation --- p.54
Chapter 5.2 --- Previous Works on Word Sense Disambiguation --- p.55
Chapter 5.2.1 --- Linguistic-based Approaches --- p.56
Chapter 5.2.2 --- Corpus-based Approaches --- p.58
Chapter 5.3 --- Our Approach --- p.60
Chapter 5.3.1 --- Bi-gram Co-occurrence Probabilities --- p.62
Chapter 5.3.2 --- Tri-gram Co-occurrence Probabilities --- p.63
Chapter 5.3.3 --- Design consideration --- p.65
Chapter 5.3.4 --- Error Analysis --- p.67
Chapter 5.4 --- Summary --- p.68
Chapter 6 --- Hybrid Tag-set for Chinese Noun Phrase Parsing --- p.69
Chapter 6.1 --- Resolving Ambiguous Noun Phrases --- p.70
Chapter 6.1.1 --- Experiment --- p.70
Chapter 6.1.2 --- Results --- p.72
Chapter 6.2 --- Summary --- p.78
Chapter 7 --- Conclusion --- p.80
Chapter 7.1 --- Summary --- p.80
Chapter 7.2 --- Difficulties Encountered --- p.83
Chapter 7.2.1 --- Lack of Training Corpus --- p.83
Chapter 7.2.2 --- Features of Chinese word formation --- p.84
Chapter 7.2.3 --- Problems with linguistic sources --- p.85
Chapter 7.3 --- Contributions --- p.86
Chapter 7.3.1 --- Enrichment to the Cilin --- p.86
Chapter 7.3.2 --- Enhancement in syntactic parsing --- p.87
Chapter 7.4 --- Further Researches --- p.88
Chapter 7.4.1 --- Investigation into words that undergo semantic changes --- p.88
Chapter 7.4.2 --- Incorporation of more information into the hybrid tag-set --- p.89
Chapter A --- POS Tag-set by Tsinghua University (清華大學） --- p.96
Chapter B --- Morphological Rules --- p.100
Chapter C --- Syntactic Rules for Di-syllabic Words Formation --- p.104

APA, Harvard, Vancouver, ISO, and other styles

49

Rasooli, Mohammad Sadegh. "Cross-Lingual Transfer of Natural Language Processing Systems." Thesis, 2019. https://doi.org/10.7916/d8-dqv9-ba34.

Full text

Abstract:

Accurate natural language processing systems rely heavily on annotated datasets. In the absence of such datasets, transfer methods can help to develop a model by transferring annotations from one or more rich-resource languages to the target language of interest. These methods are generally divided into two approaches: 1) annotation projection from translation data, aka parallel data, using supervised models in rich-resource languages, and 2) direct model transfer from annotated datasets in rich-resource languages. In this thesis, we demonstrate different methods for transfer of dependency parsers and sentiment analysis systems. We propose an annotation projection method that performs well in the scenarios for which a large amount of in-domain parallel data is available. We also propose a method which is a combination of annotation projection and direct transfer that can leverage a minimal amount of information from a small out-of-domain parallel dataset to develop highly accurate transfer models. Furthermore, we propose an unsupervised syntactic reordering model to improve the accuracy of dependency parser transfer for non-European languages. Finally, we conduct a diverse set of experiments for the transfer of sentiment analysis systems in different data settings. A summary of our contributions are as follows: * We develop accurate dependency parsers using parallel text in an annotation projection framework. We make use of the fact that the density of word alignments is a valuable indicator of reliability in annotation projection. * We develop accurate dependency parsers in the absence of a large amount of parallel data. We use the Bible data, which is in orders of magnitude smaller than a conventional parallel dataset, to provide minimal cues for creating cross-lingual word representations. Our model is also capable of boosting the performance of annotation projection with a large amount of parallel data. Our model develops cross-lingual word representations for going beyond the traditional delexicalized direct transfer methods. Moreover, we propose a simple but effective word translation approach that brings in explicit lexical features from the target language in our direct transfer method. * We develop different syntactic reordering models that can change the source treebanks in rich-resource languages, thus preventing learning a wrong model for a non-related language. Our experimental results show substantial improvements over non-European languages. * We develop transfer methods for sentiment analysis in different data availability scenarios. We show that we can leverage cross-lingual word embeddings to create accurate sentiment analysis systems in the absence of annotated data in the target language of interest. We believe that the novelties that we introduce in this thesis indicate the usefulness of transfer methods. This is appealing in practice, especially since we suggest eliminating the requirement for annotating new datasets for low-resource languages which is expensive, if not impossible, to obtain.

APA, Harvard, Vancouver, ISO, and other styles

50

"A robust unification-based parser for Chinese natural language processing." 2001. http://library.cuhk.edu.hk/record=b5895881.

Full text

Abstract:

Chan Shuen-ti Roy.
Thesis (M.Phil.)--Chinese University of Hong Kong, 2001.
Includes bibliographical references (leaves 168-175).
Abstracts in English and Chinese.
Chapter 1. --- Introduction --- p.12
Chapter 1.1. --- The nature of natural language processing --- p.12
Chapter 1.2. --- Applications of natural language processing --- p.14
Chapter 1.3. --- Purpose of study --- p.17
Chapter 1.4. --- Organization of this thesis --- p.18
Chapter 2. --- Organization and methods in natural language processing --- p.20
Chapter 2.1. --- Organization of natural language processing system --- p.20
Chapter 2.2. --- Methods employed --- p.22
Chapter 2.3. --- Unification-based grammar processing --- p.22
Chapter 2.3.1. --- Generalized Phase Structure Grammar (GPSG) --- p.27
Chapter 2.3.2. --- Head-driven Phrase Structure Grammar (HPSG) --- p.31
Chapter 2.3.3. --- Common drawbacks of UBGs --- p.33
Chapter 2.4. --- Corpus-based processing --- p.34
Chapter 2.4.1. --- Drawback of corpus-based processing --- p.35
Chapter 3. --- Difficulties in Chinese language processing and its related works --- p.37
Chapter 3.1. --- A glance at the history --- p.37
Chapter 3.2. --- Difficulties in syntactic analysis of Chinese --- p.37
Chapter 3.2.1. --- Writing system of Chinese causes segmentation problem --- p.38
Chapter 3.2.2. --- Words serving multiple grammatical functions without inflection --- p.40
Chapter 3.2.3. --- Word order of Chinese --- p.42
Chapter 3.2.4. --- The Chinese grammatical word --- p.43
Chapter 3.3. --- Related works --- p.45
Chapter 3.3.1. --- Unification grammar processing approach --- p.45
Chapter 3.3.2. --- Corpus-based processing approach --- p.48
Chapter 3.4. --- Restatement of goal --- p.50
Chapter 4. --- SERUP: Statistical-Enhanced Robust Unification Parser --- p.54
Chapter 5. --- Step One: automatic preprocessing --- p.57
Chapter 5.1. --- Segmentation of lexical tokens --- p.57
Chapter 5.2. --- "Conversion of date, time and numerals" --- p.61
Chapter 5.3. --- Identification of new words --- p.62
Chapter 5.3.1. --- Proper nouns ´ؤ Chinese names --- p.63
Chapter 5.3.2. --- Other proper nouns and multi-syllabic words --- p.67
Chapter 5.4. --- Defining smallest parsing unit --- p.82
Chapter 5.4.1. --- The Chinese sentence --- p.82
Chapter 5.4.2. --- Breaking down the paragraphs --- p.84
Chapter 5.4.3. --- Implementation --- p.87
Chapter 6. --- Step Two: grammar construction --- p.91
Chapter 6.1. --- Criteria in choosing a UBG model --- p.91
Chapter 6.2. --- The grammar in details --- p.92
Chapter 6.2.1. --- The PHON feature --- p.93
Chapter 6.2.2. --- The SYN feature --- p.94
Chapter 6.2.3. --- The SEM feature --- p.98
Chapter 6.2.4. --- Grammar rules and features principles --- p.99
Chapter 6.2.5. --- Verb phrases --- p.101
Chapter 6.2.6. --- Noun phrases --- p.104
Chapter 6.2.7. --- Prepositional phrases --- p.113
Chapter 6.2.8. --- """Ba2"" and ""Bei4"" constructions" --- p.115
Chapter 6.2.9. --- The terminal node S --- p.119
Chapter 6.2.10. --- Summary of phrasal rules --- p.121
Chapter 6.2.11. --- Morphological rules --- p.122
Chapter 7. --- Step Three: resolving structural ambiguities --- p.128
Chapter 7.1. --- Sources of ambiguities --- p.128
Chapter 7.2. --- The traditional practices: an illustration --- p.132
Chapter 7.3. --- Deficiency of current practices --- p.134
Chapter 7.4. --- A new point of view: Wu (1999) --- p.140
Chapter 7.5. --- Improvement over Wu (1999) --- p.142
Chapter 7.6. --- Conclusion on semantic features --- p.146
Chapter 8. --- "Implementation, performance and evaluation" --- p.148
Chapter 8.1. --- Implementation --- p.148
Chapter 8.2. --- Performance and evaluation --- p.150
Chapter 8.2.1. --- The test set --- p.150
Chapter 8.2.2. --- Segmentation of lexical tokens --- p.150
Chapter 8.2.3. --- New word identification --- p.152
Chapter 8.2.4. --- Parsing unit segmentation --- p.156
Chapter 8.2.5. --- The grammar --- p.158
Chapter 8.3. --- Overall performance of SERUP --- p.162
Chapter 9. --- Conclusion --- p.164
Chapter 9.1. --- Summary of this thesis --- p.164
Chapter 9.2. --- Contribution of this thesis --- p.165
Chapter 9.3. --- Future work --- p.166
References --- p.168
Appendix I --- p.176
Appendix II --- p.181
Appendix III --- p.183

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Parsing (computer grammar)'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles