Dissertations / Theses on the topic 'Probabilistic grammar'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 28 dissertations / theses for your research on the topic 'Probabilistic grammar.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Kwiatkowski, Thomas Mieczyslaw. "Probabilistic grammar induction from sentences and structured meanings." Thesis, University of Edinburgh, 2012. http://hdl.handle.net/1842/6190.
Full textStüber, Torsten. "Consistency of Probabilistic Context-Free Grammars." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-86943.
Full textAfrin, Taniza. "Extraction of Basic Noun Phrases from Natural Language Using Statistical Context-Free Grammar." Thesis, Virginia Tech, 2001. http://hdl.handle.net/10919/33353.
Full textMaster of Science
Hsu, Hsin-jen. "A neurophysiological study on probabilistic grammatical learning and sentence processing." Diss., University of Iowa, 2009. https://ir.uiowa.edu/etd/243.
Full textBrookes, James William Rowe. "Probabilistic and multivariate modelling in Latin grammar : the participle-auxiliary alternation as a case study." Thesis, University of Manchester, 2014. https://www.research.manchester.ac.uk/portal/en/theses/probabilistic-and-multivariate-modelling-in-latin-grammar-the-participleauxiliary-alternation-as-a-case-study(4ff5b912-c410-41f2-94f2-859eb1ce5b21).html.
Full textBuys, Jan Moolman. "Probabilistic tree transducers for grammatical error correction." Thesis, Stellenbosch : Stellenbosch University, 2013. http://hdl.handle.net/10019.1/85592.
Full textENGLISH ABSTRACT: We investigate the application of weighted tree transducers to correcting grammatical errors in natural language. Weighted finite-state transducers (FST) have been used successfully in a wide range of natural language processing (NLP) tasks, even though the expressiveness of the linguistic transformations they perform is limited. Recently, there has been an increase in the use of weighted tree transducers and related formalisms that can express syntax-based natural language transformations in a probabilistic setting. The NLP task that we investigate is the automatic correction of grammar errors made by English language learners. In contrast to spelling correction, which can be performed with a very high accuracy, the performance of grammar correction systems is still low for most error types. Commercial grammar correction systems mostly use rule-based methods. The most common approach in recent grammatical error correction research is to use statistical classifiers that make local decisions about the occurrence of specific error types. The approach that we investigate is related to a number of other approaches inspired by statistical machine translation (SMT) or based on language modelling. Corpora of language learner writing annotated with error corrections are used as training data. Our baseline model is a noisy-channel FST model consisting of an n-gram language model and a FST error model, which performs word insertion, deletion and replacement operations. The tree transducer model we use to perform error correction is a weighted top-down tree-to-string transducer, formulated to perform transformations between parse trees of correct sentences and incorrect sentences. Using an algorithm developed for syntax-based SMT, transducer rules are extracted from training data of which the correct version of sentences have been parsed. Rule weights are also estimated from the training data. Hypothesis sentences generated by the tree transducer are reranked using an n-gram language model. We perform experiments to evaluate the performance of different configurations of the proposed models. In our implementation an existing tree transducer toolkit is used. To make decoding time feasible sentences are split into clauses and heuristic pruning is performed during decoding. We consider different modelling choices in the construction of transducer rules. The evaluation of our models is based on precision and recall. Experiments are performed to correct various error types on two learner corpora. The results show that our system is competitive with existing approaches on several error types.
AFRIKAANSE OPSOMMING: Ons ondersoek die toepassing van geweegde boomoutomate om grammatikafoute in natuurlike taal outomaties reg te stel. Geweegde eindigetoestand outomate word suksesvol gebruik in ’n wye omvang van take in natuurlike taalverwerking, alhoewel die uitdrukkingskrag van die taalkundige transformasies wat hulle uitvoer beperk is. Daar is die afgelope tyd ’n toename in die gebruik van geweegde boomoutomate en verwante formalismes wat sintaktiese transformasies in natuurlike taal in ’n probabilistiese raamwerk voorstel. Die natuurlike taalverwerkingstoepassing wat ons ondersoek is die outomatiese regstelling van taalfoute wat gemaak word deur Engelse taalleerders. Terwyl speltoetsing in Engels met ’n baie hoë akkuraatheid gedoen kan word, is die prestasie van taalregstellingstelsels nog relatief swak vir meeste fouttipes. Kommersiële taalregstellingstelsels maak oorwegend gebruik van reël-gebaseerde metodes. Die algemeenste benadering in onlangse navorsing oor grammatikale foutkorreksie is om statistiese klassifiseerders wat plaaslike besluite oor die voorkoms van spesifieke fouttipes maak te gebruik. Die benadering wat ons ondersoek is verwant aan ’n aantal ander benaderings wat geïnspireer is deur statistiese masjienvertaling of op taalmodellering gebaseer is. Korpora van taalleerderskryfwerk wat met foutregstellings geannoteer is, word as afrigdata gebruik. Ons kontrolestelsel is ’n geraaskanaal eindigetoestand outomaatmodel wat bestaan uit ’n n-gram taalmodel en ’n foutmodel wat invoegings-, verwyderings- en vervangingsoperasies op woordvlak uitvoer. Die boomoutomaatmodel wat ons gebruik vir grammatikale foutkorreksie is ’n geweegde bo-na-onder boom-na-string omsetteroutomaat geformuleer om transformasies tussen sintaksbome van korrekte sinne en foutiewe sinne te maak. ’n Algoritme wat ontwikkel is vir sintaksgebaseerde statistiese masjienvertaling word gebruik om reëls te onttrek uit die afrigdata, waarvan sintaksontleding op die korrekte weergawe van die sinne gedoen is. Reëlgewigte word ook vanaf die afrigdata beraam. Hipotese-sinne gegenereer deur die boomoutomaat word herrangskik met behulp van ’n n-gram taalmodel. Ons voer eksperimente uit om die doeltreffendheid van verskillende opstellings van die voorgestelde modelle te evalueer. In ons implementering word ’n bestaande boomoutomaat sagtewarepakket gebruik. Om die dekoderingstyd te verminder word sinne in frases verdeel en die soekruimte heuristies besnoei. Ons oorweeg verskeie modelleringskeuses in die samestelling van outomaatreëls. Die evaluering van ons modelle word gebaseer op presisie en herroepvermoë. Eksperimente word uitgevoer om verskeie fouttipes reg te maak op twee leerderkorpora. Die resultate wys dat ons model kompeterend is met bestaande benaderings op verskeie fouttipes.
Shan, Yin Information Technology & Electrical Engineering Australian Defence Force Academy UNSW. "Program distribution estimation with grammar models." Awarded by:University of New South Wales - Australian Defence Force Academy. School of Information Technology and Electrical Engineering, 2005. http://handle.unsw.edu.au/1959.4/38737.
Full textPinnow, Eleni. "The role of probabilistic phonotactics in the recognition of reduced pseudowords." Diss., Online access via UMI:, 2009.
Find full textMora, Randall P., and Jerry L. Hill. "Service-Based Approach for Intelligent Agent Frameworks." International Foundation for Telemetering, 2011. http://hdl.handle.net/10150/595661.
Full textThis paper describes a service-based Intelligent Agent (IA) approach for machine learning and data mining of distributed heterogeneous data streams. We focus on an open architecture framework that enables the programmer/analyst to build an IA suite for mining, examining and evaluating heterogeneous data for semantic representations, while iteratively building the probabilistic model in real-time to improve predictability. The Framework facilitates model development and evaluation while delivering the capability to tune machine learning algorithms and models to deliver increasingly favorable scores prior to production deployment. The IA Framework focuses on open standard interoperability, simplifying integration into existing environments.
Torres, Parra Jimena Cecilia. "A Perception Based Question-Answering Architecture Derived from Computing with Words." Available to subscribers only, 2009. http://proquest.umi.com/pqdweb?did=1967797581&sid=1&Fmt=2&clientId=1509&RQT=309&VName=PQD.
Full textPlum, Guenter Arnold. "Text and Contextual Conditioning in Spoken English: A genre approach." Thesis, The University of Sydney, 1988. http://hdl.handle.net/2123/608.
Full textPlum, Guenter Arnold. "Text and Contextual Conditioning in Spoken English: A genre approach." University of Sydney. Linguistics, 1988. http://hdl.handle.net/2123/608.
Full textMATSUBARA, Shigeki, and Yoshihide KATO. "Incremental Parsing with Adjoining Operation." Institute of Electronics, Information and Communication Engineers, 2009. http://hdl.handle.net/2237/15001.
Full textKalantari, John I. "A general purpose artificial intelligence framework for the analysis of complex biological systems." Diss., University of Iowa, 2017. https://ir.uiowa.edu/etd/5953.
Full textAycinena, Margaret Aida. "Probabilistic geometric grammars for object recognition." Thesis, Massachusetts Institute of Technology, 2005. http://hdl.handle.net/1721.1/34640.
Full textIncludes bibliographical references (p. 121-123).
This thesis presents a generative three-dimensional (3D) representation and recognition framework for classes of objects. The framework uses probabilistic grammars to represent object classes recursively in terms of their parts, thereby exploiting the hierarchical and substitutive structure inherent to many types of objects. The framework models the 3) geometric characteristics of object parts using multivariate conditional Gaussians over dimensions, position, and rotation. I present algorithms for learning geometric models and rule probabilities given parsed 3D examples and a fixed grammar. I also present a parsing algorithm for classifying unlabeled, unparsed 3D examples given a geometric grammar. Finally, I describe the results of a set of experiments designed to investigate the chosen model representation of the framework.
by Margaret Aida Aycinena.
S.M.
Carroll, Glenn R. "Learning probabilistic grammars for language modeling." [S.l.] : Universität Stuttgart , Fakultätsübergreifend / Sonstige Einrichtung, 1995. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB7084251.
Full textÁlvaro, Muñoz Francisco. "Mathematical Expression Recognition based on Probabilistic Grammars." Doctoral thesis, Universitat Politècnica de València, 2015. http://hdl.handle.net/10251/51665.
Full text[ES] La notación matemática es bien conocida y se utiliza en todo el mundo. La humanidad ha evolucionado desde simples métodos para representar cuentas hasta la notación formal actual capaz de modelar problemas complejos. Además, las expresiones matemáticas constituyen un idioma universal en el mundo científico, y se han creado muchos recursos que contienen matemáticas durante las últimas décadas. Sin embargo, para acceder de forma eficiente a toda esa información, los documentos científicos han de ser digitalizados o producidos directamente en formatos electrónicos. Aunque la mayoría de personas es capaz de entender y producir información matemática, introducir expresiones matemáticas en dispositivos electrónicos requiere aprender notaciones especiales o usar editores. El reconocimiento automático de expresiones matemáticas tiene como objetivo llenar ese espacio existente entre el conocimiento de una persona y la entrada que aceptan los ordenadores. De este modo, documentos impresos que contienen fórmulas podrían digitalizarse automáticamente, y la escritura se podría utilizar para introducir directamente notación matemática en dispositivos electrónicos. Esta tesis está centrada en desarrollar un método para reconocer expresiones matemáticas. En este documento proponemos un método para reconocer cualquier tipo de fórmula (impresa o manuscrita) basado en gramáticas probabilísticas. Para ello, desarrollamos el marco estadístico formal que deriva varias distribuciones de probabilidad. A lo largo del documento, abordamos la definición y estimación de todas estas fuentes de información probabilística. Finalmente, definimos el algoritmo que, dada cierta entrada, calcula globalmente la expresión matemática más probable de acuerdo al marco estadístico. Un aspecto importante de este trabajo es proporcionar una evaluación objetiva de los resultados y presentarlos usando datos públicos y medidas estándar. Por ello, estudiamos los problemas de la evaluación automática en este campo y buscamos las mejores soluciones. Asimismo, presentamos diversos experimentos usando bases de datos públicas y hemos participado en varias competiciones internacionales. Además, hemos publicado como código abierto la mayoría del software desarrollado en esta tesis. También hemos explorado algunas de las aplicaciones del reconocimiento de expresiones matemáticas. Además de las aplicaciones directas de transcripción y digitalización, presentamos dos propuestas importantes. En primer lugar, desarrollamos mucaptcha, un método para discriminar entre humanos y ordenadores mediante la escritura de expresiones matemáticas, el cual representa una novedosa aplicación del reconocimiento de fórmulas. En segundo lugar, abordamos el problema de detectar y segmentar la estructura de documentos utilizando el marco estadístico formal desarrollado en esta tesis, dado que ambos son problemas bidimensionales que pueden modelarse con gramáticas probabilísticas. El método desarrollado en esta tesis para reconocer expresiones matemáticas ha obtenido buenos resultados a diferentes niveles. Este trabajo ha producido varias publicaciones en conferencias internacionales y revistas, y ha sido premiado en competiciones internacionales.
[CAT] La notació matemàtica és ben coneguda i s'utilitza a tot el món. La humanitat ha evolucionat des de simples mètodes per representar comptes fins a la notació formal actual capaç de modelar problemes complexos. A més, les expressions matemàtiques constitueixen un idioma universal al món científic, i s'han creat molts recursos que contenen matemàtiques durant les últimes dècades. No obstant això, per accedir de forma eficient a tota aquesta informació, els documents científics han de ser digitalitzats o produïts directament en formats electrònics. Encara que la majoria de persones és capaç d'entendre i produir informació matemàtica, introduir expressions matemàtiques en dispositius electrònics requereix aprendre notacions especials o usar editors. El reconeixement automàtic d'expressions matemàtiques té per objectiu omplir aquest espai existent entre el coneixement d'una persona i l'entrada que accepten els ordinadors. D'aquesta manera, documents impresos que contenen fórmules podrien digitalitzar-se automàticament, i l'escriptura es podria utilitzar per introduir directament notació matemàtica en dispositius electrònics. Aquesta tesi està centrada en desenvolupar un mètode per reconèixer expressions matemàtiques. En aquest document proposem un mètode per reconèixer qualsevol tipus de fórmula (impresa o manuscrita) basat en gramàtiques probabilístiques. Amb aquesta finalitat, desenvolupem el marc estadístic formal que deriva diverses distribucions de probabilitat. Al llarg del document, abordem la definició i estimació de totes aquestes fonts d'informació probabilística. Finalment, definim l'algorisme que, donada certa entrada, calcula globalment l'expressió matemàtica més probable d'acord al marc estadístic. Un aspecte important d'aquest treball és proporcionar una avaluació objectiva dels resultats i presentar-los usant dades públiques i mesures estàndard. Per això, estudiem els problemes de l'avaluació automàtica en aquest camp i busquem les millors solucions. Així mateix, presentem diversos experiments usant bases de dades públiques i hem participat en diverses competicions internacionals. A més, hem publicat com a codi obert la majoria del software desenvolupat en aquesta tesi. També hem explorat algunes de les aplicacions del reconeixement d'expressions matemàtiques. A més de les aplicacions directes de transcripció i digitalització, presentem dues propostes importants. En primer lloc, desenvolupem mucaptcha, un mètode per discriminar entre humans i ordinadors mitjançant l'escriptura d'expressions matemàtiques, el qual representa una nova aplicació del reconeixement de fórmules. En segon lloc, abordem el problema de detectar i segmentar l'estructura de documents utilitzant el marc estadístic formal desenvolupat en aquesta tesi, donat que ambdós són problemes bidimensionals que poden modelar-se amb gramàtiques probabilístiques. El mètode desenvolupat en aquesta tesi per reconèixer expressions matemàtiques ha obtingut bons resultats a diferents nivells. Aquest treball ha produït diverses publicacions en conferències internacionals i revistes, i ha sigut premiat en competicions internacionals.
Álvaro Muñoz, F. (2015). Mathematical Expression Recognition based on Probabilistic Grammars [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/51665
TESIS
Lee, Wing Kuen. "Interpreting tables in text using probabilistic two-dimensional context-free grammars /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?COMP%202005%20LEEW.
Full textScicluna, James. "Grammatical inference of probalistic context-free grammars." Nantes, 2014. http://www.theses.fr/2014NANT2071.
Full textProbabilistic Context-Free Grammars (PCFGs) are formal statistical models which describe probability distributions on strings and on tree structures of the same strings. Grammatical Inference is a sub-field of machine learning where the task is to learn automata or grammars (such as PCFGs) from information about their languages. In this thesis, we are interested in Grammatical Inference of PCFGs from text. There are various applications for this problem, chief amongst which are Unsupervised Parsing and Language Modelling in Natural Language Processing and RNA secondary structure prediction in Bioinformatics. PCFG inference is however a difficult problem for a variety of reasons. In spite of its importance for various applications, only few positive results have up till now been obtained for this problem. Our main contribution in this thesis is a practical PCFG learning algorithm with some proven properties and based on a principled approach. We define a new subclass of PCFGs (very similar to the one defined in (Clark, 2010)) and use distributional learning and MDL-based techniques in order to learn this class of grammars. We obtain competitive results on experiments that evaluate unsupervised parsing and language modelling. A minor contribution in this thesis is a compendium of undecidability results for distances between PCFGs along with two positive results on PCFGs. Having such results can help in the process of finding learning algorithms for PCFGs
Beneš, Vojtěch. "Syntaktický analyzátor pro český jazyk." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2014. http://www.nusl.cz/ntk/nusl-236022.
Full textBensalem, Raja. "Construction de ressources linguistiques arabes à l’aide du formalisme de grammaires de propriétés en intégrant des mécanismes de contrôle." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0503/document.
Full textThe building of syntactically informative Arabic linguistic resources is a major issue for the development of new machine processing tools. We propose in this thesis to create an Arabic treebank that integrates a new type of information, which is based on the Property Grammar formalism. A syntactic property is a relation between two units of a given syntactic structure. This grammar is automatically induced from the Arabic treebank ATB. We enriched this resource with the property representations of this grammar, while retaining its qualities. We also applied this enrichment to the parsing results of a state-of-the-art analyzer, the Stanford Parser. This provides the possibility of an evaluation using a measure set, which is calculated on this resource. We structured the tags of the units in this grammar according to a type hierarchy. This permit to vary the granularity level of these units, and consequently the accuracy level of the information. We have thus been able to construct, using this grammar, other Arabic linguistic resources. Secondly, based on this new resource, we developed a probabilistic syntactic parser based on syntactic properties. This is the first analyzer of this type that we have applied to Arabic. In the learning model, we integrated a probabilistic lexicalized property grammar that may positively affect the parsing result and describe its syntactic structures with its properties. Finally, we evaluated the parsing results of this approach by comparing them to those of the Stanford Parser
Toussenel, François. "Étiquetage probabiliste avec un grand jeu d'étiquettes en vue de l'analyse syntaxique complète." Paris 7, 2005. http://www.theses.fr/2005PA070087.
Full textWe explore the limits of the approach of supertagging using a hidden Markov model as a pre-processing step before full parsing, using a large Lexicalized Tree Adjoining Grammar automatically extracted from a treebank. We identify two major sources of difficulty in this approach (statistical issues due to heavy data sparseness, and a clash between the global nature of information provided by the supertags and the local vision of the hidden Markov model), and then we explore three possible ways to improve the tagging step. The first two (generalization of learning data and underspecification) make use of a feature structure to represent the supertags. The third way addresses the second source of difficulty and relies on the structure of the supertags to prune the sequences of supertags which can never result in a full parse
Mamián, López Esther Sofía 1985. "Métodos de pontos interiores como alternativa para estimar os parâmetros de uma gramática probabilística livre do contexto." [s.n.], 2013. http://repositorio.unicamp.br/jspui/handle/REPOSIP/306757.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica
Made available in DSpace on 2018-08-23T17:46:00Z (GMT). No. of bitstreams: 1 MamianLopez_EstherSofia_M.pdf: 1176541 bytes, checksum: 8f49901f40e77c9511c30e86c0d1bb0d (MD5) Previous issue date: 2013
Resumo: Os modelos probabilísticos de uma linguagem (MPL) são modelos matemáticos onde é definida uma função de probabilidade que calcula a probabilidade de ocorrência de uma cadeia em uma linguagem. Os parâmetros de um MPL, que são as probabilidades de uma cadeia, são aprendidos a partir de uma base de dados (amostras de cadeias) pertencentes à linguagem. Uma vez obtidas as probabilidades, ou seja, um modelo da linguagem, existe uma medida para comparar quanto o modelo obtido representa a linguagem em estudo. Esta medida é denominada perplexidade por palavra. O modelo de linguagem probabilístico que propomos estimar, está baseado nas gramáticas probabilísticas livres do contexto. O método clássico para estimar os parâmetros de um MPL (Inside-Outside) demanda uma grande quantidade de tempo, tornando-o inviável para aplicações complexas. A proposta desta dissertação consiste em abordar o problema de estimar os parâmetros de um MPL usando métodos de pontos interiores, obtendo bons resultados em termos de tempo de processamento, número de iterações até obter convergência e perplexidade por palavra
Abstract: In a probabilistic language model (PLM), a probability function is defined to calculate the probability of a particular string ocurring within a language. These probabilities are the PLM parameters and are learned from a corpus (string samples), being part of a language. When the probabilities are calculated, with a language model as a result, a comparison can be realized in order to evaluate the extent to which the model represents the language being studied. This way of evaluation is called perplexity per word. The PLM proposed in this work is based on the probabilistic context-free grammars as an alternative to the classic method inside-outside that can become quite time-consuming, being unviable for complex applications. This proposal is an approach to estimate the PLM parameters using interior point methods with good results being obtained in processing time, iterations number until convergence and perplexity per word
Mestrado
Matematica Aplicada
Mestra em Matemática Aplicada
Yi-Ting, Fu, and 傅怡婷. "Learning Semantic Parsing Using Probabilistic Context-Free Grammar in Chinese Poetry Domains." Thesis, 2004. http://ndltd.ncl.edu.tw/handle/bhdau2.
Full text國立清華大學
資訊系統與應用研究所
93
Statistical model have been used quite successfully in Natural Language Processing for recovery of hidden structure such as part-of-speech tags, or syntactic structure. This thesis considers semantic parsing and tagging of classical Chinese poetry lines. There are five aims in this thesis: (1) Construct semantic grammars; (2) Modify and learning probabilities of the semantic grammars from the training corpus; (3) Parse the sentence to tree structure; (4) Evaluate the accuracy of parsing results and (5) Compare with the Hidden Markov Model bi-gram tagger. In the first three tasks, we assumed that the categories of Chinese Thesaurus are representative enough to help us analyze the semantic of the sentences. And the semantic grammars were built upon the semantic categories and semantic rules. We modified the grammars and learned the probabilities from training data with Inside-Outside algorithm. And Viterbi algorithm was used to find the most likely parsing route. In the last two tasks, we found that the PCFG semantic parser has better performance on prediction of semantic tagging in the situation of data sparseness and the greater ability on disambiguation. We believe that parsing results might have broadly usages in machine translation, and poetry generation, and etc. in the future.
Jia-KuanLin and 林家寬. "Affective Structure Modeling of Speech for Emotion Recognition Using Probabilistic Context Free Grammar." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/38279767034205399000.
Full text國立成功大學
醫學資訊研究所
102
Speech is the most natural way with rich emotional information for communication. Recognition of emotions in speech plays an important role in affective computing. Related research on utterance-level and segment-level processing lacks the understanding of the underlying structure of emotional speech. In this thesis, a hierarchical approach to modeling affective structure based on probabilistic context free grammar is proposed for recognition. Canny edge detection algorithm is employed to detect the hypothesized segment boundaries of speech signal according to spectral similarity. Emotion profiles generated from the SVM-based classification model are used to find a maximum change boundary between segments. Then, a binary tree is constructed to derive the hierarchical structure with multi-layer speech segments. Vector quantization is further used to generate emotion-profile codebook and a hierarchical representation of the speech segments. Probabilistic context free grammar is adopted to model the hierarchical relations between codewords for affective structure modeling. In order to evaluate the proposed method, Berlin emotional speech database (EMO-DB) with 1495 utterances and 7 emotions and the leave-one-speaker-out cross validation scheme was employed. For investigating the effect of utterance length, concatenation of two or more utterances from the database was also performed. The experimental results show that the proposed method achieved emotion recognition accuracy of 87.22% in long utterance and outperformed the conventional SVM-based method. Further study on collecting more real corpus is needed for the analysis and recognition of emotions in spontaneous speech.
Cunha, Jessica Megane Taveira da. "Probabilistic Grammatical Evolution." Master's thesis, 2021. http://hdl.handle.net/10316/96066.
Full textEvolução Gramatical (GE) [1] é uma das variantes mais populares de Programação Genética(GP) [2] e tem sido utilizada com sucesso em problemas de vários domínios. Desde a pro-posta original, muitas melhorias foram introduzidas na GE para melhorar a sua perfor-mance abordando alguns dos seus principais problemas, nomeadamente a baixa localidadee a alta redundância [3, 4].Nos métodos de GP baseados em gramáticas a escolha da gramática tem um papel impor-tante na qualidade das soluções geradas, uma vez que é a gramática que define o espaçode procura [5]. Neste trabalho, propomos quatro variantes da GE, que durante o processoevolucionário realizam uma exploração do espaço de procura, alterando os pesos de cada re-gra da gramática. Estas variantes introduzem dois tipos de representação alternativas, doismétodos diferentes de ajustar a gramática e um novo método de mapeamento, utilizandouma Gramática Livre de Contexto Probabilistica (PCFG).O primeiro método é a Evolução Gramatical Probabilistica (PGE), no qual os individuossão representados por uma lista de probabilidades (genótipo), onde cada valor representa aprobabilidade de selecionar uma regra de derivação. O genótipo é mapeado numa soluçãopara o problema em questão (fenótipo), recorrendo a uma PCFG. A cada geração, as prob-abilidades de cada regra da gramática são atualizadas, com base nas regras de expansãousadas pelo melhor individuo. A Evolução Gramatical Probabilistica Co-Evolucionária(Co-PGE) utiliza a mesma representação dos individuos e introduz uma nova técnicade atualização das probabilidades da gramática onde as probabilidades de cada regra dederivação são alteradas a cada geração usando um operador semelhante à mutação. Emambos os métodos os individuos são remapeados após atualização da gramática.A Evolução Gramatical Estruturada Probabilistica (PSGE) e a Evolução Gramatical Estru-turada Probabilistica Co-Evolucionária (Co-PSGE) foram criadas adaptando a EvoluçãoGramatical Estruturada (SGE), um método que foi proposto para superar os problemas daGE melhorando a sua performance [6]. Estas variantes usam como genótipo um conjuntode listas dinâmicas, uma para cada não-terminal, em que cada elemento da lista é umaprobabilidade usada para mapear o individuo, usando uma PCFG.Analisamos e comparamos o desempenho dos métodos em seis problemas benchmark.Quando comparados com a GE, os resultados mostraram que a PGE e a Co-PGE sãoestatisticamente semelhantes ou melhores em todos os problemas, enquanto que a PSGE ea Co-PSGE foram estatisticamente melhores em todos os problemas do que a tradicionalGE. Destacamos também a Co-PSGE por superar estatisticamente a SGE em alguns prob-lemas, tornando-a competitiva com o estado da arte. Também realizamos uma análise nasrepresentações, e os resultados mostraram que a PSGE e a Co-PSGE tem menos redun-dancia, e todos os métodos apresentaram localidade mais elevada que o GE, o que permiteuma melhor exploração do espaço de procura.As análises efetuadas mostraram que as gramáticas evoluidas ajudam a guiar o processoevolucionario, e fornecem-nos informações sobre as regras de produção mais relevantespara gerar melhores soluções. Além disso, também podem ser utilizadas para gerar umaamostragem de soluções com melhor fitness médio.
Grammatical Evolution (GE) [1] is one of the most popular variants of Genetic Program-ming (GP) [2] and has been successfully used in a wide range of problem domains. Sincethe original proposal, many improvements have been introduced in GE to improve its per-formance by addressing some of its main issues, namely low locality and high redundancy[3, 4].In grammar-based GP methods the choice of the grammar has a significant impact onthe quality of the generated solutions, since it is the grammar that defines the searchspace [5]. In this work, we present four variants of GE, which during the evolutionaryprocess perform an exploration of the search space by updating the weights of each ruleof the grammar. These variants introduce two alternative representation types, two gram-mar adjustment methods, and a new mapping method using a Probabilistic Context-FreeGrammar (PCFG).The first method is Probabilistic Grammatical Evolution (PGE), in which individuals arerepresented by a list of real values (genotype), each value denoting the probability of select-ing a derivation rule. The genotype is mapped into a solution (phenotype) to the problemat hand, using a PCFG. At each generation, the probabilities of each rule in the grammarare upated, based on the expansion rules used by the best individual. Co-evolutionaryProbabilistic Grammatical Evolution (Co-PGE) employs the same representation of indi-viduals and introduces a new technique to update the grammar’s probabilities, where eachindividual is assigned a PCFG where the probabilities of each derivation option are changedat each generation using a mutation like operator. In both methods, the individuals areremapped after updating the grammar.Probabilistic Structured Grammatical Evolution (PSGE) and Co-evolutionary Probabilis-tic Structured Grammatical Evolution (Co-PSGE) were created by adapting the mappingand probabilities update mechanism from PGE and Co-PGE to Structured GrammaticalEvolution (SGE), a method that was proposed to overcome the issues of GE while improv-ing its performance [6]. These variants use as genotype a set of dynamic lists, one for eachnon-terminal of the grammar, with each element of the list being the probability used tomap the individual with the PCFG.We analyse and compare the performance of all the methods in six benchmarks. Whencompared to GE, the results showed that PGE and Co-PGE are statistically similar orbetter on all problems, while PSGE and Co-PSGE are statistically better on all problems.We also highlight Co-PSGE since it is statistically superior to SGE in some problems,making it competitive with the state-of-the-art. We also performed an analysis on therepresentations, and the results showed that PSGE and Co-PSGE have less redundancy,and all approaches exhibited better locality than GE, which allows for a better explorationof the search space.The analyses conducted showed that the evolved grammars help guide the evolutionaryprocess and provides us information about the most relevant production rules to generatebetter solutions. In addition, they can also be used to generate a sampling of solutionswith better average fitness.
FCT
Nguyen, Ngoc Tran. "Étude de transformations grammaticales pour l'entraînement de grammaires probabilistes hors-contexte." Thèse, 2002. http://hdl.handle.net/1866/14498.
Full textGotti, Fabrizio. "L'atténuation statistique des surdétections d'un correcteur grammatical symbolique." Thèse, 2012. http://hdl.handle.net/1866/9809.
Full textGrammar checking software sometimes erroneously flags a correct word sequence as an error, a problem we call overdetection in the present study. We describe the devel-opment of a system for identifying and filtering out the overdetections produced by the French grammar checker designed by the firm Druide Informatique. Various fami-lies of classifiers have been trained in a supervised way for 14 types of detections flagged by the grammar checker, using features that capture diverse linguistic phe-nomena (syntactic dependency links, POS tags, word context exploration, etc.), extracted from sentences with and without overdetections. Eight of the 14 classifiers we trained are now part of the latest version of a very popular commercial grammar checker. Moreover, our experiments have shown that statistical language models, SVMs and word sense disambiguation can all contribute to the improvement of these classifiers. This project is a striking illustration of a machine learning component suc-cessfully integrated within a robust, commercial natural language processing application.