Dissertations / Theses on the topic 'Language models'

To see the other types of publications on this topic, follow the link: Language models.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Language models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Livingstone, Daniel Jack. "Computer models of the evolution of language and languages." Thesis, University of the West of Scotland, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.398331.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ryder, Robin Jeremy. "Phylogenetic models of language diversification." Thesis, University of Oxford, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.543009.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Waegner, Nicholas Paul. "Stochastic models for language acquisition." Thesis, University of Cambridge, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.309214.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Niesler, Thomas Richard. "Category-based statistical language models." Thesis, University of Cambridge, 1997. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.627372.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Wallach, Hanna Megan. "Structured topic models for language." Thesis, University of Cambridge, 2008. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.612547.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Douzon, Thibault. "Language models for document understanding." Electronic Thesis or Diss., Lyon, INSA, 2023. http://www.theses.fr/2023ISAL0075.

Full text
Abstract:
Chaque jour, les entreprises du monde entier reçoivent et traitent d'énormes volumes de documents, entraînant des coûts considérables. Pour réduire ces coûts, de grandes entreprises automatisent le traitement documentaire, visant une automatisation complète. Cette thèse se concentre sur l'utilisation de modèles d'apprentissage machine pour extraire des informations de documents. Les progrès récents en matière d'architecture de modèle, en particulier les transformeurs, ont révolutionné le domaine grâce à leur utilisation généralisée de l'attention et à l'amélioration des pré-entraînements auto-supervisés. Nous montrons que les transformeurs, pré-entraînés sur des documents, effectuent des tâches de compréhension de documents avec précision et surpassent les modèles à base de réseaux récurrents pour l'extraction d'informations par classification de mots. Les transformeurs nécessitent également moins de données d'entraînement pour atteindre des performances élevées, soulignant l'importance du pré-entraînement auto-supervisé. Dans la suite, nous introduisons des tâches de pré-entraînement spécifiquement adaptées aux documents d'entreprise, améliorant les performances même avec des modèles plus petits. Cela permet d'atteindre des niveaux de performance similaires à ceux de modèles plus gros, ouvrant la voie à des modèles plus petits et plus économiques. Enfin, nous abordons le défi du coût d'évaluation des transformeurs sur de longues séquences. Nous montrons que des architectures plus efficaces dérivées des transformeurs nécessitent moins de ressources et donnent de meilleurs résultats sur de longues séquences. Cependant, elles peuvent perdre légèrement en performance sur de courtes séquences par rapport aux transformeurs classiques. Cela suggère l'avantage d'utiliser plusieurs modèles en fonction de la longueur des séquences à traiter, ouvrant la possibilité de concaténer des séquences de différentes modalités
Every day, an uncountable amount of documents are received and processed by companies worldwide. In an effort to reduce the cost of processing each document, the largest companies have resorted to document automation technologies. In an ideal world, a document can be automatically processed without any human intervention: its content is read, and information is extracted and forwarded to the relevant service. The state-of-the-art techniques have quickly evolved in the last decades, from rule-based algorithms to statistical models. This thesis focuses on machine learning models for document information extraction. Recent advances in model architecture for natural language processing have shown the importance of the attention mechanism. Transformers have revolutionized the field by generalizing the use of attention and by pushing self-supervised pre-training to the next level. In the first part, we confirm that transformers with appropriate pre-training were able to perform document understanding tasks with high performance. We show that, when used as a token classifier for information extraction, transformers are able to exceptionally efficiently learn the task compared to recurrent networks. Transformers only need a small proportion of the training data to reach close to maximum performance. This highlights the importance of self-supervised pre-training for future fine-tuning. In the following part, we design specialized pre-training tasks, to better prepare the model for specific data distributions such as business documents. By acknowledging the specificities of business documents such as their table structure and their over-representation of numeric figures, we are able to target specific skills useful for the model in its future tasks. We show that those new tasks improve the model's downstream performances, even with small models. Using this pre-training approach, we are able to reach the performances of significantly bigger models without any additional cost during finetuning or inference. Finally, in the last part, we address one drawback of the transformer architecture which is its computational cost when used on long sequences. We show that efficient architectures derived from the classic transformer require fewer resources and perform better on long sequences. However, due to how they approximate the attention computation, efficient models suffer from a small but significant performance drop on short sequences compared to classical architectures. This incentivizes the use of different models depending on the input length and enables concatenating multimodal inputs into a single sequence
APA, Harvard, Vancouver, ISO, and other styles
7

Townsend, Duncan Clarke McIntire. "Using a symbolic language parser to Improve Markov language models." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100621.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 31-32).
This thesis presents a hybrid approach to natural language processing that combines an n-gram (Markov) model with a symbolic parser. In concert these two techniques are applied to the problem of sentence simplification. The n-gram system is comprised of a relational database backend with a frontend application that presents a homogeneous interface for both direct n-gram lookup and Markov approximation. The query language exposed by the frontend also applies lexical information from the START natural language system to allow queries based on part of speech. Using the START natural language system's parser, English sentences are transformed into a collection of structural, syntactic, and lexical statements that are uniquely well-suited to the process of simplification. After reducing the parse of the sentence, the resulting expressions can be processed back into English. These reduced sentences are ranked by likelihood by the n-gram model.
by Duncan Clarke McIntire Townsend.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
8

Buttery, P. J. "Computational models for first language acquisition." Thesis, University of Cambridge, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.597195.

Full text
Abstract:
This work investigates a computational model of first language acquisition; the Categorical Grammar Learner or CGL. The model builds on the work of Villavicenio, who created a parametric Categorical Grammar learner that organises its parameters into an inheritance hierarchy, and also on the work of Buszkowski and Kanazawa, who demonstrated the learnability of a k-valued Classic Categorial Grammar (which uses only the rules of function application) from strings. The CGL is able to learn a k-valued General Categorial Grammar (which uses the rules of function application, function composition and Generalised Weak Permutation). The novel concept of Sentence Objects (simple strings, augmented strings, unlabelled structures and functor-argument structures) are presented as potential points from which learning may commence. Augmented strings (which are stings augmented with some basic syntactic information) are suggested as a sensible input to the CGL as they are cognitively plausible objects and have greater information content than strings alone. Building on the work of Siskind, a method for constructing augmented strings from unordered logic forms is detailed and it is suggested that augmented strings are simply a representation of the constraints placed on the space of possible parses due to a sting’s associated semantic content. The CGL make crucial use of a statistical Memory Module (constructed from a type memory and Word Order Memory) that is used to both constrain hypotheses and handle data which is noisy or parametrically ambiguous. A consequence of the Memory Module is that the CGL learns in an incremental fashion. This echoes real child learning as documented in Brown’s Stages of Language Development and also as alluded to by an included corpus study of child speech. Furthermore, the CGL learns faster when initially presented with simpler linguistic data; a further corpus study of child-directed speech suggests that this echoes the input provided to children. The CGL is demonstrated to learn from real data. It is evaluated against previous parametric learners (the Triggering Learning Algorithm of Gibson and Wexler and the Structural Triggers Learner of Fodor and Sakas) and is found to be more efficient.
APA, Harvard, Vancouver, ISO, and other styles
9

Nkadimeng, Calvin. "Language identification using Gaussian mixture models." Thesis, Stellenbosch : University of Stellenbosch, 2010. http://hdl.handle.net/10019.1/4170.

Full text
Abstract:
Thesis (MScEng (Electrical and Electronic Engineering))--University of Stellenbosch, 2010.
ENGLISH ABSTRACT: The importance of Language Identification for African languages is seeing a dramatic increase due to the development of telecommunication infrastructure and, as a result, an increase in volumes of data and speech traffic in public networks. By automatically processing the raw speech data the vital assistance given to people in distress can be speeded up, by referring their calls to a person knowledgeable in that language. To this effect a speech corpus was developed and various algorithms were implemented and tested on raw telephone speech data. These algorithms entailed data preparation, signal processing, and statistical analysis aimed at discriminating between languages. The statistical model of Gaussian Mixture Models (GMMs) were chosen for this research due to their ability to represent an entire language with a single stochastic model that does not require phonetic transcription. Language Identification for African languages using GMMs is feasible, although there are some few challenges like proper classification and accurate study into the relationship of langauges that need to be overcome. Other methods that make use of phonetically transcribed data need to be explored and tested with the new corpus for the research to be more rigorous.
AFRIKAANSE OPSOMMING: Die belang van die Taal identifiseer vir Afrika-tale is sien ’n dramatiese toename te danke aan die ontwikkeling van telekommunikasie-infrastruktuur en as gevolg ’n toename in volumes van data en spraak verkeer in die openbaar netwerke.Deur outomaties verwerking van die ruwe toespraak gegee die noodsaaklike hulp verleen aan mense in nood kan word vinniger-up ”, deur te verwys hul oproepe na ’n persoon ingelichte in daardie taal. Tot hierdie effek van ’n toespraak corpus het ontwikkel en die verskillende algoritmes is gemplementeer en getoets op die ruwe telefoon toespraak gegee.Hierdie algoritmes behels die data voorbereiding, seinverwerking, en statistiese analise wat gerig is op onderskei tussen tale.Die statistiese model van Gauss Mengsel Modelle (GGM) was gekies is vir hierdie navorsing as gevolg van hul vermo te verteenwoordig ’n hele taal met’ n enkele stogastiese model wat nodig nie fonetiese tanscription nie. Taal identifiseer vir die Afrikatale gebruik GGM haalbaar is, alhoewel daar enkele paar uitdagings soos behoorlike klassifikasie en akkurate ondersoek na die verhouding van TALE wat moet oorkom moet word.Ander metodes wat gebruik maak van foneties getranskribeerde data nodig om ondersoek te word en getoets word met die nuwe corpus vir die ondersoek te word strenger.
APA, Harvard, Vancouver, ISO, and other styles
10

Schuster, Ingmar. "Probabilistic models of natural language semantics." Doctoral thesis, Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204503.

Full text
Abstract:
This thesis tackles the problem of modeling the semantics of natural language. Neural Network models are reviewed and a new Bayesian approach is developed and evaluated. As the performance of standard Monte Carlo algorithms proofed to be unsatisfactory for the developed models, the main focus lies on a new adaptive algorithm from the Sequential Monte Carlo (SMC) family. The Gradient Importance Sampling (GRIS) algorithm developed in the thesis is shown to give very good performance as compared to many adaptive Markov Chain Monte Carlo (MCMC) algorithms on a range of complex target distributions. Another advantage as compared to MCMC is that GRIS provides a straight forward estimate of model evidence. Finally, Sample Inflation is introduced as a means to reduce variance and speed up mode finding in Importance Sampling and SMC algorithms. Sample Inflation provides provably consistent estimates and is empirically found to improve convergence of integral estimates
Diese Dissertation befasst sich mit der Modellierung der Semantik natürlicher Sprache. Eine Übersicht von Neuronalen Netzwerkmodellen wird gegeben und ein eigener Bayesscher Ansatz wird entwickelt und evaluiert. Da die Leistungsfähigkeit von Standardalgorithmen aus der Monte-Carlo-Familie auf dem entwickelten Model unbefriedigend ist, liegt der Hauptfokus der Arbeit auf neuen adaptiven Algorithmen im Rahmen von Sequential Monte Carlo (SMC). Es wird gezeigt, dass der in der Dissertation entwickelte Gradient Importance Sampling (GRIS) Algorithmus sehr leistungsfähig ist im Vergleich zu vielen Algorithmen des adaptiven Markov Chain Monte Carlo (MCMC), wobei komplexe und hochdimensionale Integrationsprobleme herangezogen werden. Ein weiterer Vorteil im Vergleich mit MCMC ist, dass GRIS einen Schätzer der Modelevidenz liefert. Schließlich wird Sample Inflation eingeführt als Ansatz zur Reduktion von Varianz und schnellerem auffinden von Modi in einer Verteilung, wenn Importance Sampling oder SMC verwendet werden. Sample Inflation ist beweisbar konsistent und es wird empirisch gezeigt, dass seine Anwendung die Konvergenz von Integralschätzern verbessert
APA, Harvard, Vancouver, ISO, and other styles
11

Damljanovic, Danica. "Natural language interfaces to conceptual models." Thesis, University of Sheffield, 2011. http://etheses.whiterose.ac.uk/1630/.

Full text
Abstract:
Accessing structured data in the form of ontologies currently requires the use of formal query languages (e.g., SeRQL or SPARQL) which pose significant difficulties for non-expert users. One way to lower the learning overhead and make ontology queries more straightforward is through a Natural Lan- guage Interface (NLI). While there are existing NLIs to structured data with reasonable performance, they tend to require expensive customisation to each new domain. Additionally, they often require specific adherence to a pre-defined syntax which, in turn, means that users still have to undergo training. In this thesis, we study the usability of NLIs from two perspectives: that of the developer who is customising the NLI system, and that of the end-user who uses it for querying. We investigate whether usability methods such as feedback and clarification dialogs can increase the usability for end users and reduce the customisation effort for the developers. To that end, we have developed two systems, QuestIO and FREyA, whose design, evaluation and comparison with similar systems form the core of the contribution of this thesis.
APA, Harvard, Vancouver, ISO, and other styles
12

Delmestri, Antonella. "Data Driven Models for Language Evolution." Doctoral thesis, Università degli studi di Trento, 2011. https://hdl.handle.net/11572/368357.

Full text
Abstract:
Natural languages that originate from a common ancestor are genetically related, words are the core of any language and cognates are words sharing the same ancestor and etymology. Cognate identification, therefore, represents the foundation upon which the evolutionary history of languages may be discovered, while linguistic phylogenetic inference aims to estimate the genetic relationships that exist between them. In this thesis, using several techniques originally developed for biological sequence analysis, we have designed a data driven orthographic learning system for measuring string similarity and we have successfully applied it to the tasks of cognate identification and phylogenetic inference. Our system has outperformed the best comparable phonetic and orthographic cognate identification models previously reported in the literature, with results statistically significant and remarkably stable, regardless of the variation of the training dataset dimension. When applied to phylogenetic inference of the Indo-European language family, whose higher structure does not yet have consensus, our method has estimated phylogenies which are compatible with the benchmark tree and has reproduced correctly all the established major language groups and subgroups present in the dataset.
APA, Harvard, Vancouver, ISO, and other styles
13

Delmestri, Antonella. "Data Driven Models for Language Evolution." Doctoral thesis, University of Trento, 2011. http://eprints-phd.biblio.unitn.it/473/1/PhD-Thesis_Uploaded.pdf.

Full text
Abstract:
Natural languages that originate from a common ancestor are genetically related, words are the core of any language and cognates are words sharing the same ancestor and etymology. Cognate identification, therefore, represents the foundation upon which the evolutionary history of languages may be discovered, while linguistic phylogenetic inference aims to estimate the genetic relationships that exist between them. In this thesis, using several techniques originally developed for biological sequence analysis, we have designed a data driven orthographic learning system for measuring string similarity and we have successfully applied it to the tasks of cognate identification and phylogenetic inference. Our system has outperformed the best comparable phonetic and orthographic cognate identification models previously reported in the literature, with results statistically significant and remarkably stable, regardless of the variation of the training dataset dimension. When applied to phylogenetic inference of the Indo-European language family, whose higher structure does not yet have consensus, our method has estimated phylogenies which are compatible with the benchmark tree and has reproduced correctly all the established major language groups and subgroups present in the dataset.
APA, Harvard, Vancouver, ISO, and other styles
14

Miao, Yishu. "Deep generative models for natural language processing." Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:e4e1f1f9-e507-4754-a0ab-0246f1e1e258.

Full text
Abstract:
Deep generative models are essential to Natural Language Processing (NLP) due to their outstanding ability to use unlabelled data, to incorporate abundant linguistic features, and to learn interpretable dependencies among data. As the structure becomes deeper and more complex, having an effective and efficient inference method becomes increasingly important. In this thesis, neural variational inference is applied to carry out inference for deep generative models. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. The powerful neural networks are able to approximate complicated non-linear distributions and grant the possibilities for more interesting and complicated generative models. Therefore, we develop the potential of neural variational inference and apply it to a variety of models for NLP with continuous or discrete latent variables. This thesis is divided into three parts. Part I introduces a generic variational inference framework for generative and conditional models of text. For continuous or discrete latent variables, we apply a continuous reparameterisation trick or the REINFORCE algorithm to build low-variance gradient estimators. To further explore Bayesian non-parametrics in deep neural networks, we propose a family of neural networks that parameterise categorical distributions with continuous latent variables. Using the stick-breaking construction, an unbounded categorical distribution is incorporated into our deep generative models which can be optimised by stochastic gradient back-propagation with a continuous reparameterisation. Part II explores continuous latent variable models for NLP. Chapter 3 discusses the Neural Variational Document Model (NVDM): an unsupervised generative model of text which aims to extract a continuous semantic latent variable for each document. In Chapter 4, the neural topic models modify the neural document models by parameterising categorical distributions with continuous latent variables, where the topics are explicitly modelled by discrete latent variables. The models are further extended to neural unbounded topic models with the help of stick-breaking construction, and a truncation-free variational inference method is proposed based on a Recurrent Stick-breaking construction (RSB). Chapter 5 describes the Neural Answer Selection Model (NASM) for learning a latent stochastic attention mechanism to model the semantics of question-answer pairs and predict their relatedness. Part III discusses discrete latent variable models. Chapter 6 introduces latent sentence compression models. The Auto-encoding Sentence Compression Model (ASC), as a discrete variational auto-encoder, generates a sentence by a sequence of discrete latent variables representing explicit words. The Forced Attention Sentence Compression Model (FSC) incorporates a combined pointer network biased towards the usage of words from source sentence, which significantly improves the performance when jointly trained with the ASC model in a semi-supervised learning fashion. Chapter 7 describes the Latent Intention Dialogue Models (LIDM) that employ a discrete latent variable to learn underlying dialogue intentions. Additionally, the latent intentions can be interpreted as actions guiding the generation of machine responses, which could be further refined autonomously by reinforcement learning. Finally, Chapter 8 summarizes our findings and directions for future work.
APA, Harvard, Vancouver, ISO, and other styles
15

Morganti, Caroline (Caroline Taylor). "Applying natural language models and causal models to project management systems." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119577.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 97-101).
This thesis concerns itself with two problems. First, it examines ways in which to use natural language features in time-varying data in predictive models, specifically applied to the problem of software project maintenance. We attempted to integrate this natural language data into our existing predictive models for project management applications. Second, we began work on creating an easy-to-use, extensible causal modeling framework, a Python package called CEModels. This package allows users to create causal inference models using input data. We tested this framework on project management data as well.
by Caroline Morganti.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
16

Rodrigeuz-Sanchez, I. "Matrix models of second language vocabulary acquisition." Thesis, Swansea University, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.638702.

Full text
Abstract:
Most of the current research in L2 vocabulary acquisition has been too focused on what it is to learn a word, and has neglected how whole vocabularies grow or decline. In general, it is assumed that vocabulary gains and losses are incremental and follow a linear progression. This thesis postulates a model which considers several discrete stages of knowledge and accounts for the unstable nature of vocabulary knowledge, where words can change from one state to any other. Matrix algebra is a tool capable to operate with such a model and produce long-term forecasts of vocabulary size. Our experimental work describes the retention and the overall growth of the vocabulary of advanced learners of Spanish. These experiments show that forecasts of vocabulary size generated by the matrix model are far more accurate than those generated by a linear model. With data from two self-rating tasks containing a large number of words completed within a given lapse we build matrices which generate forecasts of vocabulary knowledge. These forecasts highly correlate to the actual knowledge measured three and four months later. This methodology is tested with subjects of various groups, using words from different frequency bands, and different measurement scales. In addition, we indicate ways of identifying matrices likely to generate inaccurate predictions. This methodology is considered one step forward towards the establishment of a model for L2 vocabulary acquisition.
APA, Harvard, Vancouver, ISO, and other styles
17

Lei, Tao Ph D. Massachusetts Institute of Technology. "Interpretable neural models for natural language processing." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/108990.

Full text
Abstract:
Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 109-119).
The success of neural network models often comes at a cost of interpretability. This thesis addresses the problem by providing justifications behind the model's structure and predictions. In the first part of this thesis, we present a class of sequence operations for text processing. The proposed component generalizes from convolution operations and gated aggregations. As justifications, we relate this component to string kernels, i.e. functions measuring the similarity between sequences, and demonstrate how it encodes the efficient kernel computing algorithm into its structure. The proposed model achieves state-of-the-art or competitive results compared to alternative architectures (such as LSTMs and CNNs) across several NLP applications. In the second part, we learn rationales behind the model's prediction by extracting input pieces as supporting evidence. Rationales are tailored to be short and coherent, yet sufficient for making the same prediction. Our approach combines two modular components, generator and encoder, which are trained to operate well together. The generator specifies a distribution over text fragments as candidate rationales and these are passed through the encoder for prediction. Rationales are never given during training. Instead, the model is regularized by the desiderata for rationales. We demonstrate the effectiveness of this learning framework in applications such multi-aspect sentiment analysis. Our method achieves a performance over 90% evaluated against manual annotated rationales.
by Tao Lei.
Ph. D.
APA, Harvard, Vancouver, ISO, and other styles
18

Kunz, Jenny. "Neural Language Models with Explicit Coreference Decision." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-371827.

Full text
Abstract:
Coreference is an important and frequent concept in any form of discourse, and Coreference Resolution (CR) a widely used task in Natural Language Understanding (NLU). In this thesis, we implement and explore two recent models that include the concept of coreference in Recurrent Neural Network (RNN)-based Language Models (LM). Entity and reference decisions are modeled explicitly in these models using attention mechanisms. Both models learn to save the previously observed entities in a set and to decide if the next token created by the LM is a mention of one of the entities in the set, an entity that has not been observed yet, or not an entity. After a theoretical analysis where we compare the two LMs to each other and to a state of the art Coreference Resolution system, we perform an extensive quantitative and qualitative analysis. For this purpose, we train the two models and a classical RNN-LM as the baseline model on the OntoNotes 5.0 corpus with coreference annotation. While we do not reach the baseline in the perplexity metric, we show that the models’ relative performance on entity tokens has the potential to improve when including the explicit entity modeling. We show that the most challenging point in the systems is the decision if the next token is an entity token, while the decision which entity the next token refers to performs comparatively well. Our analysis in the context of a text generation task shows that a wide-spread error source for the mention creation process is the confusion of tokens that refer to related but different entities in the real world, presumably a result of the context-based word representations in the models. Our re-implementation of the DeepMind model by Yang et al. 2016 performs notably better than the re-implementation of the EntityNLM model by Ji et al. 2017 with a perplexity of 107 compared to a perplexity of 131.
APA, Harvard, Vancouver, ISO, and other styles
19

Davis, Alexandre Guelman. "Subject classification through context-enriched language models." Universidade Federal de Minas Gerais, 2015. http://hdl.handle.net/1843/ESBF-9VKK2Q.

Full text
Abstract:
Throughout the years, humans have developed a complex and intricate system of communication with several means of conveying information that range from books, newspapers and television to, more recently, social media. However, efficiently retrieving and understanding messages from social media for extracting useful information is challenging, especially considering that shorter messages are strongly dependent on context. Users often assume that their social media audience is aware of the associated background and the underlying real world events. This allows them to shorten their messages without compromising the effectiveness of communication. Traditional data mining algorithms do not account for contextual information. We argue that exploiting context could lead to more complete and accurate analyses of social media messages. For this work, therefore, we demonstrate how relevant is contextual information in the successful filtering of messages that are related to a selected subject. We also show that recall rate increases if context is taken into account. Furthermore, we propose methods for filtering relevant messages without resorting only to keywords if the context is known and can be detected. In this dissertation, we propose a novel approach for subject classification of social media messages that considers both textual and extra-textual (or contextual) information. This approach uses a proposed context-enriched language model. Techniques based on concepts of computational linguistics, more specifically in the field of Pragmatics, are employed. For experimentally analyzing the impact of the proposed approach, datasets containing messages about three major American sports (football, baseball and basketball) were used. Results indicate up to 50% improvement in retrieval over text-based approaches due to the use of contextual information.
Ao longo dos anos, humanos desenvolveram um complexo e intricado sistema de comunicação, com diversas maneiras de transmitir informações, que vão de livros, jornais e televisão até, mais recentemente, mídias sociais. No entanto, recuperar eficientemente e entender mensagens de mídias sociais para a extração de informações úteis é desafiador, especialmente considerando que mensagens mais curtas são mais dependentes do contexto. Usuários muitas vezes assumem que o público de suas mídias sociais está ciente do contexto associado e de eventos do mundo real subjacentes. Isso permite que eles encurtem as mensagens sem prejudicar a efetividade da comunicação. Algoritmos tradicionais de mineração de dados não levam em consideração informações contextuais. Consideramos que explorar o contexto pode levar a uma análise mais completa e precisa das mensagens de mídias sociais. Neste trabalho, portanto, é demonstrado o quão relevantes são as informações contextuais na filtragem de mensagens que são relacionadas a um dado assunto (ou tópico). Também é mostrado que a taxa de recuperação aumenta se o contexto for levado em consideração. Além disso, são propostos métodos para filtrar mensagens relevantes sem utilizar apenas palavras-chave se o contexto for conhecido e datectável. Nesta dissertação, propomos uma nova abordagem para classificação de tópicos em mensagens de mídias sociais que considera tanto informações textuais como extra-textuais (ou contextuais). Essa abordagem propõe e utiliza modelo de linguagem enriquecido com contexto. Técnicas baseadas em conceitos de linguística computacional, mais especificamente na área de Pragmática, são utilizadas. Para avaliar experimentalmente o impacto dessas propostas foram utilizados conjuntos de dados contendo mensagens sobre três importantes esportes americanos (futebol americano, baseball e basquete). Resultados indicam uma melhora de até 50% na recuperação de mensagens sobre estratégias baseadas em texto devido à inclusão de informação contextual.
APA, Harvard, Vancouver, ISO, and other styles
20

Pérez-Sancho, Carlos. "Stochastic language models for music information retrieval." Doctoral thesis, Universidad de Alicante, 2009. http://hdl.handle.net/10045/14217.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Labeau, Matthieu. "Neural language models : Dealing with large vocabularies." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS313/document.

Full text
Abstract:
Le travail présenté dans cette thèse explore les méthodes pratiques utilisées pour faciliter l'entraînement et améliorer les performances des modèles de langues munis de très grands vocabulaires. La principale limite à l'utilisation des modèles de langue neuronaux est leur coût computationnel: il dépend de la taille du vocabulaire avec laquelle il grandit linéairement. La façon la plus aisée de réduire le temps de calcul de ces modèles reste de limiter la taille du vocabulaire, ce qui est loin d'être satisfaisant pour de nombreuses tâches. La plupart des méthodes existantes pour l'entraînement de ces modèles à grand vocabulaire évitent le calcul de la fonction de partition, qui est utilisée pour forcer la distribution de sortie du modèle à être normalisée en une distribution de probabilités. Ici, nous nous concentrons sur les méthodes à base d'échantillonnage, dont le sampling par importance et l'estimation contrastive bruitée. Ces méthodes permettent de calculer facilement une approximation de cette fonction de partition. L'examen des mécanismes de l'estimation contrastive bruitée nous permet de proposer des solutions qui vont considérablement faciliter l'entraînement, ce que nous montrons expérimentalement. Ensuite, nous utilisons la généralisation d'un ensemble d'objectifs basés sur l'échantillonnage comme divergences de Bregman pour expérimenter avec de nouvelles fonctions objectif. Enfin, nous exploitons les informations données par les unités sous-mots pour enrichir les représentations en sortie du modèle. Nous expérimentons avec différentes architectures, sur le Tchèque, et montrons que les représentations basées sur les caractères permettent l'amélioration des résultats, d'autant plus lorsque l'on réduit conjointement l'utilisation des représentations de mots
This work investigates practical methods to ease training and improve performances of neural language models with large vocabularies. The main limitation of neural language models is their expensive computational cost: it depends on the size of the vocabulary, with which it grows linearly. Despite several training tricks, the most straightforward way to limit computation time is to limit the vocabulary size, which is not a satisfactory solution for numerous tasks. Most of the existing methods used to train large-vocabulary language models revolve around avoiding the computation of the partition function, ensuring that output scores are normalized into a probability distribution. Here, we focus on sampling-based approaches, including importance sampling and noise contrastive estimation. These methods allow an approximate computation of the partition function. After examining the mechanism of self-normalization in noise-contrastive estimation, we first propose to improve its efficiency with solutions that are adapted to the inner workings of the method and experimentally show that they considerably ease training. Our second contribution is to expand on a generalization of several sampling based objectives as Bregman divergences, in order to experiment with new objectives. We use Beta divergences to derive a set of objectives from which noise contrastive estimation is a particular case. Finally, we aim at improving performances on full vocabulary language models, by augmenting output words representation with subwords. We experiment on a Czech dataset and show that using character-based representations besides word embeddings for output representations gives better results. We also show that reducing the size of the output look-up table improves results even more
APA, Harvard, Vancouver, ISO, and other styles
22

Bayer, Ali Orkan. "Semantic Language models with deep neural Networks." Doctoral thesis, Università degli studi di Trento, 2015. https://hdl.handle.net/11572/367784.

Full text
Abstract:
Spoken language systems (SLS) communicate with users in natural language through speech. There are two main problems related to processing the spoken input in SLS. The first one is automatic speech recognition (ASR) which recognizes what the user says. The second one is spoken language understanding (SLU) which understands what the user means. We focus on the language model (LM) component of SLS. LMs constrain the search space that is used in the search for the best hypothesis. Therefore, they play a crucial role in the performance of SLS. It has long been discussed that an improvement in the recognition performance does not necessarily yield a better understanding performance. Therefore, optimization of LMs for the understanding performance is crucial. In addition, long-range dependencies in languages are hard to handle with statistical language models. These two problems are addressed in this thesis. We investigate two different LM structures. The first LM that we investigate enable SLS to understand better what they recognize by searching the ASR hypotheses for the best understanding performance. We refer to these models as joint LMs. They use lexical and semantic units jointly in the LM. The second LM structure uses the semantic context of an utterance, which can also be described as “what the system understands†, to search for a better hypothesis that improves the recognition and the understanding performance. We refer to these models as semantic LMs (SELMs). SELMs use features that are based on a well established theory of lexical semantics, namely the theory of frame semantics. They incorporate the semantic features which are extracted from the ASR hypothesis into the LM and handle long-range dependencies by using the semantic relationships between words and semantic context. ASR noise is propagated to the semantic features, to suppress this noise we introduce the use of deep semantic encodings for semantic feature extraction. In this way, SELMs optimize both the recognition and the understanding performance.
APA, Harvard, Vancouver, ISO, and other styles
23

Bayer, Ali Orkan. "Semantic Language models with deep neural Networks." Doctoral thesis, University of Trento, 2015. http://eprints-phd.biblio.unitn.it/1578/1/bayer_thesis.pdf.

Full text
Abstract:
Spoken language systems (SLS) communicate with users in natural language through speech. There are two main problems related to processing the spoken input in SLS. The first one is automatic speech recognition (ASR) which recognizes what the user says. The second one is spoken language understanding (SLU) which understands what the user means. We focus on the language model (LM) component of SLS. LMs constrain the search space that is used in the search for the best hypothesis. Therefore, they play a crucial role in the performance of SLS. It has long been discussed that an improvement in the recognition performance does not necessarily yield a better understanding performance. Therefore, optimization of LMs for the understanding performance is crucial. In addition, long-range dependencies in languages are hard to handle with statistical language models. These two problems are addressed in this thesis. We investigate two different LM structures. The first LM that we investigate enable SLS to understand better what they recognize by searching the ASR hypotheses for the best understanding performance. We refer to these models as joint LMs. They use lexical and semantic units jointly in the LM. The second LM structure uses the semantic context of an utterance, which can also be described as “what the system understands”, to search for a better hypothesis that improves the recognition and the understanding performance. We refer to these models as semantic LMs (SELMs). SELMs use features that are based on a well established theory of lexical semantics, namely the theory of frame semantics. They incorporate the semantic features which are extracted from the ASR hypothesis into the LM and handle long-range dependencies by using the semantic relationships between words and semantic context. ASR noise is propagated to the semantic features, to suppress this noise we introduce the use of deep semantic encodings for semantic feature extraction. In this way, SELMs optimize both the recognition and the understanding performance.
APA, Harvard, Vancouver, ISO, and other styles
24

Yang, Xi. "Discriminative acoustic and sequence models for GMM based automatic language identification /." View abstract or full-text, 2007. http://library.ust.hk/cgi/db/thesis.pl?ECED%202007%20YANG.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Scarcella, Alessandro. "Recurrent neural network language models in the context of under-resourced South African languages." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29431.

Full text
Abstract:
Over the past five years neural network models have been successful across a range of computational linguistic tasks. However, these triumphs have been concentrated in languages with significant resources such as large datasets. Thus, many languages, which are commonly referred to as under-resourced languages, have received little attention and have yet to benefit from recent advances. This investigation aims to evaluate the implications of recent advances in neural network language modelling techniques for under-resourced South African languages. Rudimentary, single layered recurrent neural networks (RNN) were used to model four South African text corpora. The accuracy of these models were compared directly to legacy approaches. A suite of hybrid models was then tested. Across all four datasets, neural networks led to overall better performing language models either directly or as part of a hybrid model. A short examination of punctuation marks in text data revealed that performance metrics for language models are greatly overestimated when punctuation marks have not been excluded. The investigation concludes by appraising the sensitivity of RNN language models (RNNLMs) to the size of the datasets by artificially constraining the datasets and evaluating the accuracy of the models. It is recommended that future research endeavours within this domain are directed towards evaluating more sophisticated RNNLMs as well as measuring their impact on application focused tasks such as speech recognition and machine translation.
APA, Harvard, Vancouver, ISO, and other styles
26

Takeda, Koichi. "Building Natural Language Processing Applications Using Descriptive Models." 京都大学 (Kyoto University), 2010. http://hdl.handle.net/2433/120372.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Henter, Gustav Eje. "Probabilistic Sequence Models with Speech and Language Applications." Doctoral thesis, KTH, Kommunikationsteori, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-134693.

Full text
Abstract:
Series data, sequences of measured values, are ubiquitous. Whenever observations are made along a path in space or time, a data sequence results. To comprehend nature and shape it to our will, or to make informed decisions based on what we know, we need methods to make sense of such data. Of particular interest are probabilistic descriptions, which enable us to represent uncertainty and random variation inherent to the world around us. This thesis presents and expands upon some tools for creating probabilistic models of sequences, with an eye towards applications involving speech and language. Modelling speech and language is not only of use for creating listening, reading, talking, and writing machines---for instance allowing human-friendly interfaces to future computational intelligences and smart devices of today---but probabilistic models may also ultimately tell us something about ourselves and the world we occupy. The central theme of the thesis is the creation of new or improved models more appropriate for our intended applications, by weakening limiting and questionable assumptions made by standard modelling techniques. One contribution of this thesis examines causal-state splitting reconstruction (CSSR), an algorithm for learning discrete-valued sequence models whose states are minimal sufficient statistics for prediction. Unlike many traditional techniques, CSSR does not require the number of process states to be specified a priori, but builds a pattern vocabulary from data alone, making it applicable for language acquisition and the identification of stochastic grammars. A paper in the thesis shows that CSSR handles noise and errors expected in natural data poorly, but that the learner can be extended in a simple manner to yield more robust and stable results also in the presence of corruptions. Even when the complexities of language are put aside, challenges remain. The seemingly simple task of accurately describing human speech signals, so that natural synthetic speech can be generated, has proved difficult, as humans are highly attuned to what speech should sound like. Two papers in the thesis therefore study nonparametric techniques suitable for improved acoustic modelling of speech for synthesis applications. Each of the two papers targets a known-incorrect assumption of established methods, based on the hypothesis that nonparametric techniques can better represent and recreate essential characteristics of natural speech. In the first paper of the pair, Gaussian process dynamical models (GPDMs), nonlinear, continuous state-space dynamical models based on Gaussian processes, are shown to better replicate voiced speech, without traditional dynamical features or assumptions that cepstral parameters follow linear autoregressive processes. Additional dimensions of the state-space are able to represent other salient signal aspects such as prosodic variation. The second paper, meanwhile, introduces KDE-HMMs, asymptotically-consistent Markov models for continuous-valued data based on kernel density estimation, that additionally have been extended with a fixed-cardinality discrete hidden state. This construction is shown to provide improved probabilistic descriptions of nonlinear time series, compared to reference models from different paradigms. The hidden state can be used to control process output, making KDE-HMMs compelling as a probabilistic alternative to hybrid speech-synthesis approaches. A final paper of the thesis discusses how models can be improved even when one is restricted to a fundamentally imperfect model class. Minimum entropy rate simplification (MERS), an information-theoretic scheme for postprocessing models for generative applications involving both speech and text, is introduced. MERS reduces the entropy rate of a model while remaining as close as possible to the starting model. This is shown to produce simplified models that concentrate on the most common and characteristic behaviours, and provides a continuum of simplifications between the original model and zero-entropy, completely predictable output. As the tails of fitted distributions may be inflated by noise or empirical variability that a model has failed to capture, MERS's ability to concentrate on high-probability output is also demonstrated to be useful for denoising models trained on disturbed data.

QC 20131128


ACORNS: Acquisition of Communication and Recognition Skills
LISTA – The Listening Talker
APA, Harvard, Vancouver, ISO, and other styles
28

Lou, Bill Pi-ching. "New models of natural language for automated assessment." Thesis, University of Nottingham, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.337661.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Gwei, G. M. "New models of natural language for consultative computing." Thesis, University of Nottingham, 1987. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.378986.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

McCandless, Michael Kyle. "Automatic acquisition of language models for speech recognition." Thesis, Massachusetts Institute of Technology, 1994. http://hdl.handle.net/1721.1/36462.

Full text
Abstract:
Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1994.
Includes bibliographical references (leaves 138-141).
by Michael Kyle McCanless.
M.S.
APA, Harvard, Vancouver, ISO, and other styles
31

Brorson, Erik. "Classifying Hate Speech using Fine-tuned Language Models." Thesis, Uppsala universitet, Statistiska institutionen, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-352637.

Full text
Abstract:
Given the explosion in the size of social media, the amount of hate speech is also growing. To efficiently combat this issue we need reliable and scalable machine learning models. Current solutions rely on crowdsourced datasets that are limited in size, or using training data from self-identified hateful communities, that lacks specificity. In this thesis we introduce a novel semi-supervised modelling strategy. It is first trained on the freely available data from the hateful communities and then fine-tuned to classify hateful tweets from crowdsourced annotated datasets. We show that our model reach state of the art performance with minimal hyper-parameter tuning.
APA, Harvard, Vancouver, ISO, and other styles
32

Li, Zhongliang. "Slim Embedding Layers for Recurrent Neural Language Models." Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1531950458646138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Shao, Han. "Pretraining Deep Learning Models for Natural Language Understanding." Oberlin College Honors Theses / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=oberlin158955297757398.

Full text
APA, Harvard, Vancouver, ISO, and other styles
34

Tripodi, Rocco <1982&gt. "Evolutionary game theoretic models for natural language processing." Doctoral thesis, Università Ca' Foscari Venezia, 2015. http://hdl.handle.net/10579/8351.

Full text
Abstract:
This thesis is aimed at discovering new learning algorithms inspired by principles of biological evolution, which are able to exploit relational and contextual information, viewing clustering and classification problems in a dynamical system perspective. In particular, we have investigated how game theoretic models can be used to solve different Natural Language Processing tasks. Traditional studies of language have used a game-theoretic perspective to study how language evolves over time and how it emerges in a community but to the best of our knowledge, this is the first attempt to use game-theory to solve specific problems in this area. These models are based on the concept of equilibrium, a state of a system, which emerges after a series of interactions among the elements, which are part of it. Starting from a situation in which there is uncertainty about a particular phenomenon, they describe how a disequilibrium state resolves in equilibrium. The games are situations in which a group of objects has to be classified or clustered and each of them has to choose its collocation in a predefined set of classes. The choice of each one is influenced by the choices of the other and the satisfaction that a player has, about the outcome of a game, is determined by a payoff function, which the players try to maximize. After a series of interactions the players learn to play their best strategies, leading to an equilibrium state and to the resolution of the problem. From a machine-learning perspective this approach is appealing, because it can be employed as an unsupervised, semi-supervised or supervised learning model. We have used it to resolve the word sense disambiguation problem. We casted this task as a constraint satisfaction problem, where each word to be disambiguated is con- strained to choose the most coherent sense among the available, according to the sense that the words around it are choosing. This formulation ensures the mainte- nance of textual coherence and has been tested against state-of-the-art algorithms with higher and more stable results. We have also used a game theoretic formulation, to improve the clustering results of dominant set clustering and non-negative matrix factorization technique. We evaluated our system on different document datasets through different approaches, achieving results, which outperform state-of-the-art algorithms. This work opened new perspectives in game theoretic models, demonstrating that these approaches are promising and that they can be employed also for the resolution of new problems.
APA, Harvard, Vancouver, ISO, and other styles
35

Zausa, Giulio <1998&gt. "Exploiting Language Models for Vector-Style Images Synthesis." Master's Degree Thesis, Università Ca' Foscari Venezia, 2021. http://hdl.handle.net/10579/19965.

Full text
Abstract:
Deep learning generative models have been successfully applied to synthesize images from various sources, like human faces and natural images, with impressive and realistic results. Nonetheless, not much work has been done for generating icons and vector-style images since synthesizing them requires precision and high-frequency details. Such images are essential for modern software and web development since they communicate concepts faster and more universally. We try to fill the gap by proposing an explicit density conditional generative model that can yield high-resolution samples when trained on rasterized vector-style images. Our novel architecture can solve conditional and unconditional image generation tasks, and it is easier to train than current adversarial approaches. Moreover, we compare our work with the current state-of-the-art generative models, highlighting their strengths and weakness. Finally, we introduce a new dataset containing high-quality icons from 11 different styles to test the quality of our model when performing conditional random sampling and style transfer between icons.
APA, Harvard, Vancouver, ISO, and other styles
36

González, Jorge, and Francisco Casacuberta. "Phrase-based finite state models." Universität Potsdam, 2008. http://opus.kobv.de/ubp/volltexte/2008/2720/.

Full text
Abstract:
In the last years, statistical machine translation has already demonstrated its usefulness within a wide variety of translation applications. In this line, phrase-based alignment models have become the reference to follow in order to build competitive systems. Finite state models are always an interesting framework because there are well-known efficient algorithms for their representation and manipulation. This document is a contribution to the evolution of finite state models towards a phrase-based approach. The inference of stochastic transducers that are based on bilingual phrases is carefully analysed from a finite state point of view. Indeed, the algorithmic phenomena that have to be taken into account in order to deal with such phrase-based finite state models when in decoding time are also in-depth detailed.
APA, Harvard, Vancouver, ISO, and other styles
37

Correia, Filipe André Sobral. "Model morphisms (MoMo) to enable language independent information models and interoperable business networks." Master's thesis, Faculdade de Ciências e Tecnologia, 2010. http://hdl.handle.net/10362/4782.

Full text
Abstract:
MSc. Dissertation presented at Faculdade de Ciências e Tecnologia of Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer Engineering
With the event of globalisation, the opportunities for collaboration became more evident with the effect of enlarging business networks. In such conditions, a key for enterprise success is a reliable communication with all the partners. Therefore, organisations have been searching for flexible integrated environments to better manage their services and product life cycle, where their software applications could be easily integrated independently of the platform in use. However, with so many different information models and implementation standards being used, interoperability problems arise. Moreover,organisations are themselves at different technological maturity levels, and the solution that might be good for one, can be too advanced for another, or vice-versa. This dissertation responds to the above needs, proposing a high level meta-model to be used at the entire business network, enabling to abstract individual models from their specificities and increasing language independency and interoperability, while keeping all the enterprise legacy software‟s integrity intact. The strategy presented allows an incremental mapping construction, to achieve a gradual integration. To accomplish this, the author proposes Model Driven Architecture (MDA) based technologies for the development of traceable transformations and execution of automatic Model Morphisms.
APA, Harvard, Vancouver, ISO, and other styles
38

Cortez, Marc. "Models, metaphors, and multivalent contextualizations religious language and the nature of contextual theology /." Online full text .pdf document, available to Fuller patrons only, 2004. http://www.tren.com.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Wik, Preben. "The Virtual Language Teacher : Models and applications for language learning using embodied conversational agents." Doctoral thesis, KTH, Tal-kommunikation, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-33579.

Full text
Abstract:
This thesis presents a framework for computer assisted language learning using a virtual language teacher. It is an attempt at creating, not only a new type of language learning software, but also a server-based application that collects large amounts of speech material for future research purposes.The motivation for the framework is to create a research platform for computer assisted language learning, and computer assisted pronunciation training.Within the thesis, different feedback strategies and pronunciation error detectors are exploredThis is a broad, interdisciplinary approach, combining research from a number of scientific disciplines, such as speech-technology, game studies, cognitive science, phonetics, phonology, and second-language acquisition and teaching methodologies.The thesis discusses the paradigm both from a top-down point of view, where a number of functionally separate but interacting units are presented as part of a proposed architecture, and bottom-up by demonstrating and testing an implementation of the framework.
QC 20110511
APA, Harvard, Vancouver, ISO, and other styles
40

Ana, Knežević. "Primena panel modela u identifikovanju faktora uspešnosti poslovanja proizvodnih preduzeća." Phd thesis, Univerzitet u Novom Sadu, Fakultet tehničkih nauka u Novom Sadu, 2015. http://www.cris.uns.ac.rs/record.jsf?recordId=95568&source=NDLTD&language=en.

Full text
Abstract:
Osnovni cilj istraživanja predstavlja identifikovanje faktorakoji utiču na uspešnost poslovanja proizvodnih preduzeća, i tokorišćenjem metodologije iz oblasti analize panel podataka.Kao mera uspešnosti poslovanja korišćena je profitabilnost.Istraživanjem je obuhvaćena analiza uticaja nekoliko internih ieksternih faktora. Utvrđen je značajan uticaj kako internih(veličina preduzeća, finansijska zaduženost, efikasnostkorišćenja imovine i stopa opipljivosti imovine) tako ieksternih (inflacija, BDP i kamatne stope) faktora nauspešnost poslovanja proizvodnih preduzeća.
The main goal of this research is identifying factors that have an impact onbusiness success of the manufacturing companies, by using themethodology of panel models analysis. Profitability is used as a measure ofbusiness success. Research involves analysis of several internal andexternal factors.Significant influence of several internal (size, financial leverage, efficiency ofassets usage and tangibility of assets) and external factors (inflation, GDPand interest rates) on business success of manufacturing companies hasbeen identified.
APA, Harvard, Vancouver, ISO, and other styles
41

Dillehay, Tom D. "Andean Language and Archaeology: A Final Comment." Pontificia Universidad Católica del Perú, 2012. http://repositorio.pucp.edu.pe/index/handle/123456789/113367.

Full text
APA, Harvard, Vancouver, ISO, and other styles
42

Konstas, Ioannis. "Joint models for concept-to-text generation." Thesis, University of Edinburgh, 2014. http://hdl.handle.net/1842/8926.

Full text
Abstract:
Much of the data found on the world wide web is in numeric, tabular, or other nontextual format (e.g., weather forecast tables, stock market charts, live sensor feeds), and thus inaccessible to non-experts or laypersons. However, most conventional search engines and natural language processing tools (e.g., summarisers) can only handle textual input. As a result, data in non-textual form remains largely inaccessible. Concept-to-text generation refers to the task of automatically producing textual output from non-linguistic input, and holds promise for rendering non-linguistic data widely accessible. Several successful generation systems have been produced in the past twenty years. They mostly rely on human-crafted rules or expert-driven grammars, implement a pipeline architecture, and usually operate in a single domain. In this thesis, we present several novel statistical models that take as input a set of database records and generate a description of them in natural language text. Our unique idea is to combine the processes of structuring a document (document planning), deciding what to say (content selection) and choosing the specific words and syntactic constructs specifying how to say it (lexicalisation and surface realisation), in a uniform joint manner. Rather than breaking up the generation process into a sequence of local decisions, we define a probabilistic context-free grammar that globally describes the inherent structure of the input (a corpus of database records and text describing some of them). This joint representation allows individual processes (i.e., document planning, content selection, and surface realisation) to communicate and influence each other naturally. We recast generation as the task of finding the best derivation tree for a set of input database records and our grammar, and describe several algorithms for decoding in this framework that allows to intersect the grammar with additional information capturing fluency and syntactic well-formedness constraints. We implement our generators using the hypergraph framework. Contrary to traditional systems, we learn all the necessary document, structural and linguistic knowledge from unannotated data. Additionally, we explore a discriminative reranking approach on the hypergraph representation of our model, by including more refined content selection features. Central to our approach is the idea of porting our models to various domains; we experimented on four widely different domains, namely sportscasting, weather forecast generation, booking flights, and troubleshooting guides. The performance of our systems is competitive and often superior compared to state-of-the-art systems that use domain specific constraints, explicit feature engineering or labelled data.
APA, Harvard, Vancouver, ISO, and other styles
43

Amato, Roberta. "Human collective behavior models: language, cooperation and social conventions." Doctoral thesis, Universitat de Barcelona, 2018. http://hdl.handle.net/10803/565420.

Full text
Abstract:
The topics dealt with in this thesis are all part of the general problem of social consensus, namely how a convention flourish and decay and what motivates people to conform to it. Examples range from driving on the right side of the street, to language, rules of courtesy or moral judgments. Some conventions arise directly from the need to coordinate or conform, such as fashion or speaking the same language, others, instead, apply to situations where there is a tension between individual and collective interest, such as cooperation, reciprocity, etc. This thesis is developed around three main questions still open in the research field of collective human behavior: how coexistence of concurrent conventions is possible, why cooperation in real systems is more common than predicted and how a population undergoes collective behavioral change, namely how an initially minority norm can supplant a majority ones. In the first work, we study the impact of concurrent social pressures in consensus processes. We propose a model of opinion competition where individuals participate in different social networks and receive conflicting social influences. The dynamics take place in two distinct domains, which we model as layers of a multiplex network. The novelty of our study lies to the fact that individuals can have different options in the different layers. This naturally reflects a common situation where an individual can possess some different opinions in different social contexts as a result of consensus with other individuals in the one context but not in the other. Our analysis shows that the latter property enriches the system’s dynamics and allows not only for consensus into a single state for both layers, but also for active dynamical states of coexistence of both options. In the second model, we analyze the influence of opinion dynamic in competitive strategical games. Cooperation between humans is quite common and stable behavior even in situations where both game theory and experiments predict defection prevalence. One of the reasons could be just the fact that individuals engaging in strategic interactions are also exposed to social influence and, consequently, to the spread of opinions. We present a new evolutionary game model where game and opinions dynamics take place in different layers of a multiplex network. We show that the coupling between the two dynamical processes can lead to cooperation in scenarios where the pure game dynamics predicts defection and, in some particular setting, gives rise to a metastable state in which nodes that adopt the same strategy self-organize into local groups. In the last work, we present the first extensive quantitative analysis of the phenomenon of norm change by looking at 2,365 orthographic and lexical norms shifts occurred in English and Spanish over the last two centuries as recorded by millions of digitized books. We are able to identify three distinct patterns in the data depending on the nature of the norm shift. Furthermore, we propose a simple evolutionary model that captures all the identified mechanisms and reproduces quantitatively the transitions between norms. This work advances the current understanding of norm shifts in language change, most often limited to qualitative illustrations (e.g., the observation that adoption curve of the new norm follows an ‘S-shaped’ behavior.
Esta tesis se desarrolla en torno a tres preguntas principales aún abiertas en el contexto del estudio de los comportamientos humanos colectivos: ¿cómo es posible la coexistencia de convenciones (opiniones, idiomas, etc. ) concurrentes?; ¿por qué la cooperación en sistemas reales es más común de lo que se predice?; y ¿cómo una norma inicialmente minoritaria puede suplantar a una mayoría? En el primer trabajo nos centramos en formular un modelo capaz de contemplar la coexistencia de convenciones opuestas como una solución dinámica estable. En el segundo modelo, analizamos la influencia de la dinámica de opinión el primer análisis cuantitativo (el mejor de nuestro conocimiento) del fenómeno de evolución de las normas, es decir, lo que sucede cuando una nueva norma social reemplaza a una norma existente. Resumiendo, los resultados obtenidos en estos trabajos muestran que al modelar los comportamientos humanos colectivos, el hecho de que los individuos participan simultáneamente en diferentes contextos sociales juega un papel importante. Esto implica que los individuos están sujetos tanto a la influencia de diferentes dinámicas sociales como a estructuras de interacciones diferentes, pero no independientes. También hemos demostrado que, en el complejo proceso de cambio colectivo en la adopción de normas, la naturaleza del cambio de normas deja patrones distintos en los datos representados por tres tipos diferentes de transición dinámica. Este último trabajo avanza la comprensión actual de la evolución de las normas, más a menudo limitado a ilustraciones cualitativas (por ejemplo, la observación de que la curva de adopción de la nueva norma sigue un comportamiento ”en forma de S” ).
APA, Harvard, Vancouver, ISO, and other styles
44

Clarkson, P. R. "Adaptation of statistical language models for automatic speech recognition." Thesis, University of Cambridge, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.597745.

Full text
Abstract:
Statistical language models encode linguistic information in such a way as to be useful to systems which process human language. Such systems include those for optical character recognition and machine translation. Currently, however, the most common application of language modelling is in automatic speech recognition, and it is this that forms the focus of this thesis. Most current speech recognition systems are dedicated to one specific task (for example, the recognition of broadcast news), and thus use a language model which has been trained on text which is appropriate to that task. If, however, one wants to perform recognition on more general language, then creating an appropriate language model is far from straightforward. A task-specific language model will often perform very badly on language from a different domain, whereas a model trained on text from many diverse styles of language might perform better in general, but will not be especially well suited to any particular domain. Thus the idea of an adaptive language model whose parameters automatically adjust to the current style of language is an appealing one. In this thesis, two adaptive language models are investigated. The first is a mixture-based model. The training text is partitioned according to the style of text, and a separate language model is constructed for each component. Each component is assigned a weighting according to its performance at modelling the observed text, and a final language model is constructed as the weighted sum of each of the mixture components. The second approach is based on a cache of recent words. Previous work has shown that words that have occurred recently have a higher probability of occurring in the immediate future than would be predicted by a standard triagram language model. This thesis investigates the hypothesis that more recent words should be considered more significant within the cache by implementing a cache in which a word's recurrence probability decays exponentially over time. The problem of how to predict the effect of a particular language model on speech recognition accuracy is also addressed in this thesis. The results presented here, as well as those of other recent research, suggest that perplexity, the most commonly used method of evaluating language models, is not as well correlated with word error rate as was once thought. This thesis investigates the connection between a language model's perplexity and its effect on speech recognition performance, and will describe the development of alternative measures of a language models' quality which are better correlated with word error rate. Finally, it is shown how the recognition performance which is achieved using mixture-based language models can be improved by optimising the mixture weights with respect to these new measures.
APA, Harvard, Vancouver, ISO, and other styles
45

黃伯光 and Pak-kwong Wong. "Statistical language models for Chinese recognition: speech and character." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31239456.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Mukherjee, Niloy 1978. "Spontaneous speech recognition using visual context-aware language models." Thesis, Massachusetts Institute of Technology, 2003. http://hdl.handle.net/1721.1/62380.

Full text
Abstract:
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2003.
Includes bibliographical references (p. 83-88).
The thesis presents a novel situationally-aware multimodal spoken language system called Fuse that performs speech understanding for visual object selection. An experimental task was created in which people were asked to refer, using speech alone, to objects arranged on a table top. During training, Fuse acquires a grammar and vocabulary from a "show-and-tell" procedure in which visual scenes are paired with verbal descriptions of individual objects. Fuse determines a set of visually salient words and phrases and associates them to a set of visual features. Given a new scene, Fuse uses the acquired knowledge to generate class-based language models conditioned on the objects present in the scene as well as a spatial language model that predicts the occurrences of spatial terms conditioned on target and landmark objects. The speech recognizer in Fuse uses a weighted mixture of these language models to search for more likely interpretations of user speech in context of the current scene. During decoding, the weights are updated using a visual attention model which redistributes attention over objects based on partially decoded utterances. The dynamic situationally-aware language models enable Fuse to jointly infer spoken language utterances underlying speech signals as well as the identities of target objects they refer to. In an evaluation of the system, visual situationally-aware language modeling shows significant , more than 30 %, decrease in speech recognition and understanding error rates. The underlying ideas of situation-aware speech understanding that have been developed in Fuse may may be applied in numerous areas including assistive and mobile human-machine interfaces.
by Niloy Mukherjee.
S.M.
APA, Harvard, Vancouver, ISO, and other styles
47

Lin, Chun Hung. "Automatic Question Generation with Pre-trained Masked Language Models." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-289559.

Full text
Abstract:
In this project, we study the task of generating a question from a given passage-answer pair using pre-trained masked language models. Asking questions is of importance in artificial intelligence development because it makes a machine look intelligent when it raises a reasonable and well-constructed question. Also, question generation has its applications such as drafting questions for a reading comprehension test and augmenting data for expanding the training set of a question answering task. We focus on using pre-trained masked language models throughout this project. Masked language modeling is relatively new in question generation, but it has been being explored in the machine translation domain. In our experiments, we used two training techniques and two types of generation orderings. We are the first to adopt one of these training techniques for the question generation task. In our evaluation, n-gram based precision-recall evaluation and a human evaluation were conducted for comparing and analyzing. The experiment results showed that the best of our methods was as good as LSTM-based methods by comparing the results with the previous research literature. Moreover, all combinations of the training techniques and the generation orderings are acceptable according to our human evaluation results. We also demonstrated that one of our techniques enables us to control how long the generated question would be.
I detta projekt studerar vi uppgiften att generera en fråga från ett givet par av ett textstycke och ett svar med förtränade maskerade språkmodeller. Att ställa frågor är viktigt i utvecklingen av artificiell intelligens eftersom det får en maskin att se intelligent ut när den ställer en rimlig och välkonstruerad fråga. Frågegenerering har också sina applikationer som att formulera frågor för ett läsförståelsetest och att utöka datamängder som kan användas för att träna frågebesvarande program. Vi fokuserar på att använda förtränade maskerade språkmodeller under hela detta projekt. Maskerade språkmodeller är relativt nya i samband med frågegenerering men det har undersökts i maskinöversättningsdomänen. I våra experiment använde vi två träningstekniker och två typer av genereringsordningar. Vi är de första att anta en av dessa träningstekniker för frågegenerering. För utvärdering använde vi n-grambaserad precision-täckning. Vi gjorde även en utvärdering med försökspersoner. Experimentresultaten visade att den bästa metoden var lika bra som LSTM-baserade metoder genom att jämföra resultaten med den tidigare forskningslitteraturen. Dessutom är alla kombinationer av träningsteknikerna och genereringsordningarna acceptabla enligt våra mänskliga utvärderingsresultat. Vi visade också att den nyligen föreslagna tekniken gör det möjligt för oss att kontrollera hur lång den genererade frågan skulle vara.
APA, Harvard, Vancouver, ISO, and other styles
48

Borggren, Lukas. "Automatic Categorization of News Articles With Contextualized Language Models." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177004.

Full text
Abstract:
This thesis investigates how pre-trained contextualized language models can be adapted for multi-label text classification of Swedish news articles. Various classifiers are built on pre-trained BERT and ELECTRA models, exploring global and local classifier approaches. Furthermore, the effects of domain specialization, using additional metadata features and model compression are investigated. Several hundred thousand news articles are gathered to create unlabeled and labeled datasets for pre-training and fine-tuning, respectively. The findings show that a local classifier approach is superior to a global classifier approach and that BERT outperforms ELECTRA significantly. Notably, a baseline classifier built on SVMs yields competitive performance. The effect of further in-domain pre-training varies; ELECTRA’s performance improves while BERT’s is largely unaffected. It is found that utilizing metadata features in combination with text representations improves performance. Both BERT and ELECTRA exhibit robustness to quantization and pruning, allowing model sizes to be cut in half without any performance loss.
APA, Harvard, Vancouver, ISO, and other styles
49

Wong, Pak-kwong. "Statistical language models for Chinese recognition : speech and character /." Hong Kong : University of Hong Kong, 1998. http://sunzi.lib.hku.hk/hkuto/record.jsp?B20158725.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Gangireddy, Siva Reddy. "Recurrent neural network language models for automatic speech recognition." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28990.

Full text
Abstract:
The goal of this thesis is to advance the use of recurrent neural network language models (RNNLMs) for large vocabulary continuous speech recognition (LVCSR). RNNLMs are currently state-of-the-art and shown to consistently reduce the word error rates (WERs) of LVCSR tasks when compared to other language models. In this thesis we propose various advances to RNNLMs. The advances are: improved learning procedures for RNNLMs, enhancing the context, and adaptation of RNNLMs. We learned better parameters by a novel pre-training approach and enhanced the context using prosody and syntactic features. We present a pre-training method for RNNLMs, in which the output weights of a feed-forward neural network language model (NNLM) are shared with the RNNLM. This is accomplished by first fine-tuning the weights of the NNLM, which are then used to initialise the output weights of an RNNLM with the same number of hidden units. To investigate the effectiveness of the proposed pre-training method, we have carried out text-based experiments on the Penn Treebank Wall Street Journal data, and ASR experiments on the TED lectures data. Across the experiments, we observe small but significant improvements in perplexity (PPL) and ASR WER. Next, we present unsupervised adaptation of RNNLMs. We adapted the RNNLMs to a target domain (topic or genre or television programme (show)) at test time using ASR transcripts from first pass recognition. We investigated two approaches to adapt the RNNLMs. In the first approach the forward propagating hidden activations are scaled - learning hidden unit contributions (LHUC). In the second approach we adapt all parameters of RNNLM.We evaluated the adapted RNNLMs by showing the WERs on multi genre broadcast speech data. We observe small (on an average 0.1% absolute) but significant improvements in WER compared to a strong unadapted RNNLM model. Finally, we present the context-enhancement of RNNLMs using prosody and syntactic features. The prosody features were computed from the acoustics of the context words and the syntactic features were from the surface form of the words in the context. We trained the RNNLMs with word duration, pause duration, final phone duration, syllable duration, syllable F0, part-of-speech tag and Combinatory Categorial Grammar (CCG) supertag features. The proposed context-enhanced RNNLMs were evaluated by reporting PPL and WER on two speech recognition tasks, Switchboard and TED lectures. We observed substantial improvements in PPL (5% to 15% relative) and small but significant improvements in WER (0.1% to 0.5% absolute).
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography