Gotowe bibliografie tematyczne / Neural language models

Gotowa bibliografia na temat „Neural language models”

Autor: Grafiati

Data publikacji: 25 maja 2024

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Spis treści

Artykuły w czasopismach
Rozprawy doktorskie
Książki
Części książek
Streszczenia konferencji
Raporty organizacyjne

Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Neural language models”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Artykuły w czasopismach na temat "Neural language models"

Buckman, Jacob, i Graham Neubig. "Neural Lattice Language Models". Transactions of the Association for Computational Linguistics 6 (grudzień 2018): 529–41. http://dx.doi.org/10.1162/tacl_a_00036.

Pełny tekst źródła

Streszczenie:

In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions — including polysemy and the existence of multiword lexical items — into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.

Style APA, Harvard, Vancouver, ISO itp.

Bengio, Yoshua. "Neural net language models". Scholarpedia 3, nr 1 (2008): 3881. http://dx.doi.org/10.4249/scholarpedia.3881.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Dong, Li. "Learning natural language interfaces with neural models". AI Matters 7, nr 2 (czerwiec 2021): 14–17. http://dx.doi.org/10.1145/3478369.3478375.

Pełny tekst źródła

Streszczenie:

Language is the primary and most natural means of communication for humans. The learning curve of interacting with various services (e.g., digital assistants, and smart appliances) would be greatly reduced if we could talk to machines using human language. However, in most cases computers can only interpret and execute formal languages.

Style APA, Harvard, Vancouver, ISO itp.

De Coster, Mathieu, i Joni Dambre. "Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation". Information 13, nr 5 (23.04.2022): 220. http://dx.doi.org/10.3390/info13050220.

Pełny tekst źródła

Streszczenie:

We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written language models can be leveraged to improve sign language translation. We apply the Frozen Pretrained Transformer (FPT) technique to initialize the encoder, decoder, or both, of a sign language translation model with parts of a pretrained written language model. We observe that the attention patterns transfer in zero-shot to the different modality and, in some experiments, we obtain higher scores (from 18.85 to 21.39 BLEU-4). Especially when gloss annotations are unavailable, FPTs can increase performance on unseen data. However, current models appear to be limited primarily by data quality and only then by data quantity, limiting potential gains with FPTs. Therefore, in further research, we will focus on improving the representations used as inputs to translation models.

Style APA, Harvard, Vancouver, ISO itp.

Chang, Tyler A., i Benjamin K. Bergen. "Word Acquisition in Neural Language Models". Transactions of the Association for Computational Linguistics 10 (2022): 1–16. http://dx.doi.org/10.1162/tacl_a_00444.

Pełny tekst źródła

Streszczenie:

Abstract We investigate how neural language models acquire individual words during training, extracting learning curves and ages of acquisition for over 600 words on the MacArthur-Bates Communicative Development Inventory (Fenson et al., 2007). Drawing on studies of word acquisition in children, we evaluate multiple predictors for words’ ages of acquisition in LSTMs, BERT, and GPT-2. We find that the effects of concreteness, word length, and lexical class are pointedly different in children and language models, reinforcing the importance of interaction and sensorimotor experience in child language acquisition. Language models rely far more on word frequency than children, but, like children, they exhibit slower learning of words in longer utterances. Interestingly, models follow consistent patterns during training for both unidirectional and bidirectional models, and for both LSTM and Transformer architectures. Models predict based on unigram token frequencies early in training, before transitioning loosely to bigram probabilities, eventually converging on more nuanced predictions. These results shed light on the role of distributional learning mechanisms in children, while also providing insights for more human-like language acquisition in language models.

Style APA, Harvard, Vancouver, ISO itp.

Mezzoudj, Freha, i Abdelkader Benyettou. "An empirical study of statistical language models: n-gram language models vs. neural network language models". International Journal of Innovative Computing and Applications 9, nr 4 (2018): 189. http://dx.doi.org/10.1504/ijica.2018.095762.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Mandy Lau. "Artificial intelligence language models and the false fantasy of participatory language policies". Working papers in Applied Linguistics and Linguistics at York 1 (13.09.2021): 4–15. http://dx.doi.org/10.25071/2564-2855.5.

Pełny tekst źródła

Streszczenie:

Artificial intelligence neural language models learn from a corpus of online language data, often drawn directly from user-generated content through crowdsourcing or the gift economy, bypassing traditional keepers of language policy and planning (such as governments and institutions). Here lies the dream that the languages of the digital world can bend towards individual needs and wants, and not the traditional way around. Through the participatory language work of users, linguistic diversity, accessibility, personalization, and inclusion can be increased. However, the promise of a more participatory, just, and emancipatory language policy as a result of neural language models is a false fantasy. I argue that neural language models represent a covert and oppressive form of language policy that benefits the privileged and harms the marginalized. Here, I examine the ideology underpinning neural language models and investigate the harms that result from these emerging subversive regulatory bodies.

Style APA, Harvard, Vancouver, ISO itp.

Qi, Kunxun, i Jianfeng Du. "Translation-Based Matching Adversarial Network for Cross-Lingual Natural Language Inference". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 8632–39. http://dx.doi.org/10.1609/aaai.v34i05.6387.

Pełny tekst źródła

Streszczenie:

Cross-lingual natural language inference is a fundamental task in cross-lingual natural language understanding, widely addressed by neural models recently. Existing neural model based methods either align sentence embeddings between source and target languages, heavily relying on annotated parallel corpora, or exploit pre-trained cross-lingual language models that are fine-tuned on a single language and hard to transfer knowledge to another language. To resolve these limitations in existing methods, this paper proposes an adversarial training framework to enhance both pre-trained models and classical neural models for cross-lingual natural language inference. It trains on the union of data in the source language and data in the target language, learning language-invariant features to improve the inference performance. Experimental results on the XNLI benchmark demonstrate that three popular neural models enhanced by the proposed framework significantly outperform the original models.

Style APA, Harvard, Vancouver, ISO itp.

Park, Myung-Kwan, Keonwoo Koo, Jaemin Lee i Wonil Chung. "Investigating Syntactic Transfer from English to Korean in Neural L2 Language Models". Studies in Modern Grammar 121 (30.03.2024): 177–201. http://dx.doi.org/10.14342/smog.2024.121.177.

Pełny tekst źródła

Streszczenie:

This paper investigates how the grammatical knowledge obtained in the initial language (English) of neural language models (LMs) influences the learning of grammatical structures in their second language (Korean). To achieve this objective, we conduct the now well- established experimental procedure, including (i) pre-training transformer-based GPT-2 LMs with Korean and English datasets, (ii) further fine-tuning them with a specific set of Korean data as L1 or L2, and (iii) evaluating them with the test data of KBLiMP while analyzing their linguistic generalization in L1 or L2. We have found negative transfer effects in the comparison between English as L1 and Korean as L2. Furthermore, in the trajectory analysis, the second language-learning LM has captured linguistic features of Korean including syntax, syntax-semantics interface, and morphology during the progressive training step. Our study of second language learning in LMs contributes to predicting potential syntactic challenges arising from the interference by the L1 language during the learning of Korean as a foreign language.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Rozprawy doktorskie na temat "Neural language models"

Lei, Tao Ph D. Massachusetts Institute of Technology. "Interpretable neural models for natural language processing". Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/108990.

Pełny tekst źródła

Streszczenie:

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 109-119).
The success of neural network models often comes at a cost of interpretability. This thesis addresses the problem by providing justifications behind the model's structure and predictions. In the first part of this thesis, we present a class of sequence operations for text processing. The proposed component generalizes from convolution operations and gated aggregations. As justifications, we relate this component to string kernels, i.e. functions measuring the similarity between sequences, and demonstrate how it encodes the efficient kernel computing algorithm into its structure. The proposed model achieves state-of-the-art or competitive results compared to alternative architectures (such as LSTMs and CNNs) across several NLP applications. In the second part, we learn rationales behind the model's prediction by extracting input pieces as supporting evidence. Rationales are tailored to be short and coherent, yet sufficient for making the same prediction. Our approach combines two modular components, generator and encoder, which are trained to operate well together. The generator specifies a distribution over text fragments as candidate rationales and these are passed through the encoder for prediction. Rationales are never given during training. Instead, the model is regularized by the desiderata for rationales. We demonstrate the effectiveness of this learning framework in applications such multi-aspect sentiment analysis. Our method achieves a performance over 90% evaluated against manual annotated rationales.
by Tao Lei.
Ph. D.

Style APA, Harvard, Vancouver, ISO itp.

Kunz, Jenny. "Neural Language Models with Explicit Coreference Decision". Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-371827.

Pełny tekst źródła

Streszczenie:

Coreference is an important and frequent concept in any form of discourse, and Coreference Resolution (CR) a widely used task in Natural Language Understanding (NLU). In this thesis, we implement and explore two recent models that include the concept of coreference in Recurrent Neural Network (RNN)-based Language Models (LM). Entity and reference decisions are modeled explicitly in these models using attention mechanisms. Both models learn to save the previously observed entities in a set and to decide if the next token created by the LM is a mention of one of the entities in the set, an entity that has not been observed yet, or not an entity. After a theoretical analysis where we compare the two LMs to each other and to a state of the art Coreference Resolution system, we perform an extensive quantitative and qualitative analysis. For this purpose, we train the two models and a classical RNN-LM as the baseline model on the OntoNotes 5.0 corpus with coreference annotation. While we do not reach the baseline in the perplexity metric, we show that the models’ relative performance on entity tokens has the potential to improve when including the explicit entity modeling. We show that the most challenging point in the systems is the decision if the next token is an entity token, while the decision which entity the next token refers to performs comparatively well. Our analysis in the context of a text generation task shows that a wide-spread error source for the mention creation process is the confusion of tokens that refer to related but different entities in the real world, presumably a result of the context-based word representations in the models. Our re-implementation of the DeepMind model by Yang et al. 2016 performs notably better than the re-implementation of the EntityNLM model by Ji et al. 2017 with a perplexity of 107 compared to a perplexity of 131.

Style APA, Harvard, Vancouver, ISO itp.

Labeau, Matthieu. "Neural language models : Dealing with large vocabularies". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS313/document.

Pełny tekst źródła

Streszczenie:

Le travail présenté dans cette thèse explore les méthodes pratiques utilisées pour faciliter l'entraînement et améliorer les performances des modèles de langues munis de très grands vocabulaires. La principale limite à l'utilisation des modèles de langue neuronaux est leur coût computationnel: il dépend de la taille du vocabulaire avec laquelle il grandit linéairement. La façon la plus aisée de réduire le temps de calcul de ces modèles reste de limiter la taille du vocabulaire, ce qui est loin d'être satisfaisant pour de nombreuses tâches. La plupart des méthodes existantes pour l'entraînement de ces modèles à grand vocabulaire évitent le calcul de la fonction de partition, qui est utilisée pour forcer la distribution de sortie du modèle à être normalisée en une distribution de probabilités. Ici, nous nous concentrons sur les méthodes à base d'échantillonnage, dont le sampling par importance et l'estimation contrastive bruitée. Ces méthodes permettent de calculer facilement une approximation de cette fonction de partition. L'examen des mécanismes de l'estimation contrastive bruitée nous permet de proposer des solutions qui vont considérablement faciliter l'entraînement, ce que nous montrons expérimentalement. Ensuite, nous utilisons la généralisation d'un ensemble d'objectifs basés sur l'échantillonnage comme divergences de Bregman pour expérimenter avec de nouvelles fonctions objectif. Enfin, nous exploitons les informations données par les unités sous-mots pour enrichir les représentations en sortie du modèle. Nous expérimentons avec différentes architectures, sur le Tchèque, et montrons que les représentations basées sur les caractères permettent l'amélioration des résultats, d'autant plus lorsque l'on réduit conjointement l'utilisation des représentations de mots
This work investigates practical methods to ease training and improve performances of neural language models with large vocabularies. The main limitation of neural language models is their expensive computational cost: it depends on the size of the vocabulary, with which it grows linearly. Despite several training tricks, the most straightforward way to limit computation time is to limit the vocabulary size, which is not a satisfactory solution for numerous tasks. Most of the existing methods used to train large-vocabulary language models revolve around avoiding the computation of the partition function, ensuring that output scores are normalized into a probability distribution. Here, we focus on sampling-based approaches, including importance sampling and noise contrastive estimation. These methods allow an approximate computation of the partition function. After examining the mechanism of self-normalization in noise-contrastive estimation, we first propose to improve its efficiency with solutions that are adapted to the inner workings of the method and experimentally show that they considerably ease training. Our second contribution is to expand on a generalization of several sampling based objectives as Bregman divergences, in order to experiment with new objectives. We use Beta divergences to derive a set of objectives from which noise contrastive estimation is a particular case. Finally, we aim at improving performances on full vocabulary language models, by augmenting output words representation with subwords. We experiment on a Czech dataset and show that using character-based representations besides word embeddings for output representations gives better results. We also show that reducing the size of the output look-up table improves results even more

Style APA, Harvard, Vancouver, ISO itp.

Bayer, Ali Orkan. "Semantic Language models with deep neural Networks". Doctoral thesis, Università degli studi di Trento, 2015. https://hdl.handle.net/11572/367784.

Pełny tekst źródła

Streszczenie:

Spoken language systems (SLS) communicate with users in natural language through speech. There are two main problems related to processing the spoken input in SLS. The first one is automatic speech recognition (ASR) which recognizes what the user says. The second one is spoken language understanding (SLU) which understands what the user means. We focus on the language model (LM) component of SLS. LMs constrain the search space that is used in the search for the best hypothesis. Therefore, they play a crucial role in the performance of SLS. It has long been discussed that an improvement in the recognition performance does not necessarily yield a better understanding performance. Therefore, optimization of LMs for the understanding performance is crucial. In addition, long-range dependencies in languages are hard to handle with statistical language models. These two problems are addressed in this thesis. We investigate two different LM structures. The first LM that we investigate enable SLS to understand better what they recognize by searching the ASR hypotheses for the best understanding performance. We refer to these models as joint LMs. They use lexical and semantic units jointly in the LM. The second LM structure uses the semantic context of an utterance, which can also be described as â€œwhat the system understandsâ€ , to search for a better hypothesis that improves the recognition and the understanding performance. We refer to these models as semantic LMs (SELMs). SELMs use features that are based on a well established theory of lexical semantics, namely the theory of frame semantics. They incorporate the semantic features which are extracted from the ASR hypothesis into the LM and handle long-range dependencies by using the semantic relationships between words and semantic context. ASR noise is propagated to the semantic features, to suppress this noise we introduce the use of deep semantic encodings for semantic feature extraction. In this way, SELMs optimize both the recognition and the understanding performance.

Style APA, Harvard, Vancouver, ISO itp.

Bayer, Ali Orkan. "Semantic Language models with deep neural Networks". Doctoral thesis, University of Trento, 2015. http://eprints-phd.biblio.unitn.it/1578/1/bayer_thesis.pdf.

Pełny tekst źródła

Streszczenie:

Spoken language systems (SLS) communicate with users in natural language through speech. There are two main problems related to processing the spoken input in SLS. The first one is automatic speech recognition (ASR) which recognizes what the user says. The second one is spoken language understanding (SLU) which understands what the user means. We focus on the language model (LM) component of SLS. LMs constrain the search space that is used in the search for the best hypothesis. Therefore, they play a crucial role in the performance of SLS. It has long been discussed that an improvement in the recognition performance does not necessarily yield a better understanding performance. Therefore, optimization of LMs for the understanding performance is crucial. In addition, long-range dependencies in languages are hard to handle with statistical language models. These two problems are addressed in this thesis. We investigate two different LM structures. The first LM that we investigate enable SLS to understand better what they recognize by searching the ASR hypotheses for the best understanding performance. We refer to these models as joint LMs. They use lexical and semantic units jointly in the LM. The second LM structure uses the semantic context of an utterance, which can also be described as “what the system understands”, to search for a better hypothesis that improves the recognition and the understanding performance. We refer to these models as semantic LMs (SELMs). SELMs use features that are based on a well established theory of lexical semantics, namely the theory of frame semantics. They incorporate the semantic features which are extracted from the ASR hypothesis into the LM and handle long-range dependencies by using the semantic relationships between words and semantic context. ASR noise is propagated to the semantic features, to suppress this noise we introduce the use of deep semantic encodings for semantic feature extraction. In this way, SELMs optimize both the recognition and the understanding performance.

Style APA, Harvard, Vancouver, ISO itp.

Li, Zhongliang. "Slim Embedding Layers for Recurrent Neural Language Models". Wright State University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=wright1531950458646138.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Gangireddy, Siva Reddy. "Recurrent neural network language models for automatic speech recognition". Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/28990.

Pełny tekst źródła

Streszczenie:

The goal of this thesis is to advance the use of recurrent neural network language models (RNNLMs) for large vocabulary continuous speech recognition (LVCSR). RNNLMs are currently state-of-the-art and shown to consistently reduce the word error rates (WERs) of LVCSR tasks when compared to other language models. In this thesis we propose various advances to RNNLMs. The advances are: improved learning procedures for RNNLMs, enhancing the context, and adaptation of RNNLMs. We learned better parameters by a novel pre-training approach and enhanced the context using prosody and syntactic features. We present a pre-training method for RNNLMs, in which the output weights of a feed-forward neural network language model (NNLM) are shared with the RNNLM. This is accomplished by first fine-tuning the weights of the NNLM, which are then used to initialise the output weights of an RNNLM with the same number of hidden units. To investigate the effectiveness of the proposed pre-training method, we have carried out text-based experiments on the Penn Treebank Wall Street Journal data, and ASR experiments on the TED lectures data. Across the experiments, we observe small but significant improvements in perplexity (PPL) and ASR WER. Next, we present unsupervised adaptation of RNNLMs. We adapted the RNNLMs to a target domain (topic or genre or television programme (show)) at test time using ASR transcripts from first pass recognition. We investigated two approaches to adapt the RNNLMs. In the first approach the forward propagating hidden activations are scaled - learning hidden unit contributions (LHUC). In the second approach we adapt all parameters of RNNLM.We evaluated the adapted RNNLMs by showing the WERs on multi genre broadcast speech data. We observe small (on an average 0.1% absolute) but significant improvements in WER compared to a strong unadapted RNNLM model. Finally, we present the context-enhancement of RNNLMs using prosody and syntactic features. The prosody features were computed from the acoustics of the context words and the syntactic features were from the surface form of the words in the context. We trained the RNNLMs with word duration, pause duration, final phone duration, syllable duration, syllable F0, part-of-speech tag and Combinatory Categorial Grammar (CCG) supertag features. The proposed context-enhanced RNNLMs were evaluated by reporting PPL and WER on two speech recognition tasks, Switchboard and TED lectures. We observed substantial improvements in PPL (5% to 15% relative) and small but significant improvements in WER (0.1% to 0.5% absolute).

Style APA, Harvard, Vancouver, ISO itp.

Scarcella, Alessandro. "Recurrent neural network language models in the context of under-resourced South African languages". Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29431.

Pełny tekst źródła

Streszczenie:

Over the past five years neural network models have been successful across a range of computational linguistic tasks. However, these triumphs have been concentrated in languages with significant resources such as large datasets. Thus, many languages, which are commonly referred to as under-resourced languages, have received little attention and have yet to benefit from recent advances. This investigation aims to evaluate the implications of recent advances in neural network language modelling techniques for under-resourced South African languages. Rudimentary, single layered recurrent neural networks (RNN) were used to model four South African text corpora. The accuracy of these models were compared directly to legacy approaches. A suite of hybrid models was then tested. Across all four datasets, neural networks led to overall better performing language models either directly or as part of a hybrid model. A short examination of punctuation marks in text data revealed that performance metrics for language models are greatly overestimated when punctuation marks have not been excluded. The investigation concludes by appraising the sensitivity of RNN language models (RNNLMs) to the size of the datasets by artificially constraining the datasets and evaluating the accuracy of the models. It is recommended that future research endeavours within this domain are directed towards evaluating more sophisticated RNNLMs as well as measuring their impact on application focused tasks such as speech recognition and machine translation.

Style APA, Harvard, Vancouver, ISO itp.

Le, Hai Son. "Continuous space models with neural networks in natural language processing". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00776704.

Pełny tekst źródła

Streszczenie:

The purpose of language models is in general to capture and to model regularities of language, thereby capturing morphological, syntactical and distributional properties of word sequences in a given language. They play an important role in many successful applications of Natural Language Processing, such as Automatic Speech Recognition, Machine Translation and Information Extraction. The most successful approaches to date are based on n-gram assumption and the adjustment of statistics from the training data by applying smoothing and back-off techniques, notably Kneser-Ney technique, introduced twenty years ago. In this way, language models predict a word based on its n-1 previous words. In spite of their prevalence, conventional n-gram based language models still suffer from several limitations that could be intuitively overcome by consulting human expert knowledge. One critical limitation is that, ignoring all linguistic properties, they treat each word as one discrete symbol with no relation with the others. Another point is that, even with a huge amount of data, the data sparsity issue always has an important impact, so the optimal value of n in the n-gram assumption is often 4 or 5 which is insufficient in practice. This kind of model is constructed based on the count of n-grams in training data. Therefore, the pertinence of these models is conditioned only on the characteristics of the training text (its quantity, its representation of the content in terms of theme, date). Recently, one of the most successful attempts that tries to directly learn word similarities is to use distributed word representations in language modeling, where distributionally words, which have semantic and syntactic similarities, are expected to be represented as neighbors in a continuous space. These representations and the associated objective function (the likelihood of the training data) are jointly learned using a multi-layer neural network architecture. In this way, word similarities are learned automatically. This approach has shown significant and consistent improvements when applied to automatic speech recognition and statistical machine translation tasks. A major difficulty with the continuous space neural network based approach remains the computational burden, which does not scale well to the massive corpora that are nowadays available. For this reason, the first contribution of this dissertation is the definition of a neural architecture based on a tree representation of the output vocabulary, namely Structured OUtput Layer (SOUL), which makes them well suited for large scale frameworks. The SOUL model combines the neural network approach with the class-based approach. It achieves significant improvements on both state-of-the-art large scale automatic speech recognition and statistical machine translations tasks. The second contribution is to provide several insightful analyses on their performances, their pros and cons, their induced word space representation. Finally, the third contribution is the successful adoption of the continuous space neural network into a machine translation framework. New translation models are proposed and reported to achieve significant improvements over state-of-the-art baseline systems.

Style APA, Harvard, Vancouver, ISO itp.

Miao, Yishu. "Deep generative models for natural language processing". Thesis, University of Oxford, 2017. http://ora.ox.ac.uk/objects/uuid:e4e1f1f9-e507-4754-a0ab-0246f1e1e258.

Pełny tekst źródła

Streszczenie:

Deep generative models are essential to Natural Language Processing (NLP) due to their outstanding ability to use unlabelled data, to incorporate abundant linguistic features, and to learn interpretable dependencies among data. As the structure becomes deeper and more complex, having an effective and efficient inference method becomes increasingly important. In this thesis, neural variational inference is applied to carry out inference for deep generative models. While traditional variational methods derive an analytic approximation for the intractable distributions over latent variables, here we construct an inference network conditioned on the discrete text input to provide the variational distribution. The powerful neural networks are able to approximate complicated non-linear distributions and grant the possibilities for more interesting and complicated generative models. Therefore, we develop the potential of neural variational inference and apply it to a variety of models for NLP with continuous or discrete latent variables. This thesis is divided into three parts. Part I introduces a generic variational inference framework for generative and conditional models of text. For continuous or discrete latent variables, we apply a continuous reparameterisation trick or the REINFORCE algorithm to build low-variance gradient estimators. To further explore Bayesian non-parametrics in deep neural networks, we propose a family of neural networks that parameterise categorical distributions with continuous latent variables. Using the stick-breaking construction, an unbounded categorical distribution is incorporated into our deep generative models which can be optimised by stochastic gradient back-propagation with a continuous reparameterisation. Part II explores continuous latent variable models for NLP. Chapter 3 discusses the Neural Variational Document Model (NVDM): an unsupervised generative model of text which aims to extract a continuous semantic latent variable for each document. In Chapter 4, the neural topic models modify the neural document models by parameterising categorical distributions with continuous latent variables, where the topics are explicitly modelled by discrete latent variables. The models are further extended to neural unbounded topic models with the help of stick-breaking construction, and a truncation-free variational inference method is proposed based on a Recurrent Stick-breaking construction (RSB). Chapter 5 describes the Neural Answer Selection Model (NASM) for learning a latent stochastic attention mechanism to model the semantics of question-answer pairs and predict their relatedness. Part III discusses discrete latent variable models. Chapter 6 introduces latent sentence compression models. The Auto-encoding Sentence Compression Model (ASC), as a discrete variational auto-encoder, generates a sentence by a sequence of discrete latent variables representing explicit words. The Forced Attention Sentence Compression Model (FSC) incorporates a combined pointer network biased towards the usage of words from source sentence, which significantly improves the performance when jointly trained with the ASC model in a semi-supervised learning fashion. Chapter 7 describes the Latent Intention Dialogue Models (LIDM) that employ a discrete latent variable to learn underlying dialogue intentions. Additionally, the latent intentions can be interpreted as actions guiding the generation of machine responses, which could be further refined autonomously by reinforcement learning. Finally, Chapter 8 summarizes our findings and directions for future work.

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Książki na temat "Neural language models"

1957-, Houghton George, red. Connectionist models in cognitive psychology. Hove: Psychology Press, 2004.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Miikkulainen, Risto. Subsymbolic natural language processing: An integrated model of scripts, lexicon, and memory. Cambridge, Mass: MIT Press, 1993.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Bavaeva, Ol'ga. Metaphorical parallels of the neutral nomination "man" in modern English. ru: INFRA-M Academic Publishing LLC., 2022. http://dx.doi.org/10.12737/1858259.

Pełny tekst źródła

Streszczenie:

The monograph is devoted to a multidimensional analysis of metaphor in modern English as a parallel nomination that exists along with a neutral equivalent denoting a person. The problem of determining the essence of metaphorical names and their role in the language has attracted the attention of many foreign and domestic linguists on the material of various languages, but until now the fact of the parallel existence of metaphors and neutral nominations has not been emphasized. The research is in line with modern problems of linguistics related to the relationship of language, thinking and reflection of the surrounding reality. All these problems are integrated and resolved within the framework of linguistic semantics, in particular in the semantics of metaphor. Multilevel study of language material based on semantic, component, etymological analysis methods contributed to a systematic and comprehensive description of this most important part of the lexical system of the English language. Metaphorical parallels are considered as the result of the interaction of three complexes, which allows us to identify their associative-figurative base, as well as the types of metaphorical parallels, depending on the nature of the connection between direct and figurative meaning. Based on the analysis of various human character traits and behavior that evoke associations with animals, birds, objects, zoomorphic, artifact, somatic, floral and anthropomorphic metaphorical parallels of the neutral nomination "man" are distinguished. The social aspect of metaphorical parallels is also investigated as a reflection of gender, status and age characteristics of a person. It can be used in the training of philologists and translators when reading theoretical courses on lexicology, stylistics, word formation of the English language, as well as in practical classes, in lexicographic practice.

Style APA, Harvard, Vancouver, ISO itp.

Arbib, Michael. Neural Models of Language Processes. Elsevier Science & Technology Books, 2012.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Cairns, Paul, Joseph P. Levy, Dimitrios Bairaktaris i John A. Bullinaria. Connectionist Models of Memory and Language. Taylor & Francis Group, 2015.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Houghton, George. Connectionist Models in Cognitive Psychology. Taylor & Francis Group, 2004.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Houghton, George. Connectionist Models in Cognitive Psychology. Taylor & Francis Group, 2004.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Houghton, George. Connectionist Models in Cognitive Psychology. Taylor & Francis Group, 2004.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Houghton, George. Connectionist Models in Cognitive Psychology. Taylor & Francis Group, 2004.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Connectionist Models in Cognitive Psychology. Taylor & Francis Group, 2014.

Znajdź pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Więcej źródeł

Części książek na temat "Neural language models"

Skansi, Sandro. "Neural Language Models". W Undergraduate Topics in Computer Science, 165–73. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-73004-2_9.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Delasalles, Edouard, Sylvain Lamprier i Ludovic Denoyer. "Dynamic Neural Language Models". W Neural Information Processing, 282–94. Cham: Springer International Publishing, 2019. http://dx.doi.org/10.1007/978-3-030-36718-3_24.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Hampton, Peter John, Hui Wang i Zhiwei Lin. "Knowledge Transfer in Neural Language Models". W Artificial Intelligence XXXIV, 143–48. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-71078-5_12.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

O’Neill, James, i Danushka Bollegala. "Learning to Evaluate Neural Language Models". W Communications in Computer and Information Science, 123–33. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-6168-9_11.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Goldrick, Matthew. "Neural Network Models of Speech Production". W The Handbook of the Neuropsychology of Language, 125–45. Oxford, UK: Wiley-Blackwell, 2012. http://dx.doi.org/10.1002/9781118432501.ch7.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

G, Santhosh Kumar. "Neural Language Models for (Fake?) News Generation". W Data Science for Fake News, 129–47. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-62696-9_6.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Huang, Yue, i Xiaodong Gu. "Temporal Modeling Approach for Video Action Recognition Based on Vision-language Models". W Neural Information Processing, 512–23. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-8067-3_38.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Goldberg, Yoav. "From Linear Models to Multi-layer Perceptrons". W Neural Network Methods for Natural Language Processing, 37–39. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-031-02165-7_3.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Shen, Tongtong, Longbiao Wang, Xie Chen, Kuntharrgyal Khysru i Jianwu Dang. "Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models". W Neural Information Processing, 266–75. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-70096-0_28.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Taylor, N. R., i J. G. Taylor. "The Neural Networks for Language in the Brain: Creating LAD". W Computational Models for Neuroscience, 245–65. London: Springer London, 2003. http://dx.doi.org/10.1007/978-1-4471-0085-0_9.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Streszczenia konferencji na temat "Neural language models"

Ragni, Anton, Edgar Dakin, Xie Chen, Mark J. F. Gales i Kate M. Knill. "Multi-Language Neural Network Language Models". W Interspeech 2016. ISCA, 2016. http://dx.doi.org/10.21437/interspeech.2016-371.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Кузнецов, Алексей Валерьевич. "NEURAL LANGUAGE MODELS FOR HISTORICAL RESEARCH". W Высокие технологии и инновации в науке: сборник избранных статей Международной научной конференции (Санкт-Петербург, Май 2022). Crossref, 2022. http://dx.doi.org/10.37539/vt197.2022.25.51.002.

Pełny tekst źródła

Streszczenie:

С увеличением доступности оцифрованных ресурсов исторических документов растет интерес исследователей к методам и технологиям обработки естественного языка. В последние годы в области автоматического анализа текстов доминируют глубокие нейронные сети. В данной статье анализируется существующий опыт и тенденции создания языковых моделей на основе трансформеров для исторических языков. With the increasing of availability of digitized resources of historical documents, there is a growing interest in natural language processing methods and technologies. In recent years, the field of automatic text analysis has been dominated by deep neural networks. This article analyzes the existing experience and trends in the creation of language models based on transformers for historical languages.

Style APA, Harvard, Vancouver, ISO itp.

Alexandrescu, Andrei, i Katrin Kirchhoff. "Factored neural language models". W the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers. Morristown, NJ, USA: Association for Computational Linguistics, 2006. http://dx.doi.org/10.3115/1614049.1614050.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Arisoy, Ebru, i Murat Saraclar. "Compositional Neural Network Language Models for Agglutinative Languages". W Interspeech 2016. ISCA, 2016. http://dx.doi.org/10.21437/interspeech.2016-1239.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Gandhe, Ankur, Florian Metze i Ian Lane. "Neural network language models for low resource languages". W Interspeech 2014. ISCA: ISCA, 2014. http://dx.doi.org/10.21437/interspeech.2014-560.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Chen, Zihao. "Neural Language Models in Natural Language Processing". W 2023 2nd International Conference on Data Analytics, Computing and Artificial Intelligence (ICDACAI). IEEE, 2023. http://dx.doi.org/10.1109/icdacai59742.2023.00104.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oba, Miyu, Tatsuki Kuribayashi, Hiroki Ouchi i Taro Watanabe. "Second Language Acquisition of Neural Language Models". W Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.findings-acl.856.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Liu, X., M. J. F. Gales i P. C. Woodland. "Paraphrastic language models and combination with neural network language models". W ICASSP 2013 - 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2013. http://dx.doi.org/10.1109/icassp.2013.6639308.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Javier Vazquez Martinez, Hector, Annika Lea Heuser, Charles Yang i Jordan Kodner. "Evaluating Neural Language Models as Cognitive Models of Language Acquisition". W Proceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP. Stroudsburg, PA, USA: Association for Computational Linguistics, 2023. http://dx.doi.org/10.18653/v1/2023.genbench-1.4.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Huang, Yinghui, Abhinav Sethy, Kartik Audhkhasi i Bhuvana Ramabhadran. "Whole Sentence Neural Language Models". W ICASSP 2018 - 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2018. http://dx.doi.org/10.1109/icassp.2018.8461734.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Raporty organizacyjne na temat "Neural language models"

Semerikov, Serhiy O., Illia O. Teplytskyi, Yuliia V. Yechkalo i Arnold E. Kiv. Computer Simulation of Neural Networks Using Spreadsheets: The Dawn of the Age of Camelot. [б. в.], listopad 2018. http://dx.doi.org/10.31812/123456789/2648.

Pełny tekst źródła

Streszczenie:

The article substantiates the necessity to develop training methods of computer simulation of neural networks in the spreadsheet environment. The systematic review of their application to simulating artificial neural networks is performed. The authors distinguish basic approaches to solving the problem of network computer simulation training in the spreadsheet environment, joint application of spreadsheets and tools of neural network simulation, application of third-party add-ins to spreadsheets, development of macros using the embedded languages of spreadsheets; use of standard spreadsheet add-ins for non-linear optimization, creation of neural networks in the spreadsheet environment without add-ins and macros. After analyzing a collection of writings of 1890-1950, the research determines the role of the scientific journal “Bulletin of Mathematical Biophysics”, its founder Nicolas Rashevsky and the scientific community around the journal in creating and developing models and methods of computational neuroscience. There are identified psychophysical basics of creating neural networks, mathematical foundations of neural computing and methods of neuroengineering (image recognition, in particular). The role of Walter Pitts in combining the descriptive and quantitative theories of training is discussed. It is shown that to acquire neural simulation competences in the spreadsheet environment, one should master the models based on the historical and genetic approach. It is indicated that there are three groups of models, which are promising in terms of developing corresponding methods – the continuous two-factor model of Rashevsky, the discrete model of McCulloch and Pitts, and the discrete-continuous models of Householder and Landahl.

Style APA, Harvard, Vancouver, ISO itp.

Apicella, M. L., J. Slaton i B. Levi. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 10. Neutral Data Manipulation Language (NDML) Precompiler Control Module Product Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada250451.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Althoff, J. L., M. L. Apicella i S. Singh. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 5. Neutral Data Definition Language (NDDL) Development Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada252450.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Apicella, M. L., J. Slaton i B. Levi. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 13. Neutral Data Manipulation Language (NDML) Precompiler Parse NDML Product Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada250453.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Althoff, J., i M. Apicella. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 9. Neutral Data Manipulation Language (NDML) Precompiler Development Specification. Section 2. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada252526.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Apicella, M. L., J. Slaton i B. Levi. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 12. Neutral Data Manipulation Language (NDML) Precompiler Parse Procedure Division Product Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada250452.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Apicella, M. L., J. Slaton, B. Levi i A. Pashak. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 23. Neutral Data Manipulation Language (NDML) Precompiler Build Source Code Product Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada250460.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Apicella, M. L., J. Slaton, B. Levi i A. Pashak. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 24. Neutral Data Manipulation Language (NDML) Precompiler Generator Support Routines Product Specification. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada250461.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Althoff, J., M. Apicella i S. Singh. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 6. Neutral Data Definition Language (NDDL) Product Specification. Section 3 of 6. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada251997.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Althoff, J., M. Apicella i S. Singh. Integrated Information Support System (IISS). Volume 5. Common Data Model Subsystem. Part 6. Neutral Data Definition Language (NDDL) Product Specification. Section 4 of 6. Fort Belvoir, VA: Defense Technical Information Center, wrzesień 1990. http://dx.doi.org/10.21236/ada251998.

Pełny tekst źródła

Style APA, Harvard, Vancouver, ISO itp.

Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!