To see the other types of publications on this topic, follow the link: Neural language models.

Journal articles on the topic 'Neural language models'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Neural language models.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Buckman, Jacob, and Graham Neubig. "Neural Lattice Language Models." Transactions of the Association for Computational Linguistics 6 (December 2018): 529–41. http://dx.doi.org/10.1162/tacl_a_00036.

Full text
Abstract:
In this work, we propose a new language modeling paradigm that has the ability to perform both prediction and moderation of information flow at multiple granularities: neural lattice language models. These models construct a lattice of possible paths through a sentence and marginalize across this lattice to calculate sequence probabilities or optimize parameters. This approach allows us to seamlessly incorporate linguistic intuitions — including polysemy and the existence of multiword lexical items — into our language model. Experiments on multiple language modeling tasks show that English neural lattice language models that utilize polysemous embeddings are able to improve perplexity by 9.95% relative to a word-level baseline, and that a Chinese model that handles multi-character tokens is able to improve perplexity by 20.94% relative to a character-level baseline.
APA, Harvard, Vancouver, ISO, and other styles
2

Bengio, Yoshua. "Neural net language models." Scholarpedia 3, no. 1 (2008): 3881. http://dx.doi.org/10.4249/scholarpedia.3881.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Dong, Li. "Learning natural language interfaces with neural models." AI Matters 7, no. 2 (June 2021): 14–17. http://dx.doi.org/10.1145/3478369.3478375.

Full text
Abstract:
Language is the primary and most natural means of communication for humans. The learning curve of interacting with various services (e.g., digital assistants, and smart appliances) would be greatly reduced if we could talk to machines using human language. However, in most cases computers can only interpret and execute formal languages.
APA, Harvard, Vancouver, ISO, and other styles
4

De Coster, Mathieu, and Joni Dambre. "Leveraging Frozen Pretrained Written Language Models for Neural Sign Language Translation." Information 13, no. 5 (April 23, 2022): 220. http://dx.doi.org/10.3390/info13050220.

Full text
Abstract:
We consider neural sign language translation: machine translation from signed to written languages using encoder–decoder neural networks. Translating sign language videos to written language text is especially complex because of the difference in modality between source and target language and, consequently, the required video processing. At the same time, sign languages are low-resource languages, their datasets dwarfed by those available for written languages. Recent advances in written language processing and success stories of transfer learning raise the question of how pretrained written language models can be leveraged to improve sign language translation. We apply the Frozen Pretrained Transformer (FPT) technique to initialize the encoder, decoder, or both, of a sign language translation model with parts of a pretrained written language model. We observe that the attention patterns transfer in zero-shot to the different modality and, in some experiments, we obtain higher scores (from 18.85 to 21.39 BLEU-4). Especially when gloss annotations are unavailable, FPTs can increase performance on unseen data. However, current models appear to be limited primarily by data quality and only then by data quantity, limiting potential gains with FPTs. Therefore, in further research, we will focus on improving the representations used as inputs to translation models.
APA, Harvard, Vancouver, ISO, and other styles
5

Chang, Tyler A., and Benjamin K. Bergen. "Word Acquisition in Neural Language Models." Transactions of the Association for Computational Linguistics 10 (2022): 1–16. http://dx.doi.org/10.1162/tacl_a_00444.

Full text
Abstract:
Abstract We investigate how neural language models acquire individual words during training, extracting learning curves and ages of acquisition for over 600 words on the MacArthur-Bates Communicative Development Inventory (Fenson et al., 2007). Drawing on studies of word acquisition in children, we evaluate multiple predictors for words’ ages of acquisition in LSTMs, BERT, and GPT-2. We find that the effects of concreteness, word length, and lexical class are pointedly different in children and language models, reinforcing the importance of interaction and sensorimotor experience in child language acquisition. Language models rely far more on word frequency than children, but, like children, they exhibit slower learning of words in longer utterances. Interestingly, models follow consistent patterns during training for both unidirectional and bidirectional models, and for both LSTM and Transformer architectures. Models predict based on unigram token frequencies early in training, before transitioning loosely to bigram probabilities, eventually converging on more nuanced predictions. These results shed light on the role of distributional learning mechanisms in children, while also providing insights for more human-like language acquisition in language models.
APA, Harvard, Vancouver, ISO, and other styles
6

Mezzoudj, Freha, and Abdelkader Benyettou. "An empirical study of statistical language models: n-gram language models vs. neural network language models." International Journal of Innovative Computing and Applications 9, no. 4 (2018): 189. http://dx.doi.org/10.1504/ijica.2018.095762.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Mezzoudj, Freha, and Abdelkader Benyettou. "An empirical study of statistical language models: n-gram language models vs. neural network language models." International Journal of Innovative Computing and Applications 9, no. 4 (2018): 189. http://dx.doi.org/10.1504/ijica.2018.10016827.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Mandy Lau. "Artificial intelligence language models and the false fantasy of participatory language policies." Working papers in Applied Linguistics and Linguistics at York 1 (September 13, 2021): 4–15. http://dx.doi.org/10.25071/2564-2855.5.

Full text
Abstract:
Artificial intelligence neural language models learn from a corpus of online language data, often drawn directly from user-generated content through crowdsourcing or the gift economy, bypassing traditional keepers of language policy and planning (such as governments and institutions). Here lies the dream that the languages of the digital world can bend towards individual needs and wants, and not the traditional way around. Through the participatory language work of users, linguistic diversity, accessibility, personalization, and inclusion can be increased. However, the promise of a more participatory, just, and emancipatory language policy as a result of neural language models is a false fantasy. I argue that neural language models represent a covert and oppressive form of language policy that benefits the privileged and harms the marginalized. Here, I examine the ideology underpinning neural language models and investigate the harms that result from these emerging subversive regulatory bodies.
APA, Harvard, Vancouver, ISO, and other styles
9

Qi, Kunxun, and Jianfeng Du. "Translation-Based Matching Adversarial Network for Cross-Lingual Natural Language Inference." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8632–39. http://dx.doi.org/10.1609/aaai.v34i05.6387.

Full text
Abstract:
Cross-lingual natural language inference is a fundamental task in cross-lingual natural language understanding, widely addressed by neural models recently. Existing neural model based methods either align sentence embeddings between source and target languages, heavily relying on annotated parallel corpora, or exploit pre-trained cross-lingual language models that are fine-tuned on a single language and hard to transfer knowledge to another language. To resolve these limitations in existing methods, this paper proposes an adversarial training framework to enhance both pre-trained models and classical neural models for cross-lingual natural language inference. It trains on the union of data in the source language and data in the target language, learning language-invariant features to improve the inference performance. Experimental results on the XNLI benchmark demonstrate that three popular neural models enhanced by the proposed framework significantly outperform the original models.
APA, Harvard, Vancouver, ISO, and other styles
10

Park, Myung-Kwan, Keonwoo Koo, Jaemin Lee, and Wonil Chung. "Investigating Syntactic Transfer from English to Korean in Neural L2 Language Models." Studies in Modern Grammar 121 (March 30, 2024): 177–201. http://dx.doi.org/10.14342/smog.2024.121.177.

Full text
Abstract:
This paper investigates how the grammatical knowledge obtained in the initial language (English) of neural language models (LMs) influences the learning of grammatical structures in their second language (Korean). To achieve this objective, we conduct the now well- established experimental procedure, including (i) pre-training transformer-based GPT-2 LMs with Korean and English datasets, (ii) further fine-tuning them with a specific set of Korean data as L1 or L2, and (iii) evaluating them with the test data of KBLiMP while analyzing their linguistic generalization in L1 or L2. We have found negative transfer effects in the comparison between English as L1 and Korean as L2. Furthermore, in the trajectory analysis, the second language-learning LM has captured linguistic features of Korean including syntax, syntax-semantics interface, and morphology during the progressive training step. Our study of second language learning in LMs contributes to predicting potential syntactic challenges arising from the interference by the L1 language during the learning of Korean as a foreign language.
APA, Harvard, Vancouver, ISO, and other styles
11

Bayer, Ali Orkan, and Giuseppe Riccardi. "Semantic language models with deep neural networks." Computer Speech & Language 40 (November 2016): 1–22. http://dx.doi.org/10.1016/j.csl.2016.04.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Chuchupal, V. Y. "Neural language models for automatic speech Recognition." Речевые технологии, no. 1-2 (2020): 27–47. http://dx.doi.org/10.58633/2305-8129_2020_1-2_27.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Tian, Yijun, Huan Song, Zichen Wang, Haozhu Wang, Ziqing Hu, Fang Wang, Nitesh V. Chawla, and Panpan Xu. "Graph Neural Prompting with Large Language Models." Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 17 (March 24, 2024): 19080–88. http://dx.doi.org/10.1609/aaai.v38i17.29875.

Full text
Abstract:
Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various language modeling tasks. However, they still exhibit inherent limitations in precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost. Therefore, how to enhance pre-trained LLMs using grounded knowledge, e.g., retrieval-augmented generation, remains an open question. In this work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. GNP encompasses various designs, including a standard graph neural network encoder, a cross-modality pooling module, a domain projector, and a self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings. Code is available at https://github.com/meettyj/GNP.
APA, Harvard, Vancouver, ISO, and other styles
14

Hale, John T., Luca Campanelli, Jixing Li, Shohini Bhattasali, Christophe Pallier, and Jonathan R. Brennan. "Neurocomputational Models of Language Processing." Annual Review of Linguistics 8, no. 1 (January 14, 2022): 427–46. http://dx.doi.org/10.1146/annurev-linguistics-051421-020803.

Full text
Abstract:
Efforts to understand the brain bases of language face the Mapping Problem: At what level do linguistic computations and representations connect to human neurobiology? We review one approach to this problem that relies on rigorously defined computational models to specify the links between linguistic features and neural signals. Such tools can be used to estimate linguistic predictions, model linguistic features, and specify a sequence of processing steps that may be quantitatively fit to neural signals collected while participants use language. Progress has been helped by advances in machine learning, attention to linguistically interpretable models, and openly shared data sets that allow researchers to compare and contrast a variety of models. We describe one such data set in detail in the Supplemental Appendix .
APA, Harvard, Vancouver, ISO, and other styles
15

Klemen, Matej, and Slavko Zitnik. "Neural coreference resolution for Slovene language." Computer Science and Information Systems, no. 00 (2021): 60. http://dx.doi.org/10.2298/csis201120060k.

Full text
Abstract:
Coreference resolution systems aim to recognize and cluster together mentions of the same underlying entity. While there exist large amounts of research on broadly spoken languages such as English and Chinese, research on coreference in other languages is comparably scarce. In this work we first present SentiCoref 1.0 - a coreference resolution dataset for Slovene language that is comparable to English-based corpora. Further, we conduct a series of analyses using various complex models that range from simple linear models to current state-of-the-art deep neural coreference approaches leveraging pre-trained contextual embeddings. Apart from SentiCoref, we evaluate models also on a smaller coref149 Slovene dataset to justify the creation of a new corpus. We investigate robustness of the models using cross-domain data and data augmentations. Models using contextual embeddings achieve the best results - up to 0.92 average F1 score for the SentiCoref dataset. Cross-domain experiments indicate that SentiCoref allows the models to learn more general patterns, which enables them to outperform models, learned on coref149 only.
APA, Harvard, Vancouver, ISO, and other styles
16

Schomacker, Thorben, and Marina Tropmann-Frick. "Language Representation Models: An Overview." Entropy 23, no. 11 (October 28, 2021): 1422. http://dx.doi.org/10.3390/e23111422.

Full text
Abstract:
In the last few decades, text mining has been used to extract knowledge from free texts. Applying neural networks and deep learning to natural language processing (NLP) tasks has led to many accomplishments for real-world language problems over the years. The developments of the last five years have resulted in techniques that have allowed for the practical application of transfer learning in NLP. The advances in the field have been substantial, and the milestone of outperforming human baseline performance based on the general language understanding evaluation has been achieved. This paper implements a targeted literature review to outline, describe, explain, and put into context the crucial techniques that helped achieve this milestone. The research presented here is a targeted review of neural language models that present vital steps towards a general language representation model.
APA, Harvard, Vancouver, ISO, and other styles
17

Kunchukuttan, Anoop, Mitesh Khapra, Gurneet Singh, and Pushpak Bhattacharyya. "Leveraging Orthographic Similarity for Multilingual Neural Transliteration." Transactions of the Association for Computational Linguistics 6 (December 2018): 303–16. http://dx.doi.org/10.1162/tacl_a_00022.

Full text
Abstract:
We address the task of joint training of transliteration models for multiple language pairs ( multilingual transliteration). This is an instance of multitask learning, where individual tasks (language pairs) benefit from sharing knowledge with related tasks. We focus on transliteration involving related tasks i.e., languages sharing writing systems and phonetic properties ( orthographically similar languages). We propose a modified neural encoder-decoder model that maximizes parameter sharing across language pairs in order to effectively leverage orthographic similarity. We show that multilingual transliteration significantly outperforms bilingual transliteration in different scenarios (average increase of 58% across a variety of languages we experimented with). We also show that multilingual transliteration models can generalize well to languages/language pairs not encountered during training and hence perform well on the zeroshot transliteration task. We show that further improvements can be achieved by using phonetic feature input.
APA, Harvard, Vancouver, ISO, and other styles
18

Takahashi, Shuntaro, and Kumiko Tanaka-Ishii. "Evaluating Computational Language Models with Scaling Properties of Natural Language." Computational Linguistics 45, no. 3 (September 2019): 481–513. http://dx.doi.org/10.1162/coli_a_00355.

Full text
Abstract:
In this article, we evaluate computational models of natural language with respect to the universal statistical behaviors of natural language. Statistical mechanical analyses have revealed that natural language text is characterized by scaling properties, which quantify the global structure in the vocabulary population and the long memory of a text. We study whether five scaling properties (given by Zipf’s law, Heaps’ law, Ebeling’s method, Taylor’s law, and long-range correlation analysis) can serve for evaluation of computational models. Specifically, we test n-gram language models, a probabilistic context-free grammar, language models based on Simon/Pitman-Yor processes, neural language models, and generative adversarial networks for text generation. Our analysis reveals that language models based on recurrent neural networks with a gating mechanism (i.e., long short-term memory; a gated recurrent unit; and quasi-recurrent neural networks) are the only computational models that can reproduce the long memory behavior of natural language. Furthermore, through comparison with recently proposed model-based evaluation methods, we find that the exponent of Taylor’s law is a good indicator of model quality.
APA, Harvard, Vancouver, ISO, and other styles
19

Martin, Andrea E. "A Compositional Neural Architecture for Language." Journal of Cognitive Neuroscience 32, no. 8 (August 2020): 1407–27. http://dx.doi.org/10.1162/jocn_a_01552.

Full text
Abstract:
Hierarchical structure and compositionality imbue human language with unparalleled expressive power and set it apart from other perception–action systems. However, neither formal nor neurobiological models account for how these defining computational properties might arise in a physiological system. I attempt to reconcile hierarchy and compositionality with principles from cell assembly computation in neuroscience; the result is an emerging theory of how the brain could convert distributed perceptual representations into hierarchical structures across multiple timescales while representing interpretable incremental stages of (de)compositional meaning. The model's architecture—a multidimensional coordinate system based on neurophysiological models of sensory processing—proposes that a manifold of neural trajectories encodes sensory, motor, and abstract linguistic states. Gain modulation, including inhibition, tunes the path in the manifold in accordance with behavior and is how latent structure is inferred. As a consequence, predictive information about upcoming sensory input during production and comprehension is available without a separate operation. The proposed processing mechanism is synthesized from current models of neural entrainment to speech, concepts from systems neuroscience and category theory, and a symbolic-connectionist computational model that uses time and rhythm to structure information. I build on evidence from cognitive neuroscience and computational modeling that suggests a formal and mechanistic alignment between structure building and neural oscillations, and moves toward unifying basic insights from linguistics and psycholinguistics with the currency of neural computation.
APA, Harvard, Vancouver, ISO, and other styles
20

Mukhamadiyev, Abdinabi, Mukhriddin Mukhiddinov, Ilyos Khujayarov, Mannon Ochilov, and Jinsoo Cho. "Development of Language Models for Continuous Uzbek Speech Recognition System." Sensors 23, no. 3 (January 19, 2023): 1145. http://dx.doi.org/10.3390/s23031145.

Full text
Abstract:
Automatic speech recognition systems with a large vocabulary and other natural language processing applications cannot operate without a language model. Most studies on pre-trained language models have focused on more popular languages such as English, Chinese, and various European languages, but there is no publicly available Uzbek speech dataset. Therefore, language models of low-resource languages need to be studied and created. The objective of this study is to address this limitation by developing a low-resource language model for the Uzbek language and understanding linguistic occurrences. We proposed the Uzbek language model named UzLM by examining the performance of statistical and neural-network-based language models that account for the unique features of the Uzbek language. Our Uzbek-specific linguistic representation allows us to construct more robust UzLM, utilizing 80 million words from various sources while using the same or fewer training words, as applied in previous studies. Roughly sixty-eight thousand different words and 15 million sentences were collected for the creation of this corpus. The experimental results of our tests on the continuous recognition of Uzbek speech show that, compared with manual encoding, the use of neural-network-based language models reduced the character error rate to 5.26%.
APA, Harvard, Vancouver, ISO, and other styles
21

Oralbekova, Dina, Orken Mamyrbayev, Mohamed Othman, Dinara Kassymova, and Kuralai Mukhsina. "Contemporary Approaches in Evolving Language Models." Applied Sciences 13, no. 23 (December 1, 2023): 12901. http://dx.doi.org/10.3390/app132312901.

Full text
Abstract:
This article provides a comprehensive survey of contemporary language modeling approaches within the realm of natural language processing (NLP) tasks. This paper conducts an analytical exploration of diverse methodologies employed in the creation of language models. This exploration encompasses the architecture, training processes, and optimization strategies inherent in these models. The detailed discussion covers various models ranging from traditional n-gram and hidden Markov models to state-of-the-art neural network approaches such as BERT, GPT, LLAMA, and Bard. This article delves into different modifications and enhancements applied to both standard and neural network architectures for constructing language models. Special attention is given to addressing challenges specific to agglutinative languages within the context of developing language models for various NLP tasks, particularly for Arabic and Turkish. The research highlights that contemporary transformer-based methods demonstrate results comparable to those achieved by traditional methods employing Hidden Markov Models. These transformer-based approaches boast simpler configurations and exhibit faster performance during both training and analysis. An integral component of the article is the examination of popular and actively evolving libraries and tools essential for constructing language models. Notable tools such as NLTK, TensorFlow, PyTorch, and Gensim are reviewed, with a comparative analysis considering their simplicity and accessibility for implementing diverse language models. The aim is to provide readers with insights into the landscape of contemporary language modeling methodologies and the tools available for their implementation.
APA, Harvard, Vancouver, ISO, and other styles
22

Hafeez, Rabab, Muhammad Waqas Anwar, Muhammad Hasan Jamal, Tayyaba Fatima, Julio César Martínez Espinosa, Luis Alonso Dzul López, Ernesto Bautista Thompson, and Imran Ashraf. "Contextual Urdu Lemmatization Using Recurrent Neural Network Models." Mathematics 11, no. 2 (January 13, 2023): 435. http://dx.doi.org/10.3390/math11020435.

Full text
Abstract:
In the field of natural language processing, machine translation is a colossally developing research area that helps humans communicate more effectively by bridging the linguistic gap. In machine translation, normalization and morphological analyses are the first and perhaps the most important modules for information retrieval (IR). To build a morphological analyzer, or to complete the normalization process, it is important to extract the correct root out of different words. Stemming and lemmatization are techniques commonly used to find the correct root words in a language. However, a few studies on IR systems for the Urdu language have shown that lemmatization is more effective than stemming due to infixes found in Urdu words. This paper presents a lemmatization algorithm based on recurrent neural network models for the Urdu language. However, lemmatization techniques for resource-scarce languages such as Urdu are not very common. The proposed model is trained and tested on two datasets, namely, the Urdu Monolingual Corpus (UMC) and the Universal Dependencies Corpus of Urdu (UDU). The datasets are lemmatized with the help of recurrent neural network models. The Word2Vec model and edit trees are used to generate semantic and syntactic embedding. Bidirectional long short-term memory (BiLSTM), bidirectional gated recurrent unit (BiGRU), bidirectional gated recurrent neural network (BiGRNN), and attention-free encoder–decoder (AFED) models are trained under defined hyperparameters. Experimental results show that the attention-free encoder-decoder model achieves an accuracy, precision, recall, and F-score of 0.96, 0.95, 0.95, and 0.95, respectively, and outperforms existing models.
APA, Harvard, Vancouver, ISO, and other styles
23

Yogatama, Dani, Cyprien de Masson d’Autume, and Lingpeng Kong. "Adaptive Semiparametric Language Models." Transactions of the Association for Computational Linguistics 9 (2021): 362–73. http://dx.doi.org/10.1162/tacl_a_00371.

Full text
Abstract:
Abstract We present a language model that combines a large parametric neural network (i.e., a transformer) with a non-parametric episodic memory component in an integrated architecture. Our model uses extended short-term context by caching local hidden states—similar to transformer-XL—and global long-term memory by retrieving a set of nearest neighbor tokens at each timestep. We design a gating function to adaptively combine multiple information sources to make a prediction. This mechanism allows the model to use either local context, short-term memory, or long-term memory (or any combination of them) on an ad hoc basis depending on the context. Experiments on word-based and character-based language modeling datasets demonstrate the efficacy of our proposed method compared to strong baselines.
APA, Harvard, Vancouver, ISO, and other styles
24

Tinn, Robert, Hao Cheng, Yu Gu, Naoto Usuyama, Xiaodong Liu, Tristan Naumann, Jianfeng Gao, and Hoifung Poon. "Fine-tuning large neural language models for biomedical natural language processing." Patterns 4, no. 4 (April 2023): 100729. http://dx.doi.org/10.1016/j.patter.2023.100729.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Choi, Sunjoo, Myung-Kwan Park, and Euhee Kim. "How are Korean Neural Language Models ‘surprised’ Layerwisely?" Journal of Language Sciences 28, no. 4 (November 30, 2021): 301–17. http://dx.doi.org/10.14384/kals.2021.28.4.301.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Zhang, Peng, Wenjie Hui, Benyou Wang, Donghao Zhao, Dawei Song, Christina Lioma, and Jakob Grue Simonsen. "Complex-valued Neural Network-based Quantum Language Models." ACM Transactions on Information Systems 40, no. 4 (October 31, 2022): 1–31. http://dx.doi.org/10.1145/3505138.

Full text
Abstract:
Language modeling is essential in Natural Language Processing and Information Retrieval related tasks. After the statistical language models, Quantum Language Model (QLM) has been proposed to unify both single words and compound terms in the same probability space without extending term space exponentially. Although QLM achieved good performance in ad hoc retrieval, it still has two major limitations: (1) QLM cannot make use of supervised information, mainly due to the iterative and non-differentiable estimation of the density matrix, which represents both queries and documents in QLM. (2) QLM assumes the exchangeability of words or word dependencies, neglecting the order or position information of words. This article aims to generalize QLM and make it applicable to more complicated matching tasks (e.g., Question Answering) beyond ad hoc retrieval. We propose a complex-valued neural network-based QLM solution called C-NNQLM to employ an end-to-end approach to build and train density matrices in a light-weight and differentiable manner, and it can therefore make use of external well-trained word vectors and supervised labels. Furthermore, C-NNQLM adopts complex-valued word vectors whose phase vectors can directly encode the order (or position) information of words. Note that complex numbers are also essential in the quantum theory. We show that the real-valued NNQLM (R-NNQLM) is a special case of C-NNQLM. The experimental results on the QA task show that both R-NNQLM and C-NNQLM achieve much better performance than the vanilla QLM, and C-NNQLM’s performance is on par with state-of-the-art neural network models. We also evaluate the proposed C-NNQLM on text classification and document retrieval tasks. The results on most datasets show that the C-NNQLM can outperform R-NNQLM, which demonstrates the usefulness of the complex representation for words and sentences in C-NNQLM.
APA, Harvard, Vancouver, ISO, and other styles
27

Tanaka, Tomohiro, Ryo Masumura, and Takanobu Oba. "Neural candidate-aware language models for speech recognition." Computer Speech & Language 66 (March 2021): 101157. http://dx.doi.org/10.1016/j.csl.2020.101157.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Kong, Weirui, Hyeju Jang, Giuseppe Carenini, and Thalia S. Field. "Exploring neural models for predicting dementia from language." Computer Speech & Language 68 (July 2021): 101181. http://dx.doi.org/10.1016/j.csl.2020.101181.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

Phan, Tien D., and Nur Zincir‐Heywood. "User identification via neural network based language models." International Journal of Network Management 29, no. 3 (October 30, 2018): e2049. http://dx.doi.org/10.1002/nem.2049.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Karyukin, Vladislav, Diana Rakhimova, Aidana Karibayeva, Aliya Turganbayeva, and Asem Turarbek. "The neural machine translation models for the low-resource Kazakh–English language pair." PeerJ Computer Science 9 (February 8, 2023): e1224. http://dx.doi.org/10.7717/peerj-cs.1224.

Full text
Abstract:
The development of the machine translation field was driven by people’s need to communicate with each other globally by automatically translating words, sentences, and texts from one language into another. The neural machine translation approach has become one of the most significant in recent years. This approach requires large parallel corpora not available for low-resource languages, such as the Kazakh language, which makes it difficult to achieve the high performance of the neural machine translation models. This article explores the existing methods for dealing with low-resource languages by artificially increasing the size of the corpora and improving the performance of the Kazakh–English machine translation models. These methods are called forward translation, backward translation, and transfer learning. Then the Sequence-to-Sequence (recurrent neural network and bidirectional recurrent neural network) and Transformer neural machine translation architectures with their features and specifications are concerned for conducting experiments in training models on parallel corpora. The experimental part focuses on building translation models for the high-quality translation of formal social, political, and scientific texts with the synthetic parallel sentences from existing monolingual data in the Kazakh language using the forward translation approach and combining them with the parallel corpora parsed from the official government websites. The total corpora of 380,000 parallel Kazakh–English sentences are trained on the recurrent neural network, bidirectional recurrent neural network, and Transformer models of the OpenNMT framework. The quality of the trained model is evaluated with the BLEU, WER, and TER metrics. Moreover, the sample translations were also analyzed. The RNN and BRNN models showed a more precise translation than the Transformer model. The Byte-Pair Encoding tokenization technique showed better metrics scores and translation than the word tokenization technique. The Bidirectional recurrent neural network with the Byte-Pair Encoding technique showed the best performance with 0.49 BLEU, 0.51 WER, and 0.45 TER.
APA, Harvard, Vancouver, ISO, and other styles
31

Budaya, I. Gede Bintang Arya, Made Windu Antara Kesiman, and I. Made Gede Sunarya. "The Influence of Word Vectorization for Kawi Language to Indonesian Language Neural Machine Translation." Journal of Information Technology and Computer Science 7, no. 1 (September 29, 2022): 81–93. http://dx.doi.org/10.25126/jitecs.202271387.

Full text
Abstract:
People relatively use machine translation to learn any textual knowledge beyond their native language. There is already robust machine translation such as Google translate. However, the language list has only covered the high resource language such as English, France, etc., but not for Kawi Language as one of the local languages used in Bali's old works of literature. Therefore, it is necessary to study the development of machine translation from the Kawi language to the more active user language such as the Indonesian language to make easier learning access for the young learner. The research developed the neural machine translation (NMT) using recurrent neural network (RNN) based neural models and analyzed the influence of word vectorization using Word2Vec for the machine translation performance based on BLEU scores. The result shows that word vectorization indeed significantly increases the NMT models performance, and Long-Short Term Memory (LSTM) with attention mechanism has the highest BLEU scores equal to 20.86. The NMT models still could not achieve the BLEU scores on par with those human experts and high resource language machine translation. On the other hand, this initial study could be the reference for the future development of Kawi to Indonesian NMT.
APA, Harvard, Vancouver, ISO, and other styles
32

Wu, Yi-Chao, Fei Yin, and Cheng-Lin Liu. "Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models." Pattern Recognition 65 (May 2017): 251–64. http://dx.doi.org/10.1016/j.patcog.2016.12.026.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Studenikina, Kseniia Andreevna. "Evaluation of neural models’ linguistic competence: evidence from Russian predicate agreement." Proceedings of the Institute for System Programming of the RAS 34, no. 6 (2022): 178–84. http://dx.doi.org/10.15514/ispras-2022-34(6)-14.

Full text
Abstract:
This study investigates the linguistic competence of modern language models. Artificial neural networks demonstrate high quality in many natural language processing tasks. However, their implicit grammar knowledge remains unstudied. The ability to judge a sentence as grammatical or ungrammatical is regarded as key property of human’s linguistic competence. We suppose that language models’ grammar knowledge also occurs in their ability to judge the grammaticality of a sentence. In order to test neural networks’ linguistic competence, we probe their acquisition of number predicate agreement in Russian. A dataset consisted of artificially generated grammatical and ungrammatical sentences was created to train the language models. Automatic sentence generation allows us to test the acquisition of particular language phenomenon, to detach from vocabulary and pragmatic differences. We use transfer learning of pre-trained neural networks. The results show that all the considered models demonstrate high accuracy and Matthew's correlation coefficient values which can be attributed to successful acquisition of predicate agreement rules. The classification quality is reduced for sentences with inanimate nouns which show nominative-accusative case syncretism. The complexity of the syntactic structure turns out to be significant for Russian models and a model for Slavic languages, but it does not affect the errors distribution of multilingual models.
APA, Harvard, Vancouver, ISO, and other styles
34

Goldberg, Yoav. "A Primer on Neural Network Models for Natural Language Processing." Journal of Artificial Intelligence Research 57 (November 20, 2016): 345–420. http://dx.doi.org/10.1613/jair.4992.

Full text
Abstract:
Over the past few years, neural networks have re-emerged as powerful machine-learning models, yielding state-of-the-art results in fields such as image recognition and speech processing. More recently, neural network models started to be applied also to textual natural language signals, again with very promising results. This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques. The tutorial covers input encoding for natural language tasks, feed-forward networks, convolutional networks, recurrent networks and recursive networks, as well as the computation graph abstraction for automatic gradient computation.
APA, Harvard, Vancouver, ISO, and other styles
35

Babić, Karlo, Sanda Martinčić-Ipšić, and Ana Meštrović. "Survey of Neural Text Representation Models." Information 11, no. 11 (October 30, 2020): 511. http://dx.doi.org/10.3390/info11110511.

Full text
Abstract:
In natural language processing, text needs to be transformed into a machine-readable representation before any processing. The quality of further natural language processing tasks greatly depends on the quality of those representations. In this survey, we systematize and analyze 50 neural models from the last decade. The models described are grouped by the architecture of neural networks as shallow, recurrent, recursive, convolutional, and attention models. Furthermore, we categorize these models by representation level, input level, model type, and model supervision. We focus on task-independent representation models, discuss their advantages and drawbacks, and subsequently identify the promising directions for future neural text representation models. We describe the evaluation datasets and tasks used in the papers that introduced the models and compare the models based on relevant evaluations. The quality of a representation model can be evaluated as its capability to generalize to multiple unrelated tasks. Benchmark standardization is visible amongst recent models and the number of different tasks models are evaluated on is increasing.
APA, Harvard, Vancouver, ISO, and other styles
36

Hahn, Michael. "Theoretical Limitations of Self-Attention in Neural Sequence Models." Transactions of the Association for Computational Linguistics 8 (July 2020): 156–71. http://dx.doi.org/10.1162/tacl_a_00306.

Full text
Abstract:
Transformers are emerging as the new workhorse of NLP, showing great success across tasks. Unlike LSTMs, transformers process input sequences entirely through self-attention. Previous work has suggested that the computational capabilities of self-attention to process hierarchical structures are limited. In this work, we mathematically investigate the computational power of self-attention to model formal languages. Across both soft and hard attention, we show strong theoretical limitations of the computational abilities of self-attention, finding that it cannot model periodic finite-state languages, nor hierarchical structure, unless the number of layers or heads increases with input length. These limitations seem surprising given the practical success of self-attention and the prominent role assigned to hierarchical structure in linguistics, suggesting that natural language can be approximated well with models that are too weak for the formal languages typically assumed in theoretical linguistics.
APA, Harvard, Vancouver, ISO, and other styles
37

Yoo, YongSuk, and Kang-moon Park. "Developing Language-Specific Models Using a Neural Architecture Search." Applied Sciences 11, no. 21 (November 3, 2021): 10324. http://dx.doi.org/10.3390/app112110324.

Full text
Abstract:
This paper applies the neural architecture search (NAS) method to Korean and English grammaticality judgment tasks. Based on the previous research, which only discusses the application of NAS on a Korean dataset, we extend the method to English grammatical tasks and compare the resulting two architectures from Korean and English. Since complex syntactic operations exist beneath the word order that is computed, the two different resulting architectures out of the automated NAS language modeling provide an interesting testbed for future research. To the extent of our knowledge, the methodology adopted here has not been tested in the literature. Crucially, the resulting structure of the NAS application shows an unexpected design for human experts. Furthermore, NAS has generated different models for Korean and English, which have different syntactic operations.
APA, Harvard, Vancouver, ISO, and other styles
38

Cangelosi, Angelo. "The emergence of language: neural and adaptive agent models." Connection Science 17, no. 3-4 (September 2005): 185–90. http://dx.doi.org/10.1080/09540090500177471.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Zamora-Martínez, F., V. Frinken, S. España-Boquera, M. J. Castro-Bleda, A. Fischer, and H. Bunke. "Neural network language models for off-line handwriting recognition." Pattern Recognition 47, no. 4 (April 2014): 1642–52. http://dx.doi.org/10.1016/j.patcog.2013.10.020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Shi, Yangyang, Martha Larson, Joris Pelemans, Catholijn M. Jonker, Patrick Wambacq, Pascal Wiggers, and Kris Demuynck. "Integrating meta-information into recurrent neural network language models." Speech Communication 73 (October 2015): 64–80. http://dx.doi.org/10.1016/j.specom.2015.06.006.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Lalrempuii, Candy, Badal Soni, and Partha Pakray. "An Improved English-to-Mizo Neural Machine Translation." ACM Transactions on Asian and Low-Resource Language Information Processing 20, no. 4 (May 26, 2021): 1–21. http://dx.doi.org/10.1145/3445974.

Full text
Abstract:
Machine Translation is an effort to bridge language barriers and misinterpretations, making communication more convenient through the automatic translation of languages. The quality of translations produced by corpus-based approaches predominantly depends on the availability of a large parallel corpus. Although machine translation of many Indian languages has progressively gained attention, there is very limited research on machine translation and the challenges of using various machine translation techniques for a low-resource language such as Mizo. In this article, we have implemented and compared statistical-based approaches with modern neural-based approaches for the English–Mizo language pair. We have experimented with different tokenization methods, architectures, and configurations. The performance of translations predicted by the trained models has been evaluated using automatic and human evaluation measures. Furthermore, we have analyzed the prediction errors of the models and the quality of predictions based on variations in sentence length and compared the model performance with the existing baselines.
APA, Harvard, Vancouver, ISO, and other styles
42

Ananthanarayana, Tejaswini, Priyanshu Srivastava, Akash Chintha, Akhil Santha, Brian Landy, Joseph Panaro, Andre Webster, et al. "Deep Learning Methods for Sign Language Translation." ACM Transactions on Accessible Computing 14, no. 4 (December 31, 2021): 1–30. http://dx.doi.org/10.1145/3477498.

Full text
Abstract:
Many sign languages are bona fide natural languages with grammatical rules and lexicons hence can benefit from machine translation methods. Similarly, since sign language is a visual-spatial language, it can also benefit from computer vision methods for encoding it. With the advent of deep learning methods in recent years, significant advances have been made in natural language processing (specifically neural machine translation) and in computer vision methods (specifically image and video captioning). Researchers have therefore begun expanding these learning methods to sign language understanding. Sign language interpretation is especially challenging, because it involves a continuous visual-spatial modality where meaning is often derived based on context. The focus of this article, therefore, is to examine various deep learning–based methods for encoding sign language as inputs, and to analyze the efficacy of several machine translation methods, over three different sign language datasets. The goal is to determine which combinations are sufficiently robust for sign language translation without any gloss-based information. To understand the role of the different input features, we perform ablation studies over the model architectures (input features + neural translation models) for improved continuous sign language translation. These input features include body and finger joints, facial points, as well as vector representations/embeddings from convolutional neural networks. The machine translation models explored include several baseline sequence-to-sequence approaches, more complex and challenging networks using attention, reinforcement learning, and the transformer model. We implement the translation methods over multiple sign languages—German (GSL), American (ASL), and Chinese sign languages (CSL). From our analysis, the transformer model combined with input embeddings from ResNet50 or pose-based landmark features outperformed all the other sequence-to-sequence models by achieving higher BLEU2-BLEU4 scores when applied to the controlled and constrained GSL benchmark dataset. These combinations also showed significant promise on the other less controlled ASL and CSL datasets.
APA, Harvard, Vancouver, ISO, and other styles
43

P., Dr Karrupusamy. "Analysis of Neural Network Based Language Modeling." March 2020 2, no. 1 (March 30, 2020): 53–63. http://dx.doi.org/10.36548/jaicn.2020.1.006.

Full text
Abstract:
The fundamental and core process of the natural language processing is the language modelling usually referred as the statistical language modelling. The language modelling is also considered to be vital in the processing the natural languages as the other chores such as the completion of sentences, recognition of speech automatically, translations of the statistical machines, and generation of text and so on. The success of the viable natural language processing totally relies on the quality of the modelling of the language. In the previous spans the research field such as the linguistics, psychology, speech recognition, data compression, neuroscience, machine translation etc. As the neural network are the very good choices for having a quality language modelling the paper presents the analysis of neural networks in the modelling of the language. Utilizing some of the dataset such as the Penn Tree bank, Billion Word Benchmark and the Wiki Test the neural network models are evaluated on the basis of the word error rate, perplexity and the bilingual evaluation under study scores to identify the optimal model.
APA, Harvard, Vancouver, ISO, and other styles
44

P., Dr Karrupusamy. "Analysis of Neural Network Based Language Modeling." March 2020 2, no. 1 (March 30, 2020): 53–63. http://dx.doi.org/10.36548/jaicn.2020.3.006.

Full text
Abstract:
The fundamental and core process of the natural language processing is the language modelling usually referred as the statistical language modelling. The language modelling is also considered to be vital in the processing the natural languages as the other chores such as the completion of sentences, recognition of speech automatically, translations of the statistical machines, and generation of text and so on. The success of the viable natural language processing totally relies on the quality of the modelling of the language. In the previous spans the research field such as the linguistics, psychology, speech recognition, data compression, neuroscience, machine translation etc. As the neural network are the very good choices for having a quality language modelling the paper presents the analysis of neural networks in the modelling of the language. Utilizing some of the dataset such as the Penn Tree bank, Billion Word Benchmark and the Wiki Test the neural network models are evaluated on the basis of the word error rate, perplexity and the bilingual evaluation under study scores to identify the optimal model.
APA, Harvard, Vancouver, ISO, and other styles
45

Arisoy, Ebru, Stanley F. Chen, Bhuvana Ramabhadran, and Abhinav Sethy. "Converting Neural Network Language Models into Back-off Language Models for Efficient Decoding in Automatic Speech Recognition." IEEE/ACM Transactions on Audio, Speech, and Language Processing 22, no. 1 (January 2014): 184–92. http://dx.doi.org/10.1109/taslp.2013.2286919.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Rijhwani, Shruti, Jiateng Xie, Graham Neubig, and Jaime Carbonell. "Zero-Shot Neural Transfer for Cross-Lingual Entity Linking." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 6924–31. http://dx.doi.org/10.1609/aaai.v33i01.33016924.

Full text
Abstract:
Cross-lingual entity linking maps an entity mention in a source language to its corresponding entry in a structured knowledge base that is in a different (target) language. While previous work relies heavily on bilingual lexical resources to bridge the gap between the source and the target languages, these resources are scarce or unavailable for many low-resource languages. To address this problem, we investigate zero-shot cross-lingual entity linking, in which we assume no bilingual lexical resources are available in the source low-resource language. Specifically, we propose pivot-basedentity linking, which leverages information from a highresource “pivot” language to train character-level neural entity linking models that are transferred to the source lowresource language in a zero-shot manner. With experiments on 9 low-resource languages and transfer through a total of54 languages, we show that our proposed pivot-based framework improves entity linking accuracy 17% (absolute) on average over the baseline systems, for the zero-shot scenario.1 Further, we also investigate the use of language-universal phonological representations which improves average accuracy (absolute) by 36% when transferring between languages that use different scripts.
APA, Harvard, Vancouver, ISO, and other styles
47

Demeter, David, and Doug Downey. "Just Add Functions: A Neural-Symbolic Language Model." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7634–42. http://dx.doi.org/10.1609/aaai.v34i05.6264.

Full text
Abstract:
Neural network language models (NNLMs) have achieved ever-improving accuracy due to more sophisticated architectures and increasing amounts of training data. However, the inductive bias of these models (formed by the distributional hypothesis of language), while ideally suited to modeling most running text, results in key limitations for today's models. In particular, the models often struggle to learn certain spatial, temporal, or quantitative relationships, which are commonplace in text and are second-nature for human readers. Yet, in many cases, these relationships can be encoded with simple mathematical or logical expressions. How can we augment today's neural models with such encodings?In this paper, we propose a general methodology to enhance the inductive bias of NNLMs by incorporating simple functions into a neural architecture to form a hierarchical neural-symbolic language model (NSLM). These functions explicitly encode symbolic deterministic relationships to form probability distributions over words. We explore the effectiveness of this approach on numbers and geographic locations, and show that NSLMs significantly reduce perplexity in small-corpus language modeling, and that the performance improvement persists for rare tokens even on much larger corpora. The approach is simple and general, and we discuss how it can be applied to other word classes beyond numbers and geography.
APA, Harvard, Vancouver, ISO, and other styles
48

Kipyatkova, Irina, and Ildar Kagirov. "Deep Models for Low-Resourced Speech Recognition: Livvi-Karelian Case." Mathematics 11, no. 18 (September 5, 2023): 3814. http://dx.doi.org/10.3390/math11183814.

Full text
Abstract:
Recently, there has been a growth in the number of studies addressing the automatic processing of low-resource languages. The lack of speech and text data significantly hinders the development of speech technologies for such languages. This paper introduces an automatic speech recognition system for Livvi-Karelian. Acoustic models based on artificial neural networks with time delays and hidden Markov models were trained using a limited speech dataset of 3.5 h. To augment the data, pitch and speech rate perturbation, SpecAugment, and their combinations were employed. Language models based on 3-grams and neural networks were trained using written texts and transcripts. The achieved word error rate metric of 22.80% is comparable to other low-resource languages. To the best of our knowledge, this is the first speech recognition system for Livvi-Karelian. The results obtained can be of a certain significance for development of automatic speech recognition systems not only for Livvi-Karelian, but also for other low-resource languages, including the fields of speech recognition and machine translation systems. Future work includes experiments with Karelian data using techniques such as transfer learning and DNN language models.
APA, Harvard, Vancouver, ISO, and other styles
49

Gerz, Daniela, Ivan Vulić, Edoardo Ponti, Jason Naradowsky, Roi Reichart, and Anna Korhonen. "Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction." Transactions of the Association for Computational Linguistics 6 (December 2018): 451–65. http://dx.doi.org/10.1162/tacl_a_00032.

Full text
Abstract:
Neural architectures are prominent in the construction of language models (LMs). However, word-level prediction is typically agnostic of subword-level information (characters and character sequences) and operates over a closed vocabulary, consisting of a limited word set. Indeed, while subword-aware models boost performance across a variety of NLP tasks, previous work did not evaluate the ability of these models to assist next-word prediction in language modeling tasks. Such subword-level informed models should be particularly effective for morphologically-rich languages (MRLs) that exhibit high type-to-token ratios. In this work, we present a large-scale LM study on 50 typologically diverse languages covering a wide variety of morphological systems, and offer new LM benchmarks to the community, while considering subword-level information. The main technical contribution of our work is a novel method for injecting subword-level information into semantic word vectors, integrated into the neural language modeling training, to facilitate word-level prediction. We conduct experiments in the LM setting where the number of infrequent words is large, and demonstrate strong perplexity gains across our 50 languages, especially for morphologically-rich languages. Our code and data sets are publicly available.
APA, Harvard, Vancouver, ISO, and other styles
50

Johnson, Melvin, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, et al. "Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation." Transactions of the Association for Computational Linguistics 5 (December 2017): 339–51. http://dx.doi.org/10.1162/tacl_a_00065.

Full text
Abstract:
We propose a simple solution to use a single Neural Machine Translation (NMT) model to translate between multiple languages. Our solution requires no changes to the model architecture from a standard NMT system but instead introduces an artificial token at the beginning of the input sentence to specify the required target language. Using a shared wordpiece vocabulary, our approach enables Multilingual NMT systems using a single model. On the WMT’14 benchmarks, a single multilingual model achieves comparable performance for English→French and surpasses state-of-theart results for English→German. Similarly, a single multilingual model surpasses state-of-the-art results for French→English and German→English on WMT’14 and WMT’15 benchmarks, respectively. On production corpora, multilingual models of up to twelve language pairs allow for better translation of many individual pairs. Our models can also learn to perform implicit bridging between language pairs never seen explicitly during training, showing that transfer learning and zero-shot translation is possible for neural translation. Finally, we show analyses that hints at a universal interlingua representation in our models and also show some interesting examples when mixing languages.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography