Dissertations / Theses on the topic 'Large language model'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 33 dissertations / theses for your research on the topic 'Large language model.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Jiang, Yuandong. "Large Scale Distributed Semantic N-gram Language Model." Wright State University / OhioLINK, 2011. http://rave.ohiolink.edu/etdc/view?acc_num=wright1316200173.
Full textTang, Haijiang. "Building phrase based language model from large corpus /." View Abstract or Full-Text, 2002. http://library.ust.hk/cgi/db/thesis.pl?ELEC%202002%20TANG.
Full textIncludes bibliographical references (leaves 74-79). Also available in electronic version. Access restricted to campus users.
McGreevy, Michael. "Statistical language modelling for large vocabulary speech recognition." Thesis, Queensland University of Technology, 2006. https://eprints.qut.edu.au/16444/1/Michael_McGreevy_Thesis.pdf.
Full textMcGreevy, Michael. "Statistical language modelling for large vocabulary speech recognition." Queensland University of Technology, 2006. http://eprints.qut.edu.au/16444/.
Full textTan, Ming. "A Large Scale Distributed Syntactic, Semantic and Lexical Language Model for Machine Translation." Wright State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=wright1386111950.
Full textSusman, Derya. "Turkish Large Vocabulary Continuous Speech Recognition By Using Limited Audio Corpus." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614207/index.pdf.
Full textComez, Murat Ali. "Large Vocabulary Continuous Speech Recogniton For Turkish Using Htk." Master's thesis, METU, 2003. http://etd.lib.metu.edu.tr/upload/1205491/index.pdf.
Full textSagen, Markus. "Large-Context Question Answering with Cross-Lingual Transfer." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-440704.
Full textUzelac, Lawrence Stevan. "A Multiple Coupled Microstrip Transmission Line Model for High-Speed VLSI Interconnect Simulation." PDXScholar, 1991. https://pdxscholar.library.pdx.edu/open_access_etds/4526.
Full textLabeau, Matthieu. "Neural language models : Dealing with large vocabularies." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS313/document.
Full textThis work investigates practical methods to ease training and improve performances of neural language models with large vocabularies. The main limitation of neural language models is their expensive computational cost: it depends on the size of the vocabulary, with which it grows linearly. Despite several training tricks, the most straightforward way to limit computation time is to limit the vocabulary size, which is not a satisfactory solution for numerous tasks. Most of the existing methods used to train large-vocabulary language models revolve around avoiding the computation of the partition function, ensuring that output scores are normalized into a probability distribution. Here, we focus on sampling-based approaches, including importance sampling and noise contrastive estimation. These methods allow an approximate computation of the partition function. After examining the mechanism of self-normalization in noise-contrastive estimation, we first propose to improve its efficiency with solutions that are adapted to the inner workings of the method and experimentally show that they considerably ease training. Our second contribution is to expand on a generalization of several sampling based objectives as Bregman divergences, in order to experiment with new objectives. We use Beta divergences to derive a set of objectives from which noise contrastive estimation is a particular case. Finally, we aim at improving performances on full vocabulary language models, by augmenting output words representation with subwords. We experiment on a Czech dataset and show that using character-based representations besides word embeddings for output representations gives better results. We also show that reducing the size of the output look-up table improves results even more
Zervakis, Georgios. "Enriching large language models with semantic lexicons and analogies." Electronic Thesis or Diss., Université de Lorraine, 2023. http://www.theses.fr/2023LORR0039.
Full textRecent advances in deep learning and neural networks have made it possible to address complex natural language processing tasks, which find application in a plethora of real-world problems ranging from smart assistants in mobile devices to the prediction of cancer. Nonetheless, modern systems based on these frameworks exhibit various limitations that may compromise their performance and trustworthiness, render them unfair towards minorities, or subject them to privacy leakage. It is our belief that integrating symbolic knowledge and reasoning into the deep learning framework is a necessary step towards addressing the aforementioned limitations. For example, lexical resources can enrich deep neural networks with semantic or syntactic knowledge, and logical rules can provide learning and reasoning mechanisms. Therefore, the scope of this thesis is to develop and evaluate ways of integrating different types of symbolic knowledge and reasoning into a widely used language model, Bidirectional Encoder Representations from Transformers (BERT). ln a first stage, we consider retrofitting, a simple and popular technique for refining distributional word embeddings based on relations coming from a semantic lexicon. Inspired by this technique, we present two methods for incorporating this knowledge into BERT contextualized embeddings. We evaluate these methods on three biomedical datasets for relation extraction and one movie review dataset for sentiment analysis, and show that they do not substantially impact the performance for these tasks. Furthermore, we conduct a qualitative analysis to provide further insights on this negative result. ln a second stage, we integrate analogical reasoning with BERT as a means to improve its performance on the target sense verification task, and make it more robust. To do so, we reformulate target sense verification as an analogy detection task. We present a hybrid model that combines BERT to encode the input data into quadruples and a convolutional neural classifier to decide whether they constitute valid analogies. We test our system on a benchmark dataset, and show that it can outperform existing approaches. Our empirical study shows the importance of the input encoding for BERT, and how this dependence gets alleviated by integrating the axiomatic properties of analogies during training, while preserving performance and improving robustness
Chadha, Vikrampal. "Simulation of large-scale system-level models." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-12162009-020334/.
Full textKropff, Emilio. "Statistical and dynamical properties of large cortical network models: insights into semantic memory and language." Doctoral thesis, SISSA, 2007. http://hdl.handle.net/20.500.11767/4639.
Full textZhao, Ying, and ying zhao@rmit edu au. "Effective Authorship Attribution in Large Document Collections." RMIT University. Computer Science and Information Technology, 2008. http://adt.lib.rmit.edu.au/adt/public/adt-VIT20080730.162501.
Full textHittner, Brian Edward. "Rendering large-scale terrain models and positioning objects in relation to 3D terrain." Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2003. http://library.nps.navy.mil/uhtbin/hyperion-image/03Dec%5FHittner.pdf.
Full textThesis advisor(s): Don Brutzman, Curt Blais. Includes bibliographical references (p. 117-118). Also available online.
Pan, Bi-Yu. "Hierarchical test generation for VHDL behavioral models." Thesis, This resource online, 1992. http://scholar.lib.vt.edu/theses/available/etd-09052009-040449/.
Full textWest, James F. "An examination of the application of design metrics to the development of testing strategies in large-scale SDL models." Virtual Press, 2000. http://liblink.bsu.edu/uhtbin/catkey/1191725.
Full textDepartment of Computer Science
Kapoor, Shekhar. "Process level test generation for VHDL behavioral models." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-05022009-040753/.
Full textNarayanaswamy, Sathyanarayanan. "Development of VHDL behavioral models with back annotated timing." Thesis, This resource online, 1994. http://scholar.lib.vt.edu/theses/available/etd-06112009-063442/.
Full textKubalík, Jakub. "Mining of Textual Data from the Web for Speech Recognition." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2010. http://www.nusl.cz/ntk/nusl-237170.
Full textMünzner, Ulrike Tatjana Elisabeth. "From birth to birth A cell cycle control network of S. cerevisiae." Doctoral thesis, Humboldt-Universität zu Berlin, 2017. http://dx.doi.org/10.18452/18566.
Full textThe survival of a species depends on the correct transmission of an intact genome from one generation to the next. The cell cycle regulates this process and its correct execution is vital for survival of a species. The cell cycle underlies a strict control mechanism ensuring accurate cell cycle progression, as aberrations in cell cycle progression are often linked to serious defects and diseases such as cancer. Understanding this regulatory machinery of the cell cycle offers insights into how life functions on a molecular level and also provides for a better understanding of diseases and possible approaches to control them. Cell cycle control is furthermore a complex mechanism and studying it holistically provides for understanding its collective properties. Computational approaches facilitate holistic cell cycle control studies. However, the properties of the cell cycle control network challenge large-scale in silico studies with respect to scalability, model execution and parameter estimation. This thesis presents a mechanistically detailed and executable large-scale reconstruction of the Saccharomyces cerevisiae cell cycle control network based on reaction- contingency language. The reconstruction accounts for 229 proteins and consists of three individual cycles corresponding to the macroscopic events of DNA replication, spindle pole body duplication, and bud emergence and growth. The reconstruction translated into a bipartite Boolean model has, using an initial state determined with a priori knowledge, a cyclic attractor which reproduces the cyclic behavior of a wildtype yeast cell. The bipartite Boolean model has 2506 nodes and correctly responds to four cell cycle arrest chemicals. Furthermore, the bipartite Boolean model was used in a mutational study where 37 mutants were tested and 32 mutants found to reproduce known phenotypes. The reconstruction of the cell cycle control network of S. cerevisiae demonstrates the power of the reaction-contingency based approach, and paves the way for network extension with regard to the cell cycle machinery itself, and several signal transduction pathways interfering with the cell cycle.
Larsson-Toll, Karna. "De overdracht van Nederlandse getuigenisliteratuur naar Zweden : In welk opzicht verschillen de besluiten om vier getuigenisboeken in het Zweeds te laten vertalen en uitgeven Hoe ziet de receptie van deze boeken uit." Thesis, Stockholms universitet, Nederländska, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-189550.
Full textDurán, Alcaide Ángel. "Development of high-performance algorithms for a new generation of versatile molecular descriptors. The Pentacle software." Doctoral thesis, Universitat Pompeu Fabra, 2010. http://hdl.handle.net/10803/7201.
Full textEl trabajo que se presenta en esta tesis se ha centrado en el desarrollo de algoritmos de altas prestaciones para la obtención de una nueva generación de descriptores moleculares, con numerosas ventajas con respecto a sus predecesores, adecuados para diversas aplicaciones en el área del diseño de fármacos, y en su implementación en un programa científico de calidad comercial (Pentacle). Inicialmente se desarrolló un nuevo algoritmo de discretización de campos de interacción molecular (AMANDA) que permite extraer eficientemente las regiones de máximo interés. Este algoritmo fue incorporado en una nueva generación de descriptores moleculares independientes del alineamiento, denominados GRIND-2. La rapidez y eficiencia del nuevo algoritmo permitieron aplicar estos descriptores en cribados virtuales. Por último, se puso a punto un nuevo algoritmo de codificación independiente de alineamiento (CLACC) que permite obtener modelos cuantitativos de relación estructura-actividad con mejor capacidad predictiva y mucho más fáciles de interpretar que los obtenidos con otros métodos.
Yang, Yun-Shu, and 楊雲舒. "Large-Vocabulary Mandarin Speech Recognition using Hierarchical Language Model." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/76476966608462857598.
Full text國立交通大學
電信工程研究所
99
It’s difficult to list all words in recognizer’s vocabulary for large-vocabulary speech recognition, so we present an approach for modeling out of vocabulary (OOV) words. In this thesis, we choose three types of word in Mandarin such as determinative-measure compound word, person name and affixation to deal with this OOV problem. Words are converted to the sub-word units and searched for in the hypotheses to cover more new words through the use of flexible sub-word units. The main focus of this study is to use the grammar and semantic information to construct a hierarchical language model for these three types of word. The language model will be added to promote the recognition performance and hope to recognize more meaningful long-term units such as word and word-chunk.
Tsai, Wen-Hung, and 蔡文鴻. "An Initial Study on Language Model Estimation and Adaptation Techniques for Mandarin Large Vocabulary Continuous Speech Recognition." Thesis, 2005. http://ndltd.ncl.edu.tw/handle/64319373139039836810.
Full text國立臺灣師範大學
資訊工程研究所
93
Statistical language modeling, which aims to capture the regularities in human natural language and quantify the acceptance of a given word sequence, has continuously been an important research issue in a wide variety of applications of natural language processing (NLP) over the past three decades. For example, in speech recognition, the principal role of the language models is to help resolve the acoustic confusion and thus separate the correct hypothesis from the competing ones. In the recent past, there were quite many applications of speech recognition technology being developed, such as voice dictation and call routing systems, etc. However, speech recognition performance is often seriously affected by the varying lexical and semantic characteristics among different application tasks. Thus, there is always a need for language model adaptation, which has the goal to exploit the specific lexical and semantic information inherent in the recognition domain, so as to compensate the mismatch between training and testing conditions. In this thesis, a topical mixture model (TMM) previously proposed for probabilistic information retrieval was investigated to dynamically explore the long-span latent topical information for language model adaptation. Moreover, we also studied the use of the Maximum Entropy (ME) principle for language modeling. ME is a principle for efficient combination of a variety of information sources. Under the ME criterion, each information source gives rise to a set of constraints that can be futher imposed on the resultant language model. The intersection of these constraints is the set of language model probability distributions which can satisfy all of these constraints. The probability distribution which has highest entropy is thus the solution of the ME principle. The preliminary experimental results show that the ME-based language modeling approach can achieve superior performance over the conventional Maximum Likelihood (ML) based approach in both character error rate and perplexity reductions on the Mandarin broadcast news transcription task.
Chen, Ssu-Cheng, and 陳思澄. "Exploring Word Embedding and Concept Information for Language Model Adaptation in Mandarin Large Vocabulary Continuous Speech Recognition." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/84394286701092463454.
Full text國立臺灣師範大學
資訊工程學系
103
Research on deep learning has experienced a surge of interest in recent years. Alongside the rapid development of deep learning related technologies, various distributed representation methods have been proposed to embed the words of a vocabulary as vectors in a lower-dimensional space. Based on the distributed representations, it is anticipated to discover the semantic relationship between any pair of words via some kind of similarity computation of the associated word vectors. With the above background, this thesis explores a novel use of distributed representations of words for language modeling (LM) in speech recognition. Firstly, word vectors are employed to represent the words in the search history and the upcoming words during the speech recognition process, so as to dynamically adapt the language model on top of such vector representations. Second, we extend the recently proposed concept language model (CLM) by conduct relevant training data selection in the sentence level instead of the document level. By doing so, the concept classes of CLM can be more accurately estimated while simultaneously eliminating redundant or irrelevant information. On the other hand, since the resulting concept classes need to be dynamically selected and linearly combined to form the CLM model during the speech recognition process, we determine the relatedness of each concept class to the test utterance based the word representations derived with either the continue bag-of-words model (CBOW) or the skip-gram model (Skip-gram). Finally, we also combine the above LM methods for better speech recognition performance. Extensive experiments carried out on the MATBN (Mandarin Across Taiwan Broadcast News) corpus demonstrate the utility of our proposed LM methods in relation to several state-of-the art baselines.
Feng, Zhuo. "Modeling and Analysis of Large-Scale On-Chip Interconnects." 2009. http://hdl.handle.net/1969.1/ETD-TAMU-2009-12-7142.
Full textNyberg, Jakob. "Response Generation Using Large-scale Pre-trained Language Models." Thesis, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-415323.
Full textHwang, Chien-Yo, and 黃健祐. "Analyzing Properties of Smoothing Issues for Language Models in Large Mandarin Corpus." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/75029464702391160845.
Full text國立中興大學
資訊網路多媒體研究所
100
Smoothing technique is a very fundamental and important topic. Many applications like speech reconition, machine translation, input method, Chinese characters conversion use this technique a lot. In this thesis, we discuss the properties and entropies of smoothing methods. Because of the problem of data sparseness, smoothing methods are employed to estimate the probability of each event in language models. We will mention several well-known smoothing methods: Additive Discount Method, Good-Turing Method and Witten-Bell method. The present smoothing techniques have solved the data sparse problem effectively but have not further anzlyzed the reasonableness for the frequency distribution of events occurring.So we analyzed smoothing method from a statitiscal point of view. We propose a set of properties to analyzed the statistical bebaviors of these smoothing methods. Furthmore, we present two new smoothing methods which comply with nearly all the properties. Finally, we implement the language models using large Mandarin corpus and discuss how to evaluate language models by cross-entropy and perplexity. Then we discuss some related problems of the cut off issues proopsed by Katz.
Patel, Parita. "Compilation of Graph Algorithms for Hybrid, Cross-Platform and Distributed Architectures." Thesis, 2017. http://etd.iisc.ac.in/handle/2005/3803.
Full textPatel, Parita. "Compilation of Graph Algorithms for Hybrid, Cross-Platform and Distributed Architectures." Thesis, 2017. http://etd.iisc.ernet.in/2005/3803.
Full textArun, N. S. "Design And Implementation Of An OODBMS For VLSI Interconnect Parasitic Analysis." Thesis, 1996. https://etd.iisc.ac.in/handle/2005/1724.
Full textArun, N. S. "Design And Implementation Of An OODBMS For VLSI Interconnect Parasitic Analysis." Thesis, 1996. http://etd.iisc.ernet.in/handle/2005/1724.
Full text