Academic literature on the topic 'Low-Resourced language'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Low-Resourced language.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Journal articles on the topic "Low-Resourced language"
Allah, Fadoua Ataa, and Siham Boulaknadel. "NEW TRENDS IN LESS-RESOURCED LANGUAGE PROCESSING: CASE OF AMAZIGH LANGUAGE." International Journal on Natural Language Computing 12, no. 2 (April 29, 2023): 75–89. http://dx.doi.org/10.5121/ijnlc.2023.12207.
Full textKipyatkova, Irina, and Ildar Kagirov. "Deep Models for Low-Resourced Speech Recognition: Livvi-Karelian Case." Mathematics 11, no. 18 (September 5, 2023): 3814. http://dx.doi.org/10.3390/math11183814.
Full textSingh, Pranaydeep, Orphée De Clercq, and Els Lefever. "Distilling Monolingual Models from Large Multilingual Transformers." Electronics 12, no. 4 (February 18, 2023): 1022. http://dx.doi.org/10.3390/electronics12041022.
Full textMabokela, Koena Ronny, Mpho Primus, and Turgay Celik. "Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages." Big Data and Cognitive Computing 8, no. 11 (November 15, 2024): 160. http://dx.doi.org/10.3390/bdcc8110160.
Full textShafiq, Nida, Isma Hamid, Muhammad Asif, Qamar Nawaz, Hanan Aljuaid, and Hamid Ali. "Abstractive text summarization of low-resourced languages using deep learning." PeerJ Computer Science 9 (January 13, 2023): e1176. http://dx.doi.org/10.7717/peerj-cs.1176.
Full textPandit, Rajat, Saptarshi Sengupta, Sudip Kumar Naskar, Niladri Sekhar Dash, and Mohini Mohan Sardar. "Improving Semantic Similarity with Cross-Lingual Resources: A Study in Bangla—A Low Resourced Language." Informatics 6, no. 2 (May 5, 2019): 19. http://dx.doi.org/10.3390/informatics6020019.
Full textBadawi, Soran. "Transformer-Based Neural Network Machine Translation Model for the Kurdish Sorani Dialect." UHD Journal of Science and Technology 7, no. 1 (January 15, 2023): 15–21. http://dx.doi.org/10.21928/uhdjst.v7n1y2023.pp15-21.
Full textKapočiūtė-Dzikienė, Jurgita, and Senait Gebremichael Tesfagergish. "Part-of-Speech Tagging via Deep Neural Networks for Northern-Ethiopic Languages." Information Technology And Control 49, no. 4 (December 19, 2020): 482–94. http://dx.doi.org/10.5755/j01.itc.49.4.26808.
Full textNitu, Melania, and Mihai Dascalu. "Natural Language Processing Tools for Romanian – Going Beyond a Low-Resource Language." Interaction Design and Architecture(s), no. 60 (March 15, 2024): 7–26. http://dx.doi.org/10.55612/s-5002-060-001sp.
Full textNgué Um, Emmanuel, Émilie Eliette, Caroline Ngo Tjomb Assembe, and Francis Morton Tyers. "Developing a Rule-Based Machine-Translation System, Ewondo–French–Ewondo." International Journal of Humanities and Arts Computing 16, no. 2 (October 2022): 166–81. http://dx.doi.org/10.3366/ijhac.2022.0289.
Full textDissertations / Theses on the topic "Low-Resourced language"
Aufrant, Lauriane. "Training parsers for low-resourced languages : improving cross-lingual transfer with monolingual knowledge." Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLS089/document.
Full textAs a result of the recent blossoming of Machine Learning techniques, the Natural Language Processing field faces an increasingly thorny bottleneck: the most efficient algorithms entirely rely on the availability of large training data. These technological advances remain consequently unavailable for the 7,000 languages in the world, out of which most are low-resourced. One way to bypass this limitation is the approach of cross-lingual transfer, whereby resources available in another (source) language are leveraged to help building accurate systems in the desired (target) language. However, despite promising results in research settings, the standard transfer techniques lack the flexibility regarding cross-lingual resources needed to be fully usable in real-world scenarios: exploiting very sparse resources, or assorted arrays of resources. This limitation strongly diminishes the applicability of that approach. This thesis consequently proposes to combine multiple sources and resources for transfer, with an emphasis on selectivity: can we estimate which resource of which language is useful for which input? This strategy is put into practice in the frame of transition-based dependency parsing. To this end, a new transfer framework is designed, with a cascading architecture: it enables the desired combination, while ensuring better targeted exploitation of each resource, down to the level of the word. Empirical evaluation dampens indeed the enthusiasm for the purely cross-lingual approach -- it remains in general preferable to annotate just a few target sentences -- but also highlights its complementarity with other approaches. Several metrics are developed to characterize precisely cross-lingual similarities, syntactic idiosyncrasies, and the added value of cross-lingual information compared to monolingual training. The substantial benefits of typological knowledge are also explored. The whole study relies on a series of technical improvements regarding the parsing framework: this work includes the release of a new open source software, PanParser, which revisits the so-called dynamic oracles to extend their use cases. Several purely monolingual contributions complete this work, including an exploration of monolingual cascading, which offers promising perspectives with easy-then-hard strategies
Susman, Derya. "Turkish Large Vocabulary Continuous Speech Recognition By Using Limited Audio Corpus." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614207/index.pdf.
Full textKarim, Hiva. "Best way for collecting data for low-resourced languages." Thesis, Högskolan Dalarna, Mikrodataanalys, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:du-35945.
Full textCordova, Johanna. "Le quechua dans les outils numériques, un défi pour le TAL ? Développement de ressources linguistiques et numériques pour le quechua ancashino." Electronic Thesis or Diss., Paris, INALCO, 2024. http://www.theses.fr/2024INAL0031.
Full textQuechua languages are one of the Amerindian language families with the largest number of native speakers. In Peru, according to the 2017 census, 13.9% of the population have Quechua as their first language, and around 20% speak it. However, the language is almost totally absent from digital tools. In natural language processing (NLP), it is an under-resourced language, with a strong disparity in the amount of resources depending on the variety of Quechua considered. The aim of this thesis is to develop a set of fundamental tools for the automatic processing of a variety of central Quechua, Ancash Quechua, spoken by around 400,000 people and in danger of extinction according to the UNESCO classification. This process involves three stages: digitisation of the resources available in this variety (dictionaries, written corpora), implementation of a morphological analyser, and development of a treebank for morpho-syntactic analysis. These resources will be made available on the web via applications, in particular a search engine that can be used to query the dictionaries available for this language. In a global context of preservation movement of native languages, and while ambitious policies related to linguistic rights are being deployed in the countries of the Andean region, the presence of Quechua in technologies would be an important lever to strengthen its practice and facilitate its teaching
Samson, Juan Sarah Flora. "Exploiting resources from closely-related languages for automatic speech recognition in low-resource languages from Malaysia." Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GREAM061/document.
Full textLanguages in Malaysia are dying in an alarming rate. As of today, 15 languages are in danger while two languages are extinct. One of the methods to save languages is by documenting languages, but it is a tedious task when performed manually.Automatic Speech Recognition (ASR) system could be a tool to help speed up the process of documenting speeches from the native speakers. However, building ASR systems for a target language requires a large amount of training data as current state-of-the-art techniques are based on empirical approach. Hence, there are many challenges in building ASR for languages that have limited data available.The main aim of this thesis is to investigate the effects of using data from closely-related languages to build ASR for low-resource languages in Malaysia. Past studies have shown that cross-lingual and multilingual methods could improve performance of low-resource ASR. In this thesis, we try to answer several questions concerning these approaches: How do we know which language is beneficial for our low-resource language? How does the relationship between source and target languages influence speech recognition performance? Is pooling language data an optimal approach for multilingual strategy?Our case study is Iban, an under-resourced language spoken in Borneo island. We study the effects of using data from Malay, a local dominant language which is close to Iban, for developing Iban ASR under different resource constraints. We have proposed several approaches to adapt Malay data to obtain pronunciation and acoustic models for Iban speech.Building a pronunciation dictionary from scratch is time consuming, as one needs to properly define the sound units of each word in a vocabulary. We developed a semi-supervised approach to quickly build a pronunciation dictionary for Iban. It was based on bootstrapping techniques for improving Malay data to match Iban pronunciations.To increase the performance of low-resource acoustic models we explored two acoustic modelling techniques, the Subspace Gaussian Mixture Models (SGMM) and Deep Neural Networks (DNN). We performed cross-lingual strategies using both frameworks for adapting out-of-language data to Iban speech. Results show that using Malay data is beneficial for increasing the performance of Iban ASR. We also tested SGMM and DNN to improve low-resource non-native ASR. We proposed a fine merging strategy for obtaining an optimal multi-accent SGMM. In addition, we developed an accent-specific DNN using native speech data. After applying both methods, we obtained significant improvements in ASR accuracy. From our study, we observe that using SGMM and DNN for cross-lingual strategy is effective when training data is very limited
Books on the topic "Low-Resourced language"
Multilingual processing in eastern and southern EU languages: Low-resourced technologies and translation. Newcastle upon Tyne, UK: Cambridge Scholars Publishing, 2012.
Find full textBook chapters on the topic "Low-Resourced language"
Pattnaik, Sagarika, and Ajit Kumar Nayak. "An Automatic Summarizer for a Low-Resourced Language." In Advances in Intelligent Systems and Computing, 285–95. Singapore: Springer Singapore, 2020. http://dx.doi.org/10.1007/978-981-15-1081-6_24.
Full textMbaye, Derguene, Moussa Diallo, and Thierno Ibrahima Diop. "Low-Resourced Machine Translation for Senegalese Wolof Language." In Proceedings of Eighth International Congress on Information and Communication Technology, 243–55. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-3236-8_19.
Full textRögnvaldsson, Eiríkur. "Language Report Icelandic." In European Language Equality, 159–62. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-28819-7_21.
Full textAdda-Decker, Martine, Lori Lamel, Gilles Adda, and Thomas Lavergne. "A First LVCSR System for Luxembourgish, a Low-Resourced European Language." In Human Language Technology Challenges for Computer Science and Linguistics, 479–90. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-14120-6_39.
Full textDatta, Goutam, Nisheeth Joshi, and Kusum Gupta. "Analysis of Automatic Evaluation Metric on Low-Resourced Language: BERTScore vs BLEU Score." In Speech and Computer, 155–62. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-20980-2_14.
Full textThandil, Rizwana Kallooravi, K. P. Mohamed Basheer, and V. K. Muneer. "A Multi-feature Analysis of Accented Multisyllabic Malayalam Words—a Low-Resourced Language." In Lecture Notes in Networks and Systems, 243–51. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-1203-2_21.
Full textThandil, Rizwana Kallooravi, K. P. Mohamed Basheer, and V. K. Muneer. "End-to-End Unified Accented Acoustic Model for Malayalam-A Low Resourced Language." In Communications in Computer and Information Science, 346–54. Cham: Springer International Publishing, 2023. http://dx.doi.org/10.1007/978-3-031-33231-9_25.
Full textBani, Rkia, Samir Amri, Lahbib Zenkouar, and Zouhair Guennoun. "Part of Speech Tagging of Amazigh Language as a Very Low-Resourced Language: Particularities and Challenges." In Artificial Intelligence and Industrial Applications, 172–82. Cham: Springer Nature Switzerland, 2023. http://dx.doi.org/10.1007/978-3-031-43520-1_15.
Full textAnagha, H. M., Karthik Sairam, Janya Mahesh, and H. R. Mamatha. "Paraphrase Generation and Deep Learning Models for Paraphrase Detection in a Low-Resourced Language: Kannada." In Advances in Data-Driven Computing and Intelligent Systems, 283–93. Singapore: Springer Nature Singapore, 2024. http://dx.doi.org/10.1007/978-981-99-9531-8_23.
Full textThandil, Rizwana Kallooravi, K. P. Mohamed Basheer, and V. K. Muneer. "Deep Spectral Feature Representations Via Attention-Based Neural Network Architectures for Accented Malayalam Speech—A Low-Resourced Language." In Proceedings of Data Analytics and Management, 1–13. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-99-6553-3_1.
Full textConference papers on the topic "Low-Resourced language"
Jayakody, Ravindu, and Gihan Dias. "Performance of Recent Large Language Models for a Low-Resourced Language." In 2024 International Conference on Asian Language Processing (IALP), 162–67. IEEE, 2024. http://dx.doi.org/10.1109/ialp63756.2024.10661169.
Full textPal, Vaishali, Evangelos Kanoulas, Andrew Yates, and Maarten de Rijke. "Table Question Answering for Low-resourced Indic Languages." In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 75–92. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.emnlp-main.5.
Full textMasethe, Mosima Anna, Hlaudi Daniel Masethe, and Sunday O. Ojo. "Context-Based Question Answering Using Large Language BERT Variant Models for Low Resourced Sesotho sa Leboa Language." In 2024 4th International Multidisciplinary Information Technology and Engineering Conference (IMITEC), 507–13. IEEE, 2024. https://doi.org/10.1109/imitec60221.2024.10850997.
Full textDong, Lukuan, Donghong Qin, Fengbo Bai, Fanhua Song, Yan Liu, Chen Xu, and Zhijian Ou. "Low-Resourced Speech Recognition for Iu Mien Language via Weakly-Supervised Phoneme-Based Multilingual Pretraining." In 2024 IEEE 14th International Symposium on Chinese Spoken Language Processing (ISCSLP), 264–68. IEEE, 2024. https://doi.org/10.1109/iscslp63861.2024.10800186.
Full textZhang, Jiajie, Shulin Cao, Linmei Hu, Ling Feng, Lei Hou, and Juanzi Li. "KB-Plugin: A Plug-and-play Framework for Large Language Models to Induce Programs over Low-resourced Knowledge Bases." In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2868–82. Stroudsburg, PA, USA: Association for Computational Linguistics, 2024. http://dx.doi.org/10.18653/v1/2024.emnlp-main.168.
Full textMasethe, Hlaudi Daniel, Lawrence M. Mothapo, Sunday O. Ojo, Pius A. Owolawi, Mosima Anna Masethe, and Fausto Giunchigilia. "Machine Translation for Morphologically Rich Low-Resourced South African Languages." In 2024 4th International Multidisciplinary Information Technology and Engineering Conference (IMITEC), 71–78. IEEE, 2024. https://doi.org/10.1109/imitec60221.2024.10850972.
Full textAbisado, Mideth B., Maria Luisa G. Bautista, Marilen F. Pacis, Joseph Marvin R. Imperial, Ramon L. Rodriguez, Bernie S. Fabito, Jean V. Malolos, Mico C. Magtira, and Mariecar G. Alfon. "RespiratoryPH: Empowering Low-Resourced Languages Through Multilingual and Multi-Labeled Social Media Dataset Towards Intelligent Public Health Disease Surveillance." In 2024 IEEE International Conference on Progress in Informatics and Computing (PIC), 1–6. IEEE, 2024. https://doi.org/10.1109/pic62406.2024.10892651.
Full textGupta, Akshat. "On Building Spoken Language Understanding Systems for Low Resourced Languages." In Proceedings of the 19th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology. Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.sigmorphon-1.1.
Full textH M, Anagha, Karthik Sairam, Janya Mahesh, and Mamatha H R. "Paraphrase Detection in a Low Resourced Language: Kannada." In 2023 IEEE 8th International Conference for Convergence in Technology (I2CT). IEEE, 2023. http://dx.doi.org/10.1109/i2ct57861.2023.10126391.
Full textIstván, Varga, and Yokoyama Shoichi. "Bilingual dictionary generation for low-resourced language pairs." In the 2009 Conference. Morristown, NJ, USA: Association for Computational Linguistics, 2009. http://dx.doi.org/10.3115/1699571.1699625.
Full text