Увійти

Готові списки джерел за темами / Machine Translation, Reinforcement Learning, Low Resource, Adaptation

Зміст

Статті в журналах

Добірка наукової літератури з теми "Machine Translation, Reinforcement Learning, Low Resource, Adaptation"

Автор: Grafiati

Опубліковано: 11 березня 2023

Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями

Оберіть тип джерела:

Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Machine Translation, Reinforcement Learning, Low Resource, Adaptation".

Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.

Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.

Статті в журналах з теми "Machine Translation, Reinforcement Learning, Low Resource, Adaptation"

1

Zhan, Runzhe, Xuebo Liu, Derek F. Wong, and Lidia S. Chao. "Meta-Curriculum Learning for Domain Adaptation in Neural Machine Translation." Proceedings of the AAAI Conference on Artificial Intelligence 35, no. 16 (May 18, 2021): 14310–18. http://dx.doi.org/10.1609/aaai.v35i16.17683.

Повний текст джерела

Анотація:

Meta-learning has been sufficiently validated to be beneficial for low-resource neural machine translation (NMT). However, we find that meta-trained NMT fails to improve the translation performance of the domain unseen at the meta-training stage. In this paper, we aim to alleviate this issue by proposing a novel meta-curriculum learning for domain adaptation in NMT. During meta-training, the NMT first learns the similar curricula from each domain to avoid falling into a bad local optimum early, and finally learns the curricula of individualities to improve the model robustness for learning domain-specific knowledge. Experimental results on 10 different low-resource domains show that meta-curriculum learning can improve the translation performance of both familiar and unfamiliar domains. All the codes and data are freely available at https://github.com/NLP2CT/Meta-Curriculum.

Стилі APA, Harvard, Vancouver, ISO та ін.

2

Li, Rumeng, Xun Wang, and Hong Yu. "MetaMT, a Meta Learning Method Leveraging Multiple Domain Data for Low Resource Machine Translation." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 8245–52. http://dx.doi.org/10.1609/aaai.v34i05.6339.

Повний текст джерела

Анотація:

Neural machine translation (NMT) models have achieved state-of-the-art translation quality with a large quantity of parallel corpora available. However, their performance suffers significantly when it comes to domain-specific translations, in which training data are usually scarce. In this paper, we present a novel NMT model with a new word embedding transition technique for fast domain adaption. We propose to split parameters in the model into two groups: model parameters and meta parameters. The former are used to model the translation while the latter are used to adjust the representational space to generalize the model to different domains. We mimic the domain adaptation of the machine translation model to low-resource domains using multiple translation tasks on different domains. A new training strategy based on meta-learning is developed along with the proposed model to update the model parameters and meta parameters alternately. Experiments on datasets of different domains showed substantial improvements of NMT performances on a limited amount of data.

Стилі APA, Harvard, Vancouver, ISO та ін.

3

Kumari, Divya, Asif Ekbal, Rejwanul Haque, Pushpak Bhattacharyya, and Andy Way. "Reinforced NMT for Sentiment and Content Preservation in Low-resource Scenario." ACM Transactions on Asian and Low-Resource Language Information Processing 20, no. 4 (June 28, 2021): 1–27. http://dx.doi.org/10.1145/3450970.

Повний текст джерела

Анотація:

The preservation of domain knowledge from source to the target is crucial in any translation workflows. Hence, translation service providers that use machine translation (MT) in production could reasonably expect that the translation process should transfer both the underlying pragmatics and the semantics of the source-side sentences into the target language. However, recent studies suggest that the MT systems often fail to preserve such crucial information (e.g., sentiment, emotion, gender traits) embedded in the source text in the target. In this context, the raw automatic translations are often directly fed to other natural language processing (NLP) applications (e.g., sentiment classifier) in a cross-lingual platform. Hence, the loss of such crucial information during the translation could negatively affect the performance of such downstream NLP tasks that heavily rely on the output of the MT systems. In our current research, we carefully balance both the sides (i.e., sentiment and semantics) during translation, by controlling a global-attention-based neural MT (NMT), to generate translations that encode the underlying sentiment of a source sentence while preserving its non-opinionated semantic content. Toward this, we use a state-of-the-art reinforcement learning method, namely, actor-critic , that includes a novel reward combination module, to fine-tune the NMT system so that it learns to generate translations that are best suited for a downstream task, viz. sentiment classification while ensuring the source-side semantics is intact in the process. Experimental results for Hindi–English language pair show that our proposed method significantly improves the performance of the sentiment classifier and alongside results in an improved NMT system.

Стилі APA, Harvard, Vancouver, ISO та ін.

4

Nayyar, Anand, Pijush Kanti Dutta Pramankit, and Rajni Mohana. "Introduction to the Special Issue on Evolving IoT and Cyber-Physical Systems: Advancements, Applications, and Solutions." Scalable Computing: Practice and Experience 21, no. 3 (August 1, 2020): 347–48. http://dx.doi.org/10.12694/scpe.v21i3.1568.

Повний текст джерела

Анотація:

Internet of Things (IoT) is regarded as a next-generation wave of Information Technology (IT) after the widespread emergence of the Internet and mobile communication technologies. IoT supports information exchange and networked interaction of appliances, vehicles and other objects, making sensing and actuation possible in a low-cost and smart manner. On the other hand, cyber-physical systems (CPS) are described as the engineered systems which are built upon the tight integration of the cyber entities (e.g., computation, communication, and control) and the physical things (natural and man-made systems governed by the laws of physics). The IoT and CPS are not isolated technologies. Rather it can be said that IoT is the base or enabling technology for CPS and CPS is considered as the grownup development of IoT, completing the IoT notion and vision. Both are merged into closed-loop, providing mechanisms for conceptualizing, and realizing all aspects of the networked composed systems that are monitored and controlled by computing algorithms and are tightly coupled among users and the Internet. That is, the hardware and the software entities are intertwined, and they typically function on different time and location-based scales. In fact, the linking between the cyber and the physical world is enabled by IoT (through sensors and actuators). CPS that includes traditional embedded and control systems are supposed to be transformed by the evolving and innovative methodologies and engineering of IoT. Several applications areas of IoT and CPS are smart building, smart transport, automated vehicles, smart cities, smart grid, smart manufacturing, smart agriculture, smart healthcare, smart supply chain and logistics, etc. Though CPS and IoT have significant overlaps, they differ in terms of engineering aspects. Engineering IoT systems revolves around the uniquely identifiable and internet-connected devices and embedded systems; whereas engineering CPS requires a strong emphasis on the relationship between computation aspects (complex software) and the physical entities (hardware). Engineering CPS is challenging because there is no defined and fixed boundary and relationship between the cyber and physical worlds. In CPS, diverse constituent parts are composed and collaborated together to create unified systems with global behaviour. These systems need to be ensured in terms of dependability, safety, security, efficiency, and adherence to real‐time constraints. Hence, designing CPS requires knowledge of multidisciplinary areas such as sensing technologies, distributed systems, pervasive and ubiquitous computing, real-time computing, computer networking, control theory, signal processing, embedded systems, etc. CPS, along with the continuous evolving IoT, has posed several challenges. For example, the enormous amount of data collected from the physical things makes it difficult for Big Data management and analytics that includes data normalization, data aggregation, data mining, pattern extraction and information visualization. Similarly, the future IoT and CPS need standardized abstraction and architecture that will allow modular designing and engineering of IoT and CPS in global and synergetic applications. Another challenging concern of IoT and CPS is the security and reliability of the components and systems. Although IoT and CPS have attracted the attention of the research communities and several ideas and solutions are proposed, there are still huge possibilities for innovative propositions to make IoT and CPS vision successful. The major challenges and research scopes include system design and implementation, computing and communication, system architecture and integration, application-based implementations, fault tolerance, designing efficient algorithms and protocols, availability and reliability, security and privacy, energy-efficiency and sustainability, etc. It is our great privilege to present Volume 21, Issue 3 of Scalable Computing: Practice and Experience. We had received 30 research papers and out of which 14 papers are selected for publication. The objective of this special issue is to explore and report recent advances and disseminate state-of-the-art research related to IoT, CPS and the enabling and associated technologies. The special issue will present new dimensions of research to researchers and industry professionals with regard to IoT and CPS. Vivek Kumar Prasad and Madhuri D Bhavsar in the paper titled "Monitoring and Prediction of SLA for IoT based Cloud described the mechanisms for monitoring by using the concept of reinforcement learning and prediction of the cloud resources, which forms the critical parts of cloud expertise in support of controlling and evolution of the IT resources and has been implemented using LSTM. The proper utilization of the resources will generate revenues to the provider and also increases the trust factor of the provider of cloud services. For experimental analysis, four parameters have been used i.e. CPU utilization, disk read/write throughput and memory utilization. Kasture et al. in the paper titled "Comparative Study of Speaker Recognition Techniques in IoT Devices for Text Independent Negative Recognition" compared the performance of features which are used in state of art speaker recognition models and analyse variants of Mel frequency cepstrum coefficients (MFCC) predominantly used in feature extraction which can be further incorporated and used in various smart devices. Mahesh Kumar Singh and Om Prakash Rishi in the paper titled "Event Driven Recommendation System for E-Commerce using Knowledge based Collaborative Filtering Technique" proposed a novel system that uses a knowledge base generated from knowledge graph to identify the domain knowledge of users, items, and relationships among these, knowledge graph is a labelled multidimensional directed graph that represents the relationship among the users and the items. The proposed approach uses about 100 percent of users' participation in the form of activities during navigation of the web site. Thus, the system expects under the users' interest that is beneficial for both seller and buyer. The proposed system is compared with baseline methods in area of recommendation system using three parameters: precision, recall and NDGA through online and offline evaluation studies with user data and it is observed that proposed system is better as compared to other baseline systems. Benbrahim et al. in the paper titled "Deep Convolutional Neural Network with TensorFlow and Keras to Classify Skin Cancer" proposed a novel classification model to classify skin tumours in images using Deep Learning methodology and the proposed system was tested on HAM10000 dataset comprising of 10,015 dermatoscopic images and the results observed that the proposed system is accurate in order of 94.06\% in validation set and 93.93\% in the test set. Devi B et al. in the paper titled "Deadlock Free Resource Management Technique for IoT-Based Post Disaster Recovery Systems" proposed a new class of techniques that do not perform stringent testing before allocating the resources but still ensure that the system is deadlock-free and the overhead is also minimal. The proposed technique suggests reserving a portion of the resources to ensure no deadlock would occur. The correctness of the technique is proved in the form of theorems. The average turnaround time is approximately 18\% lower for the proposed technique over Banker's algorithm and also an optimal overhead of O(m). Deep et al. in the paper titled "Access Management of User and Cyber-Physical Device in DBAAS According to Indian IT Laws Using Blockchain" proposed a novel blockchain solution to track the activities of employees managing cloud. Employee authentication and authorization are managed through the blockchain server. User authentication related data is stored in blockchain. The proposed work assists cloud companies to have better control over their employee's activities, thus help in preventing insider attack on User and Cyber-Physical Devices. Sumit Kumar and Jaspreet Singh in paper titled "Internet of Vehicles (IoV) over VANETS: Smart and Secure Communication using IoT" highlighted a detailed description of Internet of Vehicles (IoV) with current applications, architectures, communication technologies, routing protocols and different issues. The researchers also elaborated research challenges and trade-off between security and privacy in area of IoV. Deore et al. in the paper titled "A New Approach for Navigation and Traffic Signs Indication Using Map Integrated Augmented Reality for Self-Driving Cars" proposed a new approach to supplement the technology used in self-driving cards for perception. The proposed approach uses Augmented Reality to create and augment artificial objects of navigational signs and traffic signals based on vehicles location to reality. This approach help navigate the vehicle even if the road infrastructure does not have very good sign indications and marking. The approach was tested locally by creating a local navigational system and a smartphone based augmented reality app. The approach performed better than the conventional method as the objects were clearer in the frame which made it each for the object detection to detect them. Bhardwaj et al. in the paper titled "A Framework to Systematically Analyse the Trustworthiness of Nodes for Securing IoV Interactions" performed literature on IoV and Trust and proposed a Hybrid Trust model that seperates the malicious and trusted nodes to secure the interaction of vehicle in IoV. To test the model, simulation was conducted on varied threshold values. And results observed that PDR of trusted node is 0.63 which is higher as compared to PDR of malicious node which is 0.15. And on the basis of PDR, number of available hops and Trust Dynamics the malicious nodes are identified and discarded. Saniya Zahoor and Roohie Naaz Mir in the paper titled "A Parallelization Based Data Management Framework for Pervasive IoT Applications" highlighted the recent studies and related information in data management for pervasive IoT applications having limited resources. The paper also proposes a parallelization-based data management framework for resource-constrained pervasive applications of IoT. The comparison of the proposed framework is done with the sequential approach through simulations and empirical data analysis. The results show an improvement in energy, processing, and storage requirements for the processing of data on the IoT device in the proposed framework as compared to the sequential approach. Patel et al. in the paper titled "Performance Analysis of Video ON-Demand and Live Video Streaming Using Cloud Based Services" presented a review of video analysis over the LVS \& VoDS video application. The researchers compared different messaging brokers which helps to deliver each frame in a distributed pipeline to analyze the impact on two message brokers for video analysis to achieve LVS & VoS using AWS elemental services. In addition, the researchers also analysed the Kafka configuration parameter for reliability on full-service-mode. Saniya Zahoor and Roohie Naaz Mir in the paper titled "Design and Modeling of Resource-Constrained IoT Based Body Area Networks" presented the design and modeling of a resource-constrained BAN System and also discussed the various scenarios of BAN in context of resource constraints. The Researchers also proposed an Advanced Edge Clustering (AEC) approach to manage the resources such as energy, storage, and processing of BAN devices while performing real-time data capture of critical health parameters and detection of abnormal patterns. The comparison of the AEC approach is done with the Stable Election Protocol (SEP) through simulations and empirical data analysis. The results show an improvement in energy, processing time and storage requirements for the processing of data on BAN devices in AEC as compared to SEP. Neelam Saleem Khan and Mohammad Ahsan Chishti in the paper titled "Security Challenges in Fog and IoT, Blockchain Technology and Cell Tree Solutions: A Review" outlined major authentication issues in IoT, map their existing solutions and further tabulate Fog and IoT security loopholes. Furthermore, this paper presents Blockchain, a decentralized distributed technology as one of the solutions for authentication issues in IoT. In addition, the researchers discussed the strength of Blockchain technology, work done in this field, its adoption in COVID-19 fight and tabulate various challenges in Blockchain technology. The researchers also proposed Cell Tree architecture as another solution to address some of the security issues in IoT, outlined its advantages over Blockchain technology and tabulated some future course to stir some attempts in this area. Bhadwal et al. in the paper titled "A Machine Translation System from Hindi to Sanskrit Language Using Rule Based Approach" proposed a rule-based machine translation system to bridge the language barrier between Hindi and Sanskrit Language by converting any test in Hindi to Sanskrit. The results are produced in the form of two confusion matrices wherein a total of 50 random sentences and 100 tokens (Hindi words or phrases) were taken for system evaluation. The semantic evaluation of 100 tokens produce an accuracy of 94\% while the pragmatic analysis of 50 sentences produce an accuracy of around 86\%. Hence, the proposed system can be used to understand the whole translation process and can further be employed as a tool for learning as well as teaching. Further, this application can be embedded in local communication based assisting Internet of Things (IoT) devices like Alexa or Google Assistant. Anshu Kumar Dwivedi and A.K. Sharma in the paper titled "NEEF: A Novel Energy Efficient Fuzzy Logic Based Clustering Protocol for Wireless Sensor Network" proposed a a deterministic novel energy efficient fuzzy logic-based clustering protocol (NEEF) which considers primary and secondary factors in fuzzy logic system while selecting cluster heads. After selection of cluster heads, non-cluster head nodes use fuzzy logic for prudent selection of their cluster head for cluster formation. NEEF is simulated and compared with two recent state of the art protocols, namely SCHFTL and DFCR under two scenarios. Simulation results unveil better performance by balancing the load and improvement in terms of stability period, packets forwarded to the base station, improved average energy and extended lifetime.

Стилі APA, Harvard, Vancouver, ISO та ін.

5

Pham, Nghia-Luan, and Van-Vinh Nguyen. "Adaptation in Statistical Machine Translation for Low-resource Domains in English-Vietnamese Language." VNU Journal of Science: Computer Science and Communication Engineering 36, no. 1 (May 30, 2020). http://dx.doi.org/10.25073/2588-1086/vnucsce.231.

Повний текст джерела

Анотація:

In this paper, we propose a new method for domain adaptation in Statistical Machine Translation for low-resource domains in English-Vietnamese language. Specifically, our method only uses monolingual data to adapt the translation phrase-table, our system brings improvements over the SMT baseline system. We propose two steps to improve the quality of SMT system: (i) classify phrases on the target side of the translation phrase-table use the probability classifier model, and (ii) adapt to the phrase-table translation by recomputing the direct translation probability of phrases. Our experiments are conducted with translation direction from English to Vietnamese on two very different domains that are legal domain (out-of-domain) and general domain (in-of-domain). The English-Vietnamese parallel corpus is provided by the IWSLT 2015 organizers and the experimental results showed that our method significantly outperformed the baseline system. Our system improved on the quality of machine translation in the legal domain up to 0.9 BLEU scores over the baseline system,… Keywords: Machine Translation, Statistical Machine Translation, Domain Adaptation References [1] Philipp Koehn, Franz Josef Och, Daniel Marcu, Statistical phrase-based translation, In Proceedings of HLT-NAACL, Edmonton, Canada, 2003, 127-133. [2] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes and Jeffrey Dean, Google’s neural machine translation system: Bridging the gap between human and machine translation, CoRR, abs/1609.08144, 2016. [3] Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo and Marcello Federico, Neural versus phrase-based machine translation quality: A case study, 2016. [4] Barry Haddow, Philipp Koehn, Analysing the effect of out-of-domain data on smt systems, In Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012, 422-432. [5] Boxing Chen, Roland Kuhn and George Foster, Vector space model for adaptation in statistical machine translation, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013, pp. 1285-1293. [6] Daniel Dahlmeier, Hwee Tou Ng, Siew Mei Wu4, Building a large annotated corpus of learner english: The nus corpus of learner english, In Proceedings of the NAACL Workshop on Innovative Use of NLP for Building Educational Appli-cations, 2013. [7] Eva Hasler, Phil Blunsom, Philipp Koehn and Barry Haddow, Dynamic topic adaptation for phrase-based mt, In Proceedings of the 14th Conference of the European Chapter of The Association for Computational Linguistics, 2014, pp. 328-337. [8] George Foster, Roland Kuhn, Mixture-model adaptation for smt, Proceedings of the Second Workshop on Statistical Machine Translation, Prague, Association for Computational Linguistics, 2007, pp. 128-135. [9] George Foster, Boxing Chen, Roland Kuhn, Simulating discriminative training for linear mixture adaptation in statistical machine translation, Proceedings of the MT Summit, 2013. [10] Hoang Cuong, Khalil Sima’an, and Ivan Titov, Adapting to all domains at once: Rewarding domain invariance in smt, Proceedings of the Transactions of the Association for Computational Linguistics (TACL), 2016. [11] Ryo Masumura, Taichi Asam, Takanobu Oba, Hirokazu Masataki, Sumitaka Sakauchi, and Akinori Ito, Hierarchical latent words language models for robust modeling to out-of domain tasks, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015, pp. 1896-1901. [12] Chenhui Chu, Raj Dabre, and Sadao Kurohashi. An empirical comparison of simple domain adaptation methods for neural machine translation, 2017. [13] Markus Freitag, Yaser Al-Onaizan, Fast domain adaptation for neural machine translation, 2016. [14] Jia Xu, Yonggang Deng, Yuqing Gao and Hermann Ney, Domain dependent statistical machine translation, In Proceedings of the MT Summit XI, 2007, pp. 515-520. [15] Hua Wu, Haifeng Wang Chengqing Zong, Domain adaptation for statistical machine translation with domain dictionary and monolingual corpora, In Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008), Manchester, UK, 2008, pp. 993-1000. [16] Adam Berger, Stephen Della Pietra, and Vincent Della Pietra, A maximum entropy approach to natural language processing, Computational Linguistics, 22, 1996. [17] 18Santanu Pal, Sudip Naskar, Josef Van Genabith, Uds-sant, English-German hybrid machine translation system, In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, September, Association for Computational Linguistics, 2015, pp. 152-157. [18] Louis Onrust, Antal van den Bosch, Hugo Van hamme, Improving cross-domain n-gram language modelling with skipgrams, Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 2016, pp. 137-142. [19] Mark Aronoff, Kirsten Fudeman, What is morphology, V 8. john wiley and sons, 2011. [20] Laurence C. Thompson, The problem of the word in vietnamese, In journal of the International Linguistic Association 19(1) (1963) 39-52. https:// doi.org/1080/00437956.1963.11659787. [21] Binh N. Ngo, The Vietnamese language learning framework, Journal of Southeast Asian Language Teaching 10 (2001) 1-24. [22] Le Hong Phuong, Nguyen Thi Minh Huyen, Azim Roussanaly, Ho Tuong Vinh, A hybrid approach to word segmentation of vietnamese texts, 2008. [23] Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, Evan Herbst, Moses: Open source toolkit for statistical machine translation, In ACL-2007: Proceedings of demo and poster sessions, Prague, Czech Republic, 2007, pp.177-180. [24] Franz Josef Och, Minimum error rate training in statistical machine translation, In Proceedings of ACL, 2003, pp.160-167. [25] Andreas Stolcke, Srilm - an extensible language modeling toolkit, in proceedings of international conference on spoken language processing, 2002. [26] Papineni, Kishore, Salim Roukos, Todd Ward, WeiJing Zhu, Bleu: A method for automatic evaluation of machine translation, ACL, 2002. [27] G. Klein, Y. Kim, Y. Deng, J. Senellart, A.M. Rush, OpenNMT: Open-Source Toolkit for Neural Machine Translation. ArXiv e-prints. [28] Pratyush Banerjee, Jinhua Du, Baoli Li, Sudip Kr. Naskar, Andy Way and Josef van Genabith, Combining multi-domain statistical machine translation models using automatic classifiers, In Proceedings of AMTA 2010., 2010.

Стилі APA, Harvard, Vancouver, ISO та ін.

6

Kann, Katharina, Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, John E. Ortega, Annette Rios, Angela Fan, et al. "AmericasNLI: Machine translation and natural language inference systems for Indigenous languages of the Americas." Frontiers in Artificial Intelligence 5 (December 2, 2022). http://dx.doi.org/10.3389/frai.2022.995667.

Повний текст джерела

Анотація:

Little attention has been paid to the development of human language technology for truly low-resource languages—i.e., languages with limited amounts of digitally available text data, such as Indigenous languages. However, it has been shown that pretrained multilingual models are able to perform crosslingual transfer in a zero-shot setting even for low-resource languages which are unseen during pretraining. Yet, prior work evaluating performance on unseen languages has largely been limited to shallow token-level tasks. It remains unclear if zero-shot learning of deeper semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, a natural language inference dataset covering 10 Indigenous languages of the Americas. We conduct experiments with pretrained models, exploring zero-shot learning in combination with model adaptation. Furthermore, as AmericasNLI is a multiway parallel dataset, we use it to benchmark the performance of different machine translation models for those languages. Finally, using a standard transformer model, we explore translation-based approaches for natural language inference. We find that the zero-shot performance of pretrained models without adaptation is poor for all languages in AmericasNLI, but model adaptation via continued pretraining results in improvements. All machine translation models are rather weak, but, surprisingly, translation-based approaches to natural language inference outperform all other models on that task.

Стилі APA, Harvard, Vancouver, ISO та ін.

Ми пропонуємо знижки на всі преміум-плани для авторів, чиї праці увійшли до тематичних добірок літератури. Зв'яжіться з нами, щоб отримати унікальний промокод!