Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Multi-lingual training.

Artykuły w czasopismach na temat „Multi-lingual training”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Multi-lingual training”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Chi, Zewen, Li Dong, Furu Wei, Wenhui Wang, Xian-Ling Mao i Heyan Huang. "Cross-Lingual Natural Language Generation via Pre-Training". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 7570–77. http://dx.doi.org/10.1609/aaai.v34i05.6256.

Pełny tekst źródła
Streszczenie:
In this work we focus on transferring supervision signals of natural language generation (NLG) tasks between multiple languages. We propose to pretrain the encoder and the decoder of a sequence-to-sequence model under both monolingual and cross-lingual settings. The pre-training objective encourages the model to represent different languages in the shared space, so that we can conduct zero-shot cross-lingual transfer. After the pre-training procedure, we use monolingual data to fine-tune the pre-trained model on downstream NLG tasks. Then the sequence-to-sequence model trained in a single language can be directly evaluated beyond that language (i.e., accepting multi-lingual input and producing multi-lingual output). Experimental results on question generation and abstractive summarization show that our model outperforms the machine-translation-based pipeline methods for zero-shot cross-lingual generation. Moreover, cross-lingual transfer improves NLG performance of low-resource languages by leveraging rich-resource language data. Our implementation and data are available at https://github.com/CZWin32768/xnlg.
Style APA, Harvard, Vancouver, ISO itp.
2

Cao, Yue, Xiaojun Wan, Jinge Yao i Dian Yu. "MultiSumm: Towards a Unified Model for Multi-Lingual Abstractive Summarization". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 01 (3.04.2020): 11–18. http://dx.doi.org/10.1609/aaai.v34i01.5328.

Pełny tekst źródła
Streszczenie:
Automatic text summarization aims at producing a shorter version of the input text that conveys the most important information. However, multi-lingual text summarization, where the goal is to process texts in multiple languages and output summaries in the corresponding languages with a single model, has been rarely studied. In this paper, we present MultiSumm, a novel multi-lingual model for abstractive summarization. The MultiSumm model uses the following training regime: (I) multi-lingual learning that contains language model training, auto-encoder training, translation and back-translation training, and (II) joint summary generation training. We conduct experiments on summarization datasets for five rich-resource languages: English, Chinese, French, Spanish, and German, as well as two low-resource languages: Bosnian and Croatian. Experimental results show that our proposed model significantly outperforms a multi-lingual baseline model. Specifically, our model achieves comparable or even better performance than models trained separately on each language. As an additional contribution, we construct the first summarization dataset for Bosnian and Croatian, containing 177,406 and 204,748 samples, respectively.
Style APA, Harvard, Vancouver, ISO itp.
3

Kovacic, Michael, i Karl Cunningham. "Effective Electrical Safety Program Training in Multi-Lingual/Cultural Environments". IEEE Transactions on Industry Applications 55, nr 4 (lipiec 2019): 4384–88. http://dx.doi.org/10.1109/tia.2019.2907883.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Zhan, Qingran, Xiang Xie, Chenguang Hu, Juan Zuluaga-Gomez, Jing Wang i Haobo Cheng. "Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition". Electronics 10, nr 24 (20.12.2021): 3172. http://dx.doi.org/10.3390/electronics10243172.

Pełny tekst źródła
Streszczenie:
Phonological-based features (articulatory features, AFs) describe the movements of the vocal organ which are shared across languages. This paper investigates a domain-adversarial neural network (DANN) to extract reliable AFs, and different multi-stream techniques are used for cross-lingual speech recognition. First, a novel universal phonological attributes definition is proposed for Mandarin, English, German and French. Then a DANN-based AFs detector is trained using source languages (English, German and French). When doing the cross-lingual speech recognition, the AFs detectors are used to transfer the phonological knowledge from source languages (English, German and French) to the target language (Mandarin). Two multi-stream approaches are introduced to fuse the acoustic features and cross-lingual AFs. In addition, the monolingual AFs system (i.e., the AFs are directly extracted from the target language) is also investigated. Experiments show that the performance of the AFs detector can be improved by using convolutional neural networks (CNN) with a domain-adversarial learning method. The multi-head attention (MHA) based multi-stream can reach the best performance compared to the baseline, cross-lingual adaptation approach, and other approaches. More specifically, the MHA-mode with cross-lingual AFs yields significant improvements over monolingual AFs with the restriction of training data size and, which can be easily extended to other low-resource languages.
Style APA, Harvard, Vancouver, ISO itp.
5

Zhang, Mozhi, Yoshinari Fujinuma i Jordan Boyd-Graber. "Exploiting Cross-Lingual Subword Similarities in Low-Resource Document Classification". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 9547–54. http://dx.doi.org/10.1609/aaai.v34i05.6500.

Pełny tekst źródła
Streszczenie:
Text classification must sometimes be applied in a low-resource language with no labeled training data. However, training data may be available in a related language. We investigate whether character-level knowledge transfer from a related language helps text classification. We present a cross-lingual document classification framework (caco) that exploits cross-lingual subword similarity by jointly training a character-based embedder and a word-based classifier. The embedder derives vector representations for input words from their written forms, and the classifier makes predictions based on the word vectors. We use a joint character representation for both the source language and the target language, which allows the embedder to generalize knowledge about source language words to target language words with similar forms. We propose a multi-task objective that can further improve the model if additional cross-lingual or monolingual resources are available. Experiments confirm that character-level knowledge transfer is more data-efficient than word-level transfer between related languages.
Style APA, Harvard, Vancouver, ISO itp.
6

Guinzoni, Roberta. "Walgreens Boots Alliance goes multi-lingual through e-learning". Human Resource Management International Digest 23, nr 7 (12.10.2015): 5–8. http://dx.doi.org/10.1108/hrmid-08-2015-0138.

Pełny tekst źródła
Streszczenie:
Purpose – Explains how Walgreens Boots Alliance teamed up with Rosetta Stone on a digital language-learning program that is bringing many and significant benefits to individual employees and the company as a whole. Design/methodology/approach – Reveals that the aims were to develop English fluency for the non-English native speakers in the organization that work across countries and divisions and build language skills in at least one other main language spoken widely within the business. Provides a series of tips for anyone wanting to gain the far-reaching benefits of language learning in their organization. Findings – Describes how the e-learning course includes live online sessions with native-speaking tutors, conversation practice and games. Explains that the company also plans to supplement the structured digital-based learning with practice sessions in a work setting. These are planned to be so informal that people do not feel like they are learning or being taught. They will be social, networking occasions, such as “lunch in French” or other scheduled get-togethers where people are able to practice the language that they are learning. Practical implications – Highlights the importance of being realistic about time, supporting employees facing a digital challenge, getting sponsor support, integrating the continuous-learning concept into the organization, and using ambassadors. Social implications – Advances the view that a mastery of foreign languages is essential for successful business collaboration across countries and cultures, particularly for managers and directors. In providing language training for these serious learners, the company is not only helping its employees to understand each other better and relate to each other’s background and culture but also growing the business. Originality/value – Concludes that a digital-learning approach to language learning can help businesses to meet their learning goals and deliver training that is interactive, convenient and fun for learners.
Style APA, Harvard, Vancouver, ISO itp.
7

Pinto da Costa, Mariana. "Conducting Cross-Cultural, Multi-Lingual and Multi-Country Focus Groups: Guidance for Researchers". International Journal of Qualitative Methods 20 (styczeń 2021): 160940692110499. http://dx.doi.org/10.1177/16094069211049929.

Pełny tekst źródła
Streszczenie:
Conducting cross-cultural, multi-lingual and multi-country focus groups represents unique logistic and analytical challenges. However, there is little guidance for the necessary considerations required for such international focus groups. Based on the author’s experience of conducting such research, this publication documents the different stages of planning, fieldwork, analysis and dissemination, and how to mitigate possible challenges and overcome them. It is essential to set up an adequate research team with the linguist and cultural background required. All researchers should have the necessary training in qualitative methods and follow a standardised approach in the facilitation of focus groups across the different countries and in the analysis of the data, ideally in their original languages.
Style APA, Harvard, Vancouver, ISO itp.
8

Fuad, Ahlam, i Maha Al-Yahya. "AraConv: Developing an Arabic Task-Oriented Dialogue System Using Multi-Lingual Transformer Model mT5". Applied Sciences 12, nr 4 (11.02.2022): 1881. http://dx.doi.org/10.3390/app12041881.

Pełny tekst źródła
Streszczenie:
Task-oriented dialogue systems (DS) are designed to help users perform daily activities using natural language. Task-oriented DS for English language have demonstrated promising performance outcomes; however, developing such systems to support Arabic remains a challenge. This challenge is mainly due to the lack of Arabic dialogue datasets. This study introduces the first Arabic end-to-end generative model for task-oriented DS (AraConv), which uses the multi-lingual transformer model mT5 with different settings. We also present an Arabic dialogue dataset (Arabic-TOD) and used it to train and test the proposed AraConv model. The results obtained are reasonable compared to those reported in the studies of English and Chinese using the same mono-lingual settings. To avoid problems associated with a small training dataset and to improve the AraConv model’s results, we suggest joint-training, in which the model is jointly trained on Arabic dialogue data and data from one or two high-resource languages such as English and Chinese. The findings indicate the AraConv model performed better in the joint-training setting than in the mono-lingual setting. The results obtained from AraConv on the Arabic dialogue dataset provide a baseline for other researchers to build robust end-to-end Arabic task-oriented DS that can engage with complex scenarios.
Style APA, Harvard, Vancouver, ISO itp.
9

Yan, Huijiong, Tao Qian, Liang Xie i Shanguang Chen. "Unsupervised cross-lingual model transfer for named entity recognition with contextualized word representations". PLOS ONE 16, nr 9 (21.09.2021): e0257230. http://dx.doi.org/10.1371/journal.pone.0257230.

Pełny tekst źródła
Streszczenie:
Named entity recognition (NER) is one fundamental task in the natural language processing (NLP) community. Supervised neural network models based on contextualized word representations can achieve highly-competitive performance, which requires a large-scale manually-annotated corpus for training. While for the resource-scarce languages, the construction of such as corpus is always expensive and time-consuming. Thus, unsupervised cross-lingual transfer is one good solution to address the problem. In this work, we investigate the unsupervised cross-lingual NER with model transfer based on contextualized word representations, which greatly advances the cross-lingual NER performance. We study several model transfer settings of the unsupervised cross-lingual NER, including (1) different types of the pretrained transformer-based language models as input, (2) the exploration strategies of the multilingual contextualized word representations, and (3) multi-source adaption. In particular, we propose an adapter-based word representation method combining with parameter generation network (PGN) better to capture the relationship between the source and target languages. We conduct experiments on a benchmark ConLL dataset involving four languages to simulate the cross-lingual setting. Results show that we can obtain highly-competitive performance by cross-lingual model transfer. In particular, our proposed adapter-based PGN model can lead to significant improvements for cross-lingual NER.
Style APA, Harvard, Vancouver, ISO itp.
10

Xiang, Lu, Junnan Zhu, Yang Zhao, Yu Zhou i Chengqing Zong. "Robust Cross-lingual Task-oriented Dialogue". ACM Transactions on Asian and Low-Resource Language Information Processing 20, nr 6 (30.11.2021): 1–24. http://dx.doi.org/10.1145/3457571.

Pełny tekst źródła
Streszczenie:
Cross-lingual dialogue systems are increasingly important in e-commerce and customer service due to the rapid progress of globalization. In real-world system deployment, machine translation (MT) services are often used before and after the dialogue system to bridge different languages. However, noises and errors introduced in the MT process will result in the dialogue system's low robustness, making the system's performance far from satisfactory. In this article, we propose a novel MT-oriented noise enhanced framework that exploits multi-granularity MT noises and injects such noises into the dialogue system to improve the dialogue system's robustness. Specifically, we first design a method to automatically construct multi-granularity MT-oriented noises and multi-granularity adversarial examples, which contain abundant noise knowledge oriented to MT. Then, we propose two strategies to incorporate the noise knowledge: (i) Utterance-level adversarial learning and (ii) Knowledge-level guided method. The former adopts adversarial learning to learn a perturbation-invariant encoder, guiding the dialogue system to learn noise-independent hidden representations. The latter explicitly incorporates the multi-granularity noises, which contain the noise tokens and their possible correct forms, into the training and inference process, thus improving the dialogue system's robustness. Experimental results on three dialogue models, two dialogue datasets, and two language pairs have shown that the proposed framework significantly improves the performance of the cross-lingual dialogue system.
Style APA, Harvard, Vancouver, ISO itp.
11

Zorab, John S. M., i M. D. Vickers. "The European Academy of Anesthesiology – 1992 and Beyond". Journal of the Royal Society of Medicine 84, nr 12 (grudzień 1991): 704–8. http://dx.doi.org/10.1177/014107689108401205.

Pełny tekst źródła
Streszczenie:
The European Academy of Anaesthesiology was founded in 1978 as a means of meeting the challenges resulting from the introduction of the Medical Directives permitting the free movement of doctors within the European Community. The Academy is a scientific forum for anaesthetists throughout Europe -not just the EC countries - and has established its own English-language journal and multi-lingual Diploma examination. It is now embarking on a system of hospital recognition linked to intraining examinations. With the help of industry and a professional communications organization, it is also exploring the production of multi-lingual educational packages. It is believed that for effective evolution of hospital practice in Europe, medical specialties need to have their own academic organizations which will develop specialist training and which are in a position to provide appropriate advice to relevant national and European bodies.
Style APA, Harvard, Vancouver, ISO itp.
12

Santini, Marina, i Min-Chun Shih. "Exploring the Potential of an Extensible Domain-Specific Web Corpus for “Layfication”". International Journal of Cyber-Physical Systems 2, nr 1 (styczeń 2020): 20–32. http://dx.doi.org/10.4018/ijcps.2020010102.

Pełny tekst źródła
Streszczenie:
This article presents experiments based on the extensible domain-specific web corpus for “layfication”. For these experiments, both the existing layfication corpus (in Swedish and in English) and a new addition in English (the NHS-PubMed subcorpus) are used. With this extended corpus, methods to classify lay-specialized medical sublanguages cross-linguistically using small data and noisy web documents are investigated. Sublanguage is a language variety used in specific domains. Here, the authors focus on two medical sublanguages, namely the “patientspeak” (lay) and the medical jargon (specialized). Cross-lingual sublanguage classification is still largely underexplored although it can be crucial in downstream applications for digital health and cyber-physical systems. Classification models are built using small and noisy training sets in Swedish and evaluated on English test sets. The performance of Naive Bayes classifiers—built with stopwords and with Bag-of-Words—is compared with convolutional neural network classifiers leveraging on MUSE multi-lingual word embeddings. Results are promising and nuanced. These results are proposed as a first baseline for cross-lingual sublanguage classification.
Style APA, Harvard, Vancouver, ISO itp.
13

Fang, Yuwei, Shuohang Wang, Zhe Gan, Siqi Sun i Jingjing Liu. "FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding". Proceedings of the AAAI Conference on Artificial Intelligence 35, nr 14 (18.05.2021): 12776–84. http://dx.doi.org/10.1609/aaai.v35i14.17512.

Pełny tekst źródła
Streszczenie:
Large-scale cross-lingual language models (LM), such as mBERT, Unicoder and XLM, have achieved great success in cross-lingual representation learning. However, when applied to zero-shot cross-lingual transfer tasks, most existing methods use only single-language input for LM finetuning, without leveraging the intrinsic cross-lingual alignment between different languages that proves essential for multilingual tasks. In this paper, we propose FILTER, an enhanced fusion method that takes cross-lingual data as input for XLM finetuning. Specifically, FILTER first encodes text input in the source language and its translation in the target language independently in the shallow layers, then performs cross-language fusion to extract multilingual knowledge in the intermediate layers, and finally performs further language-specific encoding. During inference, the model makes predictions based on the text input in the target language and its translation in the source language. For simple tasks such as classification, translated text in the target language shares the same label as the source language. However, this shared label becomes less accurate or even unavailable for more complex tasks such as question answering, NER and POS tagging. To tackle this issue, we further propose an additional KL-divergence self-teaching loss for model training, based on auto-generated soft pseudo-labels for translated text in the target language. Extensive experiments demonstrate that FILTER achieves new state of the art on two challenging multilingual multi-task benchmarks, XTREME and XGLUE.
Style APA, Harvard, Vancouver, ISO itp.
14

Choutri, Kheireddine, Mohand Lagha, Souham Meshoul, Mohamed Batouche, Yasmine Kacel i Nihad Mebarkia. "A Multi-Lingual Speech Recognition-Based Framework to Human-Drone Interaction". Electronics 11, nr 12 (9.06.2022): 1829. http://dx.doi.org/10.3390/electronics11121829.

Pełny tekst źródła
Streszczenie:
In recent years, human–drone interaction has received increasing interest from the scientific community. When interacting with a drone, humans assume a variety of roles, the nature of which are determined by the drone’s application and degree of autonomy. Common methods of controlling drone movements include by RF remote control and ground control station. These devices are often difficult to manipulate and may even require some training. An alternative is to use innovative methods called natural user interfaces that allow users to interact with drones in an intuitive manner using speech. However, using only one language of interacting may limit the number of users, especially if different languages are spoken in the same region. Moreover, environmental and propellers noise make speech recognition a complicated task. The goal of this work is to use a multilingual speech recognition system that includes English, Arabic, and Amazigh to control the movement of drones. The reason for selecting these languages is that they are widely spoken in many regions, particularly in the Middle East and North Africa (MENA) zone. To achieve this goal, a two-stage approach is proposed. During the first stage, a deep learning based model for multilingual speech recognition is designed. Then, the developed model is deployed in real settings using a quadrotor UAV. The network was trained using 38,850 records including commands and unknown words mixed with noise to improve robustness. An average class accuracy of more than 93% has been achieved. After that, experiments were conducted involving 16 participants giving voice commands in order to test the efficiency of the designed system. The achieved accuracy is about 93.76% for English recognition and 88.55%, 82.31% for Arabic and Amazigh, respectively. Finally, hardware implementation of the designed system on a quadrotor UAV was made. Real time tests have shown that the approach is very promising as an alternative form of human–drone interaction while offering the benefit of control simplicity.
Style APA, Harvard, Vancouver, ISO itp.
15

Wei, Bin, i Christopher Pal. "Heterogeneous Transfer Learning with RBMs". Proceedings of the AAAI Conference on Artificial Intelligence 25, nr 1 (4.08.2011): 531–36. http://dx.doi.org/10.1609/aaai.v25i1.7925.

Pełny tekst źródła
Streszczenie:
A common approach in machine learning is to use a large amount of labeled data to train a model. Usually this model can then only be used to classify data in the same feature space. However, labeled data is often expensive to obtain. A number of strategies have been developed by the machine learning community in recent years to address this problem, including: semi-supervised learning,domain adaptation,multi-task learning,and self-taught learning. While training data and test may have different distributions, they must remain in the same feature set. Furthermore, all the above methods work in the same feature space. In this paper, we consider an extreme case of transfer learning called heterogeneous transfer learning — where the feature spaces of the source task and the target tasks are disjoint. Previous approaches mostly fall in the multi-view learning category, where co-occurrence data from both feature spaces is required. We generalize the previous work on cross-lingual adaptation and propose a multi-task strategy for the task. We also propose the use of a restricted Boltzmann machine (RBM), a special type of probabilistic graphical models, as an implementation. We present experiments on two tasks: action recognition and cross-lingual sentiment classification.
Style APA, Harvard, Vancouver, ISO itp.
16

Pallotti, Gabriele, i Cecilia Varcasia. "Service telephone call openings: a comparative study on five European languages". Journal of Intercultural Communication 8, nr 2 (30.06.2008): 1–25. http://dx.doi.org/10.36923/jicc.v8i2.463.

Pełny tekst źródła
Streszczenie:
The paper presents the results of a comparative study on how speakers of different languages (English, French, German, Italian and Spanish) manage the opening of service phone calls. Previous research has focussed on cross-cultural variability in telephone conversations, but this is the first attempt to systematically compare several European languages at the same time. The communicative strategies speakers use in each language are analyzed qualitatively and quantitatively, allowing a systematic comparison across cultures and languages and the observation of intra-cultural variability. Comparative analysis is based on five fundamental moves that may be performed in a telephone call opening: summons-answer, identification, greetings, how-are-you’s, getting-down-to-business. Implications are drawn for cross-cultural research on interaction and for training staff working in multi-lingual and multi-cultural settings.
Style APA, Harvard, Vancouver, ISO itp.
17

Zeng, Ailing, Mijit Ablimit i Askar Hamdulla. "MSRN and Multi-Headed Attention Mechanism for Language Identification". Information 14, nr 1 (28.12.2022): 17. http://dx.doi.org/10.3390/info14010017.

Pełny tekst źródła
Streszczenie:
With the popularity of the mobile internet, people all over the world can easily create and publish diverse media content such as multilingual and multi-dialectal audio and video. Therefore, language or dialect identification (LID) is increasingly important for practical applications such as multilingual and cross lingual processing as the front-end part of the subsequent tasks such as speech recognition and voice identification. This paper proposes a neural network framework based on a multiscale residual network (MSRN) and multi-headed self-attention (MHSA). Experimental results show that this method can effectively improve the accuracy and robustness compared to other methods. This model uses the MSRN to extract the language spectrogram feature and uses MHSA to filter useful features and suppress irrelevant features. Training and test sets are constructed from both the “Common Voice” and “Oriental Language Recognition” (AP17-OLR) datasets. The experimental results show that this model can effectively improve the accuracy and robustness of LID.
Style APA, Harvard, Vancouver, ISO itp.
18

Li, Gen, Nan Duan, Yuejian Fang, Ming Gong i Daxin Jiang. "Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 07 (3.04.2020): 11336–44. http://dx.doi.org/10.1609/aaai.v34i07.6795.

Pełny tekst źródła
Streszczenie:
We propose Unicoder-VL, a universal encoder that aims to learn joint representations of vision and language in a pre-training manner. Borrow ideas from cross-lingual pre-trained models, such as XLM (Lample and Conneau 2019) and Unicoder (Huang et al. 2019), both visual and linguistic contents are fed into a multi-layer Transformer (Vaswani et al. 2017) for the cross-modal pre-training, where three pre-trained tasks are employed, including Masked Language Modeling(MLM), Masked Object Classification(MOC) and Visual-linguistic Matching(VLM). The first two tasks learn context-aware representations for input tokens based on linguistic and visual contents jointly. The last task tries to predict whether an image and a text describe each other. After pretraining on large-scale image-caption pairs, we transfer Unicoder-VL to caption-based image-text retrieval and visual commonsense reasoning, with just one additional output layer. We achieve state-of-the-art or comparable results on both two tasks and show the powerful ability of the cross-modal pre-training.
Style APA, Harvard, Vancouver, ISO itp.
19

Lindeman, David. "Lessons From Lighthouse: Operationalizing Technology to Support Older Adults in Affordable Housing Communities". Innovation in Aging 5, Supplement_1 (1.12.2021): 465. http://dx.doi.org/10.1093/geroni/igab046.1800.

Pełny tekst źródła
Streszczenie:
Abstract Lighthouse for Older Adults, an innovative public-private partnership, was developed in response to COVID-19 as a means of advancing telehealth for low-income older adults living in affordable housing communities. Residents of these communities often don’t have reliable access to devices, sufficient bandwidth for telehealth, or adequate social services, further complicated by the need for multi-lingual and culturally sensitive programs. This presentation will share program implementation strategies and outcomes, including the essential role telehealth services play in the care and wellbeing of older adults during and beyond COVID-19. This session will review evidence-based components of a telehealth intervention, including digital literacy training and technology support. Key drivers for successful implementation (e.g., peer led training, user input into technology selection) as well as barriers to implementation (e.g., broad band installation, internet service availability/cost, tech support) will be reviewed. Lessons learned through program replication and scaling of Lighthouse telehealth services will be discussed.
Style APA, Harvard, Vancouver, ISO itp.
20

Glavašš, Goran, i Swapna Somasundaran. "Two-Level Transformer and Auxiliary Coherence Modeling for Improved Text Segmentation". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 7797–804. http://dx.doi.org/10.1609/aaai.v34i05.6284.

Pełny tekst źródła
Streszczenie:
Breaking down the structure of long texts into semantically coherent segments makes the texts more readable and supports downstream applications like summarization and retrieval. Starting from an apparent link between text coherence and segmentation, we introduce a novel supervised model for text segmentation with simple but explicit coherence modeling. Our model – a neural architecture consisting of two hierarchically connected Transformer networks – is a multi-task learning model that couples the sentence-level segmentation objective with the coherence objective that differentiates correct sequences of sentences from corrupt ones. The proposed model, dubbed Coherence-Aware Text Segmentation (CATS), yields state-of-the-art segmentation performance on a collection of benchmark datasets. Furthermore, by coupling CATS with cross-lingual word embeddings, we demonstrate its effectiveness in zero-shot language transfer: it can successfully segment texts in languages unseen in training.
Style APA, Harvard, Vancouver, ISO itp.
21

Hahn, Michael, i Marco Baroni. "Tabula Nearly Rasa: Probing the Linguistic Knowledge of Character-level Neural Language Models Trained on Unsegmented Text". Transactions of the Association for Computational Linguistics 7 (listopad 2019): 467–84. http://dx.doi.org/10.1162/tacl_a_00283.

Pełny tekst źródła
Streszczenie:
Recurrent neural networks (RNNs) have reached striking performance in many natural language processing tasks. This has renewed interest in whether these generic sequence processing devices are inducing genuine linguistic knowledge. Nearly all current analytical studies, however, initialize the RNNs with a vocabulary of known words, and feed them tokenized input during training. We present a multi-lingual study of the linguistic knowledge encoded in RNNs trained as character-level language models, on input data with word boundaries removed. These networks face a tougher and more cognitively realistic task, having to discover any useful linguistic unit from scratch based on input statistics. The results show that our “near tabula rasa” RNNs are mostly able to solve morphological, syntactic and semantic tasks that intuitively presuppose word-level knowledge, and indeed they learned, to some extent, to track word boundaries. Our study opens the door to speculations about the necessity of an explicit, rigid word lexicon in language learning and usage.
Style APA, Harvard, Vancouver, ISO itp.
22

Botezatu, Liuba. "The Evidence Theory and Methodology of Axiological Completions in the Context of Philological Training". INTERNATIONAL JOURNAL OF RESEARCH IN EDUCATION METHODOLOGY 7, nr 5 (30.12.2016): 1350–61. http://dx.doi.org/10.24297/ijrem.v7i5.4337.

Pełny tekst źródła
Streszczenie:
The first step of evidencing the axiological completions occurs, in our int erpretation, at the level of ’’Retroaction in the context of linguistic and literary education’’ as right as a modern teaching technology in the cu rricular pre-university system. The second stage (bachelor-master), of completing evidence, in perfect ascension (of the same interpretation L.B.) occurs at co-reporting level: Principle of Global Axiology - Methodology of Evidencing the Axiological Completions –engagement on a new level of becoming: training the philologist teacher.’’Happiness lies in to know yourself’’: the stylistic matrix of the biased nation, the mother language – the language of raw ascensions into the great spirituality; the own possibilities of opening – manifestation of self-realization in the universal circuit. Or, this is the primary purpose of participating approaches / engagements in professional plan: training / forming the philologist – philological genetic (bilingual – multi-lingual), philological centrist by vocation.The joints of the Evidence Methodology of Axiological Completions / MECA in this structural-phenomenal context are, on the one hand, those relating to translation into life of the possibilities / requirements listed, taking preponderance here: the possibility of returning to essence – the ability of ascension through spirituality - the possibility of highlighting the axiological completions, on the other hand - those related to the focus on diversification - information - unification, on the establishment of syntactical highlight of the pertinent datum.
Style APA, Harvard, Vancouver, ISO itp.
23

Hirci, Nataša. "Investigating trainee translators’ views on the pronunciation of English: a Slovene perspective". Linguistica 57, nr 1 (30.12.2017): 93–106. http://dx.doi.org/10.4312/linguistica.57.1.93-106.

Pełny tekst źródła
Streszczenie:
While the importance of excellent pronunciation skills for language professionals is indisputable, research attention has focused mainly on the pronunciation skills of teachers. Nevertheless translators, and even more so interpreters, who are constantly engaged in multi-lingual communication with their clients, face a tough competition in the global market and those with poor pronunciation skills are at a considerable disadvantage. Developing good pronunciation skills is thus an aspect that should not be neglected in the training of translators and interpreters, since it may directly affect their prospects of employment. The paper explores the views of Slovene trainee translators on the pronunciation of English. Their self-perception of English pronunciation skills and expectations concerning their pronunciation are examined by using a questionnaire administered to trainee translators at the University of Ljubljana. The questionnaire results provide an insight into the participants’ perceptions of their attained pronunciation proficiency and their attention to pronunciation instruction. The analysis of the replies reveals that trainee translators view pronunciation as an important element of their speaking proficiency, highlighting the issue of intelligibility as an essential component of communicative competence. The findings raise interesting issues important for pronunciation teaching in translator training, underlining the necessity to identify specific learner needs of future translators and interpreters.
Style APA, Harvard, Vancouver, ISO itp.
24

Singh, Bhupendra, i Patanjali Mishra. "Uniformity and Diversity of Components of Teacher Education". IRA International Journal of Education and Multidisciplinary Studies (ISSN 2455-2526) 7, nr 3 (5.07.2017): 228. http://dx.doi.org/10.21013/jems.v7.n3.p7.

Pełny tekst źródła
Streszczenie:
<div><p>It was the suggestion of the National Education Commission (1964-66) that investment for teacher education can return in rich surpluses because the fiscal resources essential are small when compared to the resulting developments in the upbringing of masses. Today, it is an imperative to prepare dynamic teachers. Therefore, skill development programmes for in-service and teacher trainees should be ornamented with the teaching-learning skills, real mode internship, pedagogical knowledge with application, multi-lingual and multicultural classroom management techniques and ICT based technologies. These conditions are fulfilled by implementing the components of teacher education with uniformity in knowledge and diversity in functioning. The universal and local aspects of teacher education programs might be meaningful for the environment of training. This research article is trying to ascertain their devotion to global trends versus the impact of native circumstances and cultures. Teacher Education Curriculum-1978 very rightly emphasized on the fortune of the school curriculum which is affected by the teacher education curriculum.</p></div>
Style APA, Harvard, Vancouver, ISO itp.
25

FATTAH, MOHAMED ABDEL, FUJI REN i SHINGO KUROIWA. "SENTENCE ALIGNMENT USING FEED FORWARD NEURAL NETWORK". International Journal of Neural Systems 16, nr 06 (grudzień 2006): 423–34. http://dx.doi.org/10.1142/s0129065706000822.

Pełny tekst źródła
Streszczenie:
Parallel corpora have become an essential resource for work in multi lingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross language information retrieval and machine translation applications. In this paper, we present a new approach to align sentences in bilingual parallel corpora based on feed forward neural network classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuate score, and cognate score values. A set of manually prepared training data has been assigned to train the feed forward neural network. Another set of data was used for testing. Using this new approach, we could achieve an error reduction of 60% over length based approach when applied on English–Arabic parallel documents. Moreover this new approach is valid for any language pair and it is quite flexible approach since the feature parameter vector may contain more/less or different features than that we used in our system such as lexical match feature.
Style APA, Harvard, Vancouver, ISO itp.
26

Zhang, Wenbo, Hangzhi Guo, Prerna Ranganathan, Jay Patel, Sathyanath Rajasekharan, Nidhi Danayak, Manan Gupta i Amulya Yadav. "A Continual Pre-training Approach to Tele-Triaging Pregnant Women in Kenya". Proceedings of the AAAI Conference on Artificial Intelligence 37, nr 12 (26.06.2023): 14620–27. http://dx.doi.org/10.1609/aaai.v37i12.26709.

Pełny tekst źródła
Streszczenie:
Access to high-quality maternal health care services is limited in Kenya, which resulted in ∼36,000 maternal and neonatal deaths in 2018. To tackle this challenge, Jacaranda Health (a non-profit organization working on maternal health in Kenya) developed PROMPTS, an SMS based tele-triage system for pregnant and puerperal women, which has more than 350,000 active users in Kenya. PROMPTS empowers pregnant women living far away from doctors and hospitals to send SMS messages to get quick answers (through human helpdesk agents) to questions about their medical symptoms and pregnancy status. Unfortunately, ∼1.1 million SMS messages are received by PROMPTS every month, which makes it challenging for helpdesk agents to ensure that these messages can be interpreted correctly and evaluated by their level of emergency to ensure timely responses and/or treatments for women in need. This paper reports on a collaborative effort with Jacaranda Health to develop a state-of-the-art natural language processing (NLP) framework, TRIM-AI (TRIage for Mothers using AI), which can automatically predict the emergency level (or severity of medical condition) of a pregnant mother based on the content of their SMS messages. TRIM-AI leverages recent advances in multi-lingual pre-training and continual pre-training to tackle code-mixed SMS messages (between English and Swahili), and achieves a weighted F1 score of 0.774 on real-world datasets. TRIM-AI has been successfully deployed in the field since June 2022, and is being used by Jacaranda Health to prioritize the provision of services and care to pregnant women with the most critical medical conditions. Our preliminary A/B tests in the field show that TRIM-AI is ∼17% more accurate at predicting high-risk medical conditions from SMS messages sent by pregnant Kenyan mothers, which reduces the helpdesk’s workload by ∼12%.
Style APA, Harvard, Vancouver, ISO itp.
27

Aysa, Zuhragvl, Mijit Ablimit, Hankiz Yilahun i Askar Hamdulla. "Language Identification-Based Evaluation of Single Channel Speech Separation of Overlapped Speeches". Information 13, nr 10 (11.10.2022): 492. http://dx.doi.org/10.3390/info13100492.

Pełny tekst źródła
Streszczenie:
In multi-lingual, multi-speaker environments (e.g., international conference scenarios), speech, language, and background sounds can overlap. In real-world scenarios, source separation techniques are needed to separate target sounds. Downstream tasks, such as ASR, speaker recognition, speech recognition, VAD, etc., can be combined with speech separation tasks to gain a better understanding. Since most of the evaluation methods for monophonic separation are either single or subjective, this paper used the downstream recognition task as an overall evaluation criterion. Thus, the performance could be directly evaluated by the metrics of the downstream task. In this paper, we investigated a two-stage training scheme that combined speech separation and language identification tasks. To analyze and optimize the separation performance of single-channel overlapping speech, the separated speech was fed to a language identification engine to evaluate its accuracy. The speech separation model was a single-channel speech separation network trained with WSJ0-2mix. For the language identification system, we used an Oriental Language Dataset and a dataset synthesized by directly mixing different proportions of speech groups. The combined effect of these two models was evaluated for various overlapping speech scenarios. When the language identification network model was based on single-person single-speech frequency spectrum features, Chinese, Japanese, Korean, Indonesian, and Vietnamese had significantly improved recognition results over the mixed audio spectrum.
Style APA, Harvard, Vancouver, ISO itp.
28

RAHIMI, RAZIEH, AZADEH SHAKERY, JAVID DADASHKARIMI, MOZHDEH ARIANNEZHAD, MOSTAFA DEHGHANI i HOSSEIN NASR ESFAHANI. "Building a multi-domain comparable corpus using a learning to rank method". Natural Language Engineering 22, nr 4 (15.06.2016): 627–53. http://dx.doi.org/10.1017/s1351324916000164.

Pełny tekst źródła
Streszczenie:
AbstractComparable corpora are key translation resources for both languages and domains with limited linguistic resources. The existing approaches for building comparable corpora are mostly based on ranking candidate documents in the target language for each source document using a cross-lingual retrieval model. These approaches also exploit other evidence of document similarity, such as proper names and publication dates, to build more reliable alignments. However, the importance of each evidence in the scores of candidate target documents is determined heuristically. In this paper, we employ a learning to rank method for ranking candidate target documents with respect to each source document. The ranking model is constructed by defining each evidence for similarity of bilingual documents as a feature whose weight is learned automatically. Learning feature weights can significantly improve the quality of alignments, because the reliability of features depends on the characteristics of both source and target languages of a comparable corpus. We also propose a method to generate appropriate training data for the task of building comparable corpora. We employed the proposed learning-based approach to build a multi-domain English–Persian comparable corpus which covers twelve different domains obtained from Open Directory Project. Experimental results show that the created alignments have high degrees of comparability. Comparison with existing approaches for building comparable corpora shows that our learning-based approach improves both quality and coverage of alignments.
Style APA, Harvard, Vancouver, ISO itp.
29

Kotsur, Sabina. "Сontent, Components and European Tendences of the Future Foreign Languages Teachers’ Professional Training". Professional Education: Methodology, Theory and Technologies, nr 9 (28.02.2019): 87–102. http://dx.doi.org/10.31470/2415-3729-2019-9-87-102.

Pełny tekst źródła
Streszczenie:
The article analyzes different approaches to the «professional training of a teacher» definition as a system of organizational-pedagogical measures and vocational training systems; a critical study, the improvement and experimental use of ideas; the purposeful, systematic and organized process of pedagogical influences; qualifications in the process of studying in the corresponding direction, specialty, educational program; systems of special knowledge, abilities and skills, competences, qualities. The peculiarities of the professional training of future foreign languages teachers are defined by the author as: a possession of units of a foreign language and the ability to use them in specific situations of communication; a consistency of the initial level of foreign languages knowledge received at school with the goals and objectives, methods and technologies of forming the personality of a student as a future specialist in the process of vocational training; the ratio of theoretical and practical training, special and psycho-pedagogical, methodical preparation; a symmetrical study of two foreign languages and bilingual life (Ukrainian and Russian); knowledge and appreciation of the cultural characteristics of a nation, the language of which is studied. The author also proposes the definition of the concept of «professional training of future foreign language teachers» as a dynamic system of organizational and pedagogical influences, which is characterized by the unity of goals, content, methods and technologies of professional training of students, which study foreign languages on a multi-lingual basis, and foresees the formation of readiness for their professional activity, a professional competence. The article deals with important components of the future foreign languages teachers’ training such as: the theoretical and linguistic training; the practical training; the professional-oriented theoretical training; the methodical one. The article analyzes the tendencies of future teacher training in the European region. Among the modern European approaches to foreign language teacher training, the author highlights the following main trends: the unification of requirements for professional training in the process of higher education integration into European educational space; the updating of goals and content of studying and teaching foreign languages, changing educational programs and state standards taking into account common European trends; the transition from the knowledge concept to the competence paradigm in higher education; the internationalization of education; the use of new flexible technologies in the study of foreign languages, the strengthening of the practical component of vocational training.
Style APA, Harvard, Vancouver, ISO itp.
30

Yolwas, Nurmemet, i Weijing Meng. "JSUM: A Multitask Learning Speech Recognition Model for Jointly Supervised and Unsupervised Learning". Applied Sciences 13, nr 9 (22.04.2023): 5239. http://dx.doi.org/10.3390/app13095239.

Pełny tekst źródła
Streszczenie:
In recent years, the end-to-end speech recognition model has emerged as a popular alternative to the traditional Deep Neural Network—Hidden Markov Model (DNN-HMM). This approach maps acoustic features directly onto text sequences via a single network architecture, significantly streamlining the model construction process. However, the training of end-to-end speech recognition models typically necessitates a significant quantity of supervised data to achieve good performance, which poses a challenge in low-resource conditions. The use of unsupervised representation significantly reduces this necessity. Recent research has focused on end-to-end techniques employing joint Connectionist Temporal Classification (CTC) and attention mechanisms, with some also concentrating on unsupervised presentation learning. This paper proposes a joint supervised and unsupervised multi-task learning model (JSUM). Our approach leverages the unsupervised pre-trained wav2vec 2.0 model as a shared encoder that integrates the joint CTC-Attention network and the generative adversarial network into a unified end-to-end architecture. Our method provides a new low-resource language speech recognition solution that optimally utilizes supervised and unsupervised datasets by combining CTC, attention, and generative adversarial losses. Furthermore, our proposed approach is suitable for both monolingual and cross-lingual scenarios.
Style APA, Harvard, Vancouver, ISO itp.
31

Foresti, Giovanni. "La costruzione del "terzo orecchio" Ascolto psicoanalitico e setting interno dell'analista in una prospettiva storica". GRUPPI, nr 3 (czerwiec 2009): 11–26. http://dx.doi.org/10.3280/gru2008-003002.

Pełny tekst źródła
Streszczenie:
- The contemporary, multi-lingual psychoanalytic culture is a complicated and pluralistic mix, which may prompt high levels of theoretical and technical confusion. The paper describes some of the experiences, made during the first decade of a psychoanalyst's work, which have contributed to the construction of the Author's way of listening and working. The so called "third ear" will be discussed, here, as a metaphor alluding to the psychoanalyst's working models (combinations of theories and technique) and in particular to her/his internal setting. The approach chosen to describe this latter, will be historical in two different perspectives: conceptual and personal. The historical understanding of the psychoanalytic traditions is useful in balancing different theoretical perspectives and in avoiding the often complementary phenomena of getting lost on one hand, and of becoming closed and fanatic on the other. The second focus will be on the personal and never complete work that has to be done, in order to assimilate/elaborate other people's ideas - the patients' thoughts and phantasies, the teachers' and supervisors' perspectives and the colleagues' and peers' views and remarks.Key words: psychoanalytic training, internal setting, group's dynamics, third ear, gamma function, ręverieParole chiave: formazione analitica, setting interno, dinamiche gruppali, terzo orecchio, funzione gamma, ręverie.
Style APA, Harvard, Vancouver, ISO itp.
32

Goldman, Omer, i Reut Tsarfaty. "Morphology Without Borders: Clause-Level Morphology". Transactions of the Association for Computational Linguistics 10 (2022): 1455–72. http://dx.doi.org/10.1162/tacl_a_00528.

Pełny tekst źródła
Streszczenie:
Abstract Morphological tasks use large multi-lingual datasets that organize words into inflection tables, which then serve as training and evaluation data for various tasks. However, a closer inspection of these data reveals profound cross-linguistic inconsistencies, which arise from the lack of a clear linguistic and operational definition of what is a word, and which severely impair the universality of the derived tasks. To overcome this deficiency, we propose to view morphology as a clause-level phenomenon, rather than word-level. It is anchored in a fixed yet inclusive set of features, that encapsulates all functions realized in a saturated clause. We deliver MightyMorph, a novel dataset for clause-level morphology covering 4 typologically different languages: English, German, Turkish, and Hebrew. We use this dataset to derive 3 clause-level morphological tasks: inflection, reinflection and analysis. Our experiments show that the clause-level tasks are substantially harder than the respective word-level tasks, while having comparable complexity across languages. Furthermore, redefining morphology to the clause-level provides a neat interface with contextualized language models (LMs) and allows assessing the morphological knowledge encoded in these models and their usability for morphological tasks. Taken together, this work opens up new horizons in the study of computational morphology, leaving ample space for studying neural morphology cross-linguistically.
Style APA, Harvard, Vancouver, ISO itp.
33

Park, Jangkyoung, Ammar Ul Hassan i Jaeyoung Choi. "CCFont: Component-Based Chinese Font Generation Model Using Generative Adversarial Networks (GANs)". Applied Sciences 12, nr 16 (10.08.2022): 8005. http://dx.doi.org/10.3390/app12168005.

Pełny tekst źródła
Streszczenie:
Font generation using deep learning has made considerable progress using image style transfer, but the automatic conversion/generation of Chinese characters still remains a difficult task owing to the complex character shape and large number of Chinese characters. Most known Chinese character generation models use the image conversion method of the Chinese character shape itself; however, it is difficult to reproduce complex Chinese characters. Recent methods have utilized character compositionality by separating up to three or four components to improve the quality of generated characters, but it is still difficult to generate high-quality results for complex Chinese characters with many components. In this study, we proposed the CCFont model (component-based Chinese font generation model using generative adversarial networks (GANs)) that automatically generates all Chinese characters using Chinese character components (up to 17 components). The CCFont model generates all Chinese characters in various styles using the components of Chinese characters based on conditional GAN. By acquiring local style information from the components, the information is more accurate and there is less information loss than when global information is obtained from the image of the entire character, reducing the failure of style conversion and improving quality to produce high-quality results. Additionally, the CCFont model generates high-quality results without any additional training (zero-shot font generation without any additional training) for the first-seen characters and styles. For example, the CCFont model, which was trained with only traditional Chinese (TC) characters, generates high-quality results for languages that can be divided into components, such as Korean and Thai, as well as simplified Chinese (SC) characters that are only seen during inference. CCFont can be adopted as a multi-lingual font-generation model that can be applied to all languages, which can be divided into components. To the best of our knowledge, the proposed method is the first to generate a zero-shot multilingual generation model using components. Qualitative and quantitative experiments were conducted to demonstrate the effectiveness of the proposed method.
Style APA, Harvard, Vancouver, ISO itp.
34

Andresel, Medina, Sergiu Gordea, Srdjan Stevanetic i Mina Schütz. "An Approach for Curating Collections of Historical Documents with the Use of Topic Detection Technologies". International Journal of Digital Curation 17, nr 1 (20.09.2022): 12. http://dx.doi.org/10.2218/ijdc.v17i1.819.

Pełny tekst źródła
Streszczenie:
Digital curation of materials available in large online repositories is required to enable the reuse of Cultural Heritage resources in specific activities like education or scientific research. The digitization of such valuable objects is an important task for making them accessible through digital platforms such as Europeana, therefore ensuring the success of transcription campaigns via the Transcribathon platform is highly important for this goal. Based on impact assessment results, people are more engaged in the transcription process if the content is more oriented to specific themes, such as First World War. Currently, efforts to group related documents into thematic collections are in general hand-crafted and due to the large ingestion of new material they are difficult to maintain and update. The current solutions based on text retrieval are not able to support the discovery of related content since the existing collections are multi-lingual and contain heterogeneous items like postcards, letters, journals, photographs etc. Technological advances in natural language understanding and in data management have led to the automation of document categorization and via automatic topic detection. To use existing topic detection technologies on Europeana collections there are several challenges to be addressed: (1) ensure representative and qualitative training data, (2) ensure the quality of the learned topics, and (3) efficient and scalable solutions for searching related content based on the automatically detected topics, and for suggesting the most relevant topics on new items. This paper describes in more details each such challenge and the proposed solutions thus offering a novel perspective on how digital curation practices can be enhanced with the help of machine learning technologies.
Style APA, Harvard, Vancouver, ISO itp.
35

Hermann, K. G., M. Protopopov, A. Serfaty, I. Hmamouchi, F. Sommerfleck, F. Macori, K. Ziegeler, T. Diekhoff, D. Poddubnyy i J. Sieper. "POS1460 CONTRIBUTING TO THE TRAINING OF IMAGING IN RHEUMATOLOGY BY EXPERTS WORLDWIDE VIA INTERACTIVE MOBILE E-TEACHING: BERLINCASEVIEWER." Annals of the Rheumatic Diseases 81, Suppl 1 (23.05.2022): 1075.1–1075. http://dx.doi.org/10.1136/annrheumdis-2022-eular.4885.

Pełny tekst źródła
Streszczenie:
BackgroundRheumatology education today can be very diverse, and you can find everything from structured textbooks to YouTube channels to social media accounts. Peer-reviewed content is still recognized as a very high-quality source of information. App-based content has the advantage of bundling information in one place, always available on the go. However, the majority of offerings are only available in English.ObjectivesAn app was to be created to learn about imaging in rheumatology in a very easy to understand way in different languages, with experts being able to create translated content very easily.MethodsUsing mySQL, Java, Objective C and JavaScript, a case database with specific structure and numerous interactive elements was created for academic teaching. Special functions for the annotation of images were provided. The development was initially for devices with the iOS operating system, and later for Android. Rheumatologists and radiologists worldwide were invited to participate via the social media channels LinkedIn, Instagram, Facebook, Twitter, and TikTok.ResultsThe app, called BerlinCaseViewer, was developed for smartphones, tablets and Mac computers. All information is entered and processed in a web-based database. Using XML files and ZIP archives, the relevant data is then transferred to the mobile apps. Case of the month and learning modules on rheumatoid arthritis, psoriatic arthritis, and axial spondyloarthritis are available, many in English, Spanish, French, Italian, Portuguese, German, and other languages (Figure 1). In addition to the medical image data, the patient’s medical history is also presented in an exciting way with the help of multiple-choice questions. Only when all questions are answered, the diagnosis becomes visible. Timeline functions can be used to visualize medical courses as well. Colored overlays are used to annotate images and can be placed with pixel precision. The user can decide whether these should be displayed as aids. Content is peer-reviewed before publication.Figure 1.Multi-lingual presentation of medical training cases.ConclusionBerlinCaseViewer is a new approach not only to train medical professionals, but also to connect colleagues and overcome language barriers. As a platform, BerlinCaseViewer is open to all medical professionals to collaborate, whether to contribute their own cases or translate existing cases for use in the local language.References[1]BerlinCaseViewer home page: https://www.berlincaseviewer.de/Disclosure of Interests:Kay-Geert Hermann Shareholder of: Co-founder of BerlinFlame GmbH, Mikhail Protopopov: None declared, Aline Serfaty: None declared, Ihsane Hmamouchi: None declared, Fernando Sommerfleck: None declared, Fabio Macori: None declared, Katharina Ziegeler: None declared, Torsten Diekhoff: None declared, Denis Poddubnyy Shareholder of: Co-founder of BerlinFlame GmbH, Joachim Sieper Shareholder of: Co-founder of BerlinFlame GmbH
Style APA, Harvard, Vancouver, ISO itp.
36

Kornfeld, Shaun, Emily Kalambaheti i Matthew Michael Antonucci. "Rehabilitation of Persistent Neurocognitive Deficits Following Sports-Related Concussion in an Amateur Football Athlete: Case Study". Neurology 98, nr 1 Supplement 1 (27.12.2021): S25.2—S25. http://dx.doi.org/10.1212/01.wnl.0000801976.20078.26.

Pełny tekst źródła
Streszczenie:
ObjectiveDemonstrate neurocognitive improvements in an inactive, amateur football athlete following a functional neurology approach to multimodal neurorehabilitation.BackgroundAmerican Football has been reported to have one of the highest incidences of concussion in all contact sports. Given the high rate of concussive blows during play, the investigation of treatment modalities is warranted. This case study presents a 23-year-old male amateur football player who has sustained 3 diagnosed concussions with additional suspected concussions throughout his time participating in football. In addition, his symptoms persisted years after ceasing participation in all contact sports.Design/MethodsThe athlete was prescribed 10 treatment sessions over 5 consecutive days at an outpatient neurorehabilitation center specializing in functional neurology. The C3Logix neurocognitive assessment and Graded Symptom Checklist were utilized on intake and discharge. Multimodal treatment interventions included transcranial photobiomodulation, non-invasive neuromodulation of the lingual branch of the trigeminal nerve, neuromuscular reeducation of the limbs bilaterally, hand-eye coordination training, vestibular rehabilitation utilizing a three-axis whole-body off-axis rotational device, and cognitive training.ResultsOn intake, composite symptom score was reported as 10/162, Trails Making Test Part A was 20.8 seconds, Part B was 41.9 seconds, Digit Symbol Matching score was 53, Simple Reaction Time was 277 milliseconds, and Choice Reaction Time was 412 milliseconds. On discharge, the patient experienced a 70% in self-reported symptoms, Trails A improved to 14.8 seconds (+29%), Trails B improved to 30.3 seconds (+28%), Simple Reaction Time was 248 milliseconds (10% faster), and Choice Reaction Time was 340 milliseconds (17% faster).ConclusionsThe present case study demonstrates a meaningful improvement in symptoms and neurocognitive performance of a patient with multiple sports-related concussions. Therefore, the Press suggest further investigation into a functional neurology approach to multi-modal, intensive care to improve neurocognitive impairment in athletes that sustained concussions participating in footballs.
Style APA, Harvard, Vancouver, ISO itp.
37

Zehra, Wisha, Abdul Rehman Javed, Zunera Jalil, Habib Ullah Khan i Thippa Reddy Gadekallu. "Cross corpus multi-lingual speech emotion recognition using ensemble learning". Complex & Intelligent Systems, 11.01.2021. http://dx.doi.org/10.1007/s40747-020-00250-4.

Pełny tekst źródła
Streszczenie:
AbstractReceiving an accurate emotional response from robots has been a challenging task for researchers for the past few years. With the advancements in technology, robots like service robots interact with users of different cultural and lingual backgrounds. The traditional approach towards speech emotion recognition cannot be utilized to enable the robot and give an efficient and emotional response. The conventional approach towards speech emotion recognition uses the same corpus for both training and testing of classifiers to detect accurate emotions, but this approach cannot be generalized for multi-lingual environments, which is a requirement for robots used by people all across the globe. In this paper, a series of experiments are conducted to highlight an ensemble learning effect using a majority voting technique for cross-corpus, multi-lingual speech emotion recognition system. A comparison of the performance of an ensemble learning approach against traditional machine learning algorithms is performed. This study tests a classifier’s performance trained on one corpus with data from another corpus to evaluate its efficiency for multi-lingual emotion detection. According to experimental analysis, different classifiers give the highest accuracy for different corpora. Using an ensemble learning approach gives the benefit of combining all classifiers’ effect instead of choosing one classifier and compromising certain language corpus’s accuracy. Experiments show an increased accuracy of 13% for Urdu corpus, 8% for German corpus, 11% for Italian corpus, and 5% for English corpus from with-in corpus testing. For cross-corpus experiments, an improvement of 2% when training on Urdu data and testing on German data and 15% when training on Urdu data and testing on Italian data is achieved. An increase of 7% in accuracy is obtained when testing on Urdu data and training on German data, 3% when testing on Urdu data and training on Italian data, and 5% when testing on Urdu data and training on English data. Experiments prove that the ensemble learning approach gives promising results against other state-of-the-art techniques.
Style APA, Harvard, Vancouver, ISO itp.
38

Cai, Zexin, Yaogen Yang i Ming Li. "Cross-lingual multi-speaker speech synthesis with limited bilingual training data". Computer Speech & Language, lipiec 2022, 101427. http://dx.doi.org/10.1016/j.csl.2022.101427.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
39

Khan, Amjad. "Improved multi-lingual sentiment analysis and recognition using deep learning". Journal of Information Science, 12.01.2023, 016555152211372. http://dx.doi.org/10.1177/01655515221137270.

Pełny tekst źródła
Streszczenie:
Speech emotion recognition (SER) is still a fresh in natural language processing domain since the accuracy is beyond targeted. Mainly due to real-time applications such as human–robot interaction, human behaviour evaluation and virtual reality rely heavily on SER. Moreover, cross-lingual SER plays a significant role in practical applications, especially when users of different cultural and linguistic backgrounds interact with the system. However, the existing conventional approaches of SER cannot be employed for real-world applications because it uses the same corpus for training and testing, which cannot be used for multi-lingual environments to detect or classify real emotions. In such a situation, the performance of SER is degraded. Therefore, the proposed work develops cross-lingual emotion recognition through Urdu, Italian, English and German. The features are extracted through the most employed audio feature known as MFCCs (Mel Frequency Cepstral Coefficients). Experimental results exhibited that the proposed deep learning model comes out with promising results on the URDU data set with 91.25% accuracy using random forest (RF) and XGBoost classifier.
Style APA, Harvard, Vancouver, ISO itp.
40

Kivaisi, Alexander R., Qingjie Zhao i Jimmy T. Mbelwa. "Swahili Speech Dataset Development and Improved Pre-Training Method for Spoken Digit Recognition". ACM Transactions on Asian and Low-Resource Language Information Processing, 20.05.2023. http://dx.doi.org/10.1145/3597494.

Pełny tekst źródła
Streszczenie:
Speech dataset is an essential component in building commercial speech applications. However, low-resource languages such as Swahili lack such a resource that is vital for spoken digit recognition. For languages where such resources exist, they are usually insufficient. Thus, pre-training methods have been used with external resources to improve continuous speech recognition. However, to the best of our knowledge, no study has investigated the effect of pre-training methods specifically for spoken digit recognition. This study aimed at addressing these problems. First, we developed a Swahili spoken digit dataset for Swahili spoken digit recognition. Then, we investigated the effect of cross-lingual and multi-lingual pre-training methods on spoken digit recognition. Finally, we proposed an effective language-independent pre-training method for spoken digit recognition. The proposed method has the advantage of incorporating target language data during the pre-training stage that leads to an optimal solution when using less training data. Experiments on Swahili (being developed), English, and Gujarati datasets show that our method achieves better performance compared with all the baselines listed in this study.
Style APA, Harvard, Vancouver, ISO itp.
41

Zhang, Xinyu, Kelechi Ogueji, Xueguang Ma i Jimmy Lin. "Towards Best Practices for Training Multilingual Dense Retrieval Models". ACM Transactions on Information Systems, 12.08.2023. http://dx.doi.org/10.1145/3613447.

Pełny tekst źródła
Streszczenie:
Dense retrieval models using a transformer-based bi-encoder architecture have emerged as an active area of research. In this paper, we focus on the task of monolingual retrieval in a variety of typologically diverse languages using such an architecture. Although recent work with multilingual transformers demonstrates that they exhibit strong cross-lingual generalization capabilities, there remain many open research questions, which we tackle here. Our study is organized as a “best practices” guide for training multilingual dense retrieval models, broken down into three main scenarios: when a multilingual transformer is available, but training data in the form of relevance judgments are not available in the language and domain of interest (“have model, no data”); when both models and training data are available (“have model and data”); and, when training data are available not but models (“have data, no model”). In considering these scenarios, we gain a better understanding of the role of multi-stage fine-tuning, the strength of cross-lingual transfer under various conditions, the usefulness of out-of-language data, and the advantages of multilingual vs. monolingual transformers. Our recommendations offer a guide for practitioners building search applications, particularly for low-resource languages, and while our work leaves open a number of research questions, we provide a solid foundation for future work.
Style APA, Harvard, Vancouver, ISO itp.
42

Byambadorj, Zolzaya, Ryota Nishimura, Altangerel Ayush, Kengo Ohta i Norihide Kitaoka. "Text-to-speech system for low-resource language using cross-lingual transfer learning and data augmentation". EURASIP Journal on Audio, Speech, and Music Processing 2021, nr 1 (grudzień 2021). http://dx.doi.org/10.1186/s13636-021-00225-4.

Pełny tekst źródła
Streszczenie:
AbstractDeep learning techniques are currently being applied in automated text-to-speech (TTS) systems, resulting in significant improvements in performance. However, these methods require large amounts of text-speech paired data for model training, and collecting this data is costly. Therefore, in this paper, we propose a single-speaker TTS system containing both a spectrogram prediction network and a neural vocoder for the target language, using only 30 min of target language text-speech paired data for training. We evaluate three approaches for training the spectrogram prediction models of our TTS system, which produce mel-spectrograms from the input phoneme sequence: (1) cross-lingual transfer learning, (2) data augmentation, and (3) a combination of the previous two methods. In the cross-lingual transfer learning method, we used two high-resource language datasets, English (24 h) and Japanese (10 h). We also used 30 min of target language data for training in all three approaches, and for generating the augmented data used for training in methods 2 and 3. We found that using both cross-lingual transfer learning and augmented data during training resulted in the most natural synthesized target speech output. We also compare single-speaker and multi-speaker training methods, using sequential and simultaneous training, respectively. The multi-speaker models were found to be more effective for constructing a single-speaker, low-resource TTS model. In addition, we trained two Parallel WaveGAN (PWG) neural vocoders, one using 13 h of our augmented data with 30 min of target language data and one using the entire 12 h of the original target language dataset. Our subjective AB preference test indicated that the neural vocoder trained with augmented data achieved almost the same perceived speech quality as the vocoder trained with the entire target language dataset. Overall, we found that our proposed TTS system consisting of a spectrogram prediction network and a PWG neural vocoder was able to achieve reasonable performance using only 30 min of target language training data. We also found that by using 3 h of target language data, for training the model and for generating augmented data, our proposed TTS model was able to achieve performance very similar to that of the baseline model, which was trained with 12 h of target language data.
Style APA, Harvard, Vancouver, ISO itp.
43

Yang, Lily Wei Yun, Wei Yan Ng, Xiaofeng Lei, Shaun Chern Yuan Tan, Zhaoran Wang, Ming Yan, Mohan Kashyap Pargi i in. "Development and testing of a multi-lingual Natural Language Processing-based deep learning system in 10 languages for COVID-19 pandemic crisis: A multi-center study". Frontiers in Public Health 11 (13.02.2023). http://dx.doi.org/10.3389/fpubh.2023.1063466.

Pełny tekst źródła
Streszczenie:
PurposeThe COVID-19 pandemic has drastically disrupted global healthcare systems. With the higher demand for healthcare and misinformation related to COVID-19, there is a need to explore alternative models to improve communication. Artificial Intelligence (AI) and Natural Language Processing (NLP) have emerged as promising solutions to improve healthcare delivery. Chatbots could fill a pivotal role in the dissemination and easy accessibility of accurate information in a pandemic. In this study, we developed a multi-lingual NLP-based AI chatbot, DR-COVID, which responds accurately to open-ended, COVID-19 related questions. This was used to facilitate pandemic education and healthcare delivery.MethodsFirst, we developed DR-COVID with an ensemble NLP model on the Telegram platform (https://t.me/drcovid_nlp_chatbot). Second, we evaluated various performance metrics. Third, we evaluated multi-lingual text-to-text translation to Chinese, Malay, Tamil, Filipino, Thai, Japanese, French, Spanish, and Portuguese. We utilized 2,728 training questions and 821 test questions in English. Primary outcome measurements were (A) overall and top 3 accuracies; (B) Area Under the Curve (AUC), precision, recall, and F1 score. Overall accuracy referred to a correct response for the top answer, whereas top 3 accuracy referred to an appropriate response for any one answer amongst the top 3 answers. AUC and its relevant matrices were obtained from the Receiver Operation Characteristics (ROC) curve. Secondary outcomes were (A) multi-lingual accuracy; (B) comparison to enterprise-grade chatbot systems. The sharing of training and testing datasets on an open-source platform will also contribute to existing data.ResultsOur NLP model, utilizing the ensemble architecture, achieved overall and top 3 accuracies of 0.838 [95% confidence interval (CI): 0.826–0.851] and 0.922 [95% CI: 0.913–0.932] respectively. For overall and top 3 results, AUC scores of 0.917 [95% CI: 0.911–0.925] and 0.960 [95% CI: 0.955–0.964] were achieved respectively. We achieved multi-linguicism with nine non-English languages, with Portuguese performing the best overall at 0.900. Lastly, DR-COVID generated answers more accurately and quickly than other chatbots, within 1.12–2.15 s across three devices tested.ConclusionDR-COVID is a clinically effective NLP-based conversational AI chatbot, and a promising solution for healthcare delivery in the pandemic era.
Style APA, Harvard, Vancouver, ISO itp.
44

Demiroglu, Cenk, Aslı Beşirli, Yasin Ozkanca i Selime Çelik. "Depression-level assessment from multi-lingual conversational speech data using acoustic and text features". EURASIP Journal on Audio, Speech, and Music Processing 2020, nr 1 (17.11.2020). http://dx.doi.org/10.1186/s13636-020-00182-4.

Pełny tekst źródła
Streszczenie:
AbstractDepression is a widespread mental health problem around the world with a significant burden on economies. Its early diagnosis and treatment are critical to reduce the costs and even save lives. One key aspect to achieve that goal is to use technology and monitor depression remotely and relatively inexpensively using automated agents. There has been numerous efforts to automatically assess depression levels using audiovisual features as well as text-analysis of conversational speech transcriptions. However, difficulty in data collection and the limited amounts of data available for research present challenges that are hampering the success of the algorithms. One of the two novel contributions in this paper is to exploit databases from multiple languages for acoustic feature selection. Since a large number of features can be extracted from speech, given the small amounts of training data available, effective data selection is critical for success. Our proposed multi-lingual method was effective at selecting better features than the baseline algorithms, which significantly improved the depression assessment accuracy. The second contribution of the paper is to extract text-based features for depression assessment and use a novel algorithm to fuse the text- and speech-based classifiers which further boosted the performance.
Style APA, Harvard, Vancouver, ISO itp.
45

Grooms, Dustin R., Jed A. Diekfuss, Alexis B. Slutsky-Ganesh, Christopher A. DiCesare, Scott Bonnette, Michael A. Riley, Adam W. Kiefer i in. "Preliminary Report on the Train the Brain Project: Neuroplasticity of Augmented Neuromuscular Training and Improved Injury Risk Biomechanics - Part II". Journal of Athletic Training, 10.03.2022. http://dx.doi.org/10.4085/1062-6050-0548.21.

Pełny tekst źródła
Streszczenie:
Abstract Context: Neuromuscular training (NMT) facilitates the acquisition of new movement patterns that reduce ACL injury risk; however, the neural mechanisms underlying these changes are unknown. Objective: Determine the relationship between brain activation and biomechanical changes following NMT with biofeedback. Study Design: Controlled Laboratory Study Setting: Research laboratory Participants: Final analyses included twenty high school female soccer athletes (15.7±0.95 years; 1.68±0.05 m; 59.91±5.62 kg). Main Outcome Measures: Ten participants completed 6 weeks of NMT augmented with real-time biofeedback (aNMT) to reduce knee injury risk movements, and 10 participants completed no training. aNMT was implemented with visual biofeedback that responded in real-time to injury-risk biomechanical variables. A drop vertical jump with 3D motion capture was used to assess injury risk neuromuscular changes before and after the six-week intervention. Pre to post brain activation changes were measured using functional magnetic resonance imaging (fMRI) during unilateral knee and multi-joint motor tasks. Results: Following aNMT, sensory (precuneus), visual-spatial (lingual gyrus), and motor planning (pre-motor) brain activity increased for knee specific movement and sensorimotor cortex activity for multi-joint movement decreased. Knee abduction moment during landing also decreased (4.66±5.45 Nm; p=0.02; g=0.82) in the aNMT group with no change in the control group (p&gt;0.05). The training-induced increased brain activity for isolated knee movement was associated with decreases in knee abduction moment (r=.67, p=.036) and sensorimotor cortex activity for multi-joint movement (r=.87, p=.001). No significant change in brain activity was observed in the control group (p&gt;0.05). Conclusions: The relationship between neural changes observed across tasks and reduced knee abduction suggests that aNMT facilitates recruitment of sensory integration centers to support reduced injury risk mechanics and improve sensorimotor neural efficiency for multi-joint control. Further research is warranted to determine if this training related multimodal neuroplasticity enhances neuromuscular control during more complex sport-specific activities.
Style APA, Harvard, Vancouver, ISO itp.
46

Poon, Zhimin, Esther Cui Wei Lee, Li Ping Ang i Ngiap Chuan Tan. "Experiences of primary care physicians managing postpartum care: a qualitative research study". BMC Family Practice 22, nr 1 (30.06.2021). http://dx.doi.org/10.1186/s12875-021-01494-w.

Pełny tekst źródła
Streszczenie:
Abstract Background The postpartum period is redefined as 12 weeks following childbirth. Primary care physicians (PCP) often manage postpartum women in the community after uneventful childbirths. Postpartum care significantly impacts on the maternal and neonatal physical and mental health. However, evidence has revealed unmet needs in postpartum maternal care. Aim The study aimed to explore the experiences of PCPs in managing postpartum mothers. Methods Four focus group discussions and eleven in-depth interviews with twenty-nine PCPs were conducted in this qualitative research study in urban Singapore. PCPs of both gender and variable postgraduate training background were purposively enrolled. Audited transcripts were independently coded by two investigators. Thematic content analysis was performed using the codes to identify issues in the “clinician”, “mother”, “postpartum care” and “healthcare system & policy” domains stipulated in “The Generalists’ Wheel of Knowledge, Understanding and Inquiry” framework. Findings PCPs’ personal attributes such as gender and knowledge influenced their postpartum care delivery. Prior training, child caring experience and access to resource materials contributed to their information mastery of postpartum care. Their professional relationship with local multi-ethic and multi-lingual Asian mothers was impacted by their mutual communication, language compatibility and understanding of local confinement practices. Consultation time constraint, awareness of community postnatal services and inadequate handover of care from the specialists hindered PCPs in the healthcare system. Discussion Personal, maternal and healthcare system barriers currently prevent PCPs from delivering optimal postpartum care. Conclusion Interventions to overcome the barriers to improve postpartum care will likely be multi-faceted across domains discussed.
Style APA, Harvard, Vancouver, ISO itp.
47

Ren, Xingzhang, Baosong Yang, Dayiheng Liu, Haibo Zhang, Xiaoyu Lv, Liang Yao i Jun Xie. "Effective Approaches to Neural Query Language Identification". Computational Linguistics, 18.07.2022, 1–22. http://dx.doi.org/10.1162/coli_a_00451.

Pełny tekst źródła
Streszczenie:
Abstract Query language identification (Q-LID) plays a crucial role in cross-lingual search engine. There exist two main challenges in Q-LID: 1) insufficient contextual information in queries for disambiguation; and 2) the lack of query-style training examples for low-resource languages. In this paper, we propose a neural Q-LID model by alleviating the above problems from both model architecture and data augmentation perspectives. Concretely, we build our model upon the advanced Transformer model. In order to enhance the discrimination of queries, a variety of external features, e.g. character, word as well as script, are fed into the model and fused by a multi-scale attention mechanism. Moreover, to remedy the low resource challenge in this task, a novel machine translation based strategy is proposed to automatically generate synthetic query-style data for low-resource languages. We contribute the first Q-LID test set called QID-21, which consists of search queries in 21 languages. Experimental results reveal that our model yields better classification accuracy than strong baselines and existing LID systems on both query and traditional LID tasks.
Style APA, Harvard, Vancouver, ISO itp.
48

Mi, Chenggang. "Improving the Robustness of Loanword Identification in Social Media Texts". ACM Transactions on Asian and Low-Resource Language Information Processing, 23.11.2022. http://dx.doi.org/10.1145/3572773.

Pełny tekst źródła
Streszczenie:
As a potential bilingual resource, loanwords play a very important role in many natural language processing tasks. If loanwords in a low-resource language can be identified effectively, the generated donor-receipt word pairs will benefit many cross-lingual NLP tasks. However, most studies on loanword identification mainly focus on formal texts such as news and government documents. Loanword identification in social media texts is still an under-studied field. Since it faces many challenges and can be widely used in several downstream tasks, more efforts should be put on loanword identification in social media texts. In this study, we present a multi-task learning architecture with deep bi-directional RNNs for loanword identification in social media texts, where different task supervision can happen at different layers. The multi-task neural network architecture learns higher order feature representations from word and character sequences along with basic spell error checking (SEC), part-of-speech (POS) tagging and named entity recognition (NER) information. Experimental results on Uyghur loanword identification in social media texts in five donor languages (Chinese, Arabic, Russian, Turkish, and Farsi) show that our method achieves the best performance compared with several strong baseline systems. We also combine the loanword detection results into the training data of neural machine translation for low-resource language pairs. Experiments show that models trained on the extended datasets achieve significant improvements compared with the baseline models in all language pairs.
Style APA, Harvard, Vancouver, ISO itp.
49

Banar, Nikolay, Walter Daelemans i Mike Kestemont. "Transfer Learning for the Visual Arts: The Multi-Modal Retrieval of Iconclass Codes". Journal on Computing and Cultural Heritage, 17.03.2023. http://dx.doi.org/10.1145/3575865.

Pełny tekst źródła
Streszczenie:
Iconclass is an iconographic thesaurus which is widely used in the digital heritage domain to describe subjects depicted in artworks. Each subject is assigned a unique descriptive code, which has a corresponding textual definition. The assignment of Iconclass codes is a challenging task for computational systems, due to the large number of available labels in comparison to the limited amount of training data available. Transfer learning has become a common strategy to overcome such a data shortage. In deep learning, transfer learning consists in fine-tuning the weights of a deep neural network for a downstream task. In this work, we present a deep retrieval framework which can be fully fine-tuned for the task under consideration. Our work is based on a recent approach to this task, which already yielded state-of-the-art performance, although it could not be fully fine-tuned yet. This approach exploits the multi-linguality and multi-modality that is inherent to digital heritage data. Our framework jointly processes multiple input modalities, namely, textual and visual features. We extract the textual features from the artwork titles in multiple languages, whereas the visual features are derived from photographic reproductions of the artworks. The definitions of the Iconclass codes, containing useful textual information, are used as target labels instead of the codes themselves. As our main contribution, we demonstrate that our approach outperforms the state-of-the-art by a large margin. In addition, our approach is superior to the M 3 P feature extractor and outperforms the multi-lingual CLIP in most experiments due to the better quality of the visual features. Our out-of-domain and zero-shot experiments show poor results and demonstrate that the Iconclass retrieval remains a challenging task. We make our source code and models publicly available to support heritage institutions in the further enrichment of their digital collections.
Style APA, Harvard, Vancouver, ISO itp.
50

Liu, Boxiang, i Liang Huang. "ParaMed: a parallel corpus for English–Chinese translation in the biomedical domain". BMC Medical Informatics and Decision Making 21, nr 1 (6.09.2021). http://dx.doi.org/10.1186/s12911-021-01621-8.

Pełny tekst źródła
Streszczenie:
Abstract Background Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. Description We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en$$\rightarrow$$ → zh (zh$$\rightarrow$$ → en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en$$\rightarrow$$ → zh (zh$$\rightarrow$$ → en) directions on the full dataset. Conclusions The code and data are available at https://github.com/boxiangliu/ParaMed.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii