Kliknij ten link, aby zobaczyć inne rodzaje publikacji na ten temat: Cross lingual information retrieval.

Artykuły w czasopismach na temat „Cross lingual information retrieval”

Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych

Wybierz rodzaj źródła:

Sprawdź 50 najlepszych artykułów w czasopismach naukowych na temat „Cross lingual information retrieval”.

Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.

Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.

Przeglądaj artykuły w czasopismach z różnych dziedzin i twórz odpowiednie bibliografie.

1

Capstick, Joanne, Abdel Kader Diagne, Gregor Erbach, Hans Uszkoreit, Anne Leisenberg i Manfred Leisenberg. "A system for supporting cross-lingual information retrieval". Information Processing & Management 36, nr 2 (marzec 2000): 275–89. http://dx.doi.org/10.1016/s0306-4573(99)00058-8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
2

Zuliarso, Eri, Retantyo Wardoyo, Sri Hartati i Khabib Mustofa. "Indonesian-english cross-lingual legal ontology for information retrieval". International journal of Web & Semantic Technology 6, nr 4 (30.10.2015): 01–10. http://dx.doi.org/10.5121/ijwest.2015.6401.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
3

Gupta, Suneet Kumar, Amit Sinha i Mradul Jain. "Cross Lingual Information Retrieval With SMT And Query Mining". Advanced Computing: An International Journal 2, nr 5 (30.09.2011): 33–39. http://dx.doi.org/10.5121/acij.2011.2504.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
4

Saad, Farag, i Andreas Nürnberger. "Overview of prior-art cross-lingual information retrieval approaches". World Patent Information 34, nr 4 (grudzień 2012): 304–14. http://dx.doi.org/10.1016/j.wpi.2012.08.013.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
5

Sorg, P., i P. Cimiano. "Exploiting Wikipedia for cross-lingual and multilingual information retrieval". Data & Knowledge Engineering 74 (kwiecień 2012): 26–45. http://dx.doi.org/10.1016/j.datak.2012.02.003.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
6

Feng, Kai, Lan Huang, Hao Xu, Kangping Wang, Wei Wei i Rui Zhang. "Deep Multilabel Multilingual Document Learning for Cross-Lingual Document Retrieval". Entropy 24, nr 7 (7.07.2022): 943. http://dx.doi.org/10.3390/e24070943.

Pełny tekst źródła
Streszczenie:
Cross-lingual document retrieval, which aims to take a query in one language to retrieve relevant documents in another, has attracted strong research interest in the last decades. Most studies on this task start with cross-lingual comparisons at the word level and then represent documents via word embeddings, which leads to insufficient structure information. In this work, the cross-lingual comparison at the document level is achieved through the cross-lingual semantic space. Our method, MDL (deep multilabel multilingual document learning), leverages a six-layer fully connected network to project cross-lingual documents into a shared semantic space. The semantic distances can be calculated when the cross-lingual documents are transformed into embeddings in semantic space. The supervision signals are automatically extracted from the data and then used to construct the semantic space via a linear classifier. The ambiguity of manual labels could be avoided and the multilabel supervision signals can be acquired instead of a single label. The representation of the semantic space is enriched by multilabel supervision signals, which improves the discriminative ability of the embeddings. The MDL is easy to extend to other fields since it does not depend on specific data. Furthermore, MDL is more efficient than the models training all languages jointly, since each language is trained individually. Experiments on Wikipedia data showed that the proposed method outperforms the state-of-the-art cross-lingual document retrieval methods.
Style APA, Harvard, Vancouver, ISO itp.
7

Jena, Gouranga Charan, i Siddharth Swarup Rautaray. "A comprehensive survey on cross-language information retrieval system". Indonesian Journal of Electrical Engineering and Computer Science 14, nr 1 (1.04.2019): 127. http://dx.doi.org/10.11591/ijeecs.v14.i1.pp127-134.

Pełny tekst źródła
Streszczenie:
Cross language information retrieval (CLIR) is a retrieval process in which the user fires queries in one language to retrieve information from another (different) language. The diversity of information and language barriers are the serious issues for communication and cultural exchange across the world. To solve such barriers, Cross language information retrieval system, are nowadays in strong demand. CLIR is a subset of Information Retrieval (IR) system. Information Retrieval deals with finding useful information from a large collection of unstructured, structured and semi-structured data to a user query where the query is a set of keywords. Information Retrieval can be classified into different classes such as Monolingual information retrieval, Bi-Lingual Information Retrieval, Multilingual information retrieval and Cross language information retrieval. This paper focuses on the various IR variants and techniques used in CLIR system. Further, based on available literature, a number of challenges and issues in CLIR have been identified and discussed. It gives an overview of the advantages, limitations, tools available in CLIR research. It also describes new application areas of CLIR such as medical, multimedia, question answering system etc. The need for exploring and building more specialized information system that enable speakers of an Odia language to discover valuable information beyond linguistic and cultural barriers. This study is aimed at building an experimental CLIR system between one of the under-resourced language (i.e. Odia) and one of the most commonly used online language (i.e. English) in future.
Style APA, Harvard, Vancouver, ISO itp.
8

Ghanbari, Elham, i Azadeh Shakery. "Query-dependent learning to rank for cross-lingual information retrieval". Knowledge and Information Systems 59, nr 3 (4.07.2018): 711–43. http://dx.doi.org/10.1007/s10115-018-1232-8.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
9

Li, Juntao, Chang Liu, Jian Wang, Lidong Bing, Hongsong Li, Xiaozhong Liu, Dongyan Zhao i Rui Yan. "Cross-Lingual Low-Resource Set-to-Description Retrieval for Global E-Commerce". Proceedings of the AAAI Conference on Artificial Intelligence 34, nr 05 (3.04.2020): 8212–19. http://dx.doi.org/10.1609/aaai.v34i05.6335.

Pełny tekst źródła
Streszczenie:
With the prosperous of cross-border e-commerce, there is an urgent demand for designing intelligent approaches for assisting e-commerce sellers to offer local products for consumers from all over the world. In this paper, we explore a new task of cross-lingual information retrieval, i.e., cross-lingual set-to-description retrieval in cross-border e-commerce, which involves matching product attribute sets in the source language with persuasive product descriptions in the target language. We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language. As the dataset construction process is both time-consuming and costly, the new dataset only comprises of 13.5k pairs, which is a low-resource setting and can be viewed as a challenging testbed for model development and evaluation in cross-border e-commerce. To tackle this cross-lingual set-to-description retrieval task, we propose a novel cross-lingual matching network (CLMN) with the enhancement of context-dependent cross-lingual mapping upon the pre-trained monolingual BERT representations. Experimental results indicate that our proposed CLMN yields impressive results on the challenging task and the context-dependent cross-lingual mapping on BERT yields noticeable improvement over the pre-trained multi-lingual BERT model.
Style APA, Harvard, Vancouver, ISO itp.
10

Lu, Chao, Chengzhi Zhang i Daqing He. "Comparative analysis of book tags: a cross-lingual perspective". Electronic Library 34, nr 4 (1.08.2016): 666–82. http://dx.doi.org/10.1108/el-03-2015-0042.

Pełny tekst źródła
Streszczenie:
Purpose In the era of social media, users all over the world annotate books with social tags to express their preferences and interests. The purpose of this paper is to explore different tagging behaviours by analysing the book tags in different languages. Design/methodology/approach This investigation collected nearly 56,000 tags of 1,200 books from one Chinese and two English online bookmarking systems; it combined content analysis and machine-processing methods to evaluate the similarities and differences between different tagging systems from a cross-lingual perspective. Jaccard’s coefficient was adopted to evaluate the similarity level. Findings The results show that the similarity between mono-lingual tags of the same books is higher than that of cross-lingual tags in different systems and the similarity between tags of books written for specialties is higher than that of books written for the general public. Research limitations/implications Those who have more in common annotate books with more similar tags. The similarity between users in tagging systems determines the similarity of the tag sets. Practical implications The results and conclusion of this study will benefit users’ cross-lingual information retrieval and cross-lingual book recommendation for online bookmarking systems. Originality/value This study may be one of the first to compare cross-lingual tags. Its methodology can be applied to tag comparison between any two languages. The insights of this study will help develop cross-lingual tagging systems and improve information retrieval.
Style APA, Harvard, Vancouver, ISO itp.
11

L, Divija, C. G. Hanisha Reddy, Rayudu Srishti i Dr Surabhi Narayan. "Kannada to English Agricultural Cross-Lingual Retrieval: Enhancing Knowledge Access in Farming Practices". International Journal for Research in Applied Science and Engineering Technology 11, nr 9 (30.09.2023): 1672–77. http://dx.doi.org/10.22214/ijraset.2023.55879.

Pełny tekst źródła
Streszczenie:
Abstract: This paper presents the development and implementation of a specialized cross-lingual information retrieval system tailored for agriculture-related queries in the Kannada language. The primary objective of the system is to facilitate accurate translation of Kannada queries into English, the target language, and to retrieve relevant documents containing vital agricultural information. The proposed system addresses key challenges including incorporating effective query preprocessing techniques, designing an efficient document retrieval mechanism, and establishing optimal data indexing strategies as well as strategies have been introduced to mitigate the challenges posed by the diversity of Kannada dialects. By overcoming the language barrier, this system enables seamless and effective knowledge dissemination in the agriculture domain.
Style APA, Harvard, Vancouver, ISO itp.
12

Kumar, Aarti, i Sujoy Das. "Dealing with Relevance Ranking in Cross-Lingual Cross-Script Text Reuse". International Journal of Information Retrieval Research 6, nr 1 (styczeń 2016): 16–35. http://dx.doi.org/10.4018/ijirr.2016010102.

Pełny tekst źródła
Streszczenie:
Proliferation of multilingual content on the web has paved way for text reuse to get cross-lingual and also cross script. Identifying cross language text reuse becomes tougher if one considers cross-script less resourced languages. This paper focuses on identifying text reuse between English-Hindi news articles and improving their relevance ranking using two phases (i) Heuristic retrieval phase for reducing search space and (ii) post processing phase for improving the relevance ranking. Dictionary based strategy of Cross-Language Information Retrieval is used for heuristic retrieval and Parse Feature Vector Model (PFVS) is proposed for post processing to improve the relevance ranking. The application of this model has been successful in tackling the obfuscation problems of synonymy, hyponymy, hypernymy, antonym, sentence addition/ deletion and word inflection. Instead of using traditional approaches, Parse Feature Vectors have been explored to detect the reused documents and as per the knowledge of the authors it is a novel contribution with regards to these two language pairs.
Style APA, Harvard, Vancouver, ISO itp.
13

Et al., Chayapathi A. R. "CLOUD BASED MULTI-LANGUAGE INDEXING USING CROSS LINGUAL INFORMATION RETRIEVAL APPROACHES". INFORMATION TECHNOLOGY IN INDUSTRY 9, nr 1 (18.03.2021): 1283–93. http://dx.doi.org/10.17762/itii.v9i1.269.

Pełny tekst źródła
Streszczenie:
The exponential growth of data sizes created by digital media (video/audio/images), physicalsimulations, scientific instruments and web authoring joins the new growth of interest in cloud computing. The options for distribution and parallelization of information in clouds make the retrieval and storage processes very complicated, especially when faced with real-time data management. The quantity of Web Users getting access to data over Internet is expanding step by step. An enormous measure of data on Internet is accessible in various languages which could be accessed by anyone whenever. The Information Retrieval (IR) manages finding valuable data from a huge assortment of unorganized, organized and semi-organized information. In the present situation, the variety of data and language boundaries are the difficult challenges for communication and social trade over the world. To tackle such obstructions, CLIR, the cross-language information retrieval frameworks, are these days in solid interest. The Query Expansion (Q.E.) is the way toward adding related and important terms to original inquiry to upgrade its indexing ability to improve the significance of recovered files in CLIR. In this exploration work, Q.E. has been investigated for a Hindi-English and Kannada-English CLIR in that Hindi and Kannada queries are utilized to look through English docs. After the interpretation of query, recovered outcomes are positioned making use of OkapiBM25 to organize the most important doc at the top for expanding the significance of recovered docs using QE. We proposed architecture for Hindi-English and Kannada-English CLIR making use of QE. to improve the importance of recovered reports. In the primary investigation, QE. is performed with and without OkapiBM25 ranking. The outcomes show that the pertinence of recovered archives is higher with OKapiBM25 as contrast with the one without positioning. The work docs plainly demonstrate that the presentation of Hindi-English and Kannada-English CLIR framework can be improved altogether with query development using fitting terms located at suitable place and the recovered Snippets can incredibly fill in as the continuous test collection.
Style APA, Harvard, Vancouver, ISO itp.
14

Zhang, Fuwei, Zhao Zhang, Xiang Ao, Dehong Gao, Fuzhen Zhuang, Yi Wei i Qing He. "Mind the Gap: Cross-Lingual Information Retrieval with Hierarchical Knowledge Enhancement". Proceedings of the AAAI Conference on Artificial Intelligence 36, nr 4 (28.06.2022): 4345–53. http://dx.doi.org/10.1609/aaai.v36i4.20355.

Pełny tekst źródła
Streszczenie:
Cross-Lingual Information Retrieval (CLIR) aims to rank the documents written in a language different from the user’s query. The intrinsic gap between different languages is an essential challenge for CLIR. In this paper, we introduce the multilingual knowledge graph (KG) to the CLIR task due to the sufficient information of entities in multiple languages. It is regarded as a “silver bullet” to simultaneously perform explicit alignment between queries and documents and also broaden the representations of queries. And we propose a model named CLIR with HIerarchical Knowledge Enhancement (HIKE) for our task. The proposed model encodes the textual information in queries, documents and the KG with multilingual BERT, and incorporates the KG information in the query-document matching process with a hierarchical information fusion mechanism. Particularly, HIKE first integrates the entities and their neighborhood in KG into query representations with a knowledge-level fusion, then combines the knowledge from both source and target languages to further mitigate the linguistic gap with a language-level fusion. Finally, experimental results demonstrate that HIKE achieves substantial improvements over state-of-the-art competitors.
Style APA, Harvard, Vancouver, ISO itp.
15

Kishida, Kazuaki, Kuang-hua Chen, Sukhoon Lee, Hsin-Hsi Chen, Noriko Kando, Kazuko Kuriyama, Sung Hyon Myaeng i Koji Eguchi. "Cross-lingual information retrieval (CLIR) task at the NTCIR workshop 3". ACM SIGIR Forum 38, nr 1 (lipiec 2004): 17–20. http://dx.doi.org/10.1145/986278.986281.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
16

Asthana, Amit, i Sanjay K. Dwivedi. "Exploring Snippets as a Dataset to Overcome Challenges in CLIR". ITM Web of Conferences 54 (2023): 01012. http://dx.doi.org/10.1051/itmconf/20235401012.

Pełny tekst źródła
Streszczenie:
Cross-lingual information retrieval (CLIR) is a challenging task that requires overcoming linguistic barriers to match user queries with relevant documents in different languages. One of the major challenges in CLIR is the lack of parallel corpora, which hinders the development of effective translation models. This challenge can be addressed using snippets as a dataset to train CLIR models. Snippets can be automatically extracted from various sources, such as search engine result pages and can provide a rich and diverse set of collections for cross-lingual information retrieval. This paper initially discusses the challenges in CLIR and then explores the use of snippets as a dataset which can lead towards the development or improvements in the techniques to improve the retrieval effectiveness and further discusses the advantages and limitations of using snippets dataset in CLIR.
Style APA, Harvard, Vancouver, ISO itp.
17

Chi, H. V. T., D. L. Anh, N. L. Thanh i D. Dinh. "English-Vietnamese Cross-Lingual Paraphrase Identification Using MT-DNN". Engineering, Technology & Applied Science Research 11, nr 5 (12.10.2021): 7598–604. http://dx.doi.org/10.48084/etasr.4300.

Pełny tekst źródła
Streszczenie:
Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, our proposed method based on MT-DNN [2] to detect similarities between English and Vietnamese sentences, is proposed. We changed the shared layers of the original MT-DNN from original the BERT [3] to other pre-trained multi-language models such as M-BERT [3] or XLM-R [4] so that our model could work on cross-language (in our case, English and Vietnamese) information retrieval. We also added some tasks as improvements to gain better results. As a result, we gained 2.3% and 2.5% increase in evaluated accuracy and F1. The proposed method was also implemented on other language pairs such as English – German and English – French. With those implementations, we got a 1.0%/0.7% improvement for English – German and a 0.7%/0.5% increase for English – French.
Style APA, Harvard, Vancouver, ISO itp.
18

Chau, Rowena, i Chung-Hsing Yeh. "A multilingual text mining approach to web cross-lingual text retrieval". Knowledge-Based Systems 17, nr 5-6 (sierpień 2004): 219–27. http://dx.doi.org/10.1016/j.knosys.2004.04.001.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
19

Pachpande, Suhas D., i Parag U. Bhalchandra. "Cross Language Information Retrieval based on Automatic Query Translation for Marathi Documents". International Journal for Research in Applied Science and Engineering Technology 11, nr 9 (30.09.2023): 394–400. http://dx.doi.org/10.22214/ijraset.2023.55658.

Pełny tekst źródła
Streszczenie:
Abstract: The current research article explores the realm of Cross Language Information Retrieval (CLIR) and its significance in the digital age. It addresses the challenges faced in CLIR, including lexical and semantic disparities, the scarcity of parallel corpora, cultural nuances, and more. The article discusses innovative solutions encompassing Machine Translation, Query Expansion, Cross-Lingual Word Embeddings, and Multilingual Information Retrieval Models to enhance CLIR's effectiveness. Furthermore, it sheds light on Information Retrieval Models, such as the Boolean Model, Vector Space Model (VSM), and Probabilistic Models, explaining their principles and applications. The study also presents experimental results highlighting the limitations of monolingual IR models and the effectiveness of crosslingual techniques, such as translation and query expansion, in improving CLIR, making it a valuable tool for accessing information across languages.
Style APA, Harvard, Vancouver, ISO itp.
20

Schulz, Stefan, i Udo Hahn. "Morpheme-based, cross-lingual indexing for medical document retrieval". International Journal of Medical Informatics 58-59 (wrzesień 2000): 87–99. http://dx.doi.org/10.1016/s1386-5056(00)00078-2.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
21

Zhou, Dong, Séamus Lawless, Xuan Wu, Wenyu Zhao i Jianxun Liu. "A study of user profile representation for personalized cross-language information retrieval". Aslib Journal of Information Management 68, nr 4 (18.07.2016): 448–77. http://dx.doi.org/10.1108/ajim-06-2015-0091.

Pełny tekst źródła
Streszczenie:
Purpose – With an increase in the amount of multilingual content on the World Wide Web, users are often striving to access information provided in a language of which they are non-native speakers. The purpose of this paper is to present a comprehensive study of user profile representation techniques and investigate their use in personalized cross-language information retrieval (CLIR) systems through the means of personalized query expansion. Design/methodology/approach – The user profiles consist of weighted terms computed by using frequency-based methods such as tf-idf and BM25, as well as various latent semantic models trained on monolingual documents and cross-lingual comparable documents. This paper also proposes an automatic evaluation method for comparing various user profile generation techniques and query expansion methods. Findings – Experimental results suggest that latent semantic-weighted user profile representation techniques are superior to frequency-based methods, and are particularly suitable for users with a sufficient amount of historical data. The study also confirmed that user profiles represented by latent semantic models trained on a cross-lingual level gained better performance than the models trained on a monolingual level. Originality/value – Previous studies on personalized information retrieval systems have primarily investigated user profiles and personalization strategies on a monolingual level. The effect of utilizing such monolingual profiles for personalized CLIR remains unclear. The current study fills the gap by a comprehensive study of user profile representation for personalized CLIR and a novel personalized CLIR evaluation methodology to ensure repeatable and controlled experiments can be conducted.
Style APA, Harvard, Vancouver, ISO itp.
22

Shakery, Azadeh, i ChengXiang Zhai. "Leveraging comparable corpora for cross-lingual information retrieval in resource-lean language pairs". Information Retrieval 16, nr 1 (27.04.2012): 1–29. http://dx.doi.org/10.1007/s10791-012-9194-z.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
23

Ehsan, Nava, i Azadeh Shakery. "Candidate document retrieval for cross-lingual plagiarism detection using two-level proximity information". Information Processing & Management 52, nr 6 (listopad 2016): 1004–17. http://dx.doi.org/10.1016/j.ipm.2016.04.006.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
24

Du, Lin, Yibo Zhang, Le Sun i Yufang Sun. "The application of the comparable corpora in Chinese-English Cross-Lingual Information Retrieval". Journal of Computer Science and Technology 16, nr 4 (lipiec 2001): 351–58. http://dx.doi.org/10.1007/bf02948983.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
25

Martín-Valdivia, M. T., F. Martínez-Santiago i L. A. Ureña-López. "Merging Strategy for Cross-Lingual Information Retrieval Systems based on Learning Vector Quantization". Neural Processing Letters 22, nr 2 (październik 2005): 149–61. http://dx.doi.org/10.1007/s11063-005-2659-y.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
26

Lam, Wai, Ki Chan, Dragomir Radev, Horacio Saggion i Simone Teufel. "Context-based generic cross-lingual retrieval of documents and automated summaries". Journal of the American Society for Information Science and Technology 56, nr 2 (2004): 129–39. http://dx.doi.org/10.1002/asi.20104.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
27

Doncel, Víctor Rodríguez, i Elena Montiel Ponsoda. "LYNX: Towards a Legal Knowledge Graph for Multilingual Europe". Law in Context. A Socio-legal Journal 37, nr 1 (20.12.2020): 175–78. http://dx.doi.org/10.26826/law-in-context.v37i1.129.

Pełny tekst źródła
Streszczenie:
Lynx is an innovation project in Europe whose objective is to develop services for legal compliance. A legal knowledge graph is built over multilingual, multijurisdictional documents using semantic web technologies. A collection of services implementing natural language techniques enables better legal information retrieval, cross-lingual answering of questions and information discovery. Three use cases are discussed, as well as the overall impact of the project.
Style APA, Harvard, Vancouver, ISO itp.
28

Wumaier, Aishan, Cuiyun Xu, Zaokere Kadeer, Wenqi Liu, Yingbo Wang, Xireaili Haierla, Maihemuti Maimaiti, ShengWei Tian i Alimu Saimaiti. "A Neural-Network-Based Approach to Chinese–Uyghur Organization Name Translation". Information 11, nr 10 (21.10.2020): 492. http://dx.doi.org/10.3390/info11100492.

Pełny tekst źródła
Streszczenie:
The recognition and translation of organization names (ONs) is challenging due to the complex structures and high variability involved. ONs consist not only of common generic words but also names, rare words, abbreviations and business and industry jargon. ONs are a sub-class of named entity (NE) phrases, which convey key information in text. As such, the correct translation of ONs is critical for machine translation and cross-lingual information retrieval. The existing Chinese–Uyghur neural machine translation systems have performed poorly when applied to ON translation tasks. As there are no publicly available Chinese–Uyghur ON translation corpora, an ON translation corpus is developed here, which includes 191,641 ON translation pairs. A word segmentation approach involving characterization, tagged characterization, byte pair encoding (BPE) and syllabification is proposed here for ON translation tasks. A recurrent neural network (RNN) attention framework and transformer are adapted here for ON translation tasks with different sequence granularities. The experimental results indicate that the transformer model not only outperforms the RNN attention model but also benefits from the proposed word segmentation approach. In addition, a Chinese–Uyghur ON translation system is developed here to automatically generate new translation pairs. This work significantly improves Chinese–Uyghur ON translation and can be applied to improve Chinese–Uyghur machine translation and cross-lingual information retrieval. It can also easily be extended to other agglutinative languages.
Style APA, Harvard, Vancouver, ISO itp.
29

Novak, Erik, Luka Bizjak, Dunja Mladenić i Marko Grobelnik. "Why is a document relevant? Understanding the relevance scores in cross-lingual document retrieval". Knowledge-Based Systems 244 (maj 2022): 108545. http://dx.doi.org/10.1016/j.knosys.2022.108545.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
30

Sharma, Vijay, i Namita Mittal. "Refined stop-words and morphological variants solutions applied to Hindi-English cross-lingual information retrieval". Journal of Intelligent & Fuzzy Systems 36, nr 3 (26.03.2019): 2219–27. http://dx.doi.org/10.3233/jifs-169933.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
31

Zhang, Ying, Phil Vines i Justin Zobel. "Chinese OOV translation and post-translation query expansion in chinese--english cross-lingual information retrieval". ACM Transactions on Asian Language Information Processing 4, nr 2 (czerwiec 2005): 57–77. http://dx.doi.org/10.1145/1105696.1105697.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
32

Mishra, Dr Rudra Prasad. "Transliteration: A Magnetic Analysis". International Journal for Research in Applied Science and Engineering Technology 9, nr 11 (30.11.2021): 85–86. http://dx.doi.org/10.22214/ijraset.2021.38742.

Pełny tekst źródła
Streszczenie:
Abstract: Machine transliteration is an important problem in an increasingly multilingual world as it plays a critical role in many downstream applications such as machine translation or cross-lingual information retrieval systems. There is now a vast amount of information accessible via the Internet where a lot of regional and cultural information is put on the World Wide Web in different languages and scripts. There are more that six thousand living languages in the world. Adding to the diversity is the fact that some languages are written in different scripts in different regions of the world. The multitude of foreign languages and mutually incomprehensible scripts of the same language pose a barrier to information exchange as we cannot all learn every language or script in use worldwide. Therefore, if we can get around the language barrier or at least the script barrier, we can access much more of the world's culture and can explore its abundant richness. Keywords: Transliteration, Translation. Cross-lingual, Multilingual, Language, Script
Style APA, Harvard, Vancouver, ISO itp.
33

Costa-jussà, Marta R., Srinivas Bangalore, Patrik Lambert, Lluís Màrquez i Elena Montiel-Ponsoda. "Introduction to the Special Issue on Cross-Language Algorithms and Applications". Journal of Artificial Intelligence Research 55 (12.01.2016): 1–15. http://dx.doi.org/10.1613/jair.5022.

Pełny tekst źródła
Streszczenie:
With the increasingly global nature of our everyday interactions, the need for multilin- gual technologies to support efficient and effective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross- language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading re- search in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.
Style APA, Harvard, Vancouver, ISO itp.
34

Azarbonyad, Hosein, Azadeh Shakery i Heshaam Faili. "A learning to rank approach for cross-language information retrieval exploiting multiple translation resources". Natural Language Engineering 25, nr 3 (5.03.2019): 363–84. http://dx.doi.org/10.1017/s1351324919000032.

Pełny tekst źródła
Streszczenie:
AbstractCross-language information retrieval (CLIR), finding information in one language in response to queries expressed in another language, has attracted much attention due to the explosive growth of multilingual information in the World Wide Web. One important issue in CLIR is how to apply monolingual information retrieval (IR) methods in cross-lingual environments. Recently, learning to rank (LTR) approach has been successfully employed in different IR tasks. In this paper, we use LTR for CLIR. In order to adapt monolingual LTR techniques in CLIR and pass the barrier of language difference, we map monolingual IR features to CLIR ones using translation information extracted from different translation resources. The performance of CLIR is highly dependent on the size and quality of available bilingual resources. Effective use of available resources is especially important in low-resource language pairs. In this paper, we further propose an LTR-based method for combining translation resources in CLIR. We have studied the effectiveness of the proposed approach using different translation resources. Our results also show that LTR can be used to successfully combine different translation resources to improve the CLIR performance. In the best scenario, the LTR-based combination method improves the performance of single-resource-based CLIR method by 6% in terms of Mean Average Precision.
Style APA, Harvard, Vancouver, ISO itp.
35

Khwileh, Ahmad, Debasis Ganguly i Gareth J. F. Jones. "Utilisation of Metadata Fields and Query Expansion in Cross-Lingual Search of User-Generated Internet Video". Journal of Artificial Intelligence Research 55 (27.01.2016): 249–81. http://dx.doi.org/10.1613/jair.4775.

Pełny tekst źródła
Streszczenie:
Recent years have seen significant efforts in the area of Cross Language Information Retrieval (CLIR) for text retrieval. This work initially focused on formally published content, but more recently research has begun to concentrate on CLIR for informal social media content. However, despite the current expansion in online multimedia archives, there has been little work on CLIR for this content. While there has been some limited work on Cross-Language Video Retrieval (CLVR) for professional videos, such as documentaries or TV news broadcasts, there has to date, been no significant investigation of CLVR for the rapidly growing archives of informal user generated (UGC) content. Key differences between such UGC and professionally produced content are the nature and structure of the textual UGC metadata associated with it, as well as the form and quality of the content itself. In this setting, retrieval effectiveness may not only suffer from translation errors common to all CLIR tasks, but also recognition errors associated with the automatic speech recognition (ASR) systems used to transcribe the spoken content of the video and with the informality and inconsistency of the associated user-created metadata for each video. This work proposes and evaluates techniques to improve CLIR effectiveness of such noisy UGC content. Our experimental investigation shows that different sources of evidence, e.g. the content from different fields of the structured metadata, significantly affect CLIR effectiveness. Results from our experiments also show that each metadata field has a varying robustness to query expansion (QE) and hence can have a negative impact on the CLIR effectiveness. Our work proposes a novel adaptive QE technique that predicts the most reliable source for expansion and shows how this technique can be effective for improving the CLIR effectiveness for UGC content.
Style APA, Harvard, Vancouver, ISO itp.
36

Taghizadeh, Nasrin, i Hesham Faili. "Automatic Wordnet Development for Low-Resource Languages using Cross-Lingual WSD". Journal of Artificial Intelligence Research 56 (20.05.2016): 61–87. http://dx.doi.org/10.1613/jair.4968.

Pełny tekst źródła
Streszczenie:
‎Wordnets are an effective resource for natural language processing and information retrieval‎, ‎especially for semantic processing and meaning related tasks‎. ‎So far‎, ‎wordnets have been constructed for many languages‎. ‎However‎, ‎the automatic development of wordnets for low-resource languages has not been well studied‎. ‎In this paper‎, ‎an Expectation-Maximization algorithm is used to create high quality and large scale wordnets for poor-resource languages‎. ‎The proposed method benefits from possessing cross-lingual word sense disambiguation and develops a wordnet by only using a bi-lingual dictionary and a mono-lingual corpus‎. ‎The proposed method has been executed with Persian language and the resulting wordnet has been evaluated through several experiments‎. ‎The results show that the induced wordnet has a precision score of 90% and a recall score of 35%‎.
Style APA, Harvard, Vancouver, ISO itp.
37

Zeng, Jing, i Chung-hong Chan. "A cross-national diagnosis of infodemics: comparing the topical and temporal features of misinformation around COVID-19 in China, India, the US, Germany and France". Online Information Review 45, nr 4 (15.01.2021): 709–28. http://dx.doi.org/10.1108/oir-09-2020-0417.

Pełny tekst źródła
Streszczenie:
PurposeThis study empirically investigates how the COVID-infodemic manifests differently in different languages and in different countries. This paper focuses on the topical and temporal features of misinformation related to COVID-19 in five countries.Design/methodology/approachCOVID-related misinformation was retrieved from 4,487 fact-checked articles. A novel approach to conducting cross-lingual topic extraction was applied. The rectr algorithm, empowered by aligned word-embedding, was utilised. To examine how the COVID-infodemic interplays with the pandemic, a time series analysis was used to construct and compare their temporal development.FindingsThe cross-lingual topic model findings reveal the topical characteristics of each country. On an aggregated level, health misinformation represents only a small portion of the COVID-infodemic. The time series results indicate that, for most countries, the infodemic curve fluctuates with the epidemic curve. In this study, this form of infodemic is referred to as “point-source infodemic”. The second type of infodemic is continuous infodemic, which is seen in India and the United States (US). In those two countries, the infodemic is predominantly caused by political misinformation; its temporal distribution appears to be largely unrelated to the epidemic development.Originality/valueDespite the growing attention given to misinformation research, existing scholarship is dominated by single-country or mono-lingual research. This study takes a cross-national and cross-lingual comparative approach to investigate the problem of online misinformation. This paper demonstrates how the technological barrier of cross-lingual topic analysis can be overcome with aligned word-embedding algorithms.Peer review:The peer review history for this article is available at: https://publons.com/publon/10.1108/OIR-09-2020-0417
Style APA, Harvard, Vancouver, ISO itp.
38

Choi, YooChan. "Korean to English Patent Automatic Translation (K2E-PAT) and cross lingual retrieval on KIPRIS". World Patent Information 31, nr 2 (czerwiec 2009): 135–36. http://dx.doi.org/10.1016/j.wpi.2008.09.005.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
39

Hao Yang. "Multilingual Information Retrieval Using Graph Neural Networks: Practical Applications in English Translation". Journal of Electrical Systems 20, nr 6s (29.04.2024): 1729–39. http://dx.doi.org/10.52783/jes.3091.

Pełny tekst źródła
Streszczenie:
Multilingual information retrieval using graph neural networks offers practical applications in English translation by leveraging advanced computational models to enhance the efficiency and accuracy of cross-lingual search and translation tasks. By representing textual data as graphs and utilizing graph neural networks (GNNs), this approach captures intricate relationships between words and phrases across different languages, enabling more effective language understanding and translation. GNNs can learn complex linguistic structures and semantic similarities from multilingual corpora, facilitating the development of more robust translation systems that are capable of handling diverse language pairs and domains. The paper introduces a novel approach termed the Multilingual Ant Bee Optimization Graph Neural Network (MABO-GNN) for addressing optimization, classification, and multilingual translation tasks. MABO-GNN integrates ant bee optimization algorithms with graph neural networks to provide a versatile framework capable of optimizing objective functions, improving classification accuracy iteratively, and facilitating high-quality translations across multiple languages. Through comprehensive experimentation, the efficacy of MABO-GNN is demonstrated across various tasks, languages, and datasets. in optimization experiments, MABO-GNN achieves objective function values of 0.012, 0.015, 0.011, and 0.013 in Experiment 1, Experiment 2, Experiment 3, and Experiment 4, respectively, with convergence times ranging from 90 to 150 seconds. In classification tasks, the model exhibits notable performance improvements over iterations, with BLEU scores reaching 0.84 and METEOR scores reaching 0.78 in the fifth iteration. The translation results showcase BLEU scores of 0.85 for English, 0.82 for French, 0.79 for German, 0.81 for Spanish, and 0.75 for Chinese, indicating the model's proficiency in generating high-quality translations across diverse languages.
Style APA, Harvard, Vancouver, ISO itp.
40

Huang, Yanan, i Yuji Miao. "Selection of the Most Relevant Online English Semantic Art Translation in Cross-Lingual Information Retrieval based on Speech Signal Analysis Model". International Journal of Arts and Technology 13, nr 3 (2021): 1. http://dx.doi.org/10.1504/ijart.2021.10043418.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
41

Miao, Yuji, i Yanan Huang. "Selection of the most relevant online English semantic art translation in cross-lingual information retrieval based on speech signal analysis model". International Journal of Arts and Technology 13, nr 3 (2021): 200. http://dx.doi.org/10.1504/ijart.2021.120761.

Pełny tekst źródła
Style APA, Harvard, Vancouver, ISO itp.
42

Tamchyna, Aleš, Ondřej Dušek, Rudolf Rosa i Pavel Pecina. "MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service". Prague Bulletin of Mathematical Linguistics 100, nr 1 (1.10.2013): 31–40. http://dx.doi.org/10.2478/pralin-2013-0009.

Pełny tekst źródła
Streszczenie:
Abstract We present a web service which handles and distributes JSON-encoded HTTP requests for machine translation (MT) among multiple machines running an MT system, including text pre- and post-processing. It is currently used to provide MT between several languages for cross-lingual information retrieval in the EU FP7 Khresmoi project. The software consists of an application server and remote workers which handle text processing and communicate translation requests to MT systems. The communication between the application server and the workers is based on the XML-RPC protocol. We present the overall design of the software and test results which document speed and scalability of our solution. Our software is licensed under the Apache 2.0 licence and is available for download from the Lindat-Clarin repository and Github.
Style APA, Harvard, Vancouver, ISO itp.
43

Mi, Chenggang, Shaolin Zhu i Rui Nie. "Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion". Computational Intelligence and Neuroscience 2021 (8.04.2021): 1–9. http://dx.doi.org/10.1155/2021/9975078.

Pełny tekst źródła
Streszczenie:
Loanword identification is studied in recent years to alleviate data sparseness in several natural language processing (NLP) tasks, such as machine translation, cross-lingual information retrieval, and so on. However, recent studies on this topic usually put efforts on high-resource languages (such as Chinese, English, and Russian); for low-resource languages, such as Uyghur and Mongolian, due to the limitation of resources and lack of annotated data, loanword identification on these languages tends to have lower performance. To overcome this problem, we first propose a lexical constraint-based data augmentation method to generate training data for low-resource language loanword identification; then, a loanword identification model based on a log-linear RNN is introduced to improve the performance of low-resource loanword identification by incorporating features such as word-level embeddings, character-level embeddings, pronunciation similarity, and part-of-speech (POS) into one model. Experimental results on loanword identification in Uyghur (in this study, we mainly focus on Arabic, Chinese, Russian, and Turkish loanwords in Uyghur) showed that our proposed method achieves best performance compared with several strong baseline systems.
Style APA, Harvard, Vancouver, ISO itp.
44

FATTAH, MOHAMED ABDEL, FUJI REN i SHINGO KUROIWA. "SENTENCE ALIGNMENT USING FEED FORWARD NEURAL NETWORK". International Journal of Neural Systems 16, nr 06 (grudzień 2006): 423–34. http://dx.doi.org/10.1142/s0129065706000822.

Pełny tekst źródła
Streszczenie:
Parallel corpora have become an essential resource for work in multi lingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross language information retrieval and machine translation applications. In this paper, we present a new approach to align sentences in bilingual parallel corpora based on feed forward neural network classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuate score, and cognate score values. A set of manually prepared training data has been assigned to train the feed forward neural network. Another set of data was used for testing. Using this new approach, we could achieve an error reduction of 60% over length based approach when applied on English–Arabic parallel documents. Moreover this new approach is valid for any language pair and it is quite flexible approach since the feature parameter vector may contain more/less or different features than that we used in our system such as lexical match feature.
Style APA, Harvard, Vancouver, ISO itp.
45

B N V Narasimha Raju, Et al. "Bidirectional LSTMs with Byte Pair Encoding in NMT for CLIR using English and Telugu Parallel Corpus". International Journal on Recent and Innovation Trends in Computing and Communication 11, nr 9 (30.10.2023): 483–89. http://dx.doi.org/10.17762/ijritcc.v11i9.8832.

Pełny tekst źródła
Streszczenie:
The Neural Machine Translation (NMT) is very crucial for Cross-Lingual Information Retrieval (CLIR). NMT is effective in translating English language queries to the Telugu Language. In this paper, we are translating English queries to Telugu. The NMT will utilize a parallel corpus for translations. Telugu is a resource-poor language, it is very difficult to supply large amounts of parallel corpus to NMT. So the NMT will have a problem called Out Of Vocabulary (OOV). To overcome this problem Byte Pair Encoding (BPE) is used along with Long Short Term Memory (LSTM), which segments the rare words into sub-words and tries to translate the rare words. It still faces problems like Named Entity Recognition (NER). Some problems of NER can be solved by utilizing bidirectional LSTMs in sequence-to-sequence models. The bidirectional LSTMs (BiLSTMs) will be helpful in training systems in both directions for recognizing the named entities. The accuracy parameters and a BLEU score show the translation quality of NMT with bidirectional LSTMs has slightly more accuracy than regular LSTMs which is considerable.
Style APA, Harvard, Vancouver, ISO itp.
46

K. V. V. Satyanarayana, M. S. V. S. Bhadri Raju, B. N. V. Narasimha Raju,. "BiLSTMs and BPE for English to Telugu CLIR". Journal of Electrical Systems 20, nr 3s (4.04.2024): 2022–29. http://dx.doi.org/10.52783/jes.1798.

Pełny tekst źródła
Streszczenie:
A crucial component of Cross Lingual Information Retrieval (CLIR) is Neural Machine Translation (NMT). NMT performs a good job of transforming queries in the English language into Indian languages. This study focuses on the translation of English queries into Telugu. For translations, the NMT will make use of a parallel corpus. Due to a lack of resources in the Telugu language, it is exceedingly challenging to provide NMT with sizable parallel corpora. Thus, the NMT will encounter an issue known as Out of Vocabulary (OOV). Long Short-Term Memory (LSTM) with Byte Pair Encoding (BPE), which breaks up rare words into subwords and attempts to translate them to solve the OOV issue. Issues such as Named Entity Recognition (NER) continue to plague it. In sequence-to-sequence models, bidirectional LSTMs can solve certain NER challenges. Systems that need to be trained in both directions to recognize named entities can benefit from the use of Bidirectional LSTMs (BiLSTMs). The translation efficiency of NMT with BiLSTMs is significantly higher than normal LSTMs, as indicated by the accuracy metrics and Bilingual Evaluation Understudy (BLEU) score.
Style APA, Harvard, Vancouver, ISO itp.
47

Mahany, Ahmed, Heba Khaled, Nouh Sabri Elmitwally, Naif Aljohani i Said Ghoniemy. "Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications". Applied Sciences 12, nr 10 (21.05.2022): 5209. http://dx.doi.org/10.3390/app12105209.

Pełny tekst źródła
Streszczenie:
Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step.
Style APA, Harvard, Vancouver, ISO itp.
48

Malik, Dr Pankaj, Anshul Patel, Amit Patel, Daud Khan i Aakash Solanki. "Meta-Learning for Neural Machine Translation". International Journal for Research in Applied Science and Engineering Technology 11, nr 10 (31.10.2023): 1232–37. http://dx.doi.org/10.22214/ijraset.2023.56185.

Pełny tekst źródła
Streszczenie:
Abstract: Neural Machine Translation (NMT) has significantly advanced the field of automated language translation, yet challenges persist in adapting to diverse language pairs, handling low-resource languages, and ensuring domain-specific translation accuracy. To address these challenges, this study explores the integration of meta-learning methodologies in NMT, aiming to enhance the adaptability and generalization capabilities of translation models. Through a comprehensive analysis of various meta-learning approaches, including Model-Agnostic Meta-Learning (MAML), metric-based meta-learning, and optimization-based meta-learning, we demonstrate the potential for improved translation accuracy and fluency across diverse language pairs and domains. Drawing upon a diverse set of bilingual corpora and employing the Transformer model as the base architecture, our experimental evaluation highlights the substantial performance improvements achieved through the integration of metalearning techniques. The case studies and use cases presented in this study underscore the practical applications of the integrated meta-learning methodologies in facilitating cross-lingual information retrieval, low-resource language localization, specialized domain translation, and multimodal translation. While acknowledging the computational complexity and ethical implications, this study emphasizes the importance of collaborative and interdisciplinary research efforts to advance the development of more adaptive and contextually aware translation systems. The findings and insights presented in this study offer valuable implications for the advancement of NMT and automated language translation practices
Style APA, Harvard, Vancouver, ISO itp.
49

MacAvaney, Sean. "Effective and practical neural ranking". ACM SIGIR Forum 55, nr 1 (czerwiec 2021): 1–2. http://dx.doi.org/10.1145/3476415.3476432.

Pełny tekst źródła
Streszczenie:
Supervised machine learning methods that use neural networks ("deep learning") have yielded substantial improvements to a multitude of Natural Language Processing (NLP) tasks in the past decade. Improvements to Information Retrieval (IR) tasks, such as ad-hoc search, lagged behind those in similar NLP tasks, despite considerable community efforts. Although there are several contributing factors, I argue in this dissertation that early attempts were not more successful because they did not properly consider the unique characteristics of IR tasks when designing and training ranking models. I first demonstrate this by showing how large-scale datasets containing weak relevance labels can successfully replace training on in-domain collections. This technique improves the variety of queries encountered when training and helps mitigate concerns of over-fitting particular test collections. I then show that dataset statistics available in specific IR tasks can be easily incorporated into neural ranking models alongside the textual features, resulting in more effective ranking models. I also demonstrate that contextualized representations, particularly those from transformer-based language models, considerably improve neural ad-hoc ranking performance. I find that this approach is neither limited to the task of ad-hoc ranking (as demonstrated by ranking clinical reports) nor English content (as shown by training effective cross-lingual neural rankers). These efforts demonstrate that neural approaches can be effective for ranking tasks. However, I observe that these techniques are impractical due to their high query-time computational costs. To overcome this, I study approaches for offloading computational cost to index-time, substantially reducing query-time latency. These techniques make neural methods practical for ranking tasks. Finally, I take a deep dive into better understanding the linguistic biases of the methods I propose compared to contemporary and traditional approaches. The findings from this analysis highlight potential pitfalls of recent methods and provide a way to measure progress in this area going forward.
Style APA, Harvard, Vancouver, ISO itp.
50

Leonardelli, Elisa, i Sara Tonelli. "The Geography of Information Diffusion in Online Discourse on Europe and Migration". Proceedings of the International AAAI Conference on Web and Social Media 18 (28.05.2024): 904–16. http://dx.doi.org/10.1609/icwsm.v18i1.31361.

Pełny tekst źródła
Streszczenie:
The online diffusion of information related to Europe and migration has been little investigated from an external point of view. However, this is a very relevant topic, especially if users have had no direct contact with Europe and its perception depends solely on information retrieved online. In this work we analyse the information circulating online about Europe and migration after retrieving a large amount of data from social media (Twitter), to gain new insights into topics, magnitude, and dynamics of their diffusion. We combine retweets and hashtags network analysis with geolocation of users, linking thus data to geography and allowing analysis from an “outside Europe” perspective, with a special focus on Africa. We also introduce a novel approach based on cross-lingual quotes, i.e. when content in a language is commented and retweeted in another language, assuming these interactions are a proxy for connections between very distant communities. Results show how the majority of online discussions occurs at a national level, especially when discussing migration. Language (English) is pivotal for information to become transnational and reach far. Transnational information flow is strongly unbalanced, with content mainly produced in Europe and amplified outside. Conversely Europe-based accounts tend to be self-referential when they discuss migration-related topics. Football is the most exported topic from Europe worldwide. Moreover, important nodes in the communities discussing migration-related topics include accounts of official institutions and international agencies, together with journalists, news, commentators and activists.
Style APA, Harvard, Vancouver, ISO itp.
Oferujemy zniżki na wszystkie plany premium dla autorów, których prace zostały uwzględnione w tematycznych zestawieniach literatury. Skontaktuj się z nami, aby uzyskać unikalny kod promocyjny!

Do bibliografii