Log in

Relevant bibliographies by topics / Term detection in multilingual speech / Journal articles

To see the other types of publications on this topic, follow the link: Term detection in multilingual speech.

Journal articles on the topic 'Term detection in multilingual speech'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Term detection in multilingual speech.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Karayiğit, Habibe, Ali Akdagli, and Çiğdem İnan Aci. "Homophobic and Hate Speech Detection Using Multilingual-BERT Model on Turkish Social Media." Information Technology and Control 51, no. 2 (June 23, 2022): 356–75. http://dx.doi.org/10.5755/j01.itc.51.2.29988.

Full text

Abstract:

Homophobic expressions are a form of insulting the sexual orientation or personality of people. Severe psychological traumas may occur in people who are exposed to this type of communication. It is important to develop automatic classification systems based on language models to examine social media content and distinguish homophobic discourse. This study aims to present a pre-trained Multilingual Bidirectional Encoder Representations from Transformers (M-BERT) model that can successfully detect whether Turkish comments on social media contain homophobic or related hate comments (i.e., sexist, severe humiliation, and defecation expressions). Comments in the Homophobic-Abusive Turkish Comments (HATC) dataset were collected from Instagram to train the detection models. The HATC dataset was manually labeled at the sentence level and combined with the Abusive Turkish Comments (ATC) dataset that has developed in our previous study. The HATC dataset has been balanced using the resampling method and two forms of the dataset (i.e., resHATC and original HATC) were used in the experiments. Afterward, the M-BERT model was compared with DL-based models (i.e., Long-Short Term Memory, Bidirectional Long-Short Term Memory (BiLSTM), Gated Recurrent Unit), Traditional Machine Learning (TML) classifiers (i.e., Support Vector Machine, Naive Bayes, Random Forest) and Ensemble Classifiers (i.e., Adaptive Boosting, eXtreme Gradient Boosting, Gradient Boosting) for the best model selection. The performance of the detection models was evaluated using F1-score, precision, and recall performance metrics. Results showed the best performance (homophobic F1-score: 82.64%, hateful F1-score: 91.75%, neutral F1-score: 96.08%, average F1-score: 90.15%) was achieved with the M-BERT model on the HATC dataset. The M-BERT detection model can increase the effectiveness of filters in detecting Turkish homophobic and related hate speech in social networks. It can be used to detect homophobic and related hate speech for different languages since the M-BERT model has multilingual pre-trained data.

APA, Harvard, Vancouver, ISO, and other styles

2

Deekshitha, G., and Leena Mary. "Multilingual spoken term detection: a review." International Journal of Speech Technology 23, no. 3 (July 22, 2020): 653–67. http://dx.doi.org/10.1007/s10772-020-09732-9.

Full text

APA, Harvard, Vancouver, ISO, and other styles

3

Corazza, Michele, Stefano Menini, Elena Cabrio, Sara Tonelli, and Serena Villata. "A Multilingual Evaluation for Online Hate Speech Detection." ACM Transactions on Internet Technology 20, no. 2 (May 25, 2020): 1–22. http://dx.doi.org/10.1145/3377323.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Elouali, Aya, Zakaria Elberrichi, and Nadia Elouali. "Hate Speech Detection on Multilingual Twitter Using Convolutional Neural Networks." Revue d'Intelligence Artificielle 34, no. 1 (February 29, 2020): 81–88. http://dx.doi.org/10.18280/ria.340111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Ghosh, Hiranmay, Sunil Kumar Kopparapu, Tanushyam Chattopadhyay, Ashish Khare, Sujal Subhash Wattamwar, Amarendra Gorai, and Meghna Pandharipande. "Multimodal Indexing of Multilingual News Video." International Journal of Digital Multimedia Broadcasting 2010 (2010): 1–18. http://dx.doi.org/10.1155/2010/486487.

Full text

Abstract:

The problems associated with automatic analysis of news telecasts are more severe in a country like India, where there are many national and regional language channels, besides English. In this paper, we present a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks. Further, we focus on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest. English keywords are derived from RSS feed and converted to Indian language equivalents for detection in speech and on ticker texts. Restricting the keyword list to a manageable number results in drastic improvement in indexing performance. We present illustrative examples and detailed experimental results to substantiate our claim.

APA, Harvard, Vancouver, ISO, and other styles

6

Wijonarko, Panji, and Amalia Zahra. "Spoken language identification on 4 Indonesian local languages using deep learning." Bulletin of Electrical Engineering and Informatics 11, no. 6 (December 1, 2022): 3288–93. http://dx.doi.org/10.11591/eei.v11i6.4166.

Full text

Abstract:

Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, multilingual speech recognition, and human-machine interaction via voice. The identification of indonesian local languages using spoken language identification technology has enormous potential to advance tourism potential and digital content in Indonesia. The goal of this study is to identify four Indonesian local languages: Javanese, Sundanese, Minangkabau, and Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional neural network (CNN), and long-term short memory (LSTM). The selected extraction feature for audio data extraction employs mel-frequency cepstral coefficient (MFCC). The results showed that the LSTM model had the highest accuracy for each speech duration (3 s, 10 s, and 30 s), followed by the CNN and ANN models.

APA, Harvard, Vancouver, ISO, and other styles

7

Ma, Yiping, and Wei Wang. "MSFL: Explainable Multitask-Based Shared Feature Learning for Multilingual Speech Emotion Recognition." Applied Sciences 12, no. 24 (December 13, 2022): 12805. http://dx.doi.org/10.3390/app122412805.

Full text

Abstract:

Speech emotion recognition (SER), a rapidly evolving task that aims to recognize the emotion of speakers, has become a key research area in affective computing. However, various languages in multilingual natural scenarios extremely challenge the generalization ability of SER, causing the model performance to decrease quickly, and driving researchers to ask how to improve the performance of multilingual SER. Recent studies mainly use feature fusion and language-controlled models to address this challenge, but key points such as the intrinsic association of languages or deep analysis of multilingual shared features (MSFs) are still neglected. To solve this problem, an explainable Multitask-based Shared Feature Learning (MSFL) model is proposed for multilingual SER. The introduction of multi-task learning (MTL) can provide related task information of language recognition for MSFL, improve its generalization in multilingual situations, and further lay the foundation for learning MSFs. Specifically, considering the generalization capability and interpretability of the model, the powerful MTL module was combined with the long short-term memory and attention mechanism, aiming to maintain the generalization in multilingual situations. Then, the feature weights acquired from the attention mechanism were ranked in descending order, and the top-ranked MSFs were compared with top-ranked monolingual features, enhancing the model interpretability based on the feature comparison. Various experiments were conducted on Emo-DB, CASIA, and SAVEE corpora from the model generalization and interpretability aspects. Experimental results indicate that MSFL performs better than most state-of-the-art models, with an average improvement of 3.37–4.49%. Besides, the top 10 features in MSFs almost contain the top-ranked features in three monolingual features, which effectively demonstrates the interpretability of MSFL.

APA, Harvard, Vancouver, ISO, and other styles

8

Vashistha, Neeraj, and Arkaitz Zubiaga. "Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media." Information 12, no. 1 (December 22, 2020): 5. http://dx.doi.org/10.3390/info12010005.

Full text

Abstract:

The last two decades have seen an exponential increase in the use of the Internet and social media, which has changed basic human interaction. This has led to many positive outcomes. At the same time, it has brought risks and harms. The volume of harmful content online, such as hate speech, is not manageable by humans. The interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset. Having classified them into three classes, abusive, hateful or neither, we create a baseline model and improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool that identifies and scores a page with an effective metric in near-real-time and uses the same feedback to re-train our model. We prove the competitive performance of our multilingual model in two languages, English and Hindi. This leads to comparable or superior performance to most monolingual models.

APA, Harvard, Vancouver, ISO, and other styles

9

Popli, Abhimanyu, and Arun Kumar. "Multilingual query-by-example spoken term detection in Indian languages." International Journal of Speech Technology 22, no. 1 (January 10, 2019): 131–41. http://dx.doi.org/10.1007/s10772-018-09585-3.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

Thanvanthri, Srinedhi, and Shivani Ramakrishnan. "Performance of Text Classification Methods in Detection of Hate Speech in Media." International Journal for Research in Applied Science and Engineering Technology 10, no. 3 (March 31, 2022): 354–58. http://dx.doi.org/10.22214/ijraset.2022.40567.

Full text

Abstract:

Abstract: With the increased popularity of social media sites like Twitter and Instagram over the years, it has become easier for users of the sites to remain anonymous while taking part in hate speech against various peoples and communities. As a result, in an effort to curb such hate speech online, detection of the same has gained a lot more attention of late. Since curbing the growing amount of hate speech online by manual methods is not feasible, detection and control via Natural Language Processing and Deep Learning methods has gained popularity. In this paper, we evaluate the performance of a sequential model with the Universal Sentence Encoder against the RoBERTa method on different datasets for hate speech detection. The result of this study has shown a greater performance overall from using a Sequential model with a multilingual USE layer. Keywords: Hate Speech Detection, RoBERTa, Universal Sentence Encoder, Sequential model.

APA, Harvard, Vancouver, ISO, and other styles

11

Panhwar, Farida Yasmin. "Functions of Code-Switching in a Private Chat on Facebook." Journal of English Language, Literature and Education 1, no. 04 (May 17, 2020): 16. http://dx.doi.org/10.54692/jelle.2020.01045.

Full text

Abstract:

Code switching is a significant language feature of the multilingual countries like Pakistan. The term code switching refers to the shift from one language to another. The main objective of this research is to explore the communicative functions of code switching by multilingual Sindhi speaking wife and husband in an inbox chat on Facebook. Applying the qualitative methodology, the data of one year chat was collected and varieties of functions using code switching theory of Blom and Gumperz (1972) as the theoretical framework. The findings of the study elucidate that Sindhi educated multilingual heavily utilize code switching for various communicative purposes like indexing identity, quotation, rephrasing, selfcorrection, metalinguistic, reported speech, idiomatic expressions, translation, expressing anger, humour, and euphemistic expressions in order to achieve their communicative goals.

APA, Harvard, Vancouver, ISO, and other styles

12

Wang, Chuanxu, and Pengyuan Zhang. "Optimization of Spoken Term Detection System." Journal of Applied Mathematics 2012 (2012): 1–8. http://dx.doi.org/10.1155/2012/548341.

Full text

Abstract:

Generally speaking, spoken term detection system will degrade significantly because of mismatch between acoustic model and spontaneous speech. This paper presents an improved spoken term detection strategy, which integrated with a novel phoneme confusion matrix and an improved word-level minimum classification error (MCE) training method. The first technique is presented to improve spoken term detection rate while the second one is adopted to reject false accepts. On mandarin conversational telephone speech (CTS), the proposed methods reduce the equal error rate (EER) by 8.4% in relative.

APA, Harvard, Vancouver, ISO, and other styles

13

bukhari, Danish, Yutian Wang, and Hui Wang. "Multilingual Convolutional, Long Short-Term Memory, Deep Neural Networks for Low Resource Speech Recognition." Procedia Computer Science 107 (2017): 842–47. http://dx.doi.org/10.1016/j.procs.2017.03.179.

Full text

APA, Harvard, Vancouver, ISO, and other styles

14

Minks, Amanda. "Socializing Heteroglossia among Miskitu children on the Caribbean coast of Nicaragua." Pragmatics. Quarterly Publication of the International Pragmatics Association (IPrA) 20, no. 4 (December 1, 2010): 495–522. http://dx.doi.org/10.1075/prag.20.4.02min.

Full text

Abstract:

This article adapts Bakhtin’s term “heteroglossia” as a framework for analyzing Miskitu children’s multilingual speech on Corn Island, off the Caribbean coast of Nicaragua. Analysis of naturally occurring speech in this context illustrates the utility of partial competencies and hybridized speech, supporting a view of language not as a bounded system, but as a diverse pool of communicative resources that socialize children into multiple modes of voicing and acting. More broadly, the article examines the relations between language ideologies and language socialization, and the ways that both are articulated within complex histories of cultural interaction and stratified social relations. The article challenges conventional dichotomies of language loss and revitalization by viewing the hybrid linguistic practices that enable children to bridge social and cultural worlds.

APA, Harvard, Vancouver, ISO, and other styles

15

Zhang, Zhen, Ji Xu, Xu Yang Wang, Qing Wei Zhao, and Yong Hong Yan. "Long Mandarin Spoken Term Detection Using Two-Stage Search." Applied Mechanics and Materials 380-384 (August 2013): 2720–23. http://dx.doi.org/10.4028/www.scientific.net/amm.380-384.2720.

Full text

Abstract:

For efficient collection of speech recordings, the ability to search for spoken terms in the speech stream is an essential capability. Although the Chinese spoken term detection (STD) does not suffer the out-of-vocabulary (OOV) problem as English, it is still hard to retrieve the long spoken terms which contain four characters or more. In this paper, we details our approach for long Mandarin spoken term detection which combines the search on inverted index produced by speech recognizer and linear scan on syllable confusion network. First, we split the long spoken terms into syllables and search the syllables on the inverted index _le to get the segments which may contain the long spoken terms. Then we use a linear scan algorithm on syllable confusion networks (SCNs). On two Mandarin conversation telephone speech sets, we compare performance using the method proposed with that of the baseline syllable-based systems, and our approach gives satisfying performance gains over the others.

APA, Harvard, Vancouver, ISO, and other styles

16

Zia, Haris Bin, Ignacio Castro, Arkaitz Zubiaga, and Gareth Tyson. "Improving Zero-Shot Cross-Lingual Hate Speech Detection with Pseudo-Label Fine-Tuning of Transformer Language Models." Proceedings of the International AAAI Conference on Web and Social Media 16 (May 31, 2022): 1435–39. http://dx.doi.org/10.1609/icwsm.v16i1.19402.

Full text

Abstract:

Hate speech has proliferated on social media platforms in recent years. While this has been the focus of many studies, most works have exclusively focused on a single language, generally English. Low-resourced languages have been neglected due to the dearth of labeled resources. These languages, however, represent an important portion of the data due to the multilingual nature of social media. This work presents a novel zero-shot, cross-lingual transfer learning pipeline based on pseudo-label fine-tuning of Transformer Language Models for automatic hate speech detection. We employ our pipeline on benchmark datasets covering English (source) and 6 different non-English (target) languages written in 3 different scripts. Our pipeline achieves an average improvement of 7.6% (in terms of macro-F1) over previous zero-shot, cross-lingual models. This demonstrates the feasibility of high accuracy automatic hate speech detection for low-resource languages. We release our code and models at https://github.com/harisbinzia/ZeroshotCrosslingualHateSpeech.

APA, Harvard, Vancouver, ISO, and other styles

17

Liu, Chang, and David A. Eddins. "Detection of vowels in long‐term speech‐shaped noise." Journal of the Acoustical Society of America 119, no. 5 (May 2006): 3338. http://dx.doi.org/10.1121/1.4786425.

Full text

APA, Harvard, Vancouver, ISO, and other styles

18

Yang, Jichen, and Rohan Kumar Das. "Long-term high frequency features for synthetic speech detection." Digital Signal Processing 97 (February 2020): 102622. http://dx.doi.org/10.1016/j.dsp.2019.102622.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Toppo, Ravina, and Sweta Sinha. "Identifying acoustic cues for dialect profiling: Policing in multilingual communities of India." Indonesian Journal of Applied Linguistics 12, no. 2 (September 30, 2022): 521–32. http://dx.doi.org/10.17509/ijal.v12i2.43179.

Full text

Abstract:

A multilingual country such as India with numerous languages and dialects provides fertile grounds for evasive language crimes. From threat letters to ransom demands, the scope of crime is huge. The cases of illegal immigrants have only added to the fragility of international boundaries especially, during political upheavals. This leads to further vulnerability of society and also creates challenges for the police and law enforcement agencies towards timely intervention. The purpose of the study is to exhibit dialectal variation in Indian English by comparing two varieties. The current paper is based on the acoustic analysis of Indian English spoken by two distinct groups with different mother tongues. Ten native speakers of Hindi and Bangla were recorded in an anechoic chamber. A phonetically balanced passage was selected to be read. The analysis is based on Native Language Influence Detection (Perkins Grant, 2018) to derive acoustic phonetic correlates that can be used as significant identifying markers to distinguish Indian English speakers of Bangla and Hindi speech communities. The paper highlights that dialect profiling in the Indian context can be efficiently correlated with formant frequencies and Voice Onset Time for speech data. Acoustic analysis was done on PRAAT. PRAAT was used in this study because it has often been used by other similar studies to measure desired acoustic parameters simultaneously. Formant frequencies were measured at the midpoint of the vowels in the PRAAT using the LPC formant measurement algorithm. The normalization procedure was applied to the measured formant frequencies of vowels. The research affirms that acoustic analysis can provide verifiable cues for NLID. The framework can be used in the detection of native language influence in speech-centric criminal cases. The acoustic analysis shows that Indian English has subvarieties that could help in dialect profiling. The variation in Indian English vowel patterns could be due to the influence of the native language of the speakers.

APA, Harvard, Vancouver, ISO, and other styles

20

Kapoor, Raghav, Yaman Kumar, Kshitij Rajput, Rajiv Ratn Shah, Ponnurangam Kumaraguru, and Roger Zimmermann. "Mind Your Language: Abuse and Offense Detection for Code-Switched Languages." Proceedings of the AAAI Conference on Artificial Intelligence 33 (July 17, 2019): 9951–52. http://dx.doi.org/10.1609/aaai.v33i01.33019951.

Full text

Abstract:

In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e, Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text classification. We also release our model and the embeddings trained for research purposes.

APA, Harvard, Vancouver, ISO, and other styles

21

Speights Atkins, Marisha, and Joel MacAuslan. "Quantifying continuous child speech for automated detection of speech impairment." Journal of the Acoustical Society of America 152, no. 4 (October 2022): A138. http://dx.doi.org/10.1121/10.0015808.

Full text

Abstract:

The SpeechMark® Automated Syllabic Cluster detection system was tested as a novel approach for analysis of continuous speech samples recorded from 4-year-old children classified as typically developing (TD, N = 44, M = 4.32 years, SD = 0.64) and with speech compromise (SC, N = 16, M = 4.14 years, SD = 0.66). The speakers were recruited in the Midwest and Southern regions of the United States. To test if the TD group produced higher syllabic clusters compared to the SC, we fit a generalized linear mixed effects model. The model adjusted for the potential influence of age and dialect in contributing to the group differences by including them as covariates. Results were interpreted using incidence rate ratios (IRR). Results showed that the IRR was dependent on age indicated by the significant interaction term group*age (p-value = 0.003). The results also showed that there was no difference between the two dialect groups. Results from linear mixed effects models showed that the speech rate was higher among speakers in SC group given all other factors held constant (effect = 0.9, p-value = 0.055). These findings are promising as we aim to automate the analysis of continuous speech samples of young children.

APA, Harvard, Vancouver, ISO, and other styles

22

Ramı́rez, Javier, José C. Segura, Carmen Benı́tez, Ángel de la Torre, and Antonio Rubio. "Efficient voice activity detection algorithms using long-term speech information." Speech Communication 42, no. 3-4 (April 2004): 271–87. http://dx.doi.org/10.1016/j.specom.2003.10.002.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Dumetz, Jerome, Jerome Dumetz, Jerome Dumetz, and Jerome Dumetz. "Unexpected Disadvantages of a Simultaneous Quadrilingual Upbringing, a Case Study." International Journal of Teaching and Education 9, no. 1 (April 20, 2021): 1–12. http://dx.doi.org/10.52950/te.2021.9.1.001.

Full text

Abstract:

At the crossroad between linguistics and cross-cultural communication, multilingualism is frequently presented through its most positive perspective. However, if the long-term benefits outrun the disadvantages, frustration is often the dominant feeling among the speakers during their early years. Based upon meticulous observations and careful collection of examples in a multilingual family, this article is a case study of the difficulties encountered by polyglots growing up with four simultaneous languages: Russian, French, Czech, and English. Using the research framework usually developed for the study of bilingualism, the article reviews not only the psychological and cognitive difficulties encountered by tetraglots, but also the social and linguistic drawbacks they are confronted with. It also examines common multilingual strategies such as code-switching, words creation and language mixing. It concludes that the linguistic development of tetraglots does not differ much from bilingual ones, except for the elongated period before acquiring production speech. Quadrilingual children tend to speak later than not only monolingual children, but also bilingual ones.

APA, Harvard, Vancouver, ISO, and other styles

24

Perwira Joan Dwitama, Aditya, Dhomas Hatta Fudholi, and Syarif Hidayat. "Indonesian Hate Speech Detection Using Bidirectional Long Short-Term Memory (Bi-LSTM)." Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) 7, no. 2 (March 26, 2023): 302–9. http://dx.doi.org/10.29207/resti.v7i2.4642.

Full text

Abstract:

Abstract Social media is a platform that allows users to express themselves freely including spreading hate speech content. The government has issued the regulation in the UU ITE to handle and prevent hate speech on social media. The research was also conducted using the Bi-LSTM to classify the text into hate speech or not. Another research was purposed to detect hate speech and its categories using Bi-GRU. However, the performance of the model Bi-GRU is still lower than Bi-LSTM with an accuracy of 86.44% and 96.44%. Therefore, this study aims to build a model that can detect hate speech and its categories. The research offers Bi-LSTM as a classification model and IndoBERT as a tokenization model. The dataset used is a public dataset containing 13 thousand tweets. As a result, the best model obtained is using 20 epochs, 192 batch sizes, 1 layer Bi-LSTM with 40 nodes, and applying class weighing in the optimization process. The pre-train model from IndoBERT that is used to support the performance of the model in classifying is "indobenchmark/indobert-large-p2". The performance given by the purposed model is very good with an average accuracy, precision, and recall of 97.66%, 96.50%, and 85.25%.

APA, Harvard, Vancouver, ISO, and other styles

25

Ramponi, Alan, Benedetta Testa, Sara Tonelli, and Elisabetta Jezek. "Addressing religious hate online: from taxonomy creation to automated detection." PeerJ Computer Science 8 (December 15, 2022): e1128. http://dx.doi.org/10.7717/peerj-cs.1128.

Full text

Abstract:

Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on misogyny and racism. A current underexplored area in this context is religious hate, for which efforts in data and methods to date have been rather scattered. This is exacerbated by different annotation schemes that available datasets use, which inevitably lead to poor repurposing of data in wider contexts. Furthermore, religious hate is very much dependent on country-specific factors, including the presence and visibility of religious minorities, societal issues, historical background, and current political decisions. Motivated by the lack of annotated data specifically tailoring religion and the poor interoperability of current datasets, in this article we propose a fine-grained labeling scheme for religious hate speech detection. Such scheme lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam. Moreover, we introduce a Twitter dataset in two languages—English and Italian—that has been annotated following the proposed annotation scheme. We experiment with several classification algorithms on the annotated dataset, from traditional machine learning classifiers to recent transformer-based language models, assessing the difficulty of two tasks: abusive language detection and religious hate speech detection. Finally, we investigate the cross-lingual transferability of multilingual models on the tasks, shedding light on the viability of repurposing our dataset for religious hate speech detection on low-resource languages. We release the annotated data and publicly distribute the code for our classification experiments at https://github.com/dhfbk/religious-hate-speech.

APA, Harvard, Vancouver, ISO, and other styles

26

Ollagnier, Anaïs, Elena Cabrio, Serena Villata, and Sara Tonelli. "BiRDy: Bullying Role Detection in Multi-Party Chats." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 13 (June 26, 2023): 16464–66. http://dx.doi.org/10.1609/aaai.v37i13.27080.

Full text

Abstract:

Recent studies have highlighted that private instant messaging platforms and channels are major media of cyber aggression, especially among teens. Due to the private nature of the verbal exchanges on these media, few studies have addressed the task of hate speech detection in this context. Moreover, the recent release of resources mimicking online aggression situations that may occur among teens on private instant messaging platforms is encouraging the development of solutions aiming at dealing with diversity in digital harassment. In this study, we present BiRDy: a fully Web-based platform performing participant role detection in multi-party chats. Leveraging the pre-trained language model mBERT (multilingual BERT), we release fine-tuned models relying on various contextual window strategies to classify exchanged messages according to the role of involvement in cyberbullying of the authors. Integrating a role scoring function, the proposed pipeline predicts a unique role for each chat participant. In addition, detailed confidence scoring are displayed. Currently, BiRDy publicly releases models for French and Italian.

APA, Harvard, Vancouver, ISO, and other styles

27

Ghozali, Imam, Kelly Rossa Sungkono, Riyanarto Sarno, and Rachmad Abdullah. "Synonym based feature expansion for Indonesian hate speech detection." International Journal of Electrical and Computer Engineering (IJECE) 13, no. 1 (February 1, 2023): 1105. http://dx.doi.org/10.11591/ijece.v13i1.pp1105-1112.

Full text

Abstract:

Online hate speech is one of the negative impacts of internet-based social media development. Hate speech occurs due to a lack of public understanding of criticism and hate speech. The Indonesian government has regulations regarding hate speech, and most of the existing research about hate speech only focuses on feature extraction and classification methods. Therefore, this paper proposes methods to identify hate speech before a crime occurs. This paper presents an approach to detect hate speech by expanding synonyms in word embedding and shows the classification comparison result between Word2Vec and FastText with bidirectional long short-term memory which are processed using synonym expanding process and without it. The goal is to classify hate speech and non-hate speech. The best accuracy result without the synonym expanding process is 0.90, and the expanding synonym process is 0.93.

APA, Harvard, Vancouver, ISO, and other styles

28

Alagu, Prakalya P., and Gaud Nirmal. "BiDETECT: BiLSTM with BERT for hate speech detection in tweets." i-manager's Journal on Computer Science 10, no. 4 (2023): 23. http://dx.doi.org/10.26634/jcom.10.4.19334.

Full text

Abstract:

The utilization of online platforms for spreading hate speech has become a major concern. The conventional techniques used to identify hate speech, such as relying on keywords and manual moderation, frequently fall short and can lead to either missed detections or incorrect identifications. In response, researchers have developed various deeplearning strategies for locating hate speech in text. This paper covers a wide range of Deep Learning approaches, encompassing Convolutional Neural Networks and especially transformer-based models. It also discusses the key factors that influence the performance of these methods, such as the choice of datasets, the use of pre-processing strategies, and the design of the model architecture. In conjunction with summarizing existing research, it also identifies a selection of key hurdles and limitations of Deep Learning for discovering hate speech and has proposed a novel method to overcome them. In Bidirectional Long Short-Term Memory and BERT for Hate Speech Detection (BiDETECT), which involves adding a Bidirectional Long Short-Term Memory (BiLSTM) layer to Bidirectional Encoder Representations from Transformers (BERT) for classification, the hurdles include the difficulties in defining hate speech, the limitations of current datasets, and the challenges of generalizing models to new domains. It also discusses the ethical implications of employing Deep Learning to pinpoint hate speech and the need for responsible and transparent research in this area.

APA, Harvard, Vancouver, ISO, and other styles

29

Liu, Chang, and David A. Eddins. "Categorical dependence of vowel detection in long-term speech-shaped noise." Journal of the Acoustical Society of America 123, no. 6 (June 2008): 4539–46. http://dx.doi.org/10.1121/1.2903867.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

O’Brien, Kathleen, Ashley Woodall, and Chang Liu. "Vowel detection and vowel identification in long‐term speech‐shaped noise." Journal of the Acoustical Society of America 125, no. 4 (April 2009): 2696. http://dx.doi.org/10.1121/1.4784308.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Kalantari, Shahram, David Dean, and Sridha Sridharan. "Cross database audio visual speech adaptation for phonetic spoken term detection." Computer Speech & Language 44 (July 2017): 1–21. http://dx.doi.org/10.1016/j.csl.2016.09.001.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Kerremans, Koen, Isabelle Desmeytere, Rita Temmerman, and Patrick Wille. "Application-oriented terminography in financial forensics." Terminology 11, no. 1 (June 17, 2005): 83–106. http://dx.doi.org/10.1075/term.11.1.05ker.

Full text

Abstract:

This paper covers ongoing terminography work in the FF POIROT project, a European research project in which formal and shareable knowledge repositories (i.e. ontologies) and ontology-based applications are developed for the prevention of value added tax carousel fraud in the EU and the detection of securities fraud. We will emphasise that the knowledge requirements regarding users and applications determine what textual information should be structured at macro- and micro-levels of the FF POIROT multilingual terminology base. Furthermore, we will present our ideas concerning a multidisciplinary approach in terminography, called ‘Termontography’, for future application-oriented terminology development.

APA, Harvard, Vancouver, ISO, and other styles

33

Isnain, Auliya Rahman, Agus Sihabuddin, and Yohanes Suyanto. "Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection." IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 14, no. 2 (April 30, 2020): 169. http://dx.doi.org/10.22146/ijccs.51743.

Full text

Abstract:

Currently, the discussion about hate speech in Indonesia is warm, primarily through social media. Hate speech is communication that disparages a person or group based on characteristics such as (race, ethnicity, gender, citizenship, religion and organization). Twitter is one of the social media that someone uses to express their feelings and opinions through tweets, including tweets that contain expressions of hatred because Twitter has a significant influence on the success or destruction of one's image.This study aims to detect hate speech or not hate Indonesian speech tweets by using the Bidirectional Long Short Term Memory method and the word2vec feature extraction method with Continuous bag-of-word (CBOW) architecture. For testing the BiLSTM purpose with the calculation of the value of accuracy, precision, recall, and F-measure.The use of word2vec and the Bidirectional Long Short Term Memory method with CBOW architecture, with epoch 10, learning rate 0.001 and the number of neurons 200 on the hidden layer, produce an accuracy rate of 94.66%, with each precision value of 99.08%, recall 93, 74% and F-measure 96.29%. In contrast, the Bidirectional Long Short Term Memory with three layers has an accuracy of 96.93%. The addition of one layer to BiLSTM increased by 2.27%.

APA, Harvard, Vancouver, ISO, and other styles

34

Natori, Satoshi, Yuto Furuya, Hiromitsu Nishizaki, and Yoshihiro Sekiguchi. "Spoken Term Detection Using Phoneme Transition Network from Multiple Speech Recognizers' Outputs." Journal of Information Processing 21, no. 2 (2013): 176–85. http://dx.doi.org/10.2197/ipsjjip.21.176.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Vasilakis, Miltiadis, and Yannis Stylianou. "Voice Pathology Detection Based eon Short-Term Jitter Estimations in Running Speech." Folia Phoniatrica et Logopaedica 61, no. 3 (2009): 153–70. http://dx.doi.org/10.1159/000219951.

Full text

APA, Harvard, Vancouver, ISO, and other styles

36

Kuznetsov, V. O. ""Provocation" as an Expert Term in Forensic Linguistics." Theory and Practice of Forensic Science 15, no. 3 (October 23, 2020): 6–18. http://dx.doi.org/10.30764/1819-2785-2020-3-6-18.

Full text

Abstract:

The article addresses the category of “provocation” as a forensic term that is an interdisciplinary concept in between the legal legal and linguistic concepts of “provocation”. An expert term “speech provocation” has been developed through an expert analysis where the category of “provocation” has been considered from the legal, linguistic, and expert perspectives. As a part of the consideration of the concept in the expert aspect, the relationship between the legal and linguistic categories has been established. The author concluded that as an expert linguistic term in examinations in corruption cases, the term “speech provocation for an offer/payment of a bribe” is used. In this case, the speech provocation is interpreted as a verbal act which incites one of the communicators to commit an unlawful act – to bribe. That is the linguistic contents of the phenomenon legally called “crime provocation". The article also addresses the methodological aspect of the detection of speech provocation.

APA, Harvard, Vancouver, ISO, and other styles

37

Dash, Debadatta, Paul Ferrari, Satwik Dutta, and Jun Wang. "NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals." Sensors 20, no. 8 (April 16, 2020): 2248. http://dx.doi.org/10.3390/s20082248.

Full text

Abstract:

Neural speech decoding-driven brain-computer interface (BCI) or speech-BCI is a novel paradigm for exploring communication restoration for locked-in (fully paralyzed but aware) patients. Speech-BCIs aim to map a direct transformation from neural signals to text or speech, which has the potential for a higher communication rate than the current BCIs. Although recent progress has demonstrated the potential of speech-BCIs from either invasive or non-invasive neural signals, the majority of the systems developed so far still assume knowing the onset and offset of the speech utterances within the continuous neural recordings. This lack of real-time voice/speech activity detection (VAD) is a current obstacle for future applications of neural speech decoding wherein BCI users can have a continuous conversation with other speakers. To address this issue, in this study, we attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG). First, we classified the whole segments of pre-speech, speech, and post-speech in the neural signals using a support vector machine (SVM). Second, for continuous prediction, we used a long short-term memory-recurrent neural network (LSTM-RNN) to efficiently decode the voice activity at each time point via its sequential pattern-learning mechanism. Experimental results demonstrated the possibility of real-time VAD directly from the non-invasive neural signals with about 88% accuracy.

APA, Harvard, Vancouver, ISO, and other styles

38

Ouzounov, Atanas. "LTSD and GDMD features for Telephone Speech Endpoint Detection." Cybernetics and Information Technologies 17, no. 4 (November 27, 2017): 114–33. http://dx.doi.org/10.1515/cait-2017-0045.

Full text

Abstract:

AbstractThis paper proposes a new contour-based speech endpoint detector which combines the log-Group Delay Mean-Delta (log-GDMD) feature, an adaptive twothreshold scheme and an eight-state automaton. The adaptive thresholds scheme uses two pairs of thresholds - for the starting and for the ending points, respectively. Each pair of thresholds is calculated by using the contour characteristics in the corresponded region of the utterance. The experimental results have shown that the proposed detector demonstrates better performance compared to the Long-Term Spectral Divergence (LTSD) one in terms of endpoint accuracy. Additional fixed-text speaker verification tests with short phrases of telephone speech based on the Dynamic Time Warping (DTW) and left-to-right Hidden Markov Model (HMM) frameworks confirm the improvements of the verification rate due to the better endpoint accuracy.

APA, Harvard, Vancouver, ISO, and other styles

39

Demyankov, Valery Z. "POSSIBILITY AND PROBABILITY JUDGMENTS IN DIFFERENT CULTURES." RSUH/RGGU Bulletin. "Literary Theory. Linguistics. Cultural Studies" Series, no. 4 (2022): 312–22. http://dx.doi.org/10.28995/2686-7249-2022-4-312-322.

Full text

Abstract:

Ways of presenting opinions depend on mental cultures which include i.a. styles of forming judgments on possibility and on probability of events and of states of affairs. Research on a large multilingual corpus of texts in several West-European languages and in Russian shows that the possibility statements are used more than twice as often as the probability statements. The term ‘possibility’ in Latin and in modern languages denotes a physicalist attitude towards states of affairs. This term was coined much later than the term ‘probability’, originally connected to the human aspects of evaluation. The term ‘probabilis’ itself in Latin was a cognate of ‘probare’, which meant ‘approving’ and/or ‘controlling’ events. Additionally, in modern Romance languages, judgments of doubt and hope, i. e. sentences conveying speaker’s distancing from alien and non-actual opinions, usually contain verbs in a special ‘conjunctive mood’. Creating alternative ‘possible worlds’ as a figure of speech for ‘conjecture’ originated and was extensively used in writings by Leibniz in French, who used this figure in accordance with French grammar. The same ideas formulated in Russian or in German lack the ‘subjunctive’ mood, whereas it is obligatory in French.

APA, Harvard, Vancouver, ISO, and other styles

40

Adiyaksa, Andi Fadil, Donny Richasdy, and Aditya Firman Ihsan. "Hate Speech Detection on YouTube Using Long Short-Term Memory and Latent Dirichlet Allocation Method." Journal of Information System Research (JOSH) 3, no. 4 (July 31, 2022): 644–50. http://dx.doi.org/10.47065/josh.v3i4.1875.

Full text

Abstract:

YouTube social media is one of the popular media for all people to become a platform as a means of information and expressing opinions. Opinions can be categorized as hate if they attack something targeted. Hate speech is a behavior, word or action that is prohibited, because it causes violence to any individual and group. Expressing opinions in the form of hate speech is a problem that is still very difficult for the authorities to overcome because it is very common. Therefore, in this study a system was created to detect hate speech in the youtube comment column, using the Long Short-Term Memory and Latent Dirichlet Allocation. In this study, several methods were carried out that aimed to get the best accuracy value and carried out the topic modeling process using Latent Dirichlet Allocation to produce a total of three topics containing words that often appear in youtube comments. Based on the tests that have been obtained, the best accuracy is 0.657 or 66%.

APA, Harvard, Vancouver, ISO, and other styles

41

Yang, Li, Ying Li, Jin Wang, and Zhuo Tang. "Post Text Processing of Chinese Speech Recognition Based on Bidirectional LSTM Networks and CRF." Electronics 8, no. 11 (October 31, 2019): 1248. http://dx.doi.org/10.3390/electronics8111248.

Full text

Abstract:

With the rapid development of Internet of Things Technology, speech recognition has been applied more and more widely. Chinese Speech Recognition is a complex process. In the process of speech-to-text conversion, due to the influence of dialect, environmental noise, and context, the accuracy of speech-to-text in multi-round dialogues and specific contexts is still not high. After the general speech recognition technology, the text after speech recognition can be detected and corrected in the specific context, which is helpful to improve the robustness of text comprehension and is a beneficial supplement to the speech recognition technology. In this paper, a text processing model after Chinese Speech Recognition is proposed, which combines a bidirectional long short-term memory (LSTM) network with a conditional random field (CRF) model. The task is divided into two stages: text error detection and text error correction. In this paper, a bidirectional long short-term memory (Bi-LSTM) network and conditional random field are used in two stages of text error detection and text error correction respectively. Through verification and system test on the SIGHAN 2013 Chinese Spelling Check (CSC) dataset, the experimental results show that the model can effectively improve the accuracy of text after speech recognition.

APA, Harvard, Vancouver, ISO, and other styles

42

Liu, Chang, and Su-Hyun Jin. "Psychometric Functions of Vowel Detection and Identification in Long-Term Speech-Shaped Noise." Journal of Speech, Language, and Hearing Research 62, no. 5 (May 21, 2019): 1473–85. http://dx.doi.org/10.1044/2018_jslhr-h-18-0320.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Hill, Edward, David Han, Pierre Dumouchel, Najim Dehak, Thomas Quatieri, Charles Moehs, Marlene Oscar-Berman, John Giordano, Thomas Simpatico, and Kenneth Blum. "Long Term Suboxone™ Emotional Reactivity As Measured by Automatic Detection in Speech." PLoS ONE 8, no. 7 (July 9, 2013): e69043. http://dx.doi.org/10.1371/journal.pone.0069043.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

atha, M. Mam, and T. Bhaskar Reddy. "A Survey on Automatic Question-answering process in Speech using Spoken term detection." International Journal of Computer Trends and Technology 49, no. 5 (July 25, 2017): 263–65. http://dx.doi.org/10.14445/22312803/ijctt-v49p143.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Tupamahu, Ekaputra. "Language Politics and the Constitution of Racialized Subjects in the Corinthian Church." Journal for the Study of the New Testament 41, no. 2 (October 2, 2018): 223–45. http://dx.doi.org/10.1177/0142064x18804438.

Full text

Abstract:

This study examines the phenomenon of speaking in tongue(s) in the Corinthian church from the point of view of the politics of language. Instead of seeing tongue(s) as a problem of unintelligible-ecstatic speech, it reconsiders this phenomenon as a linguistic struggle. Tongue(s), in this sense, is a multilingual social dynamic that Paul perceives as chaotic. Special attention is given to the role of language as one of the crucial markers of the ancient Greeks’ collective identity. The barbarians are their imaginative and discursive ‘others’ who do not share their language. It is within this sociopolitical context that the employment of the term βάρβαρος in 1 Cor. 14:11 can be understood as a performative act of constituting racialized subjects. Such discourse is Paul’s political strategy of bringing a monolingual order into the Corinthian church.

APA, Harvard, Vancouver, ISO, and other styles

46

Ayo, Femi Emmanuel, Olusegun Folorunso, Friday Thomas Ibharalu, and Idowu Ademola Osinuga. "Hate speech detection in Twitter using hybrid embeddings and improved cuckoo search-based neural networks." International Journal of Intelligent Computing and Cybernetics 13, no. 4 (October 12, 2020): 485–525. http://dx.doi.org/10.1108/ijicc-06-2020-0061.

Full text

Abstract:

PurposeHate speech is an expression of intense hatred. Twitter has become a popular analytical tool for the prediction and monitoring of abusive behaviors. Hate speech detection with social media data has witnessed special research attention in recent studies, hence, the need to design a generic metadata architecture and efficient feature extraction technique to enhance hate speech detection.Design/methodology/approachThis study proposes a hybrid embeddings enhanced with a topic inference method and an improved cuckoo search neural network for hate speech detection in Twitter data. The proposed method uses a hybrid embeddings technique that includes Term Frequency-Inverse Document Frequency (TF-IDF) for word-level feature extraction and Long Short Term Memory (LSTM) which is a variant of recurrent neural networks architecture for sentence-level feature extraction. The extracted features from the hybrid embeddings then serve as input into the improved cuckoo search neural network for the prediction of a tweet as hate speech, offensive language or neither.FindingsThe proposed method showed better results when tested on the collected Twitter datasets compared to other related methods. In order to validate the performances of the proposed method, t-test and post hoc multiple comparisons were used to compare the significance and means of the proposed method with other related methods for hate speech detection. Furthermore, Paired Sample t-Test was also conducted to validate the performances of the proposed method with other related methods.Research limitations/implicationsFinally, the evaluation results showed that the proposed method outperforms other related methods with mean F1-score of 91.3.Originality/valueThe main novelty of this study is the use of an automatic topic spotting measure based on naïve Bayes model to improve features representation.

APA, Harvard, Vancouver, ISO, and other styles

47

Farrús, Mireia, Joan Codina-Filbà, Elisenda Reixach, Erik Andrés, Mireia Sans, Noemí Garcia, and Josep Vilaseca. "Speech-Based Support System to Supervise Chronic Obstructive Pulmonary Disease Patient Status." Applied Sciences 11, no. 17 (August 29, 2021): 7999. http://dx.doi.org/10.3390/app11177999.

Full text

Abstract:

Patients with chronic obstructive pulmonary disease (COPD) suffer from voice changes with respect to the healthy population. However, two issues remain to be studied: how long-term speech elements such as prosody are affected; and whether physical effort and medication also affect the speech of patients with COPD, and if so, how an automatic speech-based detection system of COPD measurements can be influenced by these changes. The aim of the current study is to address both issues. To this end, long read speech from COPD and control groups was recorded, and the following experiments were performed: (a) a statistical analysis over the study and control groups to analyse the effects of physical effort and medication on speech; and (b) an automatic classification experiment to analyse how different recording conditions can affect the performance of a COPD detection system. The results obtained show that speech—especially prosodic features—is affected by physical effort and inhaled medication in both groups, though in opposite ways; and that the recording condition has a relevant role when designing an automatic COPD detection system. The current work takes a step forward in the understanding of speech in patients with COPD, and in turn, in the research on its automatic detection to help professionals supervising patient status.

APA, Harvard, Vancouver, ISO, and other styles

48

Rudramurthy, M. S., V. Kamakshi Prasad, and R. Kumaraswamy. "Voice Activity Detection Algorithm Using Zero Frequency Filter Assisted Peaking Resonator and Empirical Mode Decomposition." Journal of Intelligent Systems 22, no. 3 (September 1, 2013): 269–82. http://dx.doi.org/10.1515/jisys-2013-0036.

Full text

Abstract:

AbstractIn this article, a new adaptive data-driven strategy for voice activity detection (VAD) using empirical mode decomposition (EMD) is proposed. Speech data are decomposed using an a posteriori, adaptive, data-driven EMD in the time domain to yield a set of physically meaningful intrinsic mode functions (IMFs). Each IMF preserves the nonlinear and nonstationary property of the speech utterance. Among a set of IMFs, the IMF that contains source information dominantly called characteristic IMF (CIMF) can be identified and extracted by designing a zero-frequency filter-assisted peaking resonator. The detected CIMF is used to compute energy using short-term processing. Choosing proper threshold, voiced regions in speech utterances are detected using frame energy. The proposed framework has been studied on both clean speech utterance and noisy speech utterance (0-dB white noise). The proposed method is used for voice activity detection (VAD) in the presence of white noise and shows encouraging result in the presence of white noise up to 0 dB.

APA, Harvard, Vancouver, ISO, and other styles

49

Spandana, J., T. Rakesh, K. Pranay, Ch Vijaya Bhaskar, and Sunil Bhutada. "Speech Based Parkinson\'s Disease Detection Using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 11, no. 4 (April 30, 2023): 260–64. http://dx.doi.org/10.22214/ijraset.2023.50056.

Full text

Abstract:

Abstract: The diagnosis of Parkinson's disease (PD) is often made after careful observation and evaluation of clinical indicators, such as the description of various motor symptoms. Traditional methods of diagnosis, on the other hand, may be prone to error since they depend on the subjective judgment of motions that might be difficult for human eyes to categorise. However, nonmotor symptoms of PD in its early stages may be minor and might be due to a wide variety of diseases. Thus, it is difficult to make an early diagnosis of PD since these symptoms are often disregarded. These challenges have prompted the use of machine learning approaches for the categorization of PD and healthy controls or patients with comparable clinical presentations as a means of improving diagnostic and evaluation processes for PD (e.g., movement disorders or other Parkinson an syndromes). PD has been diagnosed using a wide variety of data types and machine learning techniques; the goal of this article is to present a synopsis of these approaches. In this study, we investigate PD recognition from a spoken language using CNN, ANN, and XGB. The CNN was fed stacked 2D input maps consisting of spectrograms and other short-term characteristics. The effectiveness of PD detection was analyzed by breaking down a voice recording into its component parts and comparing the results to those obtained by fusing all of the segments at the decision level.

APA, Harvard, Vancouver, ISO, and other styles

50

Hildebrandt, Kristine A., Oliver Bond, and Dubi Nanda Dhakal. "A Micro-Typology of Contact Effects in Four Tibeto-Burman Languages." Journal of Language Contact 15, no. 2 (May 17, 2023): 302–40. http://dx.doi.org/10.1163/19552629-15020003.

Full text

Abstract:

Abstract When minority languages with similar typological profiles are in long-term contact with a genealogically unrelated socioeconomically dominant language, the perfect context is provided for investigating which observed contact effects are demonstrably allied to sociolinguistic dynamics rather than purely structural ones. This paper investigates the factors determining the different extent of contact effects in four Tibeto-Burman languages (Gurung, Gyalsumdo, Nar-Phu, and Manange) spoken in a geo-politically defined and multilingual region of Nepal. Using corpus data and sociolinguistic interviews collected in the field, we demonstrate that a range of social, economic and geo-spatial factors contribute to asymmetries where contact effects are observed in the four speech communities. These notably include factors specifically relevant in mountasin-based communities, including proximity to transport and trekking routes, outward migration effects on small settlements, and the primary economies of the different parts of the Manang District.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!