Academic literature on the topic 'Term detection in multilingual speech'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Term detection in multilingual speech.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Term detection in multilingual speech"

1

Karayiğit, Habibe, Ali Akdagli, and Çiğdem İnan Aci. "Homophobic and Hate Speech Detection Using Multilingual-BERT Model on Turkish Social Media." Information Technology and Control 51, no. 2 (June 23, 2022): 356–75. http://dx.doi.org/10.5755/j01.itc.51.2.29988.

Full text
Abstract:
Homophobic expressions are a form of insulting the sexual orientation or personality of people. Severe psychological traumas may occur in people who are exposed to this type of communication. It is important to develop automatic classification systems based on language models to examine social media content and distinguish homophobic discourse. This study aims to present a pre-trained Multilingual Bidirectional Encoder Representations from Transformers (M-BERT) model that can successfully detect whether Turkish comments on social media contain homophobic or related hate comments (i.e., sexist, severe humiliation, and defecation expressions). Comments in the Homophobic-Abusive Turkish Comments (HATC) dataset were collected from Instagram to train the detection models. The HATC dataset was manually labeled at the sentence level and combined with the Abusive Turkish Comments (ATC) dataset that has developed in our previous study. The HATC dataset has been balanced using the resampling method and two forms of the dataset (i.e., resHATC and original HATC) were used in the experiments. Afterward, the M-BERT model was compared with DL-based models (i.e., Long-Short Term Memory, Bidirectional Long-Short Term Memory (BiLSTM), Gated Recurrent Unit), Traditional Machine Learning (TML) classifiers (i.e., Support Vector Machine, Naive Bayes, Random Forest) and Ensemble Classifiers (i.e., Adaptive Boosting, eXtreme Gradient Boosting, Gradient Boosting) for the best model selection. The performance of the detection models was evaluated using F1-score, precision, and recall performance metrics. Results showed the best performance (homophobic F1-score: 82.64%, hateful F1-score: 91.75%, neutral F1-score: 96.08%, average F1-score: 90.15%) was achieved with the M-BERT model on the HATC dataset. The M-BERT detection model can increase the effectiveness of filters in detecting Turkish homophobic and related hate speech in social networks. It can be used to detect homophobic and related hate speech for different languages since the M-BERT model has multilingual pre-trained data.
APA, Harvard, Vancouver, ISO, and other styles
2

Deekshitha, G., and Leena Mary. "Multilingual spoken term detection: a review." International Journal of Speech Technology 23, no. 3 (July 22, 2020): 653–67. http://dx.doi.org/10.1007/s10772-020-09732-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Corazza, Michele, Stefano Menini, Elena Cabrio, Sara Tonelli, and Serena Villata. "A Multilingual Evaluation for Online Hate Speech Detection." ACM Transactions on Internet Technology 20, no. 2 (May 25, 2020): 1–22. http://dx.doi.org/10.1145/3377323.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Elouali, Aya, Zakaria Elberrichi, and Nadia Elouali. "Hate Speech Detection on Multilingual Twitter Using Convolutional Neural Networks." Revue d'Intelligence Artificielle 34, no. 1 (February 29, 2020): 81–88. http://dx.doi.org/10.18280/ria.340111.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Ghosh, Hiranmay, Sunil Kumar Kopparapu, Tanushyam Chattopadhyay, Ashish Khare, Sujal Subhash Wattamwar, Amarendra Gorai, and Meghna Pandharipande. "Multimodal Indexing of Multilingual News Video." International Journal of Digital Multimedia Broadcasting 2010 (2010): 1–18. http://dx.doi.org/10.1155/2010/486487.

Full text
Abstract:
The problems associated with automatic analysis of news telecasts are more severe in a country like India, where there are many national and regional language channels, besides English. In this paper, we present a framework for multimodal analysis of multilingual news telecasts, which can be augmented with tools and techniques for specific news analytics tasks. Further, we focus on a set of techniques for automatic indexing of the news stories based on keywords spotted in speech as well as on the visuals of contemporary and domain interest. English keywords are derived from RSS feed and converted to Indian language equivalents for detection in speech and on ticker texts. Restricting the keyword list to a manageable number results in drastic improvement in indexing performance. We present illustrative examples and detailed experimental results to substantiate our claim.
APA, Harvard, Vancouver, ISO, and other styles
6

Wijonarko, Panji, and Amalia Zahra. "Spoken language identification on 4 Indonesian local languages using deep learning." Bulletin of Electrical Engineering and Informatics 11, no. 6 (December 1, 2022): 3288–93. http://dx.doi.org/10.11591/eei.v11i6.4166.

Full text
Abstract:
Language identification is at the forefront of assistance in many applications, including multilingual speech systems, spoken language translation, multilingual speech recognition, and human-machine interaction via voice. The identification of indonesian local languages using spoken language identification technology has enormous potential to advance tourism potential and digital content in Indonesia. The goal of this study is to identify four Indonesian local languages: Javanese, Sundanese, Minangkabau, and Buginese, utilizing deep learning classification techniques such as artificial neural network (ANN), convolutional neural network (CNN), and long-term short memory (LSTM). The selected extraction feature for audio data extraction employs mel-frequency cepstral coefficient (MFCC). The results showed that the LSTM model had the highest accuracy for each speech duration (3 s, 10 s, and 30 s), followed by the CNN and ANN models.
APA, Harvard, Vancouver, ISO, and other styles
7

Ma, Yiping, and Wei Wang. "MSFL: Explainable Multitask-Based Shared Feature Learning for Multilingual Speech Emotion Recognition." Applied Sciences 12, no. 24 (December 13, 2022): 12805. http://dx.doi.org/10.3390/app122412805.

Full text
Abstract:
Speech emotion recognition (SER), a rapidly evolving task that aims to recognize the emotion of speakers, has become a key research area in affective computing. However, various languages in multilingual natural scenarios extremely challenge the generalization ability of SER, causing the model performance to decrease quickly, and driving researchers to ask how to improve the performance of multilingual SER. Recent studies mainly use feature fusion and language-controlled models to address this challenge, but key points such as the intrinsic association of languages or deep analysis of multilingual shared features (MSFs) are still neglected. To solve this problem, an explainable Multitask-based Shared Feature Learning (MSFL) model is proposed for multilingual SER. The introduction of multi-task learning (MTL) can provide related task information of language recognition for MSFL, improve its generalization in multilingual situations, and further lay the foundation for learning MSFs. Specifically, considering the generalization capability and interpretability of the model, the powerful MTL module was combined with the long short-term memory and attention mechanism, aiming to maintain the generalization in multilingual situations. Then, the feature weights acquired from the attention mechanism were ranked in descending order, and the top-ranked MSFs were compared with top-ranked monolingual features, enhancing the model interpretability based on the feature comparison. Various experiments were conducted on Emo-DB, CASIA, and SAVEE corpora from the model generalization and interpretability aspects. Experimental results indicate that MSFL performs better than most state-of-the-art models, with an average improvement of 3.37–4.49%. Besides, the top 10 features in MSFs almost contain the top-ranked features in three monolingual features, which effectively demonstrates the interpretability of MSFL.
APA, Harvard, Vancouver, ISO, and other styles
8

Vashistha, Neeraj, and Arkaitz Zubiaga. "Online Multilingual Hate Speech Detection: Experimenting with Hindi and English Social Media." Information 12, no. 1 (December 22, 2020): 5. http://dx.doi.org/10.3390/info12010005.

Full text
Abstract:
The last two decades have seen an exponential increase in the use of the Internet and social media, which has changed basic human interaction. This has led to many positive outcomes. At the same time, it has brought risks and harms. The volume of harmful content online, such as hate speech, is not manageable by humans. The interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset. Having classified them into three classes, abusive, hateful or neither, we create a baseline model and improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool that identifies and scores a page with an effective metric in near-real-time and uses the same feedback to re-train our model. We prove the competitive performance of our multilingual model in two languages, English and Hindi. This leads to comparable or superior performance to most monolingual models.
APA, Harvard, Vancouver, ISO, and other styles
9

Popli, Abhimanyu, and Arun Kumar. "Multilingual query-by-example spoken term detection in Indian languages." International Journal of Speech Technology 22, no. 1 (January 10, 2019): 131–41. http://dx.doi.org/10.1007/s10772-018-09585-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Thanvanthri, Srinedhi, and Shivani Ramakrishnan. "Performance of Text Classification Methods in Detection of Hate Speech in Media." International Journal for Research in Applied Science and Engineering Technology 10, no. 3 (March 31, 2022): 354–58. http://dx.doi.org/10.22214/ijraset.2022.40567.

Full text
Abstract:
Abstract: With the increased popularity of social media sites like Twitter and Instagram over the years, it has become easier for users of the sites to remain anonymous while taking part in hate speech against various peoples and communities. As a result, in an effort to curb such hate speech online, detection of the same has gained a lot more attention of late. Since curbing the growing amount of hate speech online by manual methods is not feasible, detection and control via Natural Language Processing and Deep Learning methods has gained popularity. In this paper, we evaluate the performance of a sequential model with the Universal Sentence Encoder against the RoBERTa method on different datasets for hate speech detection. The result of this study has shown a greater performance overall from using a Sequential model with a multilingual USE layer. Keywords: Hate Speech Detection, RoBERTa, Universal Sentence Encoder, Sequential model.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Term detection in multilingual speech"

1

Fancellu, Federico. "Computational models for multilingual negation scope detection." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/33038.

Full text
Abstract:
Negation is a common property of languages, in that there are few languages, if any, that lack means to revert the truth-value of a statement. A challenge to cross-lingual studies of negation lies in the fact that languages encode and use it in different ways. Although this variation has been extensively researched in linguistics, little has been done in automated language processing. In particular, we lack computational models of processing negation that can be generalized across language. We even lack knowledge of what the development of such models would require. These models however exist and can be built by means of existing cross-lingual resources, even when annotated data for a language other than English is not available. This thesis shows this in the context of detecting string-level negation scope, i.e. the set of tokens in a sentence whose meaning is affected by a negation marker (e.g. 'not'). Our contribution has two parts. First, we investigate the scenario where annotated training data is available. We show that Bi-directional Long Short Term Memory (BiLSTM) networks are state-of-the-art models whose features can be generalized across language. We also show that these models suffer from genre effects and that for most of the corpora we have experimented with, high performance is simply an artifact of the annotation styles, where negation scope is often a span of text delimited by punctuation. Second, we investigate the scenario where annotated data is available in only one language, experimenting with model transfer. To test our approach, we first build NEGPAR, a parallel corpus annotated for negation, where pre-existing annotations on English sentences have been edited and extended to Chinese translations. We then show that transferring a model for negation scope detection across languages is possible by means of structured neural models where negation scope is detected on top of a cross-linguistically consistent representation, Universal Dependencies. On the other hand, we found cross-lingual lexical information only to help very little with performance. Finally, error analysis shows that performance is better when a negation marker is in the same dependency substructure as its scope and that some of the phenomena related to negation scope requiring lexical knowledge are still not captured correctly. In the conclusions, we tie up the contributions of this thesis and we point future work towards representing negation scope across languages at the level of logical form as well.
APA, Harvard, Vancouver, ISO, and other styles
2

Cesbron, Fred́eŕique Chantal. "Pitch detection using the short-term phase spectrum." Thesis, Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15503.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Wallace, Roy Geoffrey. "Fast and accurate phonetic spoken term detection." Thesis, Queensland University of Technology, 2010. https://eprints.qut.edu.au/39610/1/Roy_Wallace_Thesis.pdf.

Full text
Abstract:
For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.
APA, Harvard, Vancouver, ISO, and other styles
4

Zhang, Yaodong Ph D. Massachusetts Institute of Technology. "Unsupervised speech processing with applications to query-by-example spoken term detection." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/79217.

Full text
Abstract:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2013.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 163-173).
This thesis is motivated by the challenge of searching and extracting useful information from speech data in a completely unsupervised setting. In many real world speech processing problems, obtaining annotated data is not cost and time effective. We therefore ask how much can we learn from speech data without any transcription. To address this question, in this thesis, we chose the query-by-example spoken term detection as a specific scenario to demonstrate that this task can be done in the unsupervised setting without any annotations. To build the unsupervised spoken term detection framework, we contributed three main techniques to form a complete working flow. First, we present two posteriorgram-based speech representations which enable speaker-independent, and noisy spoken term matching. The feasibility and effectiveness of both posteriorgram features are demonstrated through a set of spoken term detection experiments on different datasets. Second, we show two lower-bounding based methods for Dynamic Time Warping (DTW) based pattern matching algorithms. Both algorithms greatly outperform the conventional DTW in a single-threaded computing environment. Third, we describe the parallel implementation of the lower-bounded DTW search algorithm. Experimental results indicate that the total running time of the entire spoken detection system grows linearly with corpus size. We also present the training of large Deep Belief Networks (DBNs) on Graphical Processing Units (GPUs). The phonetic classification experiment on the TIMIT corpus showed a speed-up of 36x for pre-training and 45x for back-propagation for a two-layer DBN trained on the GPU platform compared to the CPU platform.
by Yaodong Zhang.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
5

Kalantari, Shahram. "Improving spoken term detection using complementary information." Thesis, Queensland University of Technology, 2015. https://eprints.qut.edu.au/90074/1/Shahram_Kalantari_Thesis.pdf.

Full text
Abstract:
This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.
APA, Harvard, Vancouver, ISO, and other styles
6

Lau, Suk-han. "The effect of type and level of noise on long-term average speech spectrum (LTASS) /." Hong Kong : University of Hong Kong, 1998. http://sunzi.lib.hku.hk/hkuto/record.jsp?B17896253.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

劉淑 and Suk-han Lau. "The effect of type and level of noise on long-term average speech spectrum (LTASS)." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31251031.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Abbs, Brandon Robert. "The temporal dynamics of auditory memory for static and dynamic sounds." Diss., University of Iowa, 2008. http://ir.uiowa.edu/etd/4.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Rouvier, Mickaël. "Structuration de contenus audio-visuel pour le résumé automatique." Thesis, Avignon, 2011. http://www.theses.fr/2011AVIG0192/document.

Full text
Abstract:
Ces dernières années, avec l’apparition des sites tels que Youtube, Dailymotion ou encore Blip TV, le nombre de vidéos disponibles sur Internet aconsidérablement augmenté. Le volume des collections et leur absence de structure limite l’accès par le contenu à ces données. Le résumé automatique est un moyen de produire des synthèses qui extraient l’essentiel des contenus et les présentent de façon aussi concise que possible. Dans ce travail, nous nous intéressons aux méthodes de résumé vidéo par extraction, basées sur l’analyse du canal audio. Nous traitons les différents verrous scientifiques liés à cet objectif : l’extraction des contenus, la structuration des documents, la définition et l’estimation des fonctions d’intérêts et des algorithmes de composition des résumés. Sur chacun de ces aspects, nous faisons des propositions concrètes qui sont évaluées. Sur l’extraction des contenus, nous présentons une méthode rapide de détection de termes. La principale originalité de cette méthode est qu’elle repose sur la construction d’un détecteur en fonction des termes cherchés. Nous montrons que cette stratégie d’auto-organisation du détecteur améliore la robustesse du système, qui dépasse sensiblement celle de l’approche classique basée sur la transcription automatique de la parole.Nous présentons ensuite une méthode de filtrage qui repose sur les modèles à mixtures de Gaussiennes et l’analyse factorielle telle qu’elle a été utilisée récemment en identification du locuteur. L’originalité de notre contribution tient à l’utilisation des décompositions par analyse factorielle pour l’estimation supervisée de filtres opérants dans le domaine cepstral.Nous abordons ensuite les questions de structuration de collections de vidéos. Nous montrons que l’utilisation de différents niveaux de représentation et de différentes sources d’informations permet de caractériser le style éditorial d’une vidéo en se basant principalement sur l’analyse de la source audio, alors que la plupart des travaux précédents suggéraient que l’essentiel de l’information relative au genre était contenue dans l’image. Une autre contribution concerne l’identification du type de discours ; nous proposons des modèles bas niveaux pour la détection de la parole spontanée qui améliorent sensiblement l’état de l’art sur ce type d’approches.Le troisième axe de ce travail concerne le résumé lui-même. Dans le cadre du résumé automatique vidéo, nous essayons, dans un premier temps, de définir ce qu’est une vue synthétique. S’agit-il de ce qui le caractérise globalement ou de ce qu’un utilisateur en retiendra (par exemple un moment émouvant, drôle....) ? Cette question est discutée et nous faisons des propositions concrètes pour la définition de fonctions d’intérêts correspondants à 3 différents critères : la saillance, l’expressivité et la significativité. Nous proposons ensuite un algorithme de recherche du résumé d’intérêt maximal qui dérive de celui introduit dans des travaux précédents, basé sur la programmation linéaire en nombres entiers
These last years, with the advent of sites such as Youtube, Dailymotion or Blip TV, the number of videos available on the Internet has increased considerably. The size and their lack of structure of these collections limit access to the contents. Sum- marization is one way to produce snippets that extract the essential content and present it as concisely as possible.In this work, we focus on extraction methods for video summary, based on au- dio analysis. We treat various scientific problems related to this objective : content extraction, document structuring, definition and estimation of objective function and algorithm extraction.On each of these aspects, we make concrete proposals that are evaluated.On content extraction, we present a fast spoken-term detection. The main no- velty of this approach is that it relies on the construction of a detector based on search terms. We show that this strategy of self-organization of the detector im- proves system robustness, which significantly exceeds the classical approach based on automatic speech recogntion.We then present an acoustic filtering method for automatic speech recognition based on Gaussian mixture models and factor analysis as it was used recently in speaker identification. The originality of our contribution is the use of decomposi- tion by factor analysis for estimating supervised filters in the cepstral domain.We then discuss the issues of structuring video collections. We show that the use of different levels of representation and different sources of information in or- der to characterize the editorial style of a video is principaly based on audio analy- sis, whereas most previous works suggested that the bulk of information on gender was contained in the image. Another contribution concerns the type of discourse identification ; we propose low-level models for detecting spontaneous speech that significantly improve the state of the art for this kind of approaches.The third focus of this work concerns the summary itself. As part of video summarization, we first try, to define what a synthetic view is. Is that what cha- racterizes the whole document, or what a user would remember (by example an emotional or funny moment) ? This issue is discussed and we make some concrete proposals for the definition of objective functions corresponding to three different criteria : salience, expressiveness and significance. We then propose an algorithm for finding the sum of the maximum interest that derives from the one introduced in previous works, based on integer linear programming
APA, Harvard, Vancouver, ISO, and other styles
10

Popli, Abhimanyu. "Framework for query-by-example and text based spoken term detection in multilingual and mixlingual speech." Thesis, 2018. http://localhost:8080/iit/handle/2074/7640.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Term detection in multilingual speech"

1

Mary, Leena, and Deekshitha G. "Spoken Term Detection Techniques." In SpringerBriefs in Speech Technology, 61–70. Cham: Springer International Publishing, 2018. http://dx.doi.org/10.1007/978-3-319-97761-4_5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Švec, Jan, Luboš Šmídl, and Josef V. Psutka. "An Analysis of the RNN-Based Spoken Term Detection Training." In Speech and Computer, 119–29. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-66429-3_11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Itoh, Yoshiaki, Hiroyuki Saito, Kazuyo Tanaka, and Shi-wook Lee. "Pseudo Real-Time Spoken Term Detection Using Pre-retrieval Results." In Speech and Computer, 264–70. Cham: Springer International Publishing, 2013. http://dx.doi.org/10.1007/978-3-319-01931-4_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Figuerola, Carlos G., Angel F. Zazo, José L. Alonso Berrocal, and Emilio Rodríguez Vázquez de Aldana. "Interactive and Bilingual Question Answering Using Term Suggestion and Passage Retrieval." In Multilingual Information Access for Text, Speech and Images, 363–70. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005. http://dx.doi.org/10.1007/11519645_37.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Lee, Kyung-Soon, and Kyo Kageura. "Multilingual Story Link Detection Based on Event Term Weighting on Times and Multilingual Spaces." In Digital Libraries: International Collaboration and Cross-Fertilization, 398–407. Berlin, Heidelberg: Springer Berlin Heidelberg, 2004. http://dx.doi.org/10.1007/978-3-540-30544-6_43.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Vavruška, Jan, Jan Švec, and Pavel Ircing. "Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework." In Text, Speech, and Dialogue, 402–9. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-40585-3_51.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Dibya, Ranjan Das Adhikary, Jitesh Pradhan, Abhinav Kumar, and Brijendra Pratap Singh. "A Multilingual Review of Hate Speech Detection in Social Media Content." In Cybercrime in Social Media, 85–106. Boca Raton: Chapman and Hall/CRC, 2023. http://dx.doi.org/10.1201/9781003304180-5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Lahoti, Anshul, Gurugubelli Krishna, Juan Rafel Orozco Arroyave, and Anil Kumar Vuppala. "Long-Term Average Spectral Feature-Based Parkinson’s Disease Detection from Speech." In Lecture Notes in Electrical Engineering, 603–12. Singapore: Springer Nature Singapore, 2022. http://dx.doi.org/10.1007/978-981-19-0840-8_46.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Velankar, Abhishek, Hrushikesh Patil, and Raviraj Joshi. "Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi." In Artificial Neural Networks in Pattern Recognition, 121–28. Cham: Springer International Publishing, 2022. http://dx.doi.org/10.1007/978-3-031-20650-4_10.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Bhagat, Dhritesh, Aritra Ray, Adarsh Sarda, Nilanjana Dutta Roy, Mufti Mahmud, and Debashis De. "Improving Mental Health Through Multimodal Emotion Detection from Speech and Text Data Using Long-Short Term Memory." In Lecture Notes in Networks and Systems, 13–23. Singapore: Springer Nature Singapore, 2023. http://dx.doi.org/10.1007/978-981-19-5191-6_2.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Term detection in multilingual speech"

1

Anguera, Xavier, Luis Javier Rodriguez-Fuentes, Igor Szőke, Andi Buzo, Florian Metze, and Mikel Penagarikano. "Query-by-example spoken term detection on multilingual unconstrained speech." In Interspeech 2014. ISCA: ISCA, 2014. http://dx.doi.org/10.21437/interspeech.2014-522.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Knill, K. M., M. J. F. Gales, S. P. Rath, P. C. Woodland, C. Zhang, and S. X. Zhang. "Investigation of multilingual deep neural networks for spoken term detection." In 2013 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU). IEEE, 2013. http://dx.doi.org/10.1109/asru.2013.6707719.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Ram, Dhananjay, Lesly Miculicich, and Herve Bourlard. "Multilingual Bottleneck Features for Query by Example Spoken Term Detection." In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). IEEE, 2019. http://dx.doi.org/10.1109/asru46091.2019.9003752.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Buzo, Andi, Horia Cucu, Mihai Safta, and Corneliu Burileanu. "Multilingual query by example spoken term detection for under-resourced languages." In 2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD 2013). IEEE, 2013. http://dx.doi.org/10.1109/sped.2013.6682655.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Mullick, Ankan. "Exploring Multilingual Intent Dynamics and Applications." In Thirty-Second International Joint Conference on Artificial Intelligence {IJCAI-23}. California: International Joint Conferences on Artificial Intelligence Organization, 2023. http://dx.doi.org/10.24963/ijcai.2023/818.

Full text
Abstract:
Multilingual Intent Detection and explore its different characteristics are major field of study for last few years. But, detection of intention dynamics from text or voice, especially in the Indian multilingual contexts, is a challenging task. So, my first research question is on intent detection and then I work on the application in Indian Multilingual Healthcare scenario. Speech dialogue systems are designed by a pre-defined set of intents to perform user specified tasks. Newer intentions may surface over time that call for retraining. However, the newer intents may not be explicitly announced and need to be inferred dynamically. Hence, here are two crucial jobs: (a) recognizing newly emergent intents; and (b) annotating the data of the new intents in order to effectively retrain the underlying classifier. The tasks become specially challenging when a large number of new intents emerge simultaneously and there is a limited budget of manual annotation. We develop MNID (Multiple Novel Intent Detection), a cluster based framework that can identify multiple novel intents while optimized human annotation cost. Empirical findings on numerous benchmark datasets (of varying sizes) show that MNID surpasses the baseline approaches in terms of accuracy and F1-score by wisely allocating the budget for annotation. We apply intent detection approach on different domains in Indian multilingual scenarios - healthcare, finance etc. The creation of advanced NLU healthcare systems is threatened by the lack of data and technology constraints for resource-poor languages in developing nations like India. We evaluate the current state of several cutting-edge language models used in the healthcare with the goal of detecting query intents and corresponding entities. We conduct comprehensive trials on a number of models different realistic contexts, and we investigate the practical relevance depending on budget and the availability of data on English.
APA, Harvard, Vancouver, ISO, and other styles
6

Kapil, Prashant, and Asif Ekbal. "A Transformer based Multi-Task Learning Approach Leveraging Translated and Transliterated Data to Hate Speech Detection in Hindi." In 3rd International Conference on Data Science and Machine Learning (DSML 2022). Academy and Industry Research Collaboration Center (AIRCC), 2022. http://dx.doi.org/10.5121/csit.2022.121516.

Full text
Abstract:
The increase in usage of the internet has also led to an increase in unsocial activities, hate speech is one of them. The increase in Hate speech over a few years has been one of the biggest problems and automated techniques need to be developed to detect it. This paper aims to use the eight publicly available Hindi datasets and explore different deep neural network techniques to detect aggression, hate, abuse, etc. We experimented on multilingual-bidirectional encoder representations from the transformer (M-BERT) and multilingual representations for Indian languages (MuRIL) in four settings (i) Single task learning (STL) framework. (ii) Transfering the encoder knowledge to the recurrent neural network (RNN). (iii) Multi-task learning (MTL) where eight Hindi datasets were jointly trained and (iv) pre-training the encoder with translated English tweets to Devanagari script and the same Devanagari scripts transliterated to romanized Hindi tweets and then fine-tuning it in MTL fashion. Experimental evaluation shows that cross-lingual information in MTL helps in improving the performance of all the datasets by a significant margin, hence outperforming the state-of-the-art approaches in terms of weightedF1 score. Qualitative and quantitative error analysis is also done to show the effects of the proposed approach.
APA, Harvard, Vancouver, ISO, and other styles
7

Motlicek, Petr, Fabio Valente, and Philip N. Garner. "English spoken term detection in multilingual recordings." In Interspeech 2010. ISCA: ISCA, 2010. http://dx.doi.org/10.21437/interspeech.2010-86.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Arango Monnar, Ayme, Jorge Perez, Barbara Poblete, Magdalena Saldaña, and Valentina Proust. "Resources for Multilingual Hate Speech Detection." In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH). Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.woah-1.12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Röttger, Paul, Haitham Seelawi, Debora Nozza, Zeerak Talat, and Bertie Vidgen. "Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models." In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH). Stroudsburg, PA, USA: Association for Computational Linguistics, 2022. http://dx.doi.org/10.18653/v1/2022.woah-1.15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Sarfjoo, Seyyed Saeed, Srikanth Madikeri, and Petr Motlicek. "Speech Activity Detection Based on Multilingual Speech Recognition System." In Interspeech 2021. ISCA: ISCA, 2021. http://dx.doi.org/10.21437/interspeech.2021-1058.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography