To see the other types of publications on this topic, follow the link: Indic language- Term detection.

Journal articles on the topic 'Indic language- Term detection'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Indic language- Term detection.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Harika, S., T. Yamini, T. Nagasaikamesh, S. H. Basha, S. Santosh Kumar, and Mrs S. Sri DurgaKameswari. "Alzheimers Disease Detection Using Different Machine Learning Algorithms." International Journal for Research in Applied Science and Engineering Technology 10, no. 10 (October 31, 2022): 62–66. http://dx.doi.org/10.22214/ijraset.2022.46937.

Full text
Abstract:
Abstract: Alzheimer’s disease is the most common form of dementia affecting the brain’s parts. A broad term used to describe illnesses and conditions that causes a deterioration in memory, language, and other cognitive abilities severe enough to interface with daily life is “dementia”. According to estimates, this disease affects 6.2 million Americans and 5 million people in India aged 65and older. In 2019, the most recent year for which data are available, official death certificates reported 121,499 deaths from AD, making Alzheimer’s the “sixth leading cause of death in the country and the fifth leading cause of death for people 65 and older”. In this paper, we suggest several machine Learning algorithms like Decision trees, SVM, Logistic regression, and Naive Bayes identify AD at an early stage. The Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Investigations (OASIS) provide data sets white used to detect the disease in its early stage. The datasets consist of longitudinal MRI data (age, gender, mini mental status, CDR) By taking into account many factors in each method, such as precision, F1 Score, Recall, and specificity are calculated. The results obtained 93.7% of maximum accuracy for the Decision Tree Algorithm.
APA, Harvard, Vancouver, ISO, and other styles
2

Mongia, Anoushka. "DEVELOPING AN EFFECTIVE MACHINE LEARNING ALGORITHM SYSTEM IN THE EARLY DETECTION AND DIAGNOSIS OF ALZHEIMER’S DISEASE." International Journal of Research in Medical Sciences and Technology 11, no. 01 (2021): 222–29. http://dx.doi.org/10.37648/ijrmst.v11i01.022.

Full text
Abstract:
A broad term used to describe diseases and conditions that cause deterioration in memory, language, and other mental capacities sufficiently extreme to communicate with day-to-day existence is "dementia". Alzheimer's disease is the most well-known type of Dementia influencing the mind's parts. As per range, this disorder influences 6.2 million Americans and 5 million individuals in India matured 65 and more seasoned. In 2019, the latest year for which information is accessible, official passing declarations revealed 121,499 deaths from Promotion, Alzheimer's, the "6th driving reason for death in the nation". In this paper, we propose AI calculations like Decision trees (DT), SVM, Linear regression, and Naive Bayes determines Promotion at the beginning phase. The Alzheimer's Sickness Neuroimaging Drive (ADNI) and the Open Access Series of Imaging Examinations give informational collections used to identify the disease in its beginning phase. The datasets comprise longitudinal X-ray information (age, orientation, small-scale mental status, CDR). By taking into; account many variables in every strategy, for example, accuracy, F1 Score, Review, and explicitness are determined. The outcomes acquired 93.7% of the greatest precision for the DT Calculation.
APA, Harvard, Vancouver, ISO, and other styles
3

Chitransh, Apar, and Birinderjit Singh Kalyan. "ARM Microcontroller Based Wireless Industrial Automation System." Indian Journal of Microprocessors and Microcontroller 1, no. 2 (September 10, 2021): 8–11. http://dx.doi.org/10.35940/ijmm.b1705.091221.

Full text
Abstract:
In modern era most of the work being completed by the new and advanced various technology. Most of the industries are being run on the robotics technologies. But in INDIA most of the company are running in various technologies as like embedded system, plc, Arduino for sensing the alcohol detection and gas detection system and most important thing microcontroller. In this paper we discuss the ARM microcontroller based wireless industrial automation system. This automation consists the coordinator module and sensor module. In which one module is connected with the monitoring computer that is called the coordinator module and for connecting with the monitor of the computer it is also called the centralized unit. And the sensor module is an ARM microcontroller for a monitoring and controlling the whole plant. The coordinator unit main work is to collects the all type of data from the sensor module and provide that information to the IP network. For better communication between these two modules, we use the best technology is ZIGBEE technology. Its main work to preset the changing and control the plant various parameter. ARM microcontroller using the embedded c language coding. This paper we do the deeply study about the ARM microcontroller and about the wireless industrial automation system and the term of ZIGBEE technology.
APA, Harvard, Vancouver, ISO, and other styles
4

Chitransh, Apar, and Birinderjit Singh Kalyan. "ARM Microcontroller Based Wireless Industrial Automation System." Indian Journal of Microprocessors and Microcontroller 1, no. 2 (September 10, 2021): 8–11. http://dx.doi.org/10.54105/ijmm.b1705.091221.

Full text
Abstract:
In modern era most of the work being completed by the new and advanced various technology. Most of the industries are being run on the robotics technologies. But in INDIA most of the company are running in various technologies as like embedded system, plc, Arduino for sensing the alcohol detection and gas detection system and most important thing microcontroller. In this paper we discuss the ARM microcontroller based wireless industrial automation system. This automation consists the coordinator module and sensor module. In which one module is connected with the monitoring computer that is called the coordinator module and for connecting with the monitor of the computer it is also called the centralized unit. And the sensor module is an ARM microcontroller for a monitoring and controlling the whole plant. The coordinator unit main work is to collects the all type of data from the sensor module and provide that information to the IP network. For better communication between these two modules, we use the best technology is ZIGBEE technology. Its main work to preset the changing and control the plant various parameter. ARM microcontroller using the embedded c language coding. This paper we do the deeply study about the ARM microcontroller and about the wireless industrial automation system and the term of ZIGBEE technology.
APA, Harvard, Vancouver, ISO, and other styles
5

Prusty, Sashikanta, Sujit Kumar Dash, Srikanta Patnaik, and Sushree Gayatri Priyadarsini Prusty. "IMPACT OF COVID-19 ON BREAST CANCER SCREENING PROGRAM (BCSP) IN INDIA." Indian Journal of Computer Science and Engineering 14, no. 3 (June 20, 2023): 416–28. http://dx.doi.org/10.21817/indjcse/2023/v14i2/231403132.

Full text
Abstract:
In the past three years, covid-19 viruses have spread rapidly worldwide, while low and middle-income countries were affected mostly so far. Emergency limits were imposed due to the rapid infection and significant mortality rates. Only emergency medical treatments are available during these shutdowns and lockdowns in India. All non-emergency treatments, such as Breast Cancer Screening Program (BCSP), have been temporarily halted due to the huge number of deaths caused by coronavirus. However, the ability of BC screening programs to improve survival rates while lowering mortality rates has been well demonstrated. Suspension may result in poorer outcomes for patients with BC. In this regard, early detection and treatment are critical for increased survival and long-term quality of life. Thus, we have taken breast cancer patients' data for the last six years i.e. from 2016 to 2021 in India to properly evaluate and analyze for our research. Assessing recent results for various features from, modeled evaluations can aid pandemic responses. Besides that, we proposed a novel method that implements the EDA technique to graphically represent BC patients' data. This experiment was done using Python programming language on Jupyter 6.4.3 platform. We found the sudden rise of BC patients from lakhs to millions in 2019. This signifies the deadly coronavirus has greatly affected people during the pandemic days when people are more serious about this virus rather than screening their breasts.
APA, Harvard, Vancouver, ISO, and other styles
6

Prusty, Sashikanta, Sujit Kumar Dash, Srikanta Patnaik, and Sushree Gayatri Priyadarsini Prusty. "IMPACT OF COVID-19 ON BREAST CANCER SCREENING PROGRAM (BCSP) IN INDIA." Indian Journal of Computer Science and Engineering 14, no. 3 (June 20, 2023): 416–28. http://dx.doi.org/10.21817/indjcse/2023/v14i3/231403132.

Full text
Abstract:
In the past three years, covid-19 viruses have spread rapidly worldwide, while low and middle-income countries were affected mostly so far. Emergency limits were imposed due to the rapid infection and significant mortality rates. Only emergency medical treatments are available during these shutdowns and lockdowns in India. All non-emergency treatments, such as Breast Cancer Screening Program (BCSP), have been temporarily halted due to the huge number of deaths caused by coronavirus. However, the ability of BC screening programs to improve survival rates while lowering mortality rates has been well demonstrated. Suspension may result in poorer outcomes for patients with BC. In this regard, early detection and treatment are critical for increased survival and long-term quality of life. Thus, we have taken breast cancer patients' data for the last six years i.e. from 2016 to 2021 in India to properly evaluate and analyze for our research. Assessing recent results for various features from, modeled evaluations can aid pandemic responses. Besides that, we proposed a novel method that implements the EDA technique to graphically represent BC patients' data. This experiment was done using Python programming language on Jupyter 6.4.3 platform. We found the sudden rise of BC patients from lakhs to millions in 2019. This signifies the deadly coronavirus has greatly affected people during the pandemic days when people are more serious about this virus rather than screening their breasts.
APA, Harvard, Vancouver, ISO, and other styles
7

Sadanandam, Manchala. "HMM Based Language Identification from Speech Utterances of Popular Indic Languages Using Spectral and Prosodic Features." Traitement du Signal 38, no. 2 (April 30, 2021): 521–28. http://dx.doi.org/10.18280/ts.380232.

Full text
Abstract:
Language identification system (LID) is a system which automatically recognises the languages of short-term duration of unknown utterance of human beings. It recognises the discriminate features and reveals the language of utterance that belongs to. In this paper, we consider concatenated feature vectors of Mel Frequency Cepstral Coefficients (MFCC) and Pitch for designing LID. We design a reference model one for each language using 14-dimensional feature vectors using Hidden Markov model (HMM) then evaluate against all reference models of listed languages. The likelihood value of test sample feature vectors given in the evaluation is considered to decide the language of unknown utterance of test speech sample. In this paper we consider seven Indian languages for the experimental set up and the performance of system is evaluated. The average performance of the system is 89.31% and 90.63% for three states and four states HMM for 3sec test speech utterances respectively and also it is also observed that the system gives significant results with 3sec test speech for four state HMM even though we follow simple procedure.
APA, Harvard, Vancouver, ISO, and other styles
8

Allassonnière-Tang, Marc, and Marcin Kilarski. "Functions of gender and numeral classifiers in Nepali." Poznan Studies in Contemporary Linguistics 56, no. 1 (March 26, 2020): 113–68. http://dx.doi.org/10.1515/psicl-2020-0004.

Full text
Abstract:
AbstractWe examine the complex nominal classification system in Nepali (Indo-European, Indic), a language spoken at the intersection of the Indo-European and Sino-Tibetan language families, which are usually associated with prototypical examples of grammatical gender and numeral classifiers, respectively. In a typologically rare pattern, Nepali possesses two gender systems based on the human/non-human and masculine/feminine oppositions, in addition to which it has also developed an inventory of at least ten numeral classifiers as a result of contact with neighbouring Sino-Tibetan languages. Based on an analysis of the lexical and discourse functions of the three systems, we show that their functional contribution involves a largely complementary distribution of workload with respect to individual functions as well as the type of categorized nouns and referents. The study thus contributes to the ongoing discussions concerning the typology and functions of nominal classification as well as the effects of long-term language contact on language structure.
APA, Harvard, Vancouver, ISO, and other styles
9

Bera, Abhijit, Mrinal Kanti Ghose, and Dibyendu Kumar Pal. "Sentiment Analysis of Multilingual Tweets Based on Natural Language Processing (NLP)." International Journal of System Dynamics Applications 10, no. 4 (October 2021): 1–12. http://dx.doi.org/10.4018/ijsda.20211001.oa16.

Full text
Abstract:
Multilingual Sentiment analysis plays an important role in a country like India with many languages as the style of expression varies in different languages. The Indian people speak in total 22 different languages and with the help of Google Indic keyboard people can express their sentiments i.e reviews about anything in the social media in their native language from individual smart phones. It has been found that machine learning approach has overcome the limitations of other approaches. In this paper, a detailed study has been carried out based on Natural Language Processing (NLP) using Simple Neural Network (SNN) ,Convolutional Neural Network(CNN), and Long Short Term Memory (LSTM)Neural Network followed by another amalgamated model adding a CNN layer on top of the LSTM without worrying about versatility of multilingualism. Around 4000 samples of reviews in English, Hindi and in Bengali languages are considered to generate outputs for the above models and analyzed. The experimental results on these realistic reviews are found to be effective for further research work.
APA, Harvard, Vancouver, ISO, and other styles
10

Torbati, Amir Hossein Harati Nejad, and Joseph Picone. "Predicting search term reliability for spoken term detection systems." International Journal of Speech Technology 17, no. 1 (June 6, 2013): 1–9. http://dx.doi.org/10.1007/s10772-013-9197-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Deekshitha, G., and Leena Mary. "Multilingual spoken term detection: a review." International Journal of Speech Technology 23, no. 3 (July 22, 2020): 653–67. http://dx.doi.org/10.1007/s10772-020-09732-9.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Sakuntharaj, Ratnasingam. "Detecting and Correcting Contextual Mistakes in Sentences Using Part of Speech Tags." Asian Journal of Research in Computer Science 15, no. 2 (March 4, 2023): 25–31. http://dx.doi.org/10.9734/ajrcos/2023/v15i2317.

Full text
Abstract:
A grammar checker is a tool to check each sentence in a text to see whether it conforms to the grammar. In case it finds a structure that conflicts with the conformity to the grammar, it would give suggestions for alternatives. The grammar checkers for European languages and some Indic languages are well developed. However, perhaps, owing to Tamil being a morphologically rich and agglutinative language this has been a challenging task. An approach to detecting and correcting grammatical mistakes due to subject and finite-verb disagreement with regard to person, number and/or gender and due to disagreement in tense aspects in Tamil sentences is proposed in this paper. A method has been proposed that uses hierarchical part-of-speech tags of words to detect the grammatical mistakes in subject and finite-verb agreement and mistakes in tense aspects in Tamil sentences. Two sets of Tamil grammar rules are used to generate suggestions for the grammatical mistakes. Test results show that the proposed grammatical mistake detection and correction system performs well.
APA, Harvard, Vancouver, ISO, and other styles
13

Wagh, Jagruti, Gandhali Jadhav, Neha Joshi, Atharva Kamble, and Yukta Patil. "Multi-Lingual Sign language Detection System." International Journal for Research in Applied Science and Engineering Technology 10, no. 11 (November 30, 2022): 1912–14. http://dx.doi.org/10.22214/ijraset.2022.47718.

Full text
Abstract:
Abstract: Sign language is the best possible communication facility for the hearing-impaired population as a result, its importance increases and research needs to be done for the systems successful implementation. This medium is a vital part of the communication process between individuals with hearing loss. In this paper, we present a framework for developing a word recognition system using a deep learning method where each method has its strength and limitations. Here, we review the various methods available in sign language recognition such as convolutional neural network (CNN) along with long short-term memory (LSTM), recurrent neural network (RNN), and hidden Markov model (HMM) understanding model behaviour and working. Creating a model which will read static as well as dynamic hand gestures and give output in written format.
APA, Harvard, Vancouver, ISO, and other styles
14

Dutta, Uttaran. "Digital Preservation of Indigenous Culture and Narratives from the Global South: In Search of an Approach." Humanities 8, no. 2 (March 28, 2019): 68. http://dx.doi.org/10.3390/h8020068.

Full text
Abstract:
This research seeks to digitally preserve cultural histories and artifacts, which are practiced/produced in the underserved indigenous spaces of rural eastern India. This paper is a case study of co-developing Sangraksha—a digital humanities application. The application seeks to facilitate the process of writing history from the below by underrepresented populations at the margins. The villages in this research were geographically remote and socio-economically underdeveloped. The research populations represented individuals who possessed low levels of literacy, limited language proficiency in English and mainstream Indic languages (e.g., Hindi and Bengali), as well as limited familiarity with computers and computing environments. Grounded in long-term ethnographic engagements in the remote Global South, this study explored a range of cultural, aesthetic, and contextual factors that were instrumental in shaping and co-generating digital humanities solutions for under-researched international populations. On one hand, the research initiative sought to co-create a culturally meaningful and welcoming digital environment to make the experience contextually appropriate and user-friendly. On the other hand, grounded in visual and sensory methodologies, this research used community generated imageries and multimedia (audio, photographs and audio-visual) to make the application inclusive and accessible. Moreover, the application-development attempt also paid close attention to intercultural, local-centric, community-driven co-design aspects to make the approach socially-embedded and sustainable in the long term.
APA, Harvard, Vancouver, ISO, and other styles
15

M J, Carmel Mary Belinda, Ravikumar S, Muhammad Arif, Dhilip Kumar V, Antony Kumar K, and Arulkumaran G. "Linguistic Analysis of Hindi-English Mixed Tweets for Depression Detection." Journal of Mathematics 2022 (April 12, 2022): 1–7. http://dx.doi.org/10.1155/2022/3225920.

Full text
Abstract:
According to recent studies, young adults in India faced mental health issues due to closures of universities and loss of income, low self-esteem, distress, and reported symptoms of anxiety and/or depressive disorder (43%). This makes it a high time to come up with a solution. A new classifier proposed to find those individuals who might be having depression based on their tweets from the social media platform Twitter. The proposed model is based on linguistic analysis and text classification by calculating probability using the TF ∗ IDF (term frequency-inverse document frequency). Indians tend to tweet predominantly using English, Hindi, or a mix of these two languages (colloquially known as Hinglish). In this proposed approach, data has been collected from Twitter and screened via passing them through a classifier built using the multinomial Naive Bayes algorithm and grid search, the latter being used for hyperparameter optimization. Each tweet is classified as depressed or not depressed. The entire architecture works over English and Hindi languages, which shall help in implementation globally and across multiple platforms and help in putting a stop to the ever-increasing depression rates in a methodical and automated manner. In the proposed model pipeline, composed techniques are used to get the better results, as 96.15% accuracy and 0.914 as the F1 score have been attained.
APA, Harvard, Vancouver, ISO, and other styles
16

Apriliyanto, Andi, and Retno Kusumaningrum. "HOAX DETECTION IN INDONESIA LANGUAGE USING LONG SHORT-TERM MEMORY MODEL." SINERGI 24, no. 3 (July 13, 2020): 189. http://dx.doi.org/10.22441/sinergi.2020.3.003.

Full text
Abstract:
Nowadays, the internet and social media grow fast. This condition has positive and negative effects on society. They become media to communicate and share information without limitation. However, many people use that easiness to broadcast news or information which do not accurate with the facts and gather people's opinions to get benefits or we called a hoax. Therefore, we need to develop a system that can detect hoax. This research uses the neural network method with Long Short-Term Memory (LSTM) model. The process of the LSTM model to identify hoax has several steps, including dataset collection, pre-processing data, word embedding using pre-trained Word2Vec, built the LSTM model. Detection model performance measurement using precision, recall, and f1-measure matrix. This research results the highest average score of precision is 0.819, recall is 0.809, and f1-measure is 0.807. These results obtained from the combination of the following parameters, i.e., Skip-gram Word2Vec Model Architecture, Hierarchical Softmax, 100 as vector dimension, max pooling, 0.5 as dropout value, and 0.001 of learning rate.
APA, Harvard, Vancouver, ISO, and other styles
17

Fieri, Brillian, and Derwin Suhartono. "Offensive Language Detection Using Soft Voting Ensemble Model." MENDEL 29, no. 1 (June 30, 2023): 1–6. http://dx.doi.org/10.13164/mendel.2023.1.001.

Full text
Abstract:
Offensive language is one of the problems that have become increasingly severe along with the rise of the internet and social media usage. This language can be used to attack a person or specific groups. Automatic moderation, such as the usage of machine learning, can help detect and filter this particular language for someone who needs it. This study focuses on improving the performance of the soft voting classifier to detect offensive language by experimenting with the combinations of the soft voting estimators. The model was applied to a Twitter dataset that was augmented using several augmentation techniques. The features were extracted using Term Frequency-Inverse Document Frequency, sentiment analysis, and GloVe embedding. In this study, there were two types of soft voting models: machine learning-based, with the estimators of Random Forest, Decision Tree, Logistic Regression, Naïve Bayes, and AdaBoost as the best combination, and deep learning-based, with the best estimator combination of Convolutional Neural Network, Bidirectional Long Short-Term Memory, and Bidirectional Gated Recurrent Unit. The results of this study show that the soft voting classifier was better in performance compared to classic machine learning and deep learning models on both original and augmented datasets.
APA, Harvard, Vancouver, ISO, and other styles
18

Tejedor, Javier, Michal Fapšo, Igor Szöke, Jan “Honza” Černocký, and František Grézl. "Comparison of methods for language-dependent and language-independent query-by-example spoken term detection." ACM Transactions on Information Systems 30, no. 3 (August 2012): 1–34. http://dx.doi.org/10.1145/2328967.2328971.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Mandal, Anupam, K. R. Prasanna Kumar, and Pabitra Mitra. "Recent developments in spoken term detection: a survey." International Journal of Speech Technology 17, no. 2 (December 14, 2013): 183–98. http://dx.doi.org/10.1007/s10772-013-9217-1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
20

Creese, Helen. "Judicial processes and legal authority in pre-colonial Bali." Bijdragen tot de taal-, land- en volkenkunde / Journal of the Humanities and Social Sciences of Southeast Asia 165, no. 4 (2009): 515–50. http://dx.doi.org/10.1163/22134379-90003631.

Full text
Abstract:
Law codes with their origins in Indic-influenced Old Javanese systems of knowledge comprise an important genre in the Balinese textual record. Written in Kawi – a term encompassing Old Javanese, Middle Javanese and High Balinese – the legal corpus forms a complex and overlapping web of indigenous legal texts and traditions that encompass the codification and administration of civil and criminal justice as well as concepts of morality and right conduct. The most significant codes include the Adhigama, Kuṭāramānawa, Pūrwādhigama, Sārasamuccaya, Swarajambu, Dewāgama (also called Krĕtopapati) and Dewadanda. Each of these law codes belongs to a shared tradition of legal thought and practice that is linked to Sanskrit Mānavadharmaśāstra traditions. Manu’s code, most notably the aṣṭadaśawyawahāra section detailing the eighteen grounds for litigation, was adopted as the model of legal textual principle in the early stages of contact between ancient India and the Indonesian archipelago. Over the course of many centuries, this model informed legal and juridical practice and was adapted and modified to suit indigenous needs. The law codes remained in use in Java until the advent of Islam towards the end of the fifteenth century, and in Bali until the colonial period in the late nineteenth and early twentieth centuries. The Balinese legal textual corpus comprises dozens of interrelated manuscripts, some complete and some fragmentary. They provide significant insights in to pre-colonial judicial practices and forms of government. This article provides a survey of the corpus of legal texts and explores the nature of law in pre-colonial Bali.
APA, Harvard, Vancouver, ISO, and other styles
21

Creese, Helen. "Old Javanese legal traditions in pre-colonial Bali." Bijdragen tot de taal-, land- en volkenkunde / Journal of the Humanities and Social Sciences of Southeast Asia 165, no. 2-3 (2009): 241–90. http://dx.doi.org/10.1163/22134379-90003636.

Full text
Abstract:
Law codes with their origins in Indic-influenced Old Javanese systems of knowledge comprise an important genre in the Balinese textual record. Written in Kawi—a term encompassing Old Javanese, Middle Javanese and High Balinese—the legal corpus forms a complex and overlapping web of indigenous legal texts and traditions that encompass the codification and administration of civil and criminal justice as well as concepts of morality and right conduct. The most significant codes include the Adhigama, Kuṭāramānawa, Pūrwādhigama, Sārasamuccaya, Swarajambu, Dewāgama (also called Krĕtopapati) and Dewadanda. Each of these law codes belongs to a shared tradition of legal thought and practice that is linked to Sanskrit Mānavadharmaśāstra traditions. Manu’s code, most notably the aṣṭadaśawyawahāra section detailing the eighteen grounds for litigation, was adopted as the model of legal textual principle in the early stages of contact between ancient India and the Indonesian archipelago. Over the course of many centuries, this model informed legal and juridical practice and was adapted and modified to suit indigenous needs. The law codes remained in use in Java until the advent of Islam towards the end of the fifteenth century, and in Bali until the colonial period in the late nineteenth and early twentieth centuries. The Balinese legal textual corpus comprises dozens of interrelated manuscripts, some complete and some fragmentary. They provide significant insights in to pre-colonial judicial practices and forms of government. This article provides a survey of the corpus of legal texts and explores the nature of law in pre-colonial Bali.
APA, Harvard, Vancouver, ISO, and other styles
22

MIZUOCHI, Satoru, Takashi NOSE, and Akinori ITO. "Spoken Term Detection of Zero-Resource Language Using Posteriorgram of Multiple Languages." Interdisciplinary Information Sciences 28, no. 1 (2022): 1–13. http://dx.doi.org/10.4036/iis.2022.a.04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Hoste, Véronique, Klaar Vanopstal, Els Lefever, and Isabelle Delaere. "Classification-based scientific term detection in patient information." Terminology 16, no. 1 (May 10, 2010): 1–29. http://dx.doi.org/10.1075/term.16.1.01hos.

Full text
Abstract:
Although intended for the “average layman”, both in terms of readability and contents, the current patient information still contains many scientific terms. Different studies have concluded that the use of scientific terminology is one of the factors, which greatly influences the readability of this patient information. The present study deals with the problem of automatic term recognition of overly scientific terminology as a first step towards the replacement of the recognized scientific terms by their popular counterpart. In order to do so, we experimented with two approaches, a dictionary-based approach and a learning-based approach, which is trained on a rich feature vector. The research was conducted on a bilingual corpus of English and Dutch EPARs (European Public Assessment Report). Our results show that we can extract scientific terms with a high accuracy (> 80%, 10% below human performance) for both languages. Furthermore, we show that a lexicon-independent approach, which solely relies on orthographical and morphological information is the most powerful predictor of the scientific character of a given term.
APA, Harvard, Vancouver, ISO, and other styles
24

Cabezas-García, Melania, and Santiago Chambó. "Multi-word term variation." Revista Española de Lingüística Aplicada/Spanish Journal of Applied Linguistics 34, no. 2 (December 15, 2021): 402–34. http://dx.doi.org/10.1075/resla.19012.cab.

Full text
Abstract:
Abstract Complex nominals (CNs) are frequently found in specialized discourse in all languages, since they are a productive method of creating terms by combining existing lexical units. In Spanish, a conceptual combination may often be rendered with a prepositional CN (PCN) or an equivalent adjectival CN (ACN), e.g., demanda de electricidad vs. demanda eléctrica [electricity demand]. Adjectives in ACNs – usually derived from nouns – are known as ‘relational adjectives’ because they encode semantic relations with other concepts. With recent exceptions, research has focused on the underlying semantic relations in CNs. In natural language processing, several works have dealt with the automatic detection of relation adjectives in Romance and Germanic languages. However, there is no discourse studies of these CNs, to our knowledge, for the goal of establishing writer recommendations. This study analyzed the co-text of equivalent PCNs and ACNs to identify factors governing the use of a certain form. EcoLexicon ES, a corpus of Spanish environmental specialized texts, was used to extract 6 relational adjectives and, subsequently, a set of 12 pairs of equivalent CNs. Their behavior in co-text was analyzed by querying EcoLexicon ES and a general language corpus with 20 expressions in CQP-syntax. Our results showed that immediate linguistic co-text determined the preference for a particular structure. Based on these findings, we provide writing guidelines to assist in the production of CNs.
APA, Harvard, Vancouver, ISO, and other styles
25

Norouzian, Atta, and Richard Rose. "An approach for efficient open vocabulary spoken term detection." Speech Communication 57 (February 2014): 50–62. http://dx.doi.org/10.1016/j.specom.2013.09.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Popli, Abhimanyu, and Arun Kumar. "Multilingual query-by-example spoken term detection in Indian languages." International Journal of Speech Technology 22, no. 1 (January 10, 2019): 131–41. http://dx.doi.org/10.1007/s10772-018-09585-3.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Wang, Xuyang, Pengyuan Zhang, Xingyu Na, Jielin Pan, and Yonghong Yan. "Handling OOVWords in Mandarin Spoken Term Detection with an Hierarchical n-Gram Language Model." Chinese Journal of Electronics 26, no. 6 (November 1, 2017): 1239–44. http://dx.doi.org/10.1049/cje.2017.07.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Mehmood, Aneela, Muhammad Shoaib Farooq, Ansar Naseem, Furqan Rustam, Mónica Gracia Villar, Carmen Lili Rodríguez, and Imran Ashraf. "Threatening URDU Language Detection from Tweets Using Machine Learning." Applied Sciences 12, no. 20 (October 14, 2022): 10342. http://dx.doi.org/10.3390/app122010342.

Full text
Abstract:
Technology’s expansion has contributed to the rise in popularity of social media platforms. Twitter is one of the leading social media platforms that people use to share their opinions. Such opinions, sometimes, may contain threatening text, deliberately or non-deliberately, which can be disturbing for other users. Consequently, the detection of threatening content on social media is an important task. Contrary to high-resource languages like English, Dutch, and others that have several such approaches, the low-resource Urdu language does not have such a luxury. Therefore, this study presents an intelligent threatening language detection for the Urdu language. A stacking model is proposed that uses an extra tree (ET) classifier and Bayes theorem-based Bernoulli Naive Bayes (BNB) as the based learners while logistic regression (LR) is employed as the meta learner. A performance analysis is carried out by deploying a support vector classifier, ET, LR, BNB, fully connected network, convolutional neural network, long short-term memory, and gated recurrent unit. Experimental results indicate that the stacked model performs better than both machine learning and deep learning models. With 74.01% accuracy, 70.84% precision, 75.65% recall, and 73.99% F1 score, the model outperforms the existing benchmark study.
APA, Harvard, Vancouver, ISO, and other styles
29

Ramı́rez, Javier, José C. Segura, Carmen Benı́tez, Ángel de la Torre, and Antonio Rubio. "Efficient voice activity detection algorithms using long-term speech information." Speech Communication 42, no. 3-4 (April 2004): 271–87. http://dx.doi.org/10.1016/j.specom.2003.10.002.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Pham, Van Tung, Haihua Xu, Xiong Xiao, Nancy F. Chen, Eng Siong Chng, and Haizhou Li. "Re-ranking spoken term detection with acoustic exemplars of keywords." Speech Communication 104 (November 2018): 12–23. http://dx.doi.org/10.1016/j.specom.2018.09.004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Verma, Aishwarya R. "Comparative Analysis of Language Translation and Detection System Using Machine Learning." International Journal for Research in Applied Science and Engineering Technology 9, no. 8 (August 31, 2021): 1200–1211. http://dx.doi.org/10.22214/ijraset.2021.37577.

Full text
Abstract:
Abstract: Words are the meaty component which can be expressed through speech, writing or signals. It is important that the actual message or meaning of the words sent must conveys the same meaning to the one receives. The evolution from manual language translator to the digital machine translation have helped us a lot for finding the exact meaning such that each word must give at least close to exact actual meaning. To make machine translator more human-friendly feeling, natural language processing (NLP) with machine learning (ML) can make the best combination. The main challenges in machine translated sentence can involve ambiguities, lexical divergence, syntactic, lexical mismatches, semantic issues, etc. which can be seen in grammar, spellings, punctuations, spaces, etc. After analysis on different algorithms, we have implemented a two different machine translator using two different Long Short-Term Memory (LSTM) approaches and performed the comparative study of the quality of the translated text based on their respective accuracy. We have used two different training approaches of encodingdecoding techniques using same datasets, which translates the source English text to the target Hindi text. To detect the text entered is English or Hindi language, we have used Sequential LSTM training model for which the analysis has been performed based on its accuracy. As the result, the first LSTM trained model is 84% accurate and the second LSTM trained model is 71% accurate in its translation from English to Hindi text, while the detection LSTM trained model is 78% accurate in detecting English text and 81% accurate in detecting Hindi text. This study has helped us to analyze the appropriate machine translation based on its accuracy. Keywords: Accuracy, Decoding, Machine Learning (ML), Detection System, Encoding, Long Short-Term Memory (LSTM), Machine Translation, Natural Language Processing (NLP), Sequential
APA, Harvard, Vancouver, ISO, and other styles
32

Gütl, Christian. "Editorial." JUCS - Journal of Universal Computer Science 28, no. 7 (July 28, 2022): 670. http://dx.doi.org/10.3897/jucs.90508.

Full text
Abstract:
Dear Readers Welcome to the seventh issue in 2022. I am very pleased to announce the journal’s improved Scopus CiteScore of 2.7 and the Web of Science impact factor of 1.056 for 2021, indicating another scientifically successful year. On behalf of the J.UCS team, I would like to thank all the authors for their sound research contributions, the reviewers for their very helpful suggestions, and the consortium members for their financial support. Your commitment and dedicated work have contributed significantly to the long-lasting success of our journal. In this regular issue, I am very pleased to introduce four accepted papers from four different countries and 11 involved authors. Daisy Ferreira Brito, Monalessa P. Barcellos and Gleison Santos from Brazil address in their research a pattern language to support software measurement planning for statistical process control. More specifically, they use the Goal-Question-Metric format and organize it in a Measurement Planning Pattern Language. Ajay Kumar from India presents a hybridized neuro-fuzzy approach for software reliability prediction and validates the proposed approach by applying the neuro-fuzzy method to a software failure dataset. Jesus Serrano-Guerrero, Bashar Alshouha, Francisco P. Romero and Jose A. Olivas from Spain conduct a comparative study of affective knowledge-enhanced emotion detection in Arabic language. Jing Qiu, Feng Dong and Guanglu Sun from China propose and discuss a disassembly method based on a code extension selection network by combining traditional linear sweep and recursive traversal methods.
APA, Harvard, Vancouver, ISO, and other styles
33

Chakraborty, Subhalaxmi, Prayosi Paul, Suparna Bhattacharjee, Soumadeep Sarkar, and Arindam Chakraborty. "Sign Language Recognition Using Landmark Detection, GRU and LSTM." American Journal of Electronics & Communication 3, no. 3 (January 2, 2023): 20–26. http://dx.doi.org/10.15864/ajec.3305.

Full text
Abstract:
Speech impairment is a kind of disability, affects individual's ability to communicate with each other. People with this problem use sign language for their communication. Though communication through sign language has been taken care of, there exists communication gap between signed and non-signed people. To overcome this type of complexity researchers are trying to develop systems using deep learning approach. The main objective of this paper is subject to implement a vision-based application that offers translation of sign language to voice message and text to reduce the gap between two kinds of people mentioned above. The proposed model extracts temporal and spatial features after taking video sequences. To extract the spatial features, MediaPipe Holistic has been used that consists of several solutions for the detecting face, had and pose landmarks. Different kind of RNN (Recurrent Neural Network) like LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Unit) have been used is to train on temporal features. By using both models and American Signed Language, 99% accuracy has been achieved. The experimental result shows that the recognition method with MediaPipe Holistic followed by GRU or LSTM can achieve a high recognition rate that meets the need of a Sign Language Recognition system that on the real-time basis. Based on the expectation, this analysis will facilitate creation of intelligent- based Sign Language Recognition systems and knowledge accumulation and provide direction to guide to the correct path.
APA, Harvard, Vancouver, ISO, and other styles
34

Farooq, Muhammad Shoaib, Ansar Naseem, Furqan Rustam, and Imran Ashraf. "Fake news detection in Urdu language using machine learning." PeerJ Computer Science 9 (May 23, 2023): e1353. http://dx.doi.org/10.7717/peerj-cs.1353.

Full text
Abstract:
With the rise of social media, the dissemination of forged content and news has been on the rise. Consequently, fake news detection has emerged as an important research problem. Several approaches have been presented to discriminate fake news from real news, however, such approaches lack robustness for multi-domain datasets, especially within the context of Urdu news. In addition, some studies use machine-translated datasets using English to Urdu Google translator and manual verification is not carried out. This limits the wide use of such approaches for real-world applications. This study investigates these issues and proposes fake news classier for Urdu news. The dataset has been collected covering nine different domains and constitutes 4097 news. Experiments are performed using the term frequency-inverse document frequency (TF-IDF) and a bag of words (BoW) with the combination of n-grams. The major contribution of this study is the use of feature stacking, where feature vectors of preprocessed text and verbs extracted from the preprocessed text are combined. Support vector machine, k-nearest neighbor, and ensemble models like random forest (RF) and extra tree (ET) were used for bagging while stacking was applied with ET and RF as base learners with logistic regression as the meta learner. To check the robustness of models, fivefold and independent set testing were employed. Experimental results indicate that stacking achieves 93.39%, 88.96%, 96.33%, 86.2%, and 93.17% scores for accuracy, specificity, sensitivity, MCC, ROC, and F1 score, respectively.
APA, Harvard, Vancouver, ISO, and other styles
35

Bouaine, Chaimaa, Faouzia Benabbou, and Imane Sadgali. "Word Embedding for High Performance Cross-Language Plagiarism Detection Techniques." International Journal of Interactive Mobile Technologies (iJIM) 17, no. 10 (May 22, 2023): 69–91. http://dx.doi.org/10.3991/ijim.v17i10.38891.

Full text
Abstract:
Academic plagiarism has become a serious concern as it leads to the retardation of scientific progress and violation of intellectual property. In this context, we make a study aiming at the detection of cross-linguistic plagiarism based on Natural language Preprocessing (NLP), Embedding Techniques, and Deep Learning. Many systems have been developed to tackle this problem, and many rely on machine learning and deep learning methods. In this paper, we propose Cross-language Plagiarism Detection (CL-PD) method based on Doc2Vec embedding techniques and a Siamese Long Short-Term Memory (SLSTM) model. Embedding techniques help capture the text's contextual meaning and improve the CL-PD system's performance. To show the effectiveness of our method, we conducted a comparative study with other techniques such as GloVe, FastText, BERT, and Sen2Vec on a dataset combining PAN11, JRC-Acquis, Europarl, and Wikipedia. The experiments for the Spanish-English language pair show that Doc2Vec+SLSTM achieve the best results compared to other relevant models, with an accuracy of 99.81%, a precision of 99.75%, a recall of 99.88%, an f-score of 99.70%, and a very small loss in the test phase.
APA, Harvard, Vancouver, ISO, and other styles
36

Ram, Dhananjay, Afsaneh Asaei, and Hervé Bourlard. "Phonetic subspace features for improved query by example spoken term detection." Speech Communication 103 (October 2018): 27–36. http://dx.doi.org/10.1016/j.specom.2018.07.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Et.al, Hazlina Shariff. "Non-Functional Requirement Detection Using Machine Learning and Natural Language Processing." Turkish Journal of Computer and Mathematics Education (TURCOMAT) 12, no. 3 (April 10, 2021): 2224–29. http://dx.doi.org/10.17762/turcomat.v12i3.1171.

Full text
Abstract:
A key aspect of software quality is when the software has been operated functionally and meets user needs. A primary concern with non-functional requirements is that they always being neglected because their information is hidden in the documents. NFR is a tacit knowledge about the system and as a human, a user usually hardly know how to describe NFR. Hence, affect the NFR to be absent during the elicitation process. The software engineer has to act proactively to demand the software quality criteria from the user so the objective of requirements can be achieved. In order to overcome these problems, we use machine learning to detect the indicator term of NFR in textual requirements so we can remind the software engineer to elicit the missing NFR.We developed a prototype tool to support our approach to classify the textual requirements and using supervised machine learning algorithms. Survey wasdone toevaluate theeffectiveness of the prototype tool in detecting the NFR.
APA, Harvard, Vancouver, ISO, and other styles
38

Soni, Jayesh. "An Efficient LSTM Model for Fake News Detection." Computer Science & Engineering: An International Journal 12, no. 2 (April 30, 2022): 1–10. http://dx.doi.org/10.5121/cseij.2022.12201.

Full text
Abstract:
Information spread through online social media or sites has increased drastically with the swift growth of the Internet. Unverified or fake news reaches numerous users without concern about the trustworthiness of the info. Such fake news is created for political or commercial interests to mislead the users. In current society, the spread of misinformation is a big challenge. Hence, we propose a deep learning-based Long Short Term Memory (LSTM) classifier for fake news classification. Textual content is the primary unit in the fake news scenario. Therefore, natural language processing-based feature extraction is used to generate language-driven features. Experimental results show that NLP-based featured extraction with LSTM model achieves a higher accuracy rate in discernible less time.
APA, Harvard, Vancouver, ISO, and other styles
39

Kartinawati, Komang Triyani, Luh Gede Pradnyawati, and I. Made Eka Dwipayana. "EARLY DETECTION OF CHILDREN'S DEVELOPMENT ON STUNTING TODDLERS." JURNAL KEDOKTERAN 7, no. 2 (July 8, 2022): 57. http://dx.doi.org/10.36679/kedokteran.v7i2.521.

Full text
Abstract:
Stunting is an impaired growth in children because of long-term insufficient nutrition from conception until the age of two. Regarding Basic Health Research 2018, the prevalence of stunting in Indonesia was 30.8%, higher than the WHO’s target. As stated in the Nutritional Status Monitoring, the prevalence of stunting in Karangasem from 2015 to 2017 was 27.5%, 26.1%, and 23.6% respectively. Early detection of child development status could be utilized as a pilot project to construct an intervention scheme for stunting prevention. This research aims for in-depth learning on the children growth status of stunting toddlers in the village of Ban, including gross motor, fine motor, language and adaptive skills. The research method was a qualitative design with a phenomenological approach. The in-depth interview was arranged with 15 participants sorted by purposive sampling, including 5 stunting toddlers, 5 mothers or babysitters of those toddlers, and 5 health personnel in the field. This result showed impairment of child development in long-term stunting toddlers. Without stimulation from parents, babysitters, and the social environment for those impaired children, they would undergo prolonged developmental failure. The data also showed that the majority of children experienced impairment in fine motor-adaptive skills whereas the gross motor and language skills were still normal. Keywords: Stunting; Nutrition; Child Development; In-depth Interview.
APA, Harvard, Vancouver, ISO, and other styles
40

Muckenhirn, Hannah, Pavel Korshunov, Mathew Magimai-Doss, and Sebastien Marcel. "Long-Term Spectral Statistics for Voice Presentation Attack Detection." IEEE/ACM Transactions on Audio, Speech, and Language Processing 25, no. 11 (November 2017): 2098–111. http://dx.doi.org/10.1109/taslp.2017.2743340.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Dorman, Michael F., Luke M. Smith, Korine Dankowski, Geary McCandless, and James L. Parkin. "Long-Term Measures of Electrode Impedance and Auditory Thresholds for the Ineraid Cochlear Implant." Journal of Speech, Language, and Hearing Research 35, no. 5 (October 1992): 1126–30. http://dx.doi.org/10.1044/jshr.3505.1126.

Full text
Abstract:
Measures of electrode impedance and of detection thresholds for electrical stimuli were extracted from the records of patients implanted with the Ineraid cochlear prosthesis. An analysis of impedance measures, obtained at 1, 12, 24, and 36 months after surgery, demonstrated (a) a significant decrease in impedance over the first year for electrodes that carried current and (b) significant increases in impedance at 24 and 36 months for electrodes that did not carry current. An analysis of detection thresholds, obtained at the same times as the impedance measures, demonstrated that averaged thresholds for the current-carrying electrodes varied no more than 0.5 dB over the 3-year period. These results support the conclusion that stimulation with the Ineraid device does not produce deleterious changes in the electrodes or in the target neural tissue.
APA, Harvard, Vancouver, ISO, and other styles
42

Srivastava, Rahul, and Pawan Singh. "Fake news Detection Using Naive Bayes Classifier." Journal of Management and Service Science (JMSS) 2, no. 1 (February 25, 2022): 1–7. http://dx.doi.org/10.54060/jmss/002.01.005.

Full text
Abstract:
Fake news has been on the rise thanks to rapid digitalization across all platforms and mediums. Many governments throughout the world are attempting to address this issue. The use of Natural Language Processing and Machine Learning techniques to properly identify fake news is the subject of this research. The data is cleaned, and feature extraction is performed using pre-processing techniques. Then, employing four distinct strategies, a false news detection model is created. Finally, the research examines and contrasts the accuracy of Naive Bayes, Support Vector Machine (SVM), neural network, and long short-term memory (LSTM) methodologies in order to determine which is the most accurate. To clean the data and conduct feature extraction, pre-processing technologies are needed. Then, employing four distinct strategies, a false news detection model is created. Finally, in order to determine the best fit for the model, the research explores and analyzes the accuracy of Naive Bayes, Support Vector Machine (SVM), neural network, and long short-term memory (LSTM) approaches. The proposed model is working well with an accuracy of products up to 93.6%.
APA, Harvard, Vancouver, ISO, and other styles
43

Wan Bejuri, Wan Mohd Yaakob, Nur’Ain Najiha Zakaria, Mohd Murtadha Mohamad, Warusia Mohamed Yassin, Sharifah Sakinah Syed Ahmad, and Ngo Hea Choon. "Sign language detection using convolutional neural network for teaching and learning application." Indonesian Journal of Electrical Engineering and Computer Science 28, no. 1 (October 1, 2022): 358. http://dx.doi.org/10.11591/ijeecs.v28.i1.pp358-364.

Full text
Abstract:
Teaching lower school mathematic could be easy for everyone. For teaching in the situation that cannot speak, using sign language is the answer especially someone that have infected with vocal cord infection or critical spasmodic dysphonia or maybe disable people. However, the situation could be difficult, when the sign language is not understandable by the audience. Thus, the purpose of this research is to design a sign language detection scheme for teaching and learning activity. In this research, the image of hand gestures from teacher or presenter will be taken by using a web camera for the system to anticipate and display the image's name. This proposed scheme will detects hand movements and convert it be meaningful information. As a result, it show the model can be the most consistent in term of accuracy and loss compared to others method. Furthermore, the proposed algorithm is expected to contribute the body of knowledge and the society.
APA, Harvard, Vancouver, ISO, and other styles
44

Mohamed Abdulhamied, Reham, Mona M. Nasr, and Sarah N. Abdul Kader. "Real-time recognition of American sign language using long-short term memory neural network and hand detection." Indonesian Journal of Electrical Engineering and Computer Science 30, no. 1 (April 1, 2023): 545. http://dx.doi.org/10.11591/ijeecs.v30.i1.pp545-556.

Full text
Abstract:
Sign language recognition is very important for deaf and mute people because it has many facilities for them, it converts hand gestures into text or speech. It also helps deaf and mute people to communicate and express mutual feelings. This paper's goal is to estimate sign language using action detection by predicting what action is being demonstrated at any given time without forcing the user to wear any external devices. We captured user signs with a webcam. For example; if we signed “thank you”, it will take the entire set of frames for that action to determine what sign is being demonstrated. The long short-term memory (LSTM) model is used to produce a real-time sign language detection and prediction flow. We also applied dropout layers for both training and testing dataset to handle overfitting in deep learning models which made a good improvement for the final result accuracy. We achieved a 99.35% accuracy after training and implementing the model which allows the deaf and mute communicate more easily with society.
APA, Harvard, Vancouver, ISO, and other styles
45

Jagirdar, Srinivas, and Venkata Subba K. Reddy. "Phony News Detection in Reddit Using Natural Language Techniques and Machine Learning Pipelines." International Journal of Natural Computing Research 10, no. 3 (July 2021): 1–11. http://dx.doi.org/10.4018/ijncr.2021070101.

Full text
Abstract:
Phony news or fake news spreads like a wildfire on social media causing loss to the society. Swift detection of fake news is a priority as it reduces harm to society. This paper developed a phony news detector for Reddit posts using popular machine learning techniques in conjunction with natural language processing techniques. Popular feature extraction algorithms like CountVectorizer (CV) and Term Frequency Inverse Document Frequency (TFIDF) were implemented. These features were fed to Multinomial Naive Bayes (MNB), Random Forest (RF), Support Vector Classifier (SVC), Logistic Regression (LR), AdaBoost, and XGBoost for classifying news as either genuine or phony. Finally, coefficient analysis was performed in order to interpret the best coefficients. The study revealed that the pipeline model of MNB and TFIDF achieved a best accuracy rate of 79.05% when compared to other pipeline models.
APA, Harvard, Vancouver, ISO, and other styles
46

Siswantining, Titin, Stanley Pratama, and Devvi Sarwinda. "SPRATAMA MODEL FOR INDONESIAN PARAPHRASE DETECTION USING BIDIRECTIONAL LONG SHORT-TERM MEMORY AND BIDIRECTIONAL GATED RECURRENT UNIT." MEDIA STATISTIKA 15, no. 2 (March 5, 2023): 129–38. http://dx.doi.org/10.14710/medstat.15.2.129-138.

Full text
Abstract:
Paraphrasing is a way to write sentences with other words with the same intent or purpose. Automatic paraphrase detection can be done using Natural Language Sentence Matching (NLSM) which is part of Natural Language Processing (NLP). NLP is a computational technique for processing text in general, while NLSM is used specifically to find the relationship between two sentences. With the development Neural Network (NN), nowadays NLP can be done more easily by computers. Many models for detecting and paraphrasing in English have been developed compared to Indonesian, which has less training data. This study proposes SPratama Model, which models paraphrase detection for Indonesian using a Recurrent Neural Network (RNN), namely Bidirectional Long Short-Term Memory (BiLSTM) and Bidirectional Gated Recurrent Unit (BiGRU). The data used is "Quora Question Pairs" taken from Kaggle and translated into Indonesian using Google Translate. The results of this study indicate that the proposed model has an accuracy of around 80% for the detection of paraphrased sentences.
APA, Harvard, Vancouver, ISO, and other styles
47

Xia, Tian, and Xuemin Chen. "A Discrete Hidden Markov Model for SMS Spam Detection." Applied Sciences 10, no. 14 (July 21, 2020): 5011. http://dx.doi.org/10.3390/app10145011.

Full text
Abstract:
Many machine learning methods have been applied for short messaging service (SMS) spam detection, including traditional methods such as naïve Bayes (NB), vector space model (VSM), and support vector machine (SVM), and novel methods such as long short-term memory (LSTM) and the convolutional neural network (CNN). These methods are based on the well-known bag of words (BoW) model, which assumes documents are unordered collection of words. This assumption overlooks an important piece of information, i.e., word order. Moreover, the term frequency, which counts the number of occurrences of each word in SMS, is unable to distinguish the importance of words, due to the length limitation of SMS. This paper proposes a new method based on the discrete hidden Markov model (HMM) to use the word order information and to solve the low term frequency issue in SMS spam detection. The popularly adopted SMS spam dataset from the UCI machine learning repository is used for performance analysis of the proposed HMM method. The overall performance is compatible with deep learning by employing CNN and LSTM models. A Chinese SMS spam dataset with 2000 messages is used for further performance evaluation. Experiments show that the proposed HMM method is not language-sensitive and can identify spam with high accuracy on both datasets.
APA, Harvard, Vancouver, ISO, and other styles
48

Chouikhi, Hasna, Mohammed Alsuhaibani, and Fethi Jarray. "BERT-Based Joint Model for Aspect Term Extraction and Aspect Polarity Detection in Arabic Text." Electronics 12, no. 3 (January 19, 2023): 515. http://dx.doi.org/10.3390/electronics12030515.

Full text
Abstract:
Aspect-based sentiment analysis (ABSA) is a method used to identify the aspects discussed in a given text and determine the sentiment expressed towards each aspect. This can help provide a more fine-grained understanding of the opinions expressed in the text. The majority of Arabic ABSA techniques in use today significantly rely on repeated pre-processing and feature-engineering operations, as well as the use of outside resources (e.g., lexicons). In essence, there is a significant research gap in NLP with regard to the use of transfer learning (TL) techniques and language models for aspect term extraction (ATE) and aspect polarity detection (APD) in Arabic text. While TL has proven to be an effective approach for a variety of NLP tasks in other languages, its use in the context of Arabic has been relatively under-explored. This paper aims to address this gap by presenting a TL-based approach for ATE and APD in Arabic, leveraging the knowledge and capabilities of previously trained language models. The Arabic base (Arabic version) of the BERT model serves as the foundation for the suggested models. Different BERT implementations are also contrasted. A reference ABSA dataset was used for the experiments (HAAD dataset). The experimental results demonstrate that our models surpass the baseline model and previously proposed approaches.
APA, Harvard, Vancouver, ISO, and other styles
49

Kalita, Deepjyoti, Khurshid Alam Borbora, and Dipen Nath. "Use of Bidirectional Long Short Term Memory in Spoken Word Detection with reference to the Assamese language." Indian Journal Of Science And Technology 15, no. 27 (July 20, 2022): 1364–71. http://dx.doi.org/10.17485/ijst/v15i27.655.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Tejedor, Javier, Dong Wang, Joe Frankel, Simon King, and José Colás. "A comparison of grapheme and phoneme-based units for Spanish spoken term detection." Speech Communication 50, no. 11-12 (November 2008): 980–91. http://dx.doi.org/10.1016/j.specom.2008.03.005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography