To see the other types of publications on this topic, follow the link: Named Entity Classification.

Journal articles on the topic 'Named Entity Classification'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Named Entity Classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Nadeau, David, and Satoshi Sekine. "A survey of named entity recognition and classification." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 3–26. http://dx.doi.org/10.1075/li.30.1.03nad.

Full text
Abstract:
This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.
APA, Harvard, Vancouver, ISO, and other styles
2

Ahmad, Muhammad Tayyab, Muhammad Kamran Malik, Khurram Shahzad, Faisal Aslam, Asif Iqbal, Zubair Nawaz, and Faisal Bukhari. "Named Entity Recognition and Classification for Punjabi Shahmukhi." ACM Transactions on Asian and Low-Resource Language Information Processing 19, no. 4 (July 7, 2020): 1–13. http://dx.doi.org/10.1145/3383306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Steinberger, Ralf, and Bruno Pouliquen. "Cross-lingual Named Entity Recognition." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 135–62. http://dx.doi.org/10.1075/li.30.1.09ste.

Full text
Abstract:
Named Entity Recognition and Classification (NERC) is a known and well-explored text analysis application that has been applied to various languages. We are presenting an automatic, highly multilingual news analysis system that fully integrates NERC for locations, persons and organisations with document clustering, multi-label categorisation, name attribute extraction, name variant merging and the calculation of social networks. The proposed application goes beyond the state-of-the-art by automatically merging the information found in news written in ten different languages, and by using the aggregated name information to automatically link related news documents across languages for all 45 language pair combinations. While state-of-the-art approaches for cross-lingual name variant merging and document similarity calculation require bilingual resources, the methods proposed here are mostly language-independent and require a minimal amount of monolingual language-specific effort. The development of resources for additional languages is therefore kept to a minimum and new languages can be plugged into the system effortlessly. The presented online news analysis application is fully functional and has, at the end of the year 2006, reached average usage statistics of 600,000 hits per day.
APA, Harvard, Vancouver, ISO, and other styles
4

Ekbal, Asif, Sriparna Saha, and Utpal Kumar Sikdar. "Multiobjective Optimization for Biomedical Named Entity Recognition and Classification." Procedia Technology 6 (2012): 206–13. http://dx.doi.org/10.1016/j.protcy.2012.10.025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Shaalan, Khaled. "A Survey of Arabic Named Entity Recognition and Classification." Computational Linguistics 40, no. 2 (June 2014): 469–510. http://dx.doi.org/10.1162/coli_a_00178.

Full text
Abstract:
As more and more Arabic textual information becomes available through the Web in homes and businesses, via Internet and Intranet services, there is an urgent need for technologies and tools to process the relevant information. Named Entity Recognition (NER) is an Information Extraction task that has become an integral part of many other Natural Language Processing (NLP) tasks, such as Machine Translation and Information Retrieval. Arabic NER has begun to receive attention in recent years. The characteristics and peculiarities of Arabic, a member of the Semitic languages family, make dealing with NER a challenge. The performance of an Arabic NER component affects the overall performance of the NLP system in a positive manner. This article attempts to describe and detail the recent increase in interest and progress made in Arabic NER research. The importance of the NER task is demonstrated, the main characteristics of the Arabic language are highlighted, and the aspects of standardization in annotating named entities are illustrated. Moreover, the different Arabic linguistic resources are presented and the approaches used in Arabic NER field are explained. The features of common tools used in Arabic NER are described, and standard evaluation metrics are illustrated. In addition, a review of the state of the art of Arabic NER research is discussed. Finally, we present our conclusions. Throughout the presentation, illustrative examples are used for clarification.
APA, Harvard, Vancouver, ISO, and other styles
6

ASBAYOU, Omar. "Automatic Arabic Named Entity Extraction and Classification for Information Retrieval." International Journal on Natural Language Computing 9, no. 6 (December 30, 2020): 1–22. http://dx.doi.org/10.5121/ijnlc.2020.9601.

Full text
Abstract:
This article tries to explain our rule-based Arabic Named Entity recognition (NER) and classification system. It is based on lists of classified proper names (PN) and particularly on syntactico-semantic patterns resulting in fine classification of Arabic NE. These patterns use syntactico-semantic combination of morpho-syntactic and syntactic entities. It also uses lexical classification of trigger words and NE extensions. These linguistic data are essential not only to name entity extraction but also to the taxonomic classification and to determining the NE frontiers. Our method is also based on the contextualisation and on the notion of NE class attributes and values. Inspired from X-bar theory and immediate constituents, we built a rule-based NER system composed of five levels of syntactico-semantic combination. We also show how the fine NE annotations in our system output (XML database) is exploited in information retrieval and information extraction.
APA, Harvard, Vancouver, ISO, and other styles
7

Tan, Chuanqi, Wei Qiu, Mosha Chen, Rui Wang, and Fei Huang. "Boundary Enhanced Neural Span Classification for Nested Named Entity Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 9016–23. http://dx.doi.org/10.1609/aaai.v34i05.6434.

Full text
Abstract:
Named entity recognition (NER) is a well-studied task in natural language processing. However, the widely-used sequence labeling framework is usually difficult to detect entities with nested structures. The span-based method that can easily detect nested entities in different subsequences is naturally suitable for the nested NER problem. However, previous span-based methods have two main issues. First, classifying all subsequences is computationally expensive and very inefficient at inference. Second, the span-based methods mainly focus on learning span representations but lack of explicit boundary supervision. To tackle the above two issues, we propose a boundary enhanced neural span classification model. In addition to classifying the span, we propose incorporating an additional boundary detection task to predict those words that are boundaries of entities. The two tasks are jointly trained under a multitask learning framework, which enhances the span representation with additional boundary supervision. In addition, the boundary detection model has the ability to generate high-quality candidate spans, which greatly reduces the time complexity during inference. Experiments show that our approach outperforms all existing methods and achieves 85.3, 83.9, and 78.3 scores in terms of F1 on the ACE2004, ACE2005, and GENIA datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
8

Choi, Yunsu, and Jeongwon Cha. "Korean Named Entity Recognition and Classification using Word Embedding Features." Journal of KIISE 43, no. 6 (June 15, 2016): 678–85. http://dx.doi.org/10.5626/jok.2016.43.6.678.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Goyal, Archana, Vishal Gupta, and Manish Kumar. "Recent Named Entity Recognition and Classification techniques: A systematic review." Computer Science Review 29 (August 2018): 21–43. http://dx.doi.org/10.1016/j.cosrev.2018.06.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Marchenko, O. O. "Machine-learning methods for text named entity recognition." PROBLEMS IN PROGRAMMING, no. 2-3 (June 2016): 150–57. http://dx.doi.org/10.15407/pp2016.02-03.150.

Full text
Abstract:
The article describes machine learning methods for the named entity recognition. To build named entity classifiers two basic models of machine learning, The Naїve Bayes and Conditional Random Fields, were used. A model for multi-classification of named entities using Error Correcting Output Codes was also researched. The paper describes a method for classifiers' training and the results of test experiments. Conditional Random Fields overcome other models in precision and recall evaluations.
APA, Harvard, Vancouver, ISO, and other styles
11

Rospocher, Marco, and Francesco Corcoglioniti. "Knowledge-driven joint posterior revision of named entity classification and linking." Journal of Web Semantics 65 (December 2020): 100617. http://dx.doi.org/10.1016/j.websem.2020.100617.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Malik, Muhammad Kamran. "Urdu Named Entity Recognition and Classification System Using Artificial Neural Network." ACM Transactions on Asian and Low-Resource Language Information Processing 17, no. 1 (November 16, 2017): 1–13. http://dx.doi.org/10.1145/3129290.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Saha, Sriparna, Asif Ekbal, and Utpal Kumar Sikdar. "Named entity recognition and classification in biomedical text using classifier ensemble." International Journal of Data Mining and Bioinformatics 11, no. 4 (2015): 365. http://dx.doi.org/10.1504/ijdmb.2015.067954.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Biswas, Arijit, Samarjeet Borah, and Bishal Pradhan. "Named Entity Recognition System in Nepali using Naive Bayes Classification Technique." International Journal of Computer Applications 145, no. 10 (July 15, 2016): 1–6. http://dx.doi.org/10.5120/ijca2016910766.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Hou, Wenfeng, Qing Liu, and Longbing Cao. "Cognitive Aspects-Based Short Text Representation with Named Entity, Concept and Knowledge." Applied Sciences 10, no. 14 (July 16, 2020): 4893. http://dx.doi.org/10.3390/app10144893.

Full text
Abstract:
Short text is widely seen in applications including Internet of Things (IoT). The appropriate representation and classification of short text could be severely disrupted by the sparsity and shortness of short text. One important solution is to enrich short text representation by involving cognitive aspects of text, including semantic concept, knowledge, and category. In this paper, we propose a named Entity-based Concept Knowledge-Aware (ECKA) representation model which incorporates semantic information into short text representation. ECKA is a multi-level short text semantic representation model, which extracts the semantic features from the word, entity, concept and knowledge levels by CNN, respectively. Since word, entity, concept and knowledge entity in the same short text have different cognitive informativeness for short text classification, attention networks are formed to capture these category-related attentive representations from the multi-level textual features, respectively. The final multi-level semantic representations are formed by concatenating all of these individual-level representations, which are used for text classification. Experiments on three tasks demonstrate our method significantly outperforms the state-of-the-art methods.
APA, Harvard, Vancouver, ISO, and other styles
16

Cucchiarelli, Alessandro, and Paola Velardi. "Unsupervised Named Entity Recognition Using Syntactic and Semantic Contextual Evidence." Computational Linguistics 27, no. 1 (March 2001): 123–31. http://dx.doi.org/10.1162/089120101300346822.

Full text
Abstract:
Proper nouns form an open class, making the incompleteness of manually or automatically learned classification rules an obvious problem. The purpose of this paper is twofold: first, to suggest the use of a complementary “backup” method to increase the robustness of any hand-crafted or machine-learning-based NE tagger; and second, to explore the effectiveness of using more fine-grained evidence—namely, syntactic and semantic contextual knowledge—in classifying NEs.
APA, Harvard, Vancouver, ISO, and other styles
17

Arévalo Rodríguez, Montserrat, Montserrat Civit Torruella, and Maria Antònia Martí. "MICE." International Journal of Corpus Linguistics 9, no. 1 (April 29, 2004): 53–68. http://dx.doi.org/10.1075/ijcl.9.1.03are.

Full text
Abstract:
In the field of corpus linguistics, Named Entity treatment includes the recognition and classification of different types of discursive elements like proper names, date, time, etc. These discursive elements play an important role in different Natural Language Processing applications and techniques such as Information Retrieval, Information Extraction, translations memories, document routers, etc.
APA, Harvard, Vancouver, ISO, and other styles
18

Wang, Yu, Yining Sun, Zuchang Ma, Lisheng Gao, and Yang Xu. "An ERNIE-Based Joint Model for Chinese Named Entity Recognition." Applied Sciences 10, no. 16 (August 18, 2020): 5711. http://dx.doi.org/10.3390/app10165711.

Full text
Abstract:
Named Entity Recognition (NER) is the fundamental task for Natural Language Processing (NLP) and the initial step in building a Knowledge Graph (KG). Recently, BERT (Bidirectional Encoder Representations from Transformers), which is a pre-training model, has achieved state-of-the-art (SOTA) results in various NLP tasks, including the NER. However, Chinese NER is still a more challenging task for BERT because there are no physical separations between Chinese words, and BERT can only obtain the representations of Chinese characters. Nevertheless, the Chinese NER cannot be well handled with character-level representations, because the meaning of a Chinese word is quite different from that of the characters, which make up the word. ERNIE (Enhanced Representation through kNowledge IntEgration), which is an improved pre-training model of BERT, is more suitable for Chinese NER because it is designed to learn language representations enhanced by the knowledge masking strategy. However, the potential of ERNIE has not been fully explored. ERNIE only utilizes the token-level features and ignores the sentence-level feature when performing the NER task. In this paper, we propose the ERNIE-Joint, which is a joint model based on ERNIE. The ERNIE-Joint can utilize both the sentence-level and token-level features by joint training the NER and text classification tasks. In order to use the raw NER datasets for joint training and avoid additional annotations, we perform the text classification task according to the number of entities in the sentences. The experiments are conducted on two datasets: MSRA-NER and Weibo. These datasets contain Chinese news data and Chinese social media data, respectively. The results demonstrate that the ERNIE-Joint not only outperforms BERT and ERNIE but also achieves the SOTA results on both datasets.
APA, Harvard, Vancouver, ISO, and other styles
19

Baksa, Krešimir, Dino Golović, Goran Glavaš, and Jan Šnajder. "Tagging Named Entities in Croatian Tweets." Slovenščina 2.0: empirical, applied and interdisciplinary research 4, no. 1 (February 5, 2017): 20–41. http://dx.doi.org/10.4312/slo2.0.2016.1.20-41.

Full text
Abstract:
Named entity extraction tools designed for recognizing named entities in texts written in standard language (e.g., news stories or legal texts) have been shown to be inadequate for user-generated textual content (e.g., tweets, forum posts). In this work, we propose a supervised approach to named entity recognition and classification for Croatian tweets. We compare two sequence labelling models: a hidden Markov model (HMM) and conditional random fields (CRF). Our experiments reveal that CRF is the best model for the task, achieving a very good performance of over 87% micro-averaged F1 score. We analyse the contributions of different feature groups and influence of the training set size on the performance of the CRF model.
APA, Harvard, Vancouver, ISO, and other styles
20

Lee, Joohong, Sangwoo Seo, and Yong Suk Choi. "Semantic Relation Classification via Bidirectional LSTM Networks with Entity-Aware Attention Using Latent Entity Typing." Symmetry 11, no. 6 (June 13, 2019): 785. http://dx.doi.org/10.3390/sym11060785.

Full text
Abstract:
Classifying semantic relations between entity pairs in sentences is an important task in natural language processing (NLP). Most previous models applied to relation classification rely on high-level lexical and syntactic features obtained by NLP tools such as WordNet, the dependency parser, part-of-speech (POS) tagger, and named entity recognizers (NER). In addition, state-of-the-art neural models based on attention mechanisms do not fully utilize information related to the entity, which may be the most crucial feature for relation classification. To address these issues, we propose a novel end-to-end recurrent neural model that incorporates an entity-aware attention mechanism with a latent entity typing (LET) method. Our model not only effectively utilizes entities and their latent types as features, but also builds word representations by applying self-attention based on symmetrical similarity of a sentence itself. Moreover, the model is interpretable by visualizing applied attention mechanisms. Experimental results obtained with the SemEval-2010 Task 8 dataset, which is one of the most popular relation classification tasks, demonstrate that our model outperforms existing state-of-the-art models without any high-level features.
APA, Harvard, Vancouver, ISO, and other styles
21

Ali, Mohammed, Guanzheng Tan, and Aamir Hussain. "Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition." Future Internet 10, no. 12 (December 13, 2018): 123. http://dx.doi.org/10.3390/fi10120123.

Full text
Abstract:
Recurrent neural network (RNN) has achieved remarkable success in sequence labeling tasks with memory requirement. RNN can remember previous information of a sequence and can thus be used to solve natural language processing (NLP) tasks. Named entity recognition (NER) is a common task of NLP and can be considered a classification problem. We propose a bidirectional long short-term memory (LSTM) model for this entity recognition task of the Arabic text. The LSTM network can process sequences and relate to each part of it, which makes it useful for the NER task. Moreover, we use pre-trained word embedding to train the inputs that are fed into the LSTM network. The proposed model is evaluated on a popular dataset called “ANERcorp.” Experimental results show that the model with word embedding achieves a high F-score measure of approximately 88.01%.
APA, Harvard, Vancouver, ISO, and other styles
22

SUZUKI, Masatoshi, Koji MATSUDA, Satoshi SEKINE, Naoaki OKAZAKI, and Kentaro INUI. "A Joint Neural Model for Fine-Grained Named Entity Classification of Wikipedia Articles." IEICE Transactions on Information and Systems E101.D, no. 1 (2018): 73–81. http://dx.doi.org/10.1587/transinf.2017swp0005.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Seti, Xieraili, Aishan Wumaier, Turgen Yibulayin, Diliyaer Paerhati, Lulu Wang, and Alimu Saimaiti. "Named-Entity Recognition in Sports Field Based on a Character-Level Graph Convolutional Network." Information 11, no. 1 (January 5, 2020): 30. http://dx.doi.org/10.3390/info11010030.

Full text
Abstract:
Traditional methods for identifying naming ignore the correlation between named entities and lose hierarchical structural information between the named entities in a given text. Although traditional named-entity methods are effective for conventional datasets that have simple structures, they are not as effective for sports texts. This paper proposes a Chinese sports text named-entity recognition method based on a character graph convolutional neural network (Char GCN) with a self-attention mechanism model. In this method, each Chinese character in the sports text is regarded as a node. The edge between the nodes is constructed using a similar character position and the character feature of the named-entity in the sports text. The internal structural information of the entity is extracted using a character map convolutional neural network. The hierarchical semantic information of the sports text is captured by the self-attention model to enhance the relationship between the named entities and capture the relevance and dependency between the characters. The conditional random fields classification function can accurately identify the named entities in the Chinese sports text. The results conducted on four datasets demonstrate that the proposed method improves the F-Score values significantly to 92.51%, 91.91%, 93.98%, and 95.01%, respectively, in comparison to the traditional naming methods.
APA, Harvard, Vancouver, ISO, and other styles
24

Hu, Anwen, Zhicheng Dou, Jian-Yun Nie, and Ji-Rong Wen. "Leveraging Multi-Token Entities in Document-Level Named Entity Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7961–68. http://dx.doi.org/10.1609/aaai.v34i05.6304.

Full text
Abstract:
Most state-of-the-art named entity recognition systems are designed to process each sentence within a document independently. These systems are easy to confuse entity types when the context information in a sentence is not sufficient enough. To utilize the context information within the whole document, most document-level work let neural networks on their own to learn the relation across sentences, which is not intuitive enough for us humans. In this paper, we divide entities to multi-token entities that contain multiple tokens and single-token entities that are composed of a single token. We propose that the context information of multi-token entities should be more reliable in document-level NER for news articles. We design a fusion attention mechanism which not only learns the semantic relevance between occurrences of the same token, but also focuses more on occurrences belonging to multi-tokens entities. To identify multi-token entities, we design an auxiliary task namely ‘Multi-token Entity Classification’ and perform this task simultaneously with document-level NER. This auxiliary task is simplified from NER and doesn't require extra annotation. Experimental results on the CoNLL-2003 dataset and OntoNotesnbm dataset show that our model outperforms state-of-the-art sentence-level and document-level NER methods.
APA, Harvard, Vancouver, ISO, and other styles
25

Akkasi, Abbas, Ekrem Varoğlu, and Nazife Dimililer. "ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition." BioMed Research International 2016 (2016): 1–9. http://dx.doi.org/10.1155/2016/4248026.

Full text
Abstract:
Named Entity Recognition (NER) from text constitutes the first step in many text mining applications. The most important preliminary step for NER systems using machine learning approaches is tokenization where raw text is segmented into tokens. This study proposes an enhanced rule based tokenizer, ChemTok, which utilizes rules extracted mainly from the train data set. The main novelty of ChemTok is the use of the extracted rules in order to merge the tokens split in the previous steps, thus producing longer and more discriminative tokens. ChemTok is compared to the tokenization methods utilized by ChemSpot and tmChem. Support Vector Machines and Conditional Random Fields are employed as the learning algorithms. The experimental results show that the classifiers trained on the output of ChemTok outperforms all classifiers trained on the output of the other two tokenizers in terms of classification performance, and the number of incorrectly segmented entities.
APA, Harvard, Vancouver, ISO, and other styles
26

Khan, Rehan, and A. J. Singh. "Developing and Deploying Algorithms for Information Extraction using Classification Measures for Named Entity Recognition." International Journal of Computer Sciences and Engineering 6, no. 10 (October 31, 2018): 235–48. http://dx.doi.org/10.26438/ijcse/v6i10.235248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

S, Amarappa, and Sathyanarayana S.V. "Kannada Named Entity Recognition and Classification (NERC) Based on Multinomial Naïve Bayes (MNB) Classifier." International Journal on Natural Language Computing 4, no. 4 (August 30, 2015): 39–52. http://dx.doi.org/10.5121/ijnlc.2015.4404.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Ali, Muhammad Asif, Yifang Sun, Bing Li, and Wei Wang. "Fine-Grained Named Entity Typing over Distantly Supervised Data Based on Refined Representations." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 7391–98. http://dx.doi.org/10.1609/aaai.v34i05.6234.

Full text
Abstract:
Fine-Grained Named Entity Typing (FG-NET) is a key component in Natural Language Processing (NLP). It aims at classifying an entity mention into a wide range of entity types. Due to a large number of entity types, distant supervision is used to collect training data for this task, which noisily assigns type labels to entity mentions irrespective of the context. In order to alleviate the noisy labels, existing approaches on FG-NET analyze the entity mentions entirely independent of each other and assign type labels solely based on mention's sentence-specific context. This is inadequate for highly overlapping and/or noisy type labels as it hinders information passing across sentence boundaries. For this, we propose an edge-weighted attentive graph convolution network that refines the noisy mention representations by attending over corpus-level contextual clues prior to the end classification. Experimental evaluation shows that the proposed model outperforms the existing research by a relative score of upto 10.2% and 8.3% for macro-f1 and micro-f1 respectively.
APA, Harvard, Vancouver, ISO, and other styles
29

Spruit, Marco, and Bas Vlug. "Effective and Efficient Classification of Topically-Enriched Domain-Specific Text Snippets." International Journal of Strategic Decision Sciences 6, no. 3 (July 2015): 1–17. http://dx.doi.org/10.4018/ijsds.2015070101.

Full text
Abstract:
Due to the explosive growth in the amount of text snippets over the past few years and their sparsity of text, organizations are unable to effectively and efficiently classify them, missing out on business opportunities. This paper presents TETSC: the Topically-Enriched Text Snippet Classification method. TETSC aims to solve the classification problem for text snippets in any domain. TETSC recognizes that there are different types of text snippets and, therefore, allows for stop word removal, named-entity recognition, and topical enrichment for the different types of text snippets. TETSC has been implemented in the production systems of a personal finance organization, which resulted in a classification error reduction of over 21%. Highlights: The authors create the TETSC method for classifying topically-enriched text snippets; the authors differentiate between different types of text snippets; the authors show a successful application of Named-Entity Recognition to text snippets; using multiple enrichment strategies appears to reduce effectivity.
APA, Harvard, Vancouver, ISO, and other styles
30

Varghese, Akson Sam, Saleha Sarang, Vipul Yadav, Bharat Karotra, and Niketa Gandhi. "Bidirectional LSTM joint model for intent classification and named entity recognition in natural language understanding." International Journal of Hybrid Intelligent Systems 16, no. 1 (March 23, 2020): 13–23. http://dx.doi.org/10.3233/his-190275.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Wang, Peng, Jing Zhou, Yuzhang Liu, and Xingchen Zhou. "TransET: Knowledge Graph Embedding with Entity Types." Electronics 10, no. 12 (June 11, 2021): 1407. http://dx.doi.org/10.3390/electronics10121407.

Full text
Abstract:
Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.
APA, Harvard, Vancouver, ISO, and other styles
32

Pomares-Quimbaya, Alexandra, Rafael A. Gonzalez, Oscar Mauricio Muñoz Velandia, Angel Alberto Garcia Peña, Julián Camilo Daza Rodríguez, Alejandro Sierra Múnera, and Cyril Labbé. "Concept Attribute Labeling and Context-Aware Named Entity Recognition in Electronic Health Records." International Journal of Reliable and Quality E-Healthcare 7, no. 1 (January 2018): 1–15. http://dx.doi.org/10.4018/ijrqeh.2018010101.

Full text
Abstract:
Extracting valuable knowledge from Electronic Health Records (EHR) represents a challenging task due to the presence of both structured and unstructured data, including codified fields, images and test results. Narrative text in particular contains a variety of notes which are diverse in language and detail, as well as being full of ad hoc terminology, including acronyms and jargon, which is especially challenging in non-English EHR, where there is a dearth of annotated corpora or trained case sets. This paper proposes an approach for NER and concept attribute labeling for EHR that takes into consideration the contextual words around the entity of interest to determine its sense. The approach proposes a composition method of three different NER methods, together with the analysis of the context (neighboring words) using an ensemble classification model. This contributes to disambiguate NER, as well as labeling the concept as confirmed, negated, speculative, pending or antecedent. Results show an improvement of the recall and a limited impact on precision for the NER process.
APA, Harvard, Vancouver, ISO, and other styles
33

Putra, Fatra Nonggala, and Chastine Fatichah. "Klasifikasi jenis kejadian menggunakan kombinasi NeuroNER dan Recurrent Convolutional Neural Network pada data Twitter." Register: Jurnal Ilmiah Teknologi Sistem Informasi 4, no. 2 (July 1, 2018): 81. http://dx.doi.org/10.26594/register.v4i2.1242.

Full text
Abstract:
Sistem deteksi kejadian dari data Twitter bertujuan untuk mendapatkan data secara real-time sebagai alternatif sistem deteksi kejadian yang murah. Penelitian tentang sistem deteksi kejadian telah dilakukan sebelumnya. Salah satu modul utama dari sistem deteksi kejadian adalah modul klasifikasi jenis kejadian. Informasi dapat diklasifikasikan sebagai kejadian penting jika memiliki entitas yang merepresentasikan di mana lokasi kejadian terjadi. Beberapa penelitian sebelumnya masih memanfaatkan fitur ‘buatan tangan’, maupun fitur model berbasis pipeline seperti n-gram sebagai penentuan fitur kunci klasifikasi yang tidak efektif dengan performa kurang optimal. Oleh karena itu, diusulkan penggabungan metode Neuro Named Entity Recognition (NeuroNER) dan klasifier Recurrent Convolutional Neural Network (RCNN) yang diharapkan dapat melakukan deteksi kejadian secara efektif dan optimal. Pertama, sistem melakukan pengenalan entitas bernama pada data tweet untuk mengenali entitas lokasi yang terdapat dalam teks tweet, karena informasi kejadian haruslah memiliki minimal satu entitas lokasi. Kedua, jika tweet terdeteksi memiliki entitas lokasi maka akan dilakukan proses klasifikasi kejadian menggunakan klasifier RCNN. Berdasarkan hasil uji coba, disimpulkan bahwa sistem deteksi kejadian menggunakan penggabungan NeuroNER dan RCNN bekerja dengan sangat baik dengan nilai rata-rata precision, recall, dan f-measure masing-masing 94,87%, 92,73%, dan 93,73%. The incident detection system from Twitter data aims to obtain real-time information as an alternative low-cost incident detection system. One of the main modules in the incident detection system is the classification module. Information is classified as important incident if it has an entity that represents where the incident occurred. Some previous studies still use 'handmade' features as well as feature-based pipeline models such as n-grams as the key features for classification which are deemed as ineffective. Therefore, this research propose a combination of Neuro Named Entity Recognition (NeuroNER) and Recurrent Convolutional Neural Network (RCNN) as an effective classification method for incident detection. First, the system perform named entity recognition to identify the location contained in the tweet text because the event information should have at least one location entity. Then, if the location is successfully identified, the incident will be classified using RCNN. Experimental result shows that the incident detection system using combination of NeuroNER and RCNN works very well with the average value of precision, recall, and f-measure 92.44%, 94.76%, and 93.53% respectively.
APA, Harvard, Vancouver, ISO, and other styles
34

Fan, Runyu, Lizhe Wang, Jining Yan, Weijing Song, Yingqian Zhu, and Xiaodao Chen. "Deep Learning-Based Named Entity Recognition and Knowledge Graph Construction for Geological Hazards." ISPRS International Journal of Geo-Information 9, no. 1 (December 27, 2019): 15. http://dx.doi.org/10.3390/ijgi9010015.

Full text
Abstract:
Constructing a knowledge graph of geological hazards literature can facilitate the reuse of geological hazards literature and provide a reference for geological hazard governance. Named entity recognition (NER), as a core technology for constructing a geological hazard knowledge graph, has to face the challenges that named entities in geological hazard literature are diverse in form, ambiguous in semantics, and uncertain in context. This can introduce difficulties in designing practical features during the NER classification. To address the above problem, this paper proposes a deep learning-based NER model; namely, the deep, multi-branch BiGRU-CRF model, which combines a multi-branch bidirectional gated recurrent unit (BiGRU) layer and a conditional random field (CRF) model. In an end-to-end and supervised process, the proposed model automatically learns and transforms features by a multi-branch bidirectional GRU layer and enhances the output with a CRF layer. Besides the deep, multi-branch BiGRU-CRF model, we also proposed a pattern-based corpus construction method to construct the corpus needed for the deep, multi-branch BiGRU-CRF model. Experimental results indicated the proposed deep, multi-branch BiGRU-CRF model outperformed state-of-the-art models. The proposed deep, multi-branch BiGRU-CRF model constructed a large-scale geological hazard literature knowledge graph containing 34,457 entities nodes and 84,561 relations.
APA, Harvard, Vancouver, ISO, and other styles
35

Hema, R., and T. V. Geetha. "Recognition of Chemical Entities using Pattern Matching and Functional Group Classification." International Journal of Intelligent Information Technologies 12, no. 4 (October 2016): 21–44. http://dx.doi.org/10.4018/ijiit.2016100102.

Full text
Abstract:
The two main challenges in chemical entity recognition are: (i) New chemical compounds are constantly being synthesized infinitely. (ii) High ambiguity in chemical representation in which a chemical entity is being described by different nomenclatures. Therefore, the identification and maintenance of chemical terminologies is a tough task. Since most of the existing text mining methods followed the term-based approaches, the problems of polysemy and synonymy came into the picture. So, a Named Entity Recognition (NER) system based on pattern matching in chemical domain is developed to extract the chemical entities from chemical documents. The Tf-idf and PMI association measures are used to filter out the non-chemical terms. The F-score of 92.19% is achieved for chemical NER. This proposed method is compared with the baseline method and other existing approaches. As the final step, the filtered chemical entities are classified into sixteen functional groups. The classification is done using SVM One against All multiclass classification approach and achieved the accuracy of 87%. One-way ANOVA is used to test the quality of pattern matching method with the other existing chemical NER methods.
APA, Harvard, Vancouver, ISO, and other styles
36

Oliwa, Tomasz, Steven B. Maron, Leah M. Chase, Samantha Lomnicki, Daniel V. T. Catenacci, Brian Furner, and Samuel L. Volchenboum. "Obtaining Knowledge in Pathology Reports Through a Natural Language Processing Approach With Classification, Named-Entity Recognition, and Relation-Extraction Heuristics." JCO Clinical Cancer Informatics, no. 3 (December 2019): 1–8. http://dx.doi.org/10.1200/cci.19.00008.

Full text
Abstract:
PURPOSE Robust institutional tumor banks depend on continuous sample curation or else subsequent biopsy or resection specimens are overlooked after initial enrollment. Curation automation is hindered by semistructured free-text clinical pathology notes, which complicate data abstraction. Our motivation is to develop a natural language processing method that dynamically identifies existing pathology specimen elements necessary for locating specimens for future use in a manner that can be re-implemented by other institutions. PATIENTS AND METHODS Pathology reports from patients with gastroesophageal cancer enrolled in The University of Chicago GI oncology tumor bank were used to train and validate a novel composite natural language processing-based pipeline with a supervised machine learning classification step to separate notes into internal (primary review) and external (consultation) reports; a named-entity recognition step to obtain label (accession number), location, date, and sublabels (block identifiers); and a results proofreading step. RESULTS We analyzed 188 pathology reports, including 82 internal reports and 106 external consult reports, and successfully extracted named entities grouped as sample information (label, date, location). Our approach identified up to 24 additional unique samples in external consult notes that could have been overlooked. Our classification model obtained 100% accuracy on the basis of 10-fold cross-validation. Precision, recall, and F1 for class-specific named-entity recognition models show strong performance. CONCLUSION Through a combination of natural language processing and machine learning, we devised a re-implementable and automated approach that can accurately extract specimen attributes from semistructured pathology notes to dynamically populate a tumor registry.
APA, Harvard, Vancouver, ISO, and other styles
37

Filannino, Michele, and Özlem Uzuner. "Advancing the State of the Art in Clinical Natural Language Processing through Shared Tasks." Yearbook of Medical Informatics 27, no. 01 (August 2018): 184–92. http://dx.doi.org/10.1055/s-0038-1667079.

Full text
Abstract:
Objectives: To review the latest scientific challenges organized in clinical Natural Language Processing (NLP) by highlighting the tasks, the most effective methodologies used, the data, and the sharing strategies. Methods: We harvested the literature by using Google Scholar and PubMed Central to retrieve all shared tasks organized since 2015 on clinical NLP problems on English data. Results: We surveyed 17 shared tasks. We grouped the data into four types (synthetic, drug labels, social data, and clinical data) which are correlated with size and sensitivity. We found named entity recognition and classification to be the most common tasks. Most of the methods used to tackle the shared tasks have been data-driven. There is homogeneity in the methods used to tackle the named entity recognition tasks, while more diverse solutions are investigated for relation extraction, multi-class classification, and information retrieval problems. Conclusions: There is a clear trend in using data-driven methods to tackle problems in clinical NLP. The availability of more and varied data from different institutions will undoubtedly lead to bigger advances in the field, for the benefit of healthcare as a whole.
APA, Harvard, Vancouver, ISO, and other styles
38

Chen, Xianglong, Chunping Ouyang, Yongbin Liu, and Yi Bu. "Improving the Named Entity Recognition of Chinese Electronic Medical Records by Combining Domain Dictionary and Rules." International Journal of Environmental Research and Public Health 17, no. 8 (April 14, 2020): 2687. http://dx.doi.org/10.3390/ijerph17082687.

Full text
Abstract:
Electronic medical records are an integral part of medical texts. Entity recognition of electronic medical records has triggered many studies that propose many entity extraction methods. In this paper, an entity extraction model is proposed to extract entities from Chinese Electronic Medical Records (CEMR). In the input layer of the model, we use word embedding and dictionary features embedding as input vectors, where word embedding consists of a character representation and a word representation. Then, the input vectors are fed to the bidirectional long short-term memory to capture contextual features. Finally, a conditional random field is employed to capture dependencies between neighboring tags. We performed experiments on body classification task, and the F1 values reached 90.65%. We also performed experiments on anatomic region recognition task, and the F1 values reached 93.89%. On both tasks, our model had higher performance than state-of-the-art models, such as Bi-LSTM-CRF, Bi-LSTM-Attention, and Vote. Through experiments, our model has a good effect when dealing with small frequency entities and unknown entities; with a small training dataset, our method showed 2–4% improvement on F1 value compared to the basic Bi-LSTM-CRF models. Additionally, on anatomic region recognition task, besides using our proposed entity extraction model, 12 rules we designed and domain dictionary were adopted. Then, in this task, the weighted F1 value of the three specific entities extraction reached 84.36%.
APA, Harvard, Vancouver, ISO, and other styles
39

Yang, Xi, Jiang Bian, Ruogu Fang, Ragnhildur I. Bjarnadottir, William R. Hogan, and Yonghui Wu. "Identifying relations of medications with adverse drug events using recurrent convolutional neural networks and gradient boosting." Journal of the American Medical Informatics Association 27, no. 1 (August 28, 2019): 65–72. http://dx.doi.org/10.1093/jamia/ocz144.

Full text
Abstract:
Abstract Objective To develop a natural language processing system that identifies relations of medications with adverse drug events from clinical narratives. This project is part of the 2018 n2c2 challenge. Materials and Methods We developed a novel clinical named entity recognition method based on an recurrent convolutional neural network and compared it to a recurrent neural network implemented using the long-short term memory architecture, explored methods to integrate medical knowledge as embedding layers in neural networks, and investigated 3 machine learning models, including support vector machines, random forests and gradient boosting for relation classification. The performance of our system was evaluated using annotated data and scripts provided by the 2018 n2c2 organizers. Results Our system was among the top ranked. Our best model submitted during this challenge (based on recurrent neural networks and support vector machines) achieved lenient F1 scores of 0.9287 for concept extraction (ranked third), 0.9459 for relation classification (ranked fourth), and 0.8778 for the end-to-end relation extraction (ranked second). We developed a novel named entity recognition model based on a recurrent convolutional neural network and further investigated gradient boosting for relation classification. The new methods improved the lenient F1 scores of the 3 subtasks to 0.9292, 0.9633, and 0.8880, respectively, which are comparable to the best performance reported in this challenge. Conclusion This study demonstrated the feasibility of using machine learning methods to extract the relations of medications with adverse drug events from clinical narratives.
APA, Harvard, Vancouver, ISO, and other styles
40

Bareket, Dan, and Reut Tsarfaty. "Neural Modeling for Named Entities and Morphology (NEMO2)." Transactions of the Association for Computational Linguistics 9 (2021): 909–28. http://dx.doi.org/10.1162/tacl_a_00404.

Full text
Abstract:
Abstract Named Entity Recognition (NER) is a fundamental NLP task, commonly formulated as classification over a sequence of tokens. Morphologically rich languages (MRLs) pose a challenge to this basic formulation, as the boundaries of named entities do not necessarily coincide with token boundaries, rather, they respect morphological boundaries. To address NER in MRLs we then need to answer two fundamental questions, namely, what are the basic units to be labeled, and how can these units be detected and classified in realistic settings (i.e., where no gold morphology is available). We empirically investigate these questions on a novel NER benchmark, with parallel token- level and morpheme-level NER annotations, which we develop for Modern Hebrew, a morphologically rich-and-ambiguous language. Our results show that explicitly modeling morphological boundaries leads to improved NER performance, and that a novel hybrid architecture, in which NER precedes and prunes morphological decomposition, greatly outperforms the standard pipeline, where morphological decomposition strictly precedes NER, setting a new performance bar for both Hebrew NER and Hebrew morphological decomposition tasks.
APA, Harvard, Vancouver, ISO, and other styles
41

Cao, Lina, Jian Zhang, Xinquan Ge, and Jindong Chen. "Occupational profiling driven by online job advertisements: Taking the data analysis and processing engineering technicians as an example." PLOS ONE 16, no. 6 (June 22, 2021): e0253308. http://dx.doi.org/10.1371/journal.pone.0253308.

Full text
Abstract:
The occupational profiling system driven by the traditional survey method has some shortcomings such as lag in updating, time consumption and laborious revision. It is necessary to refine and improve the traditional occupational portrait system through dynamic occupational information. Under the circumstances of big data, this paper showed the feasibility of vocational portraits driven by job advertisements with data analysis and processing engineering technicians (DAPET) as an example. First, according to the description of occupation in the Chinese Occupation Classification Grand Dictionary, a text similarity algorithm was used to preliminarily choose recruitment data with high similarity. Second, Convolutional Neural Networks for Sentence Classification (TextCNN) was used to further classify the preliminary corpus to obtain a precise occupational dataset. Third, the specialty and skill were taken as named entities that were automatically extracted by the named entity recognition technology. Finally, putting the extracted entities into the occupational dataset, the occupation characteristics of multiple dimensions were depicted to form a profile of the vocation.
APA, Harvard, Vancouver, ISO, and other styles
42

Athenikos, Sofia J., and Il-Yeol Song. "CAM." Journal of Database Management 24, no. 4 (October 2013): 51–80. http://dx.doi.org/10.4018/jdm.2013100103.

Full text
Abstract:
The problem of identifying relevant classes (entities) and associations (relationships) is a fundamental problem for conceptual modeling. In a previous work the authors introduced a conceptual modeling methodology named OMP (Ontology-based Modeling Patterns), which is based on the analysis of class categories representing entity types that are organized in the form of ontology. Since then the authors have explored a way to improve the methodology. As a result, in this paper the authors introduce a new conceptual modeling framework, entitled CAM (Class/Association-analysis-based Modeling), which is based on the analysis and classification of association types as well as entity types. The main objective of CAM is to serve as a tool to facilitate teaching the fundamentals of conceptual modeling to students in a systematic way, by providing extensible and adaptable entity/association classificatory systems that can be directly used in the problem-solving process. In this paper the authors present the CAM framework and illustrate its application.
APA, Harvard, Vancouver, ISO, and other styles
43

Adel, Heike, and Hinrich Schuetze. "Type-aware Convolutional Neural Networks for Slot Filling." Journal of Artificial Intelligence Research 66 (September 28, 2019): 297–339. http://dx.doi.org/10.1613/jair.1.11725.

Full text
Abstract:
The slot filling task aims at extracting answers for queries about entities from text, such as "Who founded Apple". In this paper, we focus on the relation classification component of a slot filling system. We propose type-aware convolutional neural networks to benefit from the mutual dependencies between entity and relation classification. In particular, we explore different ways of integrating the named entity types of the relation arguments into a neural network for relation classification, including a joint training and a structured prediction approach. To the best of our knowledge, this is the first study on type-aware neural networks for slot filling. The type-aware models lead to the best results of our slot filling pipeline. Joint training performs comparable to structured prediction. To understand the impact of the different components of the slot filling pipeline, we perform a recall analysis, a manual error analysis and several ablation studies. Such analyses are of particular importance to other slot filling researchers since the official slot filling evaluations only assess pipeline outputs. The analyses show that especially coreference resolution and our convolutional neural networks have a large positive impact on the final performance of the slot filling pipeline. The presented models, the source code of our system as well as our coreference resource is publicly available.
APA, Harvard, Vancouver, ISO, and other styles
44

Sakurai, Shigeaki. "Analysis of Textual Data Based on Inductive Learning Techniques." International Journal of Information Retrieval Research 3, no. 2 (April 2013): 40–57. http://dx.doi.org/10.4018/ijirr.2013040103.

Full text
Abstract:
This paper introduces knowledge discovery methods based on inductive learning techniques from textual data. The author argues three methods extracting features of the textual data. First one activates a key concept dictionary, second one does a key phrase pattern dictionary, and third one does a named entity extractor. These features are used in order to generate rules representing relationships between the features and text classes. The rules are described in the format of a fuzzy decision tree. Also, these features are used in order to acquire a classification model based on SVM (Support Vector Machine). The model can classify new textual data into the text classes with high classification accuracy. Lastly, this paper introduces two application tasks based on these methods and verifies the effect of the methods.
APA, Harvard, Vancouver, ISO, and other styles
45

Tanasijević, Ivana, and Gordana Pavlović-Lažetić. "HerCulB: content-based information extraction and retrieval for cultural heritage of the Balkans." Electronic Library 38, no. 5/6 (October 30, 2020): 905–18. http://dx.doi.org/10.1108/el-03-2020-0052.

Full text
Abstract:
Purpose The purpose of this paper is to provide a methodology for automatic annotation of a multimedia collection of intangible cultural heritage mostly in the form of interviews. Assigned annotations provide a way to search the collection. Design/methodology/approach Annotation is based on automatic extraction of metadata and is conducted by named entity and topic extraction from textual descriptions with a rule-based approach supported by vocabulary resources, a compiled domain-specific classification scheme and domain-oriented corpus analysis. Findings The proposed methodology for automatic annotation of a collection of intangible cultural heritage, applied on the cultural heritage of the Balkans, has very good results according to F measure, which is 0.87 for the named entity and 0.90 for topic annotation. The overall methodology enables encapsulating domain-specific and language-specific knowledge into collections of finite state transducers and allows further improvements. Originality/value Although cultural heritage has a significant role in the development of identity of a group or an individual, it is one of those specific domains that have not yet been fully explored in case of many languages. A methodology is proposed that can be used for incorporating natural language processing techniques into digital libraries of cultural heritage.
APA, Harvard, Vancouver, ISO, and other styles
46

Gyan, Emmanuel, François Dreyfus, and Pierre Fenaux. "REFRACTORY THROMBOCYTOPENIA AND NEUTROPENIA: A DIAGNOSTIC CHALLENGE." Mediterranean Journal of Hematology and Infectious Diseases 7 (February 13, 2015): e2015018. http://dx.doi.org/10.4084/mjhid.2015.018.

Full text
Abstract:
Background. The 2008 WHO classification identified refractory cytopenia with unilineage dysplasia (RCUD) as a composite entity encompassing refractory anemia, refractory thrombocytopenia (RT), and refractory neutropenia (RN), characterized by 10% or more dysplastic cells in the bone marrow respective lineage. The diagnosis of RT and RN is complicated by several factors. Diagnosing RT first requires exclusion of familial thrombocytopenia, chronic auto-immune thrombocytopenia, concomitant medications, viral infections, or hypersplenism. Diagnosis of RN should also be made after ruling out differential diagnoses such as ethnic or familial neutropenia, as well as acquired, drug-induced, infection-related or malignancy-related neutropenia. An accurate quantification of dysplasia should be performed in order to distinguish RT or RN from the provisional entity named idiopathic cytopenia of unknown significance (ICUS). Cytogenetic analysis, and possibly in the future somatic mutation analysis (of genes most frequently mutated in MDS), and flow cytometry analysis aberrant antigen expression on myeloid cells may help in this differential diagnosis. Importantly, we and others found that, while isolated neutropenia and thrombocytopenia are not rare in MDS, those patients can generally be classified (according to WHO 2008 classification) as refractory cytopenia with multilineage dysplasia or refractory anemia with excess blasts, while RT and RN (according to WHO 2008) are quite rare.These results suggest in particular that identification of RT and RN as distinct entities could be reconsidered in future WHO classification updates.
APA, Harvard, Vancouver, ISO, and other styles
47

Sharoff, Serge. "Finding next of kin: Cross-lingual embedding spaces for related languages." Natural Language Engineering 26, no. 2 (September 4, 2019): 163–82. http://dx.doi.org/10.1017/s1351324919000354.

Full text
Abstract:
AbstractSome languages have very few NLP resources, while many of them are closely related to better-resourced languages. This paper explores how the similarity between the languages can be utilised by porting resources from better- to lesser-resourced languages. The paper introduces a way of building a representation shared across related languages by combining cross-lingual embedding methods with a lexical similarity measure which is based on the weighted Levenshtein distance. One of the outcomes of the experiments is a Panslavonic embedding space for nine Balto-Slavonic languages. The paper demonstrates that the resulting embedding space helps in such applications as morphological prediction, named-entity recognition and genre classification.
APA, Harvard, Vancouver, ISO, and other styles
48

Permatasari, Dinda Ayu, and Devira Anggi Maharani. "Combination of Natural Language Understanding and Reinforcement Learning for Booking Bot." Journal of Electrical, Electronic, Information, and Communication Technology 3, no. 1 (April 30, 2021): 12. http://dx.doi.org/10.20961/jeeict.3.1.49818.

Full text
Abstract:
At present, some popular messaging applications have evolved specifically with bots starting to emerge into development. One of the developments of chatbots is to help humans booking flight with Named Entity Recognition in the text, trace sentences to detect user intentions, and respond even though the context of the conversation domain is limited. This study proposes to conduct analysis and design chatbot interactions using NLU (Natural Language Understanding) with the aim that the bot understands what is meant by the user and provides the best and right response. Classification using Support Vector Machine (SVM) method with (erm Frequency-Inverse Document Frequency (TF-IDF) feature extraction is suitable combination methods that produce the highest accuracy value up to 97.5%. Conversation dialogue on chatbots developed using NLU which consists of NER and intent classification then dialog manager using Reinforcement Learning could make a low cost for computing in chatbots.
APA, Harvard, Vancouver, ISO, and other styles
49

Wu, Stephen, Kirk Roberts, Surabhi Datta, Jingcheng Du, Zongcheng Ji, Yuqi Si, Sarvesh Soni, et al. "Deep learning in clinical natural language processing: a methodical review." Journal of the American Medical Informatics Association 27, no. 3 (December 3, 2019): 457–70. http://dx.doi.org/10.1093/jamia/ocz200.

Full text
Abstract:
Abstract Objective This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. Materials and Methods We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. Results DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a “long tail” of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. Discussion Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). Conclusion Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.
APA, Harvard, Vancouver, ISO, and other styles
50

Wang, Xu, Shuai Zhao, Bo Cheng, Jiale Han, Yingting Li, Hao Yang, and Guoshun Nan. "HGMAN: Multi-Hop and Multi-Answer Question Answering Based on Heterogeneous Knowledge Graph (Student Abstract)." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 10 (April 3, 2020): 13953–54. http://dx.doi.org/10.1609/aaai.v34i10.7249.

Full text
Abstract:
Multi-hop question answering models based on knowledge graph have been extensively studied. Most existing models predict a single answer with the highest probability by ranking candidate answers. However, they are stuck in predicting all the right answers caused by the ranking method. In this paper, we propose a novel model that converts the ranking of candidate answers into individual predictions for each candidate, named heterogeneous knowledge graph based multi-hop and multi-answer model (HGMAN). HGMAN is capable of capturing more informative representations for relations assisted by our heterogeneous graph, which consists of multiple entity nodes and relation nodes. We rely on graph convolutional network for multi-hop reasoning and then binary classification for each node to get multiple answers. Experimental results on MetaQA dataset show the performance of our proposed model over all baselines.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography