Journal articles on the topic 'Semantic concepts extraction'

To see the other types of publications on this topic, follow the link: Semantic concepts extraction.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'Semantic concepts extraction.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Huang, Jingxiu, Ruofei Ding, Xiaomin Wu, Shumin Chen, Jiale Zhang, Lixiang Liu, and Yunxiang Zheng. "WERECE: An Unsupervised Method for Educational Concept Extraction Based on Word Embedding Refinement." Applied Sciences 13, no. 22 (November 14, 2023): 12307. http://dx.doi.org/10.3390/app132212307.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The era of educational big data has sparked growing interest in extracting and organizing educational concepts from massive amounts of information. Outcomes are of the utmost importance for artificial intelligence–empowered teaching and learning. Unsupervised educational concept extraction methods based on pre-trained models continue to proliferate due to ongoing advances in semantic representation. However, it remains challenging to directly apply pre-trained large language models to extract educational concepts; pre-trained models are built on extensive corpora and do not necessarily cover all subject-specific concepts. To address this gap, we propose a novel unsupervised method for educational concept extraction based on word embedding refinement (i.e., word embedding refinement–based educational concept extraction (WERECE)). It integrates a manifold learning algorithm to adapt a pre-trained model for extracting educational concepts while accounting for the geometric information in semantic computation. We further devise a discriminant function based on semantic clustering and Box–Cox transformation to enhance WERECE’s accuracy and reliability. We evaluate its performance on two newly constructed datasets, EDU-DT and EDUTECH-DT. Experimental results show that WERECE achieves an average precision up to 85.9%, recall up to 87.0%, and F1 scores up to 86.4%, which significantly outperforms baselines (TextRank, term frequency–inverse document frequency, isolation forest, K-means, and one-class support vector machine) on educational concept extraction. Notably, when WERECE is implemented with different parameter settings, its precision and recall sensitivity remain robust. WERECE also holds broad application prospects as a foundational technology, such as for building discipline-oriented knowledge graphs, enhancing learning assessment and feedback, predicting learning interests, and recommending learning resources.
2

Li, Dao Wang. "Research on Text Conceptual Relation Extraction Based on Domain Ontology." Advanced Materials Research 739 (August 2013): 574–79. http://dx.doi.org/10.4028/www.scientific.net/amr.739.574.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
At present, the ontology learning research focuses on the concept and relation extraction; the traditional extraction methods ignore the influence of the semantic factors on the extraction results, and lack of the accurate extraction of the relations among concepts. According to this problem, in this paper, the association rule is combined with the semantic similarity, and the improved comprehensive semantic similarity is applied into the relation extraction through the association rule mining relation. The experiments show that the relation extraction based on this method effectively improves the precision of the extraction results.
3

Katsadaki, Eirini, and Margarita Kokla. "Comparative Evaluation of Keyphrase Extraction Tools for Semantic Analysis of Climate Change Scientific Reports and Ontology Enrichment." AGILE: GIScience Series 5 (May 30, 2024): 1–7. http://dx.doi.org/10.5194/agile-giss-5-32-2024.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract. Keyphrase extraction is a process used for identifying important concepts and entities within unstructured information sources to facilitate ontology enrichment, semantic analysis, and information retrieval. In this paper, three different tools for key phrase extraction are compared to evaluate their accuracy and effectiveness for extracting geospatial and climate change concepts from climate change reports: frequency-inverse document frequency (TF-IDF), Amazon Comprehend, and YAKE. Climate change reports contain vital information for comprehending the complexity of climate change causes, impacts, and interconnections, and include wealth of information on geospatial concepts, locations, and events but the diverse terminology used complicates information extraction and organization. The highest scoring keyphrases are further used to enrich and populate the SWEET ontology with concepts and instances related to climate change and meaningful relations between them to support semantic representation and formalization of knowledge.
4

AlArfaj, Abeer. "Towards relation extraction from Arabic text: a review." International Robotics & Automation Journal 5, no. 5 (December 24, 2019): 212–15. http://dx.doi.org/10.15406/iratj.2019.05.00195.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Semantic relation extraction is an important component of ontologies that can support many applications e.g. text mining, question answering, and information extraction. However, extracting semantic relations between concepts is not trivial and one of the main challenges in Natural Language Processing (NLP) Field. The Arabic language has complex morphological, grammatical, and semantic aspects since it is a highly inflectional and derivational language, which makes task even more challenging. In this paper, we present a review of the state of the art for relation extraction from texts, addressing the progress and difficulties in this field. We discuss several aspects related to this task, considering the taxonomic and non-taxonomic relation extraction methods. Majority of relation extraction approaches implement a combination of statistical and linguistic techniques to extract semantic relations from text. We also give special attention to the state of the work on relation extraction from Arabic texts, which need further progress.
5

Ji, Lei, Yujing Wang, Botian Shi, Dawei Zhang, Zhongyuan Wang, and Jun Yan. "Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding." Data Intelligence 1, no. 3 (June 2019): 238–70. http://dx.doi.org/10.1162/dint_a_00013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Knowlege is important for text-related applications. In this paper, we introduce Microsoft Concept Graph, a knowledge graph engine that provides concept tagging APIs to facilitate the understanding of human languages. Microsoft Concept Graph is built upon Probase, a universal probabilistic taxonomy consisting of instances and concepts mined from the Web. We start by introducing the construction of the knowledge graph through iterative semantic extraction and taxonomy construction procedures, which extract 2.7 million concepts from 1.68 billion Web pages. We then use conceptualization models to represent text in the concept space to empower text-related applications, such as topic search, query recommendation, Web table understanding and Ads relevance. Since the release in 2016, Microsoft Concept Graph has received more than 100,000 pageviews, 2 million API calls and 3,000 registered downloads from 50,000 visitors over 64 countries.
6

Papadias, Evangelos, Margarita Kokla, and Eleni Tomai. "Educing knowledge from text: semantic information extraction of spatial concepts and places." AGILE: GIScience Series 2 (June 4, 2021): 1–7. http://dx.doi.org/10.5194/agile-giss-2-38-2021.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract. A growing body of geospatial research has shifted the focus from fully structured to semistructured and unstructured content written in natural language. Natural language texts provide a wealth of knowledge about geospatial concepts, places, events, and activities that needs to be extracted and formalized to support semantic annotation, knowledge-based exploration, and semantic search. The paper presents a web-based prototype for the extraction of geospatial entities and concepts, and the subsequent semantic visualization and interactive exploration of the extraction results. A lightweight ontology anchored in natural language guides the interpretation of natural language texts and the extraction of relevant domain knowledge. The approach is applied on three heterogeneous sources which provide a wealth of spatial concepts and place names.
7

Hong Doan, Phuoc Thi, Ngamnij Arch-int, and Somjit Arch-int. "A Semantic Framework for Extracting Taxonomic Relations from Text Corpus." International Arab Journal of Information Technology 17, no. 3 (May 1, 2019): 325–37. http://dx.doi.org/10.34028/iajit/17/3/6.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Nowadays, ontologies have been exploited in many current applications due to the abilities in representing knowledge and inferring new knowledge. However, the manual construction of ontologies is tedious and time-consuming. Therefore, the automated ontology construction from text has been investigated. The extraction of taxonomic relations between concepts is a crucial step in constructing domain ontologies. To obtain taxonomic relations from a text corpus, especially when the data is deficient, the approach of using the web as a source of collective knowledge (a.k.a web-based approach) is usually applied. The important challenge of this approach is how to collect relevant knowledge from a large amount of web pages. To overcome this issue, we propose a framework that combines Word Sense Disambiguation (WSD) and web approach to extract taxonomic relations from a domain-text corpus. This framework consists of two main stages: concept extraction and taxonomic-relation extraction. Concepts acquired from the concept-extraction stage are disambiguated through WSD module and passed to stage of extraction taxonomic relations afterward. To evaluate the efficiency of the proposed framework, we conduct experiments on datasets about two domains of tourism and sport. The obtained results show that the proposed method is efficient in corpora which are insufficient or have no training data. Besides, the proposed method outperforms the state of the art method in corpora having high WSD results.
8

Chahal, Poonam, Manjeet Singh, and Suresh Kumar. "Semantic Analysis Based Approach for Relevant Text Extraction Using Ontology." International Journal of Information Retrieval Research 7, no. 4 (October 2017): 19–36. http://dx.doi.org/10.4018/ijirr.2017100102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Semantic analysis computation is done by extracting the interrelated concepts used by an author in the text/content of document. The concepts and linking i.e. relationships that are available among the concepts are most relevant as they provide the maximum information related to the event or activity as described by an author in the document. The retrieved relevant information from the text helps in the construction of the summary of a large text present in the document. This summary can further be represented in form of ontology and utilized in various application areas of information retrieval process like crawling, indexing, ranking, etc. The constructed ontologies can be compared with each other for calculation of similarity index based on semantic analysis between any texts. This paper gives a novel technique for retrieving the relevant semantic information represented in the form of ontology for true semantic analysis of given text.
9

Abbas, Asim, Muhammad Afzal, Jamil Hussain, Taqdir Ali, Hafiz Syed Muhammad Bilal, Sungyoung Lee, and Seokhee Jeon. "Clinical Concept Extraction with Lexical Semantics to Support Automatic Annotation." International Journal of Environmental Research and Public Health 18, no. 20 (October 9, 2021): 10564. http://dx.doi.org/10.3390/ijerph182010564.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Extracting clinical concepts, such as problems, diagnosis, and treatment, from unstructured clinical narrative documents enables data-driven approaches such as machine and deep learning to support advanced applications such as clinical decision-support systems, the assessment of disease progression, and the intelligent analysis of treatment efficacy. Various tools such as cTAKES, Sophia, MetaMap, and other rules-based approaches and algorithms have been used for automatic concept extraction. Recently, machine- and deep-learning approaches have been used to extract, classify, and accurately annotate terms and phrases. However, the requirement of an annotated dataset, which is labor-intensive, impedes the success of data-driven approaches. A rule-based mechanism could support the process of annotation, but existing rule-based approaches fail to adequately capture contextual, syntactic, and semantic patterns. This study intends to introduce a comprehensive rule-based system that automatically extracts clinical concepts from unstructured narratives with higher accuracy and transparency. The proposed system is a pipelined approach, capable of recognizing clinical concepts of three types, problem, treatment, and test, in the dataset collected from a published repository as a part of the I2b2 challenge 2010. The system’s performance is compared with that of three existing systems: Quick UMLS, BIO-CRF, and the Rules (i2b2) model. Compared to the baseline systems, the average F1-score of 72.94% was found to be 13% better than Quick UMLS, 3% better than BIO CRF, and 30.1% better than the Rules (i2b2) model. Individually, the system performance was noticeably higher for problem-related concepts, with an F1-score of 80.45%, followed by treatment-related concepts and test-related concepts, with F1-scores of 76.06% and 55.3%, respectively. The proposed methodology significantly improves the performance of concept extraction from unstructured clinical narratives by exploiting the linguistic and lexical semantic features. The approach can ease the automatic annotation process of clinical data, which ultimately improves the performance of supervised data-driven applications trained with these data.
10

Arnold, Patrick, and Erhard Rahm. "Automatic Extraction of Semantic Relations from Wikipedia." International Journal on Artificial Intelligence Tools 24, no. 02 (April 2015): 1540010. http://dx.doi.org/10.1142/s0218213015400102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We introduce a novel approach to extract semantic relations (e.g., is-a and part-of relations) from Wikipedia articles. These relations are used to build up a large and up-to-date thesaurus providing background knowledge for tasks such as determining semantic ontology mappings. Our automatic approach uses a comprehensive set of semantic patterns, finite state machines and NLP techniques to extract millions of relations between concepts. An evaluation for different domains shows the high quality and effectiveness of the proposed approach. We also illustrate the value of the newly found relations for improving existing ontology mappings.
11

Lee, Hyewon, Soyoung Yoon, and Ziyoung Park. "“SEMANTIC” in a Digital Curation Model." Journal of Data and Information Science 5, no. 1 (April 22, 2020): 81–92. http://dx.doi.org/10.2478/jdis-2020-0007.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractPurposeThis study attempts to propose an abstract model by gathering concepts that can focus on resource representation and description in a digital curation model and suggest a conceptual model that emphasizes semantic enrichment in a digital curation model.Design/methodology/approachThis study conducts a literature review to analyze the preceding curation models, DCC CLM, DCC&U, UC3, and DCN.FindingsThe concept of semantic enrichment is expressed in a single word, SEMANTIC in this study. The Semantic Enrichment Model, SEMANTIC has elements, subject, extraction, multi-language, authority, network, thing, identity, and connect.Research limitationsThis study does not reflect the actual information environment because it focuses on the concepts of the representation of digital objects.Practical implicationsThis study presents the main considerations for creating and reinforcing the description and representation of digital objects when building and developing digital curation models in specific institutions.Originality/valueThis study summarizes the elements that should be emphasized in the representation of digital objects in terms of information organization.
12

Jagan, Balaji, Ranjani Parthasarathi, and T. V. Geetha. "Bootstrapping of Semantic Relation Extraction for a Morphologically Rich Language." International Journal on Semantic Web and Information Systems 15, no. 1 (January 2019): 119–49. http://dx.doi.org/10.4018/ijswis.2019010106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This article focuses on the use of a bootstrapping approach for the extraction of semantic relations that exist between two different concepts in a Tamil text. The proposed system, bootstrapping approach to semantic UNL relation extraction (BASURE) extracts generic relations that exist between different components of a sentence by exploiting the morphological richness of Tamil. Tamil is essentially a partially free word order language which means that semantic relations that exist between the concepts can occur anywhere in the sentence not necessarily in a fixed order. Here, the authors use Universal Networking Language (UNL), an Interlingua framework, to represent the word-based features and aim to define UNL semantic relations that exist between any two constituents in a sentence. The morphological suffix, lexical category and UNL semantic constraints associated with a word are defined as tuples of the pattern used for bootstrapping. Most systems define the initial set of seed patterns manually. However, this article uses a rule-based approach to obtain word-based features that form tuples of the patterns. A bootstrapping approach is then applied to extract all possible instances from the corpus and to generate new patterns. Here, the authors also introduce the use of UNL ontology to discover the semantic similarity between semantic tuples of the pattern, hence, to learn new patterns from the text corpus in an iterative manner. The use of UNL Ontology makes this approach general and domain independent. The results obtained are evaluated and compared with existing approaches and it has been shown that this approach is generic, can extract all sentence based semantic UNL relations and significantly increases the performance of the generic semantic relation extraction system.
13

Kokla, M., V. Papadias, and E. Tomai. "ENRICHMENT AND POPULATION OF A GEOSPATIAL ONTOLOGY FOR SEMANTIC INFORMATION EXTRACTION." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4 (September 19, 2018): 309–14. http://dx.doi.org/10.5194/isprs-archives-xlii-4-309-2018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<p><strong>Abstract.</strong> The massive amount of user-generated content available today presents a new challenge for the geospatial domain and a great opportunity to delve into linguistic, semantic, and cognitive aspects of geographic information. Ontology-based information extraction is a new, prominent field in which a domain ontology guides the extraction process and the identification of pre-defined concepts, properties, and instances from natural language texts. The paper describes an approach for enriching and populating a geospatial ontology using both a top-down and a bottom-up approach in order to enable semantic information extraction. The top-down approach is applied in order to incorporate knowledge from existing ontologies. The bottom-up approach is applied in order to enrich and populate the geospatial ontology with semantic information (concepts, relations, and instances) extracted from domain-specific web content.</p>
14

Eljinini, Mohammad Ali H. "The Medical Semantic Web." International Journal of Information Technology and Web Engineering 6, no. 2 (April 2011): 18–28. http://dx.doi.org/10.4018/jitwe.2011040102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this paper, the need for the right information for patients with chronic diseases is elaborated, followed by some scenarios of how the semantic web can be utilised to retrieve useful and precise information by stakeholders. In previous work, the author has demonstrated the automation of knowledge acquisition from the current web is becoming an important step towards this goal. The aim was twofold; first to learn what types of information exist in chronic disease-related websites, and secondly how to extract and structure such information into machine understandable form. It has been shown that these websites exhibit many common concepts which resulted in the construction of the ontology to guide in extracting information for new unseen websites. Also, the study has resulted in the development of a platform for information extraction that utilises the ontology. Continuous work has opened many issues which are disussed in this paper. While further work is still needed, the experiments to date have shown encouraging results.
15

Pancerz, Krzysztof, Arkadiusz Lewicki, and Ryszard Tadeusiewicz. "Ant-based extraction of rules in simple decision systems over ontological graphs." International Journal of Applied Mathematics and Computer Science 25, no. 2 (June 1, 2015): 377–87. http://dx.doi.org/10.1515/amcs-2015-0029.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract In the paper, the problem of extraction of complex decision rules in simple decision systems over ontological graphs is considered. The extracted rules are consistent with the dominance principle similar to that applied in the dominancebased rough set approach (DRSA). In our study, we propose to use a heuristic algorithm, utilizing the ant-based clustering approach, searching the semantic spaces of concepts presented by means of ontological graphs. Concepts included in the semantic spaces are values of attributes describing objects in simple decision systems
16

Desul, Sudarsana, Madurai Meenachi N., Thejas Venkatesh, Vijitha Gunta, Gowtham R., and Magapu Sai Baba. "Method for automatic key concepts extraction." Electronic Library 37, no. 1 (February 4, 2019): 2–15. http://dx.doi.org/10.1108/el-01-2018-0012.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
PurposeOntology of a domain mainly consists of a set of concepts and their semantic relations. It is typically constructed and maintained by using ontology editors with substantial human intervention. It is desirable to perform the task automatically, which has led to the development of ontology learning techniques. One of the main challenges of ontology learning from the text is to identify key concepts from the documents. A wide range of techniques for key concept extraction have been proposed but are having the limitations of low accuracy, poor performance, not so flexible and applicability to a specific domain. The propose of this study is to explore a new method to extract key concepts and to apply them to literature in the nuclear domain.Design/methodology/approachIn this article, a novel method for key concept extraction is proposed and applied to the documents from the nuclear domain. A hybrid approach was used, which includes a combination of domain, syntactic name entity knowledge and statistical based methods. The performance of the developed method has been evaluated from the data obtained using two out of three voting logic from three domain experts by using 120 documents retrieved from SCOPUS database.FindingsThe work reported pertains to extracting concepts from the set of selected documents and aids the search for documents relating to given concepts. The results of a case study indicated that the method developed has demonstrated better metrics than Text2Onto and CFinder. The method described has the capability of extracting valid key concepts from a set of candidates with long phrases.Research limitations/implicationsThe present study is restricted to literature coming out in the English language and applied to the documents from nuclear domain. It has the potential to extend to other domains also.Practical implicationsThe work carried out in the current study has the potential of leading to updating International Nuclear Information System thesaurus for ontology in the nuclear domain. This can lead to efficient search methods.Originality/valueThis work is the first attempt to automatically extract key concepts from the nuclear documents. The proposed approach will address and fix the most of the problems that are existed in the current methods and thereby increase the performance.
17

Huynh, Nghia Huu, Quoc Bao Ho, and Te An Nguyen. "An approach in health relation extraction." Science & Technology Development Journal - Economics - Law and Management 1, Q3 (December 31, 2017): 51–63. http://dx.doi.org/10.32508/stdjelm.v1iq3.449.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Extracting relations among medical concepts is very important in the medical field. The relations denote the events or the possible relations between the concepts. Information about these relations provides users with a full view of medical problems. This helps physicians and health-care practitioners make effective decisions and minimize errors in the treatment process. This paper collects methods for relations extraction in health texts and presents an approach on one type of specific relation (i.e. template filling). The approach combines methods including rule-based and machine learningbased. The rule-based method uses the relation of semantic dependencies among the concepts to extract the rule set. The machine learning-based method uses the SVM (Support Vector Machine) algorithm and a feature set proposed. The results of the approach were estimated on an accuracy of 0.849.
18

Wall, Jeffrey D., and Rahul Singh. "Contextualized Meaning Extraction." International Journal of Organizational and Collective Intelligence 7, no. 3 (July 2017): 15–29. http://dx.doi.org/10.4018/ijoci.2017070102.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Text mining is a powerful form of business intelligence that is used increasingly to inform organizational decisions. Current text mining algorithms rely heavily on the lexical, syntactic, structural, and semantic features of text to extract meaning and insight for decision making. Although semantic analysis is a useful approach to meaning extraction, pragmatics suggests that a more accurate meaning of text can be extracted by examining the context in which the text is recorded. Given that massive amounts of textual data can be drawn from multiple and diverse sources, accounting for context is increasingly important. A conceptual model is provided to explain how concepts from pragmatics can improve existing text mining algorithms to provide more accurate information for decision making. Reversing the pragmatic process of meaning expression could lead to improved text mining algorithms. The theoretical process model developed herein can provide insight into the development and refinement of text mining algorithms that draw from diverse sources.
19

Ain, Qurat Ul, Mohamed Amine Chatti, Komlan Gluck Charles Bakar, Shoeb Joarder, and Rawaa Alatrash. "Automatic Construction of Educational Knowledge Graphs: A Word Embedding-Based Approach." Information 14, no. 10 (September 27, 2023): 526. http://dx.doi.org/10.3390/info14100526.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Knowledge graphs (KGs) are widely used in the education domain to offer learners a semantic representation of domain concepts from educational content and their relations, termed as educational knowledge graphs (EduKGs). Previous studies on EduKGs have incorporated concept extraction and weighting modules. However, these studies face limitations in terms of accuracy and performance. To address these challenges, this work aims to improve the concept extraction and weighting mechanisms by leveraging state-of-the-art word and sentence embedding techniques. Concretely, we enhance the SIFRank keyphrase extraction method by using SqueezeBERT and we propose a concept-weighting strategy based on SBERT. Furthermore, we conduct extensive experiments on different datasets, demonstrating significant improvements over several state-of-the-art keyphrase extraction and concept-weighting techniques.
20

Anoop, V. S., and S. Asharaf. "Extracting Conceptual Relationships and Inducing Concept Lattices from Unstructured Text." Journal of Intelligent Systems 28, no. 4 (September 25, 2019): 669–81. http://dx.doi.org/10.1515/jisys-2017-0225.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Concept and relationship extraction from unstructured text data plays a key role in meaning aware computing paradigms, which make computers intelligent by helping them learn, interpret, and synthesis information. These concepts and relationships leverage knowledge in the form of ontological structures, which is the backbone of semantic web. This paper proposes a framework that extracts concepts and relationships from unstructured text data and then learns lattices that connect concepts and relationships. The proposed framework uses an off-the-shelf tool for identifying common concepts from a plain text corpus and then implements machine learning algorithms for classifying common relations that connect those concepts. Formal concept analysis is then used for generating concept lattices, which is a proven and principled method of creating formal ontologies that aid machines to learn things. A rigorous and structured experimental evaluation of the proposed method on real-world datasets has been conducted. The results show that the newly proposed framework outperforms state-of-the-art approaches in concept extraction and lattice generation.
21

Xu, Dongfang, Manoj Gopale, Jiacheng Zhang, Kris Brown, Edmon Begoli, and Steven Bethard. "Unified Medical Language System resources improve sieve-based generation and Bidirectional Encoder Representations from Transformers (BERT)–based ranking for concept normalization." Journal of the American Medical Informatics Association 27, no. 10 (July 27, 2020): 1510–19. http://dx.doi.org/10.1093/jamia/ocaa080.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Objective Concept normalization, the task of linking phrases in text to concepts in an ontology, is useful for many downstream tasks including relation extraction, information retrieval, etc. We present a generate-and-rank concept normalization system based on our participation in the 2019 National NLP Clinical Challenges Shared Task Track 3 Concept Normalization. Materials and Methods The shared task provided 13 609 concept mentions drawn from 100 discharge summaries. We first design a sieve-based system that uses Lucene indices over the training data, Unified Medical Language System (UMLS) preferred terms, and UMLS synonyms to generate a list of possible concepts for each mention. We then design a listwise classifier based on the BERT (Bidirectional Encoder Representations from Transformers) neural network to rank the candidate concepts, integrating UMLS semantic types through a regularizer. Results Our generate-and-rank system was third of 33 in the competition, outperforming the candidate generator alone (81.66% vs 79.44%) and the previous state of the art (76.35%). During postevaluation, the model’s accuracy was increased to 83.56% via improvements to how training data are generated from UMLS and incorporation of our UMLS semantic type regularizer. Discussion Analysis of the model shows that prioritizing UMLS preferred terms yields better performance, that the UMLS semantic type regularizer results in qualitatively better concept predictions, and that the model performs well even on concepts not seen during training. Conclusions Our generate-and-rank framework for UMLS concept normalization integrates key UMLS features like preferred terms and semantic types with a neural network–based ranking model to accurately link phrases in text to UMLS concepts.
22

Nasser, Ahmed, and Hayri Sever. "A Concept-based Sentiment Analysis Approach for Arabic." International Arab Journal of Information Technology 17, no. 5 (September 1, 2020): 778–88. http://dx.doi.org/10.34028/iajit/17/5/11.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Concept-Based Sentiment Analysis (CBSA) methods are considered to be more advanced and more accurate when it compared to ordinary Sentiment Analysis methods, because it has the ability of detecting the emotions that conveyed by multi-word expressions concepts in language. This paper presented a CBSA system for Arabic language which utilizes both of machine learning approaches and concept-based sentiment lexicon. For extracting concepts from Arabic, a rule-based concept extraction algorithm called semantic parser is proposed. Different types of feature extraction and representation techniques are experimented among the building prosses of the sentiment analysis model for the presented Arabic CBSA system. A comprehensive and comparative experiments using different types of classification methods and classifier fusion models, together with different combinations of our proposed feature sets, are used to evaluate and test the presented CBSA system. The experiment results showed that the best performance for the sentiment analysis model is achieved by combined Support Vector Machine-Logistic Regression (SVM-LR) model where it obtained a F-score value of 93.23% using the Concept-Based-Features+Lexicon-Based-Features+Word2vec-Features (CBF+LEX+W2V) features combinations
23

Ben Abdessalem Karaa, Wahiba, Eman H. Alkhammash, and Aida Bchir. "Drug Disease Relation Extraction from Biomedical Literature Using NLP and Machine Learning." Mobile Information Systems 2021 (May 19, 2021): 1–10. http://dx.doi.org/10.1155/2021/9958410.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Extracting the relations between medical concepts is very valuable in the medical domain. Scientists need to extract relevant information and semantic relations between medical concepts, including protein and protein, gene and protein, drug and drug, and drug and disease. These relations can be extracted from biomedical literature available on various databases. This study examines the extraction of semantic relations that can occur between diseases and drugs. Findings will help specialists make good decisions when administering a medication to a patient and will allow them to continuously be up to date in their field. The objective of this work is to identify different features related to drugs and diseases from medical texts by applying Natural Language Processing (NLP) techniques and UMLS ontology. The Support Vector Machine classifier uses these features to extract valuable semantic relationships among text entities. The contributing factor of this research is the combination of the strength of a suggested NLP technique, which takes advantage of UMLS ontology and enables the extraction of correct and adequate features (frequency features, lexical features, morphological features, syntactic features, and semantic features), and Support Vector Machines with polynomial kernel function. These features are manipulated to pinpoint the relations between drug and disease. The proposed approach was evaluated using a standard corpus extracted from MEDLINE. The finding considerably improves the performance and outperforms similar works, especially the f-score for the most important relation “cure,” which is equal to 98.19%. The accuracy percentage is better than those in all the existing works for all the relations.
24

Afef, Zwidi, Ameni Yengui, and Neji Mahmoud. "Research system of semantic information in medical videoconference based on conceptual graphs and domain ontologies." INTERNATIONAL JOURNAL OF MANAGEMENT & INFORMATION TECHNOLOGY 7, no. 2 (November 30, 2013): 979–99. http://dx.doi.org/10.24297/ijmit.v7i2.703.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The multiplication of the number of AudioVisual Documents (AVD) engendered a problem while searching for information within gigantic databases of which we are incapable to index their contents completely manually. Indeed, several complex difficulties are put by these documents because of the vertiginous increase of the quantity of the multimedia data to be treated and the specification met in the representation and the extraction of their contents in particular semantics of the fact that these documents contain three types of media (text, sound, image). AVDs can be classified in professional broadcasted videos (movies, emissions), sporting videos, video controlling, videoconference etc. In this paper, we propose a model of representation of the semantic contents of videoconferences documents in medicine based on the conceptual graphs taking into account the different modalities. This model is based on the concepts extraction and the semantic relations between them and appeals ontology domain.
25

Devarajan, Viji, and Revathy Subramanian. "Analyzing semantic similarity amongst textual documents to suggest near duplicates." Indonesian Journal of Electrical Engineering and Computer Science 25, no. 3 (March 1, 2022): 1703. http://dx.doi.org/10.11591/ijeecs.v25.i3.pp1703-1711.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<span>Data deduplication techniques removing repeated or redundant data from the storage. In recent days, more data has been generated and stored in the storage environment. More redundant and semantically similar content of the data occupied in the storage environment due to this storage efficiency will be reduced and cost of the storage will be high. To overcome this problem, we proposed a method hybrid bidirectional encoder representation from transformers for text semantics using graph convolutional network hybrid bidirectional encoder representation from transformers (BERT) model for text semantics (HBTSG) word embedding-based deep learning model to <span>identify near duplicates based on the semantic relationship between text documents. In this paper we hybridize the concepts of chunking and</span> <span>semantic analysis. The chunking process is carried out to split the documents into blocks. Next stage we identify the semantic relationship between</span> <span>documents using word embedding techniques. It combines the advantages of the chunking, feature extraction, and semantic relations to provide better results.</span></span>
26

Navigli, Roberto, and Paola Velardi. "Learning Domain Ontologies from Document Warehouses and Dedicated Web Sites." Computational Linguistics 30, no. 2 (June 2004): 151–79. http://dx.doi.org/10.1162/089120104323093276.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We present a method and a tool, OntoLearn, aimed at the extraction of domain ontologies from Web sites, and more generally from documents shared among the members of virtual organizations. OntoLearn first extracts a domain terminology from available documents. Then, complex domain terms are semantically interpreted and arranged in a hierarchical fashion. Finally, a general-purpose ontology, WordNet, is trimmed and enriched with the detected domain concepts. The major novel aspect of this approach is semantic interpretation, that is, the association of a complex concept with a complex term. This involves finding the appropriate WordNet concept for each word of a terminological string and the appropriate conceptual relations that hold among the concept components. Semantic interpretation is based on a new word sense disambiguation algorithm, called structural semantic interconnections.
27

Preum, Sarah Masud, Sile Shu, Homa Alemzadeh, and John A. Stankovic. "EMSContExt: EMS Protocol-Driven Concept Extraction for Cognitive Assistance in Emergency Response." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 08 (April 3, 2020): 13350–55. http://dx.doi.org/10.1609/aaai.v34i08.7048.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper presents a technique for automated curation of a domain-specific knowledge base or lexicon for resource-constrained domains, such as Emergency Medical Services (EMS) and its application to real-time concept extraction and cognitive assistance in emergency response. The EMS responders often verbalize critical information describing the situations at an incident scene, including patients' physical condition and medical history. Automated extraction of EMS protocol-specific concepts from responders' speech data can facilitate cognitive support through the selection and execution of the proper EMS protocols for patient treatment. Although this task is similar to the traditional NLP task of concept extraction, the underlying application domain poses major challenges, including low training resources availability (e.g., no existing EMS ontology, lexicon, or annotated EMS corpus) and domain mismatch. Hence, we develop EMSContExt, a weakly-supervised concept extraction approach for EMS concepts. It utilizes different knowledge bases and a semantic concept model based on a corpus of over 9400 EMS narratives for lexicon expansion. The expanded EMS lexicon is then used to automatically extract critical EMS protocol-specific concepts from real-time EMS speech narratives. Our experimental results show that EMSContExt achieves 0.85 recall and 0.82 F1-score for EMS concept extraction and significantly outperforms MetaMap, a state-of-the-art medical concept extraction tool. We also demonstrate the application of EMSContExt to EMS protocol selection and execution and real-time recommendation of protocol-specific interventions to the EMS responders. Here, EMSContExt outperforms MetaMap with a 6% increase and six times speedup in weighted recall and execution time, respectively.
28

Juan and Faber. "Extraction of Terms Related to Named Rivers." Languages 4, no. 3 (June 27, 2019): 46. http://dx.doi.org/10.3390/languages4030046.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
EcoLexicon is a terminological knowledge base on environmental science, whose design permits the geographic contextualization of data. For the geographic contextualization of landform concepts, this paper presents a semi-automatic method for extracting terms associated with named rivers (e.g., Mississippi River). Terms were extracted from a specialized corpus, where named rivers were automatically identified. Statistical procedures were applied for selecting both terms and rivers in distributional semantic models to construct the conceptual structures underlying the usage of named rivers. The rivers sharing associated terms were also clustered and represented in the same conceptual network. The results showed that the method successfully described the semantic frames of named rivers with explanatory adequacy, according to the premises of Frame-Based Terminology.
29

Mason, Zachary J. "CorMet: A Computational, Corpus-Based Conventional Metaphor Extraction System." Computational Linguistics 30, no. 1 (March 2004): 23–44. http://dx.doi.org/10.1162/089120104773633376.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
CorMet is a corpus-based system for discovering metaphorical mappings between concepts. It does this by finding systematic variations in domain-specific selectional preferences, which are inferred from large, dynamically mined Internet corpora. Metaphors transfer structure from a source domain to a target domain, making some concepts in the target domain metaphorically equivalent to concepts in the source domain. The verbs that select for a concept in the source domain tend to select for its metaphorical equivalent in the target domain. This regularity, detectable with a shallow linguistic analysis, is used to find the metaphorical interconcept mappings, which can then be used to infer the existence of higher-level conventional metaphors. Most other computational metaphor systems use small, hand-coded semantic knowledge bases and work on a few examples. Although Cor Met's only knowledge base is Word Net (Fellbaum 1998) it can find the mappings constituting many conventional metaphors and in some cases recognize sentences instantiating those mappings. CorMet is tested on its ability to find a subset of the Master Metaphor List (Lakoff, Espenson, and Schwartz 1991).
30

Ma, Yong. "Research on Intelligent Evaluation System of Sports Training based on Video Image Acquisition and Scene Semantics." Advances in Multimedia 2022 (March 26, 2022): 1–6. http://dx.doi.org/10.1155/2022/4726450.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper proposes to check the travel target of the dynamic background in the video surveillance with a fixed camera. A travel target detection method based on video picture acquisition and scene semantics for surveillance video was proposed. First, on the basis of combing the concepts and methods of picture recognition, the semantic information of the scene was fused to eliminate the interference factors in the unnecessary detection area. Secondly, a remote sensing picture visual feature representation method containing a semantic recognition method of remote sensing picture scenes and CSIFT features based on PLSA was presented. 10 types of typical remote sensing picture scenes are used for tests, and the visual vocabulary extraction method remains the same. The fixed visual vocabulary was 600, and the potential semantic subjects changes between 8∼50. The test results indicated that the highest average recognition rate was obtained when the latent semantic topics were 20. Inappropriate latent semantic topics will lead to a decline in recognition rates. The effectiveness of this method was fully verified.
31

Anastopoulou, N., M. Kavouras, M. Kokla, and E. Tomai. "CONCEPTS – LOCATIONS – EMOTIONS: SEMANTIC ANALYSIS AND VISUALIZATION OF CLIMATE CHANGE TEXTS." International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLIII-B4-2021 (June 30, 2021): 31–37. http://dx.doi.org/10.5194/isprs-archives-xliii-b4-2021-31-2021.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract. Research on knowledge discovery in the geospatial domain currently focuses on semi-structured, even on unstructured rather than fully structured content. The attention has been put on the plethora of resources on the Web, such as html pages, news articles, blogs, social media etc. Semantic information extraction in geospatial-oriented approaches is further used for semantic analysis, search, and retrieval. The aim of this paper is to extract, analyse and visualize geospatial semantic information and emotions from texts on climate change. A collection of articles on climate change is used to demonstrate the developed approach. These articles describe environmental and socio-economic dimensions of climate change across the Earth, and include a wealth of information related to environmental concepts and geographic locations affected by it. The results are analysed in order to understand which specific human emotions are associated with environmental concepts and/or locations, as well as which environmental terms are linked to locations. For the better understanding of the above-mentioned information, semantic networks are used as a powerful visualization tool of the links among concepts – locations – emotions.
32

Huang, Yikun, Xingsi Xue, and Chao Jiang. "Semantic Integration of Sensor Knowledge on Artificial Internet of Things." Wireless Communications and Mobile Computing 2020 (July 25, 2020): 1–8. http://dx.doi.org/10.1155/2020/8815001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Artificial Internet of Things (AIoT) integrates Artificial Intelligence (AI) with the Internet of Things (IoT) to create the sensor network that can communicate and process data. To implement the communications and co-operations among intelligent systems on AIoT, it is necessary to annotate sensor data with the semantic meanings to overcome heterogeneity problem among different sensors, which requires the utilization of sensor ontology. Sensor ontology formally models the knowledge on AIoT by defining the concepts, the properties describing a concept, and the relationships between two concepts. Due to human’s subjectivity, a concept in different sensor ontologies could be defined with different terminologies and contexts, yielding the ontology heterogeneity problem. Thus, before using these ontologies, it is necessary to integrate their knowledge by finding the correspondences between their concepts, i.e., the so-called ontology matching. In this work, a novel sensor ontology matching framework is proposed, which aggregates three kinds of Concept Similarity Measures (CSMs) and an alignment extraction approach to determine the sensor ontology alignment. To ensure the quality of the alignments, we further propose a compact Particle Swarm Optimization algorithm (cPSO) to optimize the aggregating weights for the CSMs and a threshold for filtering the alignment. The experiment utilizes the Ontology Alignment Evaluation Initiative (OAEI)’s conference track and two pairs of real sensor ontologies to test cPSO’s performance. The experimental results show that the quality of the alignments obtained by cPSO statistically outperforms other state-of-the-art sensor ontology matching techniques.
33

Kokla, Margarita, and Eric Guilbert. "A Review of Geospatial Semantic Information Modeling and Elicitation Approaches." ISPRS International Journal of Geo-Information 9, no. 3 (March 1, 2020): 146. http://dx.doi.org/10.3390/ijgi9030146.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The present paper provides a review of two research topics that are central to geospatial semantics: information modeling and elicitation. The first topic deals with the development of ontologies at different levels of generality and formality, tailored to various needs and uses. The second topic involves a set of processes that aim to draw out latent knowledge from unstructured or semi-structured content: semantic-based extraction, enrichment, search, and analysis. These processes focus on eliciting a structured representation of information in various forms such as: semantic metadata, links to ontology concepts, a collection of topics, etc. The paper reviews the progress made over the last five years in these two very active areas of research. It discusses the problems and the challenges faced, highlights the types of semantic information formalized and extracted, as well as the methodologies and tools used, and identifies directions for future research.
34

Muppavarapu, Vamsee, Gowtham Ramesh, Amelie Gyrard, and Mahda Noura. "Knowledge extraction using semantic similarity of concepts from Web of Things knowledge bases." Data & Knowledge Engineering 135 (September 2021): 101923. http://dx.doi.org/10.1016/j.datak.2021.101923.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

Denecke, K. "Semantic Structuring of and Information Extraction from Medical Documents Using the UMLS." Methods of Information in Medicine 47, no. 05 (2008): 425–34. http://dx.doi.org/10.3414/me0508.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Summary Objectives: This paper introduces SeReMeD (Semantic Representation of Medical Documents), a method for automatically generating knowledge representations from natural language documents. The suitability of the Unified Medical Language System (UMLS) as domain knowledge for this method is analyzed. Methods: SeReMeD combines existing language engineering methods and semantic transformation rules for mapping syntactic information to semantic roles. In this way, the relevant content of medical documents is mapped to semantic structures. In order to extract specific data, these semantic structures are searched for concepts and semantic roles. A study is carried out that uses SeReMeD to detect specific data in medical narratives such as documented diagnoses or procedures. Results: The system is tested on chest X-ray reports. In first evaluations of the system’s performance, the generation of semantic structures achieves a correctness of 80%, whereas the extraction of documented findings obtains values of 93% precision and 83% recall. Conclusions: The results suggest that the methods described here can be used to accurately extract data from medical narratives, although there is also some potential for improving the results. The proposed methods provide two main benefits. By using existing language engineering methods, the effort required to construct a medical information extraction system is reduced. It is also possible to change the domain knowledge and therefore to create a more (or less) specialized system, capable of handling various medical sub-domains.
36

Zhu, Hong Mei, Liang Zhang, and Wei Sun. "A Method for Ontology Module Extract." Applied Mechanics and Materials 668-669 (October 2014): 1198–201. http://dx.doi.org/10.4028/www.scientific.net/amm.668-669.1198.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In semantic Web, extensive reuse of existing large ontology is one of the central ideas of ontology engineering. Ontology extraction should return relative sub-ontology that covers some sub-vocabulary. The efficiency of the existing ontology extraction algorithm is relatively low when they try to get a suitable ontology module from ontology at run time. This paper proposed a kind of ontology module extraction method. Related concepts and criterions of ontology modules extraction are studied; data structures and identification and evaluation methods of ontology module extraction are discussed; preliminary experimental results and the corresponding analysis are also shown.
37

Barbantan, Ioana, Mihaela Porumb, Camelia Lemnaru, and Rodica Potolea. "Feature Engineered Relation Extraction – Medical Documents Setting." International Journal of Web Information Systems 12, no. 3 (August 15, 2016): 336–58. http://dx.doi.org/10.1108/ijwis-03-2016-0015.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Purpose Improving healthcare services by developing assistive technologies includes both the health aid devices and the analysis of the data collected by them. The acquired data modeled as a knowledge base give more insight into each patient’s health status and needs. Therefore, the ultimate goal of a health-care system is obtaining recommendations provided by an assistive decision support system using such knowledge base, benefiting the patients, the physicians and the healthcare industry. This paper aims to define the knowledge flow for a medical assistive decision support system by structuring raw medical data and leveraging the knowledge contained in the data proposing solutions for efficient data search, medical investigation or diagnosis and medication prediction and relationship identification. Design/methodology/approach The solution this paper proposes for implementing a medical assistive decision support system can analyze any type of unstructured medical documents which are processed by applying Natural Language Processing (NLP) tasks followed by semantic analysis, leading to the medical concept identification, thus imposing a structure on the input documents. The structured information is filtered and classified such that custom decisions regarding patients’ health status can be made. The current research focuses on identifying the relationships between medical concepts as defined by the REMed (Relation Extraction from Medical documents) solution that aims at finding the patterns that lead to the classification of concept pairs into concept-to-concept relations. Findings This paper proposed the REMed solution expressed as a multi-class classification problem tackled using the support vector machine classifier. Experimentally, this paper determined the most appropriate setup for the multi-class classification problem which is a combination of lexical, context, syntactic and grammatical features, as each feature category is good at representing particular relations, but not all. The best results we obtained are expressed as F1-measure of 74.9 per cent which is 1.4 per cent better than the results reported by similar systems. Research limitations/implications The difficulty to discriminate between TrIP and TrAP relations revolves around the hierarchical relationship between the two classes as TrIP is a particular type (an instance) of TrAP. The intuition behind this behavior was that the classifier cannot discern the correct relations because of the bias toward the majority classes. The analysis was conducted by using only sentences from electronic health record that contain at least two medical concepts. This limitation was introduced by the availability of the annotated data with reported results, as relations were defined at sentence level. Originality/value The originality of the proposed solution lies in the methodology to extract valuable information from the medical records via semantic searches; concept-to-concept relation identification; and recommendations for diagnosis, treatment and further investigations. The REMed solution introduces a learning-based approach for the automatic discovery of relations between medical concepts. We propose an original list of features: lexical – 3, context – 6, grammatical – 4 and syntactic – 4. The similarity feature introduced in this paper has a significant influence on the classification, and, to the best of the authors’ knowledge, it has not been used as feature in similar solutions.
38

Antipov, I., W. Hersh, C. A. Smith, M. Mailhot, and H. J. Lowe. "Automated Semantic Indexing of Imaging Reports to Support Retrieval of Medical Images in the Multimedia Electronic Medical Record." Methods of Information in Medicine 38, no. 04/05 (1999): 303–7. http://dx.doi.org/10.1055/s-0038-1634413.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractThis paper describes preliminary work evaluating automated semantic indexing of radiology imaging reports to represent images stored in the Image Engine multimedia medical record system at the University of Pittsburgh Medical Center. The authors used the SAPHIRE indexing system to automatically identify important biomedical concepts within radiology reports and represent these concepts with terms from the 1998 edition of the U.S. National Library of Medicine’s Unified Medical Language System (UMLS) Metathesaurus. This automated UMLS indexing was then compared with manual UMLS indexing of the same reports. Human indexing identified appropriate UMLS Metathesaurus descriptors for 81% of the important biomedical concepts contained in the report set. SAPHIRE automatically identified UMLS Metathesaurus descriptors for 64% of the important biomedical concepts contained in the report set. The overall conclusions of this pilot study were that the UMLS metathesaurus provided adequate coverage of the majority of the important concepts contained within the radiology report test set and that SAPHIRE could automatically identify and translate almost two thirds of these concepts into appropriate UMLS descriptors. Further work is required to improve both the recall and precision of this automated concept extraction process.
39

Subramaniyaswamy, V. "Automatic Topic Ontology Construction Using Semantic Relations from WordNet and Wikipedia." International Journal of Intelligent Information Technologies 9, no. 3 (July 2013): 61–89. http://dx.doi.org/10.4018/jiit.2013070104.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Due to the explosive growth of web technology, a huge amount of information is available as web resources over the Internet. Therefore, in order to access the relevant content from the web resources effectively, considerable attention is paid on the semantic web for efficient knowledge sharing and interoperability. Topic ontology is a hierarchy of a set of topics that are interconnected using semantic relations, which is being increasingly used in the web mining techniques. Reviews of the past research reveal that semiautomatic ontology is not capable of handling high usage. This shortcoming prompted the authors to develop an automatic topic ontology construction process. However, in the past many attempts have been made by other researchers to utilize the automatic construction of ontology, which turned out to be challenging due to time, cost and maintenance. In this paper, the authors have proposed a corpus based novel approach to enrich the set of categories in the ODP by automatically identifying the concepts and their associated semantic relationship with corpus based external knowledge resources, such as Wikipedia and WordNet. This topic ontology construction approach relies on concept acquisition and semantic relation extraction. A Jena API framework has been developed to organize the set of extracted semantic concepts, while Protégé provides the platform to visualize the automatically constructed topic ontology. To evaluate the performance, web documents were classified using SVM classifier based on ODP and topic ontology. The topic ontology based classification produced better accuracy than ODP.
40

Altaf, Saud, Sofia Iqbal, and Muhammad Waseem Soomro. "Efficient natural language classification algorithm for detecting duplicate unsupervised features." Informatics and Automation 20, no. 3 (June 2, 2021): 623–53. http://dx.doi.org/10.15622/ia.2021.3.5.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper focuses on capturing the meaning of Natural Language Understanding (NLU) text features to detect the duplicate unsupervised features. The NLU features are compared with lexical approaches to prove the suitable classification technique. The transfer-learning approach is utilized to train the extraction of features on the Semantic Textual Similarity (STS) task. All features are evaluated with two types of datasets that belong to Bosch bug and Wikipedia article reports. This study aims to structure the recent research efforts by comparing NLU concepts for featuring semantics of text and applying it to IR. The main contribution of this paper is a comparative study of semantic similarity measurements. The experimental results demonstrate the Term Frequency–Inverse Document Frequency (TF-IDF) feature results on both datasets with reasonable vocabulary size. It indicates that the Bidirectional Long Short Term Memory (BiLSTM) can learn the structure of a sentence to improve the classification.
41

Poux, F., R. Neuville, P. Hallot, and R. Billen. "MODEL FOR SEMANTICALLY RICH POINT CLOUD DATA." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences IV-4/W5 (October 23, 2017): 107–15. http://dx.doi.org/10.5194/isprs-annals-iv-4-w5-107-2017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper proposes an interoperable model for managing high dimensional point clouds while integrating semantics. Point clouds from sensors are a direct source of information physically describing a 3D state of the recorded environment. As such, they are an exhaustive representation of the real world at every scale: 3D reality-based spatial data. Their generation is increasingly fast but processing routines and data models lack of knowledge to reason from information extraction rather than interpretation. The enhanced smart point cloud developed model allows to bring intelligence to point clouds via 3 connected meta-models while linking available knowledge and classification procedures that permits semantic injection. Interoperability drives the model adaptation to potentially many applications through specialized domain ontologies. A first prototype is implemented in Python and PostgreSQL database and allows to combine semantic and spatial concepts for basic hybrid queries on different point clouds.
42

Terletskyi, D. O., and S. V. Yershov. "Decompositional Extraction and Retrieval of Conceptual Knowledge." PROBLEMS IN PROGRAMMING, no. 3-4 (December 2022): 139–53. http://dx.doi.org/10.15407/pp2022.03-04.139.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
An ability to extract hidden and implicit knowledge, their integration into a knowledge base, and then retrieval of required knowledge items are important features of knowledge processing for many modern knowledge-based systems. However, the complexity of these tasks depends on the size of knowledge sources, which were used for extraction, the size of a knowledge base, which is used for the integration of extracted knowledge, as well as the size of a search space, which is used for the retrieval of required knowledge items. Therefore, in this paper, we analyzed the internal semantic dependencies of homogeneous classes of objects and how they affect the decomposition of such classes. Since all subclasses of a homogeneous class of objects form a complete lattice, we applied the methods of formal concept analysis for the knowledge extraction and retrieval within the corresponding concept lattice. We found that such an approach does not consider internal semantic dependencies within a homogeneous class of objects, consequently, it can cause inference and retrieval of formal concepts, which are semantically inconsistent within a modeled domain. We adapted the algorithm for the decomposition of homogeneous classes of objects, within such knowledge representation model as object-oriented dynamic networks, to perform dynamic knowledge extraction and retrieval, adding additional filtration parameters. As the result, the algorithm extracts knowledge via constructing only semantically consistent subclasses of homogeneous classes of objects and then filters them according to the attribute and dependency queries, retrieving knowledge. In addition, we introduced the decomposition consistency coefficient, which allows estimation of how much the algorithm can reduce the search space for knowledge extraction and improves the performance. To demonstrate some possible application scenarios for the improved algorithm, we provided an appropriate example of knowledge extraction and retrieval via decomposition of a particular homogeneous class of objects.
43

Giordano, Vito, Marco Consoloni, Filippo Chiarello, and Gualtiero Fantoni. "Towards the extraction of semantic relations in design with natural language processing." Proceedings of the Design Society 4 (May 2024): 2059–68. http://dx.doi.org/10.1017/pds.2024.208.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractNatural Language Processing (NLP) has been extensively applied in design, particularly for analyzing technical documents like patents and scientific papers to identify entities such as functions, technical feature, and problems. However, there has been less focus on understanding semantic relations within literature, and a comprehensive definition of what constitutes a relation is still lacking. In this paper, we define relation in the context of design and the fundamental concepts linked to it. Subsequently, we introduce a framework for employing NLP to extract relations relevant to design.
44

Wu, Denise H., Sara Waller, and Anjan Chatterjee. "The Functional Neuroanatomy of Thematic Role and Locative Relational Knowledge." Journal of Cognitive Neuroscience 19, no. 9 (September 2007): 1542–55. http://dx.doi.org/10.1162/jocn.2007.19.9.1542.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Lexical-semantic investigations in cognitive neuroscience have focused on conceptual knowledge of concrete objects. By contrast, relational concepts have been largely ignored. We examined thematic role and locative knowledge in 14 left-hemisphere-damage patients. Relational concepts shift cognitive focus away from the object to the relationship between objects, calling into question the relevance of traditional sensory-functional accounts of semantics. If extraction of a relational structure is the critical cognitive process common to both thematic and locative knowledge, then damage to neural structures involved in such an extraction would impair both kinds of knowledge. If the nature of the relationship itself is critical, then functional neuroanatomical dissociations should occur. Using a new lesion analysis method, we found that damage to the lateral temporal cortex produced deficits in thematic role knowledge and damage to inferior fronto-parietal regions produced deficits in locative knowledge. In addition, we found that conceptual knowledge of thematic roles dissociates from its mapping onto language. These relational knowledge deficits were not accounted for by deficits in processing nouns or verbs or by a general deficit in making inferences. Our results are consistent with the hypothesis that manners of visual motion serve as a point of entry for thematic role knowledge and networks dedicated to eye gaze, whereas reaching and grasping serve as a point of entry for locative knowledge. Intermediary convergence zones that are topographically guided by these sensory-motor points of entry play a critical role in the semantics of relational concepts.
45

Wang, Chenliang, Wenjiao Shi, and Hongchen Lv. "Construction of Remote Sensing Indices Knowledge Graph (RSIKG) Based on Semantic Hierarchical Graph." Remote Sensing 16, no. 1 (December 30, 2023): 158. http://dx.doi.org/10.3390/rs16010158.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Remote sensing indices are widely used in various fields of geoscience research. However, there are limits to how effectively the knowledge of indices can be managed or analyzed. One of the main problems is the lack of ontology models and research on indices, which makes it difficult to acquire and update knowledge in this area. Additionally, there is a lack of techniques to analyze the mathematical semantics of indices, making it difficult to directly manage and analyze their mathematical semantics. This study utilizes an ontology and mathematical semantics integration method to offer a novel knowledge graph for a remote sensing index knowledge graph (RSIKG) so as to address these issues. The proposed semantic hierarchical graph structure represents the indices of knowledge with an entity-relationship layer and a mathematical semantic layer. Specifically, ontologies in the entity-relationship layer are constructed to model concepts and relationships among indices. In the mathematical semantics layer, index formulas are represented using mathematical semantic graphs. A method for calculating similarity for index formulas is also proposed. The article describes the entire process of building RSIKG, including the extraction, storage, analysis, and inference of remote sensing index knowledge. Experiments provided in this article demonstrate the intuitive and practical nature of RSIKG for analyzing indices knowledge. Overall, the proposed methods can be useful for knowledge queries and the analysis of indices. And the present study lays the groundwork for future research on analysis techniques and knowledge processing related to remote sensing indices.
46

Zatarain, Omar, Jesse Yoe Rumbo-Morales, Silvia Ramos-Cabral, Gerardo Ortíz-Torres, Felipe d. J. Sorcia-Vázquez, Iván Guillén-Escamilla, and Juan Carlos Mixteco-Sánchez. "A Method for Perception and Assessment of Semantic Textual Similarities in English." Mathematics 11, no. 12 (June 14, 2023): 2700. http://dx.doi.org/10.3390/math11122700.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This research proposes a method for the detection of semantic similarities in text snippets; the method achieves an unsupervised extraction and comparison of semantic information by mimicking skills for the identification of clauses and possible verb conjugations, the selection of the most accurate organization of the parts of speech, and similarity analysis by a direct comparison on the parts of speech from a pair of text snippets. The method for the extraction of the parts of speech in each text exploits a knowledge base structured as a dictionary and a thesaurus to identify the possible labels of each word and its synonyms. The method consists of the processes of perception, debiasing, reasoning and assessment. The perception module decomposes the text into blocks of information focused on the elicitation of the parts of speech. The debiasing module reorganizes the blocks of information due to the biases that may be produced in the previous perception. The reasoning module finds the similarities between blocks from two texts through analyses of similarities on synonymy, morphological properties, and the relative position of similar concepts within the texts. The assessment generates a judgement on the output produced by the reasoning as the averaged similarity assessment obtained from the parts of speech similarities of blocks. The proposed method is implemented on an English language version to exploit a knowledge base in English for the extraction of the similarities and differences of texts. The system implements a set of syntactic and logical rules that enable the autonomous reasoning that uses a knowledge base regardless of the concepts and knowledge domains of the latter. A system developed with the proposed method is tested on the “test” dataset used on the SemEval 2017 competition on seven knowledge bases compiled from six dictionaries and two thesauruses. The results indicate that the performance of the method increases as the degree of completeness of concepts and their relations increase, and the Pearson correlation for the most accurate knowledge base is 77%.
47

Betru, Bisrat, and Fekade Getahun. "Ontology-driven Intelligent IT Incident Management Model." International Journal of Information Technology and Computer Science 15, no. 1 (February 8, 2023): 30–41. http://dx.doi.org/10.5815/ijitcs.2023.01.04.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
A significant number of Information Technology incidents are reported through email. To design and implement an intelligent incident management system, it is significant to automatically classify the reported incident to a given incident category. This requires the extraction of semantic content from the reported email text. In this research work, we have attempted to classify a reported incident to a given category based on its semantic content using ontology. We have developed an Incident Ontology that can serve as a knowledge base for the incident management system. We have also developed an automatic incident classifier that matches the semantical units of the incident report with concepts in the incident ontology. According to our evaluation, ontology-driven incident classification facilitates the process of Information Technology incident management in a better way since the model shows 100% recall, 66% precision, and 79% F1-Score for sample incident reports.
48

Lin, Bingqian, Yi Zhu, Xiaodan Liang, Liang Lin, and Jianzhuang Liu. "Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation." Proceedings of the AAAI Conference on Artificial Intelligence 37, no. 2 (June 26, 2023): 1568–76. http://dx.doi.org/10.1609/aaai.v37i2.25243.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Vision-Language Navigation (VLN) is a challenging task which requires an agent to align complex visual observations to language instructions to reach the goal position. Most existing VLN agents directly learn to align the raw directional features and visual features trained using one-hot labels to linguistic instruction features. However, the big semantic gap among these multi-modal inputs makes the alignment difficult and therefore limits the navigation performance. In this paper, we propose Actional Atomic-Concept Learning (AACL), which maps visual observations to actional atomic concepts for facilitating the alignment. Specifically, an actional atomic concept is a natural language phrase containing an atomic action and an object, e.g., ``go up stairs''. These actional atomic concepts, which serve as the bridge between observations and instructions, can effectively mitigate the semantic gap and simplify the alignment. AACL contains three core components: 1) a concept mapping module to map the observations to the actional atomic concept representations through the VLN environment and the recently proposed Contrastive Language-Image Pretraining (CLIP) model, 2) a concept refining adapter to encourage more instruction-oriented object concept extraction by re-ranking the predicted object concepts by CLIP, and 3) an observation co-embedding module which utilizes concept representations to regularize the observation representations. Our AACL establishes new state-of-the-art results on both fine-grained (R2R) and high-level (REVERIE and R2R-Last) VLN benchmarks. Moreover, the visualization shows that AACL significantly improves the interpretability in action decision. Code will be available at https://gitee.com/mindspore/models/tree/master/research/cv/VLN-AACL.
49

Zhu, Lichao, Hangzhou Yang, and Zhijun Yan. "Mining medical related temporal information from patients’ self-description." International Journal of Crowd Science 1, no. 2 (June 12, 2017): 110–20. http://dx.doi.org/10.1108/ijcs-08-2017-0018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Purpose The purpose of this paper is to develop a new method to extract medical temporal information from online health communities. Design/methodology/approach The authors trained a conditional random-filed model for the extraction of temporal expressions. The temporal relation identification is considered as a classification task and several support vector machine classifiers are built in the proposed method. For the model training, the authors extracted some high-level semantic features including co-reference relationship of medical concepts and the semantic similarity among words. Findings For the extraction of TIMEX, the authors find that well-formatted expressions are easy to recognize, and the main challenge is the relative TIMEX such as “three days after onset”. It also shows the same difficulty for normalization of absolute date or well-formatted duration, whereas frequency is easier to be normalized. For the identification of DocTimeRel, the result is fairly well, and the relation is difficult to identify when it involves a relative TIMEX or a hypothetical concept. Originality/value The authors proposed a new method to extract temporal information from the online clinical data and evaluated the usefulness of different level of syntactic features in this task.
50

Ekinci, Ekin, and Sevinç İlhan Omurca. "Concept-LDA: Incorporating Babelfy into LDA for aspect extraction." Journal of Information Science 46, no. 3 (April 29, 2019): 406–18. http://dx.doi.org/10.1177/0165551519845854.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Latent Dirichlet allocation (LDA) is one of the probabilistic topic models; it discovers the latent topic structure in a document collection. The basic assumption under LDA is that documents are viewed as a probabilistic mixture of latent topics; a topic has a probability distribution over words and each document is modelled on the basis of a bag-of-words model. The topic models such as LDA are sufficient in learning hidden topics but they do not take into account the deeper semantic knowledge of a document. In this article, we propose a novel method based on topic modelling to determine the latent aspects of online review documents. In the proposed model, which is called Concept-LDA, the feature space of reviews is enriched with the concepts and named entities, which are extracted from Babelfy to obtain topics that contain not only co-occurred words but also semantically related words. The performance in terms of topic coherence and topic quality is reported over 10 publicly available datasets, and it is demonstrated that Concept-LDA achieves better topic representations than an LDA model alone, as measured by topic coherence and F-measure. The learned topic representation by Concept-LDA leads to accurate and an easy aspect extraction task in an aspect-based sentiment analysis system.

To the bibliography