Journal articles on the topic 'RDF datasets'

To see the other types of publications on this topic, follow the link: RDF datasets.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 journal articles for your research on the topic 'RDF datasets.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse journal articles on a wide variety of disciplines and organise your bibliography correctly.

1

Ri Kim, Ju, and Sung Kook Han. "R2RS: schema-based relational databases mapping to linked datasets." International Journal of Engineering & Technology 7, no. 3.3 (June 8, 2018): 119. http://dx.doi.org/10.14419/ijet.v7i2.33.13868.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Background/Objectives: The vast amounts of high-quality data stored in relational databases (RDB) is the primary resources for Linked Open Data (LOD) datasets. This paper proposes a schema-based mapping approach from RDB to RDF, which provides succinct and efficient mapping.Methods/Statistical analysis: The various approaches, languages and tools for mapping RDB to LOD have been proposed in the recent years. This paper surveys and analyzes classic mapping approach and language such as Direct Mapping and R2RML. The mapping approaches can be categorized by means of their data modeling. After analyzing the conventional RDB-RDF mapping methods, this paper proposes a new mapping method and discusses its typical features and applications.Findings: There are two types of mapping approaches for the translation of RDB to RDF: instance-based and schema-based mapping approaches. The instance-based mapping approaches generate large amounts of RDF graphs by means of mapping rules. These approaches causes data redundancy since the same data is stored in two ways of RDB and RDF. It is very easy to bring the data inconsistence problem when data update operations occur. The schema-based mapping approaches can effectively avoid data redundancy since the mapping can be accomplished in the conceptual schema level.The architecture of SPARQL endpoint based on schema mapping approach consists of five phases:Generation of mapping description based on mapping rules.SPARQL query statements for RDF graph patterns.Translation of SPARQL query into SQL query.Execution of SQL query in RDB.Interpretation of SQL query result into JSON-LD format.Experiments show the schema-based mapping approach is a straightforward, succinct and efficient mapping method for RDB2RDF.Improvements/Applications: This paper proposes a schema-based mapping approach called R2RS, which shows better performance than the conventional mapping methods. In addition, R2RS also provides the efficient implementation of SPARQL endpoint in RDB.
2

Sultana, Tangina, and Young-Koo Lee. "gRDF: An Efficient Compressor with Reduced Structural Regularities That Utilizes gRePair." Sensors 22, no. 7 (March 26, 2022): 2545. http://dx.doi.org/10.3390/s22072545.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The explosive volume of semantic data published in the Resource Description Framework (RDF) data model demands efficient management and compression with better compression ratio and runtime. Although extensive work has been carried out for compressing the RDF datasets, they do not perform well in all dimensions. However, these compressors rarely exploit the graph patterns and structural regularities of real-world datasets. Moreover, there are a variety of existing approaches that reduce the size of a graph by using a grammar-based graph compression algorithm. In this study, we introduce a novel approach named gRDF (graph repair for RDF) that uses gRePair, one of the most efficient grammar-based graph compression schemes, to compress the RDF dataset. In addition to that, we have improved the performance of HDT (header-dictionary-triple), an efficient approach for compressing the RDF datasets based on structural properties, by introducing modified HDT (M-HDT). It can detect the frequent graph pattern by employing the data-structure-oriented approach in a single pass from the dataset. In our proposed system, we use M-HDT for indexing the nodes and edge labels. Then, we employ gRePair algorithm for identifying the grammar from the RDF graph. Afterward, the system improves the performance of k2-trees by introducing a more efficient algorithm to create the trees and serialize the RDF datasets. Our experiments affirm that the proposed gRDF scheme can substantially achieve at approximately 26.12%, 13.68%, 6.81%, 2.38%, and 12.76% better compression ratio when compared with the most prominent state-of-the-art schemes such as HDT, HDT++, k2-trees, RDF-TR, and gRePair in the case of real-world datasets. Moreover, the processing efficiency of our proposed scheme also outperforms others.
3

MARX, EDGARD, TOMMASO SORU, SAEEDEH SHEKARPOUR, SÖREN AUER, AXEL-CYRILLE NGONGA NGOMO, and KARIN BREITMAN. "TOWARDS AN EFFICIENT RDF DATASET SLICING." International Journal of Semantic Computing 07, no. 04 (December 2013): 455–77. http://dx.doi.org/10.1142/s1793351x13400151.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Over the last years, a considerable amount of structured data has been published on the Web as Linked Open Data (LOD). Despite recent advances, consuming and using Linked Open Data within an organization is still a substantial challenge. Many of the LOD datasets are quite large and despite progress in Resource Description Framework (RDF) data management their loading and querying within a triple store is extremely time-consuming and resource-demanding. To overcome this consumption obstacle, we propose a process inspired by the classical Extract-Transform-Load (ETL) paradigm. In this article, we focus particularly on the selection and extraction steps of this process. We devise a fragment of SPARQL Protocol and RDF Query Language (SPARQL) dubbed SliceSPARQL, which enables the selection of well-defined slices of datasets fulfilling typical information needs. SliceSPARQL supports graph patterns for which each connected subgraph pattern involves a maximum of one variable or Internationalized resource identifier (IRI) in its join conditions. This restriction guarantees the efficient processing of the query against a sequential dataset dump stream. Furthermore, we evaluate our slicing approach on three different optimization strategies. Results show that dataset slices can be generated an order of magnitude faster than by using the conventional approach of loading the whole dataset into a triple store.
4

Hietanen, E., L. Lehto, and P. Latvala. "PROVIDING GEOGRAPHIC DATASETS AS LINKED DATA IN SDI." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B2 (June 8, 2016): 583–86. http://dx.doi.org/10.5194/isprs-archives-xli-b2-583-2016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this study, a prototype service to provide data from Web Feature Service (WFS) as linked data is implemented. At first, persistent and unique Uniform Resource Identifiers (URI) are created to all spatial objects in the dataset. The objects are available from those URIs in Resource Description Framework (RDF) data format. Next, a Web Ontology Language (OWL) ontology is created to describe the dataset information content using the Open Geospatial Consortium’s (OGC) GeoSPARQL vocabulary. The existing data model is modified in order to take into account the linked data principles. The implemented service produces an HTTP response dynamically. The data for the response is first fetched from existing WFS. Then the Geographic Markup Language (GML) format output of the WFS is transformed on-the-fly to the RDF format. Content Negotiation is used to serve the data in different RDF serialization formats. This solution facilitates the use of a dataset in different applications without replicating the whole dataset. In addition, individual spatial objects in the dataset can be referred with URIs. Furthermore, the needed information content of the objects can be easily extracted from the RDF serializations available from those URIs. <br><br> A solution for linking data objects to the dataset URI is also introduced by using the Vocabulary of Interlinked Datasets (VoID). The dataset is divided to the subsets and each subset is given its persistent and unique URI. This enables the whole dataset to be explored with a web browser and all individual objects to be indexed by search engines.
5

Hietanen, E., L. Lehto, and P. Latvala. "PROVIDING GEOGRAPHIC DATASETS AS LINKED DATA IN SDI." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLI-B2 (June 8, 2016): 583–86. http://dx.doi.org/10.5194/isprsarchives-xli-b2-583-2016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this study, a prototype service to provide data from Web Feature Service (WFS) as linked data is implemented. At first, persistent and unique Uniform Resource Identifiers (URI) are created to all spatial objects in the dataset. The objects are available from those URIs in Resource Description Framework (RDF) data format. Next, a Web Ontology Language (OWL) ontology is created to describe the dataset information content using the Open Geospatial Consortium’s (OGC) GeoSPARQL vocabulary. The existing data model is modified in order to take into account the linked data principles. The implemented service produces an HTTP response dynamically. The data for the response is first fetched from existing WFS. Then the Geographic Markup Language (GML) format output of the WFS is transformed on-the-fly to the RDF format. Content Negotiation is used to serve the data in different RDF serialization formats. This solution facilitates the use of a dataset in different applications without replicating the whole dataset. In addition, individual spatial objects in the dataset can be referred with URIs. Furthermore, the needed information content of the objects can be easily extracted from the RDF serializations available from those URIs. &lt;br&gt;&lt;br&gt; A solution for linking data objects to the dataset URI is also introduced by using the Vocabulary of Interlinked Datasets (VoID). The dataset is divided to the subsets and each subset is given its persistent and unique URI. This enables the whole dataset to be explored with a web browser and all individual objects to be indexed by search engines.
6

Cheng, Long, and Spyros Kotoulas. "Scale-Out Processing of Large RDF Datasets." IEEE Transactions on Big Data 1, no. 4 (December 1, 2015): 138–50. http://dx.doi.org/10.1109/tbdata.2015.2505719.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Harbi, Razen, Ibrahim Abdelaziz, Panos Kalnis, and Nikos Mamoulis. "Evaluating SPARQL queries on massive RDF datasets." Proceedings of the VLDB Endowment 8, no. 12 (August 2015): 1848–51. http://dx.doi.org/10.14778/2824032.2824083.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Gu, Jinguang, Hao Dong, Zhao Liu, and Fangfang Xu. "Distributed Top-K Join Queries Optimizing for RDF Datasets." International Journal of Web Services Research 14, no. 3 (July 2017): 67–83. http://dx.doi.org/10.4018/ijwsr.2017070105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In recent years, the scale of RDF datasets is increasing rapidly, the query research on RDF datasets in the transitional centralized environment is unable to meet the increasing demand of data query field, especially the top-k query. Based on the Spark distributed computing system and the HBase distributed storage system, a novel method is proposed for top-k query. A top–k query plan STA (Spark Threshold Algorithm) is proposed to reduce the connection operation of RDF data. Furthermore, a better algorithm SSJA (Spark Simple Join Algorithm) is presented to reduce the sorting related operations for the intermediate data. A cache mechanism is also proposed to speed up the SSJA algorithm. The experimental results show that the SSJA algorithm performs better than the STA algorithm in term of the cost and applicability, and it can significantly improve the SSJA's performance by introducing the cache mechanism.
9

Rakhmawati, Nur Aini, and Lutfi Nur Fadzilah. "Dataset Characteristics Identification for Federated SPARQL Query." Scientific Journal of Informatics 6, no. 1 (May 24, 2019): 23–33. http://dx.doi.org/10.15294/sji.v6i1.17258.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Nowadays, the amount of data published in the RDF format is increasing. Federated SPARQL query engines that can query from multiple distributed SPARQL endpoints have been developed recently. A federated query engine usually has different performance compared to the others. One of the factors that affect the performance of the query engine is the characteristic of the accessed RDF dataset, such as the number of triples, the number of classes, the number of properties, the number of subjects, the number of entities, the number of objects, and the spreading factor of a dataset. The aim of this work is to identify the characteristic of RDF dataset and create a query set for evaluating a federated engine. The study was conducted by identifying 16 datasets that used by ten research papers in Linked Data area.
10

McGlothlin, James, and Latifur Khan. "Materializing Inferred and Uncertain Knowledge in RDF Datasets." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 5, 2010): 1951–52. http://dx.doi.org/10.1609/aaai.v24i1.7786.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
There is a growing need for efficient and scalable semantic web queries that handle inference. There is also a growing interest in representing uncertainty in semantic web knowledge bases. In this paper, we present a bit vector schema specifically designed for RDF (Resource Description Framework) datasets. We propose a system for materializing and storing inferred knowledge using this schema. We show experimental results that demonstrate that our solution drastically improves the performance of inference queries. We also propose a solution for materializing uncertain information and probabilities using multiple bit vectors and thresholds.
11

Fernández, Javier D., Sabrina Kirrane, Axel Polleres, and Simon Steyskal. "HDT crypt : Compression and encryption of RDF datasets." Semantic Web 11, no. 2 (February 5, 2020): 337–59. http://dx.doi.org/10.3233/sw-180335.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Liu, Daxin, Gong Cheng, Qingxia Liu, and Yuzhong Qu. "Fast and Practical Snippet Generation for RDF Datasets." ACM Transactions on the Web 13, no. 4 (December 20, 2019): 1–38. http://dx.doi.org/10.1145/3365575.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Paris, Pierre-Henri, Fayçal Hamdi, and Samira Si-said Cherfi. "Interlinking RDF-based datasets: A structure-based approach." Procedia Computer Science 159 (2019): 162–71. http://dx.doi.org/10.1016/j.procs.2019.09.171.

Full text
APA, Harvard, Vancouver, ISO, and other styles
14

Tzitzikas, Yannis, Nikos Manolis, and Panagiotis Papadakos. "Faceted exploration of RDF/S datasets: a survey." Journal of Intelligent Information Systems 48, no. 2 (June 15, 2016): 329–64. http://dx.doi.org/10.1007/s10844-016-0413-8.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Mehmood, Qaiser, Muhammad Saleem, Ratnesh Sahay, Axel-Cyrille Ngonga Ngomo, and Mathieu D'Aquin. "QPPDs: Querying Property Paths Over Distributed RDF Datasets." IEEE Access 7 (2019): 101031–45. http://dx.doi.org/10.1109/access.2019.2930416.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Elzein, Nahla Mohammed, Mazlina Abdul Majid, Ibrahim Abaker Targio Hashem, Ashraf Osman Ibrahim, Anas W. Abulfaraj, and Faisal Binzagr. "JQPro:Join Query Processing in a Distributed System for Big RDF Data Using the Hash-Merge Join Technique." Mathematics 11, no. 5 (March 6, 2023): 1275. http://dx.doi.org/10.3390/math11051275.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In the last decade, the volume of semantic data has increased exponentially, with the number of Resource Description Framework (RDF) datasets exceeding trillions of triples in RDF repositories. Hence, the size of RDF datasets continues to grow. However, with the increasing number of RDF triples, complex multiple RDF queries are becoming a significant demand. Sometimes, such complex queries produce many common sub-expressions in a single query or over multiple queries running as a batch. In addition, it is also difficult to minimize the number of RDF queries and processing time for a large amount of related data in a typical distributed environment encounter. To address this complication, we introduce a join query processing model for big RDF data, called JQPro. By adopting a MapReduce framework in JQPro, we developed three new algorithms, which are hash-join, sort-merge, and enhanced MapReduce-join for join query processing of RDF data. Based on an experiment conducted, the result showed that the JQPro model outperformed the two popular algorithms, gStore and RDF-3X, with respect to the average execution time. Furthermore, the JQPro model was also tested against RDF-3X, RDFox, and PARJs using the LUBM benchmark. The result showed that the JQPro model had better performance in comparison with the other models. In conclusion, the findings showed that JQPro achieved improved performance with 87.77% in terms of execution time. Hence, in comparison with the selected models, JQPro performs better.
17

Saleh Aloufi, Khalid. "Generating RDF resources from web open data portals." Indonesian Journal of Electrical Engineering and Computer Science 16, no. 3 (December 1, 2019): 1521. http://dx.doi.org/10.11591/ijeecs.v16.i3.pp1521-1529.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<span>Open data are available from various private and public institutions in different resource formats. There are already great number of open data that are published using open data portals, where datasets and resources are mainly presented in tabular or sheet formats. However, such formats have some barriers with application developments and web standards. One of the web recommenced standards for semantic web application is RDF. There are various research efforts have been focused on presenting open data in RDF formats. However, no framework has transformed tabular open data into RDFs considering the HTML tags and properties of the resources and datasets. Therefore, a methodology is required to generate RDF resources from this type of open data resources. This methodology applies data transformations of open data from a tabular format to RDF files for the Saudi Open Data Portal. The methodology successfully transforms open data resources in sheet format into RDF resources. Recommendations and future work are given to enhance the development of building open data.</span>
18

Pan, Zhengyu, Tao Zhu, Hong Liu, and Huansheng Ning. "A survey of RDF management technologies and benchmark datasets." Journal of Ambient Intelligence and Humanized Computing 9, no. 5 (June 5, 2018): 1693–704. http://dx.doi.org/10.1007/s12652-018-0876-2.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Grau, Bernardo Cuenca, and Egor V. Kostylev. "Logical Foundations of Linked Data Anonymisation." Journal of Artificial Intelligence Research 64 (February 16, 2019): 253–314. http://dx.doi.org/10.1613/jair.1.11355.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The widespread adoption of the Linked Data paradigm has been driven by the increasing demand for information exchange between organisations, as well as by regulations in domains such as health care and governance that require certain data to be published. In this setting, sensitive information is at high risk of disclosure since published data can be often seamlessly linkedwith arbitrary external data sources.In this paper we lay the logical foundations of anonymisation in the context of Linked Data. We consider anonymisations of RDF graphs (and, more generally, relational datasets with labelled nulls) and define notions of policy-compliant and linkage-safe anonymisations. Policy compliance ensures that an anonymised dataset does not reveal any sensitive information as specified by a policy query. Linkage safety ensures that an anonymised dataset remains compliant even if it is linked to (possibly unknown) external datasets available on the Web, thus providing provable protection guarantees against data linkage attacks. We establish the computational complexity of the underpinning decision problems both under the open-world semantics inherent to RDF and under the assumption that an attacker has complete, closed-world knowledge over some parts of the original data.
20

McGlothlin, James, and Latifur Khan. "Materializing and Persisting Inferred and Uncertain Knowledge in RDF Datasets." Proceedings of the AAAI Conference on Artificial Intelligence 24, no. 1 (July 5, 2010): 1405–12. http://dx.doi.org/10.1609/aaai.v24i1.7522.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
As the semantic web grows in popularity and enters the mainstream of computer technology, RDF (Resource Description Framework) datasets are becoming larger and more complex. Advanced semantic web ontologies, especially in medicine and science, are developing. As more complex ontologies are developed, there is a growing need for efficient queries that handle inference. In areas such as research, it is vital to be able to perform queries that retrieve not just facts but also inferred knowledge and uncertain information. OWL (Web Ontology Language) defines rules that govern provable inference in semantic web datasets. In this paper, we detail a database schema using bit vectors that is designed specifically for RDF datasets. We introduce a framework for materializing and storing inferred triples. Our bit vector schema enables storage of inferred knowledge without a query performance penalty. Inference queries are simplified and performance is improved. Our evaluation results demonstrate that our inference solution is more scalable and efficient than the current state-of-the-art. There are also standards being developed for representing probabilistic reasoning within OWL ontologies. We specify a framework for materializing uncertain information and probabilities using these ontologies. We define a multiple vector schema for representing probabilities and classifying uncertain knowledge using thresholds. This solution increases the breadth of information that can be efficiently retrieved.
21

Fiorelli, Manuel, Maria Teresa Pazienza, Armando Stellato, and Andrea Turbati. "Change management and validation for collaborative editing of RDF datasets." International Journal of Metadata, Semantics and Ontologies 12, no. 2/3 (2017): 142. http://dx.doi.org/10.1504/ijmso.2017.090783.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Turbati, Andrea, Armando Stellato, Manuel Fiorelli, and Maria Teresa Pazienza. "Change management and validation for collaborative editing of RDF datasets." International Journal of Metadata, Semantics and Ontologies 12, no. 2/3 (2017): 142. http://dx.doi.org/10.1504/ijmso.2017.10011837.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Nurmikko-Fuller, Terhi, Daniel Bangert, Alan Dix, David Weigl, and Kevin Page. "Building Prototypes Aggregating Musicological Datasets on the Semantic Web." Bibliothek Forschung und Praxis 42, no. 2 (June 1, 2018): 206–21. http://dx.doi.org/10.1515/bfp-2018-0025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Semantic Web technologies such as RDF, OWL, and SPARQL can be successfully used to bridge complementary musicological information. In this paper, we describe, compare, and evaluate the datasets and workflows used to create two such aggregator projects: In Collaboration with In Concert, and JazzCats, both of which bring together a cluster of smaller projects containing concert and performance metadata.
24

Gomathi, R., and D. Sharmila. "Application of Harmony Search Algorithm to Optimize SPARQL Protocol and Resource Description Framework Query Language Queries in Healthcare Data." Journal of Medical Imaging and Health Informatics 11, no. 11 (November 1, 2021): 2862–67. http://dx.doi.org/10.1166/jmihi.2021.3877.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The rapid developing international of internet, Semantic Web has become a platform for intelligent agents mainly in the healthcare sector. Inside the beyond few years there is a widening in the Semantic web data field in the healthcare industry. With a growth in the quantity of Semantic web data field in health industry, there exist some challenges to be resolved. One such challenge is to provide an efficient querying mechanism that can handle large number of Semantic web data. Consider many query languages; especially SPARQL (SPARQL Protocol and RDF Query Language) is the most popular query language. Each of these query languages has their own design strategy and it was identified in research that it is difficult to handle and query large quantity of RDF data efficiently using these languages. In the proposed process, Harmony search identify met heuristic algorithm to optimize the SPARQL queries in the healthcare data in the applicable manner. The application of Harmony search algorithm is evaluated with large Resource Description Framework (RDF) datasets and SPARQL queries. To assess performance, the algorithm’s implementation is compared to existing nature-inspired algorithms. The performance analysis shows that the proposed application performs well for large RDF datasets.
25

Mountantonakis, Michalis, and Yannis Tzitzikas. "Linking Entities from Text to Hundreds of RDF Datasets for Enabling Large Scale Entity Enrichment." Knowledge 2, no. 1 (December 24, 2021): 1–25. http://dx.doi.org/10.3390/knowledge2010001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
There is a high increase in approaches that receive as input a text and perform named entity recognition (or extraction) for linking the recognized entities of the given text to RDF Knowledge Bases (or datasets). In this way, it is feasible to retrieve more information for these entities, which can be of primary importance for several tasks, e.g., for facilitating manual annotation, hyperlink creation, content enrichment, for improving data veracity and others. However, current approaches link the extracted entities to one or few knowledge bases, therefore, it is not feasible to retrieve the URIs and facts of each recognized entity from multiple datasets and to discover the most relevant datasets for one or more extracted entities. For enabling this functionality, we introduce a research prototype, called LODsyndesisIE, which exploits three widely used Named Entity Recognition and Disambiguation tools (i.e., DBpedia Spotlight, WAT and Stanford CoreNLP) for recognizing the entities of a given text. Afterwards, it links these entities to the LODsyndesis knowledge base, which offers data enrichment and discovery services for millions of entities over hundreds of RDF datasets. We introduce all the steps of LODsyndesisIE, and we provide information on how to exploit its services through its online application and its REST API. Concerning the evaluation, we use three evaluation collections of texts: (i) for comparing the effectiveness of combining different Named Entity Recognition tools, (ii) for measuring the gain in terms of enrichment by linking the extracted entities to LODsyndesis instead of using a single or a few RDF datasets and (iii) for evaluating the efficiency of LODsyndesisIE.
26

Ahmad Khan, Aatif, and Sanjay Kumar Malik. "Assessing Large-Scale, Cross-Domain Knowledge Bases for Semantic Search." Mehran University Research Journal of Engineering and Technology 39, no. 3 (July 1, 2020): 595–602. http://dx.doi.org/10.22581/muet1982.2003.14.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Semantic Search refers to set of approaches dealing with usage of Semantic Web technologies for information retrieval in order to make the process machine understandable and fetch precise results. Knowledge Bases (KB) act as the backbone for semantic search approaches to provide machine interpretable information for query processing and retrieval of results. These KB include Resource Description Framework (RDF) datasets and populated ontologies. In this paper, an assessment of the largest cross-domain KB is presented that are exploited in large scale semantic search and are freely available on Linked Open Data Cloud. Analysis of these datasets is a prerequisite for modeling effective semantic search approaches because of their suitability for particular applications. Only the large scale, cross-domain datasets are considered, which are having sizes more than 10 million RDF triples. Survey of sizes of the datasets in triples count has been depicted along with triples data format(s) supported by them, which is quite significant to develop effective semantic search models.
27

Pirró, Giuseppe. "REWOrD: Semantic Relatedness in the Web of Data." Proceedings of the AAAI Conference on Artificial Intelligence 26, no. 1 (September 20, 2021): 129–35. http://dx.doi.org/10.1609/aaai.v26i1.8107.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper presents REWOrD, an approach to compute semantic relatedness between entities in the Web of Data representing real word concepts. REWOrD exploits the graph nature of RDF data and the SPARQL query language to access this data. Through simple queries, REWOrD constructs weighted vectors keeping the informativeness of RDF predicates used to make statements about the entities being compared. The most informative path is also considered to further refine informativeness. Relatedness is then computed by the cosine of the weighted vectors. Differently from previous approaches based on Wikipedia, REWOrD does not require any prepro- cessing or custom data transformation. Indeed, it can lever- age whatever RDF knowledge base as a source of background knowledge. We evaluated REWOrD in different settings by using a new dataset of real word entities and investigate its flexibility. As compared to related work on classical datasets, REWOrD obtains comparable results while, on one side, it avoids the burden of preprocessing and data transformation and, on the other side, it provides more flexibility and applicability in a broad range of domains.
28

Yun, Hongyan, Ying He, Li Lin, and Xiaohong Wang. "Research on Multi-Source Data Integration Based on Ontology and Karma Modeling." International Journal of Intelligent Information Technologies 15, no. 2 (April 2019): 69–87. http://dx.doi.org/10.4018/ijiit.2019040105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The purpose of data integration is that integrates multi-source heterogeneous data. Ontology solves semantic describing of multi-source heterogeneous data. The authors propose a practical approach based on ontology modeling and an information toolkit named Karma modeling for fast data integration, and demonstrate an application example in detail. Armed Conflict Location & Event Data Project (ACLED) is a publicly available conflict event dataset designed for disaggregated conflict analysis and crisis mapping. The authors analyzed the ACLED dataset and domain knowledge to build an Armed Conflict Event ontology, then constructed Karma models to integrate ACLED datasets and publish RDF data. Through SPARQL query to check the correctness of published RDF data. Authors design and developed an ACLED Query System based on Jena API, Canvas JS, and Baidu API, etc. technologies, which provides convenience for governments and researches to analyze regional conflict events and crisis early warning, and it verifies the validity of constructed ontology and the correctness of Karma modeling.
29

Hilal, Median, Christoph G. Schuetz, and Michael Schrefl. "Using superimposed multidimensional schemas and OLAP patterns for RDF data analysis." Open Computer Science 8, no. 1 (July 1, 2018): 18–37. http://dx.doi.org/10.1515/comp-2018-0003.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract The foundations for traditional data analysis are Online Analytical Processing (OLAP) systems that operate on multidimensional (MD) data. The Resource Description Framework (RDF) serves as the foundation for the publication of a growing amount of semantic web data still largely untapped by companies for data analysis. Most RDF data sources, however, do not correspond to the MD modeling paradigm and, as a consequence, elude traditional OLAP. The complexity of RDF data in terms of structure, semantics, and query languages renders RDF data analysis challenging for a typical analyst not familiar with the underlying data model or the SPARQL query language. Hence, conducting RDF data analysis is not a straightforward task. We propose an approach for the definition of superimposed MD schemas over arbitrary RDF datasets and show how to represent the superimposed MD schemas using well-known semantic web technologies. On top of that, we introduce OLAP patterns for RDF data analysis, which are recurring, domain-independent elements of data analysis. Analysts may compose queries by instantiating a pattern using only the MD concepts and business terms. Upon pattern instantiation, the corresponding SPARQL query over the source data can be automatically generated, sparing analysts from technical details and fostering self-service capabilities.
30

Izquierdo, Yenier T., Grettel M. García, Elisa Menendez, Luiz André P. P. Leme, Angelo Neves, Melissa Lemos, Anna Carolina Finamore, Carlos Oliveira, and Marco A. Casanova. "Keyword search over schema-less RDF datasets by SPARQL query compilation." Information Systems 102 (December 2021): 101814. http://dx.doi.org/10.1016/j.is.2021.101814.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Kaneiwa, Ken, and Yuuki Yamanaka. "Aggregation Path Search using Multiple Large RDF Datasets with Equivalence Relations." Transactions of the Japanese Society for Artificial Intelligence 38, no. 2 (March 1, 2023): D—M53_1–9. http://dx.doi.org/10.1527/tjsai.38-2_d-m53.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Dumontier, Michel, Alasdair J. G. Gray, M. Scott Marshall, Vladimir Alexiev, Peter Ansell, Gary Bader, Joachim Baran, et al. "The health care and life sciences community profile for dataset descriptions." PeerJ 4 (August 16, 2016): e2331. http://dx.doi.org/10.7717/peerj.2331.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.
33

Mingyan Wang, Mingyan Wang, Qingrong Huang Mingyan Wang, Nan Wu Qingrong Huang, and Ying Pan Nan Wu. "RDF Subgraph Matching by Means of Star Decomposition." 網際網路技術學刊 23, no. 7 (December 2022): 1613–21. http://dx.doi.org/10.53106/160792642022122307015.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<p>With the continuous development of the network, the scale of RDF data is becoming larger and larger. In the face of large-scale RDF data processing, the traditional database query method has been unable to meet the needs. Due to the limited characteristics of subgraph matching, most existing algorithms often have the phenomenon that many subgraphs are repeatedly traversed during the query process, resulting in a large number of intermediate result sets and low query efficiency. The core problem to be solved is how to efficiently match subgraphs. In order to improve the query efficiency of RDF subgraphs in massive RDF data graphs and solve the problem of repeated calculation of some graphs in the query process of RDF subgraphs, an RDF subgraph query algorithm based on star decomposition is proposed in this paper. The algorithm uses graph structure to decompose RDF subgraphs into stars and uses a custom node cost model to calculate the query order of the star subgraphs. By decomposing, the amount of communication among subgraphs is reduced, and the communication cost for query processing is lowered. Moreover, utilizing the query order for RDF subgraph matching can effectively reduce the generation of intermediate result sets and accelerate the efficiency of subgraph matching. On this basis, the performances of the proposed algorithm and several other widely used algorithms are compared and analyzed on two different datasets. Experiments show that the proposed algorithm has better advantages in database recreation, memory size, and execution efficiency. </p> <p>&nbsp;</p>
34

Katayama, Toshiaki, Shuichi Kawashima, Gos Micklem, Shin Kawano, Jin-Dong Kim, Simon Kocbek, Shinobu Okamoto, et al. "BioHackathon series in 2013 and 2014: improvements of semantic interoperability in life science data and services." F1000Research 8 (September 23, 2019): 1677. http://dx.doi.org/10.12688/f1000research.18238.1.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Publishing databases in the Resource Description Framework (RDF) model is becoming widely accepted to maximize the syntactic and semantic interoperability of open data in life sciences. Here we report advancements made in the 6th and 7th annual BioHackathons which were held in Tokyo and Miyagi respectively. This review consists of two major sections covering: 1) improvement and utilization of RDF data in various domains of the life sciences and 2) meta-data about these RDF data, the resources that store them, and the service quality of SPARQL Protocol and RDF Query Language (SPARQL) endpoints. The first section describes how we developed RDF data, ontologies and tools in genomics, proteomics, metabolomics, glycomics and by literature text mining. The second section describes how we defined descriptions of datasets, the provenance of data, and quality assessment of services and service discovery. By enhancing the harmonization of these two layers of machine-readable data and knowledge, we improve the way community wide resources are developed and published. Moreover, we outline best practices for the future, and prepare ourselves for an exciting and unanticipatable variety of real world applications in coming years.
35

De Meester, Ben, Pieter Heyvaert, Dörthe Arndt, Anastasia Dimou, and Ruben Verborgh. "RDF graph validation using rule-based reasoning." Semantic Web 12, no. 1 (November 19, 2020): 117–42. http://dx.doi.org/10.3233/sw-200384.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The correct functioning of Semantic Web applications requires that given RDF graphs adhere to an expected shape. This shape depends on the RDF graph and the application’s supported entailments of that graph. During validation, RDF graphs are assessed against sets of constraints, and found violations help refining the RDF graphs. However, existing validation approaches cannot always explain the root causes of violations (inhibiting refinement), and cannot fully match the entailments supported during validation with those supported by the application. These approaches cannot accurately validate RDF graphs, or combine multiple systems, deteriorating the validator’s performance. In this paper, we present an alternative validation approach using rule-based reasoning, capable of fully customizing the used inferencing steps. We compare to existing approaches, and present a formal ground and practical implementation “Validatrr”, based on N3Logic and the EYE reasoner. Our approach – supporting an equivalent number of constraint types compared to the state of the art – better explains the root cause of the violations due to the reasoner’s generated logical proof, and returns an accurate number of violations due to the customizable inferencing rule set. Performance evaluation shows that Validatrr is performant for smaller datasets, and scales linearly w.r.t. the RDF graph size. The detailed root cause explanations can guide future validation report description specifications, and the fine-grained level of configuration can be employed to support different constraint languages. This foundation allows further research into handling recursion, validating RDF graphs based on their generation description, and providing automatic refinement suggestions.
36

FREITAS, ANDRÉ, EDWARD CURRY, JOÃO GABRIEL OLIVEIRA, and SEÁN O'RIAIN. "A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA." International Journal of Semantic Computing 05, no. 04 (December 2011): 433–62. http://dx.doi.org/10.1142/s1793351x1100133x.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The vision of creating a Linked Data Web brings together the challenge of allowing queries across highly heterogeneous and distributed datasets. In order to query Linked Data on the Web today, end users need to be aware of which datasets potentially contain the data and also which data model describes these datasets. The process of allowing users to expressively query relationships in RDF while abstracting them from the underlying data model represents a fundamental problem for Web-scale Linked Data consumption. This article introduces a distributional structured semantic space which enables data model independent natural language queries over RDF data. The center of the approach relies on the use of a distributional semantic model to address the level of semantic interpretation demanded to build the data model independent approach. The article analyzes the geometric aspects of the proposed space, providing its description as a distributional structured vector space, which is built upon the Generalized Vector Space Model (GVSM). The final semantic space proved to be flexible and precise under real-world query conditions achieving mean reciprocal rank = 0.516, avg. precision = 0.482 and avg. recall = 0.491.
37

Vaisman, Alejandro, and Kevin Chentout. "Mapping Spatiotemporal Data to RDF: A SPARQL Endpoint for Brussels." ISPRS International Journal of Geo-Information 8, no. 8 (August 10, 2019): 353. http://dx.doi.org/10.3390/ijgi8080353.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This paper describes how a platform for publishing and querying linked open data for the Brussels Capital region in Belgium is built. Data are provided as relational tables or XML documents and are mapped into the RDF data model using R2RML, a standard language that allows defining customized mappings from relational databases to RDF datasets. In this work, data are spatiotemporal in nature; therefore, R2RML must be adapted to allow producing spatiotemporal Linked Open Data.Data generated in this way are used to populate a SPARQL endpoint, where queries are submitted and the result can be displayed on a map. This endpoint is implemented using Strabon, a spatiotemporal RDF triple store built by extending the RDF store Sesame. The first part of the paper describes how R2RML is adapted to allow producing spatial RDF data and to support XML data sources. These techniques are then used to map data about cultural events and public transport in Brussels into RDF. Spatial data are stored in the form of stRDF triples, the format required by Strabon. In addition, the endpoint is enriched with external data obtained from the Linked Open Data Cloud, from sites like DBpedia, Geonames, and LinkedGeoData, to provide context for analysis. The second part of the paper shows, through a comprehensive set of the spatial extension to SPARQL (stSPARQL) queries, how the endpoint can be exploited.
38

Binding, Ceri, Douglas Tudhope, and Andreas Vlachidis. "A study of semantic integration across archaeological data and reports in different languages." Journal of Information Science 45, no. 3 (July 31, 2018): 364–86. http://dx.doi.org/10.1177/0165551518789874.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This study investigates the semantic integration of data extracted from archaeological datasets with information extracted via natural language processing (NLP) across different languages. The investigation follows a broad theme relating to wooden objects and their dating via dendrochronological techniques, including types of wooden material, samples taken and wooden objects including shipwrecks. The outcomes are an integrated RDF dataset coupled with an associated interactive research demonstrator query builder application. The semantic framework combines the CIDOC Conceptual Reference Model (CRM) with the Getty Art and Architecture Thesaurus (AAT). The NLP, data cleansing and integration methods are described in detail together with illustrative scenarios from the web application Demonstrator. Reflections and recommendations from the study are discussed. The Demonstrator is a novel SPARQL web application, with CRM/AAT-based data integration. Functionality includes the combination of free text and semantic search with browsing on semantic links, hierarchical and associative relationship thesaurus query expansion. Queries concern wooden objects (e.g. samples of beech wood keels), optionally from a given date range, with automatic expansion over AAT hierarchies of wood types and specialised associative relationships. Following a ‘mapping pattern’ approach (via the STELETO tool) ensured validity and consistency of all RDF output. The user is shielded from the complexity of the underlying semantic framework by a query builder user interface. The study demonstrates the feasibility of connecting information extracted from datasets and grey literature reports in different languages and semantic cross-searching of the integrated information. The semantic linking of textual reports and datasets opens new possibilities for integrative research across diverse resources.
39

Nikas, Christos, Giorgos Kadilierakis, Pavlos Fafalios, and Yannis Tzitzikas. "Keyword Search over RDF: Is a Single Perspective Enough?" Big Data and Cognitive Computing 4, no. 3 (August 27, 2020): 22. http://dx.doi.org/10.3390/bdcc4030022.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Since the task of accessing RDF datasets through structured query languages like SPARQL is rather demanding for ordinary users, there are various approaches that attempt to exploit the simpler and widely used keyword-based search paradigm. However this task is challenging since there is no clear unit of retrieval and presentation, the user information needs are in most cases not clearly formulated, the underlying RDF datasets are in most cases incomplete, and there is not a single presentation method appropriate for all kinds of information needs. As a means to alleviate these problems, in this paper we investigate an interaction approach that offers multiple presentation methods of the search results (multiple-perspectives), allowing the user to easily switch between these perspectives and thus exploit the added value that each such perspective offers. We focus on a set of fundamental perspectives, we discuss the benefits from each one, we compare this approach with related existing systems and report the results of a task-based evaluation with users. The key finding of the task-based evaluation is that users not familiar with RDF (a) managed to complete the information-seeking tasks (with performance very close to that of the experienced users), and (b) they rated positively the approach.
40

Hor, A.-H., A. Jadidi, and G. Sohn. "BIM-GIS INTEGRATED GEOSPATIAL INFORMATION MODEL USING SEMANTIC WEB AND RDF GRAPHS." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-4 (June 3, 2016): 73–79. http://dx.doi.org/10.5194/isprsannals-iii-4-73-2016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In recent years, 3D virtual indoor/outdoor urban modelling becomes a key spatial information framework for many civil and engineering applications such as evacuation planning, emergency and facility management. For accomplishing such sophisticate decision tasks, there is a large demands for building multi-scale and multi-sourced 3D urban models. Currently, Building Information Model (BIM) and Geographical Information Systems (GIS) are broadly used as the modelling sources. However, data sharing and exchanging information between two modelling domains is still a huge challenge; while the syntactic or semantic approaches do not fully provide exchanging of rich semantic and geometric information of BIM into GIS or vice-versa. This paper proposes a novel approach for integrating BIM and GIS using semantic web technologies and Resources Description Framework (RDF) graphs. The novelty of the proposed solution comes from the benefits of integrating BIM and GIS technologies into one unified model, so-called Integrated Geospatial Information Model (IGIM). The proposed approach consists of three main modules: BIM-RDF and GIS-RDF graphs construction, integrating of two RDF graphs, and query of information through IGIM-RDF graph using SPARQL. The IGIM generates queries from both the BIM and GIS RDF graphs resulting a semantically integrated model with entities representing both BIM classes and GIS feature objects with respect to the target-client application. The linkage between BIM-RDF and GIS-RDF is achieved through SPARQL endpoints and defined by a query using set of datasets and entity classes with complementary properties, relationships and geometries. To validate the proposed approach and its performance, a case study was also tested using IGIM system design.
41

Hor, A.-H., A. Jadidi, and G. Sohn. "BIM-GIS INTEGRATED GEOSPATIAL INFORMATION MODEL USING SEMANTIC WEB AND RDF GRAPHS." ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences III-4 (June 3, 2016): 73–79. http://dx.doi.org/10.5194/isprs-annals-iii-4-73-2016.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In recent years, 3D virtual indoor/outdoor urban modelling becomes a key spatial information framework for many civil and engineering applications such as evacuation planning, emergency and facility management. For accomplishing such sophisticate decision tasks, there is a large demands for building multi-scale and multi-sourced 3D urban models. Currently, Building Information Model (BIM) and Geographical Information Systems (GIS) are broadly used as the modelling sources. However, data sharing and exchanging information between two modelling domains is still a huge challenge; while the syntactic or semantic approaches do not fully provide exchanging of rich semantic and geometric information of BIM into GIS or vice-versa. This paper proposes a novel approach for integrating BIM and GIS using semantic web technologies and Resources Description Framework (RDF) graphs. The novelty of the proposed solution comes from the benefits of integrating BIM and GIS technologies into one unified model, so-called Integrated Geospatial Information Model (IGIM). The proposed approach consists of three main modules: BIM-RDF and GIS-RDF graphs construction, integrating of two RDF graphs, and query of information through IGIM-RDF graph using SPARQL. The IGIM generates queries from both the BIM and GIS RDF graphs resulting a semantically integrated model with entities representing both BIM classes and GIS feature objects with respect to the target-client application. The linkage between BIM-RDF and GIS-RDF is achieved through SPARQL endpoints and defined by a query using set of datasets and entity classes with complementary properties, relationships and geometries. To validate the proposed approach and its performance, a case study was also tested using IGIM system design.
42

Li, Jiayu, and Antonis Bikakis. "Towards a Semantics-Based Recommendation System for Cultural Heritage Collections." Applied Sciences 13, no. 15 (August 2, 2023): 8907. http://dx.doi.org/10.3390/app13158907.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
While the use of semantic technologies is now commonplace in the cultural heritage sector and several semantically annotated cultural heritage datasets are publicly available, there are few examples of cultural portals that exploit these datasets and technologies to improve the experience of visitors to their online collections. Aiming to address this gap, this paper explores methods for semantics-based recommendations aimed at visitors to cultural portals who want to explore online collections. The proposed methods exploit the rich semantic metadata in a cultural heritage dataset and the capabilities of a graph database system to improve the accuracy of searches through the collection and the quality of the recommendations provided to the user. The methods were developed and tested with the Archive of the Art Textbooks of Elementary and Public Schools in the Japanese Colonial Period. However, they can easily be adapted to any cultural heritage collection dataset modelled in RDF.
43

Inan, *Emrah, and Oguz Dikenelli. "A Semantic-Embedding Model-Driven Seq2Seq Method for Domain-Oriented Entity Linking on Resource-Restricted Devices." International Journal on Semantic Web and Information Systems 17, no. 3 (July 2021): 73–87. http://dx.doi.org/10.4018/ijswis.2021070105.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
General entity linking systems usually leverage global coherence of all the mapped entities in the same document by using semantic embeddings and graph-based approaches. However, graph-based approaches are computationally expensive for open-domain datasets. In this paper, the authors overcome these problems by presenting an RDF embedding-based seq2seq entity linking method in specific domains. They filter candidate entities of mentions having similar meanings by using the domain information of the annotated pairs. They resolve high ambiguous pairs by using Bi-directional long short-term memory (Bi-LSTM) and attention mechanism for the entity disambiguation. To evaluate the system with baseline methods, they generate a dataset including book, music, and movie categories. They achieved 0.55 (Mi-F1), 0.586 (Ma-F1), 0.846 (Mi-F1), and 0.87 (Ma-F1) scores for high and low ambiguous datasets. They compare the method by using recent (WNED-CWEB) datasets with existing methods. Considering the domain-specificity of the proposed method, it tends to achieve competitive results while using the domain-oriented datasets.
44

Ulutaş Karakol, D., G. Kara, C. Yılmaz, and Ç. Cömert. "SEMANTIC LINKING SPATIAL RDF DATA TO THE WEB DATA SOURCES." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4 (September 19, 2018): 639–45. http://dx.doi.org/10.5194/isprs-archives-xlii-4-639-2018.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
<p><strong>Abstract.</strong> Large amounts of spatial data are hold in relational databases. Spatial data in the relational databases must be converted to RDF for semantic web applications. Spatial data is an important key factor for creating spatial RDF data. Linked Data is the most preferred way by users to publish and share data in the relational databases on the Web. In order to define the semantics of the data, links are provided to vocabularies (ontologies or other external web resources) that are common conceptualizations for a domain. Linking data of resource vocabulary with globally published concepts of domain resources combines different data sources and datasets, makes data more understandable, discoverable and usable, improves data interoperability and integration, provides automatic reasoning and prevents data duplication. The need to convert relational data to RDF is coming in sight due to semantic expressiveness of Semantic Web Technologies. One of the important key factors of Semantic Web is ontologies. Ontology means “explicit specification of a conceptualization”. The semantics of spatial data relies on ontologies. Linking of spatial data from relational databases to the web data sources is not an easy task for sharing machine-readable interlinked data on the Web. Tim Berners-Lee, the inventor of the World Wide Web and the advocate of Semantic Web and Linked Data, layed down the Linked Data design principles. Based on these rules, firstly, spatial data in the relational databases must be converted to RDF with the use of supporting tools. Secondly, spatial RDF data must be linked to upper level-domain ontologies and related web data sources. Thirdly, external data sources (ontologies and web data sources) must be determined and spatial RDF data must be linked related data sources. Finally, spatial linked data must be published on the web. The main contribution of this study is to determine requirements for finding RDF links and put forward the deficiencies for creating or publishing linked spatial data. To achieve this objective, this study researches existing approaches, conversion tools and web data sources for relational data conversion to the spatial RDF. In this paper, we have investigated current state of spatial RDF data, standards, open source platforms (particularly D2RQ, Geometry2RDF, TripleGeo, GeoTriples, Ontop, etc.) and the Web Data Sources. Moreover, the process of spatial data conversion to the RDF and how to link it to the web data sources is described. The implementation of linking spatial RDF data to the web data sources is demonstrated with an example use case. Road data has been linked to the one of the related popular web data sources, DBPedia. SILK, a tool for discovering relationships between data items within different Linked Data sources, is used as a link discovery framework. Also, we evaluated other link discovery tools e.g. LIMES, Silk and results are compared to carry out matching/linking task. As a result, linked road data is shared and represented as an information resource on the web and enriched with definitions of related different resources. By this way, road datasets are also linked by the related classes, individuals, spatial relations and properties they cover such as, construction date, road length, coordinates, etc.</p>
45

Gomathi, Ramalingam, and Dhandapani Sharmila. "A Novel Adaptive Cuckoo Search for Optimal Query Plan Generation." Scientific World Journal 2014 (2014): 1–7. http://dx.doi.org/10.1155/2014/727658.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The emergence of multiple web pages day by day leads to the development of the semantic web technology. A World Wide Web Consortium (W3C) standard for storing semantic web data is the resource description framework (RDF). To enhance the efficiency in the execution time for querying large RDF graphs, the evolving metaheuristic algorithms become an alternate to the traditional query optimization methods. This paper focuses on the problem of query optimization of semantic web data. An efficient algorithm called adaptive Cuckoo search (ACS) for querying and generating optimal query plan for large RDF graphs is designed in this research. Experiments were conducted on different datasets with varying number of predicates. The experimental results have exposed that the proposed approach has provided significant results in terms of query execution time. The extent to which the algorithm is efficient is tested and the results are documented.
46

Yousfi, Houssameddine, Amin Mesmoudi, Allel Hadjali, Houcine Matallah, and Seif-Eddine Benkabou. "SRDF_QDAG: An efficient end-to-end RDF data management when graph exploration meets spatial processing." Computer Science and Information Systems, no. 00 (2023): 46. http://dx.doi.org/10.2298/csis230225046y.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The popularity of RDF has led to the creation of several datasets (e.g., Yago, DBPedia) with different natures (graph, temporal, spatial). Different extensions have also been proposed for SPARQL language to provide appropriate processing. The best known is GeoSparql, that allows the integration of a set of spatial operators. In this paper, we propose new strategies to support such operators within a particular TripleStore, named RDF QDAG, that relies on graph fragmentation and exploration and guarantees a good compromise between scalability and performance. Our proposal covers the different TripleStore components (Storage, evaluation, optimization). We evaluated our proposal using spatial queries with real RDF data, and we also compared performance with the latest version of a popular commercial TripleStore. The first results demonstrate the relevance of our proposal and how to achieve an average gain of performance of 28% by choosing the right evaluation strategies to use. Based on these results, we proposed to extend the RDF QDAG optimizer to dynamically select the evaluation strategy to use depending on the query. Then, we show also that our proposal yields the best strategy for most queries.
47

Le-Tuan, Anh, Conor Hayes, Manfred Hauswirth, and Danh Le-Phuoc. "Pushing the Scalability of RDF Engines on IoT Edge Devices." Sensors 20, no. 10 (May 14, 2020): 2788. http://dx.doi.org/10.3390/s20102788.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Semantic interoperability for the Internet of Things (IoT) is enabled by standards and technologies from the Semantic Web. As recent research suggests a move towards decentralised IoT architectures, we have investigated the scalability and robustness of RDF (Resource Description Framework)engines that can be embedded throughout the architecture, in particular at edge nodes. RDF processing at the edge facilitates the deployment of semantic integration gateways closer to low-level devices. Our focus is on how to enable scalable and robust RDF engines that can operate on lightweight devices. In this paper, we have first carried out an empirical study of the scalability and behaviour of solutions for RDF data management on standard computing hardware that have been ported to run on lightweight devices at the network edge. The findings of our study shows that these RDF store solutions have several shortcomings on commodity ARM (Advanced RISC Machine) boards that are representative of IoT edge node hardware. Consequently, this has inspired us to introduce a lightweight RDF engine, which comprises an RDF storage and a SPARQL processor for lightweight edge devices, called RDF4Led. RDF4Led follows the RISC-style (Reduce Instruction Set Computer) design philosophy. The design constitutes a flash-aware storage structure, an indexing scheme, an alternative buffer management technique and a low-memory-footprint join algorithm that demonstrates improved scalability and robustness over competing solutions. With a significantly smaller memory footprint, we show that RDF4Led can handle 2 to 5 times more data than popular RDF engines such as Jena TDB (Tuple Database) and RDF4J, while consuming the same amount of memory. In particular, RDF4Led requires 10%–30% memory of its competitors to operate on datasets of up to 50 million triples. On memory-constrained ARM boards, it can perform faster updates and can scale better than Jena TDB and Virtuoso. Furthermore, we demonstrate considerably faster query operations than Jena TDB and RDF4J.
48

Bereta, K., G. Xiao, and M. Koubarakis. "ANSWERING GEOSPARQL QUERIES OVER RELATIONAL DATA." ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences XLII-4/W2 (July 5, 2017): 43–50. http://dx.doi.org/10.5194/isprs-archives-xlii-4-w2-43-2017.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In this paper we present the system Ontop-spatial that is able to answer GeoSPARQL queries on top of geospatial relational databases, performing on-the-fly GeoSPARQL-to-SQL translation using ontologies and mappings. GeoSPARQL is a geospatial extension of the query language SPARQL standardized by OGC for querying geospatial RDF data. Our approach goes beyond relational databases and covers all data that can have a relational structure even at the logical level. Our purpose is to enable GeoSPARQL querying on-the-fly integrating multiple geospatial sources, without converting and materializing original data as RDF and then storing them in a triple store. This approach is more suitable in the cases where original datasets are stored in large relational databases (or generally in files with relational structure) and/or get frequently updated.
49

HACHEY, B., C. GROVER, and R. TOBIN. "Datasets for generic relation extraction." Natural Language Engineering 18, no. 1 (March 9, 2011): 21–59. http://dx.doi.org/10.1017/s1351324911000106.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
AbstractA vast amount of usable electronic data is in the form of unstructured text. The relation extraction task aims to identify useful information in text (e.g. PersonW works for OrganisationX, GeneY encodes ProteinZ) and recode it in a format such as a relational database or RDF triplestore that can be more effectively used for querying and automated reasoning. A number of resources have been developed for training and evaluating automatic systems for relation extraction in different domains. However, comparative evaluation is impeded by the fact that these corpora use different markup formats and notions of what constitutes a relation. We describe the preparation of corpora for comparative evaluation of relation extraction across domains based on the publicly available ACE 2004, ACE 2005 and BioInfer data sets. We present a common document type using token standoff and including detailed linguistic markup, while maintaining all information in the original annotation. The subsequent reannotation process normalises the two data sets so that they comply with a notion of relation that is intuitive, simple and informed by the semantic web. For the ACE data, we describe an automatic process that automatically converts many relations involving nested, nominal entity mentions to relations involving non-nested, named or pronominal entity mentions. For example, the first entity is mapped from ‘one’ to ‘Amidu Berry’ in the membership relation described in ‘Amidu Berry, one half of PBS’. Moreover, we describe a comparably reannotated version of the BioInfer corpus that flattens nested relations, maps part-whole to part-part relations and maps n-ary to binary relations. Finally, we summarise experiments that compare approaches to generic relation extraction, a knowledge discovery task that uses minimally supervised techniques to achieve maximally portable extractors. These experiments illustrate the utility of the corpora.1
50

Ravindra, Padmashree, and Kemafor Anyanwu. "Nesting Strategies for Enabling Nimble MapReduce Dataflows for Large RDF Data." International Journal on Semantic Web and Information Systems 10, no. 1 (January 2014): 1–26. http://dx.doi.org/10.4018/ijswis.2014010101.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Graph and semi-structured data are usually modeled in relational processing frameworks as “thin” relations (node, edge, node) and processing such data involves a lot of join operations. Intermediate results of joins with multi-valued attributes or relationships, contain redundant subtuples due to repetition of single-valued attributes. The amount of redundant content is high for real-world multi-valued relationships in social network (millions of Twitter followers of popular celebrities) or biological (multiple references to related proteins) datasets. In MapReduce-based platforms such as Apache Hive and Pig, redundancy in intermediate results contributes avoidable costs to the overall I/O, sorting, and network transfer overhead of join-intensive workloads due to longer workflows. Consequently, providing techniques for dealing with such redundancy will enable more nimble execution of such workflows. This paper argues for the use of a nested data model for representing intermediate data concisely using nesting-aware dataflow operators that allow for lazy and partial unnesting strategies. This approach reduces the overall I/O and network footprint of a workflow by concisely representing intermediate results during most of a workflow's execution, until complete unnesting is absolutely necessary. The proposed strategies are integrated into Apache Pig and experimental evaluation over real-world and synthetic benchmark datasets confirms their superiority over relational-style MapReduce systems such as Apache Pig and Hive.

To the bibliography