Teses / dissertações sobre o tema "RDF Data"

Siga este link para ver outros tipos de publicações sobre o tema: RDF Data.

Crie uma referência precisa em APA, MLA, Chicago, Harvard, e outros estilos

Selecione um tipo de fonte:

Veja os 50 melhores trabalhos (teses / dissertações) para estudos sobre o assunto "RDF Data".

Ao lado de cada fonte na lista de referências, há um botão "Adicionar à bibliografia". Clique e geraremos automaticamente a citação bibliográfica do trabalho escolhido no estilo de citação de que você precisa: APA, MLA, Harvard, Chicago, Vancouver, etc.

Você também pode baixar o texto completo da publicação científica em formato .pdf e ler o resumo do trabalho online se estiver presente nos metadados.

Veja as teses / dissertações das mais diversas áreas científicas e compile uma bibliografia correta.

1

Abedjan, Ziawasch. "Improving RDF data with data mining". Phd thesis, Universität Potsdam, 2014. http://opus.kobv.de/ubp/volltexte/2014/7133/.

Texto completo da fonte
Resumo:
Linked Open Data (LOD) comprises very many and often large public data sets and knowledge bases. Those datasets are mostly presented in the RDF triple structure of subject, predicate, and object, where each triple represents a statement or fact. Unfortunately, the heterogeneity of available open data requires significant integration steps before it can be used in applications. Meta information, such as ontological definitions and exact range definitions of predicates, are desirable and ideally provided by an ontology. However in the context of LOD, ontologies are often incomplete or simply not available. Thus, it is useful to automatically generate meta information, such as ontological dependencies, range definitions, and topical classifications. Association rule mining, which was originally applied for sales analysis on transactional databases, is a promising and novel technique to explore such data. We designed an adaptation of this technique for min-ing Rdf data and introduce the concept of “mining configurations”, which allows us to mine RDF data sets in various ways. Different configurations enable us to identify schema and value dependencies that in combination result in interesting use cases. To this end, we present rule-based approaches for auto-completion, data enrichment, ontology improvement, and query relaxation. Auto-completion remedies the problem of inconsistent ontology usage, providing an editing user with a sorted list of commonly used predicates. A combination of different configurations step extends this approach to create completely new facts for a knowledge base. We present two approaches for fact generation, a user-based approach where a user selects the entity to be amended with new facts and a data-driven approach where an algorithm discovers entities that have to be amended with missing facts. As knowledge bases constantly grow and evolve, another approach to improve the usage of RDF data is to improve existing ontologies. Here, we present an association rule based approach to reconcile ontology and data. Interlacing different mining configurations, we infer an algorithm to discover synonymously used predicates. Those predicates can be used to expand query results and to support users during query formulation. We provide a wide range of experiments on real world datasets for each use case. The experiments and evaluations show the added value of association rule mining for the integration and usability of RDF data and confirm the appropriateness of our mining configuration methodology.
Linked Open Data (LOD) umfasst viele und oft sehr große öffentlichen Datensätze und Wissensbanken, die hauptsächlich in der RDF Triplestruktur bestehend aus Subjekt, Prädikat und Objekt vorkommen. Dabei repräsentiert jedes Triple einen Fakt. Unglücklicherweise erfordert die Heterogenität der verfügbaren öffentlichen Daten signifikante Integrationsschritte bevor die Daten in Anwendungen genutzt werden können. Meta-Daten wie ontologische Strukturen und Bereichsdefinitionen von Prädikaten sind zwar wünschenswert und idealerweise durch eine Wissensbank verfügbar. Jedoch sind Wissensbanken im Kontext von LOD oft unvollständig oder einfach nicht verfügbar. Deshalb ist es nützlich automatisch Meta-Informationen, wie ontologische Abhängigkeiten, Bereichs-und Domänendefinitionen und thematische Assoziationen von Ressourcen generieren zu können. Eine neue und vielversprechende Technik um solche Daten zu untersuchen basiert auf das entdecken von Assoziationsregeln, welche ursprünglich für Verkaufsanalysen in transaktionalen Datenbanken angewendet wurde. Wir haben eine Adaptierung dieser Technik auf RDF Daten entworfen und stellen das Konzept der Mining Konfigurationen vor, welches uns befähigt in RDF Daten auf unterschiedlichen Weisen Muster zu erkennen. Verschiedene Konfigurationen erlauben uns Schema- und Wertbeziehungen zu erkennen, die für interessante Anwendungen genutzt werden können. In dem Sinne, stellen wir assoziationsbasierte Verfahren für eine Prädikatvorschlagsverfahren, Datenvervollständigung, Ontologieverbesserung und Anfrageerleichterung vor. Das Vorschlagen von Prädikaten behandelt das Problem der inkonsistenten Verwendung von Ontologien, indem einem Benutzer, der einen neuen Fakt einem Rdf-Datensatz hinzufügen will, eine sortierte Liste von passenden Prädikaten vorgeschlagen wird. Eine Kombinierung von verschiedenen Konfigurationen erweitert dieses Verfahren sodass automatisch komplett neue Fakten für eine Wissensbank generiert werden. Hierbei stellen wir zwei Verfahren vor, einen nutzergesteuertenVerfahren, bei dem ein Nutzer die Entität aussucht die erweitert werden soll und einen datengesteuerten Ansatz, bei dem ein Algorithmus selbst die Entitäten aussucht, die mit fehlenden Fakten erweitert werden. Da Wissensbanken stetig wachsen und sich verändern, ist ein anderer Ansatz um die Verwendung von RDF Daten zu erleichtern die Verbesserung von Ontologien. Hierbei präsentieren wir ein Assoziationsregeln-basiertes Verfahren, der Daten und zugrundeliegende Ontologien zusammenführt. Durch die Verflechtung von unterschiedlichen Konfigurationen leiten wir einen neuen Algorithmus her, der gleichbedeutende Prädikate entdeckt. Diese Prädikate können benutzt werden um Ergebnisse einer Anfrage zu erweitern oder einen Nutzer während einer Anfrage zu unterstützen. Für jeden unserer vorgestellten Anwendungen präsentieren wir eine große Auswahl an Experimenten auf Realweltdatensätzen. Die Experimente und Evaluierungen zeigen den Mehrwert von Assoziationsregeln-Generierung für die Integration und Nutzbarkeit von RDF Daten und bestätigen die Angemessenheit unserer konfigurationsbasierten Methodologie um solche Regeln herzuleiten.
Estilos ABNT, Harvard, Vancouver, APA, etc.
2

Qiao, Shi. "QUERYING GRAPH STRUCTURED RDF DATA". Case Western Reserve University School of Graduate Studies / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=case1447198654.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
3

Frommhold, Marvin, Piris Rubén Navarro, Natanael Arndt, Sebastian Tramp, Niklas Petersen e Michael Martin. "Towards versioning of arbitrary RDF data". Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15777.

Texto completo da fonte
Resumo:
Coherent and consistent tracking of provenance data and in particular update history information is a crucial building block for any serious information system architecture. Version Control Systems can be a part of such an architecture enabling users to query and manipulate versioning information as well as content revisions. In this paper, we introduce an RDF versioning approach as a foundation for a full featured RDF Version Control System. We argue that such a system needs support for all concepts of the RDF specification including support for RDF datasets and blank nodes. Furthermore, we placed special emphasis on the protection against unperceived history manipulation by hashing the resulting patches. In addition to the conceptual analysis and an RDF vocabulary for representing versioning information, we present a mature implementation which captures versioning information for changes to arbitrary RDF datasets.
Estilos ABNT, Harvard, Vancouver, APA, etc.
4

HERRERA, JOSE EDUARDO TALAVERA. "AN ARCHITECTURE FOR RDF DATA SOURCES RECOMMENDATION". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2012. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=21367@1.

Texto completo da fonte
Resumo:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Dentro do processo de publicação de dados na Web recomenda-se interligar os dados entre diferentes fontes, através de recursos similares que descrevam um domínio em comum. No entanto, com o crescimento do número dos conjuntos de dados publicados na Web de Dados, as tarefas de descoberta e seleção de dados tornam-se cada vez mais complexas. Além disso, a natureza distribuída e interconectada dos dados, fazem com que a sua análise e entendimento sejam muito demorados. Neste sentido, este trabalho visa oferecer uma arquitetura Web para a identificação de fontes de dados em RDF, com o objetivo de prover melhorias nos processos de publicação, interconex ão, e exploração de dados na Linked Open Data. Para tal, nossa abordagem utiliza o modelo de MapReduce sobre o paradigma de computa ção nas nuvens. Assim, podemos efetuar buscas paralelas por palavraschave sobre um índice de dados semânticos existente na Web. Estas buscas permitem identificar fontes candidatas para ligar os dados. Por meio desta abordagem, foi possível integrar diferentes ferramentas da web semântica em um processo de busca para descobrir fontes de dados relevantes, e relacionar tópicos de interesse denidos pelo usuário. Para atingir nosso objetivo foi necessária a indexação e análise de texto para aperfeiçoar a busca de recursos na Linked Open Data. Para mostrar a ecácia de nossa abordagem foi desenvolvido um estudo de caso, utilizando um subconjunto de dados de uma fonte na Linked Open Data, através do seu serviço SPARQL endpoint. Os resultados do nosso trabalho revelam que a geração de estatísticas sobre os dados da fonte é, de fato, um grande diferencial no processo de busca. Estas estatísticas ajudam ao usuário no processo de escolha de indivíduos. Um processo especializado de extração de palavras-chave é aplicado para cada indivíduo com o objetivo de gerar diferentes buscas sobre o índice semântico. Mostramos a escalabilidade de nosso processo de recomendação de fontes RDF através de diferentes amostras de indivíduos.
In the Web publishing process of data it is recommended to link the data from different sources using similar resources that describe a domain in common. However, the growing number of published data sets on the Web have made the data discovery and data selection tasks become increasingly complex. Moreover, the distributed and interconnected nature of the data causes the understanding and analysis to become too prolonged. In this context, this work aims to provide a Web architecture for identifying RDF data sources with the goal of improving the publishing, interconnection, and data exploration processes within the Linked Open Data. Our approach utilizes the MapReduce computing model on top of the cloud computing paradigm. In this manner, we are able to make parallel keyword searches over existing semantic data indexes available on the web. This will allow to identify candidate sources to link the data. Through this approach, it was possible to integrate different semantic web tools and relevant data sources in a search process, and also to relate topics of interest denied by the user. In order to achieve our objectives it was necessary to index and analyze text to improve the search of resources in the Linked Open Data. To show the effectiveness of our approach we developed a case study using a subset of data from a source in the Linked Open Data through its SPARQL endpoint service. The results of our work reveal that the generation and usage of data source s statistics do make a great difference within the search process. These statistics help the user within the choosing individuals process. Furthermore, a specialized keyword extraction process is run for each individual in order to create different search processes using the semantic index. We show the scalability of our RDF recommendation process by sampling several individuals.
Estilos ABNT, Harvard, Vancouver, APA, etc.
5

Kaithi, Bhargavacharan Reddy. "Knowledge Graph Reasoning over Unseen RDF Data". Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1571955816559707.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
6

Espinola, Roger Humberto Castillo. "Indexing RDF data using materialized SPARQL queries". Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, 2012. http://dx.doi.org/10.18452/16582.

Texto completo da fonte
Resumo:
In dieser Arbeit schlagen wir die Verwendung von materialisierten Anfragen als Indexstruktur für RDF-Daten vor. Wir streben eine Reduktion der Bearbeitungszeit durch die Minimierung der Anzahl der Vergleiche zwischen Anfrage und RDF Datenmenge an. Darüberhinaus betonen wir die Rolle von Kostenmodellen und Indizes für die Auswahl eines efizienten Ausführungsplans in Abhängigkeit vom Workload. Wir geben einen Überblick über das Problem der Auswahl von materialisierten Anfragen in relationalen Datenbanken und diskutieren ihre Anwendung zur Optimierung der Anfrageverarbeitung. Wir stellen RDFMatView als Framework für SPARQL-Anfragen vor. RDFMatView benutzt materializierte Anfragen als Indizes und enthalt Algorithmen, um geeignete Indizes fur eine gegebene Anfrage zu finden und sie in Ausführungspläne zu integrieren. Die Auswahl eines effizienten Ausführungsplan ist das zweite Thema dieser Arbeit. Wir führen drei verschiedene Kostenmodelle für die Verarbeitung von SPARQL Anfragen ein. Ein detaillierter Vergleich der Kostmodelle zeigt, dass ein auf Index-- und Prädikat--Statistiken beruhendes Modell die genauesten Informationen liefert, um einen effizienten Ausführungsplan auszuwählen. Die Evaluation zeigt, dass unsere Methode die Anfragebearbeitungszeit im Vergleich zu unoptimierten SPARQL--Anfragen um mehrere Größenordnungen reduziert. Schließlich schlagen wir eine einfache, aber effektive Strategie für das Problem der Auswahl von materialisierten Anfragen über RDF-Daten vor. Ausgehend von einem bestimmten Workload werden algorithmisch diejenigen Indizes augewählt, die die Bearbeitungszeit des gesamten Workload minimieren sollen. Dann erstellen wir auf der Basis von Anfragemustern eine Menge von Index--Kandidaten und suchen in dieser Menge Zusammenhangskomponenten. Unsere Auswertung zeigt, dass unsere Methode zur Auswahl von Indizes im Vergleich zu anderen, die größten Einsparungen in der Anfragebearbeitungszeit liefert.
In this thesis, we propose to use materialized queries as a special index structure for RDF data. We strive to reduce the query processing time by minimizing the number of comparisons between the query and the RDF dataset. We also emphasize the role of cost models in the selection of execution plans as well as index sets for a given workload. We provide an overview of the materialized view selection problem in relational databases and discuss its application for optimization of query processing. We introduce RDFMatView, a framework for answering SPARQL queries using materialized views as indexes. We provide algorithms to discover those indexes that can be used to process a given query and we develop different strategies to integrate these views in query execution plans. The selection of an efficient execution plan states the topic of our second major contribution. We introduce three different cost models designed for SPARQL query processing with materialized views. A detailed comparison of these models reveals that a model based on index and predicate statistics provides the most accurate cost estimation. We show that selecting an execution plan using this cost model yields a reduction of processing time with several orders of magnitude compared to standard SPARQL query processing. Finally, we propose a simple yet effective strategy for the materialized view selection problem applied to RDF data. Based on a given workload of SPARQL queries we provide algorithms for selecting a set of indexes that minimizes the workload processing time. We create a candidate index by retrieving all connected components from query patterns. Our evaluation shows that using the set of suggested indexes usually achieves larger runtime savings than other index sets regarding the given workload.
Estilos ABNT, Harvard, Vancouver, APA, etc.
7

Sherif, Mohamed Ahmed Mohamed. "Automating Geospatial RDF Dataset Integration and Enrichment". Doctoral thesis, Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-215708.

Texto completo da fonte
Resumo:
Over the last years, the Linked Open Data (LOD) has evolved from a mere 12 to more than 10,000 knowledge bases. These knowledge bases come from diverse domains including (but not limited to) publications, life sciences, social networking, government, media, linguistics. Moreover, the LOD cloud also contains a large number of crossdomain knowledge bases such as DBpedia and Yago2. These knowledge bases are commonly managed in a decentralized fashion and contain partly verlapping information. This architectural choice has led to knowledge pertaining to the same domain being published by independent entities in the LOD cloud. For example, information on drugs can be found in Diseasome as well as DBpedia and Drugbank. Furthermore, certain knowledge bases such as DBLP have been published by several bodies, which in turn has lead to duplicated content in the LOD . In addition, large amounts of geo-spatial information have been made available with the growth of heterogeneous Web of Data. The concurrent publication of knowledge bases containing related information promises to become a phenomenon of increasing importance with the growth of the number of independent data providers. Enabling the joint use of the knowledge bases published by these providers for tasks such as federated queries, cross-ontology question answering and data integration is most commonly tackled by creating links between the resources described within these knowledge bases. Within this thesis, we spur the transition from isolated knowledge bases to enriched Linked Data sets where information can be easily integrated and processed. To achieve this goal, we provide concepts, approaches and use cases that facilitate the integration and enrichment of information with other data types that are already present on the Linked Data Web with a focus on geo-spatial data. The first challenge that motivates our work is the lack of measures that use the geographic data for linking geo-spatial knowledge bases. This is partly due to the geo-spatial resources being described by the means of vector geometry. In particular, discrepancies in granularity and error measurements across knowledge bases render the selection of appropriate distance measures for geo-spatial resources difficult. We address this challenge by evaluating existing literature for point set measures that can be used to measure the similarity of vector geometries. Then, we present and evaluate the ten measures that we derived from the literature on samples of three real knowledge bases. The second challenge we address in this thesis is the lack of automatic Link Discovery (LD) approaches capable of dealing with geospatial knowledge bases with missing and erroneous data. To this end, we present Colibri, an unsupervised approach that allows discovering links between knowledge bases while improving the quality of the instance data in these knowledge bases. A Colibri iteration begins by generating links between knowledge bases. Then, the approach makes use of these links to detect resources with probably erroneous or missing information. This erroneous or missing information detected by the approach is finally corrected or added. The third challenge we address is the lack of scalable LD approaches for tackling big geo-spatial knowledge bases. Thus, we present Deterministic Particle-Swarm Optimization (DPSO), a novel load balancing technique for LD on parallel hardware based on particle-swarm optimization. We combine this approach with the Orchid algorithm for geo-spatial linking and evaluate it on real and artificial data sets. The lack of approaches for automatic updating of links of an evolving knowledge base is our fourth challenge. This challenge is addressed in this thesis by the Wombat algorithm. Wombat is a novel approach for the discovery of links between knowledge bases that relies exclusively on positive examples. Wombat is based on generalisation via an upward refinement operator to traverse the space of Link Specifications (LS). We study the theoretical characteristics of Wombat and evaluate it on different benchmark data sets. The last challenge addressed herein is the lack of automatic approaches for geo-spatial knowledge base enrichment. Thus, we propose Deer, a supervised learning approach based on a refinement operator for enriching Resource Description Framework (RDF) data sets. We show how we can use exemplary descriptions of enriched resources to generate accurate enrichment pipelines. We evaluate our approach against manually defined enrichment pipelines and show that our approach can learn accurate pipelines even when provided with a small number of training examples. Each of the proposed approaches is implemented and evaluated against state-of-the-art approaches on real and/or artificial data sets. Moreover, all approaches are peer-reviewed and published in a conference or a journal paper. Throughout this thesis, we detail the ideas, implementation and the evaluation of each of the approaches. Moreover, we discuss each approach and present lessons learned. Finally, we conclude this thesis by presenting a set of possible future extensions and use cases for each of the proposed approaches.
Estilos ABNT, Harvard, Vancouver, APA, etc.
8

Abedjan, Ziawasch [Verfasser], e Felix [Akademischer Betreuer] Naumann. "Improving RDF data with data mining / Ziawasch Abedjan. Betreuer: Felix Naumann". Potsdam : Universitätsbibliothek der Universität Potsdam, 2014. http://d-nb.info/1059014122/34.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
9

Morgan, Juston. "Visual language for exploring massive RDF data sets". Pullman, Wash. : Washington State University, 2010. http://www.dissertations.wsu.edu/Thesis/Spring2010/J_Morgan_041210.pdf.

Texto completo da fonte
Resumo:
Thesis (M.S. in computer science)--Washington State University, May 2010.
Title from PDF title page (viewed on July 12, 2010). "School of Engineering and Computer Science." Includes bibliographical references (p. 33-34).
Estilos ABNT, Harvard, Vancouver, APA, etc.
10

Fan, Zhengjie. "Concise Pattern Learning for RDF Data Sets Interlinking". Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM013/document.

Texto completo da fonte
Resumo:
De nombreux jeux de données sont publiés sur le web à l’aide des technologies du web sémantique. Ces jeux de données contiennent des données qui représentent des liens vers des ressources similaires. Si ces jeux de données sont liés entre eux par des liens construits correctement, les utilisateurs peuvent facilement interroger des données à travers une interface uniforme, comme s’ils interrogeaient un jeu de données unique. Mais, trouver des liens corrects est très difficile car de nombreuses comparaisons doivent être effectuées. Plusieurs solutions ont été proposées pour résoudre ce problème : (1) l’approche la plus directe est de comparer les valeurs d’attributs d’instances pour identifier les liens, mais il est impossible de comparer toutes les paires possibles de valeurs d’attributs. (2) Une autre stratégie courante consiste à comparer les instances selon les attribut correspondants trouvés par l’alignement d’ontologies à base d’instances, qui permet de générer des correspondances d’attributs basés sur des instances. Cependant, il est difficile d’identifier des instances similaires à travers les ensembles de données car,dans certains cas, les valeurs des attributs en correspondance ne sont pas les mêmes.(3) Plusieurs méthodes utilisent la programmation génétique pour construire des modèles d’interconnexion afin de comparer différentes instances, mais elles souffrent de longues durées d’exécution.Dans cette thèse, une méthode d’interconnexion est proposée pour relier les instances similaires dans différents ensembles de données, basée à la fois sur l’apprentissage statistique et sur l’apprentissage symbolique. L’entrée est constituée de deux ensembles de données, des correspondances de classes sur les deux ensembles de données et un échantillon de liens “positif” ou “négatif” résultant d’une évaluation de l’utilisateur. La méthode construit un classifieur qui distingue les bons liens des liens incorrects dans deux ensembles de données RDF en utilisant l’ensemble des liens d’échantillons évalués. Le classifieur est composé de correspondances d’attributs entre les classes correspondantes et de deux ensembles de données,qui aident à comparer les instances et à établir les liens. Le classifieur est appelé motif d’interconnexion dans cette thèse. D’une part, notre méthode découvre des correspondances potentielles entre d’attributs pour chaque correspondance de classe via une méthode d’apprentissage statistique : l’algorithme de regroupement K-medoids,en utilisant des statistiques sur les valeurs des instances. D’autre part, notre solution s’appuie sur un modèle d’interconnexion par une méthode d’apprentissage symbolique: l’espace des versions, basée sur les correspondances d’attributs potentielles découvertes et l’ensemble des liens de l’échantillon évalué. Notre méthode peut résoudre la tâche d’interconnexion quand il n’existe pas de motif d’interconnexion combiné qui couvre tous les liens corrects évalués avec un format concis.L’expérimentation montre que notre méthode d’interconnexion, avec seulement1% des liens totaux dans l’échantillon, atteint une F-mesure élevée (de 0,94 à 0,99)
There are many data sets being published on the web with Semantic Web technology. The data sets usually contain analogous data which represent the similar resources in the world. If these data sets are linked together by correctly identifying the similar instances, users can conveniently query data through a uniform interface, as if they are connecting a single database. However, finding correct links is very challenging because web data sources usually have heterogeneous ontologies maintained by different organizations. Many existing solutions have been proposed for this problem. (1) One straight-forward idea is to compare the attribute values of instances for identifying links, yet it is impossible to compare all possible pairs of attribute values. (2) Another common strategy is to compare instances with correspondences found by instance-based ontology matching, which can generate attribute correspondences based on overlapping ranges between two attributes, while it is easy to cause incomparable attribute correspondences or undiscovered comparable attribute correspondences. (3) Many existing solutions leverage Genetic Programming to construct interlinking patterns for comparing instances, however the running times of the interlinking methods are usually long. In this thesis, an interlinking method is proposed to interlink instances for different data sets, based on both statistical learning and symbolic learning. On the one hand, the method discovers potential comparable attribute correspondences of each class correspondence via a K-medoids clustering algorithm with instance value statistics. We adopt K-medoids because of its high working efficiency and high tolerance on irregular data and even incorrect data. The K-medoids classifies attributes of each class into several groups according to their statistical value features. Groups from different classes are mapped when they have similar statistical value features, to determine potential comparable attribute correspondences. The clustering procedure effectively narrows the range of candidate attribute correspondences. On the other hand, our solution also leverages a symbolic learning method, called Version Space. Version Space is an iterative learning model that searches for the interlinking pattern from two directions. Our design can solve the interlinking task that does not have a single compatible conjunctive interlinking pattern that covers all assessed correct links with a concise format. The interlinking solution is evaluated with large-scale real-world data from IM@OAEI and CKAN. Experiments confirm that the solution with only 1% of sample links already reaches a high accuracy (up to 0.94-0.99 on F-measure). The F-measure quickly converges improving on other state-of-the-art approaches, by nearly 10 percent of their F-measure values
Estilos ABNT, Harvard, Vancouver, APA, etc.
11

Bang, Ole Petter, e Tormod Fjeldskår. "Storing and Querying RDF in Mars". Thesis, Norwegian University of Science and Technology, Department of Computer and Information Science, 2009. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-8971.

Texto completo da fonte
Resumo:

As part of the Semantic Web movement, the Resource Description Framework (RDF) is gaining momentum as a format for storing data, particularly metadata. The SPARQL Protocol and RDF Query Language is a SQL-like query language, recommended by W3C for querying RDF data. FAST is exploring the possibilities of supporting storage and querying of RDF data in their Mars search engine. To facilitate this, a SPARQL parser has been created for the Microsoft .NET Framework, using the MPLex and MPPG tools from Microsoft's Managed Babel package. This thesis proposes a solution for efficiently storing and retrieving RDF data in Mars, based on decomposition and B+ Tree indexing. Further, a method for transforming SPARQL queries into Mars operator graphs is described. Finally, the implementation of a prototype implementation is discussed. The prototype has been developed in collaboration with FAST and has required customized indexing in Mars. Some deviations from the proposed solution were made in order to create a working prototype within the available time frame. The focus has been on exploring possibilities, and performance has thus not been a priority, neither in indexing nor in evaluation.

Estilos ABNT, Harvard, Vancouver, APA, etc.
12

Fernandez, Garcia Javier David, Jürgen Umbrich, Axel Polleres e Magnus Knuth. "Evaluating Query and Storage Strategies for RDF Archives". IOS Press, 2018. http://epub.wu.ac.at/6488/1/BEAR%2DSWJ.pdf.

Texto completo da fonte
Resumo:
There is an emerging demand on efficiently archiving and (temporal) querying different versions of evolving semantic Web data. As novel archiving systems are starting to address this challenge, foundations/standards for benchmarking RDF archives are needed to evaluate its storage space efficiency and the performance of different retrieval operations. To this end, we provide theoretical foundations on the design of data and queries to evaluate emerging RDF archiving systems. Then, we instantiate these foundations along a concrete set of queries on the basis of a real-world evolving dataset. Finally, we perform an empirical evaluation of various current archiving techniques and querying strategies on this data that is meant to serve as a baseline of future developments on querying archives of evolving RDF data.
Estilos ABNT, Harvard, Vancouver, APA, etc.
13

Pomykacz, Michal. "Hromadná extrakce dat veřejné správy do RDF". Master's thesis, Vysoká škola ekonomická v Praze, 2013. http://www.nusl.cz/ntk/nusl-197443.

Texto completo da fonte
Resumo:
The purpose of this work was to deal with data extraction from various formats (HTML, XML, XLS) and transformation for further processing. As the data sources were used Czech public contracts and related code lists and classifications. Main goal was to implement periodic data extraction, RDF transformation and publishing the output in form of Linked Data using SPARQL endpoint. It was necessary to design and implement extraction modules for UnifiedViews tool as it was used for periodic extractions. Theoretical section of this thesis explains the principles of linked data and key tools used for data extraction and manipulation. Practical section deals with extractors design and implementation. Part describing extractor implementation shows methods for parsing data in various dataset formats and its transformation to RDF. The success of each extractor implementation is presented at the conclusion along with thought of usability in a real world.
Estilos ABNT, Harvard, Vancouver, APA, etc.
14

Meissner, Roy, e Kurt Junghanns. "Using DevOps principles to continuously monitor RDF data quality". Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15940.

Texto completo da fonte
Resumo:
One approach to continuously achieve a certain data quality level is to use an integration pipeline that continuously checks and monitors the quality of a data set according to defined metrics. This approach is inspired by Continuous Integration pipelines, that have been introduced in the area of software development and DevOps to perform continuous source code checks. By investigating in possible tools to use and discussing the specific requirements for RDF data sets, an integration pipeline is derived that joins current approaches of the areas of software development and semantic web as well as reuses existing tools. As these tools have not been built explicitly for CI usage, we evaluate their usability and propose possible workarounds and improvements. Furthermore, a real world usage scenario is discussed, outlining the benefit of the usage of such a pipeline.
Estilos ABNT, Harvard, Vancouver, APA, etc.
15

Darari, Fariz. "Managing and Consuming Completeness Information for RDF Data Sources". Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-226655.

Texto completo da fonte
Resumo:
The ever increasing amount of Semantic Web data gives rise to the question: How complete is the data? Though generally data on the Semantic Web is incomplete, many parts of data are indeed complete, such as the children of Barack Obama and the crew of Apollo 11. This thesis aims to study how to manage and consume completeness information about Semantic Web data. In particular, we first discuss how completeness information can guarantee the completeness of query answering. Next, we propose optimization techniques of completeness reasoning and conduct experimental evaluations to show the feasibility of our approaches. We also provide a technique to check the soundness of queries with negation via reduction to query completeness checking. We further enrich completeness information with timestamps, enabling query answers to be checked up to when they are complete. We then introduce two demonstrators, i.e., CORNER and COOL-WD, to show how our completeness framework can be realized. Finally, we investigate an automated method to generate completeness statements from text on the Web via relation cardinality extraction.
Estilos ABNT, Harvard, Vancouver, APA, etc.
16

Chartrand, Timothy Adam. "Ontology-Based Extraction of RDF Data from the World Wide Web". BYU ScholarsArchive, 2003. https://scholarsarchive.byu.edu/etd/56.

Texto completo da fonte
Resumo:
The simplicity and proliferation of the World Wide Web (WWW) has taken the availability of information to an unprecedented level. The next generation of the Web, the Semantic Web, seeks to make information more usable by machines by introducing a more rigorous structure based on ontologies. One hinderance to the Semantic Web is the lack of existing semantically marked-up data. Until there is a critical mass of Semantic Web data, few people will develop and use Semantic Web applications. This project helps promote the Semantic Web by providing content. We apply existing information-extraction techniques, in particular, the BYU ontologybased data-extraction system, to extract information from the WWW based on a Semantic Web ontology to produce Semantic Web data with respect to that ontology. As an example of how the generated Semantic Web data can be used, we provide an application to browse the extracted data and the source documents together. In this sense, the extracted data is superimposed over or is an index over the source documents. Our experiments with ontologies in four application domains show that our approach can indeed extract Semantic Web data from the WWW with precision and recall similar to that achieved by the underlying information extraction system and make that data accessible to Semantic Web applications.
Estilos ABNT, Harvard, Vancouver, APA, etc.
17

Ren, Xiangnan. "Traitement et raisonnement distribués des flux RDF". Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1139/document.

Texto completo da fonte
Resumo:
Le traitement en temps réel des flux de données émanant des capteurs est devenu une tâche courante dans de nombreux scénarios industriels. Dans le contexte de l'Internet des objets (IoT), les données sont émises par des sources de flux hétérogènes, c'est-à-dire provenant de domaines et de modèles de données différents. Cela impose aux applications de l'IoT de gérer efficacement l'intégration de données à partir de ressources diverses. Le traitement des flux RDF est dès lors devenu un domaine de recherche important. Cette démarche basée sur des technologies du Web Sémantique supporte actuellement de nombreuses applications innovantes où les notions de temps réel et de raisonnement sont prépondérantes. La recherche présentée dans ce manuscrit s'attaque à ce type d'application. En particulier, elle a pour objectif de gérer efficacement les flux de données massifs entrants et à avoir des services avancés d’analyse de données, e.g., la détection d’anomalie. Cependant, un moteur de RDF Stream Processing (RSP) moderne doit prendre en compte les caractéristiques de volume et de vitesse rencontrées à l'ère du Big Data. Dans un projet industriel d'envergure, nous avons découvert qu'un moteur de traitement de flux disponible 24/7 est généralement confronté à un volume de données massives, avec des changements dynamiques de la structure des données et les caractéristiques de la charge du système. Pour résoudre ces problèmes, nous proposons Strider, un moteur de traitement de flux RDF distribué, hybride et adaptatif qui optimise le plan de requête logique selon l’état des flux de données. Strider a été conçu pour garantir d'importantes propriétés industrielles telles que l'évolutivité, la haute disponibilité, la tolérance aux pannes, le haut débit et une latence acceptable. Ces garanties sont obtenues en concevant l'architecture du moteur avec des composants actuellement incontournables du Big Data: Apache Spark et Apache Kafka. De plus, un nombre croissant de traitements exécutés sur des moteurs RSP nécessitent des mécanismes de raisonnement. Ils se traduisent généralement par un compromis entre le débit de données, la latence et le coût computationnel des inférences. Par conséquent, nous avons étendu Strider pour prendre en charge la capacité de raisonnement en temps réel avec un support d'expressivité d'ontologies en RDFS + (i.e., RDFS + owl:sameAs). Nous combinons Strider avec une approche de réécriture de requêtes pour SPARQL qui bénéficie d'un encodage intelligent pour les bases de connaissances. Le système est évalué selon différentes dimensions et sur plusieurs jeux de données, pour mettre en évidence ses performances. Enfin, nous avons exploré le raisonnement du flux RDF dans un contexte d'ontologies exprimés avec un fragment d'ASP (Answer Set Programming). La considération de cette problématique de recherche est principalement motivée par le fait que de plus en plus d'applications de streaming nécessitent des tâches de raisonnement plus expressives et complexes. Le défi principal consiste à gérer les dimensions de débit et de latence avec des méthologies efficaces. Les efforts récents dans ce domaine ne considèrent pas l'aspect de passage à l'échelle du système pour le raisonnement des flux. Ainsi, nous visons à explorer la capacité des systèmes distribuées modernes à traiter des requêtes d'inférence hautement expressive sur des flux de données volumineux. Nous considérons les requêtes exprimées dans un fragment positif de LARS (un cadre logique temporel basé sur Answer Set Programming) et proposons des solutions pour traiter ces requêtes, basées sur les deux principaux modèles d’exécution adoptés par les principaux systèmes distribuées: Bulk Synchronous Parallel (BSP) et Record-at-A-Time (RAT). Nous mettons en œuvre notre solution nommée BigSR et effectuons une série d’évaluations. Nos expériences montrent que BigSR atteint un débit élevé au-delà du million de triplets par seconde en utilisant un petit groupe de machines
Real-time processing of data streams emanating from sensors is becoming a common task in industrial scenarios. In an Internet of Things (IoT) context, data are emitted from heterogeneous stream sources, i.e., coming from different domains and data models. This requires that IoT applications efficiently handle data integration mechanisms. The processing of RDF data streams hence became an important research field. This trend enables a wide range of innovative applications where the real-time and reasoning aspects are pervasive. The key implementation goal of such application consists in efficiently handling massive incoming data streams and supporting advanced data analytics services like anomaly detection. However, a modern RSP engine has to address volume and velocity characteristics encountered in the Big Data era. In an on-going industrial project, we found out that a 24/7 available stream processing engine usually faces massive data volume, dynamically changing data structure and workload characteristics. These facts impact the engine's performance and reliability. To address these issues, we propose Strider, a hybrid adaptive distributed RDF Stream Processing engine that optimizes logical query plan according to the state of data streams. Strider has been designed to guarantee important industrial properties such as scalability, high availability, fault-tolerant, high throughput and acceptable latency. These guarantees are obtained by designing the engine's architecture with state-of-the-art Apache components such as Spark and Kafka. Moreover, an increasing number of processing jobs executed over RSP engines are requiring reasoning mechanisms. It usually comes at the cost of finding a trade-off between data throughput, latency and the computational cost of expressive inferences. Therefore, we extend Strider to support real-time RDFS+ (i.e., RDFS + owl:sameAs) reasoning capability. We combine Strider with a query rewriting approach for SPARQL that benefits from an intelligent encoding of knowledge base. The system is evaluated along different dimensions and over multiple datasets to emphasize its performance. Finally, we have stepped further to exploratory RDF stream reasoning with a fragment of Answer Set Programming. This part of our research work is mainly motivated by the fact that more and more streaming applications require more expressive and complex reasoning tasks. The main challenge is to cope with the large volume and high-velocity dimensions in a scalable and inference-enabled manner. Recent efforts in this area still missing the aspect of system scalability for stream reasoning. Thus, we aim to explore the ability of modern distributed computing frameworks to process highly expressive knowledge inference queries over Big Data streams. To do so, we consider queries expressed as a positive fragment of LARS (a temporal logic framework based on Answer Set Programming) and propose solutions to process such queries, based on the two main execution models adopted by major parallel and distributed execution frameworks: Bulk Synchronous Parallel (BSP) and Record-at-A-Time (RAT). We implement our solution named BigSR and conduct a series of evaluations. Our experiments show that BigSR achieves high throughput beyond million-triples per second using a rather small cluster of machines
Estilos ABNT, Harvard, Vancouver, APA, etc.
18

Görlitz, Olaf [Verfasser]. "Distributed query processing for federated RDF data management / Olaf Görlitz". Koblenz : Universitätsbibliothek Koblenz, 2015. http://d-nb.info/1065246986/34.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
19

ARAUJO, SAMUR FELIPE CARDOSO DE. "EXPLORATOR: A TOOL FOR EXPLORING RDF DATA THROUGH DIRECT MANIPULATION". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2009. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=13792@1.

Texto completo da fonte
Resumo:
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
Nessa dissertação propomos uma ferramenta destinada à exploração dos dados na Web Semântica. Nosso objetivo foi desenvolver um modelo de exploração que permitisse o usuário explorar uma base de dados RDF sem deter qualquer conhecimento prévio de seu domínio. Para tanto, apresentamos um modelo de operações que suportadas por uma interface baseada no paradigma de manipulação direta e query-by-example, nos permite explorar uma base de dados RDF semi-estruturada para ganhar conhecimento e responder questões específicas sobre o domínio, através de navegação, busca e outros mecanismos de exploração. Também desenvolvemos um modelo de especificação e geração automática de facetas que pode ser utilizado na construção de mecanismos de navegação facetada sobre dados RDF. O produto final desse trabalho é a ferramenta Explorator, que propomos como um ambiente para Exploração dos dados na Web Semântica.
In this dissertation we propose a tool for Semantic Data exploration. We developed an exploration model that allows users without any a prior knowledge about the data domain to explore an RDF database. So that, we presented an operation model, that supported by an interface based on the direct manipulation and query-by-example paradigm, allows users to explore an RDF base to both gain knowledge and answer questions about a domain, through navigation, search and others exploration mechanisms. Also, we developed a facet specification model and a mechanism for automatic facet extraction that can be used in the development of facet navigation systems over RDF. The final product of this work is a tool called Explorator that we are proposing as an environment for Semantic Web data exploration.
Estilos ABNT, Harvard, Vancouver, APA, etc.
20

PESCE, MARCIA LUCAS. "RDXEL: A TOOLKIT FOR RDF STATISTICAL DATA MANIPULATION THROUGH SPREADSHEETS". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2012. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=26259@1.

Texto completo da fonte
Resumo:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
PROGRAMA DE EXCELENCIA ACADEMICA
Dados estatísticos são uma das mais importantes fontes de informação para atividades humanas e organizações. No entanto, o acesso, consulta e correlação deste tipo de dados demanda grande esforço, principalmente em situações que envolvem diferentes organizações. Soluções que facilitem o acesso e a integração de grandes bases de dados analíticos, desta forma, agregam muito valor a este cenário. Neste trabalho propomos um arcabouço de software que permite com que dados estatísticos sejam eficientemente transformados e representados no formato de triplas RDF. Utilizando como base o DataCube Vocabulary, padrão W3C para o processo de triplificação de informações, a solução proposta facilita a consulta, análise, e reuso dos dados quando no formato RDF. O processo inverso, RDF para Excel, também é suportado, de modo a oferecer uma solução para a integração e consumo de dados RDF a partir de planilha.
Statistical data represent one of the most important sources of information both for humans and organizations alike. However, accessing, querying and correlating statistical data demand a great deal of effort, especially in situations that involve different organizations. Therefore, solutions to facilitate the manipulation and integration of large statistical databases add value to this scenario. In this dissertation we propose a framework that allows statistical data to be efficiently processed and represented as RDF triples. Based on the DataCube Vocabulary, W3C s triplification standard, the proposed solution makes it easy to query, analyze, and reuse statistical data in RDF format. The reverse process, RDF for Excel, is also supported, so as to offer a solution for the integration and use of RDF data in spreadsheets.
Estilos ABNT, Harvard, Vancouver, APA, etc.
21

Li, Sheng. "Query Suggestion for Keyword Search over XML and RDF Data". Thesis, Griffith University, 2014. http://hdl.handle.net/10072/367139.

Texto completo da fonte
Resumo:
With the growing amount of XML and RDF data, keyword search over XML and RDF data has become an important and increasingly researched topic. In this thesis, we investigate several problems related to keyword search over XML and RDF data, and provide solutions to these problems. The research consists largely of three main technical contributions: top-k nearest keyword (NK) search for XML data, query suggestion for XML data and query suggestion for RDF data. We first study the top-k NK search problem for XML data, which provides an approach to exploring XML queries by the distance between keyword matching nodes. A top-k NK query is to find the top-k nearest neighbors of a given node where each neighbor matches a certain keyword, and it can serve as the building block to deal with many problems in XML data, such as keyword search. In our research, we design a method to construct an extended compact TVP (ecTVP) index to efficiently find the top-k nearest neighbors in XML data. We build the index by constructing a variant of Extended Compact Tree, which finds the top-k nearest neighbors during a bottom-up and a top-down process. Theoretical analysis and experiments indicate that our proposed method is efficient to build the ecTVP index. Moreover, we reduce the redundancy in the ecTVP index. In this way, the index costs less space and query time.
Thesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Information and Communication Technology
Science, Environment, Engineering and Technology
Full Text
Estilos ABNT, Harvard, Vancouver, APA, etc.
22

Huang, Xin. "Querying big RDF data : semantic heterogeneity and rule-based inconsistency". Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB124/document.

Texto completo da fonte
Resumo:
Le Web sémantique est la vision de la prochaine génération de Web proposé par Tim Berners-Lee en 2001. Avec le développement rapide des technologies du Web sémantique, de grandes quantités de données RDF existent déjà sous forme de données ouvertes et liées et ne cessent d'augmenter très rapidement. Les outils traditionnels d'interrogation et de raisonnement sur les données du Web sémantique sont conçus pour fonctionner dans un environnement centralisé. A ce titre, les algorithmes de calcul traditionnels vont inévitablement rencontrer des problèmes de performances et des limitations de mémoire. De gros volumes de données hétérogènes sont collectés à partir de différentes sources de données par différentes organisations. Ces sources de données présentent souvent des divergences et des incertitudes dont la détection et la résolution sont rendues encore plus difficiles dans le big data. Mes travaux de recherche présentent des approches et algorithmes pour une meilleure exploitation de données dans le contexte big data et du web sémantique. Nous avons tout d'abord développé une approche de résolution des identités (Entity Resolution) avec des algorithmes d'inférence et d'un mécanisme de liaison lorsque la même entité est fournie dans plusieurs ressources RDF décrite avec différentes sémantiques et identifiants de ressources URI. Nous avons également développé un moteur de réécriture de requêtes SPARQL basé le modèle MapReduce pour inférer les données implicites décrites intentionnellement par des règles d'inférence lors de l'évaluation de la requête. L'approche de réécriture traitent également de la fermeture transitive et règles cycliques pour la prise en compte de langages de règles plus riches comme RDFS et OWL. Plusieurs optimisations ont été proposées pour améliorer l'efficacité des algorithmes visant à réduire le nombre de jobs MapReduce. La deuxième contribution concerne le traitement d'incohérence dans le big data. Nous étendons l'approche présentée dans la première contribution en tenant compte des incohérences dans les données. Cela comprend : (1) La détection d'incohérence à base de règles évaluées par le moteur de réécriture de requêtes que nous avons développé; (2) L'évaluation de requêtes permettant de calculer des résultats cohérentes selon une des trois sémantiques définies à cet effet. La troisième contribution concerne le raisonnement et l'interrogation sur la grande quantité données RDF incertaines. Nous proposons une approche basée sur MapReduce pour effectuer l'inférence de nouvelles données en présence d'incertitude. Nous proposons un algorithme d'évaluation de requêtes sur de grandes quantités de données RDF probabilistes pour le calcul et l'estimation des probabilités des résultats
Semantic Web is the vision of next generation of Web proposed by Tim Berners-Lee in 2001. Indeed, with the rapid development of Semantic Web technologies, large-scale RDF data already exist as linked open data, and their number is growing rapidly. Traditional Semantic Web querying and reasoning tools are designed to run in stand-alone environment. Therefor, Processing large-scale bulk data computation using traditional solutions will result in bottlenecks of memory space and computational performance inevitably. Large volumes of heterogeneous data are collected from different data sources by different organizations. In this context, different sources always exist inconsistencies and uncertainties which are difficult to identify and evaluate. To solve these challenges of Semantic Web, the main research contents and innovative approaches are proposed as follows. For these purposes, we firstly developed an inference based semantic entity resolution approach and linking mechanism when the same entity is provided in multiple RDF resources described using different semantics and URIs identifiers. We also developed a MapReduce based rewriting engine for Sparql query over big RDF data to handle the implicit data described intentionally by inference rules during query evaluation. The rewriting approach also deal with the transitive closure and cyclic rules to provide a rich inference language as RDFS and OWL. The second contribution concerns the distributed inconsistency processing. We extend the approach presented in first contribution by taking into account inconsistency in the data. This includes: (1)Rules based inconsistency detection with the help of our query rewriting engine; (2)Consistent query evaluation in three different semantics. The third contribution concerns the reasoning and querying over large-scale uncertain RDF data. We propose an MapReduce based approach to deal with large-scale reasoning with uncertainty. Unlike possible worlds semantic, we propose an algorithm for generating intensional Sparql query plan over probabilistic RDF graph for computing the probabilities of results within the query
Estilos ABNT, Harvard, Vancouver, APA, etc.
23

Huang, Xin. "Querying big RDF data : semantic heterogeneity and rule-based inconsistency". Electronic Thesis or Diss., Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCB124.

Texto completo da fonte
Resumo:
Le Web sémantique est la vision de la prochaine génération de Web proposé par Tim Berners-Lee en 2001. Avec le développement rapide des technologies du Web sémantique, de grandes quantités de données RDF existent déjà sous forme de données ouvertes et liées et ne cessent d'augmenter très rapidement. Les outils traditionnels d'interrogation et de raisonnement sur les données du Web sémantique sont conçus pour fonctionner dans un environnement centralisé. A ce titre, les algorithmes de calcul traditionnels vont inévitablement rencontrer des problèmes de performances et des limitations de mémoire. De gros volumes de données hétérogènes sont collectés à partir de différentes sources de données par différentes organisations. Ces sources de données présentent souvent des divergences et des incertitudes dont la détection et la résolution sont rendues encore plus difficiles dans le big data. Mes travaux de recherche présentent des approches et algorithmes pour une meilleure exploitation de données dans le contexte big data et du web sémantique. Nous avons tout d'abord développé une approche de résolution des identités (Entity Resolution) avec des algorithmes d'inférence et d'un mécanisme de liaison lorsque la même entité est fournie dans plusieurs ressources RDF décrite avec différentes sémantiques et identifiants de ressources URI. Nous avons également développé un moteur de réécriture de requêtes SPARQL basé le modèle MapReduce pour inférer les données implicites décrites intentionnellement par des règles d'inférence lors de l'évaluation de la requête. L'approche de réécriture traitent également de la fermeture transitive et règles cycliques pour la prise en compte de langages de règles plus riches comme RDFS et OWL. Plusieurs optimisations ont été proposées pour améliorer l'efficacité des algorithmes visant à réduire le nombre de jobs MapReduce. La deuxième contribution concerne le traitement d'incohérence dans le big data. Nous étendons l'approche présentée dans la première contribution en tenant compte des incohérences dans les données. Cela comprend : (1) La détection d'incohérence à base de règles évaluées par le moteur de réécriture de requêtes que nous avons développé; (2) L'évaluation de requêtes permettant de calculer des résultats cohérentes selon une des trois sémantiques définies à cet effet. La troisième contribution concerne le raisonnement et l'interrogation sur la grande quantité données RDF incertaines. Nous proposons une approche basée sur MapReduce pour effectuer l'inférence de nouvelles données en présence d'incertitude. Nous proposons un algorithme d'évaluation de requêtes sur de grandes quantités de données RDF probabilistes pour le calcul et l'estimation des probabilités des résultats
Semantic Web is the vision of next generation of Web proposed by Tim Berners-Lee in 2001. Indeed, with the rapid development of Semantic Web technologies, large-scale RDF data already exist as linked open data, and their number is growing rapidly. Traditional Semantic Web querying and reasoning tools are designed to run in stand-alone environment. Therefor, Processing large-scale bulk data computation using traditional solutions will result in bottlenecks of memory space and computational performance inevitably. Large volumes of heterogeneous data are collected from different data sources by different organizations. In this context, different sources always exist inconsistencies and uncertainties which are difficult to identify and evaluate. To solve these challenges of Semantic Web, the main research contents and innovative approaches are proposed as follows. For these purposes, we firstly developed an inference based semantic entity resolution approach and linking mechanism when the same entity is provided in multiple RDF resources described using different semantics and URIs identifiers. We also developed a MapReduce based rewriting engine for Sparql query over big RDF data to handle the implicit data described intentionally by inference rules during query evaluation. The rewriting approach also deal with the transitive closure and cyclic rules to provide a rich inference language as RDFS and OWL. The second contribution concerns the distributed inconsistency processing. We extend the approach presented in first contribution by taking into account inconsistency in the data. This includes: (1)Rules based inconsistency detection with the help of our query rewriting engine; (2)Consistent query evaluation in three different semantics. The third contribution concerns the reasoning and querying over large-scale uncertain RDF data. We propose an MapReduce based approach to deal with large-scale reasoning with uncertainty. Unlike possible worlds semantic, we propose an algorithm for generating intensional Sparql query plan over probabilistic RDF graph for computing the probabilities of results within the query
Estilos ABNT, Harvard, Vancouver, APA, etc.
24

Fernandez, Garcia Javier David, Sabrina Kirrane, Axel Polleres e Simon Steyskal. "HDT crypt: Compression and Encryption of RDF Datasets". IOS Press, 2018. http://epub.wu.ac.at/6489/1/HDTCrypt%2DSWJ.pdf.

Texto completo da fonte
Resumo:
The publication and interchange of RDF datasets online has experienced significant growth in recent years, promoted by different but complementary efforts, such as Linked Open Data, the Web of Things and RDF stream processing systems. However, the current Linked Data infrastructure does not cater for the storage and exchange of sensitive or private data. On the one hand, data publishers need means to limit access to confidential data (e.g. health, financial, personal, or other sensitive data). On the other hand, the infrastructure needs to compress RDF graphs in a manner that minimises the amount of data that is both stored and transferred over the wire. In this paper, we demonstrate how HDT - a compressed serialization format for RDF - can be extended to cater for supporting encryption. We propose a number of different graph partitioning strategies and discuss the benefits and tradeoffs of each approach.
Estilos ABNT, Harvard, Vancouver, APA, etc.
25

Hazuza, Petr. "Ontologie přístupnosti budov". Master's thesis, Vysoká škola ekonomická v Praze, 2014. http://www.nusl.cz/ntk/nusl-202106.

Texto completo da fonte
Resumo:
Within the project Maps without Barriers realized under Charta 77 Foundation - Barriers Account, in 2015 we intend to map accessibility of buildings and its premises from the perspective of people with limited mobility. We plan to inspect nearly 600 castles, palaces and other tourist attractions in the Czech Republic. The acquired data will be gathered and published as an on-line map in form of open and machine-readable data. It will also appear as Linked Open Data. However, the project will not end with mapping premises, the main objective is to provide a solid foundation for a unified database of accessibility of buildings and its premises. Negotiations with institutions and organizations interested in mapping are in progress and we try to offer them our project platform for publication of their data. The required RDFS vocabulary will be designed and carried out as part of this diploma thesis. It will be tested on the data from a number of forms describing existing objects. The data will be gathered by means of services designed in terms of this theses and provided for purchasers and users equally.
Estilos ABNT, Harvard, Vancouver, APA, etc.
26

Abicht, Konrad, Georges Alkhouri, Natanael Arndt, Roy Meissner e Michael Martin. "CubeViz.js: A Lightweight Framework for Discovering and Visualizing RDF Data Cubes". Gesellschaft für Informatik, 2017. https://ul.qucosa.de/id/qucosa%3A32064.

Texto completo da fonte
Resumo:
In this paper we present CubeViz.js, the successor of CubeViz, as an approach for lightweight visualization and exploration of statistical data using the RDF Data Cube vocabulary. In several use cases, such as the European Unions Open Data Portal, in which we deployed CubeViz, we were able to gather various requirements that eventually led to the decision of reimplementing CubeViz as JavaScript-only application. As part of this paper we showcase major functionalities of CubeViz.js and its improvements in comparison to the prior version.
Estilos ABNT, Harvard, Vancouver, APA, etc.
27

Lozano, Aparicio Jose Martin. "Data exchange from relational databases to RDF with target shape schemas". Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1I063.

Texto completo da fonte
Resumo:
Resource Description Framework (RDF) est un modèle de graphe utilisé pour publier des données sur le Web à partir de bases de données relationnelles. Nous étudions l'échange de données depuis des bases de données relationnelles vers des graphes RDF avec des schémas de formes cibles. Essentiellement, échange de données modélise un processus de transformation d'une instance d'un schéma relationnel, appelé schéma source, en un graphe RDF contraint par un schéma cible, selon un ensemble de règles, appelé tuple source-cible générant des dépendances. Le graphe RDF obtenu est appelé une solution. Étant donné que les dépendances générant des tuple définissent ce processus de manière déclarative, il peut y avoir de nombreuses solutions possibles ou aucune solution du tout. Nous étudions le système d'échange de données relationnel avec RDF constructive avec des schémas de formes cibles, qui est composé d'un schéma source relationnel, un schéma de formes pour le schéma cible, un ensemble de mappages utilisant des constructeurs IRI. De plus, nous supposons que deux constructeurs IRI ne se chevauchent pas. Nous proposons un langage visuel pour la spécification des correspondances (VML) qui aide les utilisateurs non experts à spécifier des mappages dans ce système. De plus, nous développons un outil appelé ShERML qui effectue l'échange de données avec l'utilisation de VML et pour les utilisateurs qui souhaitent comprendre le modèle derrière les mappages VML, nous définissons R2VML, un langage texte, qui capture VML et présente une syntaxe succincte pour définition des mappages.Nous étudions le problème de la vérification de la consistance: un système d'échange de données est consistent si pour chaque instance de source d'entrée, il existe au moins une solution. Nous montrons que le problème de consistance est coNP-complet et fournissons un algorithme d'analyse statique du système qui permet de décider si le système est consistent ou non.Nous étudions le problème du calcul de réponses certaines. Une réponse est certaine si la réponse tient dans chaque solution. En générale, réponses certaines sont calculées en utilisant d'une solution universelle. Cependant, dans notre contexte, une solution universelle pourrait ne pas exister. Ainsi, nous introduisons la notion de solution de simulation universelle, qui existe toujours et permet de calculer certaines réponses à n'importe quelle classe de requêtes robustes sous simulation. Une de ces classes sont les expressions régulières imbriquées (NRE) qui sont forward c'est-à-dire qui n'utilisent pas l’opération inverse. L'utilisation d'une solution de simulation universelle rend traitable le calcul de réponses certaines pour les NRE (data-complexity).Enfin, nous étudions le problème d'extraction de schéma des formes qui consiste à construire un schéma de formes cibles à partir d'un système constructif d'échange de données relationnel vers RDF sans le schéma de formes cibles. Nous identifions deux propriétés souhaitables d'un bon schéma cible, qui sont la correction c'est-à-dire que chaque graphe RDF produit est accepté par le schéma cible; et la complétude c'est-à-dire que chaque graphe RDF accepté par le schéma cible peut être produit. Nous proposons un algorithme d'extraction qui convient à tout système d'échange de données sans schéma, mais qui est également complet pour une grande classe pratique de systèmes sans schéma
Resource Description Framework (RDF) is a graph data model which has recently found the use of publishing on the web data from relational databases. We investigate data exchange from relational databases to RDF graphs with target shapes schemas. Essentially, data exchange models a process of transforming an instance of a relational schema, called the source schema, to a RDF graph constrained by a target schema, according to a set of rules, called source-to-target tuple generating dependencies. The output RDF graph is called a solution. Because the tuple generating dependencies define this process in a declarative fashion, there might be many possible solutions or no solution at all. We study constructive relational to RDF data exchange setting with target shapes schemas, which is composed of a relational source schema, a shapes schema for the target schema, a set of mappings that uses IRI constructors. Furthermore, we assume that any two IRI constructors are non-overlapping. We propose a visual mapping language (VML) that helps non-expert users to specify mappings in this setting. Moreover, we develop a tool called ShERML that performs data exchange with the use of VML and for users that want to understand the model behind VML mappings, we define R2VML, a text-based mapping language, that captures VML and presents a succinct syntax for defining mappings.We investigate the problem of checking consistency: a data exchange setting is consistent if for every input source instance, there is at least one solution. We show that the consistency problem is coNP-complete and provide a static analysis algorithm of the setting that allows to decide if the setting is consistent or not. We study the problem of computing certain answers. An answer is certain if the answer holds in every solution. Typically, certain answers are computed using a universal solution. However, in our setting a universal solution might not exist. Thus, we introduce the notion of universal simulation solution, which always exists and allows to compute certain answers to any class of queries that is robust under simulation. One such class is nested regular expressions (NREs) that are forward i.e., do not use the inverse operation. Using universal simulation solution renders tractable the computation of certain answers to forward NREs (data-complexity).Finally, we investigate the shapes schema elicitation problem that consists of constructing a target shapes schema from a constructive relational to RDF data exchange setting without the target shapes schema. We identity two desirable properties of a good target schema, which are soundness i.e., every produced RDF graph is accepted by the target schema; and completeness i.e., every RDF graph accepted by the target schema can be produced. We propose an elicitation algorithm that is sound for any schema-less data exchange setting, but also that is complete for a large practical class of schema-less settings
Estilos ABNT, Harvard, Vancouver, APA, etc.
28

Chartrand, Tim. "Ontology-based extraction of RDF data from the World Wide Web /". Diss., CLICK HERE for online access, 2003. http://contentdm.lib.byu.edu/ETD/image/etd168.pdf.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
29

Pérez, de Laborda Schwankhart Cristian. "Incorporating relational data into the Semantic Web". [S.l.] : [s.n.], 2006. http://deposit.ddb.de/cgi-bin/dokserv?idn=982420390.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
30

Domin, Annika. "Konzeption eines RDF-Vokabulars für die Darstellung von COUNTER-Nutzungsstatistiken". Master's thesis, Universitätsbibliothek Leipzig, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-179416.

Texto completo da fonte
Resumo:
Die vorliegende Masterarbeit dokumentiert die Erstellung eines RDF-basierten Vokabulars zur Darstellung von Nutzungsstatistiken elektronischer Ressourcen, die nach dem COUNTER-Standard erstellt wurden. Die konkrete Anwendung dieses Vokabulars bildet das Electronic Resource Management System (ERMS), welches momentan von der Universitätsbibliothek Leipzig im Rahmen des kooperativen Projektes AMSL entwickelt wird. Dieses basiert auf Linked Data, soll die veränderten Verwaltungsprozesse elektronischer Ressourcen abbilden können und gleichzeitig anbieterunabhängig und flexibel sein. Das COUNTER-Vokabular soll aber auch über diese Anwendung hinaus einsetzbar sein. Die Arbeit gliedert sich in die beiden Teile Grundlagen und Modellierung. Im ersten Teil wird zu nächst die bibliothekarische Notwendigkeit von ERM-Systemen herausgestellt und der Fokus der Betrachtung auf das Teilgebiet der Nutzungsstatistiken und die COUNTER-Standardisierung gelenkt. Anschließend werden die technischen Grundlagen der Modellierung betrachtet, um die Arbeit auch für nicht mit Linked Data vertraute Leser verständlich zu machen. Darauf folgt der Modellierungsteil, der mit einer Anforderungsanalyse sowie der Analyse des den COUNTER-Dateien zugrunde liegenden XML-Schemas beginnt. Daran schließt sich die Modellierung des Vokabulars mit Hilfe von RDFS und OWL an. Aufbauend auf angestellten Überlegungen zur Übertragung von XML-Statistiken nach RDF und der Vergabe von URIs werden anschließend reale Beispieldateien manuell konvertiert und in einem kurzen Test erfolgreich überprüft. Den Abschluss bilden ein Fazit der Arbeit sowie ein Ausblick auf das weitere Verfahren mit den Ergebnissen. Das erstellte RDF-Vokabular ist bei GitHub unter der folgenden URL zur Weiterverwendung hinterlegt: https://github.com/a-nnika/counter.vocab
Estilos ABNT, Harvard, Vancouver, APA, etc.
31

Neto, Luis Eufrasio Teixeira. "Uma abordagem para publicaÃÃo de visÃes RDF de dados relacionais". Universidade Federal do CearÃ, 2014. http://www.teses.ufc.br/tde_busca/arquivo.php?codArquivo=12676.

Texto completo da fonte
Resumo:
nÃo hÃ
A iniciativa Linked Data trouxe novas oportunidades para a construÃÃo da nova geraÃÃo de aplicaÃÃes Web. Entretanto, a utilizaÃÃo das melhores prÃticas estabelecidas por este padrÃo depende de mecanismos que facilitem a transformaÃÃo dos dados armazenados em bancos relacionais em triplas RDF. Recentemente, o grupo de trabalho W3C RDB2RDF propÃs uma linguagem de mapeamento padrÃo, denominada R2RML, para especificar mapeamentos customizados entre esquemas relacionais e vocabulÃrios RDF. No entanto, a geraÃÃo de mapeamentos R2RML nÃo à uma tarefa fÃcil. à imperativo, entÃo, definir: (a) uma soluÃÃo para mapear os conceitos de um esquema relacional em termos de um esquema RDF; (b) um processo que suporte a publicaÃÃo dos dados relacionais no formato RDF; e (c) uma ferramenta para facilitar a aplicaÃÃo deste processo. Assertivas de correspondÃncia sÃo propostas para formalizar mapeamentos entre esquemas relacionais e esquemas RDF. VisÃes sÃo usadas para publicar dados de uma base de dados em uma nova estrutura ou esquema. A definiÃÃo de visÃes RDF sobre dados relacionais permite que esses dados possam ser disponibilizados em uma estrutura de termos de uma ontologia OWL, sem que seja necessÃrio alterar o esquema da base de dados. Neste trabalho, propomos uma arquitetura em trÃs camadas â de dados, de visÃes SQL e de visÃes RDF â onde a camada de visÃes SQL mapeia os conceitos da camada de dados nos termos da camada de visÃes RDF. A criaÃÃo desta camada intermediÃria de visÃes facilita a geraÃÃo dos mapeamentos R2RML e evita que alteraÃÃes na camada de dados impliquem em alteraÃÃes destes mapeamentos. Adicionalmente, definimos um processo em trÃs etapas para geraÃÃo das visÃes RDF. Na primeira etapa, o usuÃrio define o esquema do banco de dados relacional e a ontologia OWL alvo e cria assertivas de correspondÃncia que mapeiam os conceitos do esquema relacional nos termos da ontologia alvo. A partir destas assertivas, uma ontologia exportada à gerada automaticamente. O segundo passo produz um esquema de visÃes SQL gerado a partir da ontologia exportada e um mapeamento R2RML do esquema de visÃes para a ontologia exportada, de forma automatizada. Por fim, no terceiro passo, as visÃes RDF sÃo publicadas em um SPARQL endpoint. Neste trabalho sÃo detalhados as assertivas de correspondÃncia, a arquitetura, o processo, os algoritmos necessÃrios, uma ferramenta que suporta o processo e um estudo de caso para validaÃÃo dos resultados obtidos.
The Linked Data initiative brought new opportunities for building the next generation of Web applications. However, the full potential of linked data depends on how easy it is to transform data stored in conventional, relational databases into RDF triples. Recently, the W3C RDB2RDF Working Group proposed a standard mapping language, called R2RML, to specify customized mappings between relational schemas and target RDF vocabularies. However, the generation of customized R2RML mappings is not an easy task. Thus, it is mandatory to define: (a) a solution that maps concepts from a relational schema to terms from a RDF schema; (b) a process to support the publication of relational data into RDF, and (c) a tool that implements this process. Correspondence assertions are proposed to formalize the mappings between relational schemas and RDF schemas. Views are created to publish data from a database to a new structure or schema. The definition of RDF views over relational data allows providing this data in terms of an OWL ontology structure without having to change the database schema. In this work, we propose a three-tier architecture â database, SQL views and RDF views â where the SQL views layer maps the database concepts into RDF terms. The creation of this intermediate layer facilitates the generation of R2RML mappings and prevents that changes in the data layer result in changes on R2RML mappings. Additionally, we define a three-step process to generate the RDF views of relational data. First, the user defines the schema of the relational database and the target OWL ontology. Then, he defines correspondence assertions that formally specify the relational database in terms of the target ontology. Using these assertions, an exported ontology is generated automatically. The second step produces the SQL views that perform the mapping defined by the assertions and a R2RML mapping between these views and the exported ontology. This dissertation describes a formalization of the correspondence assertions, the three-tier architecture, the publishing process steps, the algorithms needed, a tool that supports the entire process and a case study to validate the results obtained.
Estilos ABNT, Harvard, Vancouver, APA, etc.
32

Nohejl, Petr. "Transformace a publikace otevřených a propojitelných dat". Master's thesis, Vysoká škola ekonomická v Praze, 2013. http://www.nusl.cz/ntk/nusl-198076.

Texto completo da fonte
Resumo:
The principle of Open Data and Linked data is in growing interest of many organizations, developers and even government institutions. This work is aimed on providing actual information about development of Open and Linked data, further there are introduced featured tools for creating, manipulating, transformation and other operations regarding Open and Linked Data. Finally, there is description of development of Linked Data application based on universal visualization system Payola.
Estilos ABNT, Harvard, Vancouver, APA, etc.
33

SALAS, PERCY ENRIQUE RIVERA. "OLAP2DATACUBE: AN ON-DEMAND TRANSFORMATION FRAMEWORK FROM OLAP TO RDF DATA CUBES". PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO, 2015. http://www.maxwell.vrac.puc-rio.br/Busca_etds.php?strSecao=resultado&nrSeq=26120@1.

Texto completo da fonte
Resumo:
PONTIFÍCIA UNIVERSIDADE CATÓLICA DO RIO DE JANEIRO
COORDENAÇÃO DE APERFEIÇOAMENTO DO PESSOAL DE ENSINO SUPERIOR
CONSELHO NACIONAL DE DESENVOLVIMENTO CIENTÍFICO E TECNOLÓGICO
PROGRAMA DE SUPORTE À PÓS-GRADUAÇÃO DE INSTS. DE ENSINO
PROGRAMA DE EXCELENCIA ACADEMICA
Dados estatísticos são uma das mais importantes fontes de informações, relevantes para um grande número de partes interessadas nos domínios governamentais, científicos e de negócios. Um conjunto de dados estatísticos compreende uma coleção de observações feitas em alguns pontos através de um espaço lógico e muitas vezes é organizado como cubos de dados. A definição adequada de cubos de dados, especialmente das suas dimensões, ajuda a processar as observações e, mais importante, ajuda a combinar observações de diferentes cubos de dados. Neste contexto, os princípios de Linked Data podem ser proveitosamente aplicados na definição de cubos de dados, no sentido de que os princípios oferecem uma estratégia para fornecer a semântica ausentes nas dimensões, incluindo os seus valores. Nesta tese, descrevemos o processo e a implementação de uma arquitetura de mediação, chamada OLAP2DataCube On Demand Framework, que ajuda a descrever e consumir dados estatísticos, expostos como triplas RDF, mas armazenados em bancos de dados relacionais. O Framework possui um catálogo de descrições de Linked Data Cubes, criado de acordo com os princípios de Linked Data. O catálogo tem uma descrição padronizada para cada cubo de dados armazenado em bancos de dados (relacionais) estatísticos conhecidos pelo Framework. O Framework oferece uma interface para navegar pelas descrições dos Linked Data Cubes e para exportar os cubos de dados como triplas RDF geradas por demanda a partir das fontes de dados subjacentes. Também discutimos a implementação de operações sofisticadas de busca de metadados, operações OLAP em cubo de dados, tais como slice e dice, e operações de mashup sofisticadas de cubo de dados que criam novos cubos através da combinação de outros cubos.
Statistical data is one of the most important sources of information, relevant to a large number of stakeholders in the governmental, scientific and business domains alike. A statistical data set comprises a collection of observations made at some points across a logical space and is often organized as what is called a data cube. The proper definition of the data cubes, especially of their dimensions, helps processing the observations and, more importantly, helps combining observations from different data cubes. In this context, the Linked Data principles can be profitably applied to the definition of data cubes, in the sense that the principles offer a strategy to provide the missing semantics of the dimensions, including their values. In this thesis we describe the process and the implementation of a mediation architecture, called OLAP2DataCube On Demand, which helps describe and consume statistical data, exposed as RDF triples, but stored in relational databases. The tool features a catalogue of Linked Data Cube descriptions, created according to the Linked Data principles. The catalogue has a standardized description for each data cube actually stored in each statistical (relational) database known to the tool. The tool offers an interface to browse the linked data cube descriptions and to export the data cubes as RDF triples, generated on demand from the underlying data sources. We also discuss the implementation of sophisticated metadata search operations, OLAP data cube operations, such as slice and dice, and data cube mashup operations that create new cubes by combining other cubes.
Estilos ABNT, Harvard, Vancouver, APA, etc.
34

Bongiovanni, Francesco. "Design, formalization and implementation of overlay networks : application to RDF data storage". Nice, 2012. http://www.theses.fr/2012NICE4021.

Texto completo da fonte
Resumo:
Les réseaux de recouvrement structurés sont une nouvelle classe de systèmes Pair-à-pair (P2P), qui sont utilisés pour des applications à grande échelle telles que le partage de fichiers, diffusion de l’information ; le stockage et la récupération des différentes ressources… Beaucoup de ces réseaux coexistent sur le Web mais ne coopèrent pas. Afin de promouvoir la coopération, nous proposons deux protocoles, Babelchord et Synapse, dont les objectifs sont de permettre l’interconnexion de réseaux de recouvrement structurés et hétérogènes grâce à des méta-protocoles. Babelchord vise à regrouper les petits réseaux de recouvrement structurés d’une manière non structurée , tandis que Synapse généralise ce concept et prévoit des mécanismes souples reposant sur des nœuds co-localisés, à savoir des nœuds qui appartiennent à plusieurs réseaux en même temps. Nous fournissons les algorithmes derrière ces deux protocoles, ainsi que les résultats des simulations montrant leurs comportements dans le contexte de recherche d’information. Nous avons également développé et expérimenté un prototype de JSynapse sur la plate-forme Grid’50000, confirmant les résultats de simulation obtenus. Une nouvelle génération de ces réseaux fut créée afin de stocker et de récupérer des données sémantiques dans des contextes à larges échelles. En effet, la communauté du Web sémantique a besoin de solutions capables de stocker et récupérer des données RDF, le modèle de données au centre du Web sémantique, passant à l’échelle. La première génération de ces systèmes est monolithique et fournit un support limité pour les requêtes expressives. Nous proposons la conception et l’implémentation d’un système modulaire basé sur du P2P afin de répondre à ces besoins. Nous l’avons construit avec RDF à l’esprit et avons utilisé une infrastructure à trois dimensions, reflétant la nature d’un triplet RDF. Nous avons également fait des choix de design qui permettent de préserver la localité des données mais qui soulèvent des challenges techniques intéressants. Notre conception modulaire réduit le couplage entre les composants formant l’infrastructure et peuvent donc être inter-changé avec d ‘autres. Nous avons expérimenté notre implémentation sur Grid’5000 et présentons les résultats de micro-benchmarks. Enfin, nous nous sommes intéressés de plus près aux performances de ces réseaux. En effet, ils ont une topologie géométrique spécifique qui peut être exploitée de manière à augmenter les performances des applications tournant au-dessus. A cet effet, nous proposons un algorithme de diffusion pour CAN qui est efficace en termes de messages échangés dans le réseau. Cet algorithme a été mis au point en réponse aux résultats trouvés au cours des expériences de notre infrastructure de stockage de données RDF. En parallèle de cet algorithme, nous proposons également un cadre de raisonnement, développé avec l’assistant de preuve Isabelle/HOL, afin de prouver des propriétés d’exactitudes des algorithmes de diffusion pour des réseaux à la CAN. Nous nous sommes concentrés, sur l’ensemble minimal d’abstractions nécessaires afin de concevoir des algorithmes de diffusion efficaces corrects par construction au-dessus de réseaux comme CAN
Structured Overlay Networks (SONs) are a new class of Peer-to-Peer (P2P) systems which are widely used for large scale applications such as file sharing , information dissemination, storage and retrieval of different resources… Many different SONs co-exist on the Web yet they do not cooperate with each other. In order to promote cooperation, we propose two protocols, Babelchord and Synapse, whose goals are to enable the inter-connection of structured and heterogeneous overlays networks through meta-protocols. Babelchord aims to aggregate small structured overlay networks in an unstructured fashion while Synapse generalizes this concept and provides flexible mechanisms relying on co-located nodes, i. E. Nodes which belong to multiple overlays at the same time. We provides the algorithms behind both protocols, as well as simulations results showing their behaviours in the context of information retrieval. We have also implemented and experimented a prototype of JSynapse on the Grid’5000 platform, confirming the obtained simulation results and giving a proof of concept of our protocol. A novel generation of SONs was created in order to store and retrieve semantic data in large scale settings. The Semantic Web community is in need for scalable solutions which are able to store and retrieve RDF data, the core data model of the Semantic Web. The first generation of these systems is too monolithic and provided limited support for expressive queries. We propose the design and implementation of a new modular P2P-based system for these purposes. We build the system with RDF in mind and used a three-dimensional CAN overlay network, mimicking the nature of an RDF triple. We made specific design choices which preserve data locality but raises interesting technical challenges. Our modular design reduces the coupling between its underlying components, allowing them to be inter-changed with others. We also ran some micro-benchmarks on Grid’50000 which will be discussed. SONs have a specific geometrical topology which could be leveraged in order to increase the overall performance of the system. In this regard we propose a new broadcast efficient algorithm for CAN, developed in response to the results found from running the experiments in the RDF data store we have built, which used too many messages. Along this algorithm, we also propose a reasoning framework, developed with the Isabelle/HOL proof assistant, for proving correctness properties of dissemination algorithms for CAN-like P2P-systems. We focus on providing the minimal set of abstractions needed to devise efficient correct-by-construction dissemination algorithms on top of such overlay
Estilos ABNT, Harvard, Vancouver, APA, etc.
35

Mixter, Jeffrey. "Linked Data in VRA Core 4.0: Converting VRA XML Records into RDF/XML". Kent State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=kent1366642050.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
36

Frommhold, Marvin, Sebastian Tramp, Natanael Arndt e Niklas Petersen. "Publish and subscribe for RDF in enterprise value networks". Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15776.

Texto completo da fonte
Resumo:
Sharing information securely between business partners and managing large supply chains effciently will be a crucial competitive advantage for enterprises in the near future. In this paper, we present a concept that allows for building value networks between business partners in a distributed manner. Companies are able to publish Linked Data which participants of the network can clone and subscribe to. Subscribers get noticed as soon as new information becomes available. This provides a technical infrastructure for business communication acts such as supply chain communication or master data management. In addition to the conceptual analysis, we provide an implementation enabling companies to create such dynamic semantic value networks.
Estilos ABNT, Harvard, Vancouver, APA, etc.
37

Ali, Liaquat [Verfasser], e Georg [Akademischer Betreuer] Lausen. "Efficient management and querying of RDF data in a P2P framework = Effiziente Verwaltung und Abfrage von RDF-Daten in einem P2P-Rahmen". Freiburg : Universität, 2014. http://d-nb.info/1123481075/34.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
38

Albahli, Saleh Mohammad. "Ontology-based approaches to improve RDF Triple Store". Kent State University / OhioLINK, 2016. http://rave.ohiolink.edu/etdc/view?acc_num=kent1456063330.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
39

Lesnikova, Tatiana. "Liage de données RDF : évaluation d'approches interlingues". Thesis, Université Grenoble Alpes (ComUE), 2016. http://www.theses.fr/2016GREAM011/document.

Texto completo da fonte
Resumo:
Le Web des données étend le Web en publiant des données structurées et liées en RDF. Un jeu de données RDF est un graphe orienté où les ressources peuvent être des sommets étiquetées dans des langues naturelles. Un des principaux défis est de découvrir les liens entre jeux de données RDF. Étant donnés deux jeux de données, cela consiste à trouver les ressources équivalentes et les lier avec des liens owl:sameAs. Ce problème est particulièrement difficile lorsque les ressources sont décrites dans différentes langues naturelles.Cette thèse étudie l'efficacité des ressources linguistiques pour le liage des données exprimées dans différentes langues. Chaque ressource RDF est représentée comme un document virtuel contenant les informations textuelles des sommets voisins. Les étiquettes des sommets voisins constituent le contexte d'une ressource. Une fois que les documents sont créés, ils sont projetés dans un même espace afin d'être comparés. Ceci peut être réalisé à l'aide de la traduction automatique ou de ressources lexicales multilingues. Une fois que les documents sont dans le même espace, des mesures de similarité sont appliquées afin de trouver les ressources identiques. La similarité entre les documents est prise pour la similarité entre les ressources RDF.Nous évaluons expérimentalement différentes méthodes pour lier les données RDF. En particulier, deux stratégies sont explorées: l'application de la traduction automatique et l'usage des banques de données terminologiques et lexicales multilingues. Dans l'ensemble, l'évaluation montre l'efficacité de ce type d'approches. Les méthodes ont été évaluées sur les ressources en anglais, chinois, français, et allemand. Les meilleurs résultats (F-mesure > 0.90) ont été obtenus par la traduction automatique. L'évaluation montre que la méthode basée sur la similarité peut être appliquée avec succès sur les ressources RDF indépendamment de leur type (entités nommées ou concepts de dictionnaires)
The Semantic Web extends the Web by publishing structured and interlinked data using RDF.An RDF data set is a graph where resources are nodes labelled in natural languages. One of the key challenges of linked data is to be able to discover links across RDF data sets. Given two data sets, equivalent resources should be identified and linked by owl:sameAs links. This problem is particularly difficult when resources are described in different natural languages.This thesis investigates the effectiveness of linguistic resources for interlinking RDF data sets. For this purpose, we introduce a general framework in which each RDF resource is represented as a virtual document containing text information of neighboring nodes. The context of a resource are the labels of the neighboring nodes. Once virtual documents are created, they are projected in the same space in order to be compared. This can be achieved by using machine translation or multilingual lexical resources. Once documents are in the same space, similarity measures to find identical resources are applied. Similarity between elements of this space is taken for similarity between RDF resources.We performed evaluation of cross-lingual techniques within the proposed framework. We experimentally evaluate different methods for linking RDF data. In particular, two strategies are explored: applying machine translation or using references to multilingual resources. Overall, evaluation shows the effectiveness of cross-lingual string-based approaches for linking RDF resources expressed in different languages. The methods have been evaluated on resources in English, Chinese, French and German. The best performance (over 0.90 F-measure) was obtained by the machine translation approach. This shows that the similarity-based method can be successfully applied on RDF resources independently of their type (named entities or thesauri concepts). The best experimental results involving just a pair of languages demonstrated the usefulness of such techniques for interlinking RDF resources cross-lingually
Estilos ABNT, Harvard, Vancouver, APA, etc.
40

Joshi, Amit Krishna. "Exploiting Alignments in Linked Data for Compression and Query Answering". Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1496142816700187.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
41

Ianniello, Raffaele. "Linked Open Data per la pubblica amministrazione: conversione e utilizzo dei dati". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2013. http://amslaurea.unibo.it/5776/.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
42

Garrido, García Camilo Fernando. "Resúmenes semiautomáticos de conocimiento : caso de RDF". Tesis, Universidad de Chile, 2013. http://www.repositorio.uchile.cl/handle/2250/113509.

Texto completo da fonte
Resumo:
Ingeniero Civil en Computación
En la actualidad, la cantidad de información que se genera en el mundo es inmensa. En el campo científico tenemos, por ejemplo, datos astronómicos con imágenes de las estrellas, los datos de pronósticos meteorológicos, los datos de infomación biológica y genética, etc. No sólo en el mundo científico se produce este fenómeno, por ejemplo, un usuario navegando por Internet produce grandes cantidades de información: Comentarios en foros, participación en redes sociales o simplemente la comunicación a través de la web. Manejar y analizar esta cantidad de información trae grandes problemas y costos. Por ello, antes de realizar un análisis, es conveniente determinar si el conjunto de datos que se posee es adecuado para lo que se desea o si trata sobre los temas que son de nuestro interés. Estas preguntas podrían responderse si se contara con un resumen del conjunto de datos. De aquí surge el problema que esta memoria abarca: Crear resúmenes semi-automáticos de conocimiento formalizado. En esta memoria se diseñó e implementó un método para la obtención de resúmenes semiautomáticos de conjuntos RDF. Dado un grafo RDF se puede obtener un conjunto de nodos, cuyo tamaño es determinado por el usuario, el cual representa y da a entender cuáles son los temas más importantes dentro del conjunto completo. Este método fue diseñado en base a los conjuntos de datos provistos por DBpedia. La selección de recursos dentro del conjunto de datos se hizo utilizando dos métricas usadas ampliamente en otros escenarios: Centralidad de intermediación y grados. Con ellas se detectaron los recursos más importantes en forma global y local. Las pruebas realizadas, las cuales contaron con evaluación de usuarios y evaluación automática, indicaron que el trabajo realizado cumple con el objetivo de realizar resúmenes que den a entender y representen al conjunto de datos. Las pruebas también mostraron que los resúmenes logran un buen balance de los temas generales, temas populares y la distribución respecto al conjunto de datos completo.
Estilos ABNT, Harvard, Vancouver, APA, etc.
43

Teixeira, Neto Luis Eufrasio. "Uma abordagem para publicação de visões RDF de dados relacionais". reponame:Repositório Institucional da UFC, 2014. http://www.repositorio.ufc.br/handle/riufc/12694.

Texto completo da fonte
Resumo:
TEIXEIRA NETO, L. E. Uma abordagem para publicação de visões RDF de dados relacionais. 2014. 97 f. Dissertação (Mestrado em Ciência da Computação) - Centro de Ciências, Universidade Federal do Ceará, Fortaleza, 2014.
Submitted by Daniel Eduardo Alencar da Silva (dealencar.silva@gmail.com) on 2015-01-23T19:39:51Z No. of bitstreams: 1 2014_dis_letneto.pdf: 2039098 bytes, checksum: 476ca3810a4d9341414016b0440023ba (MD5)
Approved for entry into archive by Nirlange Queiroz(nirlange@gmail.com) on 2015-06-09T14:15:58Z (GMT) No. of bitstreams: 1 2014_dis_letneto.pdf: 2039098 bytes, checksum: 476ca3810a4d9341414016b0440023ba (MD5)
Made available in DSpace on 2015-06-09T14:15:58Z (GMT). No. of bitstreams: 1 2014_dis_letneto.pdf: 2039098 bytes, checksum: 476ca3810a4d9341414016b0440023ba (MD5) Previous issue date: 2014
The Linked Data initiative brought new opportunities for building the next generation of Web applications. However, the full potential of linked data depends on how easy it is to transform data stored in conventional, relational databases into RDF triples. Recently, the W3C RDB2RDF Working Group proposed a standard mapping language, called R2RML, to specify customized mappings between relational schemas and target RDF vocabularies. However, the generation of customized R2RML mappings is not an easy task. Thus, it is mandatory to define: (a) a solution that maps concepts from a relational schema to terms from a RDF schema; (b) a process to support the publication of relational data into RDF, and (c) a tool that implements this process. Correspondence assertions are proposed to formalize the mappings between relational schemas and RDF schemas. Views are created to publish data from a database to a new structure or schema. The definition of RDF views over relational data allows providing this data in terms of an OWL ontology structure without having to change the database schema. In this work, we propose a three-tier architecture – database, SQL views and RDF views – where the SQL views layer maps the database concepts into RDF terms. The creation of this intermediate layer facilitates the generation of R2RML mappings and prevents that changes in the data layer result in changes on R2RML mappings. Additionally, we define a three-step process to generate the RDF views of relational data. First, the user defines the schema of the relational database and the target OWL ontology. Then, he defines correspondence assertions that formally specify the relational database in terms of the target ontology. Using these assertions, an exported ontology is generated automatically. The second step produces the SQL views that perform the mapping defined by the assertions and a R2RML mapping between these views and the exported ontology. This dissertation describes a formalization of the correspondence assertions, the three-tier architecture, the publishing process steps, the algorithms needed, a tool that supports the entire process and a case study to validate the results obtained.
A iniciativa Linked Data trouxe novas oportunidades para a construção da nova geração de aplicações Web. Entretanto, a utilização das melhores práticas estabelecidas por este padrão depende de mecanismos que facilitem a transformação dos dados armazenados em bancos relacionais em triplas RDF. Recentemente, o grupo de trabalho W3C RDB2RDF propôs uma linguagem de mapeamento padrão, denominada R2RML, para especificar mapeamentos customizados entre esquemas relacionais e vocabulários RDF. No entanto, a geração de mapeamentos R2RML não é uma tarefa fácil. É imperativo, então, definir: (a) uma solução para mapear os conceitos de um esquema relacional em termos de um esquema RDF; (b) um processo que suporte a publicação dos dados relacionais no formato RDF; e (c) uma ferramenta para facilitar a aplicação deste processo. Assertivas de correspondência são propostas para formalizar mapeamentos entre esquemas relacionais e esquemas RDF. Visões são usadas para publicar dados de uma base de dados em uma nova estrutura ou esquema. A definição de visões RDF sobre dados relacionais permite que esses dados possam ser disponibilizados em uma estrutura de termos de uma ontologia OWL, sem que seja necessário alterar o esquema da base de dados. Neste trabalho, propomos uma arquitetura em três camadas – de dados, de visões SQL e de visões RDF – onde a camada de visões SQL mapeia os conceitos da camada de dados nos termos da camada de visões RDF. A criação desta camada intermediária de visões facilita a geração dos mapeamentos R2RML e evita que alterações na camada de dados impliquem em alterações destes mapeamentos. Adicionalmente, definimos um processo em três etapas para geração das visões RDF. Na primeira etapa, o usuário define o esquema do banco de dados relacional e a ontologia OWL alvo e cria assertivas de correspondência que mapeiam os conceitos do esquema relacional nos termos da ontologia alvo. A partir destas assertivas, uma ontologia exportada é gerada automaticamente. O segundo passo produz um esquema de visões SQL gerado a partir da ontologia exportada e um mapeamento R2RML do esquema de visões para a ontologia exportada, de forma automatizada. Por fim, no terceiro passo, as visões RDF são publicadas em um SPARQL endpoint. Neste trabalho são detalhados as assertivas de correspondência, a arquitetura, o processo, os algoritmos necessários, uma ferramenta que suporta o processo e um estudo de caso para validação dos resultados obtidos.
Estilos ABNT, Harvard, Vancouver, APA, etc.
44

Escobar, Esteban María Pilar. "Un enfoque multidimensional basado en RDF para la publicación de Linked Open Data". Doctoral thesis, Universidad de Alicante, 2020. http://hdl.handle.net/10045/109950.

Texto completo da fonte
Resumo:
Cada vez hay disponibles más datos de manera pública en Internet y surgen nuevas bases de conocimiento conocidas como Knowledge Graph, basadas en conceptos de Linked Open Data (datos abiertos enlazados), como DBPedia, Wikidata, YAGO o Google Knowledge Graph, que cubren un amplio abanico de campos del conocimiento. Además, se incorporan los datos que provienen de diversas fuentes como dispositivos inteligentes o las redes sociales. Sin embargo, que estos datos estén públicos y accesibles no garantiza que sean útiles para los usuarios, no siempre se garantiza que sean confiables ni que puedan ser reutilizados de manera eficiente. Actualmente, siguen existiendo barreras que dificultan la reutilización de los datos, porque los formatos son poco adecuados para el procesamiento automático y publicación de la información, por falta de metadatos descriptivos y de semántica, duplicidades, ambigüedad o incluso errores en los propios datos. A todos estos problemas hay que añadir la complejidad del proceso de explotación de la información de un repositorio de datos abiertos enlazados. El trabajo y conocimientos técnicos que requiere el acceso, recolección, normalización y preparación de los datos para que puedan ser reutilizados supone una carga extra para los usuarios y organizaciones que quieran utilizarlos. Para garantizar una eficiente explotación de los mismos, resulta fundamental dotarlos de más valor estableciendo conexiones con otros repositorios que permitan enriquecerlos; garantizar su valor, evaluando y mejorando la calidad de lo que se publica; y asimismo ofrecer los mecanismos necesarios que faciliten su explotación. En este trabajo de tesis se ha propuesto un modelo para la publicación de Linked Open Data que, a partir de un conjunto de datos obtenidos de diversas fuentes, facilita la publicación, enriquecimiento y validación de los datos, generando información útil y de calidad orientada a usuarios expertos y no expertos.
Estilos ABNT, Harvard, Vancouver, APA, etc.
45

Felix, Juan Manuel. "Esplorando i Linked Open Data con RSLT". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020. http://amslaurea.unibo.it/20925/.

Texto completo da fonte
Resumo:
Se è vero che i Linked Open Data accessibili sul Web sono ormai tantissimi, è vero anche che la fruizione di queste informazioni da parte di un pubblico umano è una questione quantomai spinoza. Questo lavoro si propone di esplorare le possibilità di creare applicazioni Web dinamiche con l'ausilio di RSLT, per la visualizzazione di Linked Open Data in formato RDF.
Estilos ABNT, Harvard, Vancouver, APA, etc.
46

Zamagni, Luca. "Analisi comparativa di DB a grafo in particolare RDF". Bachelor's thesis, Alma Mater Studiorum - Università di Bologna, 2020.

Encontre o texto completo da fonte
Resumo:
Nell’era moderna del World Wide Web accedere alle informazioni è divenuto sempre più semplice grazie allo sviluppo di motori di ricerca che rendono qualsiasi richiesta degli utenti sul Web una procedura da poche frazioni di secondo. La facilità con cui ogni persona può apprendere qualsiasi tipo di informazione, e quindi può avere accesso alla conoscenza di quella informazione, è resa tale grazie alla struttura e all’organizzazione dei dati che risiede all’interno dei database. Tali database sono in grado di memorizzare, archiviare a livello fisico e a livello logico, porre in relazione tra loro e condividere, enormi moli di dati con una facilità estrema e con alte velocità prestazionali, ma ciò è reso possibile solo grazie allo sviluppo, negli ultimi decenni, di modelli di database sempre più efficienti in grado di relazionare dati estremamente complessi e non strutturati. Dalla loro nascita esistono numerosi modelli di database, diversi tra loro a seconda del tipo di rappresentazione dei dati e dal tipo di relazione che vi è tra essi: vi sono DB relazionali, DB a oggetto, DB NoSQL e DB a grafo. Proprio sull’analisi di questi ultimi verterà l’obiettivo di questa tesi, volta a dare una chiara comparazione di tutti i singoli parametri che caratterizzano un DB a grafo, quali costo dello stesso e licenze messe a disposizione, prestazioni, casi d’uso più peculiari e linguaggio d'interrogazione utilizzato, di modo da fornire una sorta di classificazione più generale dei principali database a grafo esistenti per coloro che fossero alla ricerca del migliore per il loro progetto. In particolare, questa tesi tratterà la descrizione dei principali database RDF, che sono una specifica tipologia di DB a grafo che descrivono la realtà tramite frasi strutturalmente semplici, composte da due entità, soggetto e oggetto, collegate tra loro da una proprietà: questa struttura così composta è definita tripla, che rappresenta quindi l’informazione atomica all’interno del modello RDF.
Estilos ABNT, Harvard, Vancouver, APA, etc.
47

Gerber, Daniel. "Statistical Extraction of Multilingual Natural Language Patterns for RDF Predicates: Algorithms and Applications". Doctoral thesis, Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-208759.

Texto completo da fonte
Resumo:
The Data Web has undergone a tremendous growth period. It currently consists of more then 3300 publicly available knowledge bases describing millions of resources from various domains, such as life sciences, government or geography, with over 89 billion facts. In the same way, the Document Web grew to the state where approximately 4.55 billion websites exist, 300 million photos are uploaded on Facebook as well as 3.5 billion Google searches are performed on average every day. However, there is a gap between the Document Web and the Data Web, since for example knowledge bases available on the Data Web are most commonly extracted from structured or semi-structured sources, but the majority of information available on the Web is contained in unstructured sources such as news articles, blog post, photos, forum discussions, etc. As a result, data on the Data Web not only misses a significant fragment of information but also suffers from a lack of actuality since typical extraction methods are time-consuming and can only be carried out periodically. Furthermore, provenance information is rarely taken into consideration and therefore gets lost in the transformation process. In addition, users are accustomed to entering keyword queries to satisfy their information needs. With the availability of machine-readable knowledge bases, lay users could be empowered to issue more specific questions and get more precise answers. In this thesis, we address the problem of Relation Extraction, one of the key challenges pertaining to closing the gap between the Document Web and the Data Web by four means. First, we present a distant supervision approach that allows finding multilingual natural language representations of formal relations already contained in the Data Web. We use these natural language representations to find sentences on the Document Web that contain unseen instances of this relation between two entities. Second, we address the problem of data actuality by presenting a real-time data stream RDF extraction framework and utilize this framework to extract RDF from RSS news feeds. Third, we present a novel fact validation algorithm, based on natural language representations, able to not only verify or falsify a given triple, but also to find trustworthy sources for it on the Web and estimating a time scope in which the triple holds true. The features used by this algorithm to determine if a website is indeed trustworthy are used as provenance information and therewith help to create metadata for facts in the Data Web. Finally, we present a question answering system that uses the natural language representations to map natural language question to formal SPARQL queries, allowing lay users to make use of the large amounts of data available on the Data Web to satisfy their information need.
Estilos ABNT, Harvard, Vancouver, APA, etc.
48

Ansell, Peter. "A context sensitive model for querying linked scientific data". Thesis, Queensland University of Technology, 2011. https://eprints.qut.edu.au/49777/1/Peter_Ansell_Thesis.pdf.

Texto completo da fonte
Resumo:
This thesis provides a query model suitable for context sensitive access to a wide range of distributed linked datasets which are available to scientists using the Internet. The model is designed based on scientific research standards which require scientists to provide replicable methods in their publications. Although there are query models available that provide limited replicability, they do not contextualise the process whereby different scientists select dataset locations based on their trust and physical location. In different contexts, scientists need to perform different data cleaning actions, independent of the overall query, and the model was designed to accommodate this function. The query model was implemented as a prototype web application and its features were verified through its use as the engine behind a major scientific data access site, Bio2RDF.org. The prototype showed that it was possible to have context sensitive behaviour for each of the three mirrors of Bio2RDF.org using a single set of configuration settings. The prototype provided executable query provenance that could be attached to scientific publications to fulfil replicability requirements. The model was designed to make it simple to independently interpret and execute the query provenance documents using context specific profiles, without modifying the original provenance documents. Experiments using the prototype as the data access tool in workflow management systems confirmed that the design of the model made it possible to replicate results in different contexts with minimal additions, and no deletions, to query provenance documents.
Estilos ABNT, Harvard, Vancouver, APA, etc.
49

Arndt, Natanael, e Norman Radtke. "Quit diff: calculating the delta between RDF datasets under version control". Universität Leipzig, 2016. https://ul.qucosa.de/id/qucosa%3A15780.

Texto completo da fonte
Resumo:
Distributed actors working on a common RDF dataset regularly encounter the issue to compare the status of one graph with another or generally to synchronize copies of a dataset. A versioning system helps to synchronize the copies of a dataset, combined with a difference calculation system it is also possible to compare versions in a log and to determine, in which version a certain statement was introduced or removed. In this demo we present Quit Diff 1, a tool to compare versions of a Git versioned quad store, while it is also applicable to simple unversioned RDF datasets. We are following an approach to abstract from differences on a syntactical level to differences on the level of the RDF data model, while we leave further semantic interpretation on the schema and instance level to specialized applications. Quit Diff can generate patches in various output formats and can be directly integrated in the distributed version control system Git which provides a foundation for a comprehensive co-evolution work flow on RDF datasets.
Estilos ABNT, Harvard, Vancouver, APA, etc.
50

Perry, Matthew Steven. "A Framework to Support Spatial, Temporal and Thematic Analytics over Semantic Web Data". Wright State University / OhioLINK, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=wright1219267560.

Texto completo da fonte
Estilos ABNT, Harvard, Vancouver, APA, etc.
Oferecemos descontos em todos os planos premium para autores cujas obras estão incluídas em seleções literárias temáticas. Contate-nos para obter um código promocional único!

Vá para a bibliografia