Tesis sobre el tema "Graphes sémantiques"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Graphes sémantiques".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Auillans, Pascal. "Modélisation de réseaux sémantiques par des hypergraphes et applications". Bordeaux 1, 2005. http://www.theses.fr/2005BOR12966.
Texto completoThe goal of the Web evolutions planned by the W3C is to enable the improvement of web services quality. To this end, W3C has added to the Web architecture a knowledge management system named Semantic Web, which implements a theoretical model relying on descriptive logic. This thesis, of which researches are more specifically applied to another knowledge representation system named Topic Maps, aims to rovide an alternative to the use of descriptive logic. We will show how graph theory can be used to structure the knowledge, hence benefit to the field of knowledge representation. This thesis initialy stands within the european project KePT, which aimed to implement a visualization interface for knowledge, structured according to the norm ISO 13250 Topic Maps, in Mondeca's ITM application. Research on graph clustering made for this project raised the need of both a better understanding of the topic maps structure, and tools that enable implementation of efficient processing. Therefore, we propose a formal model relying on graph theory that enables to express structural properties, beyond the expressive power of first order logic. Our model is not only suited for theoretical studies, but also for the adaptation of fast graph theory algorithms to knowledge processing. These processing were previously hardly implementable in industrial applications
Hubert, Nicolas. "Mesure et enrichissement sémantiques des modèles à base d'embeddings pour la prédiction de liens dans les graphes de connaissances". Electronic Thesis or Diss., Université de Lorraine, 2024. http://www.theses.fr/2024LORR0059.
Texto completoKnowledge graph embedding models (KGEMs) have gained considerable traction in recent years. These models learn a vector representation of knowledge graph entities and relations, a.k.a. knowledge graph embeddings (KGEs). This thesis specifically explores the advancement of KGEMs for the link prediction (LP) task, which is of utmost importance as it underpins several downstream applications such as recommender systems. In this thesis, various challenges around the use of KGEMs for LP are identified: the scarcity of semantically rich resources, the unidimensional nature of evaluation frameworks, and the lack of semantic considerations in prevailing machine learning-based approaches. Central to this thesis is the proposition of novel solutions to these challenges. Firstly, the thesis contributes to the development of semantically rich resources: mainstream datasets for link prediction are enriched using schema-based information, EducOnto and EduKG are proposed to overcome the paucity of resources in the educational domain, and PyGraft is introduced as an innovative open-source tool for generating synthetic ontologies and knowledge graphs. Secondly, the thesis proposes a new semantic-oriented evaluation metric, Sem@K, offering a multi-dimensional perspective on model performance. Importantly, popular models are reassessed using Sem@K, which reveals essential insights into their respective capabilities and highlights the need for multi-faceted evaluation frameworks. Thirdly, the thesis delves into the development of neuro-symbolic approaches, transcending traditional machine learning paradigms. These approaches do not only demonstrate improved semantic awareness but also extend their utility to diverse applications such as recommender systems. In summary, the present work not only redefines the evaluation and functionality of knowledge graph embedding models but also sets the stage for more versatile, interpretable AI systems, underpinning future explorations at the intersection of machine learning and symbolic reasoning
Puget, Dominique. "Aspects sémantiques dans les Systèmes de Recherche d'Informations". Toulouse 3, 1993. http://www.theses.fr/1993TOU30139.
Texto completoTailhardat, Lionel. "Anomaly detection using knowledge graphs and synergistic reasoning : application to network management and cyber security". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS293.
Texto completoIncident management on telecom and computer networks, whether it is related to infrastructure or cybersecurity issues, requires the ability to simultaneously and quickly correlate and interpret a large number of heterogeneous technical information sources. In this thesis, we study the benefits of structuring this data into a knowledge graph, and examine how this structure helps to manage the complexity of networks, particularly for applications in anomaly detection on dynamic and large-scale networks. Through an ontology (a model of concepts and relationships to describe an application domain), knowledge graphs allow different-looking information to be given a common meaning. We first introduce a new ontology for describing network infrastructures, incidents, and operations. We also describe an architecture for transforming network data into a knowledge graph organized according to this ontology, using Semantic Web technologies to foster interoperability. The resulting knowledge graph allows for standardized analysis of network behavior. We then define three families of algorithmic techniques for using the graph data, and show how these techniques can be used to detect abnormal system behavior and assist technical support teams in incident diagnosis. Finally, we present a software architecture to facilitate the interactions of support teams with the knowledge graph and diagnostic algorithms through a specialized graphical user interface. Each proposal has been independently tested through experiments and demonstrations, as well as by a panel of expert users using the specialized graphical interface within an integrated solution
Talon, Bénédicte. "Un système d'aide à l'acquisition de concepts nouveaux pour un outil d'analyse du langage naturel". Compiègne, 1991. http://www.theses.fr/1991COMPD378.
Texto completoGueffaz, Mahdi. "ScaleSem : model checking et web sémantique". Phd thesis, Université de Bourgogne, 2012. http://tel.archives-ouvertes.fr/tel-00801730.
Texto completoLajmi, Sonia. "Annotation et recherche contextuelle des documents multimédias socio-personnels". Phd thesis, INSA de Lyon, 2011. http://tel.archives-ouvertes.fr/tel-00668689.
Texto completoAllani, Atig Olfa. "Une approche de recherche d'images basée sur la sémantique et les descripteurs visuels". Electronic Thesis or Diss., Paris 8, 2017. http://www.theses.fr/2017PA080032.
Texto completoImage retrieval is a very active search area. Several image retrieval approaches that allow mapping between low-level features and high-level semantics have been proposed. Among these, one can cite object recognition, ontologies, and relevance feedback. However, their main limitation concern their high dependence on reliable external resources and lack of capacity to combine semantic and visual information.This thesis proposes a system based on a pattern graph combining semantic and visual features, relevant visual feature selection for image retrieval and improvement of results visualization. The idea is (1) build a pattern graph composed of a modular ontology and a graph-based model, (2) to build visual feature collections to guide feature selection during online retrieval phase and (3) improve the retrieval results visualization with the integration of semantic relations.During the pattern graph building, ontology modules associated to each domain are automatically built using textual corpuses and external resources. The region's graphs summarize the visual information in a condensed form and classify it given its semantics. The pattern graph is obtained using modules composition. In visual features collections building, association rules are used to deduce the best practices on visual features use for image retrieval. Finally, results visualization uses the rich information on images to improve the results presentation.Our system has been tested on three image databases. The results show an improvement in the research process, a better adaptation of the visual features to the domains and a richer visualization of the results
Dennemont, Yannick. "Une assistance à l'interaction 3D en réalité virutuelle par un raisonnement sémantique et une conscience du contexte". Thesis, Evry-Val d'Essonne, 2013. http://www.theses.fr/2013EVRY0010/document.
Texto completoTasks in immersive virtual environments are associated with 3D interaction techniques and devices (e.g. the selection of 3D objects with the virtual hand and a flystick). As environments and tasks become more and more complex, techniques can not remain the same for each application, even for every situations of a single application. A solution is to adapt the interaction depending on the situation in order to increase usability. These adaptations can be done manually by the designer or the user, or automatically by the system thus creating an adaptative interaction. Formalisation of such assistance needs the management of pertinent information regarding the situation. Those items of information make the context emerge from the interaction. The adaptative assistance obtained by reasoning on this information is then context-aware. Numerous possibilities can be used to build one. Our objective is a context management that preserves its high degrees of expressiveness and evolutivity while being easy to plug in. We have built a model for this issue using conceptual graphs based on an ontology and managed externally with a first order logic engine. The engine is generic and uses a knowledge base with facts and rules which can be dynamically changed. We have added a confidence notion, in order to establish a situation similarity to the knowledge base. Reactions’confidences are compared to their impacts so as to keep only the pertinent ones while avoiding user overload. Applications have tools that can be controlled by the engine. Sensors are used to extract semantic information for the context. Effectors are used to act upon the application and to have adaptations. A tools set and a knowledge base have been created for 3D interaction. Numerous steps have been added in the knowledge base to obtain good combinations and a reasoning independent from specific tools. Our first applications shows the situation understanding, including user interests and difficulties, and the triggering of pertinent assistances. An off-line study illustrates the access and evolution of the internal engine steps. The built generic semantic reasoning is expressive, understandable, extensive and modifiable dynamically. For 3D interaction, it allows universal assistances for the user that can be automatic, punctual or manual and off-line activities or conceptions analysis fort he designers
Allani, Atig Olfa. "Une approche de recherche d'images basée sur la sémantique et les descripteurs visuels". Thesis, Paris 8, 2017. http://www.theses.fr/2017PA080032.
Texto completoImage retrieval is a very active search area. Several image retrieval approaches that allow mapping between low-level features and high-level semantics have been proposed. Among these, one can cite object recognition, ontologies, and relevance feedback. However, their main limitation concern their high dependence on reliable external resources and lack of capacity to combine semantic and visual information.This thesis proposes a system based on a pattern graph combining semantic and visual features, relevant visual feature selection for image retrieval and improvement of results visualization. The idea is (1) build a pattern graph composed of a modular ontology and a graph-based model, (2) to build visual feature collections to guide feature selection during online retrieval phase and (3) improve the retrieval results visualization with the integration of semantic relations.During the pattern graph building, ontology modules associated to each domain are automatically built using textual corpuses and external resources. The region's graphs summarize the visual information in a condensed form and classify it given its semantics. The pattern graph is obtained using modules composition. In visual features collections building, association rules are used to deduce the best practices on visual features use for image retrieval. Finally, results visualization uses the rich information on images to improve the results presentation.Our system has been tested on three image databases. The results show an improvement in the research process, a better adaptation of the visual features to the domains and a richer visualization of the results
Mugnier, Marie-Laure. "Contributions algorithmiques pour les graphes d'héritage et les graphes conceptuels". Montpellier 2, 1992. http://www.theses.fr/1992MON20195.
Texto completoZneika, Mussab. "Interrogation du web sémantique à l'aide de résumés de graphes de données". Thesis, Cergy-Pontoise, 2019. http://www.theses.fr/2019CERG1010.
Texto completoThe amount of RDF data available increases fast both in size and complexity, making available RDF Knowledge Bases (KBs) with millions or even billions of triples something usual, e.g. more than 1000 datasets are now published as part of the Linked Open Data (LOD) cloud, which contains more than 62 billion RDF triples, forming big and complex RDF data graphs. This explosion of size, complexity and number of available RDF Knowledge Bases (KBs) and the emergence of Linked Datasets made querying, exploring, visualizing, and understanding the data in these KBs difficult both from a human (when trying to visualize) and a machine (when trying to query or compute) perspective. To tackle this problem, we propose a method of summarizing a large RDF KBs based on representing the RDF graph using the (best) top-k approximate RDF graph patterns. The method is named SemSum+ and extracts the meaningful/descriptive information from RDF Knowledge Bases and produces a succinct overview of these RDF KBs. It extracts from the RDF graph, an RDF schema that describes the actual contents of the KB, something that has various advantages even compared to an existing schema, which might be partially used by the data in the KB. While computing the approximate RDF graph patterns, we also add information on the number of instances each of the patterns represents. So, when we query the RDF summary graph, we can easily identify whether the necessary information is present and if it is present in significant numbers whether to be included in a federated query result. The method we propose does not require the presence of the initial schema of the KB and works equally well when there is no schema information at all (something realistic with modern KBs that are constructed either ad-hoc or by merging fragments of other existing KBs). Additionally, the proposed method works equally well with homogeneous (having the same structure) and heterogeneous (having different structure, possibly the result of data described under different schemas/ontologies) RDF graphs.Given that RDF graphs can be large and complex, methods that need to compute the summary by fitting the whole graph in the memory of a (however large) machine will not scale. In order to overcome this problem, we proposed, as part of this thesis, a parallel framework that allows us to have a scalable parallel version of our proposed method. This will allow us to compute the summaries of any RDF graph regardless of size. Actually, we generalized this framework so as to be usable by any approximate pattern mining algorithm that needs parallelization.But working on this problem, introduced us to the issue of measuring the quality of the produced summaries. Given that in the literature exist various algorithms that can be used to summarize RDF graphs, we need to understand which one is better suited for a specific task or a specific RDF KB. In the literature, there is a lack of widely accepted evaluation criteria or an extensive empirical evaluation. This leads to the necessity of a method to compare and evaluate the quality of the produced summaries. So, in this thesis, we provide a comprehensive Quality Framework for RDF Graph Summarization to cover the gap that exists in the literature. This framework allows a better, deeper and more complete understanding of the quality of the different summaries and facilitates their comparison. It is independent of the way RDF summarization algorithms work and makes no assumptions on the type or structure neither of the input nor of the final results. We provide a set of metrics that help us understand not only if this is a valid summary but also how a summary compares to another in terms of the specified quality characteristic(s). The framework has the ability, which was experimentally validated, to capture subtle differences among summaries and produce metrics that depict that and was used to provide an extensive experimental evaluation and comparison of our method
Ribeyre, Corentin. "Méthodes d’analyse supervisée pour l’interface syntaxe-sémantique : de la réécriture de graphes à l’analyse par transitions". Sorbonne Paris Cité, 2016. http://www.theses.fr/2016USPCC119.
Texto completoNowadays, the amount of textual data has become so gigantic, that it is not possible to deal with it manually. In fact, it is now necessary to use Natural Language Processing techniques to extract useful information from these data and understand their underlying meaning. In this thesis, we offer resources, models and methods to allow: (i) the automatic annotation of deep syntactic corpora to extract argument structure that links (verbal) predicates to their arguments (ii) the use of these resources with the help of efficient methods. First, we develop a graph rewriting system and a set of manually-designed rewriting rules to automatically annotate deep syntax in French. Thanks to this approach, two corpora were created: the DeepSequoia, a deep syntactic version of the Séquoia corpus and the DeepFTB, a deep syntactic version of the dependency version of the French Treebank. Next, we extend two transition-based parsers and adapt them to be able to deal with graph structures. We also develop a set of rich linguistic features extracted from various syntactic trees. We think they are useful to bring different kind of topological information to accurately predict predicat-argument structures. Used in an arc-factored second-order parsing model, this set of features gives the first state-of-the-art results on French and outperforms the one established on the DM and PAS corpora for English. Finally, we briefly explore a method to automatically induce the transformation between a tree and a graph. This completes our set of coherent resources and models to automatically analyze the syntax-semantics interface on French and English
Pradel, Camille. "D'un langage de haut niveau à des requêtes graphes permettant d'interroger le web sémantique". Toulouse 3, 2013. http://thesesups.ups-tlse.fr/2237/.
Texto completoGraph models are suitable candidates for KR on the Web, where everything is a graph, from the graph of machines connected to the Internet, the "Giant Global Graph" as described by Tim Berners-Lee, to RDF graphs and ontologies. In that context, the ontological query answering problem is the following: given a knowledge base composed of a terminological component and an assertional component and a query, does the knowledge base implies the query, i. E. Is there an answer to the query in the knowledge base? Recently, new description logic languages have been proposed where the ontological expressivity is restricted so that query answering becomes tractable. The most prominent members are the DL-Lite and the EL families. In the same way, the OWL-DL language has been restricted and this has led to OWL2, based on the DL-Lite and EL families. We work in the framework of using graph formalisms for knowledge representation (RDF, RDF-S and OWL) and interrogation (SPARQL). Even if interrogation languages based on graphs have long been presented as a natural and intuitive way of expressing information needs, end-users do not think their queries in terms of graphs. They need simple languages that are as close as possible to natural language, or at least mainly limited to keywords. We propose to define a generic way of translating a query expressed in a high-level language into the SPARQL query language, by means of query patterns. The beginning of this work coincides with the current activity of the W3C that launches an initiative to prepare a possible new version of RDF and is in the process of standardizing SPARQL 1. 1 with entailments
Lapalut, Stéphane. "Sémantique formelle et spécifications algébriques du raisonnement sur les graphes conceptuels simples et étendus". Nice, 1997. http://www.theses.fr/1997NICE5148.
Texto completoDib, Saker. "L'interrogation des bases de données relationnelles assistée par le graphe sémantique normalisé". Lyon 1, 1993. http://www.theses.fr/1993LYO10122.
Texto completoSaidouni, Djamel-Eddine. "Sémantique de maximalité : application au raffinement d'actions dans LOTOS". Toulouse 3, 1996. http://www.theses.fr/1996TOU30040.
Texto completoBuron, Maxime. "Raisonnement efficace sur des grands graphes hétérogènes". Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAX061.
Texto completoThe Semantic Web offers knowledge representations, which allow to integrate heterogeneous data from several sources into a unified knowledge base. In this thesis, we investigate techniques for querying such knowledge bases.The first part is devoted to query answering techniques on a knowledge base, represented by an RDF graph subject to ontological constraints. Implicit information entailed by the reasoning, enabled by the set of RDFS entailment rules, has to be taken into account to correctly answer such queries. First, we present a sound and complete query reformulation algorithm for Basic Graph Pattern queries, which exploits a partition of RDFS entailment rules into assertion and constraint rules. Second, we introduce a novel RDF storage layout, which combines two well-known layouts. For both contributions, our experiments assess our theoretical and algorithmic results.The second part considers the issue of querying heterogeneous data sources integrated into an RDF graph, using BGP queries. Following the Ontology-Based Data Access paradigm, we introduce a framework of data integration under an RDFS ontology, using the Global-Local-As-View mappings, rarely considered in the literature.We present several query answering strategies, which may materialize the integrated RDF graph or leave it virtual, and differ on how and when RDFS reasoning is handled. We implement these strategies in a platform, in order to conduct experiments, which demonstrate the particular interest of one of the strategies based on mapping saturation. Finally, we show that mapping saturation can be extended to reasoning defined by a subset of existential rules
Raad, Joe. "Gestion d'identité dans des graphes de connaissances". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLA028/document.
Texto completoIn the absence of a central naming authority on the Web of data, it is common for different knowledge graphs to refer to the same thing by different names (IRIs). Whenever multiple names are used to denote the same thing, owl:sameAs statements are needed in order to link the data and foster reuse. Such identity statements have strict logical semantics, indicating that every property asserted to one name, will also be inferred to the other, and vice versa. While such inferences can be extremely useful in enabling and enhancing knowledge-based systems such as search engines and recommendation systems, incorrect use of identity can have wide-ranging effects in a global knowledge space like the Web of data. With several studies showing that owl:sameAs is indeed misused for different reasons, a proper approach towards the handling of identity links is required in order to make the Web of data succeed as an integrated knowledge space. This thesis investigates the identity problem at hand, and provides different, yet complementary solutions. Firstly, it presents the largest dataset of identity statements that has been gathered from the LOD Cloud to date, and a web service from which the data and its equivalence closure can be queried. Such resource has both practical impacts (it helps data users and providers to find different names for the same entity), as well as analytical value (it reveals important aspects of the connectivity of the LOD Cloud). In addition, by relying on this collection of 558 million identity statements, we show how network metrics such as the community structure of the owl:sameAs graph can be used in order to detect possibly erroneous identity assertions. For this, we assign an error degree for each owl:sameAs based on the density of the community(ies) in which they occur, and their symmetrical characteristics. One benefit of this approach is that it does not rely on any additional knowledge. Finally, as a way to limit the excessive and incorrect use of owl:sameAs, we define a new relation for asserting the identity of two ontology instances in a specific context (a sub-ontology). This identity relation is accompanied with an approach for automatically detecting these links, with the ability of using certain expert constraints for filtering irrelevant contexts. As a first experiment, the detection and exploitation of the detected contextual identity links are conducted on two knowledge graphs for life sciences, constructed in a mutual effort with domain experts from the French National Institute of Agricultural Research (INRA)
Islam, Md Kamrul. "Explainable link prediction in large complex graphs - application to drug repurposing". Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0203.
Texto completoMany real-world complex systems can be well-represented with graphs, where nodes represent objects or entities and links/relations represent interactions between pairs of nodes. Link prediction (LP) is one of the most interesting and long-standing problems in the field of graph mining; it predicts the probability of a link between two unconnected nodes based on available information in the current graph. This thesis studies the LP problem in graphs. It consists of two parts: LP in simple graphs and LP knowledge graphs (KGs). In the first part, the LP problem is defined as predicting the probability of a link between a pair of nodes in a simple graph. In the first study, a few similarity-based and embedding-based LP approaches are evaluated and compared on simple graphs from various domains. he study also criticizes the traditional way of computing the precision metric of similarity-based approaches as the computation faces the difficulty of tuning the threshold for deciding the link existence based on the similarity score. We proposed a new way of computing the precision metric. The results showed the expected superiority of embedding-based approaches. Still, each of the similarity-based approaches is competitive on graphs with specific properties. We could check experimentally that similarity-based approaches are fully explainable but lack generalization due to their heuristic nature, whereas embedding-based approaches are general but not explainable. The second study tries to alleviate the unexplainability limitation of embedding-based approaches by uncovering interesting connections between them and similarity-based approaches to get an idea of what is learned in embedding-based approaches. The third study demonstrates how the similarity-based approaches can be ensembled to design an explainable supervised LP approach. Interestingly, the study shows high LP performance for the supervised approach across various graphs, which is competitive with embedding-based approaches.The second part of the thesis focuses on LP in KGs. A KG is represented as a collection of RDF triples, (head,relation,tail) where the head and the tail are two entities which are connected by a specific relation. The LP problem in a KG is formulated as predicting missing head or tail entities in a triple. LP approaches based on the embeddings of entities and relations of a KG have become very popular in recent years, and generating negative triples is an important task in KG embedding methods. The first study in this part discusses a new method called SNS to generate high-quality negative triples during the training of embedding methods for learning embeddings of KGs. The results we produced show better LP performance when SNS is injected into an embedding approach than when injecting state-of-the-art negative triple sampling methods. The second study in the second part discusses a new neuro-symbolic method of mining rules and an abduction strategy to explain LP by an embedding-based approach utilizing the learned rules. The third study applies the explainable LP to a COVID-19 KG to develop a new drug repurposing approach for COVID-19. The approach learns ”ensemble embeddings” of entities and relations in a COVID-19 centric KG, in order to get a better latent representation of the graph elements. For the first time to our knowledge, molecular docking is then used to evaluate the predictions obtained from drug repurposing using KG embedding. Molecular evaluation and explanatory paths bring reliability to prediction results and constitute new complementary and reusable methods for assessing KG-based drug repurposing. The last study proposes a distributed architecture for learning KG embeddings in distributed and parallel settings. The results of the study that the computational time of embedding methods improves remarkably without affecting LP performance when they are trained in the proposed distributed settings than the traditional centralized settings
Khajeh, Nassiri Armita. "Expressive Rule Discovery for Knowledge Graph Refinement". Electronic Thesis or Diss., université Paris-Saclay, 2023. http://www.theses.fr/2023UPASG045.
Texto completoKnowledge graphs (KGs) are heterogeneous graph structures representing facts in a machine-readable format. They find applications in tasks such as question answering, disambiguation, and entity linking. However, KGs are inherently incomplete, and refining them is crucial to improve their effectiveness in downstream tasks. It's possible to complete the KGs by predicting missing links within a knowledge graph or integrating external sources and KGs. By extracting rules from the KG, we can leverage them to complete the graph while providing explainability. Various approaches have been proposed to mine rules efficiently. Yet, the literature lacks effective methods for effectively incorporating numerical predicates in rules. To address this gap, we propose REGNUM, which mines numerical rules with interval constraints. REGNUM builds upon the rules generated by an existing rule mining system and enriches them by incorporating numerical predicates guided by quality measures. Additionally, the interconnected nature of web data offers significant potential for completing and refining KGs, for instance, by data linking, which is the task of finding sameAs links between entities of different KGs. We introduce RE-miner, an approach that mines referring expressions (REs) for a class in a knowledge graph and uses them for data linking. REs are rules that are only applied to one entity. They support knowledge discovery and serve as an explainable way to link data. We employ pruning strategies to explore the search space efficiently, and we define characteristics to generate REs that are more relevant for data linking. Furthermore, we aim to explore the advantages and opportunities of fine-tuning language models to bridge the gap between KGs and textual data. We propose GilBERT, which leverages fine-tuning techniques on language models like BERT using a triplet loss. GilBERT demonstrates promising results for refinement tasks of relation prediction and triple classification tasks. By considering these challenges and proposing novel approaches, this thesis contributes to KG refinement, particularly emphasizing explainability and knowledge discovery. The outcomes of this research open doors to more research questions and pave the way for advancing towards more accurate and comprehensive KGs
Simonet, Geneviève. "Héritage non monotone à base de chemins et de graphes partiels". Montpellier 2, 1994. http://www.theses.fr/1994MON20151.
Texto completoBasse, Adrien. "Caractérisation incrémentale d'une base de triplets RDF". Nice, 2012. http://www.theses.fr/2012NICE4056.
Texto completoMany semantics web applications address the issue of integrating data from distributed RDF triple stores. There are several solutions for distributed query processing such as SPARQL 1. 1 Federation, which defines extensions to the SPARQL Query Language to support distributed query execution. Such extensions make it possible to formulate a query that delegates parts of the query to a series of services, but one issue remains : how to automate the selection of RDF triple stores containing relevant data to answer a query. This is especially true in the context of the Linking Open data project where numerous and very heterogeneous datasets are interlinked, allowing for interesting queries across several sources. To decompose and send queries targeting only relevant stores, we need a means to describe each RDF triple store, i. E. An index structure which provides a complete and compact index items. In this thesis we present an approach to extract these graph patterns from RDF triple store. For this purpose, we extend Depth-First search coding (DFS) by (Yan and Han, 2002) to RDF labeled and oriented multigraphs and we provide a join operator between two DFS codes so as to sequentially build the different levels of the index structure. Insertion or deletion of annotations in the triple store may cause changes to the index structure. To handle update in triple store, we proposed a procedure to identify exactly the changes in the first level of the index structure and propagate them to the following levels. The DFSR coding makes it possible for us to efficiently manipulate graph patterns, but is difficult to read (succession of integer number). To facilitate the reading of our index structure, we propose a visualization user-interface and algorithms to turn a DFS code into a more legible format like RDF. Our algorithm relies on the CORESE/KGRAM by (Corby, 2008). We have tested our algorithm on many datasets. During the building f index structures we keep a set of data in order to help us to better understand the progress of our algorithm and improve it
Destandau, Marie. "Path-Based Interactive Visual Exploration of Knowledge Graphs". Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASG063.
Texto completoKnowledge Graphs facilitate the pooling and sharing of information from different domains. They rely on small units of information named triples that can be combined to form higher-level statements. Producing interactive visual interfaces to explore collections in Knowledge Graphs is a complex problem, mostly unresolved. In this thesis, I introduce the concept of path outlines to encode aggregate information relative to a chain of triples. I demonstrate 3 applications of the concept withthe design and implementation of 3 open source tools. S-Paths lets users browse meaningful overviews of collections; Path Outlines supports data producers in browsing the statements thatcan be produced from their data; and The Missing Path supports data producers in analysingincompleteness in their data. I show that the concept not only supports interactive visual interfaces for Knowledge Graphs but also helps better their quality
Bonner, Chantal. "Classification et composition de services Web : une perspective réseaux complexes". Corte, 2011. http://www.theses.fr/2011CORT0008.
Texto completoWeb services are building blocks for modular applications independent of any software or hardware platforms. They implement the service oriented architecture (SOA). Research on Web services mainly focuses on discovery and composition. However, complexity of the Web services space structure and its development must necessarily be taken into account. This cannot be done without using the complex systems science, including the theory of complex networks. In this thesis, we define a set of networks based on Web services composition when Web services are syntactically (WSDL) and semantically (SAWSDL) described. The experimental exploration of these networks can reveal characteristic properties of complex networks (small world property and scale-free distribution). It also shows that these networks have a community structure. This result provides an alternative answer to the problem of Web services classification by domain of interest. Indeed, communities don’t gather Web services with similar functionalities, but Web services that share many interaction relationships. This organization can be used among others, to guide compositions search algorithms. Furthermore, with respect to the classification based on Web services functional similarity for discovery or substitution, we propose a set of network models for syntactic and semantic representations of Web services, reflecting various similarity degrees. The topological analysis of these networks reveals a component structure and internal organization of thecomponents around elementary patterns. This property allows a two-level characterization of the notion of community of similar Web services that highlight the flexibility of this new organizational model. This work opens new perspectives in the issues of service-oriented architecture
Bannour, Ines. "Recherche d’information s´emantique : Graphe sémantico-documentaire et propagation d’activation". Thesis, Sorbonne Paris Cité, 2017. http://www.theses.fr/2017USPCD024/document.
Texto completoSemantic information retrieval (SIR) aims to propose models that allow us to rely, beyond statistical calculations, on the meaning and semantics of the words of the vocabulary, in order to better represent relevant documents with respect to user’s needs, and better retrieve them.The aim is therefore to overcome the classical purely statistical (« bag of wordsé») approaches, based on strings’ matching and the analysis of the frequencies of the words and their distributions in the text.To do this, existing SIR approaches, through the exploitation of external semantic resources (thesauri, ontologies, etc.), proceed by injecting knowledge into the classical IR models (such as the vector space model) in order to disambiguate the vocabulary or to enrich the representation of documents and queries.These are usually adaptations of the classical IR models. We go so to a « bag of concepts » approach which allows us to take account of synonymy. The semantic resources thus exploited are « flattened », the calculations are generally confined to calculations of semantic similarities.In order to better exploit the semantics in RI, we propose a new model, which allows to unify in a coherent and homogeneous way the numerical (distributional) and symbolic (semantic) information without sacrificing the power of the analyzes of the one for the other. The semantic-documentary network thus modeled is translated into a weighted graph. The matching mechanism is provided by a Spreading activation mechanism in the graph. This new model allows to respond to queries expressed in the form of key words, concepts or even examples of documents. The propagation algorithm has the merit of preserving the well-tested characteristics of classical information retrieval models while allowing a better consideration of semantic models and their richness.Depending on whether semantics is introduced in the graph or not, this model makes it possible to reproduce a classical IR or provides, in addition, some semantic functionalities. The co-occurrence in the graph then makes it possible to reveal an implicit semantics which improves the precision by solving some semantic ambiguities. The explicit exploitation of the concepts as well as the links of the graph allow the resolution of the problems of synonymy, term mismatch, semantic coverage, etc. These semantic features, as well as the scaling up of the model presented, are validated experimentally on a corpus in the medical field
Sesboué, Matthias. "Κnοwledge graph-based system fοr technical dοcument retrieval : a deductive reasοning-fοcused explοratiοn". Electronic Thesis or Diss., Normandie, 2024. http://www.theses.fr/2024NORMIR17.
Texto completoThese industrial research works explore Knowledge Graph-Based Systems (KGBS) for Information Retrieval (IR). They have been conducted in partnership with the company TraceParts. TraceParts is one of the world's leading Computer-Aided Design (CAD)-content platforms for Engineering, Industrial Equipment, and Machine Design. Hence, our use case considers a technical document corpus composed of Computer-Aided Design (CAD) models and their descriptions. Rather than leveraging the CAD models, we focus on their descriptive texts. Knowledge Graphs (KG) are ubiquitous in today's enterprise information systems and applications. Many academic research fields, such as Information Retrieval (IR), have adopted KGs. These digital knowledge artefacts aggregate heterogeneous data and represent knowledge in a machine-readable format. They are graphs intended to accumulate and convey knowledge of the real world, whose nodes represent entities of interest and whose edges represent relations between these entities. The Architecture Engineering and Construction projects produce a wealth of technical documents. IR systems are critical to these industries to retrieve their complex, heterogeneous, specialised documents quickly. Healthcare is another similar domain with such a need. Though these industries manage documents with some textual content, such text and the metadata contain domain-specific concepts and vocabularies. Open KGs and the existing ontologies often describe concepts that are too high-level and need more fine-grained knowledge required by IR applications. Hence, companies' IR and knowledge management tools require domain-specific KGs built from scratch or extending existing ones. Throughout our literature review, we first explore Knowledge Graphs (KG), ontologies, and how they relate to and derive our unifying KG definition. We consider ontologies one component of a KG and take a Semantic Web perspective, proposing illustrative candidate technologies from the World Wide Web Consortium Semantic Web standards. We also explore the theoretical and practical meaning of the term "semantics". We then explore the literature on IR, focusing on KG-based IR. We break down this review section, first exploring the literature on IR using the term "knowledge graph" and then the one using the term "ontology". We thereby point out some similarities and distinctions in the KG usages. Our contributions first introduce a KGBS architecture relating knowledge acquisition, modelling, and consumption arranged around the KG. We demonstrate that Semantic Web standards provide an approach for each KGBS component. To organise our work, we follow this system architecture; hence, each of our contributions addresses knowledge acquisition, modelling, and consumption, respectively. For our work, we do not have a pre-built KG or access to domain experts to construct it. Hence, we address knowledge acquisition by designing our Ontology Learning Applied Framework (OLAF) collaboratively with some of our research group members. We use OLAF to build pipelines to automatically learn an ontology from text. We implement our framework as an open-source Python library and build two ontologies to assess the OLAF's pertinence, usability, and modularity. We then focus on knowledge modelling, presenting our IR ontology and demonstrating its usage with an OWL reasoning-powered IR system. While most IR systems leverage reasoning in an offline process, our approach explores OWL reasoning at runtime. While demonstrating our IR ontology, we illustrate a Semantic Web-based implementation of our KG definition by pointing out each KG component in our IR ontology demonstration. Finally, we tackle the CAD model retrieval challenge our industrial partner TraceParts faces by implementing a KG-based approach at scale and using real-world data. We illustrate moving from an existing text-based technical document retrieval system to a KG-based one. We leverage real-world TraceParts
Lully, Vincent. "Vers un meilleur accès aux informations pertinentes à l’aide du Web sémantique : application au domaine du e-tourisme". Thesis, Sorbonne université, 2018. http://www.theses.fr/2018SORUL196.
Texto completoThis thesis starts with the observation that there is an increasing infobesity on the Web. The two main types of tools, namely the search engine and the recommender system, which are designed to help us explore the Web data, have several problems: (1) in helping users express their explicit information needs, (2) in selecting relevant documents, and (3) in valuing the selected documents. We propose several approaches using Semantic Web technologies to remedy these problems and to improve the access to relevant information. We propose particularly: (1) a semantic auto-completion approach which helps users formulate longer and richer search queries, (2) several recommendation approaches using the hierarchical and transversal links in knowledge graphs to improve the relevance of the recommendations, (3) a semantic affinity framework to integrate semantic and social data to yield qualitatively balanced recommendations in terms of relevance, diversity and novelty, (4) several recommendation explanation approaches aiming at improving the relevance, the intelligibility and the user-friendliness, (5) two image user profiling approaches and (6) an approach which selects the best images to accompany the recommended documents in recommendation banners. We implemented and applied our approaches in the e-tourism domain. They have been properly evaluated quantitatively with ground-truth datasets and qualitatively through user studies
Morey, Mathieu. "Étiquetage grammatical symbolique et interface syntaxe-sémantique des formalismes grammaticaux lexicalisés polarisés". Phd thesis, Université de Lorraine, 2011. http://tel.archives-ouvertes.fr/tel-00640561.
Texto completoCharhad, Mbarek. "Modèles de documents vidéos basés sur le formalisme des graphes conceptuels pour l'indexation et la recherche par le contenu sémantique". Université Joseph Fourier (Grenoble), 2005. http://www.theses.fr/2005GRE10186.
Texto completoLn the case of video, there are a number of specificities due to its multimedia aspect. For instance, a given concept (person, object. . . ) can be present in different ways: it can be seen, it can be heard, it can be talked of, and combinations ofthese representations can also occur. Of course, these distinctions are important for the user. Queries involving a concept C as: "Show me a picture of C" or as "I want to know what C2 has said about C" are likely to give quite different answers. The first one would look for C in the image track while the second would look in the audio track for a segment in which C is the speaker and C is mentioned in the speech. The context of this study is multimedia information modelling, indexing and retrieval. At the theoretical level, our contribution consists in the proposal of a model for the representation of the semantic contents of video documents. This model permits the synthetic and integrated taking into account of, data elements from each media (image, text, audio). The instantiation of this model is implemented using the conceptual graph (CG) formalism. The choice of this formalism is justified by its expressivity and its adequacy with content-based information indexing and retrieval. Our experimental contribution consists in the (partial) implementation of the CLOVIS prototype. We have integrated the proposed model in the video indexing and retrieval system by content in order to evaluate its contributions in terms of effectiveness and precision
Gillani, Syed. "Semantically-enabled stream processing and complex event processing over RDF graph streams". Thesis, Lyon, 2016. http://www.theses.fr/2016LYSES055/document.
Texto completoThere is a paradigm shift in the nature and processing means of today’s data: data are used to being mostly static and stored in large databases to be queried. Today, with the advent of new applications and means of collecting data, most applications on the Web and in enterprises produce data in a continuous manner under the form of streams. Thus, the users of these applications expect to process a large volume of data with fresh low latency results. This has resulted in the introduction of Data Stream Processing Systems (DSMSs) and a Complex Event Processing (CEP) paradigm – both with distinctive aims: DSMSs are mostly employed to process traditional query operators (mostly stateless), while CEP systems focus on temporal pattern matching (stateful operators) to detect changes in the data that can be thought of as events. In the past decade or so, a number of scalable and performance intensive DSMSs and CEP systems have been proposed. Most of them, however, are based on the relational data models – which begs the question for the support of heterogeneous data sources, i.e., variety of the data. Work in RDF stream processing (RSP) systems partly addresses the challenge of variety by promoting the RDF data model. Nonetheless, challenges like volume and velocity are overlooked by existing approaches. These challenges require customised optimisations which consider RDF as a first class citizen and scale the processof continuous graph pattern matching. To gain insights into these problems, this thesis focuses on developing scalable RDF graph stream processing, and semantically-enabled CEP systems (i.e., Semantic Complex Event Processing, SCEP). In addition to our optimised algorithmic and data structure methodologies, we also contribute to the design of a new query language for SCEP. Our contributions in these two fields are as follows: • RDF Graph Stream Processing. We first propose an RDF graph stream model, where each data item/event within streams is comprised of an RDF graph (a set of RDF triples). Second, we implement customised indexing techniques and data structures to continuously process RDF graph streams in an incremental manner. • Semantic Complex Event Processing. We extend the idea of RDF graph stream processing to enable SCEP over such RDF graph streams, i.e., temporalpattern matching. Our first contribution in this context is to provide a new querylanguage that encompasses the RDF graph stream model and employs a set of expressive temporal operators such as sequencing, kleene-+, negation, optional,conjunction, disjunction and event selection strategies. Based on this, we implement a scalable system that employs a non-deterministic finite automata model to evaluate these operators in an optimised manner. We leverage techniques from diverse fields, such as relational query optimisations, incremental query processing, sensor and social networks in order to solve real-world problems. We have applied our proposed techniques to a wide range of real-world and synthetic datasets to extract the knowledge from RDF structured data in motion. Our experimental evaluations confirm our theoretical insights, and demonstrate the viability of our proposed methods
Liu, Jixiong. "Semantic Annotations for Tabular Data Using Embeddings : Application to Datasets Indexing and Table Augmentation". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS529.
Texto completoWith the development of Open Data, a large number of data sources are made available to communities (including data scientists and data analysts). This data is the treasure of digital services as long as data is cleaned, unbiased, as well as combined with explicit and machine-processable semantics in order to foster exploitation. In particular, structured data sources (CSV, JSON, XML, etc.) are the raw material for many data science processes. However, this data derives from different domains for which consumers are not always familiar with (knowledge gap), which complicates their appropriation, while this is a critical step in creating machine learning models. Semantic models (in particular, ontologies) make it possible to explicitly represent the implicit meaning of data by specifying the concepts and relationships present in the data. The provision of semantic labels on datasets facilitates the understanding and reuse of data by providing documentation on the data that can be easily used by a non-expert. Moreover, semantic annotation opens the way to search modes that go beyond simple keywords and allow the use of queries of a high conceptual level on the content of the datasets but also their structure while overcoming the problems of syntactic heterogeneity encountered in tabular data. This thesis introduces a complete pipeline for the extraction, interpretation, and applications of tables in the wild with the help of knowledge graphs. We first refresh the exiting definition of tables from the perspective of table interpretation and develop systems for collecting and extracting tables on the Web and local files. Three table interpretation systems are further proposed based on either heuristic rules or graph representation models facing the challenges observed from the literature. Finally, we introduce and evaluate two table augmentation applications based on semantic annotations, namely data imputation and schema augmentation
Lisena, Pasquale. "Knowledge-based music recommendation : models, algorithms and exploratory search". Electronic Thesis or Diss., Sorbonne université, 2019. http://www.theses.fr/2019SORUS614.
Texto completoRepresenting the information about music is a complex activity that involves different sub-tasks. This thesis manuscript mostly focuses on classical music, researching how to represent and exploit its information. The main goal is the investigation of strategies of knowledge representation and discovery applied to classical music, involving subjects such as Knowledge-Base population, metadata prediction, and recommender systems. We propose a complete workflow for the management of music metadata using Semantic Web technologies. We introduce a specialised ontology and a set of controlled vocabularies for the different concepts specific to music. Then, we present an approach for converting data, in order to go beyond the librarian practice currently in use, relying on mapping rules and interlinking with controlled vocabularies. Finally, we show how these data can be exploited. In particular, we study approaches based on embeddings computed on structured metadata, titles, and symbolic music for ranking and recommending music. Several demo applications have been realised for testing the previous approaches and resources
Raimbault, Thomas. "Transition de modèles de connaissances : un système de connaissance fondé sur OWL, graphes conceptuels et UML". Phd thesis, Nantes, 2008. https://archive.bu.univ-nantes.fr/pollux/show/show?id=4ef8d797-9884-4506-8973-e5bc095e2459.
Texto completoThe purpose of this thesis is using multiple knowledge models for representating knowledge and reasoning on this represented knowledge. This thesis proposes transitions between several knowledge models\string: OWL, Conceptual Graphs and UML. The originality of this thesis lies both in the centralized modeling knowledge within a knowledge system and in action to pass this knowledge from one model to another of system as needs required by modeling and reasoning. The main goal of these transitions knowledge models is twofold. On the one hand, it is to benefit from relatively easy using expressiveness of models to obtain as result a strong expressive power. On the other hand, it helps in the design and operation of a modeling, using best knowed models or best suited models. The tools of each model can then be used on represented knowledge, providing complementary use of these models
Gandon, Fabien. "Graphes RDF et leur Manipulation pour la Gestion de Connaissances". Habilitation à diriger des recherches, Université de Nice Sophia-Antipolis, 2008. http://tel.archives-ouvertes.fr/tel-00351772.
Texto completoDans le deuxième chapitre, nous rappelons comment les formalismes à base de graphes peuvent être utilisés pour représenter des connaissances avec un degré variable de formalisation en fonction des besoins identifiés dans les scénarios d'application et des traitements à effectuer notamment pour la mise en place de webs sémantiques. Nous identifierons brièvement les caractéristiques de certains de ces formalismes qui sont utilisés dans nos travaux et les opportunités d'extensions qu'ils offrent. Nous synthétiserons aussi une initiative en cours pour factoriser la définition des structures mathématiques partagées par ces formalismes et réutiliser l'algorithmique des traitements communs à ces structures.
Dans le troisième chapitre nous expliquons que l'ontologie offre un support à d'autres types de raisonnement que la dérivation logique. Par exemple, la hiérarchie de notions contenue dans une ontologie peut être vue comme un espace métrique permettant de définir des distances pour comparer la proximité sémantique de deux notions. Nous avons mis en œuvre cette idée dans plusieurs scénarios comme l'allocation distribuée d'annotations, la recherche approchée ou le clustering. Nous résumons dans ce troisième chapitre diverses utilisations que nous avons faites des distances sémantiques et discutons notre position sur ce domaine. Nous donnons les scénarios d'utilisation et les distances utilisées dans un échantillon représentatif de projets que nous avons menés. Pour nous, cette première série d'expériences a permis de démontrer l'intérêt et le potentiel des distances, et aussi de souligner l'importance du travail restant à faire pour identifier et caractériser les familles de distances existantes et leur adéquation respective aux tâches pour lesquelles nos utilisateurs souhaitent être assistés.
Dans le quatrième chapitre, nous rappelons qu'un web sémantique, tel que nous en utilisons dans nos scénarios, qu'il soit public ou sur l'intranet d'une entreprise, repose généralement sur plusieurs serveurs web qui proposent chacun différentes ontologies et différentes bases d'annotations utilisant ces ontologies pour décrire des ressources. Les scénarios d'usage amènent souvent un utilisateur à formuler des requêtes dont les réponses combinent des éléments d'annotation distribués entre plusieurs de ces serveurs.
Ceci demande alors d'être capable :
(1) d'identifier les serveurs susceptibles d'avoir des éléments de réponse ;
(2) d'interroger des serveurs distants sur les éléments qu'ils connaissent sans surcharger le réseau;
(3) de décomposer la requête et router les sous-requêtes vers les serveurs idoines ;
(4) de recomposer les résultats à partir des réponses partielles.
Nous avons, avec le web sémantique, les briques de base d'une architecture distribuée. Le quatrième chapitre résume un certain nombre d'approches que nous avons proposées pour tenir compte de la distribution et gérer des ressources distribuées dans les webs sémantiques que nous concevons.
Les ontologies et les représentations de connaissances sont souvent dans le cœur technique de nos architectures, notamment lorsqu'elles utilisent des représentations formelles. Pour interagir avec le web sémantique et ses applications, le cinquième chapitre rappelle que nous avons besoin d'interfaces qui les rendent intelligibles pour les utilisateurs finaux. Dans nos systèmes d'inférences des éléments de connaissances sont manipulés et combinés, et même si les éléments de départ étaient intelligibles, l'intelligibilité des résultats, elle, n'est pas préservée par ces transformations.
Actuellement, et dans le meilleur des cas, les concepteurs d'interfaces mettent en œuvre des transformations ad hoc des structures de données internes en représentations d'interface en oubliant souvent les capacités de raisonnement que pourraient fournir ces représentations pour construire de telles interfaces. Dans le pire des cas, et encore trop souvent, les structures de représentation normalement internes sont directement mises à nu dans des widgets sans que cela soit justifié et, au lieu d'assister l'interaction, ces représentations alourdissent les interfaces.
Puisqu'elles reçoivent les contributions d'un monde ouvert, les interfaces du web sémantique devront être, au moins en partie, générées dynamiquement et rendues pour chaque structure devant rentrer en contact avec les utilisateurs. Le cinquième et dernier chapitre souligne cette opportunité croissante d'utiliser des systèmes à base d'ontologies dans l'assistance aux interactions avec nos utilisateurs.
Hossayni, Hicham. "Enabling industrial maintenance knowledge sharing by using knowledge graphs". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAS017.
Texto completoFormerly considered as part of general enterprise costs, industrial maintenance has become critical for business continuity and a real source of data. Despite the heavy investments made by companies in smart manufacturing, traditional maintenance practices still dominate the industrial landscape. In this Ph.D., we investigate maintenance knowledge sharing as a potential solution that can invert the trend and enhance the maintenance activity to comply with the Industry 4.0 spirit. We specifically consider the knowledge graphs as an enabler to share the maintenance knowledge among the different industry players.In the first contribution of this thesis, we conducted a field study through a campaign of interviews with different experts with different profiles and from different industry domains. This allowed us to test the hypothesis of improving the maintenance activity via knowledge sharing which is quite a novel concept in many industries. The results of this activity clearly show a real interest in our approach and reveal the different requirements and challenges that need to be addressed.The second contribution is the concept, design, and prototype of "SemKoRe" which is a vendor-agnostic solution relying on Semantic Web technologies to share the maintenance knowledge. It gathers all machine failure-related data in the knowledge graph and shares it among all connected customers to easily solve future failures of the same type. A flexible architecture was proposed to cover the varied needs of the different customers. SemKoRe received approval of several Schneider clients located in several countries and from various segments.In the third contribution, we designed and implemented a novel solution for the automatic detection of sensitive data in maintenance reports. In fact, maintenance reports may contain some confidential data that can compromise or negatively impact the company's activity if revealed. This feature came up as the make or break point for SemKoRe for the interviewed domain experts. It allows avoiding sensitive data disclosure during the knowledge-sharing activity. In this contribution, we relied on semantic web and natural language processing techniques to develop custom models for sensitive data detection. The construction and training of such models require a considerable amount of data. Therefore, we implemented several services for collaborative data collection, text annotation, and corpus construction. Also, an architecture and a simplified workflow were proposed for the generation and deployment of customizable sensitive data detection models on edge gateways.In addition to these contributions, we worked on different peripheral features with a strong value for the SemKoRe project, and that has resulted in different patents. For instance, we prototyped and patented a novel method to query time series data using semantic criteria. It combines the use of ontologies and time-series databases to offer a useful set of querying capabilities even on resource-constrained edge gateways. We also designed a novel tool that helps software developers to easily interact with knowledge graphs with little or no knowledge of semantic web technologies. This solution has been patented and turns out to be useful for other ontology-based projects
Sourty, Raphael. "Apprentissage de représentation de graphes de connaissances et enrichissement de modèles de langue pré-entraînés par les graphes de connaissances : approches basées sur les modèles de distillation". Electronic Thesis or Diss., Toulouse 3, 2023. http://www.theses.fr/2023TOU30337.
Texto completoNatural language processing (NLP) is a rapidly growing field focusing on developing algorithms and systems to understand and manipulate natural language data. The ability to effectively process and analyze natural language data has become increasingly important in recent years as the volume of textual data generated by individuals, organizations, and society as a whole continues to grow significantly. One of the main challenges in NLP is the ability to represent and process knowledge about the world. Knowledge graphs are structures that encode information about entities and the relationships between them, they are a powerful tool that allows to represent knowledge in a structured and formalized way, and provide a holistic understanding of the underlying concepts and their relationships. The ability to learn knowledge graph representations has the potential to transform NLP and other domains that rely on large amounts of structured data. The work conducted in this thesis aims to explore the concept of knowledge distillation and, more specifically, mutual learning for learning distinct and complementary space representations. Our first contribution is proposing a new framework for learning entities and relations on multiple knowledge bases called KD-MKB. The key objective of multi-graph representation learning is to empower the entity and relation models with different graph contexts that potentially bridge distinct semantic contexts. Our approach is based on the theoretical framework of knowledge distillation and mutual learning. It allows for efficient knowledge transfer between KBs while preserving the relational structure of each knowledge graph. We formalize entity and relation inference between KBs as a distillation loss over posterior probability distributions on aligned knowledge. Grounded on this finding, we propose and formalize a cooperative distillation framework where a set of KB models are jointly learned by using hard labels from their own context and soft labels provided by peers. Our second contribution is a method for incorporating rich entity information from knowledge bases into pre-trained language models (PLM). We propose an original cooperative knowledge distillation framework to align the masked language modeling pre-training task of language models and the link prediction objective of KB embedding models. By leveraging the information encoded in knowledge bases, our proposed approach provides a new direction to improve the ability of PLM-based slot-filling systems to handle entities
Raimbault, Thomas. "Transition de modèles de connaissances - Un système de connaissance fondé sur OWL, Graphes conceptuels et UML". Phd thesis, Université de Nantes, 2008. http://tel.archives-ouvertes.fr/tel-00482664.
Texto completoNel, François. "Suivi de mouvements informationnels : construction, modélisation et simulation de graphes de citations, application à la détection de buzz". Paris 6, 2011. http://www.theses.fr/2011PA066541.
Texto completoParis, Pierre-Henri. "Identity in RDF knowledge graphs : propagation of properties between contextually identical entities". Electronic Thesis or Diss., Sorbonne université, 2020. http://www.theses.fr/2020SORUS132.
Texto completoDue to a large number of knowledge graphs and, more importantly, their even more numerous interconnections using the owl:sameas property, it has become increasingly evident that this property is often misused. Indeed, the entities linked by the owl:sameas property must be identical in all possible and imaginable contexts. This is not always the case and leads to a deterioration of data quality. Identity must be considered as context-dependent. We have, therefore, proposed a large-scale study on the presence of semantics in knowledge graphs since specific semantic characteristics allow us to deduce identity links. This study naturally led us to build an ontology allowing us to describe the semantic content of a knowledge graph. We also proposed a interlinking approach based both on the logic allowed by semantic definitions, and on the predominance of certain properties to characterize the identity relationship between two entities. We looked at completeness and proposed an approach to generate a conceptual schema to measure the completeness of an entity. Finally, using our previous work, we proposed an approach based on sentence embedding to compute the properties that can be propagated in a specific context. Hence, the propagation framework allows the expansion of SPARQL queries and, ultimately, to increase the completeness of query results
Ayed, Rihab. "Recherche d’information agrégative dans des bases de graphes distribuées". Thesis, Lyon, 2019. http://www.theses.fr/2019LYSE1305.
Texto completoIn this research, we are interested in investigating issues related to query evaluation and optimization in the framework of aggregated search. Aggregated search is a new paradigm to access massively distributed information. It aims to produce answers to queries by combining fragments of information from different sources. The queries search for objects (documents) that do not exist as such in the targeted sources, but are built from fragments extracted from the different sources. The sources might not be specified in the query expression, they are dynamically discovered at runtime. In our work, we consider data dependencies to propose a framework for optimizing query evaluation over distributed graph-oriented data sources. For this purpose, we propose an approach for the document indexing/orgranizing process of aggregated search systems. We consider information retrieval systems that are graph oriented (RDF graphs). Using graph relationships, our work is within relational aggregated search where relationships are used to aggregate fragments of information. Our goal is to optimize the access to source of information in a aggregated search system. These sources contain fragments of information that are relevant partially for the query. We aim at minimizing the number of sources to ask, also at maximizing the aggregation operations within a same source. For this, we propose to reorganize the graph database(s) in partitions, dedicated to aggregated queries. We use a semantic or strucutral clustering of RDF predicates. For structural clustering, we propose to use frequent subgraph mining algorithms, we performed for this, a comparative study of their performances. For semantic clustering, we use the descriptive metadata of RDF predicates and apply semantic textual similarity methods to calculate their relatedness. Following the clustering, we define query decomposing rules based on the semantic/structural aspects of RDF predicates
Ayats, H. Ambre. "Construction de graphes de connaissances à partir de textes avec une intelligence artificielle explicable et centrée-utilisateur·ice". Electronic Thesis or Diss., Université de Rennes (2023-....), 2023. http://www.theses.fr/2023URENS095.
Texto completoWith recent advances in artificial intelligence, the question of human control has become central. Today, this involves both research into explainability and designs centered around interaction with the user. What's more, with the expansion of the semantic web and automatic natural language processing methods, the task of constructing knowledge graphs from texts has become an important issue. This thesis presents a user-centered system for the construction of knowledge graphs from texts. This thesis presents several contributions. First, we introduce a user-centered workflow for the aforementioned task, having the property of progressively automating the user's actions while leaving them a fine-grained control over the outcome. Next, we present our contributions in the field of formal concept analysis, used to design an explainable instance-based learning module for relation classification. Finally, we present our contributions in the field of relation extraction, and how these fit into the presented workflow
Baalbaki, Hussein. "Designing Big Data Frameworks for Quality-of-Data Controlling in Large-Scale Knowledge Graphs". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS697.
Texto completoKnowledge Graphs (KGs) are the most used representation of structured information about a particular domain consisting of billions of facts in the form of entities (nodes) and relations (edges) between them. Additionally, the semantic type information of the entities is also contained in the KGs. The number of KGs has steadily increased over the past 20 years in a variety of fields, including government, academic research, the biomedical fields, etc. Applications based on machine learning that use KGs include entity linking, question-answering systems, recommender systems, etc. Open KGs are typically produced heuristically, automatically from a variety of sources, including text, photos, and other resources, or are hand-curated. However, these KGs are often incomplete, i.e., there are missing links between the entities and missing links between the entities and their corresponding entity types. In this thesis, we are addressing one of the most challenging issues facing Knowledge Graph Completion (KGC) which is link prediction. General Link Prediction in KGs that include head and tail prediction, triple classification. In recent years, KGE have been trained to represent the entities and relations in the KG in a low-dimensional vector space preserving the graph structure. In most published works such as the translational models, neural network models and others, the triple information is used to generate the latent representation of the entities and relations. In this dissertation, several methods have been proposed for KGC and their effectiveness is shown empirically in this thesis. Firstly, a novel KG embedding model TransModE is proposed for Link Prediction. TransModE projects the contextual information of the entities to modular space, while considering the relation as transition vector that guide the head to the tail entity. Secondly, we worked on building a simple low complexity KGE model, meanwhile preserving its efficiency. KEMA is a novel KGE model among the lowest KGE models in terms of complexity, meanwhile it obtains promising results. Finally, KEMA++ is proposed as an upgrade of KEMA to predict the missing triples in KGs using product arithmetic operation in modular space. The extensive experiments and ablation studies show efficiency of the proposed model, which compete the current state of the art models and set new baselines for KGC. The proposed models establish new way in solving KGC problem other than transitional, neural network, or tensor factorization based approaches. The promising results and observations open up interesting scopes for future research involving exploiting the proposed models in domain-specific KGs such as scholarly data, biomedical data, etc. Furthermore, the link prediction model can be exploited as a base model for the entity alignment task as it considers the neighborhood information of the entities
Mecharnia, Thamer. "Approches sémantiques pour la prédiction de présence d'amiante dans les bâtiments : une approche probabiliste et une approche à base de règles". Electronic Thesis or Diss., université Paris-Saclay, 2022. http://www.theses.fr/2022UPASG036.
Texto completoNowadays, Knowledge Graphs are used to represent all kinds of data and they constitute scalable and interoperable resources that can be used by decision support tools. The Scientific and Technical Center for Building (CSTB) was asked to develop a tool to help identify materials containing asbestos in buildings. In this context, we have created and populated the ASBESTOS ontology which allows the representation of building data and the results of diagnostics carried out in order to detect the presence of asbestos in the used products. We then relied on this knowledge graph to develop two approaches which make it possible to predict the presence of asbestos in products in the absence of the reference of the marketed product actually used.The first approach, called the hybrid approach, is based on external resources describing the periods when the marketed products are asbestos-containing to calculate the probability of the existence of asbestos in a building component. This approach addresses conflicts between external resources, and incompleteness of listed data by applying a pessimistic fusion approach that adjusts the calculated probabilities using a subset of diagnostics.The second approach, called CRA-Miner, is inspired by inductive logic programming (ILP) methods to discover rules from the knowledge graph describing buildings and asbestos diagnoses. Since the reference of specific products used during construction is never specified, CRA-Miner considers temporal data, ASBESTOS ontology semantics, product types and contextual information such as part-of relations to discover a set of rules that can be used to predict the presence of asbestos in construction elements.The evaluation of the two approaches carried out on the ASBESTOS ontology populated with the data provided by the CSTB show that the results obtained, in particular when the two approaches are combined, are quite promising
Ticona, Herrera Regina Paola. "Towards RDF normalization". Thesis, Pau, 2016. http://www.theses.fr/2016PAUU3015/document.
Texto completoOver the past three decades, millions of people have been producing and sharing information on the Web, this information can be structured, semi-structured, and/or non-structured such as blogs, comments, Web pages, and multimedia data, etc., which require a formal description to help their publication and/or exchange on the Web. To help address this problem, the Word Wide Web Consortium (or W3C) introduced in 1999 the RDF standard as a data model designed to standardize the definition and use of metadata, in order to better describe and handle data semantics, thus improving interoperability, and scalability, and promoting the deployment of new Web applications. Currently, billions of RDF descriptions are available on the Web through the Linked Open Data cloud projects (e.g., DBpedia and LinkedGeoData). Also, several data providers have adopted the principles and practices of the Linked Data to share, connect, enrich and publish their information using the RDF standard, e.g., Governments (e.g., Canada Government), universities (e.g., Open University) and companies (e.g., BBC and CNN). As a result, both individuals and organizations are increasingly producing huge collections of RDF descriptions and exchanging them through different serialization formats (e.g., RDF/XML, Turtle, N-Triple, etc.). However, many available RDF descriptions (i.e., graphs and serializations) are noisy in terms of structure, syntax, and semantics, and thus may present problems when exploiting them (e.g., more storage, processing time, and loading time). In this study, we propose to clean RDF descriptions of redundancies and unused information, which we consider to be an essential and required stepping stone toward performing advanced RDF processing as well as the development of RDF databases and related applications (e.g., similarity computation, mapping, alignment, integration, versioning, clustering, and classification, etc.). For that purpose, we have defined a framework entitled R2NR which normalizes different RDF descriptions pertaining to the same information into one normalized representation, which can then be tuned both at the graph level and at the serialization level, depending on the target application and user requirements. We illustrate this approach by introducing use cases (real and synthetics) that need to be normalized.The contributions of the thesis can be summarized as follows:i. Producing a normalized (output) RDF representation that preserves all the information in the source (input) RDF descriptions,ii. Eliminating redundancies and disparities in the normalized RDF descriptions, both at the logical (graph) and physical (serialization) levels,iii. Computing a RDF serialization output adapted w.r.t. the target application requirements (faster loading, better storage, etc.),iv. Providing a mathematical formalization of the normalization process with dedicated normalization functions, operators, and rules with provable properties, andv. Providing a prototype tool called RDF2NormRDF (desktop and online versions) in order to test and to evaluate the approach's efficiency.In order to validate our framework, the prototype RDF2NormRDF has been tested through extensive experimentations. Experimental results are satisfactory show significant improvements over existing approaches, namely regarding loading time and file size, while preserving all the information from the original description
Bongiovanni, Francesco. "Design, formalization and implementation of overlay networks : application to RDF data storage". Nice, 2012. http://www.theses.fr/2012NICE4021.
Texto completoStructured Overlay Networks (SONs) are a new class of Peer-to-Peer (P2P) systems which are widely used for large scale applications such as file sharing , information dissemination, storage and retrieval of different resources… Many different SONs co-exist on the Web yet they do not cooperate with each other. In order to promote cooperation, we propose two protocols, Babelchord and Synapse, whose goals are to enable the inter-connection of structured and heterogeneous overlays networks through meta-protocols. Babelchord aims to aggregate small structured overlay networks in an unstructured fashion while Synapse generalizes this concept and provides flexible mechanisms relying on co-located nodes, i. E. Nodes which belong to multiple overlays at the same time. We provides the algorithms behind both protocols, as well as simulations results showing their behaviours in the context of information retrieval. We have also implemented and experimented a prototype of JSynapse on the Grid’5000 platform, confirming the obtained simulation results and giving a proof of concept of our protocol. A novel generation of SONs was created in order to store and retrieve semantic data in large scale settings. The Semantic Web community is in need for scalable solutions which are able to store and retrieve RDF data, the core data model of the Semantic Web. The first generation of these systems is too monolithic and provided limited support for expressive queries. We propose the design and implementation of a new modular P2P-based system for these purposes. We build the system with RDF in mind and used a three-dimensional CAN overlay network, mimicking the nature of an RDF triple. We made specific design choices which preserve data locality but raises interesting technical challenges. Our modular design reduces the coupling between its underlying components, allowing them to be inter-changed with others. We also ran some micro-benchmarks on Grid’50000 which will be discussed. SONs have a specific geometrical topology which could be leveraged in order to increase the overall performance of the system. In this regard we propose a new broadcast efficient algorithm for CAN, developed in response to the results found from running the experiments in the RDF data store we have built, which used too many messages. Along this algorithm, we also propose a reasoning framework, developed with the Isabelle/HOL proof assistant, for proving correctness properties of dissemination algorithms for CAN-like P2P-systems. We focus on providing the minimal set of abstractions needed to devise efficient correct-by-construction dissemination algorithms on top of such overlay
Cherifi, Chantal. "Classification et Composition de Services Web : Une Perspective Réseaux Complexes". Phd thesis, Université Pascal Paoli, 2011. http://tel.archives-ouvertes.fr/tel-00652852.
Texto completoDennemont, Yannick. "Une assistance à l'interaction 3D en réalité virtuelle par un raisonnement sémantique et une conscience du contexte". Phd thesis, Université d'Evry-Val d'Essonne, 2013. http://tel.archives-ouvertes.fr/tel-00903529.
Texto completoEl, Sayad Ismail. "Une représentation visuelle avancée pour l'apprentissage sémantique dans les bases d'images". Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2011. http://tel.archives-ouvertes.fr/tel-00666531.
Texto completoNgo, Duy Hoa. "Amélioration de l'alignement d'ontologies par les techniques d'apprentissage automatique, d'appariement de graphes et de recherche d'information". Phd thesis, Université Montpellier II - Sciences et Techniques du Languedoc, 2012. http://tel.archives-ouvertes.fr/tel-00767318.
Texto completo