Littérature scientifique sur le sujet « Documents semi-structurés (DSS) »
Créez une référence correcte selon les styles APA, MLA, Chicago, Harvard et plusieurs autres
Sommaire
Consultez les listes thématiques d’articles de revues, de livres, de thèses, de rapports de conférences et d’autres sources académiques sur le sujet « Documents semi-structurés (DSS) ».
À côté de chaque source dans la liste de références il y a un bouton « Ajouter à la bibliographie ». Cliquez sur ce bouton, et nous générerons automatiquement la référence bibliographique pour la source choisie selon votre style de citation préféré : APA, MLA, Harvard, Vancouver, Chicago, etc.
Vous pouvez aussi télécharger le texte intégral de la publication scolaire au format pdf et consulter son résumé en ligne lorsque ces informations sont inclues dans les métadonnées.
Articles de revues sur le sujet "Documents semi-structurés (DSS)"
Naji, Meriem, et Rachida Jehouani. « La relation avec l’argent dans l’ère digitale au Maroc et ses conséquences environnementales ». SHS Web of Conferences 175 (2023) : 01022. http://dx.doi.org/10.1051/shsconf/202317501022.
Texte intégralRezende de Almeida, Debora. « RESILIÊNCIA INSTITUCIONAL : para onde vai a participação nos Conselhos Nacionais de Saúde e dos Direitos da Mulher ? » Caderno CRH 33 (27 juillet 2020) : 020004. http://dx.doi.org/10.9771/ccrh.v33i0.33281.
Texte intégralRodrigues, Jovenildo Cardoso, Rodrigo Luciano Macedo Machado, Luciano Rocha da Penha et Adolfo Oliveira Neto. « INTERFACES DO RURAL E DO URBANO NA CIDADE DE BARCARENA, AMAZÔNIA PARAENSE ». InterEspaço : Revista de Geografia e Interdisciplinaridade 5, no 19 (18 janvier 2020) : 202016. http://dx.doi.org/10.18764/2446-6549.e202016.
Texte intégralThèses sur le sujet "Documents semi-structurés (DSS)"
Martin, Stéphane. « Edition collaborative des documents semi-structurés ». Phd thesis, Université de Provence - Aix-Marseille I, 2011. http://tel.archives-ouvertes.fr/tel-00684778.
Texte intégralBelhadj, Djedjiga. « Multi-GAT semi-supervisé pour l’extraction d’informations et son adaptation au chiffrement homomorphe ». Electronic Thesis or Diss., Université de Lorraine, 2024. http://www.theses.fr/2024LORR0023.
Texte intégralThis thesis is being carried out as part of the BPI DeepTech project, in collaboration with the company Fair&Smart, primarily looking after the protection of personal data in accordance with the General Data Protection Regulation (RGPD). In this context, we have proposed a deep neural model for extracting information in semi-structured administrative documents (SSDs). Due to the lack of public training datasets, we have proposed an artificial generator of SSDs that can generate several classes of documents with a wide variation in content and layout. Documents are generated using random variables to manage content and layout, while respecting constraints aimed at ensuring their similarity to real documents. Metrics were introduced to evaluate the content and layout diversity of the generated SSDs. The results of the evaluation have shown that the generated datasets for three SSD types (payslips, receipts and invoices) present a high diversity level, thus avoiding overfitting when training the information extraction systems. Based on the specific format of SSDs, consisting specifically of word pairs (keywords-information) located in spatially close neighborhoods, the document is modeled as a graph where nodes represent words and edges, neighborhood connections. The graph is fed into a multi-layer graph attention network (Multi-GAT). The latter applies the multi-head attention mechanism to learn the importance of each word's neighbors in order to better classify it. A first version of this model was used in supervised mode and obtained an F1 score of 96% on two generated invoice and payslip datasets, and 89% on a real receipt dataset (SROIE). We then enriched the multi-GAT with multimodal embedding of word-level information (textual, visual and positional), and combined it with a variational graph auto-encoder (VGAE). This model operates in semi-supervised mode, being able to learn on both labeled and unlabeled data simultaneously. To further optimize the graph node classification, we have proposed a semi-VGAE whose encoder shares its first layers with the multi-GAT classifier. This is also reinforced by the proposal of a VGAE loss function managed by the classification loss. Using a small unlabeled dataset, we were able to improve the F1 score obtained on a generated invoice dataset by over 3%. Intended to operate in a protected environment, we have adapted the architecture of the model to suit its homomorphic encryption. We studied a method of dimensionality reduction of the Multi-GAT model. We then proposed a polynomial approximation approach for the non-linear functions in the model. To reduce the dimensionality of the model, we proposed a multimodal feature fusion method that requires few additional parameters and reduces the dimensions of the model while improving its performance. For the encryption adaptation, we studied low-degree polynomial approximations of nonlinear functions, using knowledge distillation and fine-tuning techniques to better adapt the model to the new approximations. We were able to minimize the approximation loss by around 3% on two invoice datasets as well as one payslip dataset and by 5% on SROIE
Harrathi, Rami. « Recherche d'information conceptuelle dans les documents semi-structurés ». Lyon, INSA, 2010. http://theses.insa-lyon.fr/publication/2010ISAL0073/these.pdf.
Texte intégralWith the advent of XML as the de facto standard for semi-structured document representation and exchange over the Web, several approaches of structured information retrieval (SIR) for semi-structured document have been proposed. These approaches have limitations of RIS at different levels: the matching element/query and query language. The matching element/query consist of assigning a relevance scores of elements in the documents. Most approaches for evaluating the relevance are based on keywords-based indexing systems where the element of a document and the query are represented by a list of weighted keyword. The keywords-based indexing is generally imprecise. This imprecision is due to the problem of semantic ambiguity of words in natural language. To address these limitations, several studies were interested in taking into account the semantic indexing terms. This type of indexing is called semantic or conceptual indexing. These works take into account the notion of concept in place of the notion of word. The query languages allow the user to query semi-structured documents by content and structure. Most query languages which proposed for querying semi-structured documents were textual query languages. The limitation of textual languages lies in the fact that it is unsuitable for users who are novices in computer science. These languages are characterized by a complex formalism. They require training in the formal syntax of the language. The use of visual languages overcomes these limitations. In this context, our contributions focus on the proposal of a conceptual IR approach in semi-structured documents and a model of visual querying. Our contributions are evaluated through the IN EX Evaluation Initiative and the development of a prototype
Debarbieux, Denis. « Modélisation et requêtes des documents semi-structurés : exploitation de la structure de graphe ». Phd thesis, Université des Sciences et Technologie de Lille - Lille I, 2005. http://tel.archives-ouvertes.fr/tel-00619303.
Texte intégralPinel-Sauvagnat, Karen. « Modèle flexible pour la recherche d'information dans des corpus de documents semi-structurés ». Toulouse 3, 2005. http://www.theses.fr/2005TOU30071.
Texte intégralStructural information contained in semi-structured documents can be used to focus on relevant information. The aim of Information Retrieval System is then to retrieve relevant information units instead of whole documents. We propose here the XFIRM model (XML Flexible Information Retrieval model), which is based on: (i) a generic data representation model, allowing the modelling of documents having heterogeneous structures; (ii) a flexible query language that allows the expression of users needs according to many precision degrees, by expressing (or not) conditions on the documents structure; (iii) a retrieval model based on a relevance propagation method, which aims at finding the most exhaustive and specific information units answering the query. The interest of our propositions has been shown thanks to the prototype we developed
Decoster, Jean. « Programmation logique inductive pour la classification et la transformation de documents semi-structurés ». Thesis, Lille 1, 2014. http://www.theses.fr/2014LIL10046/document.
Texte intégralThe recent proliferation of XML documents in databases and web applications rises some issues due to the numerous data exchanged and their diversity. To ease their uses, some smart means have been developed such as automatic classification and transformation. This thesis has two goals:• To propose a framework for the XML documents classification task.• To study the XML documents transformation learning.We have chosen to use Inductive Logic Programming. The expressiveness of logic programs grants flexibility in specifying the learning task and understandability to the induced theories. This flexibility implies a high computational cost, constraining the applicability of ILP systems. However, XML documents being trees, a good concession can be found.For our first contribution, we define clauses languages that allow encoding xml trees. The definition of our classification framework follows their studies. It stands on a rewriting of the standard ILP operations such as theta-subsumption and least general generalization [Plotkin1971]. Our algorithms are polynomials in time in the input size whereas the standard ones are exponentials. They grant an identification in the limit [Gold1967] of our languages.Our second contribution is the building of methods to learn XML documents transformations. It begins by the definition of a clauses class in the way of functional programs [Paulson91]. They are an ILP adaptation of edit scripts and allow a context. Their learning is possible thanks to two A*-like algorithms, a common ILP approach (HOC-Learner [Santos2009])
Naffakhi, Najeh. « Un modèle de recherche d'information agrégée basée sur les réseaux bayésiens dans des documents semi-structurés ». Toulouse 3, 2013. http://thesesups.ups-tlse.fr/2018/.
Texte intégralThe work described in this thesis are concerned with the aggregated search on XML elements. We propose new approaches to aggregating and pruning using different sources of evidence (content and structure). We propose a model based on Bayesian networks. The dependency relationships between query-terms and terms-elements are quantified by probability measures. In this model, the user's query triggers a propagation process to find XML elements. In our model, we search to return to the user an aggregate instead of a list of XML elements. In fact, the aggregate made from a document is considered an information unit (or a portion of this document) that best meets the user's query. This aggregate must meet three aspects namely relevance, non-redundancy and complementarity in order to answer the query. The value returned aggregates is that they give the user an overview of the information need in the collection
Sauvagnat, Karen. « Mod`ele flexible pour la Recherched'Information dans des corpus dedocuments semi-structur´es ». Phd thesis, Université Paul Sabatier - Toulouse III, 2005. http://tel.archives-ouvertes.fr/tel-00359579.
Texte intégraltraditionnels ” plats ” ne contenant que du texte s'enrichissent d'information
structurelle et multimédia. Cette ´évolution est accélérée par l'expansion du
Web, et les documents semi-structurés de type XML (eXtensible Markup Language)
tendent à former la majorité des documents numériques mis à disposition
des utilisateurs. Le développement d'outils automatisés permettant un
accès efficace à ce nouveau type d'information numérique apparaît comme une
nécessité. Afin de valoriser au mieux l'ensemble des informations disponibles,
les méthodes existantes de Recherche d'Information (RI) doivent être adaptées.
L'information structurelle des documents peut en effet servir à affiner le concept
de granule documentaire. Le but pour les Systèmes de Recherche d'Information
(SRI) est alors de retrouver des unités d'information (et non plus de documents)
pertinentes à des requêtes utilisateur. Afin de répondre à cette problématique
fondamentale, de nouveaux modèles prenant en compte l'information structurelle
des documents, tant au niveau de l'indexation, de l'interrogation que de
la recherche doivent être construits.
L'objectif de nos travaux est de proposer un modèle permettant d'effectuer des
recherches flexibles dans des corpus de document semi-structurés. Ceci nous
a conduit à proposer le mod`ele XFIRM (XML Flexible Information Retrieval
Model ) reposant sur : (i) Un modèle de représentation des donn´ees générique,
permettant de modéliser des documents possédant des structures différentes ;
(ii) Un langage de requête flexible, permettant à l'utilisateur d'exprimer son
besoin selon divers degrés de précision, en exprimant ou non des conditions
sur la structure des documents ; (iii) Un modèle de recherche bas´ee sur une
m´ethode de propagation de la pertinence. Ce modèle a pour but de trouver les
unités d'information les plus exhaustives et spécifiques répondant à une requête
utilisateur, que celle-ci contienne ou non des conditions de structure. Les documents
semi-structurés peuvent être représentés sous forme arborescente, et
le but est alors de trouver les sous-arbres de taille minimale répondant à la
requête. Les recherches sur le contenu seul des documents sont effectuées en
prenant en compte les importances diverses des feuilles des sous-arbres, et en
plaçant ces derniers dans leur contexte, c'est à dire, en tenant compte de la
pertinence du document. Les recherches portant à la fois sur le contenu et la
structure des documents sont effectuées grâce à plusieurs propagations de pertinence
dans l'arbre du document, et ce afin d'effectuer une correspondance
vague entre l'arbre du document et l'arbre de la requête.
L'´evaluation de notre modèle, grâce au prototype que nous avons d´eveloppé,
montre l'intérêt de nos propositions, que ce soit pour effectuer des recherches
sur le contenu seul des documents que sur le contenu et la structure.
Torjmen, Mouna. « Approches de recherche multimédia dans des documents semi-structurés : utilisation du contexte textuel et structurel pour la sélection d'objets multimédia ». Toulouse 3, 2009. http://thesesups.ups-tlse.fr/673/.
Texte intégralThe evolution of user needs and electronic documents raises new issues in the Information R(IR) domain. Indeed, when considering semi-structured documents (XML), the document structure allows the Information Retrieval Systems (IRS) to answer more precisely to the user information needs, by returning parts of documents instead of whole documents. With the emergence of structural information in documents, the integration of multimedia content, like images for example, has also raised many issues. To exploit all the multimedia and structural information at best, the existing methods of Multimedia Retrieval (MR) must be adapted. Although the use of the document structure in textual information retrieval has shown its interest, only a few studies have investigated its impact in multimedia retrieval. In the literature, most of the existing works in multimedia structured retrieval consists either of combining XML textual search and content-based multimedia retrieval, or of using an XML textual search and then filtering the results by keeping only those having a multimedia specification. The aim of our work is to propose methods to answer to the multimedia information needs, by taking into account both the document structure and the multimedia specificity. Our approaches can be applied on any type of media (images, audio, video) because they are independent of the physical content of the media. However, we are particularly interested in image retrieval. For multimedia elements (images) retrieval, the basic idea is to determine their relevance score thanks to the other non-multimedia elements scores. At this stage, the challenge is to select the elements used to evaluate the multimedia elements scores. For this purpose, we proposed two approaches: they are respectively based on the implicit and explicit use of textual and structural context. For multimedia fragments retrieval, we use the multimedia elements retrieved by one of the two previous methods to determine the best multimedia fragment to be returned to the user. .
Verdier, Maxime. « Effet de l’orientation et de l’état des surfaces/interfaces sur les propriétés thermiques des semi-conducteurs nano-structurés ». Thesis, Université de Lorraine, 2018. http://www.theses.fr/2018LORR0138/document.
Texte intégralThis study deals with heat transport in crystalline nanostructured silicon and the impact of amorphization. The thermal conductivity of various nanostructures is computed with two numerical methods: Molecular Dynamics and Monte Carlo resolution of the Boltzmann transport equation. First, materials with spherical nanopores are investigated and the importance of the surface density is highlighted. Then, nanofilms with periodic cylindrical pores, often called phononic crystals, are studied. The density of states computed with Molecular Dynamics does not show major modifications of the heat carriers (phonons) properties. However, results show that the surfaces orientation, the pore distribution and the existence of native oxide or amorphous layers may have an important impact on the thermal conductivity. Then, heat transport in nanowires is studied, in particular the radial evolution of the thermal conductivity. The latter one is maximum at the center of the nanowire and decreases when approaching the nanowire surface. Structures made from interconnected nanowires, called nanowire networks, are also studied; they have an extremely low thermal conductivity. Finally, the impact of the roughness and amorphization of the surfaces on thermal transport is analyzed for different types of nanostructures. The two latter phenomena contribute strongly to the reduction of the thermal conductivity, which can reach very low values while keeping an important crystalline fraction.It opens new perspectives for the control of this property with material designing