Tesi sul tema "Génération en langue naturelle"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Génération en langue naturelle".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Ponton, Claude (1966. "Génération automatique de textes en langue naturelle : essai de définition d'un système noyau". Grenoble 3, 1996. http://www.theses.fr/1996GRE39030.
One of the common features with many generation systems is the strong dependence on the application. If few definition attempts of "non dedicated" systems have been realised, none of them permis to take into account the application characteristics (as its formalism) and the communication context (application field, user,. . . ). The purpose of this thesis is the definition of a generation system both non dedicated and permitting to take into account these elements. Such a system is called a "kernel generation system". In this perspective, we have studied 94 generation systems through objective relevant criteria. This study is used as a basis in the continuation of our work. The definition of a kernel generator needs the determination of the frontier between the application and the kernel generation (generator tasks, inputs, outputs, data,. . . ). Effectively, it is necessary to be aware of the role of both parts and their communication ways before designing the kernel generator. It results of this study that our generator considers as input any formal content representation as well as a set of constraints describing the communication context. The kernel generator then processes what is generally called the "how to say it?" and is able to produce every solutions according to the input constraints. This definition part is followed by the achievement of a first generator prototype which has been tested through two applications distinct in all respects (formalism, field, type of texts,. . . ). Finally, this work opens out on some evolution perspectives for the generator particulary on knowledge representation formalism (cotopies d'objets) and on architecture (distributed architecture)
Balicco, Laurence. "Génération de repliques en français dans une interface homme-machine en langue naturelle". Grenoble 2, 1993. http://www.theses.fr/1993GRE21025.
This research takes place in the context of natural language generation. This field has benn neglected for a long time because it seemed a much easier phase that those of analysis. The thesis corresponds to a first work on generation in the criss team and places the problem of generation in the context of a manmachine dialogue in natural language. Some of its consequences are : generation from a logical content to be translated into natural language, this translation of the original content kept as close as possible,. . . After the study of the different works that have been done, we decided to create our own generation system, resusing when it is possible, the tools elaborated during the analyzing process. This generation process is based on a linguistic model, which uses syntactic and morphologic information and in which linguistic transformations called operations are defined (coodination, anaphorisation, thematisation,. . . ). These operations can be given by the dialogue or calulated during the generation process. The model allows the creation of several of the same utterance and therefore a best adaptation for different users. This thesis presents the studied works, essentially on the french and the english languages, the linguistic model developped, the computing model used, and a brief presentation of an european project which offers a possible application of ou
Garcia-Fernandez, Anne. "Génération de réponses en langue naturelle orales et écrites pour les systèmes de question-réponse en domaine ouvert". Phd thesis, Université Paris Sud - Paris XI, 2010. http://tel.archives-ouvertes.fr/tel-00603358.
Bourcier, Frédéric. "Représentation des connaissances pour la résolution de problèmes et la génération d'explications en langue naturelle : contribution au projet AIDE". Compiègne, 1996. http://www.theses.fr/1996COMPD903.
Popesco, Liana. "Analyse et génération de textes à partir d'un seul ensemble de connaissances pour chaque langue naturelle et de meta-règles de structuration". Paris 6, 1986. http://www.theses.fr/1986PA066138.
Namer, Fiammetta. "Pronominalisation et effacement du sujet en génération automatique de textes en langues romanes". Paris 7, 1990. http://www.theses.fr/1990PA077249.
Perez, Laura Haide. "Génération automatique de phrases pour l'apprentissage des langues". Thesis, Université de Lorraine, 2013. http://www.theses.fr/2013LORR0062/document.
In this work, we explore how Natural Language Generation (NLG) techniques can be used to address the task of (semi-)automatically generating language learning material and activities in Camputer-Assisted Language Learning (CALL). In particular, we show how a grammar-based Surface Realiser (SR) can be usefully exploited for the automatic creation of grammar exercises. Our surface realiser uses a wide-coverage reversible grammar namely SemTAG, which is a Feature-Based Tree Adjoining Grammar (FB-TAG) equipped with a unification-based compositional semantics. More precisely, the FB-TAG grammar integrates a flat and underspecified representation of First Order Logic (FOL) formulae. In the first part of the thesis, we study the task of surface realisation from flat semantic formulae and we propose an optimised FB-TAG-based realisation algorithm that supports the generation of longer sentences given a large scale grammar and lexicon. The approach followed to optimise TAG-based surface realisation from flat semantics draws on the fact that an FB-TAG can be translated into a Feature-Based Regular Tree Grammar (FB-RTG) describing its derivation trees. The derivation tree language of TAG constitutes a simpler language than the derived tree language, and thus, generation approaches based on derivation trees have been already proposed. Our approach departs from previous ones in that our FB-RTG encoding accounts for feature structures present in the original FB-TAG having thus important consequences regarding over-generation and preservation of the syntax-semantics interface. The concrete derivation tree generation algorithm that we propose is an Earley-style algorithm integrating a set of well-known optimisation techniques: tabulation, sharing-packing, and semantic-based indexing. In the second part of the thesis, we explore how our SemTAG-based surface realiser can be put to work for the (semi-)automatic generation of grammar exercises. Usually, teachers manually edit exercises and their solutions, and classify them according to the degree of dificulty or expected learner level. A strand of research in (Natural Language Processing (NLP) for CALL addresses the (semi-)automatic generation of exercises. Mostly, this work draws on texts extracted from the Web, use machine learning and text analysis techniques (e.g. parsing, POS tagging, etc.). These approaches expose the learner to sentences that have a potentially complex syntax and diverse vocabulary. In contrast, the approach we propose in this thesis addresses the (semi-)automatic generation of grammar exercises of the type found in grammar textbooks. In other words, it deals with the generation of exercises whose syntax and vocabulary are tailored to specific pedagogical goals and topics. Because the grammar-based generation approach associates natural language sentences with a rich linguistic description, it permits defining a syntactic and morpho-syntactic constraints specification language for the selection of stem sentences in compliance with a given pedagogical goal. Further, it allows for the post processing of the generated stem sentences to build grammar exercise items. We show how Fill-in-the-blank, Shuffle and Reformulation grammar exercises can be automatically produced. The approach has been integrated in the Interactive French Learning Game (I-FLEG) serious game for learning French and has been evaluated both based in the interactions with online players and in collaboration with a language teacher
Perez, Laura Haide. "Génération automatique de phrases pour l'apprentissage des langues". Electronic Thesis or Diss., Université de Lorraine, 2013. http://www.theses.fr/2013LORR0062.
In this work, we explore how Natural Language Generation (NLG) techniques can be used to address the task of (semi-)automatically generating language learning material and activities in Camputer-Assisted Language Learning (CALL). In particular, we show how a grammar-based Surface Realiser (SR) can be usefully exploited for the automatic creation of grammar exercises. Our surface realiser uses a wide-coverage reversible grammar namely SemTAG, which is a Feature-Based Tree Adjoining Grammar (FB-TAG) equipped with a unification-based compositional semantics. More precisely, the FB-TAG grammar integrates a flat and underspecified representation of First Order Logic (FOL) formulae. In the first part of the thesis, we study the task of surface realisation from flat semantic formulae and we propose an optimised FB-TAG-based realisation algorithm that supports the generation of longer sentences given a large scale grammar and lexicon. The approach followed to optimise TAG-based surface realisation from flat semantics draws on the fact that an FB-TAG can be translated into a Feature-Based Regular Tree Grammar (FB-RTG) describing its derivation trees. The derivation tree language of TAG constitutes a simpler language than the derived tree language, and thus, generation approaches based on derivation trees have been already proposed. Our approach departs from previous ones in that our FB-RTG encoding accounts for feature structures present in the original FB-TAG having thus important consequences regarding over-generation and preservation of the syntax-semantics interface. The concrete derivation tree generation algorithm that we propose is an Earley-style algorithm integrating a set of well-known optimisation techniques: tabulation, sharing-packing, and semantic-based indexing. In the second part of the thesis, we explore how our SemTAG-based surface realiser can be put to work for the (semi-)automatic generation of grammar exercises. Usually, teachers manually edit exercises and their solutions, and classify them according to the degree of dificulty or expected learner level. A strand of research in (Natural Language Processing (NLP) for CALL addresses the (semi-)automatic generation of exercises. Mostly, this work draws on texts extracted from the Web, use machine learning and text analysis techniques (e.g. parsing, POS tagging, etc.). These approaches expose the learner to sentences that have a potentially complex syntax and diverse vocabulary. In contrast, the approach we propose in this thesis addresses the (semi-)automatic generation of grammar exercises of the type found in grammar textbooks. In other words, it deals with the generation of exercises whose syntax and vocabulary are tailored to specific pedagogical goals and topics. Because the grammar-based generation approach associates natural language sentences with a rich linguistic description, it permits defining a syntactic and morpho-syntactic constraints specification language for the selection of stem sentences in compliance with a given pedagogical goal. Further, it allows for the post processing of the generated stem sentences to build grammar exercise items. We show how Fill-in-the-blank, Shuffle and Reformulation grammar exercises can be automatically produced. The approach has been integrated in the Interactive French Learning Game (I-FLEG) serious game for learning French and has been evaluated both based in the interactions with online players and in collaboration with a language teacher
Hadjadj, Mohammed. "Modélisation de la Langue des Signes Française : Proposition d’un système à compositionalité sémantique". Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS560/document.
The recognition of French Sign Language (LSF) as a natural language in 2005 has created an important need for the development of tools to make information accessible to the deaf public. With this prospect, this thesis aims at linguistic modeling for a system of generation of LSF. We first present the different linguistic approaches aimed at describing the sign language (SL). We then present the models proposed in computer science. In a second step, we propose an approach allowing to take into account the linguistic properties of the SL while respecting the constraints of a formalisation process.By studying the links between semantic functions and their observed forms in LSF Corpora, we have identified several production rules. We finally present the rule functioning as a system capable of modeling an entire utterance in LSF
Shimorina, Anastasia. "Natural Language Generation : From Data Creation to Evaluation via Modelling". Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0080.
Natural language generation is a process of generating a natural language text from some input. This input can be texts, documents, images, tables, knowledge graphs, databases, dialogue acts, meaning representations, etc. Recent methods in natural language generation, mostly based on neural modelling, have yielded significant improvements in the field. Despite this recent success, numerous issues with generation prevail, such as faithfulness to the source, developing multilingual models, few-shot generation. This thesis explores several facets of natural language generation from creating training datasets and developing models to evaluating proposed methods and model outputs. In this thesis, we address the issue of multilinguality and propose possible strategies to semi-automatically translate corpora for data-to-text generation. We show that named entities constitute a major stumbling block in translation exemplified by the English-Russian translation pair. We proceed to handle rare entities in data-to-text modelling exploring two mechanisms: copying and delexicalisation. We demonstrate that rare entities strongly impact performance and that the impact of these two mechanisms greatly varies depending on how datasets are constructed. Getting back to multilinguality, we also develop a modular approach for shallow surface realisation in several languages. Our approach splits the surface realisation task into three submodules: word ordering, morphological inflection and contraction generation. We show, via delexicalisation, that the word ordering component mainly depends on syntactic information. Along with the modelling, we also propose a framework for error analysis, focused on word order, for the shallow surface realisation task. The framework enables to provide linguistic insights into model performance on the sentence level and identify patterns where models underperform. Finally, we also touch upon the subject of evaluation design while assessing automatic and human metrics, highlighting the difference between the sentence-level and system-level type of evaluation
Kervajan, LoÏc. "Contribution à la traduction automatique français/langue des signes française (LSF) au moyen de personnages virtuels : Contribution à la génération automatique de la LSF". Thesis, Aix-Marseille 1, 2011. http://www.theses.fr/2011AIX10172.
Since the law was voted the 11-02-2005 for equal rights and opportunities: places open to anyone (public places, shops, internet, etc.) should welcome the Deaf in French Sign Language (FSL). We have worked on the development of technological tools to promote LSF, especially in machine translation from written French to FSL.Our thesis begins with a presentation of knowledge on FSL (theoretical resources and ways to edit FSL) and follows by further concepts of descriptive grammar. Our working hypothesis is: FSL is a language and, therefore, machine translation is relevant.We describe the language specifications for automatic processing, based on scientific knowledge and proposals of our native FSL speaker informants. We also expose our methodology, and do present the advancement of our work in the formalization of linguistic data based on the specificities of FSL which certain (verbs scheme, adjective and adverb modification, organization of nouns, agreement patterns) require further analysis.We do present the application framework in which we worked on: the machine translation system and virtual characters animation system of France Telecom R&D.After a short avatar technology presentation, we explain our control modalities of the gesture synthesis engine through the exchange format that we developed.Finally, we conclude with an evaluation, researches and developments perspectives that could follow this thesis.Our approach has produced its first results since we have achieved our goal of running the full translation chain: from the input of a sentence in French to the realization of the corresponding sentence in FSL with a synthetic character
Narayan, Shashi. "Generating and simplifying sentences". Thesis, Université de Lorraine, 2014. http://www.theses.fr/2014LORR0166/document.
Depending on the input representation, this dissertation investigates issues from two classes: meaning representation (MR) to text and text-to-text generation. In the first class (MR-to-text generation, "Generating Sentences"), we investigate how to make symbolic grammar based surface realisation robust and efficient. We propose an efficient approach to surface realisation using a FB-LTAG and taking as input shallow dependency trees. Our algorithm combines techniques and ideas from the head-driven and lexicalist approaches. In addition, the input structure is used to filter the initial search space using a concept called local polarity filtering; and to parallelise processes. To further improve our robustness, we propose two error mining algorithms: one, an algorithm for mining dependency trees rather than sequential data and two, an algorithm that structures the output of error mining into a tree to represent them in a more meaningful way. We show that our realisers together with these error mining algorithms improves on both efficiency and coverage by a wide margin. In the second class (text-to-text generation, "Simplifying Sentences"), we argue for using deep semantic representations (compared to syntax or SMT based approaches) to improve the sentence simplification task. We use the Discourse Representation Structures for the deep semantic representation of the input. We propose two methods: a supervised approach (with state-of-the-art results) to hybrid simplification using deep semantics and SMT, and an unsupervised approach (with competitive results to the state-of-the-art systems) to simplification using the comparable Wikipedia corpus
Narayan, Shashi. "Generating and simplifying sentences". Electronic Thesis or Diss., Université de Lorraine, 2014. http://www.theses.fr/2014LORR0166.
Depending on the input representation, this dissertation investigates issues from two classes: meaning representation (MR) to text and text-to-text generation. In the first class (MR-to-text generation, "Generating Sentences"), we investigate how to make symbolic grammar based surface realisation robust and efficient. We propose an efficient approach to surface realisation using a FB-LTAG and taking as input shallow dependency trees. Our algorithm combines techniques and ideas from the head-driven and lexicalist approaches. In addition, the input structure is used to filter the initial search space using a concept called local polarity filtering; and to parallelise processes. To further improve our robustness, we propose two error mining algorithms: one, an algorithm for mining dependency trees rather than sequential data and two, an algorithm that structures the output of error mining into a tree to represent them in a more meaningful way. We show that our realisers together with these error mining algorithms improves on both efficiency and coverage by a wide margin. In the second class (text-to-text generation, "Simplifying Sentences"), we argue for using deep semantic representations (compared to syntax or SMT based approaches) to improve the sentence simplification task. We use the Discourse Representation Structures for the deep semantic representation of the input. We propose two methods: a supervised approach (with state-of-the-art results) to hybrid simplification using deep semantics and SMT, and an unsupervised approach (with competitive results to the state-of-the-art systems) to simplification using the comparable Wikipedia corpus
Carroy, Bertrand. "La génération naturelle chez Thomas d’Aquin". Paris 4, 2007. http://www.theses.fr/2006PA040156.
The generation of the body is a basic concept for anyone studying nature. Pervasive since the beginning of Greek philosophy, Christian thought introduced it and gave it a double destiny : on the one hand it seems to be overshadowed for the benefit of the Creation’s notion ; on the other hand it is transformed in the theological discourse to express the Trinitarian relation of the Father and the Son. The goal of this study is to show how Thomas Aquinas, great witness and actor of the thirteenth century and actor in Aristotelian theoria’s reception in the theological discourse, understands and uses the concept of natural generation. By the precise study of the central philosophical topic and by its theological application, clearly appears Thomas’ project of unifying faith and reason. Means used in this study are a thorough text inventory and the ordering of the great motions composing his thought on natural generation : its principles, specificity and divisions (elements, inanimate corps, vegetables, animals), human generation’s case. These motions bring together some of the crucial medieval questions, particularly those of the eternity of the world and the plurality of the forms. Thomas Aquinas shows, through a reasoned used of Aristotle’s corpus by giving intelligibility to the nature and Revelation which is manifested through it, both an absolute respect for Holy Scripture and a fine intellectual daring
Patoz, Evelyne. "Génération de représentations topologiques à partir de requêtes en langage naturel". Besançon, 2006. http://www.theses.fr/2006BESA1031.
From the reasoning’ study and the visual perceptions abilities that use a human being for locating in the space, we elaborate an example theoretic allowing a computing system to situate an object in the space by means of linguistics signs. For this fact, the rule of linguistic activity is studying in his constructive rule of the spatial representation, but also to the other cognitive effect, is revealed as essential: the visual perception. The visual perception resting in a huge part on the products informations in function of an observer’ knowledges of the universe, the interpretation can conduct to a mental representation. The notion of representation so is linked up to a reality of objects that existence by itself depends of the perceptive aptitude of a special individual. The representation is no more examined like a construction for a well-done configuration, but relative to an environmental perception. We can show that the dynamic generation for a spatial representation depend of parameters, which the more important factor is the identification of a point of reference. We can develop a logical application, integrating a speech factor, that permit to a user to directing a robot in an area, and thus to give an account to the state of the world how it can evaluate
Azeraf, Elie. "Classification avec des modèles probabilistes génératifs et des réseaux de neurones. Applications au traitement des langues naturelles". Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. https://theses.hal.science/tel-03880848.
Many probabilistic models have been neglected for classification tasks with supervised learning for several years, as the Naive Bayes or the Hidden Markov Chain. These models, called generative, are criticized because the induced classifier must learn the observations' law. This problem is too complex when the number of observations' features is too large. It is especially the case with Natural Language Processing tasks, as the recent embedding algorithms convert words in large numerical vectors to achieve better scores.This thesis shows that every generative model can define its induced classifier without using the observations' law. This proposition questions the usual categorization of the probabilistic models and classifiers and allows many new applications. Therefore, Hidden Markov Chain can be efficiently applied to Chunking and Naive Bayes to sentiment analysis.We go further, as this proposition allows to define the classifier induced from a generative model with neural network functions. We "neuralize" the models mentioned above and many of their extensions. Models so obtained allow to achieve relevant scores for many Natural Language Processing tasks while being interpretable, able to require little training data, and easy to serve
Martineau, Claude. "Compression de textes en langue naturelle". Marne-la-Vallée, 2001. https://hal.archives-ouvertes.fr/tel-02076650.
In this Ph. D. Thesis we investigate several data compression methods on text in natural language. Our study is focused on algorithms that use the word as the basic units, they are usally called word-based text compression algorithms. We have developped algorithms that allow to divide original size of the text by an average factor of 3. 5 and keeps (medium an index) direct access to the compressed form of the text. The set of words of a text, (the lexicon) is not a priori known. An efficient compression of the text requires an efficient compression of its lexicon. For this purpose, we have developped a compact representation of the lexicon that allows, by the application of Markov chain based compression algorithms, to get very high compression rates. The early algorithms dedicated to compress text in natural language have been elaborated to process very large text databases in which the size of the lexicon is very small versus the data one. Our algorithms can be apply also to every day text size (from some fifty Ko up to some Mo) for which the size of the lexicon is an important part of the size of the text
Moriceau, Véronique. "Intégration de données dans un système question-réponse sur le Web". Toulouse 3, 2007. http://www.theses.fr/2007TOU30019.
In the framework of question-answering systems on the Web, our main goals are to model, develop and evaluate a system which can, from a question in natural language, search for relevant answers on the Web and generate a synthetic answer, even if the search engine selected several candidate answers. We focused on temporal and numerical questions. Our system deals with : - the integration of data from candidate answers by using a knowledge base and knowledge extracted from the Web. This component allows the detection of data inconsistencies and deals with user expectations in order to produce a relevant answer, - the generation of synthetic answers in natural language which are relevant w. R. T users. Indeed, generated answers have to be short, understandable and have to express the cooperative know-how which has been used to solve data inconsistencies. We also propose evaluation methods to evaluate our system from a technical and cognitive point of view
Petitjean, Simon. "Génération modulaire de grammaires formelles". Thesis, Orléans, 2014. http://www.theses.fr/2014ORLE2048/document.
The work presented in this thesis aim at facilitating the development of resources for natural language processing. Resources of this type take different forms, because of the existence of several levels of linguistic description (syntax, morphology, semantics, . . . ) and of several formalisms proposed for the description of natural languages at each one of these levels. The formalisms featuring different types of structures, a unique description language is not enough: it is necessary to create a domain specific language (or DSL) for every formalism, and to implement a new tool which uses this language, which is a long a complex task. For this reason, we propose in this thesis a method to assemble in a modular way development frameworks specific to tasks of linguistic resource generation. The frameworks assembled thanks to our method are based on the fundamental concepts of the XMG (eXtensible MetaGrammar) approach, allowing the generation of tree based grammars. The method is based on the assembling of a description language from reusable bricks, and according to a unique specification file. The totality of the processing chain for the DSL is automatically assembled thanks to the same specification. In a first time, we validated this approach by recreating the XMG tool from elementary bricks. Some collaborations with linguists also brought us to assemble compilers allowing the description of morphology and semantics
Belec, Yves. "Des règles expertes pour une méthode applicative d'analyse ou de génération du langage naturel". Toulouse 3, 1990. http://www.theses.fr/1990TOU30136.
Froissart, Christel. "Robustesse des interfaces homme-machine en langue naturelle". Grenoble 2, 1992. http://www.theses.fr/1992GRE29053.
Once having demonstrated that robustness is currently a crucial problem for systems based on a natural language man-machine interface, we will evidence the extent of the problem through the analysis of researd carried out in error processing. We can thus define a deviation as any elements which violates academic use of the language and or system's expectations at every of analysis. Then, we show that a robust strategy must solve the double bind between tolerance (release of contraints) and the selection of the most plausible solution (constriction). We offer to identify deviations (either real or potential) which are not detected by the natural language understanding system by questioning the validity of the user's input as early as possible. We suggest a strategy based on additional knowledge that must be modelized in order to put in place predictive mechanisms that controll the robust processing, so as to direct suspicion towards the plausible deviation and to direct its processing towards the most likely hypothesis. This body of knowledge is derived from: - data which can be provided by the very operation of the parser thanks to a multi-agent structure; - external data (linguistic, cognitive, ergonomic) structured in five models constructed from the corpus of manmachine dialogue : the technological model, the field and application model, the language model (and its pitfalls), the dialogue model & the user's model. This
Thévenon, Patrick. "Vers un assistant à la preuve en langue naturelle". Chambéry, 2006. http://www.theses.fr/2006CHAMS036.
This Thesis is the conclusion of three years of work in a project named DemoNat. The aim of this project is to design a system able to analyse and validate mathematical proofs written in a natural language. The general scheme of the system is the following : 1. Analysis of the proof by means of linguistics tools ; 2. Translation of the proof in a restricted language ; 3. Interpretation of the translated text in a deduction rules tree ; 4. Validation of the deduction rules with an automatic prover. This project envolved teams of linguists and logicians, the first two phases being the task of the linguists, and the lasts ones being the task of the logicians. This thesis presents in more details the project and develops mainly the following points: - Definition of the restricted language and its interpretation ; - proprerties of the principal type of terms of a typed λ-calculus with two arrows, part of a linguistic tool, the ACGs ; - Description of the automatic prover
Membrado, Miguel. "Génération d'un système conceptuel écrit en langage de type semi-naturel en vue d'un traitment des données textuelles : application au langage médical". Paris 11, 1989. http://www.theses.fr/1989PA112004.
We present our research and our own realization on a KBMS (Knowledge Based Management System) aiming at processing any kind of data, especially textual data, and the related knowledge. In this field of applied Artificial Intelligence, we propose a way for representing knowledge : to describe it in a semi-natural language able as well to describe structures or relations as rules. Knowledge is managed as conceptual definitions figuring in a dictionary which represents the knowledge base. The power of this language allows to process a lot of ambiguities, especially those coming from contextual polysemia, to deal with metonymia or incomplete knowledge, and to solve several kinds of paraphrases. Simultaneous polyhierarchies as well as chunks are taken into account. The system has been specially studied for automatic processing of medical reports. An application to neuro radiology has been taken as example. But it could be applied as well to any other field, included outside Medecine to any professional field. Text analysis is realized in two steps : first a conceptual extraction, secondly a structural analysis. The first step only is taken into account in this thesis. It aims at retrieving pertinent documents, matching them to the given question by comparison between concepts, not between character strings. An overview of the second step will be presented. The final goal is to be able to retrieve the knowledge contained into the texts, i. E. The data themselves, and to manage it in respect to the knowledge represented into the dictionaries
Lemeunier, Thierry. "L'intentionnalité communicative dans le dialogue homme-machine en langue naturelle". Phd thesis, Université du Maine, 2000. http://tel.archives-ouvertes.fr/tel-00003771.
Notre modèle s'appuie sur l'idée que le sens échangé entre les interactants d'une conversation n'est pas un sens pré-existant à celle-ci, mais au contraire, un sens négocié et co-construit par les interactants durant la conversation. Cette co-construction s'appuie sur l'hypothèse de l'existence d'un terrain commun, c'est-à-dire d'un ensemble de connaissances, hypothèses et croyances que le locuteur pense être partagées.
Notre travail a consisté à définir une mémoire interactionnelle pour la machine permettant le travail de négociation du sens. Cette mémoire contient des éléments de différents états organisés en arborescences. Ces éléments proviennent de l'interprétation des actes illocutoires de l'utilisateur et des résultats des raisonnements faits par les différentes activités du système de dialogue. Nous distinguons l'activité applicative dont le but est de fournir un service quelconque à l'utilisateur, l'activité langagière qui consiste à analyser les énoncés de l'utilisateur et générer les énoncés du système, et enfin l'activité dialogique qui consiste à dialoguer avec l'utilisateur. Les intentions de communications de la machine sont générées par la reconnaissance de configurations remarquables que nous avons définies en étudiant les arborescences qu'il est normalement possible d'obtenir. Ce principe de génération, à l'origine des actes langagiers de la machine, est général et indépendant de l'application. Il s'appuie uniquement sur la forme structurelle des éléments mnésiques (appelé UMM pour Unité Minimale de Mémoire) et sur le statut de ces derniers.
Derouault, Anne-Marie. "Modélisation d'une langue naturelle pour la désambiguation des chaînes phonétiques". Paris 7, 1985. http://www.theses.fr/1985PA077028.
Striegnitz, Kristina. "Génération d'expressions anaphoriques : Raisonnement contextuel et planification de phrases". Nancy 1, 2004. http://www.theses.fr/2004NAN10186.
This thesis investigates the contextual reasoning involved in the production of anaphoric expressions in natural language generation systems. More specifically, I propose generation strategies for two types of discourse anaphora which have not been treated in generation before: bridging descriptions and additive particles. To this end the contextual conditions that govern the use of these expressions have to be formalized. The formalization that I propose is based on notions from linguistics and extends previous approaches to the generation of co-referential anaphora. I then specify the reasoning tasks that have to be carried out in order to check the contextual conditions. I describe how they can be implemented using a state-of-the-art reasoning system for description logics, and I compare my proposal to alternative approaches using other kinds of reasoning tools. Finally, I describe an experimental implementation of the proposed approach
Pho, Van-Minh. "Génération automatique de questionnaires à choix multiples pédagogiques : évaluation de l'homogénéité des options". Thesis, Paris 11, 2015. http://www.theses.fr/2015PA112192/document.
Recent years have seen a revival of Intelligent Tutoring Systems. In order to make these systems widely usable by teachers and learners, they have to provide means to assist teachers in their task of exercise generation. Among these exercises, multiple-choice tests are very common. However, writing Multiple-Choice Questions (MCQ) that correctly assess a learner's level is a complex task. Guidelines were developed to manually write MCQs, but an automatic evaluation of MCQ quality would be a useful tool for teachers.We are interested in automatic evaluation of distractor (wrong answer choice) quality. To do this, we studied characteristics of relevant distractors from multiple-choice test writing guidelines. This study led us to assume that homogeneity between distractors and answer is an important criterion to validate distractors. Homogeneity is both syntactic and semantic. We validated the definition of homogeneity by a MCQ corpus analysis, and we proposed methods for automatic recognition of syntactic and semantic homogeneity based on this analysis.Then, we focused our work on distractor semantic homogeneity. To automatically estimate it, we proposed a ranking model by machine learning, combining different semantic homogeneity measures. The evaluation of the model showed that our method is more efficient than existing work to estimate distractor semantic homogeneity
Cheminot, Eric. "Formalisation de spécifications de logiciels : traitement d'annotations en langue naturelle contrôlée". Grenoble INPG, 1999. http://www.theses.fr/1999INPG0171.
El, Kassas Dina. "Une étude contrastive de l'arabe et du français dans une perspective de génération multilingue". Paris 7, 2005. http://www.theses.fr/2005PA070034.
The present PhD research was conducted in a dependancy grammar framework : the Meaning-Text theory. Its objective is twofold. First of all, it is meant to accomplish a syntactic analysis of Arabie. The theory we put forward is that the syntactic head of declarative sentences in Arabis is systematically the verb. The active valency of the verb in Arabie is studied in order to establish a library of syntactic functions for Arabie. We start by identifying typical predicative units, the gramamticalization of which surpasses the simple juxtaposition of propositions, and consider the analytical verb forms as well. We then propose a topological model dealing with the liearisation of dependency syntactical structures. Secondly, a contrastive study of syntactic structures in Arabie and their French equivalents in undertaken, with a view to making them converge on a more abstract level. We point out the extent to which lexical choices and the underlying logics of the language influence the information représentation, which makes multilingual approaches based on pivot language seem utopie
LAVAUD, MARIE-PIERRE. "Pragmatique, logique naturelle et argumentation : le connecteur or et ses équivalents en espagnol". Dijon, 1994. http://www.theses.fr/1994DIJOL002.
This study deals with the French connecting word "or" and its Spanish equivalents (pero, ahora bien, puies bien, sin embargo, no obstante, y eso que, es asi que), in a perspective which combines pragmatical and logical aspects of the question, and which aims at conciliating "new linguistics" and tradition (data from dictionaries and grammar books). As far as pragmatics is concerned our leading methodology is drawn from o. Ducrot's works and the conversational theories as defined by the Geneva school. This study also relies on the mathematician and logician J-B Grize's works, in order to compare formal (mathematical) logic and a certain natural logic which is present in speech. This study consists in three parts: first an introduction to the theoretical apparatus, and then the study of the French connecting word "or". The study of its Spanish equivalents constitutes the third part, and as a conclusion a balance is proposed. Is it possible to give a single description of the French connecting word "or"? Which are the equivalents for it in the Spanish language? How far does the equivalence relationship go? In other words, does the Spanish language follow the same steps in thinking process as the French on does? All these are the questions the present study tries to answer
Boutouhami, Sara. "Un système de générations de descriptions argumentées". Paris 13, 2010. http://www.theses.fr/2010PA132014.
In this thesis, we investigate the expression of arguments in natural language (NL). Our work has two motivations: theoretical motivation is to understand and simulate the sense of reasoning underlies the argumentative process and clarify the intuition that distinguishes between good and bad arguments, and a practical motivation: helping eventually, assistance in writing text descriptions "good" reasoned. The objective of this thesis is the realization of a system that can generate a description that is argued better for one of the protagonists of the accident. In this work, we cooperate in various ways within the same architecture as well as the reasoning component language. The idea is to take advantage of advanced artificial intelligence in terms of formalization of reasoning to reproduce a basic form of argument used by people everyday and who draws much of its force in the flexible and subjectivity of the LN. For knowledge representation and reasoning, we defined a language of first order reified which takes into account some useful terms, the temporal information and non-monotonic inferences expressed using a fragment of logic Reiter defects. For implementation, we used the paradigm Answer Set Programming by translating our rules of inference in extended logic programs expressed in the languages models. Finally, to validate the quality of the descriptions generated by our system, we used a psychological experience with the help of specialists in cognitive psychology. The results of this experiment are encouraging and have confirmed the overall relevance of the argumentative strategies that we simulated
Hankach, Pierre. "Génération automatique de textes par satisfaction de contraintes". Paris 7, 2009. http://www.theses.fr/2009PA070027.
We address in this thesis the construction of a natural language generation System - computer software that transforms a formal representation of information into a text in natural language. In our approach, we define the generation problem as a constraint satisfaction problem (CSP). The implemented System ensures an integrated processing of generation operations as their different dependencies are taken into account and no priority is given to any type of operation over the others. In order to define the constraint satisfaction problem, we represent the construction operations of a text by decision variables. Individual operations that implement the same type of minimal expressions in the text form a generation task. We classify decision variables according to the type of operations they represent (e. G. Content selection variables, document structuring variables. . . ). The linguistic rules that govern the operations are represented as constraints on the variables. A constraint can be defined over variables of the same type or different types, capturing the dependency between the corresponding operations. The production of a text consists of resolving the global System of constraints, that is finding an evaluation of the variables that satisfies all the constraints. As part of the grammar of constraints for generation, we particularly formulate the constraints that govern document structuring operations. We model by constraints the rhetorical structure of SORT in order to yield coherent texts as the generator's output. Beforehand, in order to increase the generation capacities of our System, we extend the rhetorical structure to cover texts in the non-canonical order. Furthermore, in addition to defining these coherence constraints, we formulate a set of constraints that enables controlling the form of the macrostructure by communicative goals. Finally, we propose a solution to the problem of computational complexity of generating large texts. This solution is based on the generation of a text by groups of clauses. The problem of generating a text is therefore divided into many problems of reduced complexity, where each of them is concerned with generating a part of the text. These parts are of limited size so the associated complexity to their generation remains reasonable. The proposed partitioning of generation is motivated by linguistic considerations
Hue, Jean-François. "L'analyse contextuelle des textes en langue naturelle : les systèmes de réécritures typées". Nantes, 1995. http://www.theses.fr/1995NANT2034.
Tromeur, Laurent. "Mise en place d'une interface en langue naturelle pour la plateforme Ontomantics". Paris 13, 2011. http://scbd-sto.univ-paris13.fr/secure/ederasme_th_2011_tromeur.pdf.
Delumeau, Fabrice. "Une description linguistique du créole guadeloupéen dans la perspective de la génération automatique d'énoncés". Paris 10, 2006. http://www.theses.fr/2006PA100003.
The aim of this PhD thesis is to put forward a description of the Creole of Guadeloupe in the perspective of the automatic generation of uterrances in Creole, using contemporary French as an input. In phonology and morpho-phonology, the permanent features one observes point out to rules accounting for what is called "synchronie Creolisation". As regards the syntactic domain, the emphasis is laid on the differences between French and Creole, and a formalised description of the main structures of Guadeloupe Creole is presented
Aslanides, Sophie. "Syntaxe et structure d'un texte : les connecteurs du français dans un système de génération automatique". Paris 7, 1995. http://www.theses.fr/1995PA070081.
This study aims defining the content and structure of the linguistic databases of a nlg system. More precisely, it concentrates on the lexical encoding of cue-prases - in which we include the full-stop, complex verb- phrases, relativization and participles - and the evaluation of the potential ambiguities of a complex discourse structure. As demonstrated by danlos (1985), the relevant item for lexical choice is not the connective by itself, but a set of constraints attached to if (henceforth, discourse structure, or ds). To define the relevant dss for a given semantic relation, a thorough analysis of the linguistic properties of cue-phrases is required, and more specifically, the determination of differential syntactic properties that reflect semantic variation. Once defined the dss families, i. E. All the possible dss built around a given cue-phrase - they are organised in a hierarchy which can serve as an interface between the conceptual level and the lexicon. But the ambiguities of complex discourse structures are thus only partly controlled. We therefore study the possible scope ambiguities in p1 c1 p2 c2 p3 discourses, and show the various factors which interfere with the choice of cue-phrases to create ambiguity (subordinate clause moving, ellipsis, pronominalisation, causal inference). The last part of this work proposes a tag-inspired tree representation for elementary dss and discusses the linguistic relevance of possible representations for complex dss as tree-structures
Antoniadis, Georges. "Élaboration d'un système d'analyse morpho-syntaxique d'une langue naturelle : application en informatique documentaire /". Grenoble : Centre de recherche en informatique appliquée aux sciences sociales, 1987. http://catalogue.bnf.fr/ark:/12148/cb34976768x.
Fredj, Mounia. "Saphir : un système d'objets inférentiels : contribution à l'étude des raisonnements en langue naturelle". Grenoble 2, 1993. http://www.theses.fr/1993GRE21010.
This work is in keeping with the general framework of natural language processing. It especially addresses the problem of knowledge representation and reasoning "carried" in natural language. The goal of the saphir system is to construct the network of objects coming from the discourse. This construction is done by describing some of the reasonings taking place in the knowledge acquisition process and particularly the ones that allow to resolve the "associative anaphora". We define a knowledge representation model, having a linguistics basis and cognitif elements. In order to support this model, we propose an object oriented formalism, whose theoretical foundations are lesniewski's logical system : ontology and mereology. The first system relies upon a primitif functor called "epsilon" meaning "is-a", the second one upon the "part-of" relation called "ingredience". These logical systems constitute a more appropriate theoretical foundation than the traditional predicate calculus
Hajlaoui, Najeh. "Multilinguïsation des systèmes de e-commerce traitant des énoncés spontanés en langue naturelle". Phd thesis, Grenoble 1, 2008. http://www.theses.fr/2008GRE10118.
We are interested in the multilinguization, or “linguistic porting” (simpler than localization) of management content services processing spontaneous utterances in natural language, often noisy, but constrained by the situation and constituting a restricted “sublangage”. Any service of this type (App) uses a specific content representation (CR-App) on which the functionnal kernel operates. Most often, this representation is produced from the “native” language L1 by a content extractor (CE-App). We identified three possible methods of porting and have illustrated them by porting to French a part of CATS, a system handling small ads in SMS (in Arabic), deployed in Amman, as well as IMRS, a music retrieval system, where the native natural language interface is in Japanese and only the CR is accessible. These are: (1) “internal localisation”, i. E. Adaptation to L2 of the CE, giving CE-App-L2; (2)”external” localization , i. E. Adaptation of an existing CE for L2 to the domain and to the App content representation (CE-X-L2-App); (3) translation of utterances from L2 to L1. The choice of the strategy is constrained by the translational situation: type and level of possible access (complete access to the source code, access limited to the internal representation, access limited to the dictionary, and no access), available resources (dictionaries, corpus), competences in languages and linguistics of persons taking part in the multilinguisation of application. The three methods gave good results on the Arabic to French porting of the CARS part of CATS. For internal localization, the grammatical part was very little modified, which proves that, despite the great distance between Arabic and French, these two sub-languages are very near one to another. This is a new illustration of R. Kittredge’s analysis. The external localization was experimented with CATS and with IMRS by adapting to the new domain the French content extractor written initially by H. Blanchon for the tourism domain (CSTAR/Nespole! project), and then by changing the language for IMRS (English). Finally, porting by statistical MT gave also a very good performance, and that with a very small training corpus (less than 10 000 words) and a complete dictionary. This proves that, in the case of very small sub-languages, statistical MT may be of sufficient quality, starting from a corpus 100 to 500 smaller than for the general language
Hajlaoui, Najeh. "Multilinguïsation des systèmes de e-commerce traitant des énoncés spontanés en langue naturelle". Phd thesis, Université Joseph Fourier (Grenoble), 2008. http://tel.archives-ouvertes.fr/tel-00337336.
Un service de ce type (soit App) utilise une représentation du contenu spécifique (RC-App) sur laquelle travaille le noyau fonctionnel. Le plus souvent, cette représentation est produite à partir de la langue « native » L1 par un extracteur de contenu (EC-App). Nous avons dégagé trois méthodes de portage possibles, et les avons illustrées par le portage en français d'une partie de CATS, un système de traitement de petites annonces en SMS (en arabe) déployé à Amman, ainsi que sur IMRS, un système de recherche de morceaux de musique dont l'interface native est en japonais et dont seule la RC est accessible. Il s'agit de : (1) localisation « interne », i.e. adaptation à L2 de l'EC donnant EC-App-L2 ; (2) localisation « externe », i.e. adaptation d'un EC existant pour L2 au domaine et à la représentation de contenu de App (EC-X-L2-App); (3) traduction des énoncés de L2 vers L1.
Le choix de la stratégie est contraint par la situation traductionnelle : types et niveau d'accès possibles (accès complet au code source, accès limité à la représentation interne, accès limité au dictionnaire, et aucun accès), ressources disponibles (dictionnaires, corpus), compétences langagières et linguistiques des personnes intervenant dans la multilinguïsation des applications.
Les trois méthodes ont donné de bons résultats sur le portage d'arabe en français de la partie de CATS concernant l'occasion automobile. En localisation interne, la partie grammaticale a été très faiblement modifiée, ce qui prouve que, malgré la grande distance entre l'arabe et le français, ces deux sous-langages sont très proches l'un de l'autre, une nouvelle illustration de l'analyse de R. Kittredge. La localisation externe a été expérimentée sur CATS et sur IMRS en adaptant au nouveau domaine considéré l'extracteur de contenu du français écrit initialement par H. Blanchon pour le domaine du tourisme (projet CSTAR/Nespole!), puis en changeant de langue pour IMRS (anglais).
Enfin, le portage par TA statistique a également donné de très bonnes performances, et cela avec un corpus d'apprentissage très petit (moins de 10.000 mots) et un dictionnaire complet. Cela prouve que, dans le cas de sous-langages très petits, la TA statistique peut être de qualité suffisante en partant de corpus 100 à 500 fois moins grands que pour de la langue générale.
Al, Haj Hasan Issam. "Alimentation automatique d'une base de connaissances à partir de textes en langue naturelle". Clermont-Ferrand 2, 2008. http://www.theses.fr/2008CLF21879.
Petrecca, Miguel Angel. "La langue en question : trois poètes chinois contemporains de la troisième génération". Thesis, Paris, INALCO, 2020. http://www.theses.fr/2020INAL0005.
Modern Chinese poetry or "new poetry" (xin shi 新诗) has had to fight from its beginnings to exist as a viable literary genre. Born out of the break with tradition and the rejection of the classical language, it has been haunted throughout its history by the question of the link with tradition, the tension between China and the West and the problem of language. The objective of our thesis is twofold. On the one hand, we aim to placethe issue of tradition, identity and language within the broader framework of the history of modern Chinese poetry in order to better understand its challenges in the current context, marked by the rise of nationalist discourses and the entry of Chinese poetry into the networks of world poetry. On the other hand, we want to see how this problem is embodied in the work of three poets of the third generation (di san dai shiren 第三 代诗人), that is to say, those who began to write in the wake of obscure poets (menglong shiren 朦胧诗 人). Our dissertations also aims to discover the works of these three poets who are among the most important figures of the current Chinese poetic scene (Yu Jian 于 坚, Xi Chuan 西川 and Bai Hua 柏桦), and to help draw the attention of French researchers to a poetry (contemporary Chinese poetry) which has not yet received the attention it deserves
Tzoukermann, Evelyne. "Morphologie et génération automatique du verbe français : implémentation d'un module conversationnel". Paris, INALCO, 1986. http://www.theses.fr/1986INAL0004.
El-Khoury, Sahar. "Approche mixte, analytique et par apprentissage, pour la synthèse d'une prise naturelle". Paris 6, 2008. http://www.theses.fr/2008PA066585.
Bouchet, François. "Conception d'une chaîne de traitement de la langue naturelle pour un agent conversationnel assistant". Phd thesis, Université Paris Sud - Paris XI, 2010. http://tel.archives-ouvertes.fr/tel-00607298.
Thollard, Franck. "Inférence grammaticale probabiliste pour l'apprentissage de la syntaxe en traitement de la langue naturelle". Saint-Etienne, 2000. http://www.theses.fr/2000STET4010.
Zablit, Patricia. "Construction de l'interprétation temporelle en langue naturelle : Un système fondé sur les graphes conceptuels". Paris 11, 1991. http://www.theses.fr/1991PA112380.
Bertrand, de Beuvron François de. "Un système de programmation logique pour la création d'interfaces homme-machine en langue naturelle". Compiègne, 1992. http://www.theses.fr/1992COMPD545.
Pouchot, Stéphanie. "L'analyse de corpus et la génération automatique de texte : méthodes et usages". Grenoble 3, 2003. http://www.theses.fr/2003GRE39006.
Popescu, Vladimir. "Formalisation des contraintes pragmatiques pour la génération des énoncés en dialogue homme-machine multi-locuteurs". Phd thesis, Grenoble INPG, 2008. http://www.theses.fr/2008INPG0175.
We have developed a framework for controlling utterance generation in multi-party human-computer dialogue. This process takes place in four stages: (i) the rhetorical structure for the dialogue is computed, by using an emulation of SDRT ("Segmented Discourse Representation Theory"); (ii) this structure is used for computing speakers' commitments; these commitments are used for driving the process of adjusting the illocutionary force degree of the utterances; (iii) the commitments are filtered and placed in a stack for each speaker; these stacks are used for performing semantic ellipses; (iv) the discourse structure drives the choice of concessive connectors (mais, quand même, pourtant and bien que) between utterances; to do this, the utterances are ordered from an argumentative viewpoint