Rozprawy doktorskie na temat „Traitement automatique du langage naturel – Linguistique – Informatique”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Sprawdź 50 najlepszych rozpraw doktorskich naukowych na temat „Traitement automatique du langage naturel – Linguistique – Informatique”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Przeglądaj rozprawy doktorskie z różnych dziedzin i twórz odpowiednie bibliografie.
Mela, Augusta. "Traitement automatique de la coordination par et". Paris 13, 1992. http://www.theses.fr/1992PA132040.
Pełny tekst źródłaHagège, Caroline. "Analyse syntaxique automatique du portugais". Clermont-Ferrand 2, 2000. http://www.theses.fr/2000CLF20028.
Pełny tekst źródłaHaddad, Afifa Le Guern Michel. "Traitement des nominalisations anaphoriques en indexation automatique". [S.l.] : [s.n.], 2001. http://theses.univ-lyon2.fr/sdx/theses/lyon2/intranet/haddad_a.
Pełny tekst źródłaAl-Shafi, Bilal. "Traitement informatique des signes diacritiques : pour une application automatique et didactique". Université de Besançon, 1996. http://www.theses.fr/1996BESA1029.
Pełny tekst źródłaOh, Hyun-Gum. "Représentation des valeurs sémantiques du passé composé français en vue d'un traitement informatique". Paris 4, 1991. http://www.theses.fr/1991PA040070.
Pełny tekst źródłaWe present a model solving the problem of tense and aspect in the French language, detail research of "passé compose". There are three parties in this thesis: first, generality and theoric concept second ; values of "passé compose" in French; third: strategie of contextual exploration. Its shows that a natural language processing the aim of which is to build semantic representations of tenses is possible thanks to only linguistic data without using any other knowledge of the world. This linguistic approach has been implemented by a generator of expert system called "snark"
Paumier, Sébastien. "De la reconnaissance des formes linguistiques à l'analyse syntaxique". Marne-la-Vallée, 2003. http://www.theses.fr/2003MARN0162.
Pełny tekst źródłaMost of natural language descriptions are made of sets of rules modelling the behavior of words. However, whereas many general rules have been established, exceptions to these rules are not often studied. Consequently, these rules are incomplete, and even inaccurate when the number of particular cases is too large. To solve this problem, the LADL team has studied in detail basic sentences of French. This work led to a very fine description of the syntactic properties of these sentences, stored in matrices called lexicon-grammar tables. In 1993, the proof was made by Emmanuel Roche that these data could be used to perform automatic parsing. We have studied a way to extend this work, in order to take into account the whole data contained in lexicon-grammar tables, so that we could analyse any basic sentence of French. As this study will take a long time, we had to address the issue of maintenance of data through a long period of time. In fact, we tried to make the formalism to design our grammars as simple as possible, so that they would be easily maintained. In a first step, we verified that this formalism was powerful enough, through the examination of several syntactic structures. We have shown that this formalism, though simple, was adapted to syntactic description and parsing, which suggests that the difference between pattern matching and syntactic analysis is just a matter of scale. In return, we had to solve computational problems, mainly related due to the huge amount of data we had to deal with. So, in a second step, we studied methods to handle these data in reasonable time, either by transforming grammars or by optimizing programs. Our results show that our model is reliable, and so, that it is possible to build an exploitable set of grammars describing all the basic sentences of French. They show the way for efficient syntactic parsers for these constructions
EL, HAROUCHY ZAHRA. "Dictionnaire et grammaire pour le traitement automatique des ambiguites morphologiques des mots simples en francais". Besançon, 1997. http://www.theses.fr/1997BESA1010.
Pełny tekst źródłaWhen carrying out the automatic analysis of a text, one of the first stages consists in determining the grammatical categories of the words. In order to do this, a dictionary has been designed which recognises the one or several grammatical categories of non-compound words from their endings. This dictionary, which we have called automatic dictionary, is a collection of general rules (which can consist of sub- rules). A general rule sets forth an ending. An operator (the one or several grammatical categories) is associated with each rule. For example, we have the following general rule: +words ending in 'able' are adjectives;. Examples of exceptions to (or sub-rules) of this general rule are nouns such as (+cartable ;), conjugated verbs like (+ accable ;), and morphological ambiguities such as + noun and conjugated verb (like +sable;, +table. . . ;), and ambiguities such as + adjectival nouns ;(like, for example, + comptable ;. . . ) consequently, this sort of dictionary gives prominence to those words posessing several grammatical categories. When the automatic dictionary detects a word posessing several categories, the grammar system is consulted,of which the role is to pick out the morphological ambiguities by studying the immediate context. The rules in the grammar system work like a group of possible combinations of elements capable of going after and-or before the ambiguous form ( for example, a rule states that an ambiguous form such as + pronoun or article ; preceded by + a cause de ; is, in fact, an article)
Diakité, Mohamed Lamine. "Relations entre les phrases : contribution à la représentation sémantique des textes pour la compréhension automatique du langage naturel". Dijon, 2005. http://www.theses.fr/2005DIJOS025.
Pełny tekst źródłaThe work described in this thesis presents an approach of semantic representation of texts to contribute to an automatic comprehension of the natural language. The proposed approach is based on the evidence of the need for knowledge on the analyzed texts in order to discover their meaning. We thus proposed a semi-automatic approach of knowledge acquisition from texts. This acquisition is guided by a hierarchy of classes of entities organized in an ontology. Based on the principle of compositional semantic, we propose to identify relations between different entities of the text. We were interested in particular in the problem of pronominal anaphora for which we proposed a resolution method
Timimi, Ismaïl. "De la paraphrase linguistique à la recherche d'information, le système 3 AD : théorie et implantation (aide à l'analyse automatique du discours)". Grenoble 3, 1999. http://www.theses.fr/1999GRE39025.
Pełny tekst źródłaFort, Karën. "Les ressources annotées, un enjeu pour l’analyse de contenu : vers une méthodologie de l’annotation manuelle de corpus". Paris 13, 2012. http://scbd-sto.univ-paris13.fr/intranet/edgalilee_th_2012_fort.pdf.
Pełny tekst źródłaManual corpus annotation has become a key issue for Natural Langage Processing (NLP), as manually annotated corpora are used both to create and to evaluate NLP tools. However, the process of manual annotation remains underdescribed and the tools used to support it are often misused. This situation prevents the campaign manager from evaluating and guarantying the quality of the annotation. We propose in this work a unified vision of manual corpus annotation for NLP. It results from our experience of annotation campaigns, either as a manager or as a participant, as well as from collaborations with other researchers. We first propose a global methodology for managing manual corpus annotation campaigns, that relies on two pillars: an organization for annotation campaigns that puts evaluation at the heart of the process and an innovative grid for the analysis of the complexity dimensions of an annotation campaign. A second part of our work concerns the tools of the campaign manager. We evaluated the precise influence of automatic pre-annotation on the quality and speed of the correction by humans, through a series of experiments on part-of-speech tagging for English. Furthermore, we propose practical solutions for the evaluation of manual annotations, that proche che vide the campaign manager with the means to select the most appropriate measures. Finally, we brought to light the processes and tools involved in an annotation campaign and we instantiated the methodology that we described
Delannoy, Jean-François. "Un système fondé sur les objets pour le suivi de situation à partir de textes en langage naturel". Aix-Marseille 3, 1991. http://www.theses.fr/1991AIX30063.
Pełny tekst źródłaConstant, Mathieu. "Grammaires locales pour l'analyse automatique de textes : méthodes de construction et outils de gestion". Marne-la-Vallée, 2003. http://www.theses.fr/2003MARN0169.
Pełny tekst źródłaMany researchers in the field of Natural Language Processing have shown the significance of descriptive linguistics and especially the use of large-scaled databases of fine-grained linguistic components composed of lexicons and grammars. This approach has a drawback: it requires long-term investment. It is then necessary to develop methods and computational tools to help the construction of such data that are required to be directly applicable to texts. This work focuses on a specific linguistic representation: local grammars that describe precise and local constraints in the form of graphs. Two issues arise : How to efficiently build precise, complete and text-applicable grammars? How to deal with their growing number and their dispersion ? To handle the first problem, a set of simple and empirical methods have been exposed on the basis of M. Gross (1975)'s lexicon-grammar methodology. The whole process of linguistic analysis and formal representation has been described through the examples of two original phenomena: expressions of measurement (un immeuble d'une hauteur de 20 mètres) and locative prepositional phrases containing geographical proper names (à l'île de la Réunion). Each phenomenon has been narrowed to elementary sentences. This enables semantically classify them according to formal criteria. The syntactical behavior of these sentences has been systematically studied according to the lexical value of their elements. Then, the observed properties have been encoded either directly in the form of graphs with an editor or in the form of syntactical matrices then semi-automatically converted into graphs according to E. Roche (1993). These studies led to develop new conversion algorithms in the case of matrix systems where linguistic information is encoded in several matrices. For the second issue, a prototype on-line library of local grammars have been designed and implemented. The objective is to centralize and distribute local grammars constructed within the RELEX network of laboratories. We developed a set of tools allowing users to both store new graphs and search for graphs according to different criteria. The implementation of a grammar search engine led to an investigation into a new field of information retrieval: searching of linguistic information into sets of local grammars
Trouilleux, François. "Identification des reprises et interprétation automatique des expressions pronominales dans des textes en français". Clermont-Ferrand 2, 2001. https://hal.archives-ouvertes.fr/tel-01152394.
Pełny tekst źródłaStroppa, Nicolas. "Définitions et caractérisations de modèles à base d'analogies pour l'apprentissage automatique des langues naturelles /". Paris : École nationale supérieure des télécommunications, 2006. http://catalogue.bnf.fr/ark:/12148/cb40129220d.
Pełny tekst źródłaAmoia, Marilisa. "Reconnaissance d'implications textuelles à forte composante linguistique". Phd thesis, Nancy 1, 2008. http://tel.archives-ouvertes.fr/tel-00338608.
Pełny tekst źródłaAzzam, Saliha. "Traitement informatique des ambigui͏̈tés (anaphores et rattachement du syntagme prépositionnel) du langage naturel : réalisation d'un prototype : clam". Paris 4, 1995. http://www.theses.fr/1995PA040063.
Pełny tekst źródłaOur work presents a conceptual analysis methodology, i. E. , a methodology to translate natural language texts into a target language understandable by the computer. The main aim is to exploit this representation and ask questions about the semantic contain of the texts. The ontained representation must represent strictly the texts contain. The main obstacles to attempt this objective are incontestably the several kinds of ambiguity that are present at each comprehension level. In our study, we particularly focused on two types of ambiguity that are the main reasons of impoverishment results : the anaphors ambiguity and the prepositional attachment ambiguity. The anaphors problem is concerned with the implicit references to text "entities", as using, for example pronouns. The problem of preposotional attachment is caused by the very ambiguous feature of the prepositions, leading to several interpretations in the text understanding process. We propose a solution to deal with each of these two problems and a methodology to cordinate both procedures efficiently. We present a methodology to integrate, in "harmonious" way, the disambiguation process into the general strategy of the conceptual analysis
Fleury, Serge. "Polas fritas : prototypes oriented language has freed us. la programmation a prototypes, un outil pour une linguistique experimentale: mise en oeuvre de representations evolutives des connaissances pour le traitement automatique du langage naturel". Paris 7, 1997. http://www.theses.fr/1997PA070039.
Pełny tekst źródłaThis work deals with the confrontation between data-processing tools for natural language processing and the problem arising from automatic treatment to construct sense. This work aims to develop tools that respond to these problems. The problem to solve consists in defining behaviors that one can assign to words in the framework of a word parser. Nlp systems must represent linguistic information and the processes of representation are constrained to determine these information, they must presume what these knowledge are able to do. The language is always moving and the linguistic descriptions must be ajusted. The behaviors of words follow these permanent evolutions. Our work deals with a process of representation which does not predefines all the knowledge that can be associated to the words. We use the prototype as part of prototype-based language as a tool for representation of linguistic facts, in so far as we think that it can propose an answer for the problem we face. This tool leads us to build simple and ajustable structures of representation to suit to the adjustement dimension of natural languages. Prototype-based languages allow to build evolutive structure of representation that can be adjust when new information becomes available. Prototype-based language can easily change the representation of refinements, prototype-based language lead us to construct an evolutive syntactic classification for words. Our work aims to develop a stronger and effective connection between the work of the linguist and the nlp systems. The manual refinements of the results produced here follow this initial aim. Prototype-based language encourage an approach for representation which consists in successive leaps and which improves the quality of the representation that is produced. Futhermore, these processes go together with the 'home-made" but necessary- work of the linguist and his will to describe the way language events work
Hankach, Pierre. "Génération automatique de textes par satisfaction de contraintes". Paris 7, 2009. http://www.theses.fr/2009PA070027.
Pełny tekst źródłaWe address in this thesis the construction of a natural language generation System - computer software that transforms a formal representation of information into a text in natural language. In our approach, we define the generation problem as a constraint satisfaction problem (CSP). The implemented System ensures an integrated processing of generation operations as their different dependencies are taken into account and no priority is given to any type of operation over the others. In order to define the constraint satisfaction problem, we represent the construction operations of a text by decision variables. Individual operations that implement the same type of minimal expressions in the text form a generation task. We classify decision variables according to the type of operations they represent (e. G. Content selection variables, document structuring variables. . . ). The linguistic rules that govern the operations are represented as constraints on the variables. A constraint can be defined over variables of the same type or different types, capturing the dependency between the corresponding operations. The production of a text consists of resolving the global System of constraints, that is finding an evaluation of the variables that satisfies all the constraints. As part of the grammar of constraints for generation, we particularly formulate the constraints that govern document structuring operations. We model by constraints the rhetorical structure of SORT in order to yield coherent texts as the generator's output. Beforehand, in order to increase the generation capacities of our System, we extend the rhetorical structure to cover texts in the non-canonical order. Furthermore, in addition to defining these coherence constraints, we formulate a set of constraints that enables controlling the form of the macrostructure by communicative goals. Finally, we propose a solution to the problem of computational complexity of generating large texts. This solution is based on the generation of a text by groups of clauses. The problem of generating a text is therefore divided into many problems of reduced complexity, where each of them is concerned with generating a part of the text. These parts are of limited size so the associated complexity to their generation remains reasonable. The proposed partitioning of generation is motivated by linguistic considerations
Wurbel, Nathalie. "Dictionnaires et bases de connaissances : traitement automatique de données dictionnairiques de langue française". Aix-Marseille 3, 1995. http://www.theses.fr/1995AIX30035.
Pełny tekst źródłaLebarbé, Thomas. "Hiérarchie Inclusive des Unités Linguistiques en Analyse Syntaxique Coopérative : Le segment, unité intermédiaire entre chunk et phrase dans le traitement linguistique par système multi-agents". Caen, 2002. http://www.theses.fr/2002CAEN2019.
Pełny tekst źródłaCulioli-Atwood, Marie-Hélène. "Operations referentielles. Analyse de la determination en francais en vue d'un traitement informatise". Paris 7, 1992. http://www.theses.fr/1992PA070014.
Pełny tekst źródłaThe purpose of the thesis is (1) to gather a maximun of systematic and detailed observations concerning the occurence of determiners in french ( in the pattern det. + n ); (2) to build a system of metalinguistic representation enabling the modelling of facts; (3) to build procedures of reasoning having in mind an algorithmic treatment whether in generation or in analysis. The work gives the conceptual basis for modelling both on a formal and a semantic level. The thesis is made up of three parts: analysis of the problems in relation to the paraphrastic manipulations; study of groups of nominalised predicates based on semantic classifications; study of determiners in prepositional phrases. This work of research builds the preliminary steps of any computerized treatment of determination as used in a french text
Hassoun, Mohamed. "Conception d'un dictionnaire pour le traitement automatique de l'arabe dans différents contextes d'application". Lyon 1, 1987. http://www.theses.fr/1987LYO10035.
Pełny tekst źródłaBesombes, Jérôme. "Un modèle algorithmique de la généralisation de structures dans le processus d'acquisition du langage". Nancy 1, 2003. http://www.theses.fr/2003NAN10156.
Pełny tekst źródłaThe subject of our study is the learning of regular tree languages for an algorithmic modeling of language acquisition. For this, we suppose that data are structured; these data are heard correct sentences and the learning is effective since a representation of the language to which these sentences belong is built. From this representation the learner is able to generate new sentences compatible with the language and not presented as examples. Considering that heard sentences are translated into trees, it appears that the generalization of these tree structures is a component of the learning. We developed several models for this generalization in the form of algorithms taking into account various types of structures as input and various levels of contribution of information. These new models offer the advantage of unifying major results in the theory of the grammatical inference, and of extending these results, in particular by the consideration of new structures not studied previously in the learnability point of view
Larouk, Omar. "Extraction de connaissances à partir de documents textuels : traitement automatique de la coordination (connecteurs et ponctuation)". Lyon 1, 1994. http://www.theses.fr/1994LYO10029.
Pełny tekst źródłaAmeli, Samila. "Construction d'un langage de dictionnaire conceptuel en vue du traitement du langage naturel : application au langage médical". Compiègne, 1989. http://www.theses.fr/1989COMPD226.
Pełny tekst źródłaThis study deals with the realisation of a « new generation » information retrieval system, taking consideration of texts signification. This system compares texts (questions and documents) by their content. A knowledge base being indispensable for text “comprehension”, a dictionary of concepts has been designed in which are defined the concepts and their mutual relations thru a user friendly language called SUMIX. SUMIX enables us (1) to solve ambiguities due to polysemia by considering context dependencies, (2) to make use of property inheritance and so can largely help cogniticiens in the creation of the knowledge and inference base, (3) to define subject dependant relation between concepts which make possible metaknowledge handling. The dictionary of concepts is essentially used (1) to index concepts (and not characters string) which enables us to select a wide range of documents in the conceptual extraction phase, (2) to filter the previously selected documents by comparing the structure of each document with that of the query in the structural analysis phase
Le, Kien Van. "Generation automatique de l'accord du participe passe". Paris 7, 1987. http://www.theses.fr/1987PA077257.
Pełny tekst źródłaHue, Jean-François. "L'analyse contextuelle des textes en langue naturelle : les systèmes de réécritures typées". Nantes, 1995. http://www.theses.fr/1995NANT2034.
Pełny tekst źródłaMoreau, Fabienne Sébillot Pascale. "Revisiter le couplage traitement automatique des langues et recherche d'information". [S.l.] : [s.n.], 2006. ftp://ftp.irisa.fr/techreports/theses/2006/moreau.pdf.
Pełny tekst źródłaKostov, Jovan. "Le verbe macédonien : pour un traitement informatique de nature linguistique et applications didactiques (réalisation d'un conjugueur)". Institut National des Langues et Civilisations Orientales, 2013. http://www.theses.fr/2013INAL0033.
Pełny tekst źródłaAfter the standardization of the Macedonian language in 1945, the description of its current standard variety has been carried out by several generations of experts working – most often – in Macedonian institutions. The fact that several manuals were published is an undeniable proof of significant efforts made to describe the Macedonian verbal system and yet, today verbs represent the least exploited word-category. Inflexion rules cannot envisage all possible models of the Macedonian conjugaison and their approach is too synthetic to be fully operational from a didactic point of view. For all these reasons, the purpose of this doctoral thesis is to study a large number of conjugated verbs in order to map stable patterns opening up new forays into the teaching of the Macedonian verbal system. Moreover, these patterns are used to produce computational models resulting in an automatized conjugation tool which derives paradigms from the lexical verbal forms : FlexiMac 1. 1
Cardey-Greenfield, Sylviane. "Traitement algorithmique de la grammaire normative du français pour une utilisation automatique et didactique". Besançon, 1987. http://www.theses.fr/1987BESA1013.
Pełny tekst źródłaHathout, Nabil. "Théorie du gourvernement et du liage et programmation logique avec contraintes : une application à l'analyse automatique du français". Toulouse 3, 1992. http://www.theses.fr/1992TOU30200.
Pełny tekst źródłaLepage, Yves. "Un système de grammaires correspondancielles d'identification". Grenoble 1, 1989. http://www.theses.fr/1989GRE10059.
Pełny tekst źródłaHatmi, Mohamed. "Reconnaissance des entités nommées dans des documents multimodaux". Nantes, 2014. http://archive.bu.univ-nantes.fr/pollux/show.action?id=022d16d5-ad85-43fa-9127-9f1d9d89db14.
Pełny tekst źródłaNamed entity recognition is a subtask of information extraction. It consists of identifying some textual objects such as person, location and organization names. The work of this thesis focuses on the named entity recognition task for the oral modality. Some difficulties may arise for this task due to the intrinsic characteristics of speech processing (lack of capitalisation marks, lack of punctuation marks, presence of disfluences and of recognition errors. . . ). In the first part, we study the characteristics of the named entity recognition downstream of the automatic speech recognition system. We present a methodology which allows named entity recognition following a hierarchical and compositional taxonomy. We measure the impact of the different phenomena specific to speech on the quality of named entity recognition. In the second part, we propose to study the tight pairing between the speech recognition task and the named entity recognition task. For that purpose, we take away the basic functionnalities of a speech recognition system to turn it into a named entity recognition system. Therefore, by mobilising the inherent knowledge of the speech processing to the named entity recognition task, we ensure a better synergy between the two tasks. We carry out different types of experiments to optimize and evaluate our approach
Striegnitz, Kristina. "Génération d'expressions anaphoriques : Raisonnement contextuel et planification de phrases". Nancy 1, 2004. http://www.theses.fr/2004NAN10186.
Pełny tekst źródłaThis thesis investigates the contextual reasoning involved in the production of anaphoric expressions in natural language generation systems. More specifically, I propose generation strategies for two types of discourse anaphora which have not been treated in generation before: bridging descriptions and additive particles. To this end the contextual conditions that govern the use of these expressions have to be formalized. The formalization that I propose is based on notions from linguistics and extends previous approaches to the generation of co-referential anaphora. I then specify the reasoning tasks that have to be carried out in order to check the contextual conditions. I describe how they can be implemented using a state-of-the-art reasoning system for description logics, and I compare my proposal to alternative approaches using other kinds of reasoning tools. Finally, I describe an experimental implementation of the proposed approach
Dupont, Michel. "Une approche cognitive du calcul de le référence". Caen, 2003. http://www.theses.fr/2003CAEN2084.
Pełny tekst źródłaWidlöcher, Antoine. "Analyse macro-sémantique des structures rhétoriques du discours : cadre théorique et modèle opératoire". Caen, 2008. http://www.theses.fr/2008CAEN2042.
Pełny tekst źródłaIn the general field of Natural Language Processing (NLP), this work concerns the analysis of the rhetorical structure of discourse, which consists in the argumentative organization of texts through various stereotypes. Our main goal was to define a theoretical and computational framework allowing formal modeling and automatic exploration of various discursive structures involved in this textual organization. We notably propose to describe those structures using the three elementary categories of units, relations and schemas, and outline recurrent properties of discursive patterns and clues which signal their presence: variable granularity, fuzziness, possible non-linearity and non-sequentiality, local/global interactions. . . In order to give a formal description of the studied linguistic phenomena and to make their computational analysis possible, in a corpus-based approach, we propose the CDML formalism (Contraint-based Discourse Modeling Language). It allows to design formal models of discursive patterns by means of constraints expressed on textual objects whose nature (morphologic, syntactic, semantic. . . ) and whose granularity level may vary. A CDML parser has been implemented and may be used to apply such a formal description to a corpus and automatically detect textual structures satisfying the given constraints. In addition, we present two case studies dedicated to significantly different discursive patterns and illustrating our analysis principles, formal model and computational approach. The first one concerns Charolles' discourse framing theory. The second considers contrastive relations between various kind of textual objects, at different granularity levels
Choumane, Ali. "Traitement générique des références dans le cadre multimodal parole-image-tactile". Rennes 1, 2008. ftp://ftp.irisa.fr/techreports/theses/2008/choumane.pdf.
Pełny tekst źródłaWe are interested in multimodal human-computer communication systems that use the following modes: speech, gesture and vision. The user communicates with the system by oral utterance in natural language and/or by gesture. The user's request contains his/her goal and the designation of objects (referents) required to the goal realisation. The system should identify in a precise and non ambiguous way the designated objects. In this context, we aim to improve the understanding process of multimodal requests. Hence, we propose a generic set of processing of modalities, for fusion and for reference resolution. The main aspects of the realisation consist in modeling the natural language processing in speech environment, the gesture processing and the visual context (visual salience use) while taking into account the difficulties in multimodal context: speech recognition errors, natural language ambiguity, gesture imprecision due to the user performance, designation ambiguity due to the perception of the displayed objects or to the display topology. To complete the interpretation of the user's request, we propose a method for fusion/verification of modalities processing results to find the designated objects by the user
rossignol, mathias. "Acquisition sur corpus d'informations lexicales fondées sur la sémantique différentielle". Rennes 1, 2005. https://tel.archives-ouvertes.fr/tel-00524299.
Pełny tekst źródłaCHAO, HUI LAN. "Comprehension automatique de phrases interrogatives francaises et chinoises : application dans le cadre de l'interrogation de bases de donnees". Besançon, 1998. http://www.theses.fr/1998BESA1005.
Pełny tekst źródłaIn view of facilitating the communication between human beings and machines, especially the information extraction from databases, our research aims at elaborating an interface permitting data retrieval in natural language. We propose an unified methodology: not only it enables the processing of two distinct languages, french and chinese, but also it achieves the fusion of two approches, semantic oriented and syntaxic oriented approches. Our automatic analysis of interrogarative phrases leads to the transcription in sql syntaxe with the consideration of their resemblances. Sql is a standard query language, supported by a great many databases management systems. This strategy will facilitate the generalization of our interface. Despite of its friendliness, sql remainds to be a machine language intolerant of imprecisions inherent in natural language. We introduce thus the technique of fuzzy logic to solve this problem. Our researche leads finally to the implementation of a software named sidbln permitting the natural language query of databases
Smits, Grégory. "Une approche par surclassement pour le contrôle d'un processus d'analyse linguistique". Caen, 2008. http://www.theses.fr/2008CAEN2014.
Pełny tekst źródłaNatural Language Processing (NLP) systems are continuously faced with the problem of generating concurrent hypotheses, of which some can be erroneous. In order to avoid the propagation of erroneous hypotheses, it appears to be essential to apply specific control strategies, which aim to distinguishing concurrent hypotheses based on their relevance. On most of observed indetermination cases, we have noticed that multiple heterogeneous knowledge sources have to be combined to determine the hypotheses relative relevance. According to this observation, we show that the control of the indetermination cases can be formalised as a decisional process based on multiple criteria. This decisional formalisation and our research of an adapted methodology have conducted us toward an outranking approach issued from the MultiCriteria Decision Aid (MCDA) paradigm. This approach differs from alternative methods by the importance granted to knowledge and preferences that an expert can express about a given problem. From this innovative intersection between NLP and MCDA, our work has been focalised on the development of a decisional module dedicated to multicriteria control. The integration of this module into a complete NLP system has allowed us to attest the feasibility of our approach and to perform experimentation on concrete indetermination cases
Trybocki, Christine. "Elaboration d'un modèle conceptuel pour les bases de données lexicales". Aix-Marseille 3, 1995. http://www.theses.fr/1995AIX30088.
Pełny tekst źródłaLin, Huei-Chi. "Un module NooJ pour le traitement automatique du chinois : formalisation du vocabulaire et des têtes de groupes nominaux". Besançon, 2010. http://www.theses.fr/2010BESA1025.
Pełny tekst źródłaThis study presents the development of a module for the automatic parsing of Chinese that will allow to recognize automatically lexical units in modern Chinese, as well as central Noun Phrases in texts. In order to reach these two principle objectives, we solved the following problems: 1) identify lexical units in modern Chinese ; 2) determine their categories ; 3) describe certain local syntactic structures as well as the structure of central Noun Phrases. Firstly we constructed a corpus regrouping literary and journalistic texts published in the XXth century. These texts are written in modern Chinese with traditional characters. Thanks to textual data, we could collect linguistic information such as lexical units, syntagmatic structures or grammatical rules. Then, we constructed several electronic dictionaries in which each entry represents a lexeme, with which is associated linguistic information such as its lexical category, its semantic distributional class or certain formal properties. At this stage, we tried to identify the lexical units of Chinese lexicon and their categories in order to list them. Thanks to this list, an automatic lexical analyzer can process various types of lexical units in bloc, without deconstructing them in components. For instance, the lexical parser processes the following lexical units as atomic units : 理髮lǐfà / fǎ ‘have a haircut’. 放假fàngjià ‘have vacation’. 刀子口dāozikǒu ‘straight talk’. 研究員yánjiū / jiù yuán ‘researcher’. 翻譯系統fānyì xìtǒng ‘translation system’. 浪漫主義làngmàn zhŭyì ‘romanticism’. Then, we described formally certain local syntagms and five types of central Noun Phrases. Finally, we used this Chinese module to study thematic evolution in literary texts
Abdellatif, Emir. "Classification sémantico-syntaxique des adjectivaux prédicatifs". Paris 13, 2004. http://www.theses.fr/2004PA131005.
Pełny tekst źródłaComputationnal linguistics have questioned former studies on fixation. Indeed, it has become essential to describe set expressions given their frequency in texts. This study sets about examining predicate adjectivals in this setting. Taking object classes as a model, a semantico-syntactic typology has been worked out for these sequences. The final objective of our analysis is to create an electronic dictionary of predicate adjectivals. This dissertation communicates three essential points: fixation of adjectivals, their definitions and their classification based on common distributional properties
Mesfar, Slim. "Analyse morpho-syntaxique automatique et reconnaissance des entités nommées en arabe standard". Besançon, 2008. http://www.theses.fr/2008BESA1022.
Pełny tekst źródłaThe Arabic language, although very important by the number of its speakers, it presents special morpho-syntactic phenomena. This particularity is mainly related to the inflectional and agglutinative morphology, the lack of vowels in currents written texts, and the multiplicity of its forms; this induces a high level of lexical and syntactic ambiguity. It follows considerable difficulties for the automatic processing. The selection of a linguistic environment providing powerful tools and the ability to improve performance according to our needs has led us to use the platform language NooJ. We begin with a study followed by a large-coverage formalization of the Arabic lexicon. The built dictionary, baptised "El-DicAr" allows to link all the inflexional, morphological, syntactico-semantic information to the list of lemmas. Automatic inflexional and derivational routines applied to this list produce more than 3 million inflected forms. We propose a new finite state machine compiler that leads to an optimal storage through a combination of a sequential minimization algorithm and a dynamic compression routine for stored information. This dictionary acts as the linguistic engine for the automatic morpho-syntactic analyzer that we have developed. This analyzer includes a set of tools: a morphological analyzer that identifies the component morphemes of agglutinative forms using large coverage morphological grammars, a new algorithm for looking through finite-state transducers in order to deal with texts written in Arabic with regardless of their vocalisation statements, a corrector of the most frequent typographical errors, a named entities recognition tool based on a combination of the morphological analysis results and rules described into local grammar presented as Augmented Transition Networks ( ATNS), an automatic annotator and some tools for linguistic research and contextual exploration. In order to make our work available to the scientific community, we have developed an online concordance service “NooJ4Web: NooJ for the Web”. It provides instant results to different types of queries and displays statistical reports as well as the corresponding histograms. The listed services are offered in order to collect feedbacks and improve performance. This system is used to process Arabic, as well as French and English
Tannier, Xavier. "Extraction et recherche d'information en langage naturel dans les documents semi-structurés". Phd thesis, Ecole Nationale Supérieure des Mines de Saint-Etienne, 2006. http://tel.archives-ouvertes.fr/tel-00121721.
Pełny tekst źródła(écrits en XML en pratique) combine des aspects de la RI
traditionnelle et ceux de l'interrogation de bases de données. La
structure a une importance primordiale, mais le besoin d'information
reste vague. L'unité de recherche est variable (un paragraphe, une
figure, un article complet\dots). Par ailleurs, la flexibilité du
langage XML autorise des manipulations du contenu qui provoquent
parfois des ruptures arbitraires dans le flot naturel du texte.
Les problèmes posés par ces caractéristiques sont nombreux, que ce
soit au niveau du pré-traitement des documents ou de leur
interrogation. Face à ces problèmes, nous avons étudié les solutions
spécifiques que pouvait apporter le traitement automatique de la
langue (TAL). Nous avons ainsi proposé un cadre théorique et une
approche pratique pour permettre l'utilisation des techniques
d'analyse textuelle en faisant abstraction de la structure. Nous avons
également conçu une interface d'interrogation en langage naturel pour
la RI dans les documents XML, et proposé des méthodes tirant profit de
la structure pour améliorer la recherche des éléments pertinents.
Dehouck, Mathieu. "Multi-lingual dependency parsing : word representation and joint training for syntactic analysis". Thesis, Lille 1, 2019. http://www.theses.fr/2019LIL1I019/document.
Pełny tekst źródłaWhile modern dependency parsers have become as good as human experts, they still rely heavily on hand annotated training examples which are available for a handful of languages only. Several methods such as model and annotation transfer have been proposed to make high quality syntactic analysis available to low resourced languages as well. In this thesis, we propose new approaches for sharing information across languages relying on their shared morphological features. In a fist time, we propose to use shared morphological features to induce cross-lingual delexicalised word representations that help learning syntactic analysis models. Then, we propose a new multi-task learning framework called phylogenetic learning which learns models for related tasks/languages guided by the tasks/languages evolutionary tree. Eventually, with our new measure of morphosyntactic complexity we investigate the intrinsic role of morphological information for dependency parsing
Bedaride, Paul. "Implication Textuelle et Réécriture". Phd thesis, Université Henri Poincaré - Nancy I, 2010. http://tel.archives-ouvertes.fr/tel-00541581.
Pełny tekst źródłaStroppa, Nicolas. "Définitions et caractérisations de modèles à base d'analogies pour l'apprentissage automatique des langues naturelles". Phd thesis, Télécom ParisTech, 2005. http://tel.archives-ouvertes.fr/tel-00145147.
Pełny tekst źródłaDans le cadre d'un apprentissage automatique de données linguistiques, des modèles inférentiels alternatifs ont alors été proposés qui remettent en cause le principe d'abstraction opéré par les règles ou les modèles probabilistes. Selon cette conception, la connaissance linguistique reste implicitement représentée dans le corpus accumulé. Dans le domaine de l'Apprentissage Automatique, les méthodes suivant les même principes sont regroupées sous l'appellation d'apprentissage \og{}paresseux\fg{}. Ces méthodes reposent généralement sur le biais d'apprentissage suivant~: si un objet $Y$ est \og{}proche\fg{} d'un objet $X$, alors son analyse $f(Y)$ est un bon candidat pour $f(X)$. Alors que l'hypothèse invoquée se justifie pour les applications usuellement traitées en Apprentissage Automatique, la nature structurée et l'organisation paradigmatique des données linguistiques suggèrent une approche légèrement différente. Pour rendre compte de cette particularité, nous étudions un modèle reposant sur la notion de \og{}proportion analogique\fg{}. Dans ce modèle, l'analyse $f(T)$ d'un nouvel objet $T$ s'opère par identification d'une proportion analogique avec des objets $X$, $Y$ et $Z$ déjà connus. L'hypothèse analogique postule ainsi que si \lana{X}{Y}{Z}{T}, alors \lana{$f(X)$}{$f(Y)$}{$f(Z)$}{$f(T)$}. Pour inférer $f(T)$ à partir des $f(X)$, $f(Y)$, $f(Z)$ déjà connus, on résout l'\og{}équation analogique\fg{} d'inconnue $I$~: \lana{$f(X)$}{$f(Y)$}{$f(Z)$}{$I$}.
Nous présentons, dans la première partie de ce travail, une étude de ce modèle de proportion analogique au regard d'un cadre plus général que nous qualifierons d'\og{}apprentissage par analogie\fg{}. Ce cadre s'instancie dans un certain nombre de contextes~: dans le domaine des sciences cognitives, il s'agit de raisonnement par analogie, faculté essentielle au c\oe{}ur de nombreux processus cognitifs~; dans le cadre de la linguistique traditionnelle, il fournit un support à un certain nombre de mécanismes tels que la création analogique, l'opposition ou la commutation~; dans le contexte de l'apprentissage automatique, il correspond à l'ensemble des méthodes d'apprentissage paresseux. Cette mise en perspective offre un éclairage sur la nature du modèle et les mécanismes sous-jacents.
La deuxième partie de notre travail propose un cadre algébrique unifié, définissant la notion de proportion analogique. Partant d'un modèle de proportion analogique entre chaînes de symboles, éléments d'un monoïde libre, nous présentons une extension au cas plus général des semigroupes. Cette généralisation conduit directement à une définition valide pour tous les ensembles dérivant de la structure de semigroupe, permettant ainsi la modélisation des proportions analogiques entre représentations courantes d'entités linguistiques telles que chaînes de symboles, arbres, structures de traits et langages finis. Des algorithmes adaptés au traitement des proportions analogiques entre de tels objets structurés sont présentés. Nous proposons également quelques directions pour enrichir le modèle, et permettre ainsi son utilisation dans des cas plus complexes.
Le modèle inférentiel étudié, motivé par des besoins en Traitement Automatique des Langues, est ensuite explicitement interprété comme une méthode d'Apprentissage Automatique. Cette formalisation a permis de mettre en évidence plusieurs de ses éléments caractéristiques. Une particularité notable du modèle réside dans sa capacité à traiter des objets structurés, aussi bien en entrée qu'en sortie, alors que la tâche classique de classification suppose en général un espace de sortie constitué d'un ensemble fini de classes. Nous montrons ensuite comment exprimer le biais d'apprentissage de la méthode à l'aide de l'introduction de la notion d'extension analogique. Enfin, nous concluons par la présentation de résultats expérimentaux issus de l'application de notre modèle à plusieurs tâches de Traitement Automatique des Langues~: transcription orthographique/phonétique, analyse flexionnelle et analyse dérivationnelle.
Bouali, Monia. "L'actualisation aspectuelle des adjectivaux prédicatifs : le cas du changement d'état". Paris 13, 2007. http://www.theses.fr/2007PA131029.
Pełny tekst źródłaA predicate can take several different morphological forms. However, each form has its own usage and actualization. A predicative adjective (à la mode) is a polylexical entity which conforms to the definitional criteria of a simple adjective (élégant). Nonetheless, its actualization is far richer than what is possible with a simple adjective. It consists in a set of actuators which include light verbs, predicative verbs and adverbs. This work is a contribution to the ongoing effort of the LDI to build semantical classes for the predicates of . The attention given to the actualization of the predicative adjectives in order to design a typology of the state modification markers has allowed to revise the semantical classes of predicates already defined, to incorporate new entities and to link together predicates belonging to different semantical classes through the use of common aspectual markers
Fouqueré, Christophe. "Systèmes d'analyse tolérante du langage naturel". Paris 13, 1988. http://www.theses.fr/1988PA132003.
Pełny tekst źródła