Tesis: "Simplification Automatique des Textes"

1

Hijazi, Rita. "Simplification syntaxique de textes à base de représentations sémantiques exprimées avec le formalisme Dependency Minimal Recursion Semantics (DMRS)". Electronic Thesis or Diss., Aix-Marseille, 2022. http://theses.univ-amu.fr.lama.univ-amu.fr/221214_HIJAZI_602vzfxdu139bxtesm225byk629aeqyvw_TH.pdf.

Texto completo

Resumen

La simplification de textes consiste à transformer un texte en une version plus simple à lire et/ou à comprendre et plus accessible à un public cible, tout en conservant son information, son contenu et son sens originaux. Cette thèse se concentre sur la simplification syntaxique de textes en anglais, une tâche pour laquelle les systèmes automatiques existants présentent certaines limites. Pour les dépasser, nous proposons tout d’abord une nouvelle méthode de simplification syntaxique exploitant des dépendances sémantiques exprimées en DMRS (Dependency Minimal Recursion Semantics), une représentation sémantique profonde sous forme de graphes combinant sémantique et syntaxe. La simplification syntaxique consiste alors à représenter la phrase complexe en un graphe DMRS, transformer selon des stratégies spécifiques ce graphe en d’autres graphes DMRS qui généreront des phrases plus simples. Cette méthode permet la simplification syntaxique de constructions complexes, en particulier des opérations de division basées sur des appositives, sur des coordinations et sur des subordinations ; ainsi que la transformation de formes passives en formes actives. Les résultats obtenus par ce système de simplification syntaxique sur ce corpus de référence sur les opérations de division de phrases surpassent ceux des systèmes existants du même type dans la production de phrases simples, grammaticales et conservant le sens, démontrant ainsi tout l’intérêt de notre approche de la simplification syntaxique à base de représentations sémantiques en DMRS
Text simplification is the task of making a text easier to read and understand and more accessible to a target audience. This goal can be reached by reducing the linguistic complexity of the text while preserving the original meaning as much as possible. This thesis focuses on the syntactic simplification of texts in English, a task for which these automatic systems have certain limitations. To overcome them, we first propose a new method of syntactic simplification exploiting semantic dependencies expressed in DMRS (Dependency Minimal Recursion Semantics), a deep semantic representation in the form of graphs combining semantics and syntax. Syntactic simplification enables to represent the complex sentence in a DMRS graph, transforming this graph according to specific strategies into other DMRS graphs, which will generate simpler sentences. This method allows the syntactic simplification of complex constructions, in particular division operations such as subordinate clauses, appositive clauses, coordination and also the transformation of passive forms into active forms. The results obtained by this system of syntactic simplification surpass those of the existing systems of the same type in the production of simple, grammatical sentences and preserving the meaning, thus demonstrating all the interest of our approach to syntactic simplification based on semantic representations in DMRS