Tesis sobre el tema "Traitement de la parole et du langage"
Crea una cita precisa en los estilos APA, MLA, Chicago, Harvard y otros
Consulte los 50 mejores tesis para su investigación sobre el tema "Traitement de la parole et du langage".
Junto a cada fuente en la lista de referencias hay un botón "Agregar a la bibliografía". Pulsa este botón, y generaremos automáticamente la referencia bibliográfica para la obra elegida en el estilo de cita que necesites: APA, MLA, Harvard, Vancouver, Chicago, etc.
También puede descargar el texto completo de la publicación académica en formato pdf y leer en línea su resumen siempre que esté disponible en los metadatos.
Explore tesis sobre una amplia variedad de disciplinas y organice su bibliografía correctamente.
Meunier, Fanny. "Morphologie et traitement du langage parlé". Paris 5, 1997. http://www.theses.fr/1997PA05H084.
Texto completoA major issue in the study of human language concerns the way the stored representations of words are accessed during speech processing. The research work i have carried out so far approaches this question through the special case of morphologically complex words (e. G. , 'undo', 'asymmetry'). Because of their structure, these words allow clearer insight than the monomorphemic ones into the nature of representation and retrieval processes within the 'mental lexicon'. More specifically, we raised two critical questions: (a) how is morphological structure represented mentally? And (b) how are polymorphemic words accessed during spoken language recognition? The experiments i conducted within my ph d have shown that there are both representational and processing differences between derivationally prefixed (e. G. , distrust) and suffixed forms (e. G. , trustful) the first series of experiments used a lexical decision task, performed on auditorily presented words. Our results clearly suggests that prefixed words, just like monomorphemic items, are processed in a temporally continuous way ('from left to right'), that is, without prelexical decomposition (i. E. , without words being broken down into their constituent morphemes prior to lexical access). The processing of suffixed words, on the other hand, is influenced by the rest of their 'morphological family'. All suffixed relatives of a stem are listed fully within their stem's lexical entry, and the place they occupy within this list depends on their frequency (the most frequent ones coming 'on top'). Thus, the speed with which a suffixed form is accessed will depend on its 'place' within its family. A second series of experiments used a cross-modal priming paradigm (an auditory prime was immediately followed by the presentation of a visual target, on which subjects performed a lexical decision). These experiments showed that there exists between all three member types (stem, suffixed and prefixed forms) links of a purely morphological nature (that is, other than purely semantic or formal). All these experiments give keys concerning the way morphological complex words are treated and concerning their representations format. In the last part of our work, we proposed a model that take all our results into account
Nazzi, Thierry. "Du rythme dans l'acquisition et le traitement de la parole". Paris, EHESS, 1997. http://www.theses.fr/1997EHES0004.
Texto completoEshkol, Iris. "Typologie sémantique des prédicats de parole". Paris 13, 2002. http://www.theses.fr/2002PA131013.
Texto completoWith the enormous amount of electronically avaible information on the internet, the development of information extraction technology becomes and more important. It requires new tools to structure and analyse textual data that help users in accessing and evaluating the information they are looking for. It is the linguist' task to create terminological databases and thesauri for this application. The research presented in this thesis is situated in the domain of Natural Language Processing (NLP), in which the lexicon plays a central role. Apart from traditional lexicographical approaches, the majority of NLP applications relies on syntactic information encoded in the lexicon, as for example in the LADL electronic dictionaries. Although this is an appropriate approach it is to some extent insufficient because it does not take into account the semantic of words and their ambiguity. For this, one needs to build dictionaries that relate syntax and semantics. .
Regnault, Pascaline. "Musique et chant : approche comportementale et électrophysiologique du traitement de la musique et du langage". Aix-Marseille 1, 2001. http://www.theses.fr/2001AIX11020.
Texto completoDeligne, Sabine. "Modeles de sequences de longueurs variables : application au traitement du langage ecrit et de la parole". Paris, ENST, 1996. http://www.theses.fr/1996ENST0029.
Texto completoDeligne, Sabine. "Modèle de séquences de longueurs variables : application au traitement du langage écrit et de la parole /". Paris : École nationale supérieure des télécommunications, 1996. http://catalogue.bnf.fr/ark:/12148/cb36162106f.
Texto completoPouchot, Stéphanie. "L'analyse de corpus et la génération automatique de texte : méthodes et usages". Grenoble 3, 2003. http://www.theses.fr/2003GRE39006.
Texto completoLaurent, Antoine. "Auto-adaptation et reconnaissance automatique de la parole". Le Mans, 2010. http://cyberdoc.univ-lemans.fr/theses/2010/2010LEMA1009.pdf.
Texto completoThe first part of this thesis presents a computer assisted transcription of speech method. Every time the user corrects a word in the automatic transcription, this correction is immediately taken into account to re-evaluate the transcription of the words following it. The latter is obtained from a reordering of the confusion networks hypothesis generated by the ASR. The use of the reordering method allows an absolute gain of 3. 4 points (19. 2% to 15. 8%) in term of word stroke ratio (WSR) on the ESTER 2 corpus. In order to decrease the proper nouns error rate, an acoustic-based phonetic transcription method is proposed in this manuscript. The use of SMT [Laurent 2009] associated with the proposed method allows a significant reduce in term of word error rate (WER) and in term of proper nouns error rate (PNER)
Tran, Ngoc Anaïs. "Perception de la parole sifflée : étude de la capacité de traitement langagier des musiciens". Electronic Thesis or Diss., Université Côte d'Azur, 2023. http://www.theses.fr/2023COAZ2052.
Texto completoSpeech perception is a process that must adapt to a large amount of variability. These variations, including differences in production that depend on the speaker, modify the speech signal. By then using this modified speech signal in experimental studies, we can target certain aspects of speech and their role in the perceptive process. In this thesis, I considered a form of naturally modified speech known as “whistled speech” to further explore the role of acoustic phonological cues in the speech perception process. Variation, however, is not unique to speech production: it is also present among those perceiving speech and varies according to individual experience. Here, I analyzed the effect of classical music expertise on whistled speech perception. Whistled speech augments the modal spoken speech signal into higher frequencies corresponding to a register best perceived by human hearing. In our corpus, vowels are reduced to high whistled frequencies, in a pitch range specific to each vowel, and consonants modify these frequencies according to their articulation. First, we considered how naive listeners (who have never heard whistled speech before) perceive whistled speech. We targeted four vowels and four consonants: /i,e,a,o/ and /k,p,s,t/, which we considered in isolation or a VCV form, and in whistled words (chosen to incorporate the target phonemes). We then considered the effect of musical experience on these categorization tasks, also taking an interest in the transfer of knowledge and the effect of instrument expertise. In these studies, we observed that naive listeners categorize whistled phonemes and whistled words well over chance, with a preference for acoustic cues that characterize consonants and vowels with contrasting pitches. This preference is nonetheless affected by the context in which the phoneme is heard (especially in the word). We also observed an effect of musical expertise on categorization, which improved with more experience and was strongest for high-level classical musicians. We attributed these differences to better use of acoustic cues, allowing for a transfer of skills between musical knowledge and whistled speech perception, though performances due to musical experience are much lower than participants with a knowledge of whistled speech. These acoustic skills were also found to be specific to the instrument played, where flute players outperformed the other instrumentalists, particularly on consonant tasks. Thus, we suggest that the effect of training, such as music, improves one's performance on whistled speech perception according to the similarities between the sound signals, both in terms of acoustics and articulation
Guilleminot, Christian. "Décomposition adaptative du signal de parole appliquée au cas de l'arabe standard et dialectal". Besançon, 2008. http://www.theses.fr/2008BESA1030.
Texto completoThe present work introduces in phonetics, the atomic decomposition of the signal also known as the Matching Pursuit and treats a group of atoms by compression without losses and finally measures the distance of the list of atoms compressed using the Kolmogorov's algorithms. The calibration is based on an initial classical analysis of the co-articulation of sound sequences of VCV and CV, or V ∈ {[i] [u] [a]} and C ∈ {[t] [d] [s] [δ]}∪ [tʕ] [dʕ] [sʕ [δʕ]} the excerpts culled from a corpus made up of four arabic speaking areas. The locus equation of CV vs CʕV, makes it possible to differentiate the varieties of the language. In the second analysis, an algorithm of atomic adaptative decomposition or Matching Pursuit is applied to the sequences VCV and VCʕV still on the same corpus. The atomic sequences representing VCV et VCʕV are then compressed without losses and the distances between them are searched for by Kolmogorov's algorithms. The classification of phonetic recordings obtained from these arabic speaking areas is equivalent to that of the first method. The findings of the study show how the introduction of Matching Pursuit's in phonetics works, the great robustness of the use of algorithms and suggesting important possibilities of automation of processes put in place, while opening new grounds for further investigations
Bigi, Brigitte. "Contribution à la modélisation du langage pour des applications de recherche documentaire et de traitement de la parole". Avignon, 2000. http://www.theses.fr/2000AVIG0125.
Texto completoFrath, Pierre. "Semantique, reference et acquisition automatique de connaissances a partir de textes". Strasbourg 2, 1997. http://www.theses.fr/1997STR20079.
Texto completoAutomatic knowledge acquisition from text ideally consists in generating a structured representation of a corpus, which a human or a machine should be able to query. Designing and realising such a system raises a number of difficulties, both theoretical and practical, which we intend to look into. The first part of this dissertation studies the two main approaches to the problem : automatic terminology retrieval, and model driven knowledge acquisition. The second part studies the mostly implicit theoretical foundations of natural language processing i. E. Logical positivism and componential lexical semantics. We offer an alternative inspired from the work of charles sanders peirce, ludwig wittgenstein and georges kleiber, i. E. A semantics based on the notions of sign, usage and reference. The third part is devoted to a detailed semantic analysis of a medical corpus. Reference is studied through two notions, denomination and denotation. Denominations allow for arbitrary, preconstructed and opaque reference; denotations, for discursive, constructed and transparent reference. In the fourth part, we manually construct a detailed representation of a fragment of the corpus. The aim is to study the relevance of the theoretical analysis and to set precise objectives to the system. The fifth part focuses on implementation. It is devoted to the construction of a terminological knowledge base capable of representing a domain corpus, and sufficiently structured for use by applications in terminology or domain modelling for example. In a nutshell, this dissertation examines automatic knowledge acquisition from text from a theoretical and technical point of view, with the technology setting the guidelines for the theoretical discussions
Huet, Stéphane Sébillot Pascale. "Informations morpho-syntaxiques et adaptation thématique pour améliorer la reconnaissance de la parole". [S.l.] : [s.n.], 2007. ftp://ftp.irisa.fr/techreports/theses/2007/huet-hyperref.pdf.
Texto completoDutrey, Camille. "Analyse et détection automatique de disfluences dans la parole spontanée conversationnelle". Thesis, Paris 11, 2014. http://www.theses.fr/2014PA112415/document.
Texto completoExtracting information from linguistic data has gain more and more attention in the last decades inrelation with the increasing amount of information that has to be processed on a daily basis in the world. Since the 90’s, this interest for information extraction has converged to the development of researches on speech data. In fact, speech data involves extra problems to those encountered on written data. In particular, due to many phenomena specific to human speech (e.g. hesitations, corrections, etc.). But also, because automatic speech recognition systems applied on speech signal potentially generates errors. Thus, extracting information from audio data requires to extract information by taking into account the "noise" inherent to audio data and output of automatic systems. Thus, extracting information from speech data cannot be as simple as a combination of methods that have proven themselves to solve the extraction information task on written data. It comes that, the use of technics dedicated for speech/audio data processing is mandatory, and epsecially technics which take into account the specificites of such data in relation with the corresponding signal and transcriptions (manual and automatic). This problem has given birth to a new area of research and raised new scientific challenges related to the management of the variability of speech and its spontaneous modes of expressions. Furthermore, robust analysis of phone conversations is subject to a large number of works this thesis is in the continuity.More specifically, this thesis focuses on edit disfluencies analysis and their realisation in conversational data from EDF call centres, using speech signal and both manual and automatic transcriptions. This work is linked to numerous domains, from robust analysis of speech data to analysis and management of aspects related to speech expression. The aim of the thesis is to propose appropriate methods to deal with speech data to improve text mining analyses of speech transcriptions (treatment of disfluencies). To address these issues, we have finely analysed the characteristic phenomena and behavior of spontaneous speech (disfluencies) in conversational data from EDF call centres and developed an automatic method for their detection using linguistic, prosodic, discursive and para-linguistic features.The contributions of this thesis are structured in three areas of research. First, we proposed a specification of call centre conversations from the prespective of the spontaneous speech and from the phenomena that specify it. Second, we developed (i) an enrichment chain and effective processings of speech data on several levels of analysis (linguistic, acoustic-prosodic, discursive and para-linguistic) ; (ii) an system which detect automaticcaly the edit disfluencies suitable for conversational data and based on the speech signal and transcriptions (manual or automatic). Third, from a "resource" point of view, we produced a corpus of automatic transcriptions of conversations taken from call centres which has been annotated in edition disfluencies (using a semi-automatic method)
Milhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel". Electronic Thesis or Diss., Paris, ENST, 2014. http://www.theses.fr/2014ENST0087.
Texto completoRecently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism
Grisvard, Olivier. "Modélisation et gestion du dialogue oral homme-machine de commande". Nancy 1, 2000. http://www.theses.fr/2000NAN10011.
Texto completoTo design a spoken man-machine command dialogue system to be used by the largest number of people, that is even people who are not specialists of interacting with computers, is not an easy task. On the one hand, it requires to take into account sorne characteristics of human conversation in general, in order to provide the system with natural means of interacting with the user. On the other hand, it implies to respect constraints specifie to task-based dialogue, that is dialogue used to manage a definite computer task. Given such a framework, we propose a model for this class of dialogues. Although the model's main purpose is to be implemented in a real command system, its definition is based on an in-depth study of princip les and mecanisms of man-man dialogue. More precisely, our dialogue model comprises a structured representation formalism for task and dialogue data, which is based on the notion of eventuality, as well as a dialogue management procedure. This procedure includes pragmatic analysis of user utterances, effective management of the event-based dialogue representation, application management, and system utterance production. The model we propose is intended to be generic enough in order to be independent of the application
Adda, Gilles. "Reconnaissance de grands vocabulaires : une étude syntaxique et lexicale". Paris 11, 1987. http://www.theses.fr/1987PA112386.
Texto completoHuet, Stéphane. "Informations morpho-syntaxiques et adaptation thématique pour améliorer la reconnaissance de la parole". Phd thesis, Rennes 1, 2007. ftp://ftp.irisa.fr/techreports/theses/2007/huet-hyperref.pdf.
Texto completoOur research aims at improving outputs produced by automatic speech recognition (ASR) systems by integrating additional linguistic knowledge. In the first part, we propose a new mode of integration of parts of speech in a post-processing stage of speech decoding. To do this, we tag N-best sentence hypothesis lists with a morpho-syntactic tagger and then reorder these lists by modifying the score computed by ASR systems at the sentence level. Experiments done on French-speaking broadcast news exhibit an improvement of the word error rate and of confidence measures. In the second more exploratory part, we are interested in thematically adapting the language model (LM) of an ASR system. We first segment the studied document into thematically homogeneous sections, by proposing a new probabilistic framework to integrate different modalities. We then build adaptation corpora retrieved from the Web and finally modify the LM with these specific corpora
Huet, Stéphane. "Informations morpho-syntaxiques et adaptation thématique pour améliorer la reconnaissance de la parole". Phd thesis, Université Rennes 1, 2007. http://tel.archives-ouvertes.fr/tel-00524245.
Texto completoMilhorat, Pierrick. "Une plate-forme ouverte pour la conception et l'implémentation de systèmes de dialogue vocaux en langage naturel". Thesis, Paris, ENST, 2014. http://www.theses.fr/2014ENST0087/document.
Texto completoRecently, global tech companies released so-called virtual intelligent personal assistants.This thesis has a bi-directional approach to the domain of spoken dialog systems. On the one hand, parts of the work emphasize on increasing the reliability and the intuitiveness of such interfaces. On the other hand, it also focuses on the design and development side, providing a platform made of independent specialized modules and tools to support the implementation and the test of prototypical spoken dialog systems technologies. The topics covered by this thesis are centered around an open-source framework for supporting the design and implementation of natural-language spoken dialog systems. Continuous listening, where users are not required to signal their intent prior to speak, has been and is still an active research area. Two methods are proposed here, analyzed and compared. According to the two directions taken in this work, the natural language understanding subsystem of the platform has been thought to be intuitive to use, allowing a natural language interaction. Finally, on the dialog management side, this thesis argue in favor of the deterministic modeling of dialogs. However, such an approach requires intense human labor, is prone to error and does not ease the maintenance, the update or the modification of the models. A new paradigm, the linked-form filling language, offers to facilitate the design and the maintenance tasks by shifting the modeling to an application specification formalism
Al, Thonaiyan Abdullah. "Espaces de parole et stratégies d'individuation : repérage et analyse des mécanismes d'influences dans le traitement des évènements rapportés des journaux quotidiens français et saoudiens à propos de la guerre d'Irak". Rouen, 2010. http://www.theses.fr/2010ROUEL022.
Texto completoThis work focus on the study of complex processes that allow establishing a mechanism to influence the treatment of the news media of some French and Saudi newspapers media event. The purpose of this work is to distinguish four strategies of news discourse in the press related to the kind of reported events and to identify certain aspects of variations in discourse according to the type required by the invention and the continuous calculation on the others. It also deals with the discourses that circumscribe an intentionality to produce effects in the reader, highlighting the kinds of effects they are likely to produce for influence. This work highlights the importance of communicative processes in the construction and presentation of events reported in organs socially and culturally different
Woehrling, Cécile. "Accents régionaux en français : perception, analyse et modélisation à partir de grands corpus". Phd thesis, Université Paris Sud - Paris XI, 2009. http://tel.archives-ouvertes.fr/tel-00617248.
Texto completoRoussel, David. "Intégration de prédictions linguistiques issues d'applications à partir d'une grammaire d'arbres hors-contexte : contribution à l'analyse de la parole". Grenoble 1, 1999. http://www.theses.fr/1999GRE10209.
Texto completoHamon, Bérengère Beaumont Catherine. "Etude des traitements phonologique et visuo-attentionnel chez des collégiens normo-lecteurs et dyslexiques". Tours : SCD de l'université de Tours, 2007. http://www.applis.univ-tours.fr/scd/Orthophonie/2007ortho_hamon.pdf.
Texto completoServan, Christophe. "Apprentissage automatique et compréhension dans le cadre d'un dialogue homme-machine téléphonique à initiative mixte". Phd thesis, Université d'Avignon, 2008. http://tel.archives-ouvertes.fr/tel-00591997.
Texto completoMesfar, Slim. "Analyse morpho-syntaxique automatique et reconnaissance des entités nommées en arabe standard". Besançon, 2008. http://www.theses.fr/2008BESA1022.
Texto completoThe Arabic language, although very important by the number of its speakers, it presents special morpho-syntactic phenomena. This particularity is mainly related to the inflectional and agglutinative morphology, the lack of vowels in currents written texts, and the multiplicity of its forms; this induces a high level of lexical and syntactic ambiguity. It follows considerable difficulties for the automatic processing. The selection of a linguistic environment providing powerful tools and the ability to improve performance according to our needs has led us to use the platform language NooJ. We begin with a study followed by a large-coverage formalization of the Arabic lexicon. The built dictionary, baptised "El-DicAr" allows to link all the inflexional, morphological, syntactico-semantic information to the list of lemmas. Automatic inflexional and derivational routines applied to this list produce more than 3 million inflected forms. We propose a new finite state machine compiler that leads to an optimal storage through a combination of a sequential minimization algorithm and a dynamic compression routine for stored information. This dictionary acts as the linguistic engine for the automatic morpho-syntactic analyzer that we have developed. This analyzer includes a set of tools: a morphological analyzer that identifies the component morphemes of agglutinative forms using large coverage morphological grammars, a new algorithm for looking through finite-state transducers in order to deal with texts written in Arabic with regardless of their vocalisation statements, a corrector of the most frequent typographical errors, a named entities recognition tool based on a combination of the morphological analysis results and rules described into local grammar presented as Augmented Transition Networks ( ATNS), an automatic annotator and some tools for linguistic research and contextual exploration. In order to make our work available to the scientific community, we have developed an online concordance service “NooJ4Web: NooJ for the Web”. It provides instant results to different types of queries and displays statistical reports as well as the corresponding histograms. The listed services are offered in order to collect feedbacks and improve performance. This system is used to process Arabic, as well as French and English
Peri, Pauline. "Apprentissage des langues secondes : les processus de perception et de production de la parole : Perspectives phonétique et psycholinguistique". Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM3074.
Texto completoThis thesis investigates the formation of L2 vowel category in late learners, regarding particularly the influence of the first language on the acquisition process as described in the theoretical predictions of models of L2 speech sound processing. In this study, two different experimental approaches were conducted. At first, we examined, with electrophysiological and behavioral techniques, the perception of American-English contrasts by French late learners of English at the acoustical, phonological and lexical level as a function of linguistic experience. Second, fine grained acoustical analysis have been run on the perception and production of French vowels by American English late learners with a specific dialect: that from California, in order to a) understand how both processes evolve and are linked during the first stages of learning an L2 in immersion and b) examine the effect of the L1 on the production of L2 speech sounds due to possible lexical competition with homophonic words in French and English. The results show that new L2 vowel categories can be learned and the differences maintained at the lexical level even in late learners. Linguistic experience enables perceptive changes but does not guarantee a cognitive processing as automatic as for native speakers. In the phonetic part of the study, the results show that phonetic differences can be perceived and produced as a function of the pattern of assimilation described in models of L2 acquisition and phonological overlap between French and English words. Finally, it seems that the evolution of perception skills precede production one’s in line with SLM predictions (Flege, 1995)
Rilliard, Albert. "Vers une mesure de l'intelligibilité linguistique de la prosodie : évaluation diagnostique des prosodies synthétique et naturelle". Grenoble INPG, 2000. http://www.theses.fr/2000INPG0156.
Texto completoNguyen, Tu Anh. "Spoken Language Modeling from Raw Audio". Electronic Thesis or Diss., Sorbonne université, 2024. http://www.theses.fr/2024SORUS089.
Texto completoSpeech has always been a dominant mode of social connection and communication. However, speech processing and modeling have been challenging due to its variability. Classic speech technologies rely on cascade modeling, i.e. transcribing speech to text with an Automatic Speech Recognition (ASR) system, processing transcribed text using Natural Language Processing (NLP) methods, and converting text back to speech with a Speech Synthesis model. This method eliminates speech variability but requires a lot of textual datasets, which are not always available for all languages. In addition, it removes all the expressivity contained in the speech itself.Recent advancements in self-supervised speech learning (SpeechSSL) have enabled the learning of good discrete speech representations from raw audio, bridging the gap between speech and text technologies. This allows to train language models on discrete representations (discrete units, or pseudo-text) obtained from the speech and has given rise to a new domain called TextlessNLP, where the task is to learn the language directly on audio signals, bypassing the need for ASR systems. The so-called Spoken Language Models (Speech Language Models, or SpeechLMs) have been shown to be working and offer new possibilities for speech processing compared to cascade systems.The objective of this thesis is thus to explore and improve this newly-formed domain. We are going to analyze why these discrete representations work, discover new applications of SpeechLMs to spoken dialogues, extend TextlessNLP to more expressive speech as well as improve the performance of SpeechLMs to reduce the gap between SpeechLMs and TextLMs
Vaudable, Christophe. "Analyse et reconnaissance des émotions lors de conversations de centres d'appels". Phd thesis, Université Paris Sud - Paris XI, 2012. http://tel.archives-ouvertes.fr/tel-00758650.
Texto completoCarlotti, Lisa Marie. "Traitement des variations phonologiques régionales en anglais britannique chez l'apprenant francophone". Aix-Marseille 1, 2007. http://www.theses.fr/2007AIX10031.
Texto completoTsai, Chien-Wen. "La compétence commnunicative en didactique des langues : étude des actes de parole rituels en français et en chinois mandarin et traitement en classe". Paris 3, 2007. http://www.theses.fr/2007PA030156.
Texto completoThe purpose of foreign language teaching and learning is the development of communicative ability. Communicative approach and speech acts have played an important role in language teaching for thirty years. At present, cultural component of target language, but also of first language, constitute a central issue on the subject. Through the study of speech acts in French and in Mandarin Chinese, this dissertation attempt to reread speech act theories and to offer some applications in foreign language classrooms
Raybaud, Sylvain. "De l'utilisation de mesures de confiance en traduction automatique : évaluation, post-édition et application à la traduction de la parole". Electronic Thesis or Diss., Université de Lorraine, 2012. http://www.theses.fr/2012LORR0260.
Texto completoIn this thesis I shall deal with the issues of confidence estimation for machine translation and statistical machine translation of large vocabulary spontaneous speech translation. I shall first formalize the problem of confidence estimation. I present experiments under the paradigm of multivariate classification and regression. I review the performances yielded by different techniques, present the results obtained during the WMT2012 internation evaluation campaign and give the details of an application to post edition of automatically translated documents. I then deal with the issue of speech translation. After going into the details of what makes it a very specific and particularly challenging problem, I present original methods to partially solve it, by using phonetic confusion networks, confidence estimation techniques and speech segmentation. I show that the prototype I developped yields performances comparable to state-of-the-art of more standard design
Kulkarni, Ajinkya. "Expressivity transfer in deep learning based text-to-speech synthesis". Electronic Thesis or Diss., Université de Lorraine, 2022. http://www.theses.fr/2022LORR0122.
Texto completoRecently, text-to-speech (TTS) synthesis has gained immense success in the human-computer interaction domain. Current TTS systems are monotonous due to the absence of expressivity. Expressivity in speech generally refers to suprasegmental speech characteristics represented by emotions, speaking styles, and the relationship between speech and gestures, facial expressions, etc. It seems likely that expressive speech synthesis provides the ability to improve the user experience with machines greatly. The development of an expressive TTS system heavily relies on the speech data used in training the system. The thesis aims at developing an expressive TTS system in a speaker's voice for which only neutral speech data is available. The main focus of the thesis is to investigate deep learning approaches for exploring the disentanglement of speaker information and expressivity in a multispeaker TTS setting. The scope of the work incorporates expressivity as an emotion attribute with well-defined emotion classes. We present various deep neural network architectures to create latent representations of speaker and expressivity in multispeaker TTS settings. During the expressivity transfer phase, representations from expressivity and speaker are used to interpolate for synthesizing expressive speech in desired speaker's voice. We present a deep metric learning framework for improving the latent representation of expressivity in a multispeaker TTS system setting, which results in improved expressivity transfer. The thesis work also investigates the expressivity transfer capability of probability density estimation based on deep generative models. The usage of deep generative models provides scalable modeling of complex, high-dimensional speech data and tractability of the system, resulting in high-quality speech synthesis. The evaluation of the proposed systems is a challenging aspect of the thesis, as no reference expressive speech data was available in the target speaker's voice. Therefore, we propose two subjective evaluation metrics, speaker MOS and expressive MOS, which indicate the performance of the framework to transfer the expressivity and the retention of the target speaker's voice. As it is a time-consuming process to conduct a subjective evaluation each time system is developed, we propose a cosine similarity-based evaluation metric to measure the strength of expressivity and the speaker's voice. The obtained results demonstrate the ability of the proposed work to transfer the expressivity with maintaining the overall quality of synthesized expressive speech in the target speaker's voice. It is hard to identify which neural network parameters represent the attributes of speaker characteristics and expressivity. Moreover, expressivity and speaker characteristics are bounded aspects of prosody parameters
Burfin, Sabine. "L'apport des informations visuelles des gestes oro-faciaux dans le traitement phonologique des phonèmes natifs et non-natifs : approches comportementale, neurophysiologique". Thesis, Université Grenoble Alpes (ComUE), 2015. http://www.theses.fr/2015GRENS002/document.
Texto completoDuring audiovisual speech perception, like in face-to-face conversations, we can takeadvantage of the visual information conveyed by the speaker's oro-facial gestures. Thisenhances the intelligibility of the utterance. The aim of this work was to determine whetherthis “audiovisual benefit” can improve the identification of phonemes that do not exist in ourmother tongue. Our results revealed that the visual information contributes to overcome thephonological deafness phenomenon we experience in an audio only situation (Study 1). AnERP study indicates that this benefit could be due to the modulation of early processing in theprimary auditory cortex (Study 2). The audiovisual presentation of non native phonemesgenerates a P50 that is not observed for native phonemes. The linguistic background affectsthe way we use visual information. Early bilinguals take less advantage of the visual cuesduring the processing of unfamiliar phonemes (Study 3). We examined the identificationprocesses of native plosive consonants with a gating paradigm to evaluate the differentialcontribution of auditory and visual cues across time (Study 4). We observed that theaudiovisual benefit is not systematic. Phoneme predictability depends on the visual saliencyof the articulatory movements of the speaker
Jochaut-Roussillon, Delphine. "Analyse comparée de la pathologie du traitement temporel auditif dans les troubles du spectre autistique et la dyslexie". Thesis, Paris 6, 2015. http://www.theses.fr/2015PA066723/document.
Texto completoThis research aimed to better understand two language disorders : those associated with autism spectrum disorder and dyslexia. Recent advances indicate how cortical collective neural behaviour intervene in speech segmentation and decoding. Cortical oscillations allow integration temporal windows at syllabic (4-7 Hz) and phonemic (25-35 Hz) time scale, resulting in chunking continuous speech signal into linguistically relevant units. We measured slow fluctuations of rhythmic cortical activity and their topography in healthy subjects, in subjects with autism spectrum disorder and in dyslexic subjects using combined fMRI and EEG. We showed that the sensitivity to syllabic and phonemic density is atypical in dyslexia and in autism. In autism gamma and theta activity do not engage synergistically in response to speech. Theta activity in left auditory cortex fails to track speech modulations and to down-regulate gamma oscillations that encode speech acoustic details. The language disorder in autism results from an altered coupling of slow and fast oscillations that disrupts the temporal organization of the speech neural code. In dyslexia, theta activity is not altered and theta-paced readout of gamma activity is preserved, enabling the phonemic decoding, even atypical (faster). In both pathologies, auditory oscillatory anomalies lead to atypical oscillation-based connectivity between auditory and other language cortices
Minescu, Bogdan. "Construction et stratégie d'exploitation des réseaux de confusion en lien avec le contexte applicatif de la compréhension de la parole". Phd thesis, Université d'Avignon, 2008. http://tel.archives-ouvertes.fr/tel-00629195.
Texto completoDerouault, Anne-Marie. "Modélisation d'une langue naturelle pour la désambiguation des chaînes phonétiques". Paris 7, 1985. http://www.theses.fr/1985PA077028.
Texto completoRaybaud, Sylvain. "De l'utilisation de mesures de confiance en traduction automatique : évaluation, post-édition et application à la traduction de la parole". Thesis, Université de Lorraine, 2012. http://www.theses.fr/2012LORR0260/document.
Texto completoIn this thesis I shall deal with the issues of confidence estimation for machine translation and statistical machine translation of large vocabulary spontaneous speech translation. I shall first formalize the problem of confidence estimation. I present experiments under the paradigm of multivariate classification and regression. I review the performances yielded by different techniques, present the results obtained during the WMT2012 internation evaluation campaign and give the details of an application to post edition of automatically translated documents. I then deal with the issue of speech translation. After going into the details of what makes it a very specific and particularly challenging problem, I present original methods to partially solve it, by using phonetic confusion networks, confidence estimation techniques and speech segmentation. I show that the prototype I developped yields performances comparable to state-of-the-art of more standard design
Doukhan, David. "Synthèse de parole expressive au delà du niveau de la phrase : le cas du conte pour enfant : conception et analyse de corpus de contes pour la synthèse de parole expressive". Thesis, Paris 11, 2013. http://www.theses.fr/2013PA112165/document.
Texto completoThe aim of this thesis is to propose ways to improve the expressiveness of speech synthesis systems. One of the central propositions of this work is to define, use and measure the impact of linguistic structures operating beyond the sentence level, as opposed to approaches operating on sentences out of their context. The scope of the study is restricted to the case of storytelling for children. The stories have the distinction of having been the subject of a number of studies in order to highlight a narrative structure and involve a number of stereotypical characters (hero, villain, fairy) whose speech is often reported. These special features are used to model the prosodic properties tales beyond the sentence level. The oral transmission of tales was often associated with musical practice (vocals, instruments) and their reading is associated with rich melodic properties including reproduction remains a challenge for modern speech synthesizers. To address these issues, a first corpus of written tales is collected and annotated with information about the narrative structure of stories, identification and allocation of direct quotations, referencing references to characters as well as named entities and enumerations areas. The corpus analyzed is described in terms of coverage and inter-annotator agreement. It is used to model systems segmentation tales episode, detection of direct quotes, dialogue acts and modes of communication. A second corpus of stories read by a professional speaker is presented. The word is aligned with the lexical and phonetic transcriptions, annotations of the corpus text and meta-information describing the characteristics of the characters involved in the story. The relationship between linguistic annotations and prosodic properties observed in the speech corpus are described and modeled. Finally, a prototype control expressive synthesizer parameters by Acapela unit selection is made. The prototype generates prosodic operating instructions beyond the sentence level, including using the information related to the structure of the story and the distinction between direct speech and reported speech. Prototype validation control is performed through a perceptual experience, which shows a significant improvement in the quality of the synthesis
Hueber, Thomas. "Reconstitution de la parole par imagerie ultrasonore et vidéo de l'appareil vocal : vers une communication parlée silencieuse". Phd thesis, Université Pierre et Marie Curie - Paris VI, 2009. http://pastel.archives-ouvertes.fr/pastel-00005707.
Texto completoNguyen, Roselyne. "Un système multi-agent pour la machine à dicter vocale MAUD : conception et intégration d'une source de connaissances phonologiques". Nancy 1, 1996. http://www.theses.fr/1996NAN10321.
Texto completoFell, Michael. "Traitement automatique des langues pour la recherche d'information musicale : analyse profonde de la structure et du contenu des paroles de chansons". Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4017.
Texto completoApplications in Music Information Retrieval and Computational Musicology have traditionally relied on features extracted from the music content in the form of audio, but mostly ignored the song lyrics. More recently, improvements in fields such as music recommendation have been made by taking into account external metadata related to the song. In this thesis, we argue that extracting knowledge from the song lyrics is the next step to improve the user’s experience when interacting with music. To extract knowledge from vast amounts of song lyrics, we show for different textual aspects (their structure, content and perception) how Natural Language Processing methods can be adapted and successfully applied to lyrics. For the structuralaspect of lyrics, we derive a structural description of it by introducing a model that efficiently segments the lyricsinto its characteristic parts (e.g. intro, verse, chorus). In a second stage, we represent the content of lyrics by meansof summarizing the lyrics in a way that respects the characteristic lyrics structure. Finally, on the perception of lyricswe investigate the problem of detecting explicit content in a song text. This task proves to be very hard and we showthat the difficulty partially arises from the subjective nature of perceiving lyrics in one way or another depending onthe context. Furthermore, we touch on another problem of lyrics perception by presenting our preliminary resultson Emotion Recognition. As a result, during the course of this thesis we have created the annotated WASABI SongCorpus, a dataset of two million songs with NLP lyrics annotations on various levels
Vaglio, Andrea. "Leveraging lyrics from audio for MIR". Electronic Thesis or Diss., Institut polytechnique de Paris, 2021. http://www.theses.fr/2021IPPAT027.
Texto completoLyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio
Mahjoubi, Hanane. "Calcul de la référence dans les dialogues oraux et transfert du français vers l'arabe : modélisatin simplifiée de la théorie du gouvernement et du liage". Besançon, 2009. http://www.theses.fr/2009BESA1029.
Texto completoThis thesis aims at illustrating the realization and the use of a model for anaphora resolution in oral dialogs in the field of computational linguistics. This work is mainly based on the theory of government and binding. Moreover, oral corpora transcribed manually are the basis of this work. We don’t pretend having discovered the most successful solution for the problem of anaphora in discourse. However, we propose that the study and analysis of anaphoric structures needs the use of complex formal theories such as government and binding. Since many years the structure of oral discourse is considered as ambiguous and very difficult to study. Oral constructions should be treated just like the written ones. Oral language has it’s own logic, but standard forms of language appears largely in it
Dupuis, Catherine. "Langage et parole chez l’enfant dysphasique". Paris 7, 1999. http://www.theses.fr/1999PA070042.
Texto completoNguyen, Viet Son. "Etude des caractéristiques de la langue vietnamienne en vue de sa synthèse et de sa reconnaissance automatique. Aspects statiques et dynamiques". Phd thesis, Telecom ParisTech, 2009. http://tel.archives-ouvertes.fr/tel-01064853.
Texto completoWu, Yaru. "Étude de la réduction segmentale en français parlé à travers différents styles : apports des grands corpus et du traitement automatique de la parole à l’étude du schwa, du /ʁ/ et des réductions à segments multiples". Thesis, Sorbonne Paris Cité, 2018. http://www.theses.fr/2018USPCA078.
Texto completoThis study on segmental reduction (i.e. deletion or temporal reduction) in spontaneous French allows us to propose two research methods for linguistic studies on large corpora, to investigate different factors of variation and to bring new insights on the propensity of segmental reduction. We applied the descendant method using forced alignment with variants when it concerns a specific reduction phenomena. Otherwise, we used the ascendant method using absent and short segments as indicators. Three reduction phenomena are studied: schwa elision, /ʁ/ deletion and the propensity of segmental reduction. The descendant method was used for analyzing schwa elision and /ʁ/ deletion. Common factors used for the two studies are post-lexical context, speech style, sex and profession. Schwas elision at initial syllable position in polysyllabic words and post-consonantal /ʁ/ deletion at word final position are not always conditioned by the same variation factors. Similarly, lexical schwa and epenthetic schwa are not under the influence of the same variation factors. The study on the propensity of segmental reduction allows us to apply the ascendant method and to investigate segmental reduction in general. Results suggest that liquids and glides resist less the reduction procedure than other consonants and nasal vowels resist better reduction procedure than oral vowels. Among oral vowels, high rounded vowels tend to be reduced more often than other oral vowels
Hamdi, Ahmed. "Traitement automatique du dialecte tunisien à l'aide d'outils et de ressources de l'arabe standard : application à l'étiquetage morphosyntaxique". Thesis, Aix-Marseille, 2015. http://www.theses.fr/2015AIXM4089/document.
Texto completoDeveloping natural language processing tools usually requires a large number of resources (lexica, annotated corpora, ...), which often do not exist for less- resourced languages. One way to overcome the problem of lack of resources is to devote substantial efforts to build new ones from scratch. Another approach is to exploit existing resources of closely related languages. Taking advantage of the closeness of standard Arabic and its dialects, one way to solve the problem of limited resources, consists in performing a conversion of Arabic dialects into standard Arabic in order to use the tools developed to handle the latter. In this work, we focus especially on processing Tunisian Arabic dialect. We propose a conversion system of Tunisian into a closely form of standard Arabic for which the application of natural language processing tools designed for the latter provides good results. In order to validate our approach, we focused on part-of-speech tagging. Our system achieved an accuracy of 89% which presents ∼20% of absolute improvement over a standard Arabic tagger baseline
Choumane, Ali Siroux Jacques. "Traitement générique des références dans le cadre multimodal parole-image-tactile". Rennes : [s.n.], 2008. ftp://ftp.irisa.fr/techreports/theses/2008/choumane.pdf.
Texto completo