Spis treści
Gotowa bibliografia na temat „Modélisation multilingue”
Utwórz poprawne odniesienie w stylach APA, MLA, Chicago, Harvard i wielu innych
Zobacz listy aktualnych artykułów, książek, rozpraw, streszczeń i innych źródeł naukowych na temat „Modélisation multilingue”.
Przycisk „Dodaj do bibliografii” jest dostępny obok każdej pracy w bibliografii. Użyj go – a my automatycznie utworzymy odniesienie bibliograficzne do wybranej pracy w stylu cytowania, którego potrzebujesz: APA, MLA, Harvard, Chicago, Vancouver itp.
Możesz również pobrać pełny tekst publikacji naukowej w formacie „.pdf” i przeczytać adnotację do pracy online, jeśli odpowiednie parametry są dostępne w metadanych.
Artykuły w czasopismach na temat "Modélisation multilingue"
Carsenty, Stéphane. "Quel rôle pour le corpus dans la modélisation ontoterminologique multilingue : l’exemple de la balance des paiements". Studia Romanica Posnaniensia 49, nr 4 (9.01.2023): 9–25. http://dx.doi.org/10.14746/strop.2022.494.001.
Pełny tekst źródłaRozprawy doktorskie na temat "Modélisation multilingue"
Tan, Tien Ping. "Reconnaissance automatique de la parole non-native". Grenoble 1, 2008. http://www.theses.fr/2008GRE10096.
Pełny tekst źródłaAutomatic speech recognition technology has achieved maturity, where it has been widely integrated into many systems. However, speech recognition system for non-native speakers still suffers from high error rate, which is due to the mismatch between the non-native speech and the trained models. Recording sufficient non-native speech for training is time consuming and often difficult. In this thesis, we propose approaches to adapt acoustic and pronunciation model under different resource constraints for non-native speakers. A preliminary work on accent identification has also been carried out. Multilingual acoustic modeling has been proposed for modeling cross-lingual transfer of non-native speakers to overcome the difficulty in obtaining non-native speech. In cases where multilingual acoustic models are available, a hybrid approach of acoustic interpolation and merging has been proposed for adapting the target acoustic model. The proposed approach has also proven to be useful for context modeling. However, if multilingual corpora are available instead, a class of three interpolation methods has equally been introduced for adaptation. Two of them are supervised speaker adaptation methods, which can be carried out with only few non-native utterances. In term of pronunciation modeling, two existing approaches which model pronunciation variants, one at the pronunciation dictionary and another at the rescoring module have been revisited, so that they can work under limited amount of non-native speech. We have also proposed a speaker clustering approach called “latent pronunciation analysis” for clustering non-native speakers based on pronunciation habits. This approach can also be used for pronunciation adaptation. Finally, a text dependent accent identification method has been proposed. The approach can work with little amount of non-native speech for creating robust accent models. This is made possible with the generalizability of the decision trees and the usage of multilingual resources to increase the performance of the accent models
Zhu, Dong. "Modélisation acoustique multilingue pour l'identification automatique de la langue et la transcription de la parole". Paris 11, 2007. http://www.theses.fr/2007PA112132.
Pełny tekst źródłaBella, Gábor. "Modélisation de texte numérique multilingue : vers un modèle général et extensible fondé sur le concept de textème". Télécom Bretagne, 2008. http://www.theses.fr/2008TELB0067.
Pełny tekst źródłaThis thesis is concerned with the modelling of electronic text. This modelling involves the definition both of the atomic text elements and of the way these elements join together to form textual structures. In response to the growing need for internationalisation of information systems, historical models of text, based on the concept of code tables, have been extended by semi-formalised knowledge related to the writing system so that, by now, such knowledge is essential to text processing of even the simplest kind. Thus were born the Unicode character encoding and the so-called 'intelligent' font formats. Realising that this phenomenon marks only the beginning of a convergence towards models based on the principles of knowledge representation, we here propose an alternative approach to text modelling that defines a text element not as a table entry but through the properties that describe the element. The formal framework that we establish, initially developed for the purposes of knowledge representation, provides us with a method by which precise formal definitions can be given to much-used but ill-defined notions such as character, glyph, or usage. The same framework allows us to define a generalised text element that we call a texteme, the atomic element on which a whole family of new text models is based. The study of these models then leads us to the understanding
Haton, Sébastien. "Analyse et modélisation de la polysémie verbale dans une perspective multilingue : le dictionnaire bilingue vu dans un miroir". Nancy 2, 2006. http://www.theses.fr/2006NAN21016.
Pełny tekst źródłaLexical asymmetry and hidden data, i. E. Not directly visible into one lexical entry, are phenomena peculiar to most of the bilingual dictionaries. Our purpose is to establish a methodology to highlight both phenomena by extracting hidden data from the dictionary and by re-establishing symmetry between its two parts. So we studied a large number of verbs and integrated them into a unique multilingual database. In order to offset some lacks of the lexicography, we also studied verb occurrences from a literary database. The purpose is to expand dictionaires' data without criticizing these ones. At last, our database is turned into a "multilexical" graph thanks to an algorithm, which is binding words from different languages into the same semantic space
Lam-Yee-Mui, Léa-Marie. "Modélisations pour la reconnaissance de la parole à données contraintes". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG075.
Pełny tekst źródłaThis thesis explores the development of speech recognition systems in the context of low-resource conditions. Over the last decade, advances with deep neural networks have led to large improvements in the performance of speech-to-text systems. The success of deep learning methods relies on supervised training with very large annotated corpora, typically comprised of thousands of hours of recordings with manual transcriptions, and on increasing the number of trainable parameters in the models. However, sufficient training corpora are not always available due to the lengthy and costly process of data collection and annotation. Our aim is to build systems under low-resource conditions (a few hours) for the transcription of conversational speech. Recent research shows that state-of-the-art hybrid systems with distinct acoustic and linguistic models are more efficient than neuronal end-to-end systems when less than ten hours of annotated speech are available. Therefore, we adopt hybrid models, and investigate multilingual acoustic modeling to mutualize linguistic resources from multiple sources. For the multilingual models, we first investigate the impact of the amount of training data as well the similarity between the training and target languages. The multilingual models are evaluation both without adaptation and after fine-tuning via transfer learning on conversational telephone speech data in four languages (Amharic, Assamese, Georgian, and Kurmandji) collected as part of the iARPA Babel program. These languages are linguistically varied and were chosen to cover several language families. Next, we study language adaptive training in which the acoustic feature vector is augmented with a language embedding when training the multilingual acoustic model. Our multilingual models can be used to decode speech or to extract multilingual features. These features are evaluated on both the Babal corpus and on the South African corpus Soap Operas, composed of code-switched speech. We compare our hybrid models with multilingual self-supervised publicly available pretrained models, trained with a large amount of data from various domains. For every proposed method and for all target languages, we show that hybrid multilingual systems remain competitive and robust under low resource conditions, while having the advantage of being industrializable with low computational resource requirements. Lastly, we show the usefulness of multilingual acoustic modeling on keyword spotting when only a few hours of monolingual data are available
Morin, Emmanuel. "Synergie des approches et des ressources déployées pour le traitement de l'écrit". Habilitation à diriger des recherches, Université de Nantes, 2007. http://tel.archives-ouvertes.fr/tel-00482893.
Pełny tekst źródłaDenoual, Etienne. "Méthodes en caractères pour le traitement automatique des langues". Phd thesis, Université Joseph Fourier (Grenoble), 2006. http://tel.archives-ouvertes.fr/tel-00107056.
Pełny tekst źródłaLe présent travail promeut l'utilisation de méthodes travaillant au niveau du signal de l'écrit: le caractère, unité immédiatement accessible dans toute langue informatisée, permet de se passer de segmentation en mots, étape actuellement incontournable pour des langues comme le chinois ou le japonais.
Dans un premier temps, nous transposons et appliquons en caractères une méthode bien établie d'évaluation objective de la traduction automatique, BLEU.
Les résultats encourageants nous permettent dans un deuxième temps d'aborder d'autres tâches de traitement des données linguistiques. Tout d'abord, le filtrage de la grammaticalité; ensuite, la caractérisation de la similarité et de l'homogénéité des ressources linguistiques. Dans toutes ces tâches, le traitement en caractères obtient des résultats acceptables, et comparables à ceux obtenus en mots.
Dans un troisième temps, nous abordons des tâches de production de données linguistiques: le calcul analogique sur les chaines de caractères permet la production de paraphrases aussi bien que la traduction automatique.
Ce travail montre qu'on peut construire un système complet de traduction automatique ne nécessitant pas de segmentation, a fortiori pour traiter des langues sans séparateur orthographique.
Schleider, Thomas. "Knowledge Modeling and Multilingual Information Extraction for the Understanding of the Cultural Heritage of Silk". Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS280.
Pełny tekst źródłaModeling any type of human knowledge is a complex effort and needs to consider all specificities of its domain including niche vocabulary. This thesis focuses on such an endeavour for the knowledge about the European silk object production, which can be considered obscure and therefore endangered. However, the fact that such Cultural Heritage data is heterogenous, spread across many museums worldwide, sparse and multilingual poses particular challenges for which knowledge graphs have become more and more popular in recent years. Our main goal is not only into investigating knowledge representations, but also in which ways such an integration process can be accompanied through enrichments, such as information reconciliation through ontologies and vocabularies, as well as metadata predictions to fill gaps in the data. We will first propose a workflow for the management for the integration of data about silk artifacts and afterwards present different classification approaches, with a special focus on unsupervised and zero-shot methods. Finally, we study ways of making exploration of such metadata and images afterwards as easy as possible
Grosjean, Julien. "Modélisation, réalisation et évaluation d'un portail multi-terminologique multi-discipline, multi-lingue (3M) dans le cadre de la Plateforme d'Indexation Régionale (PlaIR)". Rouen, 2014. http://www.theses.fr/2014ROUES028.
Pełny tekst źródłaCossu, Jean-Valère. "Analyse de l’image de marque sur le Web 2.0". Thesis, Avignon, 2015. http://www.theses.fr/2015AVIG0207/document.
Pełny tekst źródłaAnalyse of entities representation over the Web 2.0Every day, millions of people publish their views on Web 2.0 (social networks,blogs, etc.). These comments focus on subjects as diverse as news, politics,sports scores, consumer objects, etc. The accumulation and agglomerationof these notices on an entity (be it a product, a company or a public entity) givebirth to the brand image of that entity. Internet has become in recent years aprivileged place for the emergence and dissemination of opinions and puttingWeb 2.0 at the head of observatories of opinions. The latter being a means ofaccessing the knowledge of the opinion of the world population.The image is here understood as the idea that a person or a group of peopleis that entity. This idea carries a priori on a particular subject and is onlyvalid in context for a given time. This perceived image is different from theentity initially wanted to broadcast (eg via a communication campaign). Moreover,in reality, there are several images in the end living together in parallel onthe network, each specific to a community and all evolve differently over time(imagine how would be perceived in each camp together two politicians edgesopposite). Finally, in addition to the controversy caused by the voluntary behaviorof some entities to attract attention (think of the declarations required orshocking). It also happens that the dissemination of an image beyond the frameworkthat governed the and sometimes turns against the entity (for example,« marriage for all » became « the demonstration for all »). The views expressedthen are so many clues to understand the logic of construction and evolution ofthese images. The aim is to be able to know what we are talking about and howwe talk with filigree opportunity to know who is speaking.viiIn this thesis we propose to use several simple supervised statistical automaticmethods to monitor entity’s online reputation based on textual contentsmentioning it. More precisely we look the most important contents and theirsauthors (from a reputation manager point-of-view). We introduce an optimizationprocess allowing us to enrich the data using a simulated relevance feedback(without any human involvement). We also compare content contextualizationmethod using information retrieval and automatic summarization methods.Wealso propose a reflection and a new approach to model online reputation, improveand evaluate reputation monitoring methods using Partial Least SquaresPath Modelling (PLS-PM). In designing the system, we wanted to address localand global context of the reputation. That is to say the features can explain thedecision and the correlation betweens topics and reputation. The goal of ourwork was to propose a different way to combine usual methods and featuresthat may render reputation monitoring systems more accurate than the existingones. We evaluate and compare our systems using state of the art frameworks: Imagiweb and RepLab. The performances of our proposals are comparableto the state of the art. In addition, the fact that we provide reputation modelsmake our methods even more attractive for reputation manager or scientistsfrom various fields