Добірка наукової літератури з теми "Extraction d'entités"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся зі списками актуальних статей, книг, дисертацій, тез та інших наукових джерел на тему "Extraction d'entités".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Дисертації з теми "Extraction d'entités":
Stern, Rosa. "Identification automatique d'entités pour l'enrichissement de contenus textuels." Phd thesis, Université Paris-Diderot - Paris VII, 2013. http://tel.archives-ouvertes.fr/tel-00939420.
Taillé, Bruno. "Contextualization and Generalization in Entity and Relation Extraction." Electronic Thesis or Diss., Sorbonne université, 2022. http://www.theses.fr/2022SORUS266.
Since 2018, the transfer of entire pretrained Language Models and the preservation of their contextualization capacities enabled to reach unprecedented performance on virtually every Natural Language Processing benchmark. However, as models reach such impressive scores, their comprehension abilities still appear as shallow, which reveals limitations of benchmarks to provide useful insights on their factors of performance and to accurately measure understanding capabilities. In this thesis, we study the behaviour of state-of-the-art models regarding generalization to facts unseen during training in Entity and Relation Extraction. Indeed, traditional benchmarks present important lexical overlap between mentions and relations used for training and evaluating models, whereas the main interest of Information Extraction is to extract previously unknown information. We propose studies to separate performance based on mention and relation overlap with the training set and find that pretrained Language Models are mainly beneficial to detect unseen mentions, in particular out-of-domain. While this makes them suited for real use cases, there is still a gap in performance between seen and unseen mentions that hurts generalization to new facts. In particular, even state-of-the-art ERE models rely on a shallow retention heuristic, basing their prediction more on arguments surface forms than context
Wang, Zhen. "Extraction en langue chinoise d'actions spatiotemporalisées réalisées par des personnes ou des organismes." Thesis, Sorbonne Paris Cité, 2016. http://www.theses.fr/2016INAL0006.
We have developed an automatic analyser and an extraction module for Chinese langage processing. The analyser performs automatic Chinese word segmentation based on linguistic rules and dictionaries, part-of-speech tagging based on n-gram statistics and dependency grammar parsing. The module allows to extract information around named entities and activities. In order to achieve these goals, we have tackled the following main issues: segmentation and part-of-speech ambiguity; unknown word identification in Chinese text; attachment ambiguity in parsing. Chinese texts are analysed sentence by sentence. Given a sentence, the analyzer begins with typographic processing to identify sequences of Latin characters and numbers. Then, dictionaries are used for preliminary segmentation into words. Linguistic-based rules are used to create proper noun hypotheses and change the weight of some word categories. These rules take into account word context. An n-gram language model is created from a training corpus and selects the best word segmentation and parts-of-speech. Dependency grammar parsing is used to annotate relations between words. A first step of named entity recognition is performed after parsing. Its goal is to identify single-word named entities and noun-phrase-based named entities and to determine their semantic type. These named entities are then used in knowledge extraction. Knowledge extraction rules are used to validate named entities or to change their types. Knowledge extraction consists of two steps: automatic content extraction and tagging from analysed text; extracted contents control and ontology-based co-reference resolution
Ramdani, Halima. "Un système intelligent pour l'optimisation du processus de e-recrutement." Electronic Thesis or Diss., Université de Lorraine, 2021. http://www.theses.fr/2021LORR0366.
Decision support systems are commonly used to solve selection and decision-making problems in a variety of domains. As digital and computer systems evolve, decision-making environments become less familiar to decision-makers, resulting in (1) decisions made under uncertainty and influenced by external factors, and (2) hybrid decision-making contexts. This thesis proposes a generic decision support system that can be used to solve problems with the following conditions: (1) the environment is uncertain and changes over time; (2) the decision-objectives makers are multiple; and (3) the decision-making context is written in natural language. The system we propose consists of several components. The first component extracts and identifies information from a natural language-written context in order to classify it. For this purpose, our first contribution is used : DEEP, a methodology for entity extraction based on the organizational patterns of a text written in natural language. The second component aims to create semantically comparable groups of texts in order to fill in data gaps for under-represented contexts. This component is our second contribution: a matching method based on the type of information contained in two natural-language texts. The results of this contribution are used to aggregate temporal data related to decision contexts that are semantically close in order to forecast decision-maker choice factors. Given dynamicity and uncertainty in the environment, a hybrid architecture of convolutional and recurrent neural networks was chosen to capture trends and correlations between items. Finally, these decision factors are used in a multi-objective, multi-period optimization to provide the decision maker with the best set of options based on his or her goals and constraints. The proposed decision support system is used in the e-recruitment domain to assist the recruiter (decision-maker) in selecting (decision) the most appropriate (multi-objective optimization) channels (items) for a job offer (context of decision). To do so, we compared the results obtained after a campaign manager implemented a recruitment campaign with the results obtained after the decision support system recommended channels. The decision support system saves the recruiter time on (1) data preparation for job posting using the DEEP contribution, (2) data analysis of historical data, (3) data analysis of current data, and (4) decision-making using the system recommendations, according to our experiments. The time forecasting and reinforcement system, which is based on continuous data rectification, saves money during periods when the recruiter's goals are not met, so this approach saves money as well
Caubriere, Antoine. "Du signal au concept : réseaux de neurones profonds appliqués à la compréhension de la parole." Thesis, Le Mans, 2021. https://tel.archives-ouvertes.fr/tel-03177996.
This thesis is part of the deep learning applied to spoken language understanding. Until now, this task was performed through a pipeline of components implementing, for example, a speech recognition system, then different natural language processing, before involving a language understanding system on enriched automatic transcriptions. Recently, work in the field of speech recognition has shown that it is possible to produce a sequence of words directly from the acoustic signal. Within the framework of this thesis, the aim is to exploit these advances and extend them to design a system composed of a single neural model fully optimized for the spoken language understanding task, from signal to concept. First, we present a state of the art describing the principles of deep learning, speech recognition, and speech understanding. Then, we describe the contributions made along three main axes. We propose a first system answering the problematic posed and apply it to a task of named entities recognition. Then, we propose a transfer learning strategy guided by a curriculum learning approach. This strategy is based on the generic knowledge learned to improve the performance of a neural system on a semantic concept extraction task. Then, we perform an analysis of the errors produced by our approach, while studying the functioning of the proposed neural architecture. Finally, we set up a confidence measure to evaluate the reliability of a hypothesis produced by our system
Bravo, Serrano Àlex 1984. "BeFree : a text mining system for the extraction of biomedical information from literature." Doctoral thesis, Universitat Pompeu Fabra, 2016. http://hdl.handle.net/10803/398300.
Avui dia, la recerca biomèdica ha d'aprofitar i explotar la gran quantitat d'informació inclosa en publicacions científiques. El processament automàtic de text, habitualment conegut com mineria de text o text mining, és una eina essencial per tal d'identificar, extreure, organitzar i analitzar la informació biomèdica més rellevant de la literatura. Aquesta tesi presenta el sistema BeFree, una eina de text mining per l’extracció d’informació biomèdica per donar suport a la recerca de les bases genètiques de les malalties i la toxicitat de fàrmacs. BeFree pot identificar gens i malalties des d’un gran repositori de text biomèdic. D’altra banda, mitjançant informació lingüística continguda al text, BeFree pot detectar relacions entre gens, malalties i fàrmacs amb uns resultats comparables a l’estat de l’art. Com a resultat, BeFree ha sigut utilitzat en diverses aplicacions del camp biomèdic, amb l’objectiu d’oferir informació biomèdica estructurada pel desenvolupament de recursos com base de dades i corpora. A més, aquests recursos estan disponibles per la comunitat científica pel desenvolupament de noves eines de text mining.