Academic literature on the topic 'Named Entity Classification'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Named Entity Classification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Named Entity Classification"

1

Nadeau, David, and Satoshi Sekine. "A survey of named entity recognition and classification." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 3–26. http://dx.doi.org/10.1075/li.30.1.03nad.

Full text
Abstract:
This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.
APA, Harvard, Vancouver, ISO, and other styles
2

Ahmad, Muhammad Tayyab, Muhammad Kamran Malik, Khurram Shahzad, Faisal Aslam, Asif Iqbal, Zubair Nawaz, and Faisal Bukhari. "Named Entity Recognition and Classification for Punjabi Shahmukhi." ACM Transactions on Asian and Low-Resource Language Information Processing 19, no. 4 (July 7, 2020): 1–13. http://dx.doi.org/10.1145/3383306.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Steinberger, Ralf, and Bruno Pouliquen. "Cross-lingual Named Entity Recognition." Lingvisticæ Investigationes. International Journal of Linguistics and Language Resources 30, no. 1 (August 10, 2007): 135–62. http://dx.doi.org/10.1075/li.30.1.09ste.

Full text
Abstract:
Named Entity Recognition and Classification (NERC) is a known and well-explored text analysis application that has been applied to various languages. We are presenting an automatic, highly multilingual news analysis system that fully integrates NERC for locations, persons and organisations with document clustering, multi-label categorisation, name attribute extraction, name variant merging and the calculation of social networks. The proposed application goes beyond the state-of-the-art by automatically merging the information found in news written in ten different languages, and by using the aggregated name information to automatically link related news documents across languages for all 45 language pair combinations. While state-of-the-art approaches for cross-lingual name variant merging and document similarity calculation require bilingual resources, the methods proposed here are mostly language-independent and require a minimal amount of monolingual language-specific effort. The development of resources for additional languages is therefore kept to a minimum and new languages can be plugged into the system effortlessly. The presented online news analysis application is fully functional and has, at the end of the year 2006, reached average usage statistics of 600,000 hits per day.
APA, Harvard, Vancouver, ISO, and other styles
4

Ekbal, Asif, Sriparna Saha, and Utpal Kumar Sikdar. "Multiobjective Optimization for Biomedical Named Entity Recognition and Classification." Procedia Technology 6 (2012): 206–13. http://dx.doi.org/10.1016/j.protcy.2012.10.025.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Shaalan, Khaled. "A Survey of Arabic Named Entity Recognition and Classification." Computational Linguistics 40, no. 2 (June 2014): 469–510. http://dx.doi.org/10.1162/coli_a_00178.

Full text
Abstract:
As more and more Arabic textual information becomes available through the Web in homes and businesses, via Internet and Intranet services, there is an urgent need for technologies and tools to process the relevant information. Named Entity Recognition (NER) is an Information Extraction task that has become an integral part of many other Natural Language Processing (NLP) tasks, such as Machine Translation and Information Retrieval. Arabic NER has begun to receive attention in recent years. The characteristics and peculiarities of Arabic, a member of the Semitic languages family, make dealing with NER a challenge. The performance of an Arabic NER component affects the overall performance of the NLP system in a positive manner. This article attempts to describe and detail the recent increase in interest and progress made in Arabic NER research. The importance of the NER task is demonstrated, the main characteristics of the Arabic language are highlighted, and the aspects of standardization in annotating named entities are illustrated. Moreover, the different Arabic linguistic resources are presented and the approaches used in Arabic NER field are explained. The features of common tools used in Arabic NER are described, and standard evaluation metrics are illustrated. In addition, a review of the state of the art of Arabic NER research is discussed. Finally, we present our conclusions. Throughout the presentation, illustrative examples are used for clarification.
APA, Harvard, Vancouver, ISO, and other styles
6

ASBAYOU, Omar. "Automatic Arabic Named Entity Extraction and Classification for Information Retrieval." International Journal on Natural Language Computing 9, no. 6 (December 30, 2020): 1–22. http://dx.doi.org/10.5121/ijnlc.2020.9601.

Full text
Abstract:
This article tries to explain our rule-based Arabic Named Entity recognition (NER) and classification system. It is based on lists of classified proper names (PN) and particularly on syntactico-semantic patterns resulting in fine classification of Arabic NE. These patterns use syntactico-semantic combination of morpho-syntactic and syntactic entities. It also uses lexical classification of trigger words and NE extensions. These linguistic data are essential not only to name entity extraction but also to the taxonomic classification and to determining the NE frontiers. Our method is also based on the contextualisation and on the notion of NE class attributes and values. Inspired from X-bar theory and immediate constituents, we built a rule-based NER system composed of five levels of syntactico-semantic combination. We also show how the fine NE annotations in our system output (XML database) is exploited in information retrieval and information extraction.
APA, Harvard, Vancouver, ISO, and other styles
7

Tan, Chuanqi, Wei Qiu, Mosha Chen, Rui Wang, and Fei Huang. "Boundary Enhanced Neural Span Classification for Nested Named Entity Recognition." Proceedings of the AAAI Conference on Artificial Intelligence 34, no. 05 (April 3, 2020): 9016–23. http://dx.doi.org/10.1609/aaai.v34i05.6434.

Full text
Abstract:
Named entity recognition (NER) is a well-studied task in natural language processing. However, the widely-used sequence labeling framework is usually difficult to detect entities with nested structures. The span-based method that can easily detect nested entities in different subsequences is naturally suitable for the nested NER problem. However, previous span-based methods have two main issues. First, classifying all subsequences is computationally expensive and very inefficient at inference. Second, the span-based methods mainly focus on learning span representations but lack of explicit boundary supervision. To tackle the above two issues, we propose a boundary enhanced neural span classification model. In addition to classifying the span, we propose incorporating an additional boundary detection task to predict those words that are boundaries of entities. The two tasks are jointly trained under a multitask learning framework, which enhances the span representation with additional boundary supervision. In addition, the boundary detection model has the ability to generate high-quality candidate spans, which greatly reduces the time complexity during inference. Experiments show that our approach outperforms all existing methods and achieves 85.3, 83.9, and 78.3 scores in terms of F1 on the ACE2004, ACE2005, and GENIA datasets, respectively.
APA, Harvard, Vancouver, ISO, and other styles
8

Choi, Yunsu, and Jeongwon Cha. "Korean Named Entity Recognition and Classification using Word Embedding Features." Journal of KIISE 43, no. 6 (June 15, 2016): 678–85. http://dx.doi.org/10.5626/jok.2016.43.6.678.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Goyal, Archana, Vishal Gupta, and Manish Kumar. "Recent Named Entity Recognition and Classification techniques: A systematic review." Computer Science Review 29 (August 2018): 21–43. http://dx.doi.org/10.1016/j.cosrev.2018.06.001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Marchenko, O. O. "Machine-learning methods for text named entity recognition." PROBLEMS IN PROGRAMMING, no. 2-3 (June 2016): 150–57. http://dx.doi.org/10.15407/pp2016.02-03.150.

Full text
Abstract:
The article describes machine learning methods for the named entity recognition. To build named entity classifiers two basic models of machine learning, The Naїve Bayes and Conditional Random Fields, were used. A model for multi-classification of named entities using Error Correcting Output Codes was also researched. The paper describes a method for classifiers' training and the results of test experiments. Conditional Random Fields overcome other models in precision and recall evaluations.
APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic "Named Entity Classification"

1

Alasiry, Areej Mohammed. "Named entity recognition and classification in search queries." Thesis, Birkbeck (University of London), 2015. http://bbktheses.da.ulcc.ac.uk/154/.

Full text
Abstract:
Named Entity Recognition and Classification is the task of extracting from text, instances of different entity classes such as person, location, or company. This task has recently been applied to web search queries in order to better understand their semantics, where a search query consists of linguistic units that users submit to a search engine to convey their search need. Discovering and analysing the linguistic units comprising a search query enables search engines to reveal and meet users' search intents. As a result, recent research has concentrated on analysing the constituent units comprising search queries. However, since search queries are short, unstructured, and ambiguous, an approach to detect and classify named entities is presented in this thesis, in which queries are augmented with the text snippets of search results for search queries. The thesis makes the following contributions: 1. A novel method for detecting candidate named entities in search queries, which utilises both query grammatical annotation and query segmentation. 2. A novel method to classify the detected candidate entities into a set of target entity classes, by using a seed expansion approach; the method presented exploits the representation of the sets of contextual clues surrounding the entities in the snippets as vectors in a common vector space. 3. An exploratory analysis of three main categories of search refiners: nouns, verbs, and adjectives, that users often incorporate in entity-centric queries in order to further refine the entity-related search results. 4. A taxonomy of named entities derived from a search engine query log. By using a large commercial query log, experimental evidence is provided that the work presented herein is competitive with the existing research in the field of entity recognition and classification in search queries.
APA, Harvard, Vancouver, ISO, and other styles
2

Rosvall, Erik. "Comparison of sequence classification techniques with BERT for named entity recognition." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-261419.

Full text
Abstract:
This thesis takes its starting point from the recent advances in Natural Language Processing being developed upon the Transformer model. One of the significant developments recently was the release of a deep bidirectional encoder called BERT that broke several state of the art results at its release. BERT utilises Transfer Learning to improve modelling language dependencies in texts. BERT is used for several different Natural Language Processing tasks, this thesis looks at Named Entity Recognition, sometimes referred to as sequence classification. This thesis compares the model architecture as it was presented in its original paper with a different classifier in the form of a Conditional Random Field. BERT was evaluated on the CoNLL-03 dataset, based on English news articles published by Reuters. The Conditional Random Field classifier overall outperforms the original Feed Forward classifier on the F1-score metric with a small margin of approximately 0.25 percentage points. While the thesis fails to reproduce the original report’s results it compares the two model architectures across the hyperparameters proposed for fine-tuning. Conditional Random Fields proves to perform better scores at most hyperparameter combination and are less sensitive to which parameters were chosen, creating an incentive for its use by reducing the effect of parameter search compared to a Feed Forward layer as the classifier. Comparing the two models shows a lower variance in the results for Conditional Random Fields.
Den här uppsatsen tar avstamp från den senaste utvecklingen inom datorlingvistik som skett med bakgrund av den nya transformator-arkitekturen (engelska “Transformer”). En av de senare modellerna som presenterats är en djup dubbelriktad modell, kallad BERT, som förbättrade flera resultat inom datorlingvistik. BERT är en modell som tränats på generell språkförståelse genom att bearbeta stora textmängder och sedan specialanpassas till ett specifikt problemområde. BERT kan användas för flera uppgifter inom datorlingvistik men denna uppsats tittade specifikt på informationsextraktion av entiteter (engelska “Named Entity Recognition”). Uppsatsen jämförde den ursprungliga modellen som presenterades med en ny klassificerare baserat på Conditional Random Fields. Modellen utvärderades på CoNLL-03, ett dataset från Reuters nyhetsartiklar skrivna på engelska. Resultatet visade att Conditional Random Field klassificerare presterade bättre mätt i F1-resultat, med ungefär 0.25 procentenheter. Uppsatsen lyckades inte reproducera BERTs ursprungliga resultat men jämför de två arkitekturerna över de hyperparametrar som föreslagits för specialanpassning till uppgiften. Conditional Random Fields visade bättre resultat för de flesta modellkonfigurationerna, men även mindre varians i resultat för olika parametrar vilket skapar ett starkt incitament att använda Conditional Random Fields som klassificerare.
APA, Harvard, Vancouver, ISO, and other styles
3

Kliegr, Tomáš. "Unsupervised Entity Classification with Wikipedia and WordNet." Doctoral thesis, Vysoká škola ekonomická v Praze, 2007. http://www.nusl.cz/ntk/nusl-126861.

Full text
Abstract:
This dissertation addresses the problem of classification of entities in text represented by noun phrases. The goal of this thesis is to develop a method for automated classification of entities appearing in datasets consisting of short textual fragments. The emphasis is on unsupervised and semi-supervised methods that will allow for fine-grained character of the assigned classes and require no labeled instances for training. The set of target classes is either user-defined or determined automatically. Our initial attempt to address the entity classification problem is called Semantic Concept Mapping (SCM) algorithm. SCM maps the noun phrases representing the entities as well as the target classes to WordNet. Graph-based WordNet similarity measures are used to assign the closest class to the noun phrase. If a noun phrase does not match any WordNet concept, a Targeted Hypernym Discovery (THD) algorithm is executed. The THD algorithm extracts a hypernym from a Wikipedia article defining the noun phrase using lexico-syntactic patterns. This hypernym is then used to map the noun phrase to a WordNet synset, but it can also be perceived as the classification result by itself, resulting in an unsupervised classification system. SCM and THD algorithms were designed for English. While adaptation of these algorithms for other languages is conceivable, we decided to develop the Bag of Articles (BOA) algorithm, which is language agnostic as it is based on the statistical Rocchio classifier. Since this algorithm utilizes Wikipedia as a source of data for classification, it does not require any labeled training instances. WordNet is used in a novel way to compute term weights. It is also used as a positive term list and for lemmatization. A disambiguation algorithm utilizing global context is also proposed. We consider the BOA algorithm to be the main contribution of this dissertation. Experimental evaluation of the proposed algorithms is performed on the WordSim353 dataset, which is used for evaluation in the Word Similarity Computation (WSC) task, and on the Czech Traveler dataset, the latter being specifically designed for the purpose of our research. BOA performance on WordSim353 achieves Spearman correlation of 0.72 with human judgment, which is close to the 0.75 correlation for the ESA algorithm, to the author's knowledge the best performing algorithm for this gold-standard dataset, which does not require training data. The advantage of BOA over ESA is that it has smaller requirements on preprocessing of the Wikipedia data. While SCM underperforms on the WordSim353 dataset, it overtakes BOA on the Czech Traveler dataset, which was designed specifically for our entity classification problem. This discrepancy requires further investigation. In a standalone evaluation of THD on Czech Traveler dataset the algorithm returned a correct hypernym for 62% of entities.
APA, Harvard, Vancouver, ISO, and other styles
4

Volkova, Svitlana. "Entity extraction, animal disease-related event recognition and classification from web." Thesis, Kansas State University, 2010. http://hdl.handle.net/2097/4593.

Full text
Abstract:
Master of Science
Department of Computing and Information Sciences
William H. Hsu
Global epidemic surveillance is an essential task for national biosecurity management and bioterrorism prevention. The main goal is to protect the public from major health threads. To perform this task effectively one requires reliable, timely and accurate medical information from a wide range of sources. Towards this goal, we present a framework for epidemiological analytics that can be used to extract and visualize infectious disease outbreaks from the variety of unstructured web sources automatically. More precisely, in this thesis, we consider several research tasks including document relevance classification, entity extraction and animal disease-related event recognition in the veterinary epidemiology domain. First, we crawl web sources and classify collected documents by topical relevance using supervised learning algorithms. Next, we propose a novel approach for automated ontology construction in the veterinary medicine domain. Our approach is based on semantic relationship discovery using syntactic patterns. We then apply our automatically-constructed ontology for the domain-specific entity extraction task. Moreover, we compare our ontology-based entity extraction results with an alternative sequence labeling approach. We introduce a sequence labeling method for the entity tagging that relies on syntactic feature extraction using a sliding window. Finally, we present our novel sentence-based event recognition approach that includes three main steps: entity extraction of animal diseases, species, locations, dates and the confirmation status n-grams; event-related sentence classification into two categories - suspected or confirmed; automated event tuple generation and aggregation. We show that our document relevance classification results as well as entity extraction and disease-related event recognition results are significantly better compared to the results reported by other animal disease surveillance systems.
APA, Harvard, Vancouver, ISO, and other styles
5

Yosef, Mohamed Amir [Verfasser], and Gerhard [Akademischer Betreuer] Weikum. "U-AIDA : a customizable system for named entity recognition, classification, and disambiguation / Mohamed Amir Yosef. Betreuer: Gerhard Weikum." Saarbrücken : Saarländische Universitäts- und Landesbibliothek, 2016. http://d-nb.info/1083894722/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Mendes, Pablo N. "Adaptive Semantic Annotation of Entity and Concept Mentions in Text." Wright State University / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=wright1401665504.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Sidås, Albin, and Simon Sandberg. "Conversational Engine for Transportation Systems." Thesis, Linköpings universitet, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-176810.

Full text
Abstract:
Today's communication between operators and professional drivers takes place through direct conversations between the parties. This thesis project explores the possibility to support the operators in classifying the topic of incoming communications and which entities are affected through the use of named entity recognition and topic classifications. By developing a synthetic training dataset, a NER model and a topic classification model was developed and evaluated to achieve F1-scores of 71.4 and 61.8 respectively. These results were explained by a low variance in the synthetic dataset in comparison to a transcribed dataset from the real world which included anomalies not represented in the synthetic dataset. The aforementioned models were integrated into the dialogue framework Emora to seamlessly handle the back and forth communication and generating responses.
APA, Harvard, Vancouver, ISO, and other styles
8

Urbansky, David. "Automatic Extraction and Assessment of Entities from the Web." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2012. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-97469.

Full text
Abstract:
The search for information about entities, such as people or movies, plays an increasingly important role on the Web. This information is still scattered across many Web pages, making it more time consuming for a user to find all relevant information about an entity. This thesis describes techniques to extract entities and information about these entities from the Web, such as facts, opinions, questions and answers, interactive multimedia objects, and events. The findings of this thesis are that it is possible to create a large knowledge base automatically using a manually-crafted ontology. The precision of the extracted information was found to be between 75–90 % (facts and entities respectively) after using assessment algorithms. The algorithms from this thesis can be used to create such a knowledge base, which can be used in various research fields, such as question answering, named entity recognition, and information retrieval.
APA, Harvard, Vancouver, ISO, and other styles
9

Liaghat, Zeinab. "Quality-efficiency trade-offs in machine learning applied to text processing." Doctoral thesis, Universitat Pompeu Fabra, 2017. http://hdl.handle.net/10803/402575.

Full text
Abstract:
Nowadays, the amount of available digital documents is rapidly growing, expanding at a considerable rate and coming from a variety of sources. Sources of unstructured and semi-structured information include the World Wide Web, news articles, biological databases, electronic mail, digital libraries, governmental digital repositories, chat rooms, online forums, blogs, and social media such as Facebook, Instagram, LinkedIn, Pinterest, Twitter, YouTube, Instagram, Pinterest, plus many others. Extracting information from these resources and finding useful information from such collections has become a challenge, which makes organizing massive amounts of data a necessity. Data mining, machine learning, and natural language processing are powerful techniques that can be used together to deal with this big challenge. Depending on the task or problem at hand, there are many different approaches that can be used. The methods that are being implemented are continuously being optimized, but not all these methods have been tested and compared for quality after training on large size corpora for supervised machine learning algorithms. The question is what happens to the quality of methods if we increase the data size from, say, 100 MB to over 1 GB? Moreover, are quality gains worth it when the rate of data processing diminishes? Can we trade quality for time efficiency and recover the quality loss by just being able to process more data? This thesis is first attempt to answer these questions in a general way for text processing tasks, as not enough research has been done to compare those methods considering the trade-offs of data size, quality, and processing time. Hence, we propose a trade-off analysis framework and apply it to three important text processing problems: Named Entity Recognition, Sentiment Analysis, and Document Classification. These problems were also chosen because they have different levels of object granularity: words, passages, and documents. For each problem, we select several machine learning algorithms and we evaluate the trade-offs of these different methods on large publicly available datasets (news, reviews, patents). We use different data subsets of increasing size ranging from 50 MB to a few GB, to explore these trade-offs. We conclude, as hypothesized, that just because the method has good performance in small data, it does not necessarily have the same performance for big data. For the two last problems, we consider similar algorithms and also consider two different data sets and two different evaluation techniques, to study the impact of the data and the evaluation technique on the resulting trade-offs. We find that the results do not change significantly.
Avui en dia, la quantitat de documents digitals disponibles està creixent ràpidament, expandint- se a un ritme considerable i procedint de diverses fonts. Les fonts d’informació no estructurada i semiestructurada inclouen la World Wide Web, articles de notícies, bases de dades biològiques, correus electrònics, biblioteques digitals, repositoris electrònics governamentals, , sales de xat, forums en línia, blogs i mitjans socials com Facebook, Instagram, LinkedIn, Pinterest, Twitter, YouTube i molts d’altres. Extreure’n informació d’aquests recursos i trobar informació útil d’aquestes col.leccions s’ha convertit en un desafiament que fa que l’organització d’aquesta enorme quantitat de dades esdevingui una necessitat. La mineria de dades, l’aprenentatge automàtic i el processament del llenguatge natural són tècniques poderoses que poden utilitzar-se conjuntament per fer front a aquest gran desafiament. Segons la tasca o el problema en qüestió existeixen molts emfo- caments diferents que es poden utilitzar. Els mètodes que s’estan implementant s’optimitzen continuament, però aquests mètodes d’aprenentatge automàtic supervisats han estat provats i comparats amb grans dades d’entrenament. La pregunta és : Què passa amb la qualitat dels mètodes si incrementem les dades de 100 MB a 1 GB? Més encara: Les millores en la qualitat valen la pena quan la taxa de processament de les dades minva? Podem canviar qualitat per eficiència, tot recuperant la perdua de qualitat quan processem més dades? Aquesta tesi és una primera aproximació per resoldre aquestes preguntes de forma gene- ral per a tasques de processament de text, ja que no hi ha hagut suficient investigació per a comparar aquests mètodes considerant el balanç entre el tamany de les dades, la qualitat dels resultats i el temps de processament. Per tant, proposem un marc per analitzar aquest balanç i l’apliquem a tres problemes importants de processament de text: Reconeixement d’Entitats Anomenades, Anàlisi de Sentiments i Classificació de Documents. Aquests problemes tam- bé han estat seleccionats perquè tenen nivells diferents de granularitat: paraules, opinions i documents complerts. Per a cada problema seleccionem diferents algoritmes d’aprenentatge automàtic i avaluem el balanç entre aquestes variables per als diferents algoritmes en grans conjunts de dades públiques ( notícies, opinions, patents). Utilitzem subconjunts de diferents tamanys entre 50 MB i alguns GB per a explorar aquests balanç. Per acabar, com havíem suposat, no perquè un algoritme és eficient en poques dades serà eficient en grans quantitats de dades. Per als dos últims problemes considerem algoritmes similars i també dos conjunts diferents de dades i tècniques d’avaluació per a estudiar l’impacte d’aquests dos paràmetres en els resultats. Mostrem que els resultats no canvien significativament amb aquests canvis.
Hoy en día, la cantidad de documentos digitales disponibles está creciendo rápidamente, ex- pandiéndose a un ritmo considerable y procediendo de una variedad de fuentes. Estas fuentes de información no estructurada y semi estructurada incluyen la World Wide Web, artículos de noticias, bases de datos biológicos, correos electrónicos, bibliotecas digitales, repositorios electrónicos gubernamentales, salas de chat, foros en línea, blogs y medios sociales como Fa- cebook, Instagram, LinkedIn, Pinterest, Twitter, YouTube, además de muchos otros. Extraer información de estos recursos y encontrar información útil de tales colecciones se ha convertido en un desafío que hace que la organización de esa enorme cantidad de datos sea una necesidad. La minería de datos, el aprendizaje automático y el procesamiento del lenguaje natural son técnicas poderosas que pueden utilizarse conjuntamente para hacer frente a este gran desafío. Dependiendo de la tarea o el problema en cuestión, hay muchos enfoques dife- rentes que se pueden utilizar. Los métodos que se están implementando se están optimizando continuamente, pero estos métodos de aprendizaje automático supervisados han sido probados y comparados con datos de entrenamiento grandes. La pregunta es ¿Qué pasa con la calidad de los métodos si incrementamos los datos de 100 MB a 1GB? Más aún, ¿las mejoras en la cali- dad valen la pena cuando la tasa de procesamiento de los datos disminuye? ¿Podemos cambiar calidad por eficiencia, recuperando la perdida de calidad cuando procesamos más datos? Esta tesis es una primera aproximación para resolver estas preguntas de forma general para tareas de procesamiento de texto, ya que no ha habido investigación suficiente para comparar estos métodos considerando el balance entre el tamaño de los datos, la calidad de los resultados y el tiempo de procesamiento. Por lo tanto, proponemos un marco para analizar este balance y lo aplicamos a tres importantes problemas de procesamiento de texto: Reconocimiento de En- tidades Nombradas, Análisis de Sentimientos y Clasificación de Documentos. Estos problemas fueron seleccionados también porque tienen distintos niveles de granularidad: palabras, opinio- nes y documentos completos. Para cada problema seleccionamos distintos algoritmos de apren- dizaje automático y evaluamos el balance entre estas variables para los distintos algoritmos en grandes conjuntos de datos públicos (noticias, opiniones, patentes). Usamos subconjuntos de distinto tamaño entre 50 MB y varios GB para explorar este balance. Para concluir, como ha- bíamos supuesto, no porque un algoritmo es eficiente en pocos datos será eficiente en grandes cantidades de datos. Para los dos últimos problemas consideramos algoritmos similares y tam- bién dos conjuntos distintos de datos y técnicas de evaluación, para estudiar el impacto de estos dos parámetros en los resultados. Mostramos que los resultados no cambian significativamente con estos cambios.
APA, Harvard, Vancouver, ISO, and other styles
10

Skeppstedt, Maria. "Extracting Clinical Findings from Swedish Health Record Text." Doctoral thesis, Stockholms universitet, Institutionen för data- och systemvetenskap, 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-109254.

Full text
Abstract:
Information contained in the free text of health records is useful for the immediate care of patients as well as for medical knowledge creation. Advances in clinical language processing have made it possible to automatically extract this information, but most research has, until recently, been conducted on clinical text written in English. In this thesis, however, information extraction from Swedish clinical corpora is explored, particularly focusing on the extraction of clinical findings. Unlike most previous studies, Clinical Finding was divided into the two more granular sub-categories Finding (symptom/result of a medical examination) and Disorder (condition with an underlying pathological process). For detecting clinical findings mentioned in Swedish health record text, a machine learning model, trained on a corpus of manually annotated text, achieved results in line with the obtained inter-annotator agreement figures. The machine learning approach clearly outperformed an approach based on vocabulary mapping, showing that Swedish medical vocabularies are not extensive enough for the purpose of high-quality information extraction from clinical text. A rule and cue vocabulary-based approach was, however, successful for negation and uncertainty classification of detected clinical findings. Methods for facilitating expansion of medical vocabulary resources are particularly important for Swedish and other languages with less extensive vocabulary resources. The possibility of using distributional semantics, in the form of Random indexing, for semi-automatic vocabulary expansion of medical vocabularies was, therefore, evaluated. Distributional semantics does not require that terms or abbreviations are explicitly defined in the text, and it is, thereby, a method suitable for clinical corpora. Random indexing was shown useful for extending vocabularies with medical terms, as well as for extracting medical synonyms and abbreviation dictionaries.
APA, Harvard, Vancouver, ISO, and other styles

Books on the topic "Named Entity Classification"

1

Kellum, John A. Diagnosis of oliguria and acute kidney injury. Oxford University Press, 2016. http://dx.doi.org/10.1093/med/9780199600830.003.0212.

Full text
Abstract:
Diagnosis and classification of acute pathology in the kidney is major clinical problem. Azotemia and oliguria represent not only disease, but also normal responses of the kidney to extracellular volume depletion or a decreased renal blood flow. Clinicians routinely make inferences about both the presence of renal dysfunction and its cause. Pure prerenal physiology is unusual in hospitalized patients and its effects are not necessary benign. Sepsismay alter renal function without the characteristic changes in urine indices. The clinical syndrome known as acute tubular necrosis does not actually manifest the histological changes that the name implies. Acute kidney injury (AKI) is a term proposed to encompass the entire spectrum of the syndrome from minor changes in renal function to a requirement for renal replacement therapy. Criteria based on both changes in serum creatinine and urine output represent a broad international consensus for diagnosing and staging AKI.
APA, Harvard, Vancouver, ISO, and other styles

Book chapters on the topic "Named Entity Classification"

1

Harrando, Ismail, and Raphaël Troncy. "Named Entity Recognition as Graph Classification." In The Semantic Web: ESWC 2021 Satellite Events, 103–8. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-80418-3_19.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Nadeau, David, and Satoshi Sekine. "A survey of named entity recognition and classification." In Benjamins Current Topics, 3–28. Amsterdam: John Benjamins Publishing Company, 2009. http://dx.doi.org/10.1075/bct.19.03nad.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Barua, Jayendra, and Dhaval Patel. "Named Entity Classification Using Search Engine’s Query Suggestions." In Lecture Notes in Computer Science, 612–18. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-56608-5_56.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Jayashree, R., Basavaraj S. Anami, and S. Teju. "Impact of Named Entity Recognition on Kannada Documents Classification." In Communications in Computer and Information Science, 395–402. Singapore: Springer Singapore, 2018. http://dx.doi.org/10.1007/978-981-10-9059-2_35.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

de Pablo-Sánchez, César, and Paloma Martínez. "Building a Graph of Names and Contextual Patterns for Named Entity Classification." In Lecture Notes in Computer Science, 530–37. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. http://dx.doi.org/10.1007/978-3-642-00958-7_47.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Moreno, Isabel, M. T. Romá-Ferri, and Paloma Moreda. "Named Entity Classification Based on Profiles: A Domain Independent Approach." In Natural Language Processing and Information Systems, 142–46. Cham: Springer International Publishing, 2017. http://dx.doi.org/10.1007/978-3-319-59569-6_15.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Gamallo, Pablo, and Marcos Garcia. "A Resource-Based Method for Named Entity Extraction and Classification." In Progress in Artificial Intelligence, 610–23. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. http://dx.doi.org/10.1007/978-3-642-24769-9_44.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Abdallah, Sherief, Khaled Shaalan, and Muhammad Shoaib. "Integrating Rule-Based System with Classification for Arabic Named Entity Recognition." In Computational Linguistics and Intelligent Text Processing, 311–22. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-28604-9_26.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Ni, Yuan, Lei Zhang, Zhaoming Qiu, and Chen Wang. "Enhancing the Open-Domain Classification of Named Entity Using Linked Open Data." In Lecture Notes in Computer Science, 566–81. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010. http://dx.doi.org/10.1007/978-3-642-17746-0_36.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Petasis, G., S. Petridis, G. Paliouras, V. Karkaletsis, S. J. Perantonis, and C. D. Spyropoulos. "Symbolic and Neural Learning of Named-Entity Recognition and Classification Systems in Two Languages." In International Series in Intelligent Technologies, 193–210. Dordrecht: Springer Netherlands, 2002. http://dx.doi.org/10.1007/978-94-010-0324-7_14.

Full text
APA, Harvard, Vancouver, ISO, and other styles

Conference papers on the topic "Named Entity Classification"

1

Manchanda, Pikakshi, Elisabetta Fersini, Matteo Palmonari, Debora Nozza, and Enza Messina. "Towards adaptation of named entity classification." In SAC 2017: Symposium on Applied Computing. New York, NY, USA: ACM, 2017. http://dx.doi.org/10.1145/3019612.3022188.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Ahmad, Faruk, and Md-Mizanur Rahoman. "Named entity classification using dependency grammar." In 2017 20th International Conference of Computer and Information Technology (ICCIT). IEEE, 2017. http://dx.doi.org/10.1109/iccitechn.2017.8281836.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Màrquez, Lluís, Adrià de Gispert, Xavier Carreras, and Lluís Padró. "Low-cost Named Entity Classification for Catalan." In the ACL 2003 workshop. Morristown, NJ, USA: Association for Computational Linguistics, 2003. http://dx.doi.org/10.3115/1119384.1119388.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Wu, Chuan-Yu, Bom Yi Lee, Jing Sun, Yin Yin Latt, Kim Shepherd, and Jared Watts. "Named Entity Extraction and Classification in Digital Publications." In The 29th International Conference on Software Engineering and Knowledge Engineering. KSI Research Inc. and Knowledge Systems Institute Graduate School, 2017. http://dx.doi.org/10.18293/seke2017-132.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

"Using Named Entity Recognition as a Classification Heuristic." In iConference 2014 Proceedings: Breaking Down Walls. Culture - Context - Computing. iSchools, 2014. http://dx.doi.org/10.9776/14401.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Keretna, Sara, Chee Peng Lim, Doug Creighton, and Khaled Bashir Shaban. "Classification ensemble to improve medical Named Entity Recognition." In 2014 IEEE International Conference on Systems, Man and Cybernetics - SMC. IEEE, 2014. http://dx.doi.org/10.1109/smc.2014.6974324.

Full text
APA, Harvard, Vancouver, ISO, and other styles
7

Vora, Komil, Avani Vasant, and Rachit Adhvaryu. "Named entity recognition and classification for Gujarati language." In 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI). IEEE, 2016. http://dx.doi.org/10.1109/icacci.2016.7732390.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Kim, Jae-Ho, In-Ho Kang, and Key-Sun Choi. "Unsupervised named entity classification models and their ensembles." In the 19th international conference. Morristown, NJ, USA: Association for Computational Linguistics, 2002. http://dx.doi.org/10.3115/1072228.1072316.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

St. Chifu, Emil, and Viorica R. Chifu. "A Neural Model for Unsupervised Named Entity Classification." In 2008 International Conference on Computational Intelligence for Modelling Control & Automation. IEEE, 2008. http://dx.doi.org/10.1109/cimca.2008.163.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Fritzler, Alexander, Varvara Logacheva, and Maksim Kretov. "Few-shot classification in named entity recognition task." In SAC '19: The 34th ACM/SIGAPP Symposium on Applied Computing. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3297280.3297378.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography