Dissertations / Theses on the topic 'Hierarchical Multi-label Text Classification'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 24 dissertations / theses for your research on the topic 'Hierarchical Multi-label Text Classification.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Dendamrongvit, Sareewan. "Induction in Hierarchical Multi-label Domains with Focus on Text Categorization." Scholarly Repository, 2011. http://scholarlyrepository.miami.edu/oa_dissertations/542.
Full textBorggren, Lukas. "Automatic Categorization of News Articles With Contextualized Language Models." Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177004.
Full textRazavi, Amir Hossein. "Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)." Thèse, Université d'Ottawa / University of Ottawa, 2012. http://hdl.handle.net/10393/23061.
Full textWei, Zhihua. "The research on chinese text multi-label classification." Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO20025/document.
Full textLa thèse est centrée sur la Classification de texte, domaine en pleine expansion, avec de nombreuses applications actuelles et potentielles. Les apports principaux de la thèse portent sur deux points : Les spécificités du codage et du traitement automatique de la langue chinoise : mots pouvant être composés de un, deux ou trois caractères ; absence de séparation typographique entre les mots ; grand nombre d’ordres possibles entre les mots d’une phrase ; tout ceci aboutissant à des problèmes difficiles d’ambiguïté. La solution du codage en «n-grams »(suite de n=1, ou 2 ou 3 caractères) est particulièrement adaptée à la langue chinoise, car elle est rapide et ne nécessite pas les étapes préalables de reconnaissance des mots à l’aide d’un dictionnaire, ni leur séparation. La classification multi-labels, c'est-à-dire quand chaque individus peut être affecté à une ou plusieurs classes. Dans le cas des textes, on cherche des classes qui correspondent à des thèmes (topics) ; un même texte pouvant être rattaché à un ou plusieurs thème. Cette approche multilabel est plus générale : un même patient peut être atteint de plusieurs pathologies ; une même entreprise peut être active dans plusieurs secteurs industriels ou de services. La thèse analyse ces problèmes et tente de leur apporter des solutions, d’abord pour les classifieurs unilabels, puis multi-labels. Parmi les difficultés, la définition des variables caractérisant les textes, leur grand nombre, le traitement des tableaux creux (beaucoup de zéros dans la matrice croisant les textes et les descripteurs), et les performances relativement mauvaises des classifieurs multi-classes habituels
文本分类是信息科学中一个重要而且富有实际应用价值的研究领域。随着文本分类处理内容日趋复杂化和多元化,分类目标也逐渐多样化,研究有效的、切合实际应用需求的文本分类技术成为一个很有挑战性的任务,对多标签分类的研究应运而生。本文在对大量的单标签和多标签文本分类算法进行分析和研究的基础上,针对文本表示中特征高维问题、数据稀疏问题和多标签分类中分类复杂度高而精度低的问题,从不同的角度尝试运用粗糙集理论加以解决,提出了相应的算法,主要包括:针对n-gram作为中文文本特征时带来的维数灾难问题,提出了两步特征选择的方法,即去除类内稀有特征和类间特征选择相结合的方法,并就n-gram作为特征时的n值选取、特征权重的选择和特征相关性等问题在大规模中文语料库上进行了大量的实验,得出一些有用的结论。针对文本分类中运用高维特征表示文本带来的分类效率低,开销大等问题,提出了基于LDA模型的多标签文本分类算法,利用LDA模型提取的主题作为文本特征,构建高效的分类器。在PT3多标签分类转换方法下,该分类算法在中英文数据集上都表现出很好的效果,与目前公认最好的多标签分类方法效果相当。针对LDA模型现有平滑策略的随意性和武断性的缺点,提出了基于容差粗糙集的LDA语言模型平滑策略。该平滑策略首先在全局词表上构造词的容差类,再根据容差类中词的频率为每类文档的未登录词赋予平滑值。在中英文、平衡和不平衡语料库上的大量实验都表明该平滑方法显著提高了LDA模型的分类性能,在不平衡语料库上的提高尤其明显。针对多标签分类中分类复杂度高而精度低的问题,提出了一种基于可变精度粗糙集的复合多标签文本分类框架,该框架通过可变精度粗糙集方法划分文本特征空间,进而将多标签分类问题分解为若干个两类单标签分类问题和若干个标签数减少了的多标签分类问题。即,当一篇未知文本被划分到某一类文本的下近似区域时,可以直接用简单的单标签文本分类器判断其类别;当未知文本被划分在边界域时,则采用相应区域的多标签分类器进行分类。实验表明,这种分类框架下,分类的精确度和算法效率都有较大的提高。本文还设计和实现了一个基于多标签分类的网页搜索结果可视化系统(MLWC),该系统能够直接调用搜索引擎返回的搜索结果,并采用改进的Naïve Bayes多标签分类算法实现实时的搜索结果分类,使用户可以快速地定位搜索结果中感兴趣的文本。
Burkhardt, Sophie [Verfasser]. "Online Multi-label Text Classification using Topic Models / Sophie Burkhardt." Mainz : Universitätsbibliothek Mainz, 2018. http://d-nb.info/1173911235/34.
Full textSendur, Zeynel. "Text Document Categorization by Machine Learning." Scholarly Repository, 2008. http://scholarlyrepository.miami.edu/oa_theses/209.
Full textArtmann, Daniel. "Applying machine learning algorithms to multi-label text classification on GitHub issues." Thesis, Högskolan i Halmstad, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43097.
Full textLi, Xin. "Multi-label Learning under Different Labeling Scenarios." Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/350482.
Full textPh.D.
Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where |Y| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods.
Temple University--Theses
Průša, Petr. "Multi-label klasifikace textových dokumentů." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-412872.
Full textRios, Anthony. "Deep Neural Networks for Multi-Label Text Classification: Application to Coding Electronic Medical Records." UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/71.
Full textTsatsaronis, George. "An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition." BioMed Central, 2015. https://tud.qucosa.de/id/qucosa%3A29496.
Full textTsatsaronis, George. "An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition." Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-202687.
Full textRodríguez, Medina Samuel. "Multi-Label Text Classification with Transfer Learning for Policy Documents : The Case of the Sustainable Development Goals." Thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-395186.
Full textMetz, Jean. "Abordagens para aprendizado semissupervisionado multirrótulo e hierárquico." Universidade de São Paulo, 2011. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-13012012-144607/.
Full textIn machine learning, the task of classification consists on creating computational models that are able to automatically identify the class of objects belonging to a predefined domain from a set of examples whose class is known a priori. There are some classification scenarios in which each object can be associated to more than one class at the same time. Moreover, in such multilabeled scenarios, classes can be organized in a taxonomy that represents the generalization and specialization relationships among the different classes, which defines a class hierarchy, making the classification task, known as hierarchical classification, even more specific. The methods used to build such classification models are complex and highly dependent on the availability of an expressive quantity of previously classified examples. However, for a large number of applications, it is difficult to find a significant number of such examples. Moreover, when few examples are available, supervised learning algorithms are not able to build efficient classification models. In such situations it is possible to use semi-supervised learning, whose aim is to learn the classes of the domain using a few classified examples in conjunction to a considerable number of examples with no specified class. In this work, we propose methods that use the co-perspective disagreement based learning approach for both, the flat multilabel classification and the hierarchical classification tasks, among others. We also propose other methods that use active learning, aiming at improving the performance of semi-supervised learning algorithms. Additionally, two methods for the evaluation of multilabel and hierarchical learning algorithms are proposed. These methods define strategies for the identification of the majority multilabels, which are used to estimate the baseline evaluation measures. A framework for the experimental evaluation of the hierarchical classification was developed. This framework includes the implementations of the proposed methods as well as a complete module for the experimental evaluation of the hierarchical algorithms. The proposed methods were empirically evaluated considering datasets from various domains. From the analysis of the results, it can be observed that the methods based on co-perspective disagreement are not effective for complex classification tasks, such as the multilabel and hierarchical classification. It can also be observed that the main degradation problem of the models of the semi-supervised algorithms worsens for the multilabel and hierarchical classification due to the fact that, for these cases, there is an increase in the causes of the degradation of the models built using semi-supervised learning based on co-perspective disagreement
Santos, Araken de Medeiros. "Investigando a combina??o de t?cnicas de aprendizado semissupervisionado e classifica??o hier?rquica multirr?tulo." Universidade Federal do Rio Grande do Norte, 2012. http://repositorio.ufrn.br:8080/jspui/handle/123456789/18690.
Full textData classification is a task with high applicability in a lot of areas. Most methods for treating classification problems found in the literature dealing with single-label or traditional problems. In recent years has been identified a series of classification tasks in which the samples can be labeled at more than one class simultaneously (multi-label classification). Additionally, these classes can be hierarchically organized (hierarchical classification and hierarchical multi-label classification). On the other hand, we have also studied a new category of learning, called semi-supervised learning, combining labeled data (supervised learning) and non-labeled data (unsupervised learning) during the training phase, thus reducing the need for a large amount of labeled data when only a small set of labeled samples is available. Thus, since both the techniques of multi-label and hierarchical multi-label classification as semi-supervised learning has shown favorable results with its use, this work is proposed and used to apply semi-supervised learning in hierarchical multi-label classication tasks, so eciently take advantage of the main advantages of the two areas. An experimental analysis of the proposed methods found that the use of semi-supervised learning in hierarchical multi-label methods presented satisfactory results, since the two approaches were statistically similar results
A classifica??o de dados ? uma tarefa com alta aplicabilidade em uma grande quantidade de dom?nios. A maioria dos m?todos para tratar problemas de classifica??o encontrados na literatura, tratam problemas tradicionais ou unirr?tulo. Nos ?ltimos anos vem sendo identificada uma s?rie de tarefas de classifica??o nas quais os exemplos podem ser rotulados a mais de uma classe simultaneamente (classifica??o multirr?tulo). Adicionalmente, tais classes podem estar hierarquicamente organizadas (classifica??o hier?rquica e classifica??o hier?rquica multirr?tulo). Por outro lado, tem-se estudado tamb?m uma nova categoria de aprendizado, chamada de aprendizado semissupervisionado, que combina dados rotulados (aprendizado supervisionado) e dados n?o-rotulados (aprendizado n?o-supervisionado), durante a fase de treinamento, reduzindo, assim, a necessidade de uma grande quantidade de dados rotulados quando somente um pequeno conjunto de exemplos rotulados est? dispon?- vel. Desse modo, uma vez que tanto as t?cnicas de classifica??o multirr?tulo e hier?rquica multirr?tulo quanto o aprendizado semissupervisionado vem apresentando resultados favor ?veis ? sua utiliza??o, neste trabalho ? proposta e utilizada a aplica??o de aprendizado semissupervisionado em tarefas de classifica??o hier?rquica multirr?tulo, de modo a se atender eficientemente as principais necessidades das duas ?reas. Uma an?lise experimental dos m?todos propostos verificou que a utiliza??o do aprendizado semissupervisionado em m?todos de classifica??o hier?rquica multirr?tulo apresentou resultados satisfat?rios, uma vez que as duas abordagens apresentaram resultados estatisticamente semelhantes
Cerri, Ricardo. "Redes neurais e algoritmos genéticos para problemas de classificação hierárquica multirrótulo." Universidade de São Paulo, 2013. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-24032014-163900/.
Full textconventional classification problems, each example of a dataset is associated with just one among two or more classes. However, there are more complex classification problems where the classes are hierarchically structured, having subclasses and superclasses. In these problems, examples can be simultaneously assigned to classes belonging to two or more paths of a hierarchy, i.e., examples can be classified in many classes located in the same hierarchical level. Such a hierarchy can be structured as a tree or a directed acyclic graph. These problems are known as hierarchical multi-label classification problems, being more difficult due to the high complexity, diversity of solutions, modeling difficulty and data imbalance. Two main approaches are used to deal with these problems, called global and local. In the global approach, only one classifier is induced to deal with all classes simultaneously, and the classification of new examples is done in just one step. In the local approach, a set of classifiers is induced, where each classifier is responsible for the predictions of one class or a set of classes, and the classification of new examples is done in many steps, considering the predictions of all classifiers. In this Thesis, two methods for hierarchical multi-label classification are proposed and investigated. The first one is based on the local approach, and associates a Multi-Layer Perceptron (MLP) to each hierarchical level, being each MLP responsible for the predictions in its associated level. The method is called Hierarchical Multi-Label Classification with Local Multi-Layer Perceptrons (HMC-LMLP). The second method is based on the global approach, and induces hierarchical multi-label classification rules using a Genetic Algorithm. The method is called Hierarchical Multi-Label Classification with a Genetic Algorithm (HMC-GA). Experiments using hierarchies structured as trees showed that HMC-LMLP obtained classification performances superior to the state-of-the-art method in the literature, and superior or competitive performances when using graph-structured hierarchies. The HMC-GA method obtained competitive results with other methods of the literature in both tree and graph-structured hierarchies, being able of inducing, in many cases, smaller and in less quantity rules
Araújo, Hiury Nogueira de. "Utilizando aprendizado emissupervisionado multidescrição em problemas de classificação hierárquica multirrótulo." Universidade Federal Rural do Semi-Árido, 2017. http://bdtd.ufersa.edu.br:80/tede/handle/tede/839.
Full textApproved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:58:58Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5)
Approved for entry into archive by Vanessa Christiane (referencia@ufersa.edu.br) on 2018-06-18T16:59:18Z (GMT) No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5)
Made available in DSpace on 2018-06-18T16:59:31Z (GMT). No. of bitstreams: 1 HiuryNA_DISSERT.pdf: 3188162 bytes, checksum: d40d42a78787557868ebc6d3cd5af945 (MD5) Previous issue date: 2017-11-17
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Data classification is a task applied in various areas of knowledge, therefore, the focus of ongoing research. Data classification can be divided according to the available data, which are labeled or not labeled. One approach has proven very effective when working with data sets containing labeled and unlabeled data, this called semi-supervised learning, your objective is to label the unlabeled data by using the amount of labeled data in the data set, improving their success rate. Such data can be classified with more than one label, known as multi-label classification. Furthermore, these data can be organized hierarchically, thus containing a relation therebetween, this called hierarchical classification. This work proposes the use of multi-view semi-supervised learning, which is one of the semissupervisionado learning aspects, in problems of hierarchical multi-label classification, with the objective of investigating whether semi-supervised learning is an appropriate approach to solve the problem of low dimensionality of data. An experimental analysis of the methods found that supervised learning had a better performance than semi-supervised approaches, however, semi-supervised learning may be a widely used approach, because, there is plenty to be contributed in this area
classificação de dados é uma tarefa aplicada em diversas áreas do conhecimento, sendo assim, foco de constantes pesquisas. A classificação de dados pode ser dividida de acordo com a disposição dos dados, sendo estes rotulados ou não rotulados. Uma abordagem vem se mostrando bastante eficiente ao se trabalhar com conjuntos de dados contendo dados rotulados e não rotulados, esta chamada de aprendizado semissupervisionado, seu objetivo é classificar os dados não rotulados através da quantidade de dados rotulados contidos no conjunto, melhorando sua taxa de acerto. Tais dados podem ser classificados com mais de um rótulo, conhecida como classificação multirrótulo. Além disso, estes dados podem estar organizados de forma hierárquica, contendo assim, uma relação entre os mesmos, esta, por sua vez, denominada classificação hierárquica. Neste trabalho é proposto a utilização do aprendizado semissupervisionado multidescrição, que é uma das vertentes do aprendizado semissupervisionado, em problemas de classificação hierárquica multirrótulo, com o objetivo de investigar se o aprendizado semissupervisionado é uma abordagem apropriada para resolver o problema de baixa dimensionalidade de dados. Uma análise experimental dos métodos verificou que o aprendizado supervisionado obteve melhor desempenho contra as abordagens semissupervisionadas, contudo, o aprendizado semissupervisionado pode vir a ser uma abordagem amplamente utilizada, pois, há bastante o que ser contribuído nesta área
2018-03-14
jia-liang, Chen, and 陳佳良. "Hierarchical Multi-class Text Classification Using Support Vector Machines." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/09344492149763276087.
Full text元智大學
資訊管理學系
96
This study presents a hierarchical multi-class text classification framework based on the characteristics of enterprise documents. The multi-class classifiers are based on Support Vector Machines using an one-against-one approach. The features used by each classifier are selected using DF (Document Frequency) and CC (Correlated Coefficient). We conducted experiments on two different datasets; one contains enterprise documents from IC a local equipment manufacture and the other contains mainland china news. The experimental results show that our proposed method performed well on both datasets and ran faster than a non-hierarchical approach.
Tsai, Shang-Chi, and 蔡尚錡. "Leveraging Hierarchical Category Knowledge for Multi-Label Diagnostic Text Understanding." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/24rczv.
Full text國立臺灣大學
資料科學學位學程
107
Clinical notes are essential medical documents to record each patient''s symptoms. Each record is typically annotated with medical diagnostic codes, which means diagnosis and treatment. This paper focuses on predicting diagnostic codes given the descriptive present illness in electronic health records by leveraging domain knowledge. We investigate various losses in a convolutional model to utilize hierarchical category knowledge of diagnostic codes in order to allow the model to share semantics across different labels under the same category. The proposed model not only considers the external domain knowledge but also addresses the issue about data imbalance. The MIMIC3 benchmark experiments show that the proposed methods can effectively utilize category knowledge and provide informative cues to improve the performance in terms of the top-ranked diagnostic codes which is better than the prior state-of-the-art. The investigation and discussion express the potential of integrating the domain knowledge in the current machine learning based models and guiding future research directions.
Rau, YI-EN, and 饒以恩. "An empirical study of multi-label text classification: word2vector vs traditional techniques." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/54syt9.
Full text國立中央大學
資訊管理學系在職專班
107
The development of the Internet has led to the rapid advancement of social media. Because the free speech and anonymity of social media characteristic, it causes abuse such as cyber harassment and Toxic Comments. Machine learning have changed many fields, for example computer vision, speech recognition and language processing. I will use the text classification of machine learning to effectively filter out Toxic Comments. The dataset is from the competition organized by Kaggle: Toxic Comment Classification Challenge, whose source is Wikipedia's comments. These comments have been flagged as malicious and toxic by human evaluators. I will use Machine Learning (ML) method to match different Document representations for data analysis and prediction. In this study, the Document representations of the text will use TF-IDF and Word2Vec for comparison and use KNN, SVM, ANN, Deep Learning as text classifier. This data set contains six multi-labels: toxic, severe_toxic, obscene, threat, insult, identity_hate, so the six labels are paired with different Document representations and text classifiers for comparative analysis. The results show that in the Precision section, there is best predictive performance in TF-IDF combined with the SVM classifier than Word2Vec. About the Recall section, there is best predictive performance in Word2vec combined with LSTM classifiers.
Ma, Long. "A Multi-label Text Classification Framework: Using Supervised and Unsupervised Feature Selection Strategy." 2017. http://scholarworks.gsu.edu/cs_diss/123.
Full textChen, Sung-En, and 陳頌恩. "A hierarchical Multi-label Classification System of K-12 Cross-Knowledge Points Math Question." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/tqz79a.
Full text國立中央大學
資訊工程學系
107
With the development and progress of science and technology, the learning patterns also evolve. In Question-Driven learning, students clarify and validate what they learn by answering questions. Such a large number of questions needs good management. A well-performed management can avoid the situation that learning materials with the same knowledge set are defined into different sections due to ambiguous expressions. In this work, a hierarchical classification system that focuses on K-12 learning materials is proposed. We test several combination of document representation with flatten label and hierarchical label of the knowledge points. In response to a current question that contains text and image, we also introduce a multi-modal preprocessing approach and a combined feature CNN model. The experiment shows that our hierarchical classification can outperform flatten multi-label classification methods.
George, Susanna Serene. "Emergency Medical Service EMR-Driven Concept Extraction From Narrative Text." Thesis, 2021. http://dx.doi.org/10.7912/C2/68.
Full textBeing in the midst of a pandemic with patients having minor symptoms that quickly become fatal to patients with situations like a stemi heart attack, a fatal accident injury, and so on, the importance of medical research to improve speed and efficiency in patient care, has increased. As researchers in the computer domain work hard to use automation in technology in assisting the first responders in the work they do, decreasing the cognitive load on the field crew, time taken for documentation of each patient case and improving accuracy in details of a report has been a priority. This paper presents an information extraction algorithm that custom engineers certain existing extraction techniques that work on the principles of natural language processing like metamap along with syntactic dependency parser like spacy for analyzing the sentence structure and regular expressions to recurring patterns, to retrieve patient-specific information from medical narratives. These concept value pairs automatically populates the fields of an EMR form which could be reviewed and modified manually if needed. This report can then be reused for various medical and billing purposes related to the patient.
(10947207), Susanna S. George. "EMERGENCY MEDICAL SERVICE EMR-DRIVEN CONCEPT EXTRACTION FROM NARRATIVE TEXT." Thesis, 2021.
Find full textThis paper presents an information extraction algorithm that custom engineers certain existing extraction techniques that work on the principles of natural language processing like metamap along with syntactic dependency parser like spacy for analyzing the sentence structure and regular expressions to recurring patterns, to retrieve patient-specific information from medical narratives. These concept value pairs automatically populates the fields of an EMR form which could be reviewed and modified manually if needed. This report can then be reused for various medical and billing purposes related to the patient.