Auswahl der wissenschaftlichen Literatur zum Thema „Hierarchical Multi-label Text Classification“

Geben Sie eine Quelle nach APA, MLA, Chicago, Harvard und anderen Zitierweisen an

Wählen Sie eine Art der Quelle aus:

Machen Sie sich mit den Listen der aktuellen Artikel, Bücher, Dissertationen, Berichten und anderer wissenschaftlichen Quellen zum Thema "Hierarchical Multi-label Text Classification" bekannt.

Neben jedem Werk im Literaturverzeichnis ist die Option "Zur Bibliographie hinzufügen" verfügbar. Nutzen Sie sie, wird Ihre bibliographische Angabe des gewählten Werkes nach der nötigen Zitierweise (APA, MLA, Harvard, Chicago, Vancouver usw.) automatisch gestaltet.

Sie können auch den vollen Text der wissenschaftlichen Publikation im PDF-Format herunterladen und eine Online-Annotation der Arbeit lesen, wenn die relevanten Parameter in den Metadaten verfügbar sind.

Zeitschriftenartikel zum Thema "Hierarchical Multi-label Text Classification"

1

Ma, Yinglong, Xiaofeng Liu, Lijiao Zhao, Yue Liang, Peng Zhang und Beihong Jin. „Hybrid embedding-based text representation for hierarchical multi-label text classification“. Expert Systems with Applications 187 (Januar 2022): 115905. http://dx.doi.org/10.1016/j.eswa.2021.115905.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Yang, Zhenyu, und Guojing Liu. „Hierarchical Sequence-to-Sequence Model for Multi-Label Text Classification“. IEEE Access 7 (2019): 153012–20. http://dx.doi.org/10.1109/access.2019.2948855.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Gargiulo, Francesco, Stefano Silvestri, Mario Ciampi und Giuseppe De Pietro. „Deep neural network for hierarchical extreme multi-label text classification“. Applied Soft Computing 79 (Juni 2019): 125–38. http://dx.doi.org/10.1016/j.asoc.2019.03.041.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Wang, Boyan, Xuegang Hu, Peipei Li und Philip S. Yu. „Cognitive structure learning model for hierarchical multi-label text classification“. Knowledge-Based Systems 218 (April 2021): 106876. http://dx.doi.org/10.1016/j.knosys.2021.106876.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Manoharan J, Samuel. „Capsule Network Algorithm for Performance Optimization of Text Classification“. March 2021 3, Nr. 1 (03.04.2021): 1–9. http://dx.doi.org/10.36548/jscp.2021.1.001.

Der volle Inhalt der Quelle
Annotation:
In regions of visual inference, optimized performance is demonstrated by capsule networks on structured data. Classification of hierarchical multi-label text is performed with a simple capsule network algorithm in this paper. It is further compared to support vector machine (SVM), Long Short Term Memory (LSTM), artificial neural network (ANN), convolutional Neural Network (CNN) and other neural and non-neural network architectures to demonstrate its superior performance. The Blurb Genre Collection (BGC) and Web of Science (WOS) datasets are used for experimental purpose. The encoded latent data is combined with the algorithm while handling structurally diverse categories and rare events in hierarchical multi-label text applications.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Vogrincic, Sergeja, und Zoran Bosnic. „Ontology-based multi-label classification of economic articles“. Computer Science and Information Systems 8, Nr. 1 (2011): 101–19. http://dx.doi.org/10.2298/csis100420034v.

Der volle Inhalt der Quelle
Annotation:
The paper presents an approach to the task of automatic document categorization in the field of economics. Since the documents can be annotated with multiple keywords (labels), we approach this task by applying and evaluating multi-label classification methods of supervised machine learning. We describe forming a test corpus of 1015 economic documents that we automatically classify using a tool which integrates ontology construction with text mining methods. In our experimental work, we evaluate three groups of multi-label classification approaches: transformation to single-class problems, specialized multi-label models, and hierarchical/ranking models. The classification accuracies of all tested classification models indicate that there is a potential for using all of the evaluated methods to solve this task. The results show the benefits of using complex groups of approaches which benefit from exploiting dependence between the labels. A good alternative to these approaches is also single-class naive Bayes classifiers coupled with the binary relevance transformation approach.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Gong, Jibing, Hongyuan Ma, Zhiyong Teng, Qi Teng, Hekai Zhang, Linfeng Du, Shuai Chen, Md Zakirul Alam Bhuiyan, Jianhua Li und Mingsheng Liu. „Hierarchical Graph Transformer-Based Deep Learning Model for Large-Scale Multi-Label Text Classification“. IEEE Access 8 (2020): 30885–96. http://dx.doi.org/10.1109/access.2020.2972751.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Sohrab, Mohammad Golam, Makoto Miwa und Yutaka Sasaki. „IN-DEDUCTIVE and DAG-Tree Approaches for Large-Scale Extreme Multi-label Hierarchical Text Classification“. Polibits 54 (31.07.2016): 61–70. http://dx.doi.org/10.17562/pb-54-8.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Deng, Jiawen, und Fuji Ren. „Hierarchical Network with Label Embedding for Contextual Emotion Recognition“. Research 2021 (06.01.2021): 1–9. http://dx.doi.org/10.34133/2021/3067943.

Der volle Inhalt der Quelle
Annotation:
Emotion recognition has been used widely in various applications such as mental health monitoring and emotional management. Usually, emotion recognition is regarded as a text classification task. Emotion recognition is a more complex problem, and the relations of emotions expressed in a text are nonnegligible. In this paper, a hierarchical model with label embedding is proposed for contextual emotion recognition. Especially, a hierarchical model is utilized to learn the emotional representation of a given sentence based on its contextual information. To give emotion correlation-based recognition, a label embedding matrix is trained by joint learning, which contributes to the final prediction. Comparison experiments are conducted on Chinese emotional corpus RenCECps, and the experimental results indicate that our approach has a satisfying performance in textual emotion recognition task.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Liu, Zhenyu, Chaohong Lu, Haiwei Huang, Shengfei Lyu und Zhenchao Tao. „Hierarchical Multi-Granularity Attention- Based Hybrid Neural Network for Text Classification“. IEEE Access 8 (2020): 149362–71. http://dx.doi.org/10.1109/access.2020.3016727.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Dissertationen zum Thema "Hierarchical Multi-label Text Classification"

1

Dendamrongvit, Sareewan. „Induction in Hierarchical Multi-label Domains with Focus on Text Categorization“. Scholarly Repository, 2011. http://scholarlyrepository.miami.edu/oa_dissertations/542.

Der volle Inhalt der Quelle
Annotation:
Induction of classifiers from sets of preclassified training examples is one of the most popular machine learning tasks. This dissertation focuses on the techniques needed in the field of automated text categorization. Here, each document can be labeled with more than one class, sometimes with many classes. Moreover, the classes are hierarchically organized, the mutual relations being typically expressed in terms of a generalization tree. Both aspects (multi-label classification and hierarchically organized classes) have so far received inadequate attention. Existing literature work largely assumes that it is enough to induce a separate binary classifier for each class, and the question of class hierarchy is rarely addressed. This, however, ignores some serious problems. For one thing, induction of thousands of classifiers from hundreds of thousands of examples described by tens of thousands of features (a common case in automated text categorization) incurs prohibitive computational costs---even a single binary classifier in domains of this kind often takes hours, even days, to induce. For another, the circumstance that the classes are hierarchically organized affects the way we view the classification performance of the induced classifiers. The presented work proposes a technique referred to by the acronym "H-kNN-plus." The technique combines support vector machines and nearest neighbor classifiers with the intention to capitalize on the strengths of both. As for performance evaluation, a variety of measures have been used to evaluate hierarchical classifiers, including the standard non-hierarchical criteria that assign the same weight to different types of error. The author proposes a performance measure that overcomes some of their weaknesses. The dissertation begins with a study of (non-hierarchical) multi-label classification. One of the reasons for the poor performance of earlier techniques is the class-imbalance problem---a small number of positive examples being outnumbered by a great many negative examples. Another difficulty is that each of the classes tends to be characterized by a different set of characteristic features. This means that most of the binary classifiers are induced from examples described by predominantly irrelevant features. Addressing these weaknesses by majority-class undersampling and feature selection, the proposed technique significantly improves the overall classification performance. Even more challenging is the issue of hierarchical classification. Here, the dissertation introduces a new induction mechanism, H-kNN-plus, and subjects it to extensive experiments with two real-world datasets. The results indicate its superiority, in these domains, over earlier work in terms of prediction performance as well as computational costs.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Borggren, Lukas. „Automatic Categorization of News Articles With Contextualized Language Models“. Thesis, Linköpings universitet, Artificiell intelligens och integrerade datorsystem, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-177004.

Der volle Inhalt der Quelle
Annotation:
This thesis investigates how pre-trained contextualized language models can be adapted for multi-label text classification of Swedish news articles. Various classifiers are built on pre-trained BERT and ELECTRA models, exploring global and local classifier approaches. Furthermore, the effects of domain specialization, using additional metadata features and model compression are investigated. Several hundred thousand news articles are gathered to create unlabeled and labeled datasets for pre-training and fine-tuning, respectively. The findings show that a local classifier approach is superior to a global classifier approach and that BERT outperforms ELECTRA significantly. Notably, a baseline classifier built on SVMs yields competitive performance. The effect of further in-domain pre-training varies; ELECTRA’s performance improves while BERT’s is largely unaffected. It is found that utilizing metadata features in combination with text representations improves performance. Both BERT and ELECTRA exhibit robustness to quantization and pruning, allowing model sizes to be cut in half without any performance loss.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Razavi, Amir Hossein. „Automatic Text Ontological Representation and Classification via Fundamental to Specific Conceptual Elements (TOR-FUSE)“. Thèse, Université d'Ottawa / University of Ottawa, 2012. http://hdl.handle.net/10393/23061.

Der volle Inhalt der Quelle
Annotation:
In this dissertation, we introduce a novel text representation method mainly used for text classification purpose. The presented representation method is initially based on a variety of closeness relationships between pairs of words in text passages within the entire corpus. This representation is then used as the basis for our multi-level lightweight ontological representation method (TOR-FUSE), in which documents are represented based on their contexts and the goal of the learning task. The method is unlike the traditional representation methods, in which all the documents are represented solely based on the constituent words of the documents, and are totally isolated from the goal that they are represented for. We believe choosing the correct granularity of representation features is an important aspect of text classification. Interpreting data in a more general dimensional space, with fewer dimensions, can convey more discriminative knowledge and decrease the level of learning perplexity. The multi-level model allows data interpretation in a more conceptual space, rather than only containing scattered words occurring in texts. It aims to perform the extraction of the knowledge tailored for the classification task by automatic creation of a lightweight ontological hierarchy of representations. In the last step, we will train a tailored ensemble learner over a stack of representations at different conceptual granularities. The final result is a mapping and a weighting of the targeted concept of the original learning task, over a stack of representations and granular conceptual elements of its different levels (hierarchical mapping instead of linear mapping over a vector). Finally the entire algorithm is applied to a variety of general text classification tasks, and the performance is evaluated in comparison with well-known algorithms.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Wei, Zhihua. „The research on chinese text multi-label classification“. Thesis, Lyon 2, 2010. http://www.theses.fr/2010LYO20025/document.

Der volle Inhalt der Quelle
Annotation:
Text Classification (TC) which is an important field in information technology has many valuable applications. When facing the sea of information resources, the objects of TC are more complicated and diversity. The researches in pursuit of effective and practical TC technology are fairly challenging. More and more researchers regard that multi-label TC is more suited for many applications. This thesis analyses the difficulties and problems in multi-label TC and Chinese text representation based on a mass of algorithms for single-label TC and multi-label TC. Aiming at high dimensionality in feature space, sparse distribution in text representation and poor performance of multi-label classifier, this thesis will bring forward corresponding algorithms from different angles.Focusing on the problem of dimensionality “disaster” when Chinese texts are represented by using n-grams, two-step feature selection algorithm is constructed. The method combines filtering rare features within class and selecting discriminative features across classes. Moreover, the proper value of “n”, the strategy of feature weight and the correlation among features are discussed based on variety of experiments. Some useful conclusions are contributed to the research of n-gram representation in Chinese texts.In a view of the disadvantage in Latent Dirichlet Allocation (LDA) model, that is, arbitrarily revising the variable in smooth process, a new strategy for smoothing based on Tolerance Rough Set (TRS) is put forward. It constructs tolerant class in global vocabulary database firstly and then assigns value for out-of-vocabulary (oov) word in each class according to tolerant class.In order to improve performance of multi-label classifier and degrade computing complexity, a new TC method based on LDA model is applied for Chinese text representation. It extracts topics statistically from texts and then texts are represented by using the topic vector. It shows competitive performance both in English and in Chinese corpus.To enhance the performance of classifiers in multi-label TC, a compound classification framework is raised. It partitions the text space by computing the upper approximation and lower approximation. This algorithm decomposes a multi-label TC problem into several single-label TCs and several multi-label TCs which have less labels than original problem. That is, an unknown text should be classified by single-label classifier when it is partitioned into lower approximation space of some class. Otherwise, it should be classified by corresponding multi-label classifier.An application system TJ-MLWC (Tongji Multi-label Web Classifier) was designed. It could call the result from Search Engines directly and classify these results real-time using improved Naïve Bayes classifier. This makes the browse process more conveniently for users. Users could locate the texts interested immediately according to the class information given by TJ-MLWC
La thèse est centrée sur la Classification de texte, domaine en pleine expansion, avec de nombreuses applications actuelles et potentielles. Les apports principaux de la thèse portent sur deux points : Les spécificités du codage et du traitement automatique de la langue chinoise : mots pouvant être composés de un, deux ou trois caractères ; absence de séparation typographique entre les mots ; grand nombre d’ordres possibles entre les mots d’une phrase ; tout ceci aboutissant à des problèmes difficiles d’ambiguïté. La solution du codage en «n-grams »(suite de n=1, ou 2 ou 3 caractères) est particulièrement adaptée à la langue chinoise, car elle est rapide et ne nécessite pas les étapes préalables de reconnaissance des mots à l’aide d’un dictionnaire, ni leur séparation. La classification multi-labels, c'est-à-dire quand chaque individus peut être affecté à une ou plusieurs classes. Dans le cas des textes, on cherche des classes qui correspondent à des thèmes (topics) ; un même texte pouvant être rattaché à un ou plusieurs thème. Cette approche multilabel est plus générale : un même patient peut être atteint de plusieurs pathologies ; une même entreprise peut être active dans plusieurs secteurs industriels ou de services. La thèse analyse ces problèmes et tente de leur apporter des solutions, d’abord pour les classifieurs unilabels, puis multi-labels. Parmi les difficultés, la définition des variables caractérisant les textes, leur grand nombre, le traitement des tableaux creux (beaucoup de zéros dans la matrice croisant les textes et les descripteurs), et les performances relativement mauvaises des classifieurs multi-classes habituels
文本分类是信息科学中一个重要而且富有实际应用价值的研究领域。随着文本分类处理内容日趋复杂化和多元化,分类目标也逐渐多样化,研究有效的、切合实际应用需求的文本分类技术成为一个很有挑战性的任务,对多标签分类的研究应运而生。本文在对大量的单标签和多标签文本分类算法进行分析和研究的基础上,针对文本表示中特征高维问题、数据稀疏问题和多标签分类中分类复杂度高而精度低的问题,从不同的角度尝试运用粗糙集理论加以解决,提出了相应的算法,主要包括:针对n-gram作为中文文本特征时带来的维数灾难问题,提出了两步特征选择的方法,即去除类内稀有特征和类间特征选择相结合的方法,并就n-gram作为特征时的n值选取、特征权重的选择和特征相关性等问题在大规模中文语料库上进行了大量的实验,得出一些有用的结论。针对文本分类中运用高维特征表示文本带来的分类效率低,开销大等问题,提出了基于LDA模型的多标签文本分类算法,利用LDA模型提取的主题作为文本特征,构建高效的分类器。在PT3多标签分类转换方法下,该分类算法在中英文数据集上都表现出很好的效果,与目前公认最好的多标签分类方法效果相当。针对LDA模型现有平滑策略的随意性和武断性的缺点,提出了基于容差粗糙集的LDA语言模型平滑策略。该平滑策略首先在全局词表上构造词的容差类,再根据容差类中词的频率为每类文档的未登录词赋予平滑值。在中英文、平衡和不平衡语料库上的大量实验都表明该平滑方法显著提高了LDA模型的分类性能,在不平衡语料库上的提高尤其明显。针对多标签分类中分类复杂度高而精度低的问题,提出了一种基于可变精度粗糙集的复合多标签文本分类框架,该框架通过可变精度粗糙集方法划分文本特征空间,进而将多标签分类问题分解为若干个两类单标签分类问题和若干个标签数减少了的多标签分类问题。即,当一篇未知文本被划分到某一类文本的下近似区域时,可以直接用简单的单标签文本分类器判断其类别;当未知文本被划分在边界域时,则采用相应区域的多标签分类器进行分类。实验表明,这种分类框架下,分类的精确度和算法效率都有较大的提高。本文还设计和实现了一个基于多标签分类的网页搜索结果可视化系统(MLWC),该系统能够直接调用搜索引擎返回的搜索结果,并采用改进的Naïve Bayes多标签分类算法实现实时的搜索结果分类,使用户可以快速地定位搜索结果中感兴趣的文本。
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Burkhardt, Sophie [Verfasser]. „Online Multi-label Text Classification using Topic Models / Sophie Burkhardt“. Mainz : Universitätsbibliothek Mainz, 2018. http://d-nb.info/1173911235/34.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Sendur, Zeynel. „Text Document Categorization by Machine Learning“. Scholarly Repository, 2008. http://scholarlyrepository.miami.edu/oa_theses/209.

Der volle Inhalt der Quelle
Annotation:
Because of the explosion of digital and online text information, automatic organization of documents has become a very important research area. There are mainly two machine learning approaches to enhance the organization task of the digital documents. One of them is the supervised approach, where pre-defined category labels are assigned to documents based on the likelihood suggested by a training set of labeled documents; and the other one is the unsupervised approach, where there is no need for human intervention or labeled documents at any point in the whole process. In this thesis, we concentrate on the supervised learning task which deals with document classification. One of the most important tasks of information retrieval is to induce classifiers capable of categorizing text documents. The same document can belong to two or more categories and this situation is referred by the term multi-label classification. Multi-label classification domains have been encountered in diverse fields. Most of the existing machine learning techniques which are in multi-label classification domains are extremely expensive since the documents are characterized by an extremely large number of features. In this thesis, we are trying to reduce these computational costs by applying different types of algorithms to the documents which are characterized by large number of features. Another important thing that we deal in this thesis is to have the highest possible accuracy when we have the high computational performance on text document categorization.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Artmann, Daniel. „Applying machine learning algorithms to multi-label text classification on GitHub issues“. Thesis, Högskolan i Halmstad, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-43097.

Der volle Inhalt der Quelle
Annotation:
This report compares five machine learning algorithms in their ability to categorize code repositories. The focus of expanding software projects tend to shift from developing new software to the maintenance of the projects. Maintainers can label code repositories to organize the project, but this requires manual labor and time. This report will evaluate how machine learning algorithms perform in automatically classifying code repositories. Automatic classification can aid the management process by reducing both manual labor and human errors. GitHub provides online hosting for both private and public code repositories. In these repositories, users can open issues and assign labels to them, to keep track of bugs, enhancement, or requests. GitHub was used as a source for all data as it contains millions of open-source repositories. The focus was on the most popular labels from GitHub - both default labels and those defined by users. This report investigated the algorithms linear regression (LR), convolutional neural network (CNN), recurrent neural network (RNN), random forest (RF), and k-nearest-neighbor (KNN) - in multi-label text classification. The mentioned algorithms were implemented, trained, and tested with the Keras and Scikit-learn libraries. The training sets contained around 38 thousand rows and the test set around 12 thousand rows. Cross-validation was used to measure the performance of each algorithm. The metrics used to obtain the results were precision, recall, and F1-score. The algorithms were empirically tested on a different number of output labels. In order to maximize the F1-score, different designs of the neural networks and different natural language processing (NLP) methods were evaluated. This was done to see if the algorithms could be used to efficiently organize code repositories. CNN displayed the best scores in all experiments, but LR, RNN, and RF also showed some good results. LR, CNN, and RNN the had the highest F1-scores while RF could achieve a particularly high precision. KNN performed much worse than all other algorithms. The highest F1-score of 46.48% was achieved when using a non-sequential CNN model that used text input with stem words. The highest precision of 89.17% was achieved by RF. It was concluded that LR, CNN, RNN, and RF were all viable in classifying labels in software-related texts, among those found in GitHub issues. KNN wasn't found to be a viable candidate for this purpose.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Li, Xin. „Multi-label Learning under Different Labeling Scenarios“. Diss., Temple University Libraries, 2015. http://cdm16002.contentdm.oclc.org/cdm/ref/collection/p245801coll10/id/350482.

Der volle Inhalt der Quelle
Annotation:
Computer and Information Science
Ph.D.
Traditional multi-class classification problems assume that each instance is associated with a single label from category set Y where |Y| > 2. Multi-label classification generalizes multi-class classification by allowing each instance to be associated with multiple labels from Y. In many real world data analysis problems, data objects can be assigned into multiple categories and hence produce multi-label classification problems. For example, an image for object categorization can be labeled as 'desk' and 'chair' simultaneously if it contains both objects. A news article talking about the effect of Olympic games on tourism industry might belong to multiple categories such as 'sports', 'economy', and 'travel', since it may cover multiple topics. Regardless of the approach used, multi-label learning in general requires a sufficient amount of labeled data to recover high quality classification models. However due to the label sparsity, i.e. each instance only carries a small number of labels among the label set Y, it is difficult to prepare sufficient well-labeled data for each class. Many approaches have been developed in the literature to overcome such challenge by exploiting label correlation or label dependency. In this dissertation, we propose a probabilistic model to capture the pairwise interaction between labels so as to alleviate the label sparsity. Besides of the traditional setting that assumes training data is fully labeled, we also study multi-label learning under other scenarios. For instance, training data can be unreliable due to missing values. A conditional Restricted Boltzmann Machine (CRBM) is proposed to take care of such challenge. Furthermore, labeled training data can be very scarce due to the cost of labeling but unlabeled data are redundant. We proposed two novel multi-label learning algorithms under active setting to relieve the pain, one for standard single level problem and one for hierarchical problem. Our empirical results on multiple multi-label data sets demonstrate the efficacy of the proposed methods.
Temple University--Theses
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Průša, Petr. „Multi-label klasifikace textových dokumentů“. Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-412872.

Der volle Inhalt der Quelle
Annotation:
The master's thesis deals with automatic classifi cation of text document. It explains basic terms and problems of text mining. The thesis explains term clustering and shows some basic clustering algoritms. The thesis also shows some methods of classi fication and deals with matrix regression closely. Application using matrix regression for classifi cation was designed and developed. Experiments were focused on normalization and thresholding.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Rios, Anthony. „Deep Neural Networks for Multi-Label Text Classification: Application to Coding Electronic Medical Records“. UKnowledge, 2018. https://uknowledge.uky.edu/cs_etds/71.

Der volle Inhalt der Quelle
Annotation:
Coding Electronic Medical Records (EMRs) with diagnosis and procedure codes is an essential task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and misinterpretation of a patient’s well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. Therefore, it is necessary to develop automated diagnosis and procedure code recommendation methods that can be used by professional medical coders. The main difficulty with developing automated EMR coding methods is the nature of the label space. The standardized vocabularies used for medical coding contain over 10 thousand codes. The label space is large, and the label distribution is extremely unbalanced - most codes occur very infrequently, with a few codes occurring several orders of magnitude more than others. A few codes never occur in training dataset at all. In this work, we present three methods to handle the large unbalanced label space. First, we study how to augment EMR training data with biomedical data (research articles indexed on PubMed) to improve the performance of standard neural networks for text classification. PubMed indexes more than 23 million citations. Many of the indexed articles contain relevant information about diagnosis and procedure codes. Therefore, we present a novel method of incorporating this unstructured data in PubMed using transfer learning. Second, we combine ideas from metric learning with recent advances in neural networks to form a novel neural architecture that better handles infrequent codes. And third, we present new methods to predict codes that have never appeared in the training dataset. Overall, our contributions constitute advances in neural multi-label text classification with potential consequences for improving EMR coding.
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Buchteile zum Thema "Hierarchical Multi-label Text Classification"

1

Zhao, Rui, Xiao Wei, Cong Ding und Yongqi Chen. „Hierarchical Multi-label Text Classification: Self-adaption Semantic Awareness Network Integrating Text Topic and Label Level Information“. In Knowledge Science, Engineering and Management, 406–18. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-82147-0_33.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Ma, Yinglong, Jingpeng Zhao und Beihong Jin. „A Hierarchical Fine-Tuning Approach Based on Joint Embedding of Words and Parent Categories for Hierarchical Multi-label Text Classification“. In Artificial Neural Networks and Machine Learning – ICANN 2020, 746–57. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-61616-8_60.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Slavkov, Ivica, Jana Karcheska, Dragi Kocev, Slobodan Kalajdziski und Sašo Džeroski. „ReliefF for Hierarchical Multi-label Classification“. In New Frontiers in Mining Complex Patterns, 148–61. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-08407-7_10.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Ananpiriyakul, Thanawut, Piyapan Poomsirivilai und Peerapon Vateekul. „Label Correction Strategy on Hierarchical Multi-Label Classification“. In Machine Learning and Data Mining in Pattern Recognition, 213–27. Cham: Springer International Publishing, 2014. http://dx.doi.org/10.1007/978-3-319-08979-9_17.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Alaydie, Noor, Chandan K. Reddy und Farshad Fotouhi. „Exploiting Label Dependency for Hierarchical Multi-label Classification“. In Advances in Knowledge Discovery and Data Mining, 294–305. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. http://dx.doi.org/10.1007/978-3-642-30217-6_25.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Luo, Jiayu, Junqiao Hu, Yuman Zhang, Shuihuan Ye und Xinyi Xu. „Multi-label Classification Based on Label Hierarchical Compression“. In Advances in Natural Computation, Fuzzy Systems and Knowledge Discovery, 1464–71. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-70665-4_158.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Hrala, Michal, und Pavel Král. „Multi-label Document Classification in Czech“. In Text, Speech, and Dialogue, 343–51. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013. http://dx.doi.org/10.1007/978-3-642-40585-3_44.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Madjarov, Gjorgji, Vedrana Vidulin, Ivica Dimitrovski und Dragi Kocev. „Web Genre Classification via Hierarchical Multi-label Classification“. In Intelligent Data Engineering and Automated Learning – IDEAL 2015, 9–17. Cham: Springer International Publishing, 2015. http://dx.doi.org/10.1007/978-3-319-24834-9_2.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Stepišnik, Tomaž, und Dragi Kocev. „Hyperbolic Embeddings for Hierarchical Multi-label Classification“. In Lecture Notes in Computer Science, 66–76. Cham: Springer International Publishing, 2020. http://dx.doi.org/10.1007/978-3-030-59491-6_7.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

da Silva, Luan V. M., und Ricardo Cerri. „Feature Selection for Hierarchical Multi-label Classification“. In Advances in Intelligent Data Analysis XIX, 196–208. Cham: Springer International Publishing, 2021. http://dx.doi.org/10.1007/978-3-030-74251-5_16.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen

Konferenzberichte zum Thema "Hierarchical Multi-label Text Classification"

1

Huang, Wei, Enhong Chen, Qi Liu, Yuying Chen, Zai Huang, Yang Liu, Zhou Zhao, Dan Zhang und Shijin Wang. „Hierarchical Multi-label Text Classification“. In CIKM '19: The 28th ACM International Conference on Information and Knowledge Management. New York, NY, USA: ACM, 2019. http://dx.doi.org/10.1145/3357384.3357885.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
2

Banerjee, Siddhartha, Cem Akkaya, Francisco Perez-Sorrosal und Kostas Tsioutsiouliklis. „Hierarchical Transfer Learning for Multi-label Text Classification“. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-1633.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
3

Ren, Zhaochun, Maria-Hendrike Peetz, Shangsong Liang, Willemijn van Dolen und Maarten de Rijke. „Hierarchical multi-label classification of social text streams“. In SIGIR '14: The 37th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, NY, USA: ACM, 2014. http://dx.doi.org/10.1145/2600428.2609595.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
4

Aly, Rami, Steffen Remus und Chris Biemann. „Hierarchical Multi-label Classification of Text with Capsule Networks“. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-2045.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
5

Baker, Simon, und Anna Korhonen. „Initializing neural networks for hierarchical multi-label text classification“. In BioNLP 2017. Stroudsburg, PA, USA: Association for Computational Linguistics, 2017. http://dx.doi.org/10.18653/v1/w17-2339.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
6

Liang, Xin, Dawei Cheng, Fangzhou Yang, Yifeng Luo, Weining Qian und Aoying Zhou. „F-HMTC: Detecting Financial Events for Investment Decisions Based on Neural Hierarchical Multi-Label Text Classification“. In Twenty-Ninth International Joint Conference on Artificial Intelligence and Seventeenth Pacific Rim International Conference on Artificial Intelligence {IJCAI-PRICAI-20}. California: International Joint Conferences on Artificial Intelligence Organization, 2020. http://dx.doi.org/10.24963/ijcai.2020/619.

Der volle Inhalt der Quelle
Annotation:
The share prices of listed companies in the stock trading market are prone to be influenced by various events. Performing event detection could help people to timely identify investment risks and opportunities accompanying these events. The financial events inherently present hierarchical structures, which could be represented as tree-structured schemes in real-life applications, and detecting events could be modeled as a hierarchical multi-label text classification problem, where an event is designated to a tree node with a sequence of hierarchical event category labels. Conventional hierarchical multi-label text classification methods usually ignore the hierarchical relationships existing in the event classification scheme, and treat the hierarchical labels associated with an event as uniform labels, where correct or wrong label predictions are assigned with equal rewards or penalties. In this paper, we propose a neural hierarchical multi-label text classification method, namely F-HMTC, for a financial application scenario with massive event category labels. F-HMTC learns the latent features based on bidirectional encoder representations from transformers, and directly maps them to hierarchical labels with a delicate hierarchy-based loss layer. We conduct extensive experiments on a private financial dataset with elaborately-annotated labels, and F-HMTC consistently outperforms state-of-art baselines by substantial margins. We will release both the source codes and dataset on the first author's repository.
APA, Harvard, Vancouver, ISO und andere Zitierweisen
7

Shen, Jiaming, Wenda Qiu, Yu Meng, Jingbo Shang, Xiang Ren und Jiawei Han. „TaxoClass: Hierarchical Multi-Label Text Classification Using Only Class Names“. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA, USA: Association for Computational Linguistics, 2021. http://dx.doi.org/10.18653/v1/2021.naacl-main.335.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
8

Mao, Yuning, Jingjing Tian, Jiawei Han und Xiang Ren. „Hierarchical Text Classification with Reinforced Label Assignment“. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/d19-1042.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
9

Zhang, Qiang, Bo Chai, Bochuan Song und Jingpeng Zhao. „A Hierarchical Fine-Tuning Based Approach for Multi-label Text Classification“. In 2020 IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA). IEEE, 2020. http://dx.doi.org/10.1109/icccbda49378.2020.9095668.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
10

Liu, Liqun, Funan Mu, Pengyu Li, Xin Mu, Jing Tang, Xingsheng Ai, Ran Fu, Lifeng Wang und Xing Zhou. „NeuralClassifier: An Open-source Neural Hierarchical Multi-label Text Classification Toolkit“. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Stroudsburg, PA, USA: Association for Computational Linguistics, 2019. http://dx.doi.org/10.18653/v1/p19-3015.

Der volle Inhalt der Quelle
APA, Harvard, Vancouver, ISO und andere Zitierweisen
Wir bieten Rabatte auf alle Premium-Pläne für Autoren, deren Werke in thematische Literatursammlungen aufgenommen wurden. Kontaktieren Sie uns, um einen einzigartigen Promo-Code zu erhalten!

Zur Bibliographie