Дисертації з теми "Selective classifier"
Оформте джерело за APA, MLA, Chicago, Harvard та іншими стилями
Ознайомтеся з топ-50 дисертацій для дослідження на тему "Selective classifier".
Біля кожної праці в переліку літератури доступна кнопка «Додати до бібліографії». Скористайтеся нею – і ми автоматично оформимо бібліографічне посилання на обрану працю в потрібному вам стилі цитування: APA, MLA, «Гарвард», «Чикаго», «Ванкувер» тощо.
Також ви можете завантажити повний текст наукової публікації у форматі «.pdf» та прочитати онлайн анотацію до роботи, якщо відповідні параметри наявні в метаданих.
Переглядайте дисертації для різних дисциплін та оформлюйте правильно вашу бібліографію.
Sayin, Günel Burcu. "Towards Reliable Hybrid Human-Machine Classifiers." Doctoral thesis, Università degli studi di Trento, 2022. http://hdl.handle.net/11572/349843.
Повний текст джерелаBOLDT, F. A. "Classifier Ensemble Feature Selection for Automatic Fault Diagnosis." Universidade Federal do Espírito Santo, 2017. http://repositorio.ufes.br/handle/10/9872.
Повний текст джерела"An efficient ensemble feature selection scheme applied for fault diagnosis is proposed, based on three hypothesis: a. A fault diagnosis system does not need to be restricted to a single feature extraction model, on the contrary, it should use as many feature models as possible, since the extracted features are potentially discriminative and the feature pooling is subsequently reduced with feature selection; b. The feature selection process can be accelerated, without loss of classification performance, combining feature selection methods, in a way that faster and weaker methods reduce the number of potentially non-discriminative features, sending to slower and stronger methods a filtered smaller feature set; c. The optimal feature set for a multi-class problem might be different for each pair of classes. Therefore, the feature selection should be done using an one versus one scheme, even when multi-class classifiers are used. However, since the number of classifiers grows exponentially to the number of the classes, expensive techniques like Error-Correcting Output Codes (ECOC) might have a prohibitive computational cost for large datasets. Thus, a fast one versus one approach must be used to alleviate such a computational demand. These three hypothesis are corroborated by experiments. The main hypothesis of this work is that using these three approaches together is possible to improve significantly the classification performance of a classifier to identify conditions in industrial processes. Experiments have shown such an improvement for the 1-NN classifier in industrial processes used as case study."
Thapa, Mandira. "Optimal Feature Selection for Spatial Histogram Classifiers." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1513710294627304.
Повний текст джерелаGustafsson, Robin. "Ordering Classifier Chains using filter model feature selection techniques." Thesis, Blekinge Tekniska Högskola, Institutionen för datalogi och datorsystemteknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-14817.
Повний текст джерелаDuangsoithong, Rakkrit. "Feature selection and casual discovery for ensemble classifiers." Thesis, University of Surrey, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.580345.
Повний текст джерелаKo, Albert Hung-Ren. "Static and dynamic selection of ensemble of classifiers." Thèse, Montréal : École de technologie supérieure, 2007. http://proquest.umi.com/pqdweb?did=1467895171&sid=2&Fmt=2&clientId=46962&RQT=309&VName=PQD.
Повний текст джерела"A thesis presented to the École de technologie supérieure in partial fulfillment of the thesis requirement for the degree of the Ph.D. engineering". CaQMUQET Bibliogr. : f. [237]-246. Également disponible en version électronique. CaQMUQET
McCrae, Richard. "The Impact of Cost on Feature Selection for Classifiers." Thesis, Nova Southeastern University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=13423087.
Повний текст джерелаSupervised machine learning models are increasingly being used for medical diagnosis. The diagnostic problem is formulated as a binary classification task in which trained classifiers make predictions based on a set of input features. In diagnosis, these features are typically procedures or tests with associated costs. The cost of applying a trained classifier for diagnosis may be estimated as the total cost of obtaining values for the features that serve as inputs for the classifier. Obtaining classifiers based on a low cost set of input features with acceptable classification accuracy is of interest to practitioners and researchers. What makes this problem even more challenging is that costs associated with features vary with patients and service providers and change over time.
This dissertation aims to address this problem by proposing a method for obtaining low cost classifiers that meet specified accuracy requirements under dynamically changing costs. Given a set of relevant input features and accuracy requirements, the goal is to identify all qualifying classifiers based on subsets of the feature set. Then, for any arbitrary costs associated with the features, the cost of the classifiers may be computed and candidate classifiers selected based on cost-accuracy tradeoff. Since the number of relevant input features k tends to be large for typical diagnosis problems, training and testing classifiers based on all 2k – 1 possible non-empty subsets of features is computationally prohibitive. Under the reasonable assumption that the accuracy of a classifier is no lower than that of any classifier based on a subset of its input features, this dissertation aims to develop an efficient method to identify all qualifying classifiers.
This study used two types of classifiers—artificial neural networks and classification trees—that have proved promising for numerous problems as documented in the literature. The approach was to measure the accuracy obtained with the classifiers when all features were used. Then, reduced thresholds of accuracy were arbitrarily established which were satisfied with subsets of the complete feature set. Threshold values for three measures—true positive rates, true negative rates, and overall classification accuracy were considered for the classifiers. Two cost functions were used for the features; one used unit costs and the other random costs. Additional manipulation of costs was also performed.
The order in which features were removed was found to have a material impact on the effort required (removing the most important features first was most efficient, removing the least important features first was least efficient). The accuracy and cost measures were combined to produce a Pareto-Optimal Frontier. There were consistently few elements on this Frontier. At most 15 subsets were on the Frontier even when there were hundreds of thousands of acceptable feature sets. Most of the computational time is taken for training and testing the models. Given costs, models in the Pareto-Optimal Frontier can be efficiently identified and the models may be presented to decision makers. Both the Neural Networks and the Decision Trees performed in a comparable fashion suggesting that any classifier could be employed.
McCrae, Richard Clyde. "The Impact of Cost on Feature Selection for Classifiers." Diss., NSUWorks, 2018. https://nsuworks.nova.edu/gscis_etd/1057.
Повний текст джерелаPinagé, Felipe Azevedo, and 92-98187-1016. "Handling Concept Drift Based on Data Similarity and Dynamic Classifier Selection." Universidade Federal do Amazonas, 2017. http://tede.ufam.edu.br/handle/tede/5956.
Повний текст джерелаApproved for entry into archive by Divisão de Documentação/BC Biblioteca Central (ddbc@ufam.edu.br) on 2017-10-16T18:54:52Z (GMT) No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Tese - Felipe A. Pinagé.pdf: 1786179 bytes, checksum: 25c2a867ba549f75fe4adf778d3f3ad0 (MD5)
Made available in DSpace on 2017-10-16T18:54:52Z (GMT). No. of bitstreams: 2 license_rdf: 0 bytes, checksum: d41d8cd98f00b204e9800998ecf8427e (MD5) Tese - Felipe A. Pinagé.pdf: 1786179 bytes, checksum: 25c2a867ba549f75fe4adf778d3f3ad0 (MD5) Previous issue date: 2017-07-28
FAPEAM - Fundação de Amparo à Pesquisa do Estado do Amazonas
In real-world applications, machine learning algorithms can be employed to perform spam detection, environmental monitoring, fraud detection, web click stream, among others. Most of these problems present an environment that changes over time due to the dynamic generation process of the data and/or due to streaming data. The problem involving classification tasks of continuous data streams has become one of the major challenges of the machine learning domain in the last decades because, since data is not known in advance, it must be learned as it becomes available. In addition, fast predictions about data should be performed to support often real time decisions. Currently in the literature, methods based on accuracy monitoring are commonly used to detect changes explicitly. However, these methods may become infeasible in some real-world applications especially due to two aspects: they may need human operator feedback, and may depend on a significant decrease of accuracy to be able to detect changes. In addition, most of these methods are also incremental learning-based, since they update the decision model for every incoming example. However, this may lead the system to unnecessary updates. In order to overcome these problems, in this thesis, two semi-supervised methods based on estimating and monitoring a pseudo error are proposed to detect changes explicitly. The decision model is updated only after changing detection. In the first method, the pseudo error is calculated using similarity measures by monitoring the dissimilarity between past and current data distributions. The second proposed method employs dynamic classifier selection in order to improve the pseudo error measurement. As a consequence, this second method allows classifier ensemble online self-training. The experiments conducted show that the proposed methods achieve competitive results, even when compared to fully supervised incremental learning methods. The achievement of these methods, especially the second method, is relevant since they lead change detection and reaction to be applicable in several practical problems reaching high accuracy rates, where usually is not possible to generate the true labels of the instances fully and immediately after classification.
Em aplicações do mundo real, algoritmos de aprendizagem de máquina podem ser usados para detecção de spam, monitoramento ambiental, detecção de fraude, fluxo de cliques na Web, dentre outros. A maioria desses problemas apresenta ambientes que sofrem mudanças com o passar do tempo, devido à natureza dinâmica de geração dos dados e/ou porque envolvem dados que ocorrem em fluxo. O problema envolvendo tarefas de classificação em fluxo contínuo de dados tem se tornado um dos maiores desafios na área de aprendizagem de máquina nas últimas décadas, pois, como os dados não são conhecidos de antemão, eles devem ser aprendidos à medida que são processados. Além disso, devem ser feitas previsões rápidas a respeito desses dados para dar suporte à decisões muitas vezes tomadas em tempo real. Atualmente, métodos baseados em monitoramento da acurácia de classificação são geralmente usados para detectar explicitamente mudanças nos dados. Entretanto, esses métodos podem tornar-se inviáveis em aplicações práticas, especialmente devido a dois aspectos: a necessidade de uma realimentação do sistema por um operador humano, e a dependência de uma queda significativa da acurácia para que mudanças sejam detectadas. Além disso, a maioria desses métodos é baseada em aprendizagem incremental, onde modelos de predição são atualizados para cada instância de entrada, fato que pode levar a atualizações desnecessárias do sistema. A fim de tentar superar todos esses problemas, nesta tese são propostos dois métodos semi-supervisionados de detecção explícita de mudanças em dados, os quais baseiam-se na estimação e monitoramento de uma métrica de pseudo-erro. O modelo de decisão é atualizado somente após a detecção de uma mudança. No primeiro método proposto, o pseudo-erro é monitorado a partir de métricas de similaridade calculadas entre a distribuição atual e distribuições anteriores dos dados. O segundo método proposto utiliza seleção dinâmica de classificadores para aumentar a precisão do cálculo do pseudo-erro. Como consequência, nosso método possibilita que conjuntos de classificadores online sejam criados a partir de auto-treinamento. Os experimentos apresentaram resultados competitivos quando comparados inclusive com métodos baseados em aprendizagem incremental totalmente supervisionada. A proposta desses dois métodos, especialmente do segundo, é relevante por permitir que tarefas de detecção e reação a mudanças sejam aplicáveis em diversos problemas práticos alcançando altas taxas de acurácia, dado que, na maioria dos problemas práticos, não é possível obter o rótulo de uma instância imediatamente após sua classificação feita pelo sistema.
デイビッド, ア., and David Ha. "Boundary uncertainty-based classifier evaluation." Thesis, https://doors.doshisha.ac.jp/opac/opac_link/bibid/BB13128126/?lang=0, 2019. https://doors.doshisha.ac.jp/opac/opac_link/bibid/BB13128126/?lang=0.
Повний текст джерелаWe propose a general method that makes accurate evaluation of any classifier model for realistic tasks, both in a theoretical sense despite the finiteness of the available data, and in a practical sense in terms of computation costs. The classifier evaluation challenge arises from the bias of the classification error estimate that is only based on finite data. We bypass this existing difficulty by proposing a new classifier evaluation measure called "boundary uncertainty'' whose estimate based on finite data can be considered a reliable representative of its expectation based on infinite data, and demonstrate the potential of our approach on three classifier models and thirteen datasets.
博士(工学)
Doctor of Philosophy in Engineering
同志社大学
Doshisha University
Miranda, Dos Santos Eulanda. "Static and dynamic overproduction and selection of classifier ensembles with genetic algorithms." Mémoire, École de technologie supérieure, 2008. http://espace.etsmtl.ca/110/1/MIRANDA_DOS_SANTOS_Eulanda.pdf.
Повний текст джерелаChrysostomou, Kyriacos. "The role of classifiers in feature selection : number vs nature." Thesis, Brunel University, 2008. http://bura.brunel.ac.uk/handle/2438/3038.
Повний текст джерелаOliveira, e. Cruz Rafael Menelau. "Methods for dynamic selection and fusion of ensemble of classifiers." Universidade Federal de Pernambuco, 2011. https://repositorio.ufpe.br/handle/123456789/2436.
Повний текст джерелаFaculdade de Amparo à Ciência e Tecnologia do Estado de Pernambuco
Ensemble of Classifiers (EoC) é uma nova alternative para alcançar altas taxas de reconhecimento em sistemas de reconhecimento de padrões. O uso de ensemble é motivado pelo fato de que classificadores diferentes conseguem reconhecer padrões diferentes, portanto, eles são complementares. Neste trabalho, as metodologias de EoC são exploradas com o intuito de melhorar a taxa de reconhecimento em diferentes problemas. Primeiramente o problema do reconhecimento de caracteres é abordado. Este trabalho propõe uma nova metodologia que utiliza múltiplas técnicas de extração de características, cada uma utilizando uma abordagem diferente (bordas, gradiente, projeções). Cada técnica é vista como um sub-problema possuindo seu próprio classificador. As saídas deste classificador são utilizadas como entrada para um novo classificador que é treinado para fazer a combinação (fusão) dos resultados. Experimentos realizados demonstram que a proposta apresentou o melhor resultado na literatura pra problemas tanto de reconhecimento de dígitos como para o reconhecimento de letras. A segunda parte da dissertação trata da seleção dinâmica de classificadores (DCS). Esta estratégia é motivada pelo fato que nem todo classificador pertencente ao ensemble é um especialista para todo padrão de teste. A seleção dinâmica tenta selecionar apenas os classificadores que possuem melhor desempenho em uma dada região próxima ao padrão de entrada para classificar o padrão de entrada. É feito um estudo sobre o comportamento das técnicas de DCS demonstrando que elas são limitadas pela qualidade da região em volta do padrão de entrada. Baseada nesta análise, duas técnicas para seleção dinâmica de classificadores são propostas. A primeira utiliza filtros para redução de ruídos próximos do padrão de testes. A segunda é uma nova proposta que visa extrair diferentes tipos de informação, a partir do comportamento dos classificadores, e utiliza estas informações para decidir se um classificador deve ser selecionado ou não. Experimentos conduzidos em diversos problemas de reconhecimento de padrões demonstram que as técnicas propostas apresentam um aumento de performance significante
Almeida, Paulo Ricardo Lisboa de. "Adapting the dynamic selection of classifiers approach for concept drift scenarios." reponame:Repositório Institucional da UFPR, 2017. http://hdl.handle.net/1884/52771.
Повний текст джерелаCoorientadores : Alceu de Souza Britto Jr. ; Robert Sabourin
Tese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa: Curitiba, 09/11/2017
Inclui referências : f. 143-154
Resumo: Muitos ambientes podem sofrer com mudanças nas distribuições ou nas probabilidades a posteriori com o decorrer do tempo, em um problema conhecido como Concept Drift. Nesses cenários, é imperativa a implementação de algum mecanismo para adaptar o sistema de classificação às mudanças no ambiente a fim de minimizar o impacto na acurácia. Em um ambiente estático, é comum a utilização da Seleção Dinâmica de Classificadores (Dynamic Classifier Selection - DCS) para selecionar classificadores/ensembles customizados para cada uma das instâncias de teste de acordo com sua vizinhança em um conjunto de validação, onde a seleção pode ser vista como sendo dependente da região. Neste trabalho, a fim de tratar concept drifts, o conceito geral dos métodos de Seleção Dinâmica de Classificadores é estendido a fim de se tornar não somente dependente de região, mas também dependente do tempo. Através da adição da dependência do tempo, é demonstrado que a maioria dos métodos de Seleção Dinâmica de Classificadores podem ser adaptados para cenários contendo concept drifts, beneficiando-se da dependência de região, já que classificadores treinados em conceitos passados podem, em princípio, se manter competentes no conceito corrente em algumas regiões do espaço de características que não sofreram com mudanças. Neste trabalho a dependência de tempo para os métodos de Seleção Dinâmica é definida de acordo com o tipo de concept drift sendo tratado, que pode afetar apenas a distribuição no espaço de características ou as probabilidades a posteriori. Considerando as adaptações necessárias, o framework Dynse é proposto como uma ferramenta modular capaz de adaptar a Seleção Dinâmica de Classificadores para cenários contendo concept drits. Além disso, uma configuração padrão para o framework é proposta e um protocolo experimental, contendo sete Métodos de Seleção Dinâmica e doze problemas envolvendo concept drifts com diferentes propriedades, mostra que a Seleção Dinâmica de Classificadores pode ser adaptada para diferentes cenários contendo concept drifts. Quando comparado ao estado da arte, o framework Dynse, através da Seleção Dinâmica de Classificadores, se sobressai principalmente em termos de estabilidade. Ou seja, o método apresenta uma boa performance na maioria dos cenários, e requer quase nenhum ajuste de parâmetros. Key-words: Reconhecimento de Padrões. Concept Drift. Concept Drift Virtual. Concept Drift Real. Conjunto de Classificadores. Seleção Dinâmica de Classificadores. Acurácia Local.
Abstract: Many environments may suffer from distributions or a posteriori probabilities changes over time, leading to a phenomenon known as concept drift. In these scenarios, it is crucial to implement a mechanism to adapt the classification system to the environment changes in order to minimize any accuracy loss. Under a static environment, a popular approach consists in using a Dynamic Classifier Selection (DCS)-based method to select a custom classifier/ensemble for each test instance according to its neighborhood in a validation set, where the selection can be considered region-dependent. In order to handle concept drifts, in this work the general idea of the DCS method is extended to be also time-dependent. Through this time-dependency, it is demonstrated that most neighborhood DCS-based methods can be adapted to handle concept drift scenarios and take advantage of the region-dependency, since classifiers trained under previous concepts may still be competent in some regions of the feature space. The time-dependency for the DCS methods is defined according to the concept drift nature, which may define if the changes affects the a posteriori probabilities or the distributions only. By taking the necessary modifications, the Dynse framework is proposed in this work as a modular tool capable of adapting the DCS approach to concept drift scenarios. A default configuration for the Dynse framework is proposed and an experimental protocol, containing seven well-known DCS methods and 12 concept drift problems with different properties, shows that the DCS approach can adapt to different concept drift scenarios. When compared to state-of-the-art concept drift methods, the DCS-based approach comes out ahead in terms of stability, i.e., it performs well in most cases, and requires almost no parameter tuning. Key-words: Pattern Recognition. Concept Drift. Virtual Concept Drift. Real Concept Drift. Ensemble. Dynamic Classifier Selection. Local Accuracy.
Samet, Asma. "Classifier ensemble under the belief function framework." Thesis, Artois, 2018. http://www.theses.fr/2018ARTO0203.
Повний текст джерелаThe work presented in this Thesis concerns the construction of ensemble classifiers for addressing uncertain data, precisely data with evidential attributes. We start by developing newest machine learning classifiers within an evidential environment and then we tackle the ensemble construction process which follows two important steps: base individual classifier selection and classifier combination. Regarding the selection step, diversity between the base individual classifiers is one among the important criteria impacting the ensemble performance and it can be achieved by training the base classifiers on diverse feature subspaces. Thus, we propose a novel framework for feature subspace extraction form data with evidential attributes. We mainly relied on the rough set theory for identifying all possible minimal feature subspaces, called reducts, allowing the same discrimination as the whole feature set. Then, we develop three methods enabling the selection of the most suitable diverse reducts for an ensemble of evidential classifiers. The proposed reduct selection methods are evaluated according to several assessment criteria and the best one is used for selecting the best individual classifiers. Concerning the integration level, we propose to select the most appropriate combination operator among some well-known ones, including the Dempster, the cautious and the optimized t-norm based rules
Haning, Jacob M. "Feature Selection for High-Dimensional Individual and Ensemble Classifiers with Limited Data." University of Cincinnati / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1406810947.
Повний текст джерелаAla'raj, Maher A. "A credit scoring model based on classifiers consensus system approach." Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13669.
Повний текст джерелаMuhammad, Hanif Shehzad. "Feature selection and classifier combination: Application to the extraction of textual information in scene images." Paris 6, 2009. http://www.theses.fr/2009PA066521.
Повний текст джерелаAl-Ani, Ahmed Karim. "An improved pattern classification system using optimal feature selection, classifier combination, and subspace mapping techniques." Thesis, Queensland University of Technology, 2002.
Знайти повний текст джерелаLima, Tiago Pessoa Ferreira de. "An authomatic method for construction of multi-classifier systems based on the combination of selection and fusion." Universidade Federal de Pernambuco, 2013. https://repositorio.ufpe.br/handle/123456789/12457.
Повний текст джерелаApproved for entry into archive by Daniella Sodre (daniella.sodre@ufpe.br) on 2015-03-13T14:23:38Z (GMT) No. of bitstreams: 2 Dissertaçao Tiago de Lima.pdf: 1469834 bytes, checksum: 95a0326778b3d0f98bd35a7449d8b92f (MD5) license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
Made available in DSpace on 2015-03-13T14:23:38Z (GMT). No. of bitstreams: 2 Dissertaçao Tiago de Lima.pdf: 1469834 bytes, checksum: 95a0326778b3d0f98bd35a7449d8b92f (MD5) license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5) Previous issue date: 2013-02-26
In this dissertation, we present a methodology that aims the automatic construction of multi-classifiers systems based on the combination of selection and fusion. The presented method initially finds an optimum number of clusters for training data set and subsequently determines an ensemble for each cluster found. For model evaluation, the testing data set are submitted to clustering techniques and the nearest cluster to data input will emit a supervised response through its associated ensemble. Self-organizing maps were used in the clustering phase and multilayer perceptrons were used in the classification phase. Adaptive differential evolution has been used in this work in order to optimize the parameters and performance of the different techniques used in the classification and clustering phases. The proposed method, called SFJADE - Selection and Fusion (SF) via Adaptive Differential Evolution (JADE), has been tested on data compression of signals generated by artificial nose sensors and well-known classification problems, including cancer, card, diabetes, glass, heart, horse, soybean and thyroid. The experimental results have shown that the SFJADE method has a better performance than some literature methods while significantly outperforming most of the methods commonly used to construct Multi-Classifier Systems.
Nesta dissertação, nós apresentamos uma metodologia que almeja a construção automática de sistemas de múltiplos classificadores baseados em uma combinação de seleção e fusão. O método apresentado inicialmente encontra um número ótimo de grupos a partir do conjunto de treinamento e subsequentemente determina um comitê para cada grupo encontrado. Para avaliação do modelo, os dados de teste são submetidos à técnica de agrupamento e o grupo mais próximo do dado de entrada irá emitir uma resposta supervisionada por meio de seu comitê associado. Mapas Auto Organizáveis foi usado na fase de agrupamento e Perceptrons de múltiplas camadas na fase de classificação. Evolução Diferencial Adaptativa foi utilizada neste trabalho a fim de otimizar os parâmetros e desempenho das diferentes técnicas utilizadas nas fases de classificação e agrupamento de dados. O método proposto, chamado SFJADE – Selection and Fusion (SF) via Adaptive Differential Evolution (JADE), foi testado em dados gerados para sensores de um nariz artificial e problemas de referência em classificação de padrões, que são: cancer, card, diabetes, glass, heart, heartc e horse. Os resultados experimentais mostraram que SFJADE possui um melhor desempenho que alguns métodos da literatura, além de superar a maioria dos métodos geralmente usados para a construção de sistemas de múltiplos classificadores.
Ganapathy, Priya. "Development and Evaluation of a Flexible Framework for the Design of Autonomous Classifier Systems." Wright State University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=wright1261335392.
Повний текст джерелаFaria, Fabio Augusto 1983. "A framework for pattern classifier selection and fusion = Um arcabouço para seleção e fusão de classificadores de padrão." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275503.
Повний текст джерелаTese (doutorado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-24T22:15:52Z (GMT). No. of bitstreams: 1 Faria_FabioAugusto_D.pdf: 5657546 bytes, checksum: 5b95fa0f8a5653e7b13d8895cde208f1 (MD5) Previous issue date: 2014
Resumo: O crescente aumento de dados visuais, seja pelo uso de inúmeras câmeras de vídeo monitoramento disponíveis ou pela popularização de dispositivos móveis que permitem pessoas criar, editar e compartilhar suas próprias imagens/vídeos, tem contribuído enormemente para a chamada ''big data revolution". Esta grande quantidade de dados visuais dá origem a uma caixa de Pandora de novos problemas de classificação visuais nunca antes imaginados. Tarefas de classificação de imagens e vídeos foram inseridos em diferentes e complexas aplicações e o uso de soluções baseadas em aprendizagem de máquina tornou-se mais popular para diversas aplicações. Entretanto, por outro lado, não existe uma ''bala de prata" que resolva todos os problemas, ou seja, não é possível caracterizar todas as imagens de diferentes domínios com o mesmo método de descrição e nem utilizar o mesmo método de aprendizagem para alcançar bons resultados em qualquer tipo de aplicação. Nesta tese, propomos um arcabouço para seleção e fusão de classificadores. Nosso método busca combinar métodos de caracterização de imagem e aprendizagem por meio de uma abordagem meta-aprendizagem que avalia quais métodos contribuem melhor para solução de um determinado problema. O arcabouço utiliza três diferentes estratégias de seleção de classificadores para apontar o menos correlacionados e eficazes, por meio de análises de medidas de diversidade. Os experimentos mostram que as abordagens propostas produzem resultados comparáveis aos famosos métodos da literatura para diferentes aplicações, utilizando menos classificadores e não sofrendo com problemas que afetam outras técnicas como a maldição da dimensionalidade e normalização. Além disso, a nossa abordagem é capaz de alcançar resultados eficazes de classificação usando conjuntos de treinamento muito reduzidos
Abstract: The frequent growth of visual data, either by countless available monitoring video cameras or the popularization of mobile devices that allow each person to create, edit, and share their own images and videos have contributed enormously to the so called ''big-data revolution''. This shear amount of visual data gives rise to a Pandora box of new visual classification problems never imagined before. Image and video classification tasks have been inserted in different and complex applications and the use of machine learning-based solutions has become the most popular approach to several applications. Notwithstanding, there is no silver bullet that solves all the problems, i.e., it is not possible to characterize all images of different domains with the same description method nor is it possible to use the same learning method to achieve good results in any kind of application. In this thesis, we aim at proposing a framework for classifier selection and fusion. Our method seeks to combine image characterization and learning methods by means of a meta-learning approach responsible for assessing which methods contribute more towards the solution of a given problem. The framework uses three different strategies of classifier selection which pinpoints the less correlated, yet effective, classifiers through a series of diversity measure analysis. The experiments show that the proposed approaches yield comparable results to well-known algorithms from the literature on many different applications but using less learning and description methods as well as not incurring in the curse of dimensionality and normalization problems common to some fusion techniques. Furthermore, our approach is able to achieve effective classification results using very reduced training sets
Doutorado
Ciência da Computação
Doutor em Ciência da Computação
Zoghi, Zeinab. "Ensemble Classifier Design and Performance Evaluation for Intrusion Detection Using UNSW-NB15 Dataset." University of Toledo / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1596756673292254.
Повний текст джерелаZhang, Qing Frankowski Ralph. "An empirical evaluation of the random forests classifier models for variable selection in a large-scale lung cancer case-control study /." See options below, 2006. http://proquest.umi.com/pqdweb?did=1324365481&sid=1&Fmt=2&clientId=68716&RQT=309&VName=PQD.
Повний текст джерелаHe, Jeannie. "Automatic Diagnosis of Parkinson’s Disease Using Machine Learning : A Comparative Study of Different Feature Selection Algorithms, Classifiers and Sampling Methods." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301616.
Повний текст джерелаSom en av världens mest vanligaste sjukdom med en tendens att leda till funktionshinder har Parkinsons sjukdom länge varit i centrum av forskning. För att se till att så många som möjligt får en behandling innan det blir för sent har flera studier publicerats för att föreslå algoritmer för automatisk diagnos av Parkinsons sjukdom. Samtidigt som alla klassificerare verkar ha överträffats av en annan klassificerare i minst en studie, verkar det saknas en studie om hur väl olika klassificerare fungerar med en viss kombination av urvalsalgoritm (feature selection algorithm på engelska) och provtagningsmetod. Därutöver verkar det saknas en studie där resultatet från den föreslagna urvalsalgoritmen och/eller samplingsmetoden jämförs med resultatet av att applicera klassificeraren direkt på datan utan någon urvalsalgoritm eller resampling. Detta lämnar oss en fråga om vilket system av klassificerare, urvalsalgoritm och samplingsmetod man bör välja och ifall det är värt att använda en urvalsalgoritm och överprovtagningsmetod. Med tanke på vikten av att snabbt och noggrant upptäcka Parkinsons sjukdom har en jämförelse gjorts för att hitta den bästa kombinationen av klassificerare, urvalsalgoritm och provtagningsalgoritm för den automatiska diagnosen av Parkinsons sjukdom.
Marin, Rodenas Alfonso. "Comparison of Automatic Classifiers’ Performances using Word-based Feature Extraction Techniques in an E-government setting." Thesis, KTH, Skolan för informations- och kommunikationsteknik (ICT), 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-32363.
Повний текст джерелаMilne, Linda Computer Science & Engineering Faculty of Engineering UNSW. "Machine learning for automatic classification of remotely sensed data." Publisher:University of New South Wales. Computer Science & Engineering, 2008. http://handle.unsw.edu.au/1959.4/41322.
Повний текст джерелаWatts-Willis, Tristan A. "Autonomous model selection for surface classification via unmanned aerial vehicle." Scholarly Commons, 2017. https://scholarlycommons.pacific.edu/uop_etds/224.
Повний текст джерелаWong, Kwok Wai Johnny. "Development of selection evaluation and system intelligence analytic models for the intelligent building control systems." Thesis, The Hong Kong Polytechnic University, 2007. https://eprints.qut.edu.au/20343/1/c20343.pdf.
Повний текст джерелаDuan, Cheng. "Imbalanced Data Classification with the K-Closest Resemblance Classifier for Remote Sensing and Social Media Texts". Thesis, Université d'Ottawa / University of Ottawa, 2020. http://hdl.handle.net/10393/41424.
Повний текст джерелаChang, Liang-Hao, and 張良豪. "Improving the performance of Naive Bayes Classifier by using Selective Naive Bayesian Algorithm and Prior Distributions." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/92613736217287175606.
Повний текст джерела國立成功大學
工業與資訊管理學系碩博士班
97
Naive Bayes classifiers have been widely used for data classification because of its computational efficiency and competitive accuracy. When all attributes are employed for classification, the accuracy of the naive Bayes classifier is generally affected by noisy attributes. A mechanism for attribute selection should be considered for improving its prediction accuracy. Selective naive Bayesian method is a very successful approach for removing noisy and/or redundant attributes. In addition, attributes are generally assumed to have prior distributions, such as Dirichlet or generalized Dirichlet distributions, for achieving a higher prediction accuracy. Many studies have proposed the methods for finding the best priors for attributes, but none of them takes attribute selection into account. Thus, this thesis proposes two models for combining prior distribution and feature selection together for increasing the accuracy of the naive Bayes classifier. Model I finds out the best prior for each attribute after all attributes have been determined by the selective naive Bayesian algorithm. Model II finds the best prior of the newest attribute determined by the selective naive Bayesian algorithm when all predecessors of the newest attribute have their best priors. The experimental result on 17 data sets form UCI data repository shows that Model I with the general Dirichlet prior generally and consistently achieves a higher classification accuracy.
Bonnie, Ching-yi Lin. "A Production Experiment of Mandarin Classifier Selection." 2001. http://www.cetd.com.tw/ec/thesisdetail.aspx?etdun=U0021-2603200719114250.
Повний текст джерелаLin, Bonnie Ching-yi, and 林靜怡. "A Production Experiment of Mandarin Classifier Selection." Thesis, 2002. http://ndltd.ncl.edu.tw/handle/35300853562939351707.
Повний текст джерела國立臺灣師範大學
英語研究所
90
ABSTRACT The Chinese classifier system has always been an intriguing and interesting topic under discussion. In this study, we focus on the classifier selection of Mandarin Chinese speakers. We discuss three potential factors underlying the Mandarin classifier selection─semantic relation between classifiers and the following nouns, the syntactic environment where classifiers occur, and physical traits of the target objects. In the first part of the study, we specifically examine classifiers after numerals. The results indicate that when the semantic content of a particular classifier is close to the following noun, this classifier is more likely to be preserved. As the semantic relation between a noun and a classifier gets more distant, the classifier tends to be either neutralized to a general classifier ge or substituted to another specific classifier which has certain overlapping semantic feature with the original classifier. The second part of this thesis deals with classifier selection after demonstratives. We compare the neutralization of classifiers after numerals (with the data we obtained in the first part of the study) and demonstratives. The result shows that classifiers occurring after demonstratives are neutralized more often than those after numerals, and the difference reaches statistic significance. The last part of this research investigates the conceptual mechanism underlying classifier selection. With the change of physical traits of the same target objects, we expect subjects to react differently and choose different classifiers according to the most salient perceptual feature of the two pictures (of the same target object). However, the result is not as expected. Subjects seem not to be influenced by the change of shapes, sizes, etc., they nevertheless tend to choose the classifier in their lexicon that collocates with a particular noun most frequently. That is, collocation frequency seems to play a bigger role than conception in classifier selection.
Li, Jia-ling, and 李佳玲. "ACO-based Feature Selection and Classifier Parameter Optimization." Thesis, 2008. http://ndltd.ncl.edu.tw/handle/943bta.
Повний текст джерела國立高雄第一科技大學
資訊管理所
96
Support Vector Machines (SVM) is one of the new techniques for pattern classification. The kernel parameter settings for SVM in training process can impact on the classification accuracy. A proper feature subset can also improve the classification efficiency and accuracy. This study hybridized the SVM with ant colony optimization (ACO) to simultaneously optimize the kernel parameters and feature subset without degrading the classification accuracy. Using the feature importance and pheromones information to determine the transition probability. Using the classification accuracy and the weight vector of the feature provided by the SVM classifier are both to update the pheromone information. The experimental results of five datasets showed that the proposed approach can successfully reduce data dimensions and maintain the classification accuracy.
Wu, Pei-Tzu, and 吳珮慈. "Cancer Classifier – Feature Selection and Gene Feature Combinations." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/77421859242670457770.
Повний текст джерела國立交通大學
資訊科學與工程研究所
98
Breast cancer is the main cause of death for women. Many researchers dedicate to the investigation of cancer classifications, attempting to find malignant tumors and directing therapies in early stages. Therefore, we used feature selection methods and ensemble classifier models to identify and predict on breast cancer classifications. The diagnostic data of breast cancer provide informative and significant knowledge for cancerous classifications. Thus, we apply feature selection technique to retrieve and rank the importance of attributes. Use the attributes we obtained to classify by diversifying of attribute combinations. The study used breast cancer datasets, K-nearest neighbor, Quadratic Classifier, Support Vector Machine classification of individual classifier, ensemble models and combined model to classify. The goal is to construct an efficient classification model to improve the performance of accuracy and to obtain the most significant features identifying the malignant breast cancer.
Yu, Cheng-Lin, and 余晟麟. "A Generalized Image Classifier based on Feature Selection." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/64867408414754377176.
Повний текст джерела國立臺灣師範大學
資訊工程學系
103
Establishing an image classification system traditionally requires a series of complex procedures, including collecting training samples, feature extraction, training model and accuracy analysis. In general terms, the established image classification system should only be used to identify images of specific topics. The reason is that the system can apply the characteristics of knowledge within a specific image domain to train a model, which leads to higher accuracy. Most of the image classification methods of the earlier studies focus on specific domains, and the proposed method of the current research is otherwise that we do not specify the image domain in advance, while the image classification system can still be established. Regarding the actual application, it is not easy to collect the training images, and therefore the provided training samples are insufficient. We have built an image classifier with a small number of training samples and extracted numerous features of every variety. By so doing, the classifier is equipped with the ability to present images of different topics. To create a general classifier that can function without the need to identify a certain image domain, SVM classifier and F-score feature selection method are combined, and within the field of image classification, a specific feature has been selected to satisfy facilitate the classification tasks.
Skalak, David Bingham. "Prototype selection for composite nearest neighbor classifiers." 1997. https://scholarworks.umass.edu/dissertations/AAI9737585.
Повний текст джерелаHwang, Cheng-wei, and 黃政偉. "A Neural Network Document Classifier with Linguistic Feature Selection." Thesis, 1999. http://ndltd.ncl.edu.tw/handle/11955938873619510783.
Повний текст джерела國立臺灣科技大學
電子工程系
87
In this paper, a neural network document classifier with linguistic feature selection and multi-category output is presented. The proposed classifier is capable of classifying documents that are unstructured and contain linguistic description. It consists of a feature selection unit and a hierarchical neural network classification unit. In feature selection unit, we extract terms from original documents by text processing, then we analyze the conformity and uniformity of each term by entropy function which is characterized to measure the significance. Terms with high significance will be selected as input features for the following classifiers. To reduce the input dimension, we perform a mechanism to merge synonyms. According to the uniformity analysis, we obtain a term similarity matrix by fuzzy relation operation and then construct a synonym thesaurus. As a result, synonyms can be grouped. In hierarchical neural network classification unit, we adopt the well-known back-propagation model to build this proper hierarchical classification unit. In our experiment, a product description database from an electronic commercial company is employed. The classification results have achieved a sufficient accuracy to aid artificial classification effectively; therefore, much manpower and working time can be saved.
Tan, Chia-Chen, and 談家珍. "An Intelligent Web-Page Classifier with Fair Feature-Subnet Selection." Thesis, 2000. http://ndltd.ncl.edu.tw/handle/80632212675707369174.
Повний текст джерела國立臺灣科技大學
電子工程系
88
The explosion of on-line information has given rise to many manually constructed topic hierarchies (such as Yahoo!!). But with the current growth rate in the amount of information manual classification in topic hierarchies creates an immense information bottleneck. Therefore, developing an automatic classifier is an urgent need. However, the classifiers suffer from the enormous dimensionality, since the dimensionality is determined by the number of distinct keywords in a document corpus. More seriously, most classifiers are either working slowly or they are constructed subjectively without learning ability. In this thesis, we address these problems with a novel evaluation function and an adaptive fuzzy learning network (AFLN). First, to reduce the enormous dimensionality, we employ a novel evaluation function to be used in feature subnet selection algorithm. Further, we develop AFLN for classifying new documents into existing manually generated hierarchies. In contract to approaches, the evaluation function is sound theoretical, give equal treatment to each category and has ability to identify both positive and negative features. On the other hand, the AFLN provide extremely fast training and testing time and, more importantly, it has leaning ability to learn the human knowledge. In short, our methods will allow large amounts of information to be organized and presented to users in a comprehensible way. By alleviating the information bottleneck, we hope to help users with the problems of information access on the
Wang, Ding-En, and 王鼎恩. "Features Selection and GMM Classifier for Multi-Pose Face Recognition." Thesis, 2015. http://ndltd.ncl.edu.tw/handle/03408850317662581389.
Повний текст джерела國立東華大學
資訊工程學系
103
Face recognition is widely used in security application, such as homeland security, video surveillance, law enforcement, and identity management. However, there are still some problems in face recognition system. The main problems include the light changes, facial expression changes, pose variations and partial occlusion. Although many face recognition approaches reported satisfactory performance, their successes are limited to the conditions of controlled environment. In fact, pose variation has been identified as one of the most current problems in the real world. Therefore, many algorithms focusing on how to handle pose variation have received much attention. To solve the pose variations problem, in this thesis, we propose a multi-pose face recognition system based on an effective design of classifier using SURF feature. In training phase, the proposed method utilizes SURF features to calculate similarity between two images from different poses of the same face. Face recognition model (GMM) is trained using the robust SURF features from different poses. In testing phase, feature vectors corresponding to the test images are input to all trained models for the decision of the recognized face. Experiment results show that the performance of the proposed method is better than other existing methods.
Lin, Ching-chiang, and 林靖強. "Selection of Relevant Features for Multi-Relational Naive Bayesian Classifier." Thesis, 2010. http://ndltd.ncl.edu.tw/handle/30511079503769361755.
Повний текст джерела國立中正大學
資訊管理所暨醫療資訊管理所
98
Most structured data is stored in relational databases, which is stored in multiple relations by their characters. To mine on data, we often join several relations to form a single relation through foreign key links. The process is often called “flatten”. Unfortunately, flatten may cause some problems such as time consuming and statistical skew on data. Hence, how to mine data directly on numerous relations become an arresting issue. Multi-relational data mining (MRDM) has been successfully applied in a variety of areas, such as marketing, sales, finance, fraud detection, and natural sciences. There has been many ILP-based methods proposed in previous researches, but there are still other problems unresolved such as scalability. Irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data processing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We propose a hybrid feature selection approach called Hybrid-BC, which train multi-relational naïve Bayesian classifier to classify or label unknown data. We set different cutoff values to filter features for the Hybrid-BC in order to observe the impact on classification accuracy. The experimental results shows that effectively choosing a small set of relevant features results in enhancing classification accuracy.
CHEN, YU-XUN, and 陳俞勳. "A Study on Feature Selection Methods for the Nonspecific Classifier." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/7kg6m4.
Повний текст джерела國立雲林科技大學
資訊管理系
106
With the rapid development of network technology and information technology, a large amount of data has been generated. In order to effectively process a large amount of data, reduce a number of features, and retain the accuracy of classification, feature selection is required. The feature selection is a process that can effectively select the more important features. It can also reduce the dimension and make learning algorithm operate faster and more smoothly. The feature selection methods are divided into three kinds: filter method, wrapper method, and embedded method. The filter method is independent of the classifier, and the feature selection method of this kind is used. The study uses many different feature selection methods to select the more important features (attributes) in datasets that are multivariate data type and categorical attribute types. Then, the only selected features (attributes) in datasets will be used to classify in many different classifiers to evaluate the performance of the many different feature selection methods by using the correct rate. The experiment uses six feature selection methods, six datasets, and three classifiers, and the result is that the integrated feature selection method of IG and FCBF can get the best performance in the following evaluations: the dataset facet, the classifier facet, and the integration of the two facets.
Tzu-ChienLien and 連子建. "Feature selection methods with hybrid discretizationfor naive Bayesian classifiers." Thesis, 2012. http://ndltd.ncl.edu.tw/handle/95105249459952662675.
Повний текст джерела國立成功大學
資訊管理研究所
100
Naïve Bayesian classifier is widely used for classification problems, because of its computational efficiency and competitive accuracy. Discretization is one of the major approaches for processing continuous attributes for naïve Bayesian classifier. Hybrid discretization sets the method for discretizing each continuous attribute individually. A previous study found that hybrid discretization is a better approach to improve the performance of the naïve Bayesian classifier than unified discretization. Selective naïve Bayes, abbreviated as SNB, is an important feature selection method for naïve Bayesian classifiers. It improves the efficiency and the accuracy by reducing redundant and irrelevant attributes. The object of this study is to develop methods composed of hybrid discretization and feature selection, and three methods for this purposed are proposed. Method one that is the most efficient executes hybrid discretization after feature selection. Methods two and three generally perform hybrid discretization first followed by feature selection. Method two transforms continuous attributes without considering discrete attributes, while method three determines the best discretization methods for each continuous attribute by searching all possibilities. The experimental results shows that in general, the three methods with hybrid discretization and feature selection all have better performance than the method with unified discretization and feature selection, and method three is the best.
Chuang, Chun-Hsiang, and 莊鈞翔. "Subspace Selection based Multiple Classifier System for High Dimensional Data Classification." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/tjqxm6.
Повний текст джерела國立臺中教育大學
教育測驗統計研究所
97
In a typical supervised classification task, the size of training data fundamentally affects the generality of a classifier. Given a finite and fixed size of training data, the classification result may be degraded as the number of features (dimensionality) increase. Many researches have demonstrated that multiple classifier systems or so-called ensembles can alleviate small sample size and high dimensionality concern, and obtain more outstanding and robust results than single models. One of the effective approaches for generating an ensemble of diverse base classifiers is the use of different feature subsets such as random subspace method (RSM). Objectives of this research are to develop a novel ensemble technique named cluster based dynamic subspace method (CDSM) for strengthening RSM. This work is comprised of three phases. First, the relationships between feature vectors are explored by clustering algorithms. Second, two importance distributions are proposed to impose on the process of selecting subspaces. The functions of them provide rules for automatically selecting a suitable subspace dimensionality and the component dimensions, respectively. Finally, to utilize the spectral and spatial information contained in hyperspectral image data and enhance the performance and robustness of CDSM, two nonparametric contextual classifiers based on the Markov random field (MRF) are developed. The real data experimental results show that the proposed method obtains sound performances than the other conventional subspace methods especially when the ensemble size is small.
Yu-Lu, Jou, and 周育祿. "An Efficient Fuzzy Classifier with Feature Selection Based on Fuzzy Entropy." Thesis, 1998. http://ndltd.ncl.edu.tw/handle/93257518701023507869.
Повний текст джерела國立臺灣科技大學
電子工程技術研究所
86
This thesis presents an efficient fuzzy classifier with the ability of feature selection based on fuzzy entropy measure. The fuzzy entropy is employed to evaluate the information of pattern distribution in the pattern space. With such information, we can apply it to partition the pattern space into non-overlapped decision regions for pattern classification. Since the decision regions do not overlap, the complexity and computational load of the classifier are reduced and thus the training time and classification time are extremely fast. Although the decision regions are partitioned as non-overlapped subspaces, we can also achieve good performance by the produced smooth boundaries since the decision regions are fuzzy subspaces. In addition, we also investigate a fuzzy entropy-based method to select the relevant features. The feature selection procedure not only reduces the dimension of a problem but also discards the noise-corrupted, redundant or unimportant features. As a result, the time consuming of the classifier is reduced whereas the classification performance is increased. Finally, we apply the proposed classifier on the Iris database and Wisconsin breast cancer database to evaluate the classification performance. Both of the results show that the proposed classifier can work well for the pattern classification applications.
Makrehchi, Masoud. "Feature Ranking for Text Classifiers." Thesis, 2007. http://hdl.handle.net/10012/3250.
Повний текст джерелаFreeman, Cecille. "Feature selection and hierarchical classifier design with applications to human motion recognition." Thesis, 2014. http://hdl.handle.net/10012/8480.
Повний текст джерелаFeng, Kuan-Jen, and 馮冠仁. "An Efficient Hierarchical Metadata Classifier based on SVM and Feature Selection Methods." Thesis, 2006. http://ndltd.ncl.edu.tw/handle/71704686561687980607.
Повний текст джерела國立暨南國際大學
資訊工程學系
94
Constructing a Web portal via integrating different contents from various information systems is crucial for providing public, popular and friendly services. In this thesis, we propose a hierarchical classifier system toward to fusing heterogeneous categories from various information systems. Employing traditional text classification methods that classify documents into predefined categories to deal with the problem is a possible solution. However, traditional methods suffer from drawbacks of huge text features and flat classification without considering hierarchical structures. Feature selection methods tend to select features from large-sized classes so that the classification performance for small-sized classes is poor. Flat classification regards hierarchical classes as flat-structured classes. In this way, each category corresponds to a single classifier that tends to select features to distinguish the class from all of remains. Therefore, discriminative features are hard to be effectively selected since the hierarchical knowledge is not applied to enhance the classification task. To deal with above problems, we propose feature selection methods to avoid the process being dominated by large-size classes. Based on the SVM classification method, we propose a hierarchical classification method to support classifications on hierarchical portal objects with metadata. We also employ domain concept hierarchies as the background knowledge to improve feature selection and classification processes by using the portal’s hierarchical knowledge. The NMNS portal is used as the test bed. Experiments show that our hierarchical classifier, with outstanding 98.5% F-measure, is more efficient than traditional flat classifier.
Nimbalkar, Prakash. "Optimal HySpex band selection for roof classification determined from supervised classifier efficiency." Doctoral thesis, 2022. https://depotuw.ceon.pl/handle/item/4155.
Повний текст джерелаEkosystem miasta charakteryzuje się dużą różnorodnością struktur oraz materiałów budulcowych, które mają realny wpływ na funkcjonowanie nie tylko samego miasta, ale też i otoczenia. Oznacza to konieczność opracowania szczegółowych metod kartowania i monitoringu. Jedne z najlepszych narzędzi oferuje teledetekcja hiperspektralna, gdyż bazując na bardzo dużej rozdzielczości spektralnej, radiometrycznej oraz przestrzennej, umożliwia dokładne kartowanie oraz ilościową analizę zachodzących zmian. Jednakże dane hiperspektralne ze względu na rozmiary wymagają dużych zasobów komputerowych do przetwarzania obrazów. W związku z tym niniejsza rozprawa miała na celu opracowanie metody selekcji najbardziej informacyjnych kanałów prezentujących cechy spektralne powierzchni występujących miastach (na przykładzie Białegostoku). Optymalizację doboru kanałów przeprowadzono stosując metody redukcji danych i oceniając dokładność klasyfikacji poszczególnych zestawów danych. Algorytm był testowany przy użyciu dwóch eksperymentalnych metod identyfikujących a) pokrycia dachowe oraz b) pokrycie terenu. Badania bazowały na lotniczych danych LiDAR oraz hiperspektralnych HySpex. Pierwszym krokiem schematu badawczego były korekcje danych (atmosferyczna, geometryczna), następnie zastosowano metodę wyboru kanałów PCA-BS (Principal Component Analysis-Band Selection) oraz porównawczą metodą PCA i Minimum Noise Fraction (MNF). W kolejnym kroku zastosowano kilka klasyfikatorów oceniających dokładność klasyfikacji wybranych kanałów. Wybrane algorytmy klasyfikujące to: sztuczne sieci neuronowe (ANN), Support Vector Machine (SVM), Spectral Angle Mapper (SAM) and Spectral Information Divergence (SID). Wyniki obu eksperymentów potwierdziły, że najlepsze wyniki selekcji danych oferuje metoda PCA-BS, w następnej kolejności PCA-BS, MNF i PCA. Zestaw zawierający 30 kanałów pochodzących z metody PCA-BS zapewniła najwyższe dokładności - dokładność całkowita równa 94,34% dla klasyfikatora ANN oraz 88,72% w przypadku SVM. Dokładność całkowita klasyfikacji dachów metodą ANN wyniosła 90,85%, a współczynnik kappa 0,9. Spośród klasyfikatorów najlepsze wyniki uzyskano z ANN i SVM. Gorsze wyniki uzyskano z SAM i SID. Reasumując, metoda PCA-BS pozwoliła na wybór 10 optymalnych kanałów, które pozwalają uzyskać dokładności rzędu 83,2% i 86,63% przy użyciu klasyfikatorów SVM i ANN. Dodanie danych LiDAR do zestawu HySpex poprawiło wyniki (dokładność całkowitą) o 6%.
Cheng, Shao-wei, and 鄭紹偉. "Ensemble classifier with feature selection and multi-words for disease code assignment." Thesis, 2009. http://ndltd.ncl.edu.tw/handle/85980912556252350164.
Повний текст джерела雲林科技大學
資訊管理系碩士班
97
After the National Health Insurance (NHI) was executed, the Health Insurance Bureau stipulates that the hospitals have to report medical records with ICD-9-CM when applying for reimbursement of medical expense. If they don’t conform to this rule, they wouldn’t be subsidized. Especially filing incorrect codes or omitted codes, the medical reimbursement will be deleted or deducted. In determining correct ICD-9-CM corresponding with discharge summary, medical staffs has to manually check each document. This labor intensive work wastes human resources and time. In prior research, using domain knowledge to extend concepts of a document term was studied. However, those terms are limited to single-word terms and don’t contain multi-word terms. In addition, the meaning of codes is similar in subcategories under the same category, and the imbalanced data problem exists. This study focuses on keyword selection and multi-word terms expansion, and then combines the ensemble technique to enhance the performance of SVM and Bayes classifier in determining the disease code of medical documents. The experimental results proved the chi-square method could select keywords with better quality, the multi-word and extended-word can contain more information of medical documents, and the ensemble method Adaboost could improve the classification performance of Bayes classifier.