Dissertations / Theses on the topic 'ANN Classifiers'

To see the other types of publications on this topic, follow the link: ANN Classifiers.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'ANN Classifiers.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Eldud, Omer Ahmed Abdelkarim. "Prediction of protein secondary structure using binary classificationtrees, naive Bayes classifiers and the Logistic Regression Classifier." Thesis, Rhodes University, 2016. http://hdl.handle.net/10962/d1019985.

Full text
Abstract:
The secondary structure of proteins is predicted using various binary classifiers. The data are adopted from the RS126 database. The original data consists of protein primary and secondary structure sequences. The original data is encoded using alphabetic letters. These data are encoded into unary vectors comprising ones and zeros only. Different binary classifiers, namely the naive Bayes, logistic regression and classification trees using hold-out and 5-fold cross validation are trained using the encoded data. For each of the classifiers three classification tasks are considered, namely helix against not helix (H/∼H), sheet against not sheet (S/∼S) and coil against not coil (C/∼C). The performance of these binary classifiers are compared using the overall accuracy in predicting the protein secondary structure for various window sizes. Our result indicate that hold-out cross validation achieved higher accuracy than 5-fold cross validation. The Naive Bayes classifier, using 5-fold cross validation achieved, the lowest accuracy for predicting helix against not helix. The classification tree classifiers, using 5-fold cross validation, achieved the lowest accuracies for both coil against not coil and sheet against not sheet classifications. The logistic regression classier accuracy is dependent on the window size; there is a positive relationship between the accuracy and window size. The logistic regression classier approach achieved the highest accuracy when compared to the classification tree and Naive Bayes classifiers for each classification task; predicting helix against not helix with accuracy 77.74 percent, for sheet against not sheet with accuracy 81.22 percent and for coil against not coil with accuracy 73.39 percent. It is noted that it is easier to compare classifiers if the classification process could be completely facilitated in R. Alternatively, it would be easier to assess these logistic regression classifiers if SPSS had a function to determine the accuracy of the logistic regression classifier.
APA, Harvard, Vancouver, ISO, and other styles
2

Joo, Hyonam. "Binary tree classifier and context classifier." Thesis, Virginia Polytechnic Institute and State University, 1985. http://hdl.handle.net/10919/53076.

Full text
Abstract:
Two methods of designing a point classifier are discussed in this paper, one is a binary decision tree classifier based on the Fisher's linear discriminant function as a decision rule at each nonterminal node, and the other is a contextual classifier which gives each pixel the highest probability label given some substantially sized context including the pixel. Experiments were performed both on a simulated image and real images to illustrate the improvement of the classification accuracy over the conventional single-stage Bayes classifier under Gaussian distribution assumption.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
3

Billing, Jeffrey J. (Jeffrey Joel) 1979. "Learning classifiers from medical data." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/8068.

Full text
Abstract:
Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Includes bibliographical references (leaf 32).
The goal of this thesis was to use machine-learning techniques to discover classifiers from a database of medical data. Through the use of two software programs, C5.0 and SVMLight, we analyzed a database of 150 patients who had been operated on by Dr. David Rattner of the Massachusetts General Hospital. C5.0 is an algorithm that learns decision trees from data while SVMLight learns support vector machines from the data. With both techniques we performed cross-validation analysis and both failed to produce acceptable error rates. The end result of the research was that no classifiers could be found which performed well upon cross-validation analysis. Nonetheless, this paper provides a thorough examination of the different issues that arise during the analysis of medical data as well as describes the different techniques that were used as well as the different issues with the data that affected the performance of these techniques.
by Jeffrey J. Billing.
M.Eng.and S.B.
APA, Harvard, Vancouver, ISO, and other styles
4

Siegel, Kathryn I. (Kathryn Iris). "Incremental random forest classifiers in spark." Thesis, Massachusetts Institute of Technology, 2016. http://hdl.handle.net/1721.1/106105.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 53).
The random forest is a machine learning algorithm that has gained popularity due to its resistance to noise, good performance, and training efficiency. Random forests are typically constructed using a static dataset; to accommodate new data, random forests are usually regrown. This thesis presents two main strategies for updating random forests incrementally, rather than entirely rebuilding the forests. I implement these two strategies-incrementally growing existing trees and replacing old trees-in Spark Machine Learning(ML), a commonly used library for running ML algorithms in Spark. My implementation draws from existing methods in online learning literature, but includes several novel refinements. I evaluate the two implementations, as well as a variety of hybrid strategies, by recording their error rates and training times on four different datasets. My benchmarks show that the optimal strategy for incremental growth depends on the batch size and the presence of concept drift in a data workload. I find that workloads with large batches should be classified using a strategy that favors tree regrowth, while workloads with small batches should be classified using a strategy that favors incremental growth of existing trees. Overall, the system demonstrates significant efficiency gains when compared to the standard method of regrowing the random forest.
by Kathryn I. Siegel.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
5

Palmer-Brown, Dominic. "An adaptive resonance classifier." Thesis, University of Nottingham, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.334802.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Xue, Jinghao. "Aspects of generative and discriminative classifiers." Thesis, Connect to e-thesis, 2008. http://theses.gla.ac.uk/272/.

Full text
Abstract:
Thesis (Ph.D.) - University of Glasgow, 2008.
Ph.D. thesis submitted to the Department of Statistics, Faculty of Information and Mathematical Sciences, University of Glasgow, 2008. Includes bibliographical references. Print version also available.
APA, Harvard, Vancouver, ISO, and other styles
7

Frankowsky, Maximilian, and Dan Ke. "Humanness and classifiers in Mandarin Chinese." Universitätsbibliothek Leipzig, 2017. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-224789.

Full text
Abstract:
Mandarin Chinese numeral classifiers receive considerable at-tention in linguistic research. The status of the general classifier 个 gè re-mains unresolved. Many linguists suggest that the use of 个 gè as a noun classifier is arbitrary. This view is challenged in the current study. Relying on the CCL-Corpus of Peking University and data from Google, we investigated which nouns for living beings are most likely classified by the general clas-sifier 个 gè. The results suggest that the use of the classifier 个 gè is motivated by an anthropocentric continuum as described by Köpcke and Zubin in the 1990s. We tested Köpcke and Zubin’s approach with Chinese native speakers. We examined 76 animal expressions to explore the semantic interdepen-dence of numeral classifiers and the nouns. Our study shows that nouns with the semantic feature [+ animate] are more likely to be classified by 个 gè if their denotatum is either very close to or very far located from the anthropo-centric center. In contrast animate nouns whose denotata are located at some intermediate distance from the anthropocentric center are less likely to be classified by 个 gè.
APA, Harvard, Vancouver, ISO, and other styles
8

Lee, Yuchun. "Classifiers : adaptive modules in pattern recognition systems." Thesis, Massachusetts Institute of Technology, 1989. http://hdl.handle.net/1721.1/14496.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Chungfat, Neil C. (Neil Caye) 1979. "Context-aware activity recognition using TAN classifiers." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/87220.

Full text
Abstract:
Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Includes bibliographical references (p. 73-77).
by Neil C. Chungfat.
M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
10

Li, Ming. "Sequence and text classification : features and classifiers." Thesis, University of East Anglia, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.426966.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Semnani, Shahram. "Design and analysis of discriminant pattern classifiers." Thesis, Loughborough University, 1993. https://dspace.lboro.ac.uk/2134/14143.

Full text
Abstract:
In recent years pattern recognition has evolved to a mature discipline and has been successfully applied to various problems. A fundamental part of an automatic pattern recognition system is classification, where a pattern vector is assigned to one of a finite number of classes. This thesis reports on the development and design of pattern classifier algorithms, with particular emphasis on statistical algorithms which employ discriminant functions. The first part of this research work investigates the use of linear discriminant functions as pattern classifiers. A comparison of some well known methods, including Perceptron, Widrow-Hoff and Ho-Kashyap, is presented. Using generalised linear modelling a new method of training discriminant functions is developed. In this method the linear discriminant function is transformed by a non-linear link function which associates with each pattern vector a measure which is bounded in the range of 0 to 1 according to the class membership of the pattern. In simulations the GLM approach is applied both to synthetic data and to experimental data from a binary pattern matching problem. It is seen that GLM exhibits faster and more reliable convergence than existing linear discriminant approaches. Extensions of this method to Piecewise linear discriminant functions and to polynomial discriminant functions are explored. Application of self-organising methods for efficient generation of polynomial discriminant functions is also investigated. In the second part of the work a review of neural networks is presented, followed by an analysis and formulation of a popular neural network training algorithm, namely Backpropagation (BP). The capabilities and deficiencies of BP and its variations are experimentally evaluated by computer simulations. An alternative formulation based on Empirical Maximum Likelihood (EML) is also proposed. This approach is shown to have a simpler error landscape in comparison to the original BP based on mean square error. Simulations show that the EML approach generally provides faster convergence, involves fewer calculations per iteration than conventional BP, and results in equally good classification performance.
APA, Harvard, Vancouver, ISO, and other styles
12

Wang, Lianqing. "Origin and Development of Classifiers in Chinese." The Ohio State University, 1994. http://rave.ohiolink.edu/etdc/view?acc_num=osu1392056967.

Full text
APA, Harvard, Vancouver, ISO, and other styles
13

Halberstadt, Andrew K. (Andrew King) 1970. "Heterogeneous acoustic measurements and multiple classifiers for speech recognition." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/79971.

Full text
Abstract:
Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.
Includes bibliographical references (p. 165-173).
by Andrew K. Halberstadt.
Ph.D.
APA, Harvard, Vancouver, ISO, and other styles
14

Haque, Mahbuba. "Comparison of Distance-Based Classifiers for Elliptically Contoured Distributions." Thesis, Uppsala universitet, Statistiska institutionen, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-328026.

Full text
Abstract:
A simulation study is carried out to compare three distance-based classifiers for their misclassification and asymptotic distributions when the data follow certain elliptically contoured distributions. The data are generated from multivariate normal, multivariate t and multivariate normal mixture distributions with varying covariance structures, sample sizes and dimension sizes. In many of the simulated cases, the dimensions of the data are much larger than the sample size. The simulations show that for small dimension sizes, the centroid classifier generally performs better. The nearest neighbour classifier shows superior performance compared to the other classifiers when the covariance structure is of compound symmetry form. All three classifiers showed to have asymptotic normal distribution, regardless of the underlying distribution of the data.
APA, Harvard, Vancouver, ISO, and other styles
15

Öhman, Oscar. "Rating corrumption within insurance companies using Bayesian network classifiers." Thesis, Umeå universitet, Statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-160810.

Full text
Abstract:
Bayesian Network (BN) classifiers are a type of probabilistic models. The learning process consists of two steps, structure learning and parameter learning. Four BN classifiers will be learned. These are two different Naive Bayes classifiers (NB), one Tree Augmented Naive Bayes classifier (TAN) and one Forest Naive Bayes classifier (FAN). The NB classifiers will utililize two different parameter learning techniques, which are generative learning and discriminative learning. Generative learning uses maximum likelihood estimation (MLE) to optimize the parameters, while discriminative learning uses conditional likelihood estimation (CLE). The latter is more appropriate given the target at hand, while the former is less complicated. These four models are created in order to find the model best suited for predicting/rating the corruption levels of different insurance companies, given their features. Multi-class Area under the receiver operating characteristic (ROC) curve (AUC), as well as accuracy, is used in order to compare the predictive performances of the models. We observe that the classifiers learnt by generative parameter learning performed remarkably well, even outperforming the NB classifier with discriminative parameter learning. But unfortunately, this might imply an optimization issue when learning the parameters discriminately. Another unexpected result was that the CL-TAN classifier had the highest multi-class AUC, even though FAN is supposed to be an upgrade of CL-TAN. Further, the generatively learned NB performed about as good as the other two generative classifiers, which was also unexpected.
Bayesianska nätverk (BN) är en typ av sannolikhetsmodell som används för klassificering. Inlärningsprocessen av en sådan modell består av två steg, strukturinlärning ochparameterinlärning. Fyra olika BN-klassificerare kommer att skattas. Dessa är två stycken Naive Bayes-klassificerare (NB), en Tree augmented naive Bayes-klassificerare (TAN) och enForest augmented naive Bayes-klassificerare (FAN). De två olika NB-klassificerarna kommer att skilja sig åt i att den ena använder sig av generativ parameterskattning, medan den andra använder sig av diskriminativ parameterinlärning. Chow och Lius (CL) berömda algoritm, där det ingår att beräkna betingad ömsesidig information (CMI), brukar ofta användas för att hitta den optimala trädstrukturen. Denna variant av TAN är känd som CL-TAN. FAN är en annan slags uppgradering av NB, som kan anses vara en förstärkt variant av CL-TAN, där förklaringsvariablerna är kopplade till varandra på ett sätt som ger en skogs-liknande struktur. De två olika parameterinlärningsmetoderna som används är generativ inlärning och diskriminativ inlärning. Den förstnämnda använder sig av maximum likelihood-skattning (MLE) för att optimera parametrarna. Detta är smidigt, men samtidigt skattas inte det som avsetts. Den sistnämnda metoden använder sig istället av betingad maximum likelihood-skattning (CLE), vilket ger en mer korrekt, men också mer komplicerad, skattning. Dessa sex modeller kommer att tränas i syfte att hitta den modellsom bäst skattar korruptionsnivåerna inom olika försäkringsbolag, givet dess egenskaper iform av förklaringsvariabler. En multiklassvariant av Area under the reciever operatingcharacteristics (ROC) curve (AUC) används för att bedöma skattningsprecisionen för varjemodell. Analysen resulterade i anmärkningsvärda resultat för de generativa modellerna,som med goda marginaler skattade mer precist än den diskriminativa NB-modellen.Tyvärr kan detta dock vara en indikation på optimeringsproblem vid de diskriminativa parameterinlärningen av NB. Ett annat anmärkningsvärt resultat var att av samtliga generativa modeller, så var CL-TAN den modellen med högst AUC, trots att FAN i teorinska vara en förbättrad variant av CL-TAN. Även den generativa NB-modellens resultat var anmärkningsvärd, då denna modell hade nästan lika hög AUC som de generativa CL-TAN och FAN-modellerna.
APA, Harvard, Vancouver, ISO, and other styles
16

Song, Qing. "Features and statistical classifiers for face image analysis." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0035/NQ62459.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Duangsoithong, Rakkrit. "Feature selection and casual discovery for ensemble classifiers." Thesis, University of Surrey, 2012. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.580345.

Full text
Abstract:
With rapid development of computer and information technology that can improve a large number of applications such as web text mining, intrusion detection, biomedical informatics, gene selection in micro array data, medical data mining, and clinical de- cision support systems, many information databases have been created. However, in some applications especially in the medical area, clinical data may contain hundreds to thousands of features with relatively few samples. A consequence of this problem is increased complexity that leads to degradation in efficiency and accuracy. Moreover, in this high dimensional feature space, many features are possibly irrelevant or redundant and should be removed in order to ensure good generalisation performance. Otherwise, the classifier may over-fit the data, that is the classifier may specialise on features which are not relevant for discrimination. To overcome this problem, feature selection and ensemble classification are applied. In this thesis, an empirical analysis on using bootstrap and random subspace feature selection for multiple classifier system is investigated and bootstrap feature selection and embedded feature ranking for ensemble MLP classifiers along with a stopping criterion based on the out-of-bootstrap estimate are proposed. Moreover, basically, feature selection does not usually take causal discovery into ac- count. However, in some cases such as when the testing distribution is shifted from manipulation by external agent, causal discovery can provide some benefits for feature selection under these uncertainty conditions. It also can learn the underlying data structure, provide better understanding of the data generation process and better accuracy and robustness under uncertainty. Similarly, feature selection mutually enables global causal discovery algorithms to deal with high dimensional data by eliminating irrelevant and redundant features before exploring the causal relationship between features. A redundancy-based ensemble causal feature selection approach using bootstrap and random subspace and a comparison between correlation-based and causal feature selection for ensemble classifiers are analysed. Finally, hybrid correlation-causal feature selection for multiple classifier system is proposed in order to scale up causal discovery and deal with high dimensional features.
APA, Harvard, Vancouver, ISO, and other styles
18

Li, Mengxin. "Vision-based neural network classifiers and their applications." Thesis, University of Bedfordshire, 2005. http://hdl.handle.net/10547/312055.

Full text
Abstract:
Visual inspection of defects is an important part of quality assurance in many fields of production. It plays a very useful role in industrial applications in order to relieve human inspectors and improve the inspection accuracy and hence increasing productivity. Research has previously been done in defect classification of wood veneers using techniques such as neural networks, and a certain degree of success has been achieved. However, to improve results in tenus of both classification accuracy and running time are necessary if the techniques are to be widely adopted in industry, which has motivated this research. This research presents a method using rough sets based neural network with fuzzy input (RNNFI). Variable precision rough set (VPRS) method is proposed to remove redundant features utilising the characteristics of VPRS for data analysis and processing. The reduced data is fuzzified to represent the feature data in a more suitable foml for input to an improved BP neural network classifier. The improved BP neural network classifier is improved in three aspects: additional momentum, self-adaptive learning rates and dynamic error segmenting. Finally, to further consummate the classifier, a uniform design CUD) approach is introduced to optimise the key parameters because UD can generate a minimal set of uniform and representative design points scattered within the experiment domain. Optimal factor settings are achieved using a response surface (RSM) model and the nonlinear quadratic programming algorithm (NLPQL). Experiments have shown that the hybrid method is capable of classifying the defects of wood veneers with a fast convergence speed and high classification accuracy, comparing with other methods such as a neural network with fuzzy input and a rough sets based neural network. The research has demonstrated a methodology for visual inspection of defects, especially for situations where there is a large amount of data and a fast running speed is required. It is expected that this method can be applied to automatic visual inspection for production lines of other products such as ceramic tiles and strip steel.
APA, Harvard, Vancouver, ISO, and other styles
19

Suppharangsan, Somjet. "Comparison and performance enhancement of modern pattern classifiers." Thesis, University of Southampton, 2010. https://eprints.soton.ac.uk/170393/.

Full text
Abstract:
This thesis is a critical empirical study, using a range of benchmark datasets, on the performance of some modern machine learning systems and possible enhancements to them. When new algorithms and their performance are reported in the machine learning literature, most authors pay little attention to reporting the statistical significances in performance dififerences. We take Gaussian process classifiers as an example, which shows disappointing number of performance evaluations in the literature. What is particularly ignored is any use of the uncertainties in the performance measures when making comparisons. This thesis makes a novel contribution by developing a methodology for formal comparisons that also include performance uncertainties. Using support vector machine (SVM) as classification architectures, the thesis explores two potential enhancements to complexity reduction: (a) subset selection on the training data by some pre-processing approaches, and (b) organising the classes of a multi-class problem in a tree structure for fast classification. The former is crucial, as dataset sizes are known to have increased rapidly, and the straightforward training using quadratic programming over all of the given data is prohibitively expensive. While some researchers focus on training algorithms that operate in a stochastic manner, we explore data reduction by cluster analysis. Multi-class problems in which the number of classes is very large are of increasing interest. Our contribution is to speed up the training by removing as many irrelevant data as possible and preserving the potential data that are believed to be support vectors. The results show that too high a data reduction rate can degrade performance. However, on a subset of problems, the proposed methods have produced comparable results to the full SVM despite the high reduction rate. The new learning tree structure can then be combined with the data selection methods to obtain a further increase in speed. Finally, we also critically review SVM classification problems in which the input data is binary. In the chemoinformatics and bioinformatics literature, the Tanimoto kernel has been empirically shown to have good performance. The work we present, using carefully set up synthetic data of varying dimensions and dataset sizes, casts doubt on such claims. Improvements are noticeable, but not to the extent claimed in previous studies.
APA, Harvard, Vancouver, ISO, and other styles
20

Ko, Albert Hung-Ren. "Static and dynamic selection of ensemble of classifiers." Thèse, Montréal : École de technologie supérieure, 2007. http://proquest.umi.com/pqdweb?did=1467895171&sid=2&Fmt=2&clientId=46962&RQT=309&VName=PQD.

Full text
Abstract:
Thèse (Ph.D.) -- École de technologie supérieure, Montréal, 2007.
"A thesis presented to the École de technologie supérieure in partial fulfillment of the thesis requirement for the degree of the Ph.D. engineering". CaQMUQET Bibliogr. : f. [237]-246. Également disponible en version électronique. CaQMUQET
APA, Harvard, Vancouver, ISO, and other styles
21

Lavesson, Niklas. "Evaluation and Analysis of Supervised Learning Algorithms and Classifiers." Licentiate thesis, Karlskrona : Blekinge Institute of Technology, 2006. http://www.bth.se/fou/Forskinfo.nsf/allfirst2/c655a0b1f9f88d16c125714c00355e5d?OpenDocument.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Au, Yeung Wai Hoo. "An interface program for parameterization of classifiers in Chinese /." View abstract or full-text, 2005. http://library.ust.hk/cgi/db/thesis.pl?HUMA%202005%20AU.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

McCrae, Richard Clyde. "The Impact of Cost on Feature Selection for Classifiers." Diss., NSUWorks, 2018. https://nsuworks.nova.edu/gscis_etd/1057.

Full text
Abstract:
Supervised machine learning models are increasingly being used for medical diagnosis. The diagnostic problem is formulated as a binary classification task in which trained classifiers make predictions based on a set of input features. In diagnosis, these features are typically procedures or tests with associated costs. The cost of applying a trained classifier for diagnosis may be estimated as the total cost of obtaining values for the features that serve as inputs for the classifier. Obtaining classifiers based on a low cost set of input features with acceptable classification accuracy is of interest to practitioners and researchers. What makes this problem even more challenging is that costs associated with features vary with patients and service providers and change over time. This dissertation aims to address this problem by proposing a method for obtaining low cost classifiers that meet specified accuracy requirements under dynamically changing costs. Given a set of relevant input features and accuracy requirements, the goal is to identify all qualifying classifiers based on subsets of the feature set. Then, for any arbitrary costs associated with the features, the cost of the classifiers may be computed and candidate classifiers selected based on cost-accuracy tradeoff. Since the number of relevant input features k tends to be large for typical diagnosis problems, training and testing classifiers based on all 2^k-1 possible non-empty subsets of features is computationally prohibitive. Under the reasonable assumption that the accuracy of a classifier is no lower than that of any classifier based on a subset of its input features, this dissertation aims to develop an efficient method to identify all qualifying classifiers. This study used two types of classifiers – artificial neural networks and classification trees – that have proved promising for numerous problems as documented in the literature. The approach was to measure the accuracy obtained with the classifiers when all features were used. Then, reduced thresholds of accuracy were arbitrarily established which were satisfied with subsets of the complete feature set. Threshold values for three measures –true positive rates, true negative rates, and overall classification accuracy were considered for the classifiers. Two cost functions were used for the features; one used unit costs and the other random costs. Additional manipulation of costs was also performed. The order in which features were removed was found to have a material impact on the effort required (removing the most important features first was most efficient, removing the least important features first was least efficient). The accuracy and cost measures were combined to produce a Pareto-Optimal Frontier. There were consistently few elements on this Frontier. At most 15 subsets were on the Frontier even when there were hundreds of thousands of acceptable feature sets. Most of the computational time is taken for training and testing the models. Given costs, models in the Pareto-Optimal Frontier can be efficiently identified and the models may be presented to decision makers. Both the Neural Networks and the Decision Trees performed in a comparable fashion suggesting that any classifier could be employed.
APA, Harvard, Vancouver, ISO, and other styles
24

Ma, Kăichén. "Robust dynamic symbol recognition : the ClockSketch classifier." Thesis, Massachusetts Institute of Technology, 2013. http://hdl.handle.net/1721.1/91841.

Full text
Abstract:
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, June 2014.
Cataloged from PDF version of thesis. "May 2013."
Includes bibliographical references (page 61).
I present an automatic classifier for the digitized clock drawing test, a neurological diagnostic exam used to assess patients' mental acuity by having them draw an analog clock face using a digitizing pen. This classifier assists human examiners in clock drawing interpretation by labeling several basic components of a drawing, including its outline, numerals, hands, and noise, thereby freeing examiners to concentrate on more complex labeling problems. This is a challenging problem despite its specificity, because the average user of the clock drawing test has a high likelihood of cognitive or motor impairment. As a result, mistakes such as crossed-out numerals, messiness, missing components, and noise will be common in drawings, and a well-designed classifier must be capable of handling and correcting for various types of error. I describe in this thesis the construction of a system that is both accurate and robust enough to handle variable input, laying out its components and the principles behind its design. I demonstrate that this system accurately recognizes and classifies the basic components of a drawing, even when applied to a wide range of clinical input, and that it is able to do so because it relies both on statistical analysis and on common-sense observations about the structure of the problem at hand.
by Kaichen Ma.
M. Eng.
APA, Harvard, Vancouver, ISO, and other styles
25

Chung, Poy-san, and 鍾佩珊. "Acquisition of Cantonese sortal classifiers in Cantonese-English bilinguals." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2007. http://hub.hku.hk/bib/B38669808.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Tanaka, Mitsuru. "Classifier System Learning of Good Database Schema." ScholarWorks@UNO, 2008. http://scholarworks.uno.edu/td/859.

Full text
Abstract:
This thesis presents an implementation of a learning classifier system which learns good database schema. The system is implemented in Java using the NetBeans development environment, which provides a good control for the GUI components. The system contains four components: a user interface, a rule and message system, an apportionment of credit system, and genetic algorithms. The input of the system is a set of simple database schemas and the objective for the classifier system is to keep the good database schemas which are represented by classifiers. The learning classifier system is given some basic knowledge about database concepts or rules. The result showed that the system could decrease the bad schemas and keep the good ones.
APA, Harvard, Vancouver, ISO, and other styles
27

Dias, De Macedo Filho Antonio. "Microwave neural networks and fuzzy classifiers for ES systems." Thesis, University College London (University of London), 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.244066.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

Zhang, Ziming. "Efficient object detection via structured learning and local classifiers." Thesis, Oxford Brookes University, 2013. https://radar.brookes.ac.uk/radar/items/420cfbee-bf00-4d53-be8b-04f83389994f/1.

Full text
Abstract:
Object detection has made great strides recently. However, it is still facing two big challenges: detection accuracy and computational efficiency. In this thesis, we present an automatic efficient object detection frarnework to detect object instances ·in images using bounding boxes, which can be trained and tested easily on current personal computers. Our framework is a sliding-window based approach, and consists of two major components: (1) efficient object proposal generation, predicting possible object bounding boxes, and (2) efficient object proposal verification, classifying each bounding box in a multiclass manner. For object proposal generation, we formulate this problem as a structured learning problem and investigate structural support vector machines (SSVMs) with our proposed scale/aspect-ratio quantization scheme and ranking constraints. A general ranking-order decomposition algorithm is developed for solving the formulation efficiently, and applied to generate proposals using a two-stage cascade. Using image gradients as features, our object proposal generation method achieves state-of-the-art results in terms Df object recall at a low cost in computation. For object proposal verification, we propose two locally linear and one locally nonlinear classifiers to approximate the nonlinear decision boundaries in the feature space efficiently. Inspired by the kernel trick, these classifiers map the original features into another feature space explicitly where linear classifiers are employed for classification, and thus have linear computational complexity in both training and testing, similar to that of linear classifiers. Therefore, in general, our classifiers can achieve comparable accuracy to kernel based classifiers at the cost of lower computational time. To demonstrate its efficiency and generality, our framework is applied to four different object detection tasks: VOC detection challenges, traffic sign detection, pedestrian detection, and face detection. In each task, it can perform reasonably well with acceptable detection accuracy and good computational efficiency. For instance, on VOC datasets with 20 object classes, our method achieved about 0.1 mean average precision (AP) within 2 hours of training and 0.05 second of testing a 500 x 300 pixel image using a mixture of MATLAB and C++ code on a current personal computer.
APA, Harvard, Vancouver, ISO, and other styles
29

Lubenko, Ivans. "Towards robust steganalysis : binary classifiers and large, heterogeneous data." Thesis, University of Oxford, 2013. http://ora.ox.ac.uk/objects/uuid:c1ae44b8-94da-438d-b318-f038ad6aac57.

Full text
Abstract:
The security of a steganography system is defined by our ability to detect it. It is of no surprise then that steganography and steganalysis both depend heavily on the accuracy and robustness of our detectors. This is especially true when real-world data is considered, due to its heterogeneity. The difficulty of such data manifests itself in a penalty that has periodically been reported to affect the performance of detectors built on binary classifiers; this is known as cover source mismatch. It remains unclear how the performance drop that is associated with cover source mismatch is mitigated or even measured. In this thesis we aim to show a robust methodology to empirically measure its effects on the detection accuracy of steganalysis classifiers. Some basic machine-learning based methods, which take their origin in domain adaptation, are proposed to counter it. Specifically, we test two hypotheses through an empirical investigation. First, that linear classifiers are more robust than non-linear classifiers to cover source mismatch in real-world data and, second, that linear classifiers are so robust that given sufficiently large mismatched training data they can equal the performance of any classifier trained on small matched data. With the help of theory we draw several nontrivial conclusions based on our results. The penalty from cover source mismatch may, in fact, be a combination of two types of error; estimation error and adaptation error. We show that relatedness between training and test data, as well as the choice of classifier, both have an impact on adaptation error, which, as we argue, ultimately defines a detector's robustness. This provides a novel framework for reasoning about what is required to improve the robustness of steganalysis detectors. Whilst our empirical results may be viewed as the first step towards this goal, we show that our approach provides clear advantages over earlier methods. To our knowledge this is the first study of this scale and structure.
APA, Harvard, Vancouver, ISO, and other styles
30

TRONCI, ROBERTO. "Ensemble of binary classifiers: combination techniques and design issues." Doctoral thesis, Università degli Studi di Cagliari, 2008. http://hdl.handle.net/11584/265890.

Full text
Abstract:
In this thesis the problem of the combination of binary classifiers ensamble is faced. For each pattern a binary classifier (or binary expert) assigns a similarity score, and according to a decision threshold a class is assigned to the pattern (i.e., if the score is higher than the threshold the pattern is assigned to the “positive” class, otherwise to the “negative” one). An example of this kind of classifier is an authentication biometric expert, where the expert must distinguish between the “genuine” users, and the “impostor” users. The combination of different experts is currently investigated by researchers to increase the reliability of the decision. Thus in this thesis the following two aspects are investigated: a score “selection” methodology, and diversity measures of ensemble effectiveness. In particular, a theory on ideal score selection has been developed, and a number of selection techniques based on it have been deployed. Moreover some of them are based on the use of classifier as a selection support, thus different use of these classifier is analyzed. The influence of the characteristics of the individual experts to the final performance of the combined experts have been investigated. To this end some measures based on the characteristics of the individual experts were developed to evaluate the ensemble effectiveness. The aim of these measures is to choose which of the individual experts from a bag of experts have to be used in the combination. Finally the methodologies developed where extensively tested on biometric datasets.
APA, Harvard, Vancouver, ISO, and other styles
31

Alsharifi, Thamir. "Differential Mobility Classifiers in the Non-Ideal Assembly." VCU Scholars Compass, 2019. https://scholarscompass.vcu.edu/etd/6054.

Full text
Abstract:
The differential mobility classifier (DMC) is one of the core components in electrical mobility particle sizers for sizing sub-micrometer particles. Designing the DMC requires knowledge of the geometrical and constructional imperfection (or tolerance). Studying the effects of geometrical imperfection on the performance of the DMC is necessary to provide manufacturing tolerance and it helps to predict the performance of geometrically imperfect classifiers, as well as providing a calibration curve for the DMC. This thesis was accomplished via studying the cylindrical classifier and the parallel plate classifier. The numerical model was built using the most recent versions of COMSOL Multiphysics® and MATLAB®. For the cylindrical DMC, two major geometrical imperfections were studied: the eccentric annular classifying channel and the tilted inner cylinder/rod. For the parallel-plates DMC, the first study examined for the perfectly designed plates to optimize its dimensions and working conditions, while the second study conducted the plates’ parallelism. For both DMCs, a parametric study was conducted for several tolerances under various geometrical factors (i.e., channel length, width, spacing, cylinders radii, etc…), flow conditions (i.e., sheath-to-aerosol flow ratio, total flow rate), and several particles sizes. The results show that the transfer function deteriorated as the geometrical imperfection increased (i.e., the peak is reduced and the width at the half peak height is broadened). The parallel- plates DMC results show that the aspect ratio of the classifying channel cross-section (width-to-height) was recommended to be above 8. Particle diffusivity reduces the effect of geometrical imperfection, especially for particle sizes less than 10 nm.
APA, Harvard, Vancouver, ISO, and other styles
32

Kang, Dae-Ki. "Abstraction, aggregation and recursion for generating accurate and simple classifiers." [Ames, Iowa : Iowa State University], 2006.

Find full text
APA, Harvard, Vancouver, ISO, and other styles
33

Jannah, Najlaa. "ECG analysis and classification using CSVM, MSVM and SIMCA classifiers." Thesis, University of Reading, 2017. http://centaur.reading.ac.uk/78068/.

Full text
Abstract:
Reliable ECG classification can potentially lead to better detection methods and increase accurate diagnosis of arrhythmia, thus improving quality of care. This thesis investigated the use of two novel classification algorithms: CSVM and SIMCA, and assessed their performance in classifying ECG beats. The project aimed to introduce a new way to interactively support patient care in and out of the hospital and develop new classification algorithms for arrhythmia detection and diagnosis. Wave (P-QRS-T) detection was performed using the WFDB Software Package and multiresolution wavelets. Fourier and PCs were selected as time-frequency features in the ECG signal; these provided the input to the classifiers in the form of DFT and PCA coefficients. ECG beat classification was performed using binary SVM. MSVM, CSVM, and SIMCA; these were subsequently used for simultaneously classifying either four or six types of cardiac conditions. Binary SVM classification with 100% accuracy was achieved when applied on feature-reduced ECG signals from well-established databases using PCA. The CSVM algorithm and MSVM were used to classify four ECG beat types: NORMAL, PVC, APC, and FUSION or PFUS; these were from the MIT-BIH arrhythmia database (precordial lead group and limb lead II). Different numbers of Fourier coefficients were considered in order to identify the optimal number of features to be presented to the classifier. SMO was used to compute hyper-plane parameters and threshold values for both MSVM and CSVM during the classifier training phase. The best classification accuracy was achieved using fifty Fourier coefficients. With the new CSVM classifier framework, accuracies of 99%, 100%, 98%, and 99% were obtained using datasets from one, two, three, and four precordial leads, respectively. In addition, using CSVM it was possible to successfully classify four types of ECG beat signals extracted from limb lead simultaneously with 97% accuracy, a significant improvement on the 83% accuracy achieved using the MSVM classification model. In addition, further analysis of the following four beat types was made: NORMAL, PVC, SVPB, and FUSION. These signals were obtained from the European ST-T Database. Accuracies between 86% and 94% were obtained for MSVM and CSVM classification, respectively, using 100 Fourier coefficients for reconstructing individual ECG beats. Further analysis presented an effective ECG arrhythmia classification scheme consisting of PCA as a feature reduction method and a SIMCA classifier to differentiate between either four or six different types of arrhythmia. In separate studies, six and four types of beats (including NORMAL, PVC, APC, RBBB, LBBB, and FUSION beats) with time domain features were extracted from the MIT-BIH arrhythmia database and the St Petersburg INCART 12-lead Arrhythmia Database (incartdb) respectively. Between 10 and 30 PCs, coefficients were selected for reconstructing individual ECG beats in the feature selection phase. The average classification accuracy of the proposed scheme was 98.61% and 97.78 % using the limb lead and precordial lead datasets, respectively. In addition, using MSVM and SIMCA classifiers with four ECG beat types achieved an average classification accuracy of 76.83% and 98.33% respectively. The effectiveness of the proposed algorithms was finally confirmed by successfully classifying both the six beat and four beat types of signal respectively with a high accuracy ratio.
APA, Harvard, Vancouver, ISO, and other styles
34

Kubat, Rony Daniel. "A context-sensitive meta-classifier for color-naming." Thesis, Massachusetts Institute of Technology, 2008. http://hdl.handle.net/1721.1/43074.

Full text
Abstract:
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.
Includes bibliographical references (p. 93-97).
Humans are sensitive to situational and semantic context when applying labels to colors. This is especially challenging for algorithms which attempt to replicate human categorization for communicative tasks. Additionally, mismatched color models between dialog partners can lead to a back-and-forth negotiation of terms to find common ground. This thesis presents a color-classification algorithm that takes advantage of a dialog-like interaction model to provide fast-adaptation for a specific exchange. The model learned in each exchange is then integrated into the system as a whole. This algorithm is an incremental meta-learner, leveraging a generic online-learner and adding context-sensitivity. A human study is presented, assessing the extent of semantic contextual effects on color naming. An evaluation of the algorithm based on the corpus gathered in this experiment is then tendered.
by Rony Daniel Kubat.
S.M.
APA, Harvard, Vancouver, ISO, and other styles
35

Sembrant, Andreas. "Low Overhead Online Phase Predictor and Classifier." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-146661.

Full text
Abstract:
It is well known that programs exhibit time varying behavior. For example, some parts of the execution are memory bound while others are CPU bound. Periods of stable behavior are called program phases. Classifying the program behavior by the average over the whole execution can therefore be misleading, i.e., the program would appear to be neither CPU bound nor memory bound. As several important dynamic optimizations are done differently depending on the program behavior, it is important to keep track of what phase the program is currently executing and to predict what phase it will enter next. In this master thesis we develop a general purpose online phase prediction and classification library. It keeps track of what phase the program is currently executing and predicts what phase the program will enter next. Our library is non-intrusive, i.e., the program behavior is not changed by the presence of the library, and transparent, i.e., it does not require the tracked application to be recompiled, and architecture-independent, i.e., the same phase will be detected regardless of the processor type. To keep the overhead at a minimum we use hardware performance counters to capture the required program statistics. Our evaluation shows that we can capture and classify program phase behavior with on average less then 1% overhead, and accurately predict which program phase the application will enter next.
APA, Harvard, Vancouver, ISO, and other styles
36

Howard, Gerard David. "Constructivist and spiking neural learning classifier systems." Thesis, University of the West of England, Bristol, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.573442.

Full text
Abstract:
This thesis investigates the use of self-adaptation and neural constructivism within a neural Learning Classifier System framework. The system uses a classifier structure whereby each classifier condition is represented by an artificial neural network, which is used to compute an action in response to an environmental stimulus. We implement this neural representation in two modem Learning Classifier Systems, XCS and XCSF. A classic problem in neural networks revolves around network topology considerations; how many neurons should the network consist of? How should we configure their topological arrangement and inter-neural connectivity patterns to ensure high performance? Similarly in Learning Classifier Systems, hand-tuning of parameters is sometimes necessary to achieve acceptable system performance. We employ a number of mechanisms to address these potential deficiencies. Neural Constructivism is utilised to automatically alter network topology to reflect the complexity of the environment. It is shown that appropriate internal classifier complexity emerges during learning at a rate controlled by the learner. The resulting systems are applied to real-valued, noisy simulated maze environments and a simulated robotics platform. The main areas of novelty include the first use of self-adaptive constructivism within XCSF, the first implementation of temporally-sensitive spiking classifier representations within this constructive XC SF, and the demonstration of temporal functionality of such representations in noisy continuous-valued and robotic environments.
APA, Harvard, Vancouver, ISO, and other styles
37

Geisinger, Nathan P. "Classification of digital modulation schemes using linear and nonlinear classifiers." Thesis, Monterey, California : Naval Postgraduate School, 2010. http://edocs.nps.edu/npspubs/scholarly/theses/2010/Mar/10Mar%5FGeisinger.pdf.

Full text
Abstract:
Thesis (Electrical Engineer and M.S. in Electrical Engineering)--Naval Postgraduate School, March 2010.
Thesis Advisor(s): Fargues, Monique P. ; Cristi, Roberto ; Robertson, Ralph C. "March 2010." Description based on title screen as viewed on .April 27, 2010. Author(s) subject terms: Blind Modulation Classification, Cumulants, Principal Component Analysis, Linear Discriminant Analysis, Kernel-based functions. Includes bibliographical references (p. 211-212). Also available in print.
APA, Harvard, Vancouver, ISO, and other styles
38

Abd, Rahman Mohd Amiruddin. "Kernel and multi-class classifiers for multi-floor WLAN localisation." Thesis, University of Sheffield, 2016. http://etheses.whiterose.ac.uk/13768/.

Full text
Abstract:
Indoor localisation techniques in multi-floor environments are emerging for location based service applications. Developing an accurate location determination and time-efficient technique is crucial for online location estimation of the multi-floor localisation system. The localisation accuracy and computational complexity of the localisation system mainly relies on the performance of the algorithms embedded with the system. Unfortunately, existing algorithms are either time-consuming or inaccurate for simultaneous determination of floor and horizontal locations in multi-floor environment. This thesis proposes an improved multi-floor localisation technique by integrating three important elements of the system; radio map fingerprint database optimisation, floor or vertical localisation, and horizontal localisation. The main focus of this work is to extend the kernel density approach and implement multi-class machine learning classifiers to improve the localisation accuracy and processing time of the each and overall elements of the proposed technique. For fingerprint database optimisation, novel access point (AP) selection algorithms which are based on variant AP selection are investigated to improve computational accuracy compared to existing AP selection algorithms such as Max-Mean and InfoGain. The variant AP selection is further improved by grouping AP based on signal distribution. In this work, two AP selection algorithms are proposed which are Max Kernel and Kernel Logistic Discriminant that implement the knowledge of kernel density estimate and logistic regression machine learning classification. For floor localisation, the strategy is based on developing the algorithm to determine the floor by utilising fingerprint clustering technique. The clustering method is based on simple signal strength clustering which sorts the signals of APs in each fingerprint according to the strongest value. Two new floor localisation algorithms namely Averaged Kernel Floor (AKF) and Kernel Logistic Floor (KLF) are studied. The former is based on modification of univariate kernel algorithm which is proposed for single-floor localisation, while the latter applies the theory kernel logistic regression which is similar to AP selection approach but for classification purpose. For horizontal localisation, different algorithm based on multi-class k-nearest neighbour ( NN) classifiers with optimisation parameter is presented. Unlike the classical kNN algorithm which is a regression type algorithm, the proposed localisation algorithms utilise machine learning classification for both linear and kernel types. The multi-class classification strategy is used to ensure quick estimation of the multi-class NN algorithms. The proposed algorithms are compared and analysed with existing algorithms to confirm reliability and robustness. Additionally, the algorithms are evaluated using six multi-floor and single-floor datasets to validate the proposed algorithms. In database optimisation, the proposed AP selection technique using Max Kernel could reduce as high as 77.8% APs compared to existing approaches while retaining similar accuracy as localisation algorithm utilising all APs in the database. In floor localisation, the proposed KLF algorithm at one time could demonstrate 93.4% correct determination of floor level based on the measured dataset. In horizontal localisation, the multi-class NN classifier algorithm could improve 19.3% of accuracy within fingerprint spacing of 2 meters compared to existing algorithms. All of the algorithms are later combined to provide device location estimation for multi-floor environment. Improvement of 43.5% of within 2 meters location accuracy and reduction of 15.2 times computational time are seen as compared to existing multi-floor localisation techniques by Gansemer and Marques. The improved accuracy is due to better performance of proposed floor and horizontal localisation algorithm while the computational time is reduced due to introduction of AP selection algorithm.
APA, Harvard, Vancouver, ISO, and other styles
39

Zhang, Xu. "English quasi-numeral classifiers : a cognitive and corpus-based study." Thesis, Lancaster University, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.538610.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Danylenko, Antonina. "Decision Algebra: A General Approach to Learning and Using Classifiers." Doctoral thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-43238.

Full text
Abstract:
Processing decision information is a vital part of Computer Science fields in which pattern recognition problems arise. Decision information can be generalized as alternative decisions (or classes), attributes and attribute values, which are the basis for classification. Different classification approaches exist, such as decision trees, decision tables and Naïve Bayesian classifiers, which capture and manipulate decision information in order to construct a specific decision model (or classifier). These approaches are often tightly coupled to learning strategies, special data structures and the special characteristics of the decision information captured, etc. The approaches are also connected to the way of how certain problems are addressed, e.g., memory consumption, low accuracy, etc. This situation causes problems for a simple choice, comparison, combination and manipulation of different decision models learned over the same or different samples of decision information. The choice and comparison of decision models are not merely the choice of a model with a higher prediction accuracy and a comparison of prediction accuracies, respectively. We also need to take into account that a decision model, when used in a certain application, often has an impact on the application's performance. Often, the combination and manipulation of different decision models are implementation- or application-specific, thus, lacking the generality that leads to the construction of decision models with combined or modified decision information. They also become difficult to transfer from one application domain to another. In order to unify different approaches, we define Decision Algebra, a theoretical framework that presents decision models as higher order decision functions that abstract from their implementation details. Decision Algebra defines the operations necessary to decide, combine, approximate, and manipulate decision functions along with operation signatures and general algebraic laws. Due to its algebraic completeness (i.e., a complete algebraic semantics of operations and its implementation efficiency), defining and developing decision models is simple as such instances require implementing just one core operation based on which other operations can be derived. Another advantage of Decision Algebra is composability: it allows for combination of decision models constructed using different approaches. The accuracy and learning convergence properties of the combined model can be proven regardless of the actual approach. In addition, the applications that process decision information can be defined using Decision Algebra regardless of the different classification approaches. For example, we use Decision Algebra in a context-aware composition domain, where we showed that context-aware applications improve performance when using Decision Algebra. In addition, we suggest an approach to integrate this context-aware component into legacy applications.
APA, Harvard, Vancouver, ISO, and other styles
41

Fitzpatrick, Margo L. "Evaluating Bayesian Classifiers and Rough Sets for Corporate Bankruptcy Prediction." NSUWorks, 2004. http://nsuworks.nova.edu/gscis_etd/517.

Full text
Abstract:
Corporate failure or bankruptcy is costly to investors as well as to society in general. Given the high costs of corporate failure, there is much interest in improved methods for bankruptcy prediction. A promising approach to solve this problem is to provide auditors with a tool that aids in estimating the likelihood of bankruptcy. Recent studies indicate that some success has been achieved in identifying a model and good predictive variables, but the research has been limited to narrow industry segments or small samples. This research evaluated and contrasted two approaches for predicting corporate bankruptcy that were relatively successful in prior studies with narrow or small samples of corporations. The first approach used a Bayesian belief network that incorporated a naive Bayesian classification mechanism. The second approach used an expert system that incorporated rough sets. The contribution of this study is two-fold. First, this comparative evaluation extends the research by providing insights into relative advantages of Bayesian classifiers and rough sets as tools for predicting corporate bankruptcy. One or more such tools could be useful to auditors and others concerned with forecasting the likely bankruptcy of corporations. Second, this research contributes to the literature by identifying a single set of predictor variables that have broad applicability to corporations and that can be used in both the rough sets and naive Bayesian models. Employing a single set of predictor variables in both models is essential for comparing the relative effectiveness of the models. The result of this study offer a set of predictor variables and a determination of which model has greater general applicability and effectiveness for forecasting corporate bankruptcies.
APA, Harvard, Vancouver, ISO, and other styles
42

Anaya, Leticia H. "Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as Classifiers." Thesis, University of North Texas, 2011. https://digital.library.unt.edu/ark:/67531/metadc103284/.

Full text
Abstract:
In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents (customer comments) at two levels of abstraction for a quality assurance setting. Then two computer algorithms, LSA and LDA, were used to perform classification on these documents. The results indicate that humans outperformed all computer algorithms and had an accuracy rate of 94% at the higher level of abstraction and 76% at the lower level of abstraction. At the high level of abstraction, the accuracy rates were 84% for both LSA and LDA and at the lower level, the accuracy rate were 67% for LSA and 64% for LDA. The findings of this research have many strong implications for the improvement of information systems that process unstructured text. Document classifiers have many potential applications in many fields (e.g., fraud detection, information retrieval, national security, and customer management). Development and refinement of algorithms that classify text is a fruitful area of ongoing research and this dissertation contributes to this area.
APA, Harvard, Vancouver, ISO, and other styles
43

Alorf, Abdulaziz Abdullah. "Primary/Soft Biometrics: Performance Evaluation and Novel Real-Time Classifiers." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/96942.

Full text
Abstract:
The relevance of faces in our daily lives is indisputable. We learn to recognize faces as newborns, and faces play a major role in interpersonal communication. The spectrum of computer vision research about face analysis includes, but is not limited to, face detection and facial attribute classification, which are the focus of this dissertation. The face is a primary biometric because by itself revels the subject's identity, while facial attributes (such as hair color and eye state) are soft biometrics because by themselves they do not reveal the subject's identity. In this dissertation, we proposed a real-time model for classifying 40 facial attributes, which preprocesses faces and then extracts 7 types of classical and deep features. These features were fused together to train 3 different classifiers. Our proposed model yielded 91.93% on the average accuracy outperforming 7 state-of-the-art models. We also developed a real-time model for classifying the states of human eyes and mouth (open/closed), and the presence/absence of eyeglasses in the wild. Our method begins by preprocessing a face by cropping the regions of interest (ROIs), and then describing them using RootSIFT features. These features were used to train a nonlinear support vector machine for each attribute. Our eye-state classifier achieved the top performance, while our mouth-state and glasses classifiers were tied as the top performers with deep learning classifiers. We also introduced a new facial attribute related to Middle Eastern headwear (called igal) along with its detector. Our proposed idea was to detect the igal using a linear multiscale SVM classifier with a HOG descriptor. Thereafter, false positives were discarded using dense SIFT filtering, bag-of-visual-words decomposition, and nonlinear SVM classification. Due to the similarity in real-life applications, we compared the igal detector with state-of-the-art face detectors, where the igal detector significantly outperformed the face detectors with the lowest false positives. We also fused the igal detector with a face detector to improve the detection performance. Face detection is the first process in any facial attribute classification pipeline. As a result, we reported a novel study that evaluates the robustness of current face detectors based on: (1) diffraction blur, (2) image scale, and (3) the IoU classification threshold. This study would enable users to pick the robust face detector for their intended applications.
Doctor of Philosophy
The relevance of faces in our daily lives is indisputable. We learn to recognize faces as newborns, and faces play a major role in interpersonal communication. Faces probably represent the most accurate biometric trait in our daily interactions. Thereby, it is not singular that so much effort from computer vision researchers have been invested in the analysis of faces. The automatic detection and analysis of faces within images has therefore received much attention in recent years. The spectrum of computer vision research about face analysis includes, but is not limited to, face detection and facial attribute classification, which are the focus of this dissertation. The face is a primary biometric because by itself revels the subject's identity, while facial attributes (such as hair color and eye state) are soft biometrics because by themselves they do not reveal the subject's identity. Soft biometrics have many uses in the field of biometrics such as (1) they can be utilized in a fusion framework to strengthen the performance of a primary biometric system. For example, fusing a face with voice accent information can boost the performance of the face recognition. (2) They also can be used to create qualitative descriptions about a person, such as being an "old bald male wearing a necktie and eyeglasses." Face detection and facial attribute classification are not easy problems because of many factors, such as image orientation, pose variation, clutter, facial expressions, occlusion, and illumination, among others. In this dissertation, we introduced novel techniques to classify more than 40 facial attributes in real-time. Our techniques followed the general facial attribute classification pipeline, which begins by detecting a face and ends by classifying facial attributes. We also introduced a new facial attribute related to Middle Eastern headwear along with its detector. The new facial attribute were fused with a face detector to improve the detection performance. In addition, we proposed a new method to evaluate the robustness of face detection, which is the first process in the facial attribute classification pipeline. Detecting the states of human facial attributes in real time is highly desired by many applications. For example, the real-time detection of a driver's eye state (open/closed) can prevent severe accidents. These systems are usually called driver drowsiness detection systems. For classifying 40 facial attributes, we proposed a real-time model that preprocesses faces by localizing facial landmarks to normalize faces, and then crop them based on the intended attribute. The face was cropped only if the intended attribute is inside the face region. After that, 7 types of classical and deep features were extracted from the preprocessed faces. Lastly, these 7 types of feature sets were fused together to train three different classifiers. Our proposed model yielded 91.93% on the average accuracy outperforming 7 state-of-the-art models. It also achieved state-of-the-art performance in classifying 14 out of 40 attributes. We also developed a real-time model that classifies the states of three human facial attributes: (1) eyes (open/closed), (2) mouth (open/closed), and (3) eyeglasses (present/absent). Our proposed method consisted of six main steps: (1) In the beginning, we detected the human face. (2) Then we extracted the facial landmarks. (3) Thereafter, we normalized the face, based on the eye location, to the full frontal view. (4) We then extracted the regions of interest (i.e., the regions of the mouth, left eye, right eye, and eyeglasses). (5) We extracted low-level features from each region and then described them. (6) Finally, we learned a binary classifier for each attribute to classify it using the extracted features. Our developed model achieved 30 FPS with a CPU-only implementation, and our eye-state classifier achieved the top performance, while our mouth-state and glasses classifiers were tied as the top performers with deep learning classifiers. We also introduced a new facial attribute related to Middle Eastern headwear along with its detector. After that, we fused it with a face detector to improve the detection performance. The traditional Middle Eastern headwear that men usually wear consists of two parts: (1) the shemagh or keffiyeh, which is a scarf that covers the head and usually has checkered and pure white patterns, and (2) the igal, which is a band or cord worn on top of the shemagh to hold it in place. The shemagh causes many unwanted effects on the face; for example, it usually occludes some parts of the face and adds dark shadows, especially near the eyes. These effects substantially degrade the performance of face detection. To improve the detection of people who wear the traditional Middle Eastern headwear, we developed a model that can be used as a head detector or combined with current face detectors to improve their performance. Our igal detector consists of two main steps: (1) learning a binary classifier to detect the igal and (2) refining the classier by removing false positives. Due to the similarity in real-life applications, we compared the igal detector with state-of-the-art face detectors, where the igal detector significantly outperformed the face detectors with the lowest false positives. We also fused the igal detector with a face detector to improve the detection performance. Face detection is the first process in any facial attribute classification pipeline. As a result, we reported a novel study that evaluates the robustness of current face detectors based on: (1) diffraction blur, (2) image scale, and (3) the IoU classification threshold. This study would enable users to pick the robust face detector for their intended applications. Biometric systems that use face detection suffer from huge performance fluctuation. For example, users of biometric surveillance systems that utilize face detection sometimes notice that state-of-the-art face detectors do not show good performance compared with outdated detectors. Although state-of-the-art face detectors are designed to work in the wild (i.e., no need to retrain, revalidate, and retest), they still heavily depend on the datasets they originally trained on. This condition in turn leads to variation in the detectors' performance when they are applied on a different dataset or environment. To overcome this problem, we developed a novel optics-based blur simulator that automatically introduces the diffraction blur at different image scales/magnifications. Then we evaluated different face detectors on the output images using different IoU thresholds. Users, in the beginning, choose their own values for these three settings and then run our model to produce the efficient face detector under the selected settings. That means our proposed model would enable users of biometric systems to pick the efficient face detector based on their system setup. Our results showed that sometimes outdated face detectors outperform state-of-the-art ones under certain settings and vice versa.
APA, Harvard, Vancouver, ISO, and other styles
44

Na, Li. "Combination of supervised and unsupervised classifiers based on belief functions." Thesis, Rennes 1, 2020. http://www.theses.fr/2020REN1S041.

Full text
Abstract:
La couverture terrestre se rapporte à la couverture biophysique de la surface terrestre de la Terre, identifiant ainsi la végétation, l’eau, le sol nu ou les surfaces imperméables, etc. L’identification de la couverture terrestre est essentielle pour la planification et la gestion des ressources naturelles (e.g. d développement, protection), la compréhension de la répartition des habitats ainsi que la modélisation des variables environnementales. L’identification des types de couverture terrestre fournit des informations de base pour la production d’autres cartes thématiques et établit une base de référence pour les activités de surveillance. Par conséquent, la classification de la couverture terrestre à l’aide de données satellitaires est l’une des applications les plus importantes de la télédétection. Une grande quantité d’informations au sol est généralement nécessaire pour générer une classification de la couverture terrestre de haute qualité. Toutefois, dans les zones naturelles complexes, la collecte d’informations au sol peut-être longue et extrêmement coûteuse. De nos jours, les technologies à capteurs multiples font l’objet d’une grande attention dans la classification de la couverture terrestre. Elles apportent des informations différentes et complémentaires des caractéristiques spectrales qui peuvent aider à surmonter les limitations causées par une information au sol inadéquate. Un autre problème causé par le manque d’informations au sol est l’ambiguïté des relations entre les cartes de la couverture des terres et les cartes d’utilisation des terres. Les cartes de l’occupation des sols fournissent des informations sur les caractéristiques naturelles qui peuvent-être directement observées à la surface de la Terre. Elle font également référence à la manière dont les gens utilisent les informations sur les paysages à fins différentes. Sans informations adéquates sur le terrain, il est difficile de produire des cartes d’utilisation des sols à partir des cartes de l’occupation des sols pour des zones complexes. Par conséquent, lorsque l’on combine plusieurs cartes hétérogènes de la couverture des sols, il faut envisager comment permettre aux utilisateurs de synthétiser le schéma des cartes d’utilisation des sols. Dans notre recherche, nous nous concentrons sur la fusion d’informations hétérogènes provenant de différentes sources. Le système de combinaison vise à résoudre les problèmes causés par le nombre limité d’ échantillon étiquetés et peut-être donc utilisé dans la classification de la couverture des terres pour les zones difficiles d’accès. Les étiquettes sémantiques pour la classification de l’occupation des sols provenant de chaque capteur peuvent être différentes et peuvent ne pas correspondre au schéma final d'étiquettes que les utilisateurs attendent. Par conséquent, un autre objectif de la combinaison est de fournir une interface avec un schéma final probablement diffèrent des cartes de l’occupation des sols d’entrée
Land cover relates to the biophysical cover of the Earth’s terrestrial surface, identifying vegetation, water, bare soil, or impervious surfaces, etc. Identifying land cover is essential for planning and managing natural resources (e.g. development, protection), understanding the distribution of habitats, and for modeling environmental variables. Identification of land cover types provides basic information for the generation of other thematic maps and establishes a baseline for monitoring activities. Therefore, land cover classification using satellite data is one of the most important applications of remote sensing. A great deal of ground information (e.g. labeled samples) is usually required to generate high-quality land cover classification. However, in complex natural areas, collecting information on the ground can be time-consuming and extremely expensive. Nowadays, multiple sensor technologies have gained great attention in land cover classification. They bring different and complementary information—spectral characteristics that may help to overcome the limitations caused by inadequate ground information. In our research, we focus on the fusion of heterogeneous information from different sources. The combination system aims to solve the problems caused by limited labeled samples and can thus be used in land cover classification for hard-to-access areas. These mantic labels for the land cover classification from each sensor can be different, and may not corresponds to the final scheme of labels that users await. For instance, land cover classification methods of different sensors provide semantic labels for the ground. However, based on these land cover maps, an accessibility map is supposed to be generated to meet users’ needs. Therefore, another objective of the combination is to provide an interface with a final scheme probably different from the input land cover maps
APA, Harvard, Vancouver, ISO, and other styles
45

Çetin, Özgür. "Multi-rate modeling, model inference, and estimation for statistical classifiers /." Thesis, Connect to this title online; UW restricted, 2004. http://hdl.handle.net/1773/5849.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Oliveira, e. Cruz Rafael Menelau. "Methods for dynamic selection and fusion of ensemble of classifiers." Universidade Federal de Pernambuco, 2011. https://repositorio.ufpe.br/handle/123456789/2436.

Full text
Abstract:
Made available in DSpace on 2014-06-12T15:58:13Z (GMT). No. of bitstreams: 2 arquivo3310_1.pdf: 8155353 bytes, checksum: 2f4dcd5adb2b0b1a23c40bf343b36b34 (MD5) license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5) Previous issue date: 2011
Faculdade de Amparo à Ciência e Tecnologia do Estado de Pernambuco
Ensemble of Classifiers (EoC) é uma nova alternative para alcançar altas taxas de reconhecimento em sistemas de reconhecimento de padrões. O uso de ensemble é motivado pelo fato de que classificadores diferentes conseguem reconhecer padrões diferentes, portanto, eles são complementares. Neste trabalho, as metodologias de EoC são exploradas com o intuito de melhorar a taxa de reconhecimento em diferentes problemas. Primeiramente o problema do reconhecimento de caracteres é abordado. Este trabalho propõe uma nova metodologia que utiliza múltiplas técnicas de extração de características, cada uma utilizando uma abordagem diferente (bordas, gradiente, projeções). Cada técnica é vista como um sub-problema possuindo seu próprio classificador. As saídas deste classificador são utilizadas como entrada para um novo classificador que é treinado para fazer a combinação (fusão) dos resultados. Experimentos realizados demonstram que a proposta apresentou o melhor resultado na literatura pra problemas tanto de reconhecimento de dígitos como para o reconhecimento de letras. A segunda parte da dissertação trata da seleção dinâmica de classificadores (DCS). Esta estratégia é motivada pelo fato que nem todo classificador pertencente ao ensemble é um especialista para todo padrão de teste. A seleção dinâmica tenta selecionar apenas os classificadores que possuem melhor desempenho em uma dada região próxima ao padrão de entrada para classificar o padrão de entrada. É feito um estudo sobre o comportamento das técnicas de DCS demonstrando que elas são limitadas pela qualidade da região em volta do padrão de entrada. Baseada nesta análise, duas técnicas para seleção dinâmica de classificadores são propostas. A primeira utiliza filtros para redução de ruídos próximos do padrão de testes. A segunda é uma nova proposta que visa extrair diferentes tipos de informação, a partir do comportamento dos classificadores, e utiliza estas informações para decidir se um classificador deve ser selecionado ou não. Experimentos conduzidos em diversos problemas de reconhecimento de padrões demonstram que as técnicas propostas apresentam um aumento de performance significante
APA, Harvard, Vancouver, ISO, and other styles
47

Mancini, Lorenzo <1989&gt. "Ordinal data supervised classification with Quantile-based and other classifiers." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amsdottorato.unibo.it/8543/1/phd%20thesis_mancini.pdf.

Full text
Abstract:
The aim of this research project is to propose a new method for supervised classification problems where the input features are ordinal. Ordinal data are preponderant in many research fields. They directly arise when the observations fall into separate distinct but ordered categories and they are very common in surveys where answers are listed as Likert scales. Typically, they are coded as equally spaced values and sometimes they are analyzed as numerical values. These choices may not necessarily correspond to the real distribution of the data. The objectives of the study have been accomplished according to several steps. The first phase consisted of an exhaustive analysis of the state of art of the statistical literature with the aim of identifying the various approaches to ordinal data analysis, the related limitations, and possible advantages. We have then proposed to operate in the framework of Generalized Linear Latent Variable Models (GLLVM), considering the response function approach with a single latent variable Beta distributed. Our scope in using this method is to shift from a set of ordinal features to a single continuous feature, which well adapt the data, in order to directly apply the standard classification methods. A dedicated EM algorithm has been developed on the basis of this theoretical framework using the statistical software R. Finally, we have compared our approach with several scoring methods through a wide simulation study. The scoring methods that we have considered in the simulation study are: the raw scores, the ridits, the blom scores, the normal median scores and the conditional mean scores. These methods, although have a long history in literature, have never been used for classification purpose. In addition we present an example of the application of the proposed approach to real world business data problem.
Il lavoro di ricerca ha l'obiettivo di individuare una metodologia statistica per la classificazione supervisionata di unità statistiche misurate da un insieme di variabili ordinali. Questo tipo di dati è diffuso in diverse aree di ricerca e, in particolare, è molto comune nei sondaggi, dove le categorie di risposta sono elencate tramite scale Likert. Tipicamente, le categorie associate a queste variabili sono codificate attraverso apposite etichette le quali corrispondono solitamente a valori numerici progressivi ed equi-distanziati che riflettono l'ordine delle categorie. In fase di analisi non è però appropriato trattare questi dati come valori numerici reali, in quanto si andrebbe ad introdurre una distanza tra categorie che potrebbe non corrispondere a quella effettiva. Il progetto di ricerca si articola in diverse fasi. Inizialmente, viene effettuata un'analisi esaustiva dello stato dell'arte della letteratura, per identificare i vari approcci all'analisi dei dati ordinali, valutandone i limiti e i vantaggi. Successivamente, sulla base dei risultati di questa analisi, viene proposto un metodo basato sull'approccio response function, nel contesto dei modelli generalizzati a variabili latenti. A differenza del metodo classico, che prevede variabili latenti normalmente distribuite, la nuova metodologia proposta considera una singola variabile latente con distribuzione Beta, poiché fornisce specifici vantaggi in termini di efficienza computazionale e di adattamento ai dati. L'obiettivo è, sostanzialmente, di spostare il problema della classificazione da un insieme di variabili ordinali ad una singola variabile continua, in modo da applicare i metodi di classificazione standard. Sulla base di questo quadro teorico di riferimento è stato sviluppato un algoritmo EM, utilizzando il software statistico R. L'approccio proposto è confrontato, attraverso un ampio studio di simulazione, con diversi metodi di scoring, in particolare: raw scores, ridits, blom scores, normal median scores e conditional mean scores. Si presenta, inoltre, un'applicazione del metodo discusso ad un problema di classificazione su dati reali.
APA, Harvard, Vancouver, ISO, and other styles
48

Brosnan, Timothy Myers. "Neural network and vector quantization classifiers for recognition and inspection applications." Thesis, Georgia Institute of Technology, 1997. http://hdl.handle.net/1853/15378.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Leon, Pasqual Maria Lourdes de. "Noun and numeral classifiers in Mixtec and Tzotzil : a referential view." Thesis, University of Sussex, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.232945.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Komatsu, Hiroko. "Prototypes and Metaphorical Extensions: The Japanese Numeral Classifiers hiki and hatsu." Thesis, The University of Sydney, 2018. http://hdl.handle.net/2123/19648.

Full text
Abstract:
This study concerns the meaning of Japanese numeral classifiers (NCs) and, particularly, the elements which guide us to understand the metaphorical meanings they can convey. In the typological literature, as well as in studies of Japanese, the focus is almost entirely on NCs that refer to entities. NCs are generally characterised as being matched with a noun primarily based on semantic criteria such as the animacy, the physical characteristics, or the function of the referent concerned. However, in some languages, including Japanese, nouns allow a number of alternative NCs, so that it is considered that NCs are not automatically matched with a noun but rather with the referent that the noun refers to in the particular context in which it occurs. This study examines data from the Balanced Corpus of Contemporary Written Japanese, and focuses on two NCs as case studies: hiki, an entity NC, typically used to classify small, animate beings, and hatsu, an NC that is used to classify both entities and events that are typically explosive in nature. The study employs the framework of Prototype Theory, along with the theory of conceptual metaphor, and the theory of metonymy. The analysis of the data identified a number of semantic components for each of the target NCs; by drawing on these components, the speaker can subjectively add those meanings to modify the meaning of the referring noun or verb. Furthermore, the study revealed that the choice of NCs can be influenced by two factors. First, the choice of NC sometimes relates to the linguistic context in which the referring noun or verb occurs. For example, if a noun is used metaphorically, the NC is chosen to reinforce that metaphor, rather than to match with the actual referent. Second, the meaning of an NC itself can be used as a vehicle of metaphor to contribute meaning to that of the referring noun or verb concerned. Through the analysis, is has been identified that the range of referents of a single NC beyond cases in which objectively observable characteristics are evident occurs in two dimensions: (1) in terms of the typicality of referents and (2) across categories of referents (entities and events). Based on the findings, the study claims that, in both cases, non-literal factors account for extension in the range of referents of an NC in Japanese. Specifically, the non-literal devices of metaphor and metonymy appear to play a role in connecting an NC and its referent in the context in which extension of the use of that NC occurs.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography