Dissertations / Theses: 'Fingerprints Classification Data processing'

1

Deng, Huimin, and 鄧惠民. "Robust minutia-based fingerprint verification." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2006. http://hub.hku.hk/bib/B37036427.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Aygar, Alper. "Doppler Radar Data Processing And Classification." Master's thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12609890/index.pdf.

Full text

Abstract:

In this thesis, improving the performance of the automatic recognition of the Doppler radar targets is studied. The radar used in this study is a ground-surveillance doppler radar. Target types are car, truck, bus, tank, helicopter, moving man and running man. The input of this thesis is the output of the real doppler radar signals which are normalized and preprocessed (TRP vectors: Target Recognition Pattern vectors) in the doctorate thesis by Erdogan (2002). TRP vectors are normalized and homogenized doppler radar target signals with respect to target speed, target aspect angle and target range. Some target classes have repetitions in time in their TRPs. By the use of these repetitions, improvement of the target type classification performance is studied. K-Nearest Neighbor (KNN) and Support Vector Machine (SVM) algorithms are used for doppler radar target classification and the results are evaluated. Before classification PCA (Principal Component Analysis), LDA (Linear Discriminant Analysis), NMF (Nonnegative Matrix Factorization) and ICA (Independent Component Analysis) are implemented and applied to normalized doppler radar signals for feature extraction and dimension reduction in an efficient way. These techniques transform the input vectors, which are the normalized doppler radar signals, to another space. The effects of the implementation of these feature extraction algoritms and the use of the repetitions in doppler radar target signals on the doppler radar target classification performance are studied.

APA, Harvard, Vancouver, ISO, and other styles

3

Fernandez, Noemi. "Statistical information processing for data classification." FIU Digital Commons, 1996. http://digitalcommons.fiu.edu/etd/3297.

Full text

Abstract:

This thesis introduces new algorithms for analysis and classification of multivariate data. Statistical approaches are devised for the objectives of data clustering, data classification and object recognition. An initial investigation begins with the application of fundamental pattern recognition principles. Where such fundamental principles meet their limitations, statistical and neural algorithms are integrated to augment the overall approach for an enhanced solution. This thesis provides a new dimension to the problem of classification of data as a result of the following developments: (1) application of algorithms for object classification and recognition; (2) integration of a neural network algorithm which determines the decision functions associated with the task of classification; (3) determination and use of the eigensystem using newly developed methods with the objectives of achieving optimized data clustering and data classification, and dynamic monitoring of time-varying data; and (4) use of the principal component transform to exploit the eigensystem in order to perform the important tasks of orientation-independent object recognition, and di mensionality reduction of the data such as to optimize the processing time without compromising accuracy in the analysis of this data.

APA, Harvard, Vancouver, ISO, and other styles

4

Varnavas, Andreas Soteriou. "Signal processing methods for EEG data classification." Thesis, Imperial College London, 2008. http://hdl.handle.net/10044/1/11943.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Shen, Shan. "MRI brain tumour classification using image processing and data mining." Thesis, University of Strathclyde, 2004. http://oleg.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=21543.

Full text

Abstract:

Detecting and diagnosing brain tumour types quickly and accurately is essential to any effective treatment. The general brain tumour diagnosis procedure, biopsy, not only causes a great deal of pain to the patient but also raises operational difficulty to the clinician. In this thesis, a non-invasive brain tumour diagnosis system based on MR images is proposed. The first part is image preprocessing applied to original MR images from the hospital. Non-uniformed intensity scales of MR images are standardized relying on their statistic characteristics without requiring prior or post templates. It is followed by a non-brain region removal process using morphologic operations and a contrast enhancement between white matter and grey matter by means of histogram equalization. The second part is image segmentation applied to preprocessed MR images. A new image segmentation algorithm named IFCM is developed based on the traditional FCM algorithm. Neighbourhood attractions considered in IFCM enable this new algorithm insensitive to noise, while a neural network model is designed to determine optimized degrees of attractions. This extension can also estimate inhomogenities. Brain tissue intensities are acquired from segmentation. The final part of the system is brain tumour classification. It extracts hidden diagnosis information from brain tissue intensities using a fuzzy logic based GP algorithm. This novel method imports a fuzzy membership to implement a multi-class classification directly without converting it into several binary classification problems as with most other methods. Two fitness functions are defined to describe the features of medical data precisely. The superiority of image analysis methods in each part was demonstrated on synthetic images and real MR images. Classification rules of three types and two grades of brain tumours were discovered. The final diagnosis accuracy was very promising. The feasibility and capability of the non-invasive diagnosis system were testified comprehensively.

APA, Harvard, Vancouver, ISO, and other styles

6

Kirkin, S., and K. V. Melnyk. "Intelligent Data Processing in Creating Targeted Advertising." Thesis, National Technical University "Kharkiv Polytechnic Institute", 2017. http://repository.kpi.kharkov.ua/handle/KhPI-Press/44710.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Pinheiro, Muriel Aline. "Processing, radiometric correction, autofocus and polarimetric classification of circular SAR data." Instituto Tecnológico de Aeronáutica, 2010. http://www.bd.bibl.ita.br/tde_busca/arquivo.php?codArquivo=1083.

Full text

Abstract:

The demand for high resolution SAR systems and also for imaging techniques to retrieve scene information on the third dimension have stimulated the development of new acquisition modes and processing approaches. This work studies one of the newest SAR acquisition modes being used, namely the Circular SAR, in which the platform follows a non-linear circular trajectory. A brief introduction of the acquisition geometry is present along with the advantages of this acquisition mode, such as the volumetric reconstruction capability, higher resolutions and the possibility to retrieve target information from a wider range of observation angles. To deal with the non-linearity of trajectory, a processing approach using the time domain back-projection algorithm is suggested to focus and radiometric correct the images, taking into account the antenna patterns and loss due to propagation. An existing autofocus approach to correct motion errors is validated for the circular SAR context and a new frequency domain approach is proposed. Once the images are processed and calibrated, a polarimetric analysis is presented. In this context, a new polarimetric classification methodology is proposed for the particular geometry under consideration. The method uses the H- plane and the information of the first eigenvalue to classify small sub-apertures of the circular trajectory and finally classify the entire 360 circular aperture. Using information of all sub-apertures it is possible to preserve information of directional targets and diminish the effects caused by topography defocusing on the classification. To obtain speckle reduction improving the classification algorithm a Lee adaptive filter is implemented. The processing calibration approaches and the classification methodology are validated with circular SAR real data acquired with the SAR systems from the German Aerospace Center (DLR).

APA, Harvard, Vancouver, ISO, and other styles

8

ALMEIDA, Marcos Antonio Martins de. "Statistical analysis applied to data classification and image filtering." Universidade Federal de Pernambuco, 2016. https://repositorio.ufpe.br/handle/123456789/25506.

Full text

Abstract:

Submitted by Fernanda Rodrigues de Lima (fernanda.rlima@ufpe.br) on 2018-08-03T20:52:13Z No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) TESE Marcos Antonio Martins de Almeida.pdf: 11555397 bytes, checksum: db589d39915a5dda1d8b9e763a9cf4c0 (MD5)
Approved for entry into archive by Alice Araujo (alice.caraujo@ufpe.br) on 2018-08-09T20:49:00Z (GMT) No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) TESE Marcos Antonio Martins de Almeida.pdf: 11555397 bytes, checksum: db589d39915a5dda1d8b9e763a9cf4c0 (MD5)
Made available in DSpace on 2018-08-09T20:49:01Z (GMT). No. of bitstreams: 2 license_rdf: 811 bytes, checksum: e39d27027a6cc9cb039ad269a5db8e34 (MD5) TESE Marcos Antonio Martins de Almeida.pdf: 11555397 bytes, checksum: db589d39915a5dda1d8b9e763a9cf4c0 (MD5) Previous issue date: 2016-12-21
Statistical analysis is a tool of wide applicability in several areas of scientific knowledge. This thesis makes use of statistical analysis in two different applications: data classification and image processing targeted at document image binarization. In the first case, this thesis presents an analysis of several aspects of the consistency of the classification of the senior researchers in computer science of the Brazilian research council, CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico. The second application of statistical analysis developed in this thesis addresses filtering-out the back to front interference which appears whenever a document is written or typed on both sides of translucent paper. In this topic, an assessment of the most important algorithms found in the literature is made, taking into account a large quantity of parameters such as the strength of the back to front interference, the diffusion of the ink in the paper, and the texture and hue of the paper due to aging. A new binarization algorithm is proposed, which is capable of removing the back-to-front noise in a wide range of documents. Additionally, this thesis proposes a new concept of “intelligent” binarization for complex documents, which besides text encompass several graphical elements such as figures, photos, diagrams, etc.
Análise estatística é uma ferramenta de grande aplicabilidade em diversas áreas do conhecimento científico. Esta tese faz uso de análise estatística em duas aplicações distintas: classificação de dados e processamento de imagens de documentos visando a binarização. No primeiro caso, é aqui feita uma análise de diversos aspectos da consistência da classificação de pesquisadores sêniores do CNPq - Conselho Nacional de Desenvolvimento Científico e Tecnológico, na área de Ciência da Computação. A segunda aplicação de análise estatística aqui desenvolvida trata da filtragem da interferência frente-verso que surge quando um documento é escrito ou impresso em ambos os lados da folha de um papel translúcido. Neste tópico é inicialmente feita uma análise da qualidade dos mais importantes algoritmos de binarização levando em consideração parâmetros tais como a intensidade da interferência frente-verso, a difusão da tinta no papel e a textura e escurecimento do papel pelo envelhecimento. Um novo algoritmo para a binarização eficiente de documentos com interferência frente-verso é aqui apresentado, tendo se mostrado capaz de remover tal ruído em uma grande gama de documentos. Adicionalmente, é aqui proposta a binarização “inteligente” de documentos complexos que envolvem diversos elementos gráficos (figuras, diagramas, etc).

APA, Harvard, Vancouver, ISO, and other styles

9

Schmidt, Sven. "Quality-of-Service-Aware Data Stream Processing." Doctoral thesis, Technische Universität Dresden, 2006. https://tud.qucosa.de/id/qucosa%3A23955.

Full text

Abstract:

Data stream processing in the industrial as well as in the academic field has gained more and more importance during the last years. Consider the monitoring of industrial processes as an example. There, sensors are mounted to gather lots of data within a short time range. Storing and post-processing these data may occasionally be useless or even impossible. On the one hand, only a small part of the monitored data is relevant. To efficiently use the storage capacity, only a preselection of the data should be considered. On the other hand, it may occur that the volume of incoming data is generally too high to be stored in time or–in other words–the technical efforts for storing the data in time would be out of scale. Processing data streams in the context of this thesis means to apply database operations to the stream in an on-the-fly manner (without explicitly storing the data). The challenges for this task lie in the limited amount of resources while data streams are potentially infinite. Furthermore, data stream processing must be fast and the results have to be disseminated as soon as possible. This thesis focuses on the latter issue. The goal is to provide a so-called Quality-of-Service (QoS) for the data stream processing task. Therefore, adequate QoS metrics like maximum output delay or minimum result data rate are defined. Thereafter, a cost model for obtaining the required processing resources from the specified QoS is presented. On that basis, the stream processing operations are scheduled. Depending on the required QoS and on the available resources, the weight can be shifted among the individual resources and QoS metrics, respectively. Calculating and scheduling resources requires a lot of expert knowledge regarding the characteristics of the stream operations and regarding the incoming data streams. Often, this knowledge is based on experience and thus, a revision of the resource calculation and reservation becomes necessary from time to time. This leads to occasional interruptions of the continuous data stream processing, of the delivery of the result, and thus, of the negotiated Quality-of-Service. The proposed robustness concept supports the user and facilitates a decrease in the number of interruptions by providing more resources.

APA, Harvard, Vancouver, ISO, and other styles

10

Kutzner, Kendy. "Processing MODIS Data for Fire Detection in Australia." Thesis, Universitätsbibliothek Chemnitz, 2002. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-200200831.

Full text

Abstract:

The aim of this work was to use remote sensing data from the MODIS instrument of the Terra satellite to detect bush fires in Australia. This included preprocessing the demodulator output, bit synchronization and reassembly of data packets. IMAPP was used to do the geolocation and data calibration. The fire detection used a combination of fixed threshold techniques with difference tests and background comparisons. The results were projected in a rectangular latidue/longitude map to remedy the bow tie effect. Algorithms were implemented in C and Matlab. It proved to be possible to detect fires in the available data. The results were compared with fire detection done done by NASA and fire detections based on other sensors and found to be very similar
Das Ziel dieser Arbeit war die Nutzung von Fernerkundungsdaten des MODIS Instruments an Bord des Satelliten Terra zur Erkennung von Buschfeuern in Australien. Das schloss die Vorverarbeitung der Daten vom Demodulator, die Bitsynchronisation und die Umpacketierung der Daten ein. IMAPP wurde genutzt um die Daten zu kalibrieren und zu geolokalisieren. Die Feuererkennung bedient sich einer Kombination von absoluten Schwellwerttests, Differenztests und Vergleichen mit dem Hintergrund. Die Ergebnisse wurden in eine rechteckige Laengen/Breitengradkarte projiziert um dem BowTie Effekt entgegenzuwirken. Die benutzten Algrorithmen wurden in C und Matlab implementiert. Es zeigte sich, dass es moeglich ist in den verfuegbaren Daten Feuer zu erkennen. Die Ergebnisse wurden mit Feuererkennungen der NASA und Feuererkennung die auf anderen Sensoren basieren verglichen und fuer sehr aehnlich befunden

APA, Harvard, Vancouver, ISO, and other styles

11

Mugtussids, Iossif B. "Flight Data Processing Techniques to Identify Unusual Events." Diss., Virginia Tech, 2000. http://hdl.handle.net/10919/28095.

Full text

Abstract:

Modern aircraft are capable of recording hundreds of parameters during flight. This fact not only facilitates the investigation of an accident or a serious incident, but also provides the opportunity to use the recorded data to predict future aircraft behavior. It is believed that, by analyzing the recorded data, one can identify precursors to hazardous behavior and develop procedures to mitigate the problems before they actually occur. Because of the enormous amount of data collected during each flight, it becomes necessary to identify the segments of data that contain useful information. The objective is to distinguish between typical data points, that are present in the majority of flights, and unusual data points that can be only found in a few flights. The distinction between typical and unusual data points is achieved by using classification procedures. In this dissertation, the application of classification procedures to flight data is investigated. It is proposed to use a Bayesian classifier that tries to identify the flight from which a particular data point came. If the flight from which the data point came is identified with a high level of confidence, then the conclusion that the data point is unusual within the investigated flights can be made. The Bayesian classifier uses the overall and conditional probability density functions together with a priori probabilities to make a decision. Estimating probability density functions is a difficult task in multiple dimensions. Because many of the recorded signals (features) are redundant or highly correlated or are very similar in every flight, feature selection techniques are applied to identify those signals that contain the most discriminatory power. In the limited amount of data available to this research, twenty five features were identified as the set exhibiting the best discriminatory power. Additionally, the number of signals is reduced by applying feature generation techniques to similar signals. To make the approach applicable in practice, when many flights are considered, a very efficient and fast sequential data clustering algorithm is proposed. The order in which the samples are presented to the algorithm is fixed according to the probability density function value. Accuracy and reduction level are controlled using two scalar parameters: a distance threshold value and a maximum compactness factor.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

12

Cho, Hansang. "Classification of functional brain data for multimedia retrieval /." Thesis, Connect to this title online; UW restricted, 2005. http://hdl.handle.net/1773/5892.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

van, Schaik Sebastiaan Johannes. "A framework for processing correlated probabilistic data." Thesis, University of Oxford, 2014. http://ora.ox.ac.uk/objects/uuid:91aa418d-536e-472d-9089-39bef5f62e62.

Full text

Abstract:

The amount of digitally-born data has surged in recent years. In many scenarios, this data is inherently uncertain (or: probabilistic), such as data originating from sensor networks, image and voice recognition, location detection, and automated web data extraction. Probabilistic data requires novel and different approaches to data mining and analysis, which explicitly account for the uncertainty and the correlations therein. This thesis introduces ENFrame: a framework for processing and mining correlated probabilistic data. Using this framework, it is possible to express both traditional and novel algorithms for data analysis in a special user language, without having to explicitly address the uncertainty of the data on which the algorithms operate. The framework will subsequently execute the algorithm on the probabilistic input, and perform exact or approximate parallel probability computation. During the probability computation, correlations and provenance are succinctly encoded using probabilistic events. This thesis contains novel contributions in several directions. An expressive user language – a subset of Python – is introduced, which allows a programmer to implement algorithms for probabilistic data without requiring knowledge of the underlying probabilistic model. Furthermore, an event language is presented, which is used for the probabilistic interpretation of the user program. The event language can succinctly encode arbitrary correlations using events, which are the probabilistic counterparts of deterministic user program variables. These highly interconnected events are stored in an event network, a probabilistic interpretation of the original user program. Multiple techniques for exact and approximate probability computation (with error guarantees) of such event networks are presented, as well as techniques for parallel computation. Adaptations of multiple existing data mining algorithms are shown to work in the framework, and are subsequently subjected to an extensive experimental evaluation. Additionally, a use-case is presented in which a probabilistic adaptation of a clustering algorithm is used to predict faults in energy distribution networks. Lastly, this thesis presents techniques for integrating a number of different probabilistic data formalisms for use in this framework and in other applications.

APA, Harvard, Vancouver, ISO, and other styles

14

Phillips, Rhonda D. "A Probabilistic Classification Algorithm With Soft Classification Output." Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/26701.

Full text

Abstract:

This thesis presents a shared memory parallel version of the hybrid classification algorithm IGSCR (iterative guided spectral class rejection), a novel data reduction technique that can be used in conjunction with PIGSCR (parallel IGSCR), a noise removal method based on the maximum noise fraction (MNF), and a continuous version of IGSCR (CIGSCR) that outputs soft classifications. All of the above are either classification algorithms or preprocessing algorithms necessary prior to the classification of high dimensional, noisy images. PIGSCR was developed to produce fast and portable code using Fortran 95, OpenMP, and the Hierarchical Data Format version 5 (HDF5) and accompanying data access library. The feature reduction method introduced in this thesis is based on the singular value decomposition (SVD). This feature reduction technique demonstrated that SVD-based feature reduction can lead to more accurate IGSCR classifications than PCA-based feature reduction. This thesis describes a new algorithm used to adaptively filter a remote sensing dataset based on signal-to-noise ratios (SNRs) once the maximum noise fraction (MNF) has been applied. The adaptive filtering scheme improves image quality as shown by estimated SNRs and classification accuracy improvements greater than 10%. The continuous iterative guided spectral class rejection (CIGSCR) classification method is based on the iterative guided spectral class rejection (IGSCR) classification method for remotely sensed data. Both CIGSCR and IGSCR use semisupervised clustering to locate clusters that are associated with classes in a classification scheme. This type of semisupervised classification method is particularly useful in remote sensing where datasets are large, training data are difficult to acquire, and clustering makes the identification of subclasses adequate for training purposes less difficult. Experimental results indicate that the soft classification output by CIGSCR is reasonably accurate (when compared to IGSCR), and the fundamental algorithmic changes in CIGSCR (from IGSCR) result in CIGSCR being less sensitive to input parameters that influence iterations.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

15

Kidane, Dawit K. "Rule-based land cover classification model : expert system integration of image and non-image spatial data." Thesis, Stellenbosch : Stellenbosch University, 2005. http://hdl.handle.net/10019.1/50445.

Full text

Abstract:

Thesis (MSc)--Stellenbosch University, 2005.
ENGLISH ABSTRACT: Remote sensing and image processing tools provide speedy and up-to-date information on land resources. Although remote sensing is the most effective means of land cover and land use mapping, it is not without limitations. The accuracy of image analysis depends on a number of factors, of which the image classifier used is probably the most significant. It is noted that there is no perfect classifier, but some robust classifiers achieve higher accuracy results than others. For certain land cover/uses, discrimination based only on spectral properties is extremely difficult and often produces poor results. The use of ancillary data can improve the classification process. Some classifiers incorporate ancillary data before or after the classification process, which limits the full utilization of the information contained in the ancillary data. Expert classification, on the other hand, makes better use of ancillary data by incorporating data directly into the classification process. In this study an expert classification model was developed based on spatial operations designed to identify a specific land cover/use, by integrating both spectral and available ancillary data. Ancillary data were derived either from the spectral channels or from other spatial data sources such as DEM (Digital Elevation Model) and topographical maps. The model was developed in ERDAS Imagine image-processing software, using the expert engineer as a final integrator of the different constituent spatial operations. An attempt was made to identify the Level I land cover classes in the South African National Land Cover classification scheme hierarchy. Rules were determined on the basis of expert knowledge or statistical calculations of mean and variance on training samples. Although rules could be determined by using statistical applications, such as the classification analysis regression tree (CART), the absence of adequate and accurate training data for all land cover classes and the fact that all land cover classes do not require the same predictor variables makes this option less desirable. The result of the accuracy assessment showed that the overall classification accuracy was 84.3% and kappa statistics 0.829. Although this level of accuracy might be suitable for most applications, the model is flexible enough to be improved further.
AFRIKAANSE OPSOMMING: Afstandswaameming-en beeldverwerkingstegnieke kan akkurate informasie oorbodemhulpbronne weergee. Alhoewel afstandswaameming die mees effektiewe manier van grondbedekking en grondgebruikkartering is, is dit nie sonder beperkinge nie. Die akkuraatheid van beeldverwerking is afhanklik van verskeie faktore, waarvan die beeld klassifiseerder wat gebruik word, waarskynlik die belangrikste faktor is. Dit is welbekend dat daar geen perfekte klassifiseerder is nie, alhoewel sekere kragtige klassifiseerders hoër akkuraatheid as ander behaal. Vir sekere grondbedekking en -gebruike is uitkenning gebaseer op spektrale eienskappe uiters moeilik en dikwels word swak resultate behaal. Die gebruik van aanvullende data, kan die klassifikasieproses verbeter. Sommige klassifiseerders inkorporeer aanvullende data voor of na die klassifikasieproses, wat die volle aanwending van die informasie in die aanvullende data beperk. Deskundige klassifikasie, aan die ander kant, maak beter gebruik van aanvullende data deurdat dit data direk in die klassifikasieproses inkorporeer. Tydens hierdie studie is 'n deskundige klassifikasiemodel ontwikkel gebaseer op ruimtelike verwerkings, wat ontwerp is om spesifieke grondbedekking en -gebruike te identifiseer. Laasgenoemde is behaal deur beide spektrale en beskikbare aanvullende data te integreer. Aanvullende data is afgelei van, óf spektrale eienskappe, óf ander ruimtelike bronne soos 'n DEM (Digitale Elevasie Model) en topografiese kaarte. Die model is ontwikkel in ERDAS Imagine beeldverwerking sagteware, waar die 'expert engineer' as finale integreerder van die verskillende samestellende ruimtelike verwerkings gebruik is. 'n Poging is aangewend om die Klas I grondbedekkingklasse, in die Suid-Afrikaanse Nasionale Grondbedekking klassifikasiesisteem te identifiseer. Reëls is vasgestel aan die hand van deskundige begrippe of eenvoudige statistiese berekeninge van die gemiddelde en variansie van opleidingsdata. Alhoewel reëls met behulp van statistiese toepassings, soos die 'classification analysis regression tree (CART)' vasgestel kon word, maak die afwesigheid van genoegsame en akkurate opleidingsdata vir al die grondbedekkingsklasse hierdie opsie minder aantreklik. Bykomend tot laasgenoemde, vereis alle grondbedekkingsklasse nie dieselfde voorspellingsveranderlikes nie. Die resultaat van hierdie akkuraatheidsskatting toon dat die algehele klassifikasie-akkuraatheid 84.3% was en die kappa statistieke 0.829. Alhoewel hierdie vlak van akkuraatheid vir die meeste toepassings geskik is, is die model aanpasbaar genoeg om verder te verbeter.

APA, Harvard, Vancouver, ISO, and other styles

16

McKay, Cory. "Automatic genre classification of MIDI recordings." Thesis, McGill University, 2004. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=81503.

Full text

Abstract:

A software system that automatically classifies MIDI files into hierarchically organized taxonomies of musical genres is presented. This extensible software includes an easy to use and flexible GUI. An extensive library of high-level musical features is compiled, including many original features. A novel hybrid classification system is used that makes use of hierarchical, flat and round robin classification. Both k-nearest neighbour and neural network-based classifiers are used, and feature selection and weighting are performed using genetic algorithms. A thorough review of previous research in automatic genre classification is presented, along with an overview of automatic feature selection and classification techniques. Also included is a discussion of the theoretical issues relating to musical genre, including but not limited to what mechanisms humans use to classify music by genre and how realistic genre taxonomies can be constructed.

APA, Harvard, Vancouver, ISO, and other styles

17

Rossman, Mark A. "Automated Detection of Hematological Abnormalities through Classification of Flow Cytometric Data Patterns." FIU Digital Commons, 2011. http://digitalcommons.fiu.edu/etd/344.

Full text

Abstract:

Flow Cytometry analyzers have become trusted companions due to their ability to perform fast and accurate analyses of human blood. The aim of these analyses is to determine the possible existence of abnormalities in the blood that have been correlated with serious disease states, such as infectious mononucleosis, leukemia, and various cancers. Though these analyzers provide important feedback, it is always desired to improve the accuracy of the results. This is evidenced by the occurrences of misclassifications reported by some users of these devices. It is advantageous to provide a pattern interpretation framework that is able to provide better classification ability than is currently available. Toward this end, the purpose of this dissertation was to establish a feature extraction and pattern classification framework capable of providing improved accuracy for detecting specific hematological abnormalities in flow cytometric blood data. This involved extracting a unique and powerful set of shift-invariant statistical features from the multi-dimensional flow cytometry data and then using these features as inputs to a pattern classification engine composed of an artificial neural network (ANN). The contribution of this method consisted of developing a descriptor matrix that can be used to reliably assess if a donor’s blood pattern exhibits a clinically abnormal level of variant lymphocytes, which are blood cells that are potentially indicative of disorders such as leukemia and infectious mononucleosis. This study showed that the set of shift-and-rotation-invariant statistical features extracted from the eigensystem of the flow cytometric data pattern performs better than other commonly-used features in this type of disease detection, exhibiting an accuracy of 80.7%, a sensitivity of 72.3%, and a specificity of 89.2%. This performance represents a major improvement for this type of hematological classifier, which has historically been plagued by poor performance, with accuracies as low as 60% in some cases.

APA, Harvard, Vancouver, ISO, and other styles

18

Tristram, Uvedale Roy. "Classification of the difficulty in accelerating problems using GPUs." Thesis, Rhodes University, 2014. http://hdl.handle.net/10962/d1012978.

Full text

Abstract:

Scientists continually require additional processing power, as this enables them to compute larger problem sizes, use more complex models and algorithms, and solve problems previously thought computationally impractical. General-purpose computation on graphics processing units (GPGPU) can help in this regard, as there is great potential in using graphics processors to accelerate many scientific models and algorithms. However, some problems are considerably harder to accelerate than others, and it may be challenging for those new to GPGPU to ascertain the difficulty of accelerating a particular problem or seek appropriate optimisation guidance. Through what was learned in the acceleration of a hydrological uncertainty ensemble model, large numbers of k-difference string comparisons, and a radix sort, problem attributes have been identified that can assist in the evaluation of the difficulty in accelerating a problem using GPUs. The identified attributes are inherent parallelism, branch divergence, problem size, required computational parallelism, memory access pattern regularity, data transfer overhead, and thread cooperation. Using these attributes as difficulty indicators, an initial problem difficulty classification framework has been created that aids in GPU acceleration difficulty evaluation. This framework further facilitates directed guidance on suggested optimisations and required knowledge based on problem classification, which has been demonstrated for the aforementioned accelerated problems. It is anticipated that this framework, or a derivative thereof, will prove to be a useful resource for new or novice GPGPU developers in the evaluation of potential problems for GPU acceleration.

APA, Harvard, Vancouver, ISO, and other styles

19

Harrison, A. "The spatial resolution of remotely sensed data and its effect on classification accuracy." Thesis, University of Reading, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.353467.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Lundgren, Andreas. "Data-Driven Engine Fault Classification and Severity Estimation Using Residuals and Data." Thesis, Linköpings universitet, Fordonssystem, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-165736.

Full text

Abstract:

Recent technological advances in the automotive industry have made vehicularsystems increasingly complex in terms of both hardware and software. As thecomplexity of the systems increase, so does the complexity of efficient monitoringof these system. With increasing computational power the field of diagnosticsis becoming evermore focused on software solutions for detecting and classifyinganomalies in the supervised systems. Model-based methods utilize knowledgeabout the physical system to device nominal models of the system to detect deviations,while data-driven methods uses historical data to come to conclusionsabout the present state of the system in question. This study proposes a combinedmodel-based and data-driven diagnostic framework for fault classification,severity estimation and novelty detection. An algorithm is presented which uses a system model to generate a candidate setof residuals for the system. A subset of the residuals are then selected for eachfault using L1-regularized logistic regression. The time series training data fromthe selected residuals is labelled with fault and severity. It is then compressedusing a Gaussian parametric representation, and data from different fault modesare modelled using 1-class support vector machines. The classification of datais performed by utilizing the support vector machine description of the data inthe residual space, and the fault severity is estimated as a convex optimizationproblem of minimizing the Kullback-Leibler divergence (kld) between the newdata and training data of different fault modes and severities. The algorithm is tested with data collected from a commercial Volvo car enginein an engine test cell and the results are presented in this report. Initial testsindicate the potential of the kld for fault severity estimation and that noveltydetection performance is closely tied to the residual selection process.

APA, Harvard, Vancouver, ISO, and other styles

21

Robert, Denis J. "Selection and analysis of optimal textural features for accurate classification of monochrome digitized image data /." Online version of thesis, 1989. http://hdl.handle.net/1850/11364.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Phillips, Peter. "A novel pre-processing method for the classification of data by a neural network." Thesis, University of Sussex, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.398348.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

Huang, Heng. "Land cover classification from satellite imagery, and its applications in cellular network planning." Diss., Columbia, Mo. : University of Missouri-Columbia, 2005. http://hdl.handle.net/10355/5812.

Full text

Abstract:

Thesis (Ph.D.)--University of Missouri-Columbia, 2005.
The entire dissertation/thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file (which also appears in the research.pdf); a non-technical general description, or public abstract, appears in the public.pdf file. Title from title screen of research.pdf file viewed on (November 15, 2006) Vita. Includes bibliographical references.

APA, Harvard, Vancouver, ISO, and other styles

24

Palanisamy, Senthil Kumar. "Association rule based classification." Link to electronic thesis, 2006. http://www.wpi.edu/Pubs/ETD/Available/etd-050306-131517/.

Full text

Abstract:

Thesis (M.S.)--Worcester Polytechnic Institute.
Keywords: Itemset Pruning, Association Rules, Adaptive Minimal Support, Associative Classification, Classification. Includes bibliographical references (p.70-74).

APA, Harvard, Vancouver, ISO, and other styles

25

Nguyen, David P. "Classification of multisite electrode recordings via variable dimension Gaussian mixtures." Thesis, Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/13929.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Lembke, Benjamin. "Bearing Diagnosis Using Fault Signal Enhancing Teqniques and Data-driven Classification." Thesis, Linköpings universitet, Fordonssystem, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-158240.

Full text

Abstract:

Rolling element bearings are a vital part in many rotating machinery, including vehicles. A defective bearing can be a symptom of other problems in the machinery and is due to a high failure rate. Early detection of bearing defects can therefore help to prevent malfunction which ultimately could lead to a total collapse. The thesis is done in collaboration with Scania that wants a better understanding of how external sensors such as accelerometers, can be used for condition monitoring in their gearboxes. Defective bearings creates vibrations with specific frequencies, known as Bearing Characteristic Frequencies, BCF [23]. A key component in the proposed method is based on identification and extraction of these frequencies from vibration signals from accelerometers mounted near the monitored bearing. Three solutions are proposed for automatic bearing fault detection. Two are based on data-driven classification using a set of machine learning methods called Support Vector Machines and one method using only the computed characteristic frequencies from the considered bearing faults. Two types of features are developed as inputs to the data-driven classifiers. One is based on the extracted amplitudes of the BCF and the other on statistical properties from Intrinsic Mode Functions generated by an improved Empirical Mode Decomposition algorithm. In order to enhance the diagnostic information in the vibration signals two pre-processing steps are proposed. Separation of the bearing signal from masking noise are done with the Cepstral Editing Procedure, which removes discrete frequencies from the raw vibration signal. Enhancement of the bearing signal is achieved by band pass filtering and amplitude demodulation. The frequency band is produced by the band selection algorithms Kurtogram and Autogram. The proposed methods are evaluated on two large public data sets considering bearing fault classification using accelerometer data, and a smaller data set collected from a Scania gearbox. The produced features achieved significant separation on the public and collected data. Manual detection of the induced defect on the outer race on the bearing from the gearbox was achieved. Due to the small amount of training data the automatic solutions were only tested on the public data sets. Isolation performance of correct bearing and fault mode among multiplebearings were investigated. One of the best trade offs achieved was 76.39 % fault detection rate with 8.33 % false alarm rate. Another was 54.86 % fault detection rate with 0 % false alarm rate.

APA, Harvard, Vancouver, ISO, and other styles

27

Gendron, Marlin. "Algorithms and Data Structures for Automated Change Detection and Classification of Sidescan Sonar Imagery." ScholarWorks@UNO, 2004. http://scholarworks.uno.edu/td/210.

Full text

Abstract:

During Mine Warfare (MIW) operations, MIW analysts perform change detection by visually comparing historical sidescan sonar imagery (SSI) collected by a sidescan sonar with recently collected SSI in an attempt to identify objects (which might be explosive mines) placed at sea since the last time the area was surveyed. This dissertation presents a data structure and three algorithms, developed by the author, that are part of an automated change detection and classification (ACDC) system. MIW analysts at the Naval Oceanographic Office, to reduce the amount of time to perform change detection, are currently using ACDC. The dissertation introductory chapter gives background information on change detection, ACDC, and describes how SSI is produced from raw sonar data. Chapter 2 presents the author's Geospatial Bitmap (GB) data structure, which is capable of storing information geographically and is utilized by the three algorithms. This chapter shows that a GB data structure used in a polygon-smoothing algorithm ran between 1.3 – 48.4x faster than a sparse matrix data structure. Chapter 3 describes the GB clustering algorithm, which is the author's repeatable, order-independent method for clustering. Results from tests performed in this chapter show that the time to cluster a set of points is not affected by the distribution or the order of the points. In Chapter 4, the author presents his real-time computer-aided detection (CAD) algorithm that automatically detects mine-like objects on the seafloor in SSI. The author ran his GB-based CAD algorithm on real SSI data, and results of these tests indicate that his real-time CAD algorithm performs comparably to or better than other non-real-time CAD algorithms. The author presents his computer-aided search (CAS) algorithm in Chapter 5. CAS helps MIW analysts locate mine-like features that are geospatially close to previously detected features. A comparison between the CAS and a great circle distance algorithm shows that the CAS performs geospatial searching 1.75x faster on large data sets. Finally, the concluding chapter of this dissertation gives important details on how the completed ACDC system will function, and discusses the author's future research to develop additional algorithms and data structures for ACDC.

APA, Harvard, Vancouver, ISO, and other styles

28

Majd, Farjam. "Two new parallel processors for real time classification of 3-D moving objects and quad tree generation." PDXScholar, 1985. https://pdxscholar.library.pdx.edu/open_access_etds/3421.

Full text

Abstract:

Two related image processing problems are addressed in this thesis. First, the problem of identification of 3-D objects in real time is explored. An algorithm to solve this problem and a hardware system for parallel implementation of this algorithm are proposed. The classification scheme is based on the "Invariant Numerical Shape Modeling" (INSM) algorithm originally developed for 2-D pattern recognition such as alphanumeric characters. This algorithm is then extended to 3-D and is used for general 3-D object identification. The hardware system is an SIMD parallel processor, designed in bit slice fashion for expandability. It consists of a library of images coded according to the 3-D INSM algorithm and the SIMD classifier which compares the code of the unknown image to the library codes in a single clock pulse to establish its identity. The output of this system consists of three signals: U, for unique identification; M, for multiple identification; and N, for non-identification of the object. Second, the problem of real time image compaction is addressed. The quad tree data structure is described. Based on this structure, a parallel processor with a tree architecture is developed which is independent of the data entry process, i.e., data may be entered pixel by pixel or all at once. The hardware consists of a tree processor containing a tree generator and three separate memory arrays, a data transfer processor, and a main memory unit. The tree generator generates the quad tree of the input image in tabular form, using the memory arrays in the tree processor for storage of the table. This table can hold one picture frame at a given time. Hence, for processing multiple picture frames the data transfer processor is used to transfer their respective quad trees from the tree processor memory to the main memory. An algorithm is developed to facilitate the determination of the connections in the circuit.

APA, Harvard, Vancouver, ISO, and other styles

29

Petersson, Henrik. "Multivariate Exploration and Processing of Sensor Data-applications with multidimensional sensor systems." Doctoral thesis, Linköpings universitet, Tillämpad Fysik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-14879.

Full text

Abstract:

A sensor is a device that transforms a physical, chemical, or biological stimulus into a readable signal. The integral part that sensors make in modern technology is considerable and many are those trying to take the development of sensor technology further. Sensor systems are becoming more and more complex and may contain a wide range of different sensors, where each may deliver a multitude of signals.Although the data generated by modern sensor systems contain lots of information, the information may not be clearly visible. Appropriate handling of data becomes crucial to reveal what is sought, but unfortunately, that process is not always straightforward and there are many aspects to consider. Therefore, analysis of multidimensional sensor data has become a science.The topic of this thesis is signal processing of multidimensional sensordata. Surveys are given on methods to explore data and to use the data to quantify or classify samples. It is also discussed how to avoid the rise of artifacts and how to compensate for sensor deficiencies. Special interest is put on methods being practically applicable to chemical gas sensors. The merits and limitations of chemical sensors are discussed and it is argued that multivariate data analysis plays an important role using such sensors. The contribution made to the public by this thesis is primarily on techniques dealing with difficulties related to the operation of sensors in applications. In the second paper, a method is suggested that aims at suppressing the negative effects caused by unwanted sensor-to-sensor differences. If such differences are not suppressed sufficiently, systems where sensors occasionally must be replaced may degrade and lose performance. The strong-point of the suggested method is its relative ease of use considering large-scale production of sensor components and when integrating sensors into mass-market products. The third paper presents a method that facilitates and speeds up the process of assembling an array of sensors that is optimal for a particular application. The method combines multivariate data analysis with the `Scanning Light Pulse Technique'. In the first and fourth papers, the problem of source separation is studied. In two separate applications, one using gas sensors for combustion control and one using acoustic sensors for ground surveillance, it has been identified that the current sensors outputs mixtures of both interesting- and interfering signals. By different means, the two papers applies and evaluates methods to extract the relevant information under such circumstances.
En sensor är en komponent som överför en fysikalisk, kemisk, eller biologisk storhet eller kvalitet till en utläsbar signal. Sensorer utgör idag en viktig del i flertalet högteknologiska produkter och sensorforskning är ett aktivt område. Komplexiteten på sensorbaserade system ökar och det blir möjligt att registrera allt er olika typer av mätsignaler. Mätsignalerna är inte alltid direkt tydbara, varvid signalbehandling blir ett väsentligt verktyg för att vaska fram den viktiga information som sökes. Signalbehandling av sensorsignaler är dessvärre inte en okomplicerad procedur och det finns många aspekter att beakta. Av denna anledning har signalbehandling och analys av sensorsignaler utvecklats till ett eget forskningsområde. Denna avhandling avhandlar metoder för att analysera komplexa multidimensionella sensorsignaler. En introduktion ges till metoder för att, utifrån mätningar, klassificera och kvantifiera egenskaper hos mätobjekt. En överblick ges av de effekter som kan uppstå på grund av imperfektioner hos sensorerna och en diskussion föres kring metoder för att undvika eller lindra de problem som dessa imperfektioner kan ge uppkomst till. Speciell vikt lägges vid sådana metoder som medför en direkt applicerbarhet och nytta för system av kemiska sensorer. I avhandlingen ingår fyra artiklar, som vart och en belyser hur de metoder som beskrivits kan användas i praktiska situationer.
Sensor,

APA, Harvard, Vancouver, ISO, and other styles

30

Cheriyadat, Anil Meerasa. "Limitations of principal component analysis for dimensionality-reduction for classification of hyperspectral data." Master's thesis, Mississippi State : Mississippi State University, 2003. http://library.msstate.edu/etd/show.asp?etd=etd-11072003-133109.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Alvarado, Mantecon Jesus Gerardo. "Towards the Automatic Classification of Student Answers to Open-ended Questions." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39093.

Full text

Abstract:

One of the main research challenges nowadays in the context of Massive Open Online Courses (MOOCs) is the automation of the evaluation process of text-based assessments effectively. Text-based assessments, such as essay writing, have been proved to be better indicators of higher level of understanding than machine-scored assessments (E.g. Multiple Choice Questions). Nonetheless, due to the rapid growth of MOOCs, text-based evaluation has become a difficult task for human markers, creating the need of automated systems for grading. In this thesis, we focus on the automated short answer grading task (ASAG), which automatically assesses natural language answers to open-ended questions into correct and incorrect classes. We propose an ensemble supervised machine learning approach that relies on two types of classifiers: a response-based classifier, which centers around feature extraction from available responses, and a reference-based classifier which considers the relationships between responses, model answers and questions. For each classifier, we explored a set of features based on words and entities. For the response-based classifier, we tested and compared 5 features: traditional n-gram models, entity URIs (Uniform Resource Identifier) and entity mentions both extracted using a semantic annotation API, entity mention embeddings based on GloVe and entity URI embeddings extracted from Wikipedia. For the reference-based classifier, we explored fourteen features: cosine similarity between sentence embeddings from student answers and model answers, number of overlapping elements (words, entity URI, entity mention) between student answers and model answers or question text, Jaccard similarity coefficient between student answers and model answers or question text (based on words, entity URI or entity mentions) and a sentence embedding representation. We evaluated our classifiers on three datasets, two of which belong to the SemEval ASAG competition (Dzikovska et al., 2013). Our results show that, in general, reference-based features perform much better than response-based features in terms of accuracy and macro average f1-score. Within the reference-based approach, we observe that the use of S6 embedding representation, which considers question text, student and model answer, generated the best performing models. Nonetheless, their combination with other similarity features helped build more accurate classifiers. As for response-based classifiers, models based on traditional n-gram features remained the best models. Finally, we combined our best reference-based and response-based classifiers using an ensemble learning model. Our ensemble classifiers combining both approaches achieved the best results for one of the evaluation datasets, but underperformed on the remaining two. We also compared the best two classifiers with some of the main state-of-the-art results on the SemEval competition. Our final embedded meta-classifier outperformed the top-ranking result on the SemEval Beetle dataset and our top classifier on SemEval SciEntBank, trained on reference-based features, obtained the 2nd position. In conclusion, the reference-based approach, powered mainly by sentence level embeddings and other similarity features, proved to generate the most efficient models in two out of three datasets and the ensemble model was the best on the SemEval Beetle dataset.

APA, Harvard, Vancouver, ISO, and other styles

32

Li, Yelei. "Heartbeat detection, classification and coupling analysis using Electrocardiography data." Case Western Reserve University School of Graduate Studies / OhioLINK, 2014. http://rave.ohiolink.edu/etdc/view?acc_num=case1405084050.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

Bhonsle, Dhruvjit Vilas. "Development of an Automation Test Setup for Navigation Data Processing." Master's thesis, Universitätsbibliothek Chemnitz, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:ch1-qucosa-199331.

Full text

Abstract:

With the development of Advanced Driving Assistance Systems (ADAS) vehicles have undergone better experience in field of safety, better driving and enhanced vehicle systems. Today these systems are one of the fastest growing in automotive domain. Physical parameters like map data, vehicle position and speed are crucial for the advancement of functionalities implemented for ADAS. All the navigation map databases are stored in proprietary format. So for the ADAS application to access this data an appropriate interface has to be defined. This is the main aim of Advance Driver Assistant Systems Interface Specifications (ADASIS) consortium. This new specification allows a coordinated effort of more than one industry to improve comfort and fuel efficiency. My research during the entire duration of my master thesis mainly focuses on two stages namely XML Comparator and CAN stream generation stages from ADASIS Test Environment that was developed in our company. In this test environment ADASIS Reconstructor of our company is tested against the parameters of Reference Reconstructor provided by ADASIS consortium. The main aim of this environment is to develop a Reconstructor which will adhere to all the specifications given in ADASIS Reconstructor. My implementation in this master thesis focuses on two stages of test environment setup which are XML Comparison and CAN Stream Generation Tool respectively. Prior to my working, these stages lacked in-depth research and usability features for further working.

APA, Harvard, Vancouver, ISO, and other styles

34

Kim, Dae Wook. "Data-Driven Network-Centric Threat Assessment." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1495191891086814.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Eklund, Martin. "Comparing Feature Extraction Methods and Effects of Pre-Processing Methods for Multi-Label Classification of Textual Data." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-231438.

Full text

Abstract:

This thesis aims to investigate how different feature extraction methods applied to textual data affect the results of multi-label classification. Two different Bag of Words extraction methods are used, specifically the Count Vector and the TF-IDF approaches. A word embedding method is also investigated, called the GloVe extraction method. Multi-label classification can be useful for categorizing items, such as pieces of music or news articles, that may belong to multiple classes or topics. The effect of using different pre-processing methods is also investigated, such as the use of N-grams, stop-word elimination, and stemming. Two different classifiers, an SVM and an ANN, are used for multi-label classification using a Binary Relevance approach. The results indicate that the choice of extraction method has a meaningful impact on the resulting classifications, but that no one method consistently outperforms the others. Instead the results show that the GloVe extraction method performs the best for the recall metrics, while the Bag of Words methods perform the best for the precision metrics.
Detta arbete ämnar att undersöka vilken effekt olika metoder för att extrahera särdrag ur textdata har när dessa används för att multi-tagga textdatan. Två metoder baserat på Bag of Words undersöks, närmare bestämt Count Vector-metoden samt TF-IDF-metoden. Även en metod som använder sig av word embessings undersöks, som kallas för GloVe-metoden. Multi-taggning av data kan vara användbart när datan, exempelvis musikaliska stycken eller nyhetsartiklar, kan tillhöra flera klasser eller områden. Även användandet av flera olika metoder för att förbehandla datan undersöks, såsom användandet utav N-gram, eliminering av icke-intressanta ord, samt transformering av ord med olika böjningsformer till gemensam stamform. Två olika klassificerare, en SVM samt en ANN, används för multi-taggningen genom använding utav en metod kallad Binary Relevance. Resultaten visar att valet av metod för extraktion av särdrag har en betydelsefull roll för den resulterande multi-taggningen, men att det inte finns en metod som ger bäst resultat genom alla tester. Istället indikerar resultaten att extraktionsmetoden baserad på GloVe presterar bäst när det gäller 'recall'-mätvärden, medan Bag of Words-metoderna presterar bäst gällade 'precision'-mätvärden.

APA, Harvard, Vancouver, ISO, and other styles

36

Whitbread, P. J. "Multi-spectral texture : improving classification of multi-spectral images by the integration of spatial information /." Title page, abstract and contents only, 1992. http://web4.library.adelaide.edu.au/theses/09PH/09phw5792.pdf.

Full text

Abstract:

Thesis (Ph. D.)--University of Adelaide, Dept. of Electrical and Electronic Engineering, 1994?
One computer disk in pocket inside back cover. System requirements for accompanying computer disk: Macintosh computer. Includes bibliographical references (leaves 148-160).

APA, Harvard, Vancouver, ISO, and other styles

37

Fiebrink, Rebecca. "An exploration of feature selection as a tool for optimizing musical genre classification /." Thesis, McGill University, 2006. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=99372.

Full text

Abstract:

The computer classification of musical audio can form the basis for systems that allow new ways of interacting with digital music collections. Existing music classification systems suffer, however, from inaccuracy as well as poor scalability. Feature selection is a machine-learning tool that can potentially improve both accuracy and scalability of classification. Unfortunately, there is no consensus on which feature selection algorithms are most appropriate or on how to evaluate the effectiveness of feature selection. Based on relevant literature in music information retrieval (MIR) and machine learning and on empirical testing, the thesis specifies an appropriate evaluation method for feature selection, employs this method to compare existing feature selection algorithms, and evaluates an appropriate feature selection algorithm on the problem of musical genre classification. The outcomes include an increased understanding of the potential for feature selection to benefit MIR and a new technique for optimizing one type of classification-based system.

APA, Harvard, Vancouver, ISO, and other styles

38

Varde, Aparna S. "Graphical data mining for computational estimation in materials science applications." Link to electronic thesis, 2006. http://www.wpi.edu/Pubs/ETD/Available/etd-081506-152633/.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

Maime, Ratakane Baptista. "CHALLENGES AND OPPORTUNITIES OF ADOPTING MANAGEMENT INFORMATION SYSTEMS (MIS) FOR PASSPORT PROCESSING: COMPARATIVE STUDY BETWEEN LESOTHO AND SOUTH AFRICA." Thesis, Central University of Technology, Free State. Business Administration, 2014. http://hdl.handle.net/11462/237.

Full text

Abstract:

Thesis ( M. Tech. (Business Administration )) - Central University of Technology, Free State, 2014
Fast and secure public service delivery is not only a necessity, but a compulsory endeavour. However, it is close to impossible to achieve such objectives without the use of Information Technology (IT). It is correspondingly important to find proper sustainability frameworks of technology. Organisations do not only need technology for efficient public service; the constant upgrading of systems and cautious migration to the newest IT developments is also equally indispensable in today’s dynamic technological world. Conversely, countries in Africa are always lagging behind in technological progresses. Such deficiencies have been identified in the passport processing of Lesotho and South Africa, where to unequal extents, problems related to systems of passport production have contributed to delays and have become fertile grounds for corrupt practices. The study seeks to identify the main impediments in the adoption of Management Information Systems (MIS) for passport processing. Furthermore, the study explores the impact MIS might have in attempting to combat long queues and to avoid long waiting periods – from application to issuance of passports to citizens. The reasonable time frame between passport application and issuance, and specific passport management systems, have been extensively discussed along with various strategies that have been adopted by some of the world’s first movers in modern passport management technologies. In all cases and stages of this research, Lesotho and South Africa are compared. The research approach of the study was descriptive and explorative in nature. As a quantitative design, a structured questionnaire was used to solicit responses in Lesotho and South Africa. It was established that both Lesotho and South Africa have somewhat similar problems – although, to a greater extent, Lesotho needs much more urgent attention. Although the processes of South Africa need to be improved, the Republic releases a passport much faster and more efficiently than Lesotho. Economic issues are also revealed by the study as unavoidable factors that always affect technological developments in Africa. The study reveals that the latest MIS for passport processing has facilitated modern, automated border-control systems and resultant e-passports that incorporate more biometric information of citizens to passports – thanks to modern RFID technologies. One can anticipate that this study will provide simple, affordable and secure IT solutions for passport processing. Key words: Information Technology (IT); Management Information Systems (MIS); E-Government; E-Passport; Biometrics; and RFID.

APA, Harvard, Vancouver, ISO, and other styles

40

Sanden, Christopher, and University of Lethbridge Faculty of Arts and Science. "An empirical evaluation of computational and perceptual multi-label genre classification on music / Christopher Sanden." Thesis, Lethbridge, Alta. : University of Lethbridge, Dept. of Mathematics and Computer Science, c2010, 2010. http://hdl.handle.net/10133/2602.

Full text

Abstract:

Automatic music genre classi cation is a high-level task in the eld of Music Information Retrieval (MIR). It refers to the process of automatically assigning genre labels to music for various tasks, including, but not limited to categorization, organization and browsing. This is a topic which has seen an increase in interest recently as one of the cornerstones of MIR. However, due to the subjective and ambiguous nature of music, traditional single-label classi cation is inadequate. In this thesis, we study multi-label music genre classi cation from perceptual and computational perspectives. First, we design a set of perceptual experiments to investigate the genre-labelling behavior of individuals. The results from these experiments lead us to speculate that multi-label classi cation is more appropriate for classifying music genres. Second, we design a set of computational experiments to evaluate multi-label classi cation algorithms on music. These experiments not only support our speculation but also reveal which algorithms are more suitable for music genre classi cation. Finally, we propose and examine a group of ensemble approaches for combining multi-label classi cation algorithms to further improve classi cation performance. ii
viii, 87 leaves ; 29 cm

APA, Harvard, Vancouver, ISO, and other styles

41

Randrianarivony, Maharavo. "Geometric processing of CAD data and meshes as input of integral equation solvers." Doctoral thesis, [S.l. : s.n.], 2006. http://nbn-resolving.de/urn:nbn:de:swb:ch1-200601972.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Vila, Duran Marius. "Information theory techniques for multimedia data classification and retrieval." Doctoral thesis, Universitat de Girona, 2015. http://hdl.handle.net/10803/302664.

Full text

Abstract:

We are in the information age where most data is stored in digital format. Thus, the management of digital documents and videos requires the development of efficient techniques for automatic analysis. Among them, capturing the similarity or dissimilarity between different document images or video frames are extremely important. In this thesis, we first analyze for several image resolutions the behavior of three different families of image-based similarity measures applied to invoice classification. In these three set of measures, the computation of the similarity between two images is based, respectively, on intensity differences, mutual information, and normalized compression distance. As the best results are obtained with mutual information-based measures, we proceed to investigate the application of three different Tsallis-based generalizations of mutual information for different entropic indexes. These three generalizations derive respectively from the Kullback-Leibler distance, the difference between entropy and conditional entropy, and the Jensen-Shannon divergence. In relation to digital video processing, we propose two different information-theoretic approaches based, respectively, on Tsallis mutual information and Jensen-Tsallis divergence to detect the abrupt shot boundaries of a video sequence and to select the most representative keyframe of each shot. Finally, Shannon entropy has been commonly used to quantify the image informativeness. The main drawback of this measure is that it does not take into account the spatial distribution of pixels. In this thesis, we analyze four information-theoretic measures that overcome this limitation. Three of them (entropy rate, excess entropy, and erasure entropy) consider the image as a stationary stochastic process, while the fourth (partitional information) is based on an information channel between image regions and histogram bins
Ens trobem a l’era de la informació on la majoria de les dades s’emmagatzemen en format digital. Per tant, la gestió de documents i vídeos digitals requereix el desenvolupament de tècniques eficients per a l’anàlisi automàtic. Entre elles, la captura de la similitud o dissimilitud entre diferents imatges de documents o fotogrames de vídeo és extremadament important. En aquesta tesi, analitzem, a diverses resolucions d’imatge, el comportament de tres famílies diferents de mesures basades en similitud d’imatges i aplicades a la classificació de factures. En aquests tres conjunt de mesures, el càlcul de la similitud entre dues imatges es basa, respectivament, en les diferències d’intensitat, en la informació mútua, i en la distància de compressió normalitzada. Degut a que els millors resultats s’obtenen amb les mesures basades en la informació mútua, es procedeix a investigar l’aplicació de tres generalitzacions de la informació mútua basades en Tsallis en diferents índexs entròpics. Aquestes tres generalitzacions es deriven respectivament de la distància de Kullback-Leibler, la diferència entre l’entropia i entropia condicional, i la divergència de Jensen-Shannon. En relació al processament de vídeo digital, proposem dos enfocaments diferents de teoria de la informació basats respectivament en la informació mútua de Tsallis i en la divergència de Jensen-Tsallis, per detectar els límits d’un pla cinematogràfic en una seqüència de vídeo i per seleccionar el fotograma clau més representatiu de cada pla. Finalment, l’entropia de Shannon s’ha utilitzat habitualment per quantificar la informativitat d’una imatge. El principal inconvenient d’aquesta mesura és que no té en compte la distribució espacial dels píxels. En aquesta tesi, s’analitzen quatre mesures de teoria de la informació que superen aquesta limitació. Tres d’elles (entropy rate, excess entropy i erasure entropy) consideren la imatge com un procés estocàstic estacionari, mentre que la quarta (partitional information) es basa en un canal d’informació entre les regions d’una imatge i els intervals de l’histograma

APA, Harvard, Vancouver, ISO, and other styles

43

Dannenberg, Matthew. "Pattern Recognition in High-Dimensional Data." Scholarship @ Claremont, 2016. https://scholarship.claremont.edu/hmc_theses/76.

Full text

Abstract:

Vast amounts of data are produced all the time. Yet this data does not easily equate to useful information: extracting information from large amounts of high dimensional data is nontrivial. People are simply drowning in data. A recent and growing source of high-dimensional data is hyperspectral imaging. Hyperspectral images allow for massive amounts of spectral information to be contained in a single image. In this thesis, a robust supervised machine learning algorithm is developed to efficiently perform binary object classification on hyperspectral image data by making use of the geometry of Grassmann manifolds. This algorithm can consistently distinguish between a large range of even very similar materials, returning very accurate classification results with very little training data. When distinguishing between dissimilar locations like crop fields and forests, this algorithm consistently classifies more than 95 percent of points correctly. On more similar materials, more than 80 percent of points are classified correctly. This algorithm will allow for very accurate information to be extracted from these large and complicated hyperspectral images.

APA, Harvard, Vancouver, ISO, and other styles

44

Ali, Khan Syed Irteza. "Classification using residual vector quantization." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/50300.

Full text

Abstract:

Residual vector quantization (RVQ) is a 1-nearest neighbor (1-NN) type of technique. RVQ is a multi-stage implementation of regular vector quantization. An input is successively quantized to the nearest codevector in each stage codebook. In classification, nearest neighbor techniques are very attractive since these techniques very accurately model the ideal Bayes class boundaries. However, nearest neighbor classification techniques require a large size of representative dataset. Since in such techniques a test input is assigned a class membership after an exhaustive search the entire training set, a reasonably large training set can make the implementation cost of the nearest neighbor classifier unfeasibly costly. Although, the k-d tree structure offers a far more efficient implementation of 1-NN search, however, the cost of storing the data points can become prohibitive, especially in higher dimensionality. RVQ also offers a nice solution to a cost-effective implementation of 1-NN-based classification. Because of the direct-sum structure of the RVQ codebook, the memory and computational of cost 1-NN-based system is greatly reduced. Although, as compared to an equivalent 1-NN system, the multi-stage implementation of the RVQ codebook compromises the accuracy of the class boundaries, yet the classification error has been empirically shown to be within 3% to 4% of the performance of an equivalent 1-NN-based classifier.

APA, Harvard, Vancouver, ISO, and other styles

45

Maguluri, Naga Sai Nikhil. "Multi-Class Classification of Textual Data: Detection and Mitigation of Cheating in Massively Multiplayer Online Role Playing Games." Wright State University / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=wright1494248022049882.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Pienaar, Harrison Hursiney. "Towards a classification system of significant water resources with a case study of the Thukela river." Thesis, University of the Western Cape, 2005. http://etd.uwc.ac.za/index.php?module=etd&amp.

Full text

Abstract:

The increasing demand for water to provide for South Africa's growing population has resulted in increasing pressure being placed on the country's limited water resources. Water developments however cannot be undertaken without considering the water resource base and the key policy frameworks that governs its use and protection. The Department of Water Affairs and Forestry as the custodian of water resources in the country initiated the implementation of the National Water Act during 1999. It has therefore the mandate to ensure that the protection, use, development, conservation, management and control of water resources be achieved in an equitable, efficient and sustainable manner, to the benefit of society at large. The National Water Act prescribes that the Minister of the Department of Water Affairs and Forestry develop a system for the classification of all significant water resources to ensure its protection and sustainable utilisation. The classification system is to be used to determine the class and resource quality objectives of all significant water resources. In the absence of a formal classification system, a framework was developed through this research study in order to guide both the development of a classification system and the implementation, hence ensuring an overarching structure within which intergrated water resource management can be achieved. The main goal of this framework was to seek an appropriate balance between protecting significant water resources and at the same time promoting water resource utilisation in support of socio-economic development. This framework was executed in the preliminary determination of the Reserve for the Thukela River catchment to ensure that informed and calculated decision-making processes are followed once significant water resources are classified.

APA, Harvard, Vancouver, ISO, and other styles

47

Kim, Kye Hyun 1956. "Classification of environmental hydrologic behaviors in Northeastern United States." Thesis, The University of Arizona, 1989. http://hdl.handle.net/10150/277083.

Full text

Abstract:

Environmental response to acidic deposition occurs through the vehicle of water movement in the ecosystem. As a part of the environmental studies for acidic deposition in the ecosystem, output-based hydrologic classification was done from basin hydrologies based on the distribution of the baseflow, snowmelt, and the direct runoff sources. Because of the differences in the flow paths and exposure duration, those components were assumed to represent distinct geochemical responses. As a first step, user-friendly software has been developed to calculate the baseflow based on the separation of annual hydrographs. It also generates the hydrograph for visual analysis using trial separation slope. After the software was completed, about 1200 stream flow gauging stations in Northeastern U.S. were accessed for flow separation and other hydrologic characteristics. At the final stage, based on the output from the streamflow analysis, cluster analysis was performed to classify the streamflow behaviors in terms of acidic inflow. The output from the cluster analysis shows more efficient regional boundaries of the subregions than the current regional boundaries used by U.S. Environmental Protection Agency (U.S.E.P.A.) for the environmental management in terms of acidic deposition based on the regional baseflow properties.

APA, Harvard, Vancouver, ISO, and other styles

48

Colak, Tufan, and Rami S. R. Qahwaji. "Automated McIntosh-Based Classification of Sunspot Groups Using MDI Images." Springer, 2007. http://hdl.handle.net/10454/4091.

Full text

Abstract:

yes
This paper presents a hybrid system for automatic detection and McIntosh-based classification of sunspot groups on SOHO/MDI white-light images using active-region data extracted from SOHO/MDI magnetogram images. After sunspots are detected from MDI white-light images they are grouped/clustered using MDI magnetogram images. By integrating image-processing and neural network techniques, detected sunspot regions are classified automatically according to the McIntosh classification system. Our results show that the automated grouping and classification of sunspots is possible with a high success rate when compared to the existing manually created catalogues. In addition, our system can detect and classify sunspot groups in their early stages, which are usually missed by human observers.
EPSRC

APA, Harvard, Vancouver, ISO, and other styles

49

Eberius, Julian. "Query-Time Data Integration." Doctoral thesis, Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden, 2015. http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-191560.

Full text

Abstract:

Today, data is collected in ever increasing scale and variety, opening up enormous potential for new insights and data-centric products. However, in many cases the volume and heterogeneity of new data sources precludes up-front integration using traditional ETL processes and data warehouses. In some cases, it is even unclear if and in what context the collected data will be utilized. Therefore, there is a need for agile methods that defer the effort of integration until the usage context is established. This thesis introduces Query-Time Data Integration as an alternative concept to traditional up-front integration. It aims at enabling users to issue ad-hoc queries on their own data as if all potential other data sources were already integrated, without declaring specific sources and mappings to use. Automated data search and integration methods are then coupled directly with query processing on the available data. The ambiguity and uncertainty introduced through fully automated retrieval and mapping methods is compensated by answering those queries with ranked lists of alternative results. Each result is then based on different data sources or query interpretations, allowing users to pick the result most suitable to their information need. To this end, this thesis makes three main contributions. Firstly, we introduce a novel method for Top-k Entity Augmentation, which is able to construct a top-k list of consistent integration results from a large corpus of heterogeneous data sources. It improves on the state-of-the-art by producing a set of individually consistent, but mutually diverse, set of alternative solutions, while minimizing the number of data sources used. Secondly, based on this novel augmentation method, we introduce the DrillBeyond system, which is able to process Open World SQL queries, i.e., queries referencing arbitrary attributes not defined in the queried database. The original database is then augmented at query time with Web data sources providing those attributes. Its hybrid augmentation/relational query processing enables the use of ad-hoc data search and integration in data analysis queries, and improves both performance and quality when compared to using separate systems for the two tasks. Finally, we studied the management of large-scale dataset corpora such as data lakes or Open Data platforms, which are used as data sources for our augmentation methods. We introduce Publish-time Data Integration as a new technique for data curation systems managing such corpora, which aims at improving the individual reusability of datasets without requiring up-front global integration. This is achieved by automatically generating metadata and format recommendations, allowing publishers to enhance their datasets with minimal effort. Collectively, these three contributions are the foundation of a Query-time Data Integration architecture, that enables ad-hoc data search and integration queries over large heterogeneous dataset collections.

APA, Harvard, Vancouver, ISO, and other styles

50

Song, Xiaohui. "FPGA Implementation of a Support Vector Machine based Classification System and its Potential Application in Smart Grid." University of Toledo / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1376579033.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Fingerprints Classification Data processing'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles