Dissertations / Theses on the topic 'Automatic classification Statistical methods'

To see the other types of publications on this topic, follow the link: Automatic classification Statistical methods.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Automatic classification Statistical methods.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Latino, Diogo Alexandre Rosa Serra. "Automatic learning for the classification of chemical reactions and in statistical thermodynamics." Doctoral thesis, FCT - UNL, 2008. http://hdl.handle.net/10362/1752.

Full text
Abstract:
This Thesis describes the application of automatic learning methods for a) the classification of organic and metabolic reactions, and b) the mapping of Potential Energy Surfaces(PES). The classification of reactions was approached with two distinct methodologies: a representation of chemical reactions based on NMR data, and a representation of chemical reactions from the reaction equation based on the physico-chemical and topological features of chemical bonds. NMR-based classification of photochemical and enzymatic reactions. Photochemical and metabolic reactions were classified by Kohonen Self-Organizing Maps (Kohonen SOMs) and Random Forests (RFs) taking as input the difference between the 1H NMR spectra of the products and the reactants. The development of such a representation can be applied in automatic analysis of changes in the 1H NMR spectrum of a mixture and their interpretation in terms of the chemical reactions taking place. Examples of possible applications are the monitoring of reaction processes, evaluation of the stability of chemicals, or even the interpretation of metabonomic data. A Kohonen SOM trained with a data set of metabolic reactions catalysed by transferases was able to correctly classify 75% of an independent test set in terms of the EC number subclass. Random Forests improved the correct predictions to 79%. With photochemical reactions classified into 7 groups, an independent test set was classified with 86-93% accuracy. The data set of photochemical reactions was also used to simulate mixtures with two reactions occurring simultaneously. Kohonen SOMs and Feed-Forward Neural Networks (FFNNs) were trained to classify the reactions occurring in a mixture based on the 1H NMR spectra of the products and reactants. Kohonen SOMs allowed the correct assignment of 53-63% of the mixtures (in a test set). Counter-Propagation Neural Networks (CPNNs) gave origin to similar results. The use of supervised learning techniques allowed an improvement in the results. They were improved to 77% of correct assignments when an ensemble of ten FFNNs were used and to 80% when Random Forests were used. This study was performed with NMR data simulated from the molecular structure by the SPINUS program. In the design of one test set, simulated data was combined with experimental data. The results support the proposal of linking databases of chemical reactions to experimental or simulated NMR data for automatic classification of reactions and mixtures of reactions. Genome-scale classification of enzymatic reactions from their reaction equation. The MOLMAP descriptor relies on a Kohonen SOM that defines types of bonds on the basis of their physico-chemical and topological properties. The MOLMAP descriptor of a molecule represents the types of bonds available in that molecule. The MOLMAP descriptor of a reaction is defined as the difference between the MOLMAPs of the products and the reactants, and numerically encodes the pattern of bonds that are broken, changed, and made during a chemical reaction. The automatic perception of chemical similarities between metabolic reactions is required for a variety of applications ranging from the computer validation of classification systems, genome-scale reconstruction (or comparison) of metabolic pathways, to the classification of enzymatic mechanisms. Catalytic functions of proteins are generally described by the EC numbers that are simultaneously employed as identifiers of reactions, enzymes, and enzyme genes, thus linking metabolic and genomic information. Different methods should be available to automatically compare metabolic reactions and for the automatic assignment of EC numbers to reactions still not officially classified. In this study, the genome-scale data set of enzymatic reactions available in the KEGG database was encoded by the MOLMAP descriptors, and was submitted to Kohonen SOMs to compare the resulting map with the official EC number classification, to explore the possibility of predicting EC numbers from the reaction equation, and to assess the internal consistency of the EC classification at the class level. A general agreement with the EC classification was observed, i.e. a relationship between the similarity of MOLMAPs and the similarity of EC numbers. At the same time, MOLMAPs were able to discriminate between EC sub-subclasses. EC numbers could be assigned at the class, subclass, and sub-subclass levels with accuracies up to 92%, 80%, and 70% for independent test sets. The correspondence between chemical similarity of metabolic reactions and their MOLMAP descriptors was applied to the identification of a number of reactions mapped into the same neuron but belonging to different EC classes, which demonstrated the ability of the MOLMAP/SOM approach to verify the internal consistency of classifications in databases of metabolic reactions. RFs were also used to assign the four levels of the EC hierarchy from the reaction equation. EC numbers were correctly assigned in 95%, 90%, 85% and 86% of the cases (for independent test sets) at the class, subclass, sub-subclass and full EC number level,respectively. Experiments for the classification of reactions from the main reactants and products were performed with RFs - EC numbers were assigned at the class, subclass and sub-subclass level with accuracies of 78%, 74% and 63%, respectively. In the course of the experiments with metabolic reactions we suggested that the MOLMAP / SOM concept could be extended to the representation of other levels of metabolic information such as metabolic pathways. Following the MOLMAP idea, the pattern of neurons activated by the reactions of a metabolic pathway is a representation of the reactions involved in that pathway - a descriptor of the metabolic pathway. This reasoning enabled the comparison of different pathways, the automatic classification of pathways, and a classification of organisms based on their biochemical machinery. The three levels of classification (from bonds to metabolic pathways) allowed to map and perceive chemical similarities between metabolic pathways even for pathways of different types of metabolism and pathways that do not share similarities in terms of EC numbers. Mapping of PES by neural networks (NNs). In a first series of experiments, ensembles of Feed-Forward NNs (EnsFFNNs) and Associative Neural Networks (ASNNs) were trained to reproduce PES represented by the Lennard-Jones (LJ) analytical potential function. The accuracy of the method was assessed by comparing the results of molecular dynamics simulations (thermal, structural, and dynamic properties) obtained from the NNs-PES and from the LJ function. The results indicated that for LJ-type potentials, NNs can be trained to generate accurate PES to be used in molecular simulations. EnsFFNNs and ASNNs gave better results than single FFNNs. A remarkable ability of the NNs models to interpolate between distant curves and accurately reproduce potentials to be used in molecular simulations is shown. The purpose of the first study was to systematically analyse the accuracy of different NNs. Our main motivation, however, is reflected in the next study: the mapping of multidimensional PES by NNs to simulate, by Molecular Dynamics or Monte Carlo, the adsorption and self-assembly of solvated organic molecules on noble-metal electrodes. Indeed, for such complex and heterogeneous systems the development of suitable analytical functions that fit quantum mechanical interaction energies is a non-trivial or even impossible task. The data consisted of energy values, from Density Functional Theory (DFT) calculations, at different distances, for several molecular orientations and three electrode adsorption sites. The results indicate that NNs require a data set large enough to cover well the diversity of possible interaction sites, distances, and orientations. NNs trained with such data sets can perform equally well or even better than analytical functions. Therefore, they can be used in molecular simulations, particularly for the ethanol/Au (111) interface which is the case studied in the present Thesis. Once properly trained, the networks are able to produce, as output, any required number of energy points for accurate interpolations.
APA, Harvard, Vancouver, ISO, and other styles
2

Arshad, Irshad Ahmad. "Using statistical methods for automatic classifications of clouds in ground-based photographs of the sky." Thesis, University of Essex, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.250129.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Shepherd, Gareth William Safety Science Faculty of Science UNSW. "Automating the aetiological classification of descriptive injury data." Awarded by:University of New South Wales. School of Safety Science, 2006. http://handle.unsw.edu.au/1959.4/24934.

Full text
Abstract:
Injury now surpasses disease as the leading global cause of premature death and disability, claiming over 5.8 millions lives each year. However, unlike disease, which has been subjected to a rigorous epidemiologic approach, the field of injury prevention and control has been a relative newcomer to scientific investigation. With the distribution of injury now well described (i.e. ???who???, ???what???, ???where??? and ???when???), the underlying hypothesis is that progress in understanding ???how??? and ???why??? lies in classifying injury occurrences aetiologically. The advancement of a means of classifying injury aetiology has so far been inhibited by two related limitations: 1. Structural limitation: The absence of a cohesive and validated aetiological taxonomy for injury, and; 2. Methodological limitation: The need to manually classify large numbers of injury cases to determine aetiological patterns. This work is directed at overcoming these impediments to injury research. An aetiological taxonomy for injury was developed consistent with epidemiologic principles, along with clear conventions and a defined three-tier hierarchical structure. Validation testing revealed that the taxonomy could be applied with a high degree of accuracy (coder/gold standard agreement was 92.5-95.0%), and with high inter- and intra- coder reliability (93.0-96.3% and 93.5-96.3%). Practical application demonstrated the emergence of strong aetiological patterns which provided insight into causative sequences leading to injury, and led to the identification of effective control measures to reduce injury frequency and severity. However, limitations related to the inefficient and error-prone manual classification process (i.e. average 4.75 minute/case processing time and 5.0-7.5% error rate), revealed the need for an automated approach. To overcome these limitations, a knowledge acquisition (KA) software tool was developed, tested and applied, based on an expertsystems technique known as ripple down rules (RDR). It was found that the KA system was able acquire tacit knowledge from a human expert and apply learned rules to efficiently and accurately classify large numbers of injury cases. Ultimately, coding error rates dropped to 3.1%, which, along with an average 2.50 minute processing time, compared favourably with results from manual classification. As such, the developed taxonomy and KA tool offer significant advantages to injury researchers who have a need to deduce useful patterns from injury data and test hypotheses regarding causation and prevention.
APA, Harvard, Vancouver, ISO, and other styles
4

Monroy, Chora Isaac. "An investigation on automatic systems for fault diagnosis in chemical processes." Doctoral thesis, Universitat Politècnica de Catalunya, 2012. http://hdl.handle.net/10803/77637.

Full text
Abstract:
Plant safety is the most important concern of chemical industries. Process faults can cause economic loses as well as human and environmental damages. Most of the operational faults are normally considered in the process design phase by applying methodologies such as Hazard and Operability Analysis (HAZOP). However, it should be expected that failures may occur in an operating plant. For this reason, it is of paramount importance that plant operators can promptly detect and diagnose such faults in order to take the appropriate corrective actions. In addition, preventive maintenance needs to be considered in order to increase plant safety. Fault diagnosis has been faced with both analytic and data-based models and using several techniques and algorithms. However, there is not yet a general fault diagnosis framework that joins detection and diagnosis of faults, either registered or non-registered in records. Even more, less efforts have been focused to automate and implement the reported approaches in real practice. According to this background, this thesis proposes a general framework for data-driven Fault Detection and Diagnosis (FDD), applicable and susceptible to be automated in any industrial scenario in order to hold the plant safety. Thus, the main requirement for constructing this system is the existence of historical process data. In this sense, promising methods imported from the Machine Learning field are introduced as fault diagnosis methods. The learning algorithms, used as diagnosis methods, have proved to be capable to diagnose not only the modeled faults, but also novel faults. Furthermore, Risk-Based Maintenance (RBM) techniques, widely used in petrochemical industry, are proposed to be applied as part of the preventive maintenance in all industry sectors. The proposed FDD system together with an appropriate preventive maintenance program would represent a potential plant safety program to be implemented. Thus, chapter one presents a general introduction to the thesis topic, as well as the motivation and scope. Then, chapter two reviews the state of the art of the related fields. Fault detection and diagnosis methods found in literature are reviewed. In this sense a taxonomy that joins both Artificial Intelligence (AI) and Process Systems Engineering (PSE) classifications is proposed. The fault diagnosis assessment with performance indices is also reviewed. Moreover, it is exposed the state of the art corresponding to Risk Analysis (RA) as a tool for taking corrective actions to faults and the Maintenance Management for the preventive actions. Finally, the benchmark case studies against which FDD research is commonly validated are examined in this chapter. The second part of the thesis, integrated by chapters three to six, addresses the methods applied during the research work. Chapter three deals with the data pre-processing, chapter four with the feature processing stage and chapter five with the diagnosis algorithms. On the other hand, chapter six introduces the Risk-Based Maintenance techniques for addressing the plant preventive maintenance. The third part includes chapter seven, which constitutes the core of the thesis. In this chapter the proposed general FD system is outlined, divided in three steps: diagnosis model construction, model validation and on-line application. This scheme includes a fault detection module and an Anomaly Detection (AD) methodology for the detection of novel faults. Furthermore, several approaches are derived from this general scheme for continuous and batch processes. The fourth part of the thesis presents the validation of the approaches. Specifically, chapter eight presents the validation of the proposed approaches in continuous processes and chapter nine the validation of batch process approaches. Chapter ten raises the AD methodology in real scaled batch processes. First, the methodology is applied to a lab heat exchanger and then it is applied to a Photo-Fenton pilot plant, which corroborates its potential and success in real practice. Finally, the fifth part, including chapter eleven, is dedicated to stress the final conclusions and the main contributions of the thesis. Also, the scientific production achieved during the research period is listed and prospects on further work are envisaged.
La seguridad de planta es el problema más inquietante para las industrias químicas. Un fallo en planta puede causar pérdidas económicas y daños humanos y al medio ambiente. La mayoría de los fallos operacionales son previstos en la etapa de diseño de un proceso mediante la aplicación de técnicas de Análisis de Riesgos y de Operabilidad (HAZOP). Sin embargo, existe la probabilidad de que pueda originarse un fallo en una planta en operación. Por esta razón, es de suma importancia que una planta pueda detectar y diagnosticar fallos en el proceso y tomar las medidas correctoras adecuadas para mitigar los efectos del fallo y evitar lamentables consecuencias. Es entonces también importante el mantenimiento preventivo para aumentar la seguridad y prevenir la ocurrencia de fallos. La diagnosis de fallos ha sido abordada tanto con modelos analíticos como con modelos basados en datos y usando varios tipos de técnicas y algoritmos. Sin embargo, hasta ahora no existe la propuesta de un sistema general de seguridad en planta que combine detección y diagnosis de fallos ya sea registrados o no registrados anteriormente. Menos aún se han reportado metodologías que puedan ser automatizadas e implementadas en la práctica real. Con la finalidad de abordar el problema de la seguridad en plantas químicas, esta tesis propone un sistema general para la detección y diagnosis de fallos capaz de implementarse de forma automatizada en cualquier industria. El principal requerimiento para la construcción de este sistema es la existencia de datos históricos de planta sin previo filtrado. En este sentido, diferentes métodos basados en datos son aplicados como métodos de diagnosis de fallos, principalmente aquellos importados del campo de “Aprendizaje Automático”. Estas técnicas de aprendizaje han resultado ser capaces de detectar y diagnosticar no sólo los fallos modelados o “aprendidos”, sino también nuevos fallos no incluidos en los modelos de diagnosis. Aunado a esto, algunas técnicas de mantenimiento basadas en riesgo (RBM) que son ampliamente usadas en la industria petroquímica, son también propuestas para su aplicación en el resto de sectores industriales como parte del mantenimiento preventivo. En conclusión, se propone implementar en un futuro no lejano un programa general de seguridad de planta que incluya el sistema de detección y diagnosis de fallos propuesto junto con un adecuado programa de mantenimiento preventivo. Desglosando el contenido de la tesis, el capítulo uno presenta una introducción general al tema de esta tesis, así como también la motivación generada para su desarrollo y el alcance delimitado. El capítulo dos expone el estado del arte de las áreas relacionadas al tema de tesis. De esta forma, los métodos de detección y diagnosis de fallos encontrados en la literatura son examinados en este capítulo. Asimismo, se propone una taxonomía de los métodos de diagnosis que unifica las clasificaciones propuestas en el área de Inteligencia Artificial y de Ingeniería de procesos. En consecuencia, se examina también la evaluación del performance de los métodos de diagnosis en la literatura. Además, en este capítulo se revisa y reporta el estado del arte correspondiente al “Análisis de Riesgos” y a la “Gestión del Mantenimiento” como técnicas complementarias para la toma de medidas correctoras y preventivas. Por último se abordan los casos de estudio considerados como puntos de referencia en el campo de investigación para la aplicación del sistema propuesto. La tercera parte incluye el capítulo siete, el cual constituye el corazón de la tesis. En este capítulo se presenta el esquema o sistema general de diagnosis de fallos propuesto. El sistema es dividido en tres partes: construcción de los modelos de diagnosis, validación de los modelos y aplicación on-line. Además incluye un modulo de detección de fallos previo a la diagnosis y una metodología de detección de anomalías para la detección de nuevos fallos. Por último, de este sistema se desglosan varias metodologías para procesos continuos y por lote. La cuarta parte de esta tesis presenta la validación de las metodologías propuestas. Específicamente, el capítulo ocho presenta la validación de las metodologías propuestas para su aplicación en procesos continuos y el capítulo nueve presenta la validación de las metodologías correspondientes a los procesos por lote. El capítulo diez valida la metodología de detección de anomalías en procesos por lote reales. Primero es aplicada a un intercambiador de calor escala laboratorio y después su aplicación es escalada a un proceso Foto-Fenton de planta piloto, lo cual corrobora el potencial y éxito de la metodología en la práctica real. Finalmente, la quinta parte de esta tesis, compuesta por el capítulo once, es dedicada a presentar y reafirmar las conclusiones finales y las principales contribuciones de la tesis. Además, se plantean las líneas de investigación futuras y se lista el trabajo desarrollado y presentado durante el periodo de investigación.
APA, Harvard, Vancouver, ISO, and other styles
5

Fu, Qiang. "A generalization of the minimum classification error (MCE) training method for speech recognition and detection." Diss., Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/22705.

Full text
Abstract:
The model training algorithm is a critical component in the statistical pattern recognition approaches which are based on the Bayes decision theory. Conventional applications of the Bayes decision theory usually assume uniform error cost and result in a ubiquitous use of the maximum a posteriori (MAP) decision policy and the paradigm of distribution estimation as practice in the design of a statistical pattern recognition system. The minimum classification error (MCE) training method is proposed to overcome some substantial limitations for the conventional distribution estimation methods. In this thesis, three aspects of the MCE method are generalized. First, an optimal classifier/recognizer design framework is constructed, aiming at minimizing non-uniform error cost.A generalized training criterion named weighted MCE is proposed for pattern and speech recognition tasks with non-uniform error cost. Second, the MCE method for speech recognition tasks requires appropriate management of multiple recognition hypotheses for each data segment. A modified version of the MCE method with a new approach to selecting and organizing recognition hypotheses is proposed for continuous phoneme recognition. Third, the minimum verification error (MVE) method for detection-based automatic speech recognition (ASR) is studied. The MVE method can be viewed as a special version of the MCE method which aims at minimizing detection/verification errors. We present many experiments on pattern recognition and speech recognition tasks to justify the effectiveness of our generalizations.
APA, Harvard, Vancouver, ISO, and other styles
6

Gorecki, Christophe. "Classification par échantillonnage de la densité spectrale d'énergie : Application à l'étude statistique des surfaces et à l'analyse de particules." Besançon, 1989. http://www.theses.fr/1989BESA2015.

Full text
Abstract:
Etude d'un profilometre optique base sur la defocalisation d'un faisceau de lumiere blanche. Etude de deux dispositifs optonumeriques d'analyse statistique utilisant les techniques de fourier optiques: un analyseur de particules et un dispositif de classement automatique des surfaces non polies
APA, Harvard, Vancouver, ISO, and other styles
7

Dimara, Euthalie. "L'agriculture grecque : une étude chronologique et régionale par l'analyse des correspondances et la classification automatique." Paris 6, 1988. http://www.theses.fr/1988PA066199.

Full text
Abstract:
Dans une étude statistique de la production agricole grecque durant la période 1970-1981, on fait une correspondance ternaire entre trois ensembles de données: les 53 départements grecs, 56 productions différentes et l'ensemble des dix années. Il ressort que l'agriculture grecque a un caractère traditionnel ou le facteur temps n'a pas d'influence significative. On peut distinguer deux zones agraires: productions subméditerranéennes au nord, produits typiquement méditerranéens au sud et dans les iles.
APA, Harvard, Vancouver, ISO, and other styles
8

Sastre, Jurado Carlos. "Exploitation du signal pénétrométrique pour l'aide à l'obtention d'un modèle de terrain." Thesis, Université Clermont Auvergne‎ (2017-2020), 2018. http://www.theses.fr/2018CLFAC003/document.

Full text
Abstract:
Ce travail porte sur la reconnaissance de sols à faible profondeur grâce aux données de résistance de pointe recueillies à l'aide de l'essai de pénétration dynamique à énergie variable, Panda®. L'objectif principal est d'étudier et de proposer un ensemble d'approches dans le cadre d'une méthode globale permettant d'exploiter les mesures issues d'une campagne de sondages Panda afin de bâtir un modèle géotechnique du terrain.Ce manuscrit est structuré en quatre parties, chacune abordant un objectif spécifique :dans un premier temps, on rappelle les principaux moyens de reconnaissance des sols, notamment l'essai de pénétration dynamique Panda. Ensuite on réalise un bref aperçu sur le modèle géotechnique et les techniques mathématiques pour décrire l'incertitude dans la caractérisation des propriétés du sol;la deuxième partie porte sur l'identification automatique des unités homogènes du terrain, à partir du signal pénétrométrique Panda. Suite à l'étude réalisée sur l'identification "experte" des couches à partir du signal Panda, des approches statistiques basées sur une fenêtre glissante ont été proposées. Ces techniques ont été étudiées et validées sur la base d'un protocole d'essais en laboratoire et sur des essais effectués en sites naturels et en conditions réelles;la troisième partie porte sur l'identification automatique des matériaux composant les unités homogènes détectées dans le signal Panda à partir des méthodes proposées en partie II. Une méthode de classification automatique basée sur des réseaux de neurones artificiels a été proposée et appliquée aux deux cas d'étude : la caractérisation de sols naturels et la classification d'un matériau granulaire argileux industrialisé (bentonite) ; enfin, la dernière partie est consacrée à la production d'un modèle de terrain basé sur la modélisation et la simulation de la résistance de pointe dynamique au moyen de fonctions aléatoires de l'espace. Cette modélisation est basée sur une approche par champs aléatoires conditionnés par les sondages Panda du terrain. Sa mise en œuvre a été étudiée pour un terrain expérimental situé dans la plaine deltaïque méditerranéenne en Espagne. Des études complémentaires en vue de raffiner cette démarche ont été réalisées pour un deuxième site expérimental dans la plaine de la Limagne en France
This research focuses on the site characterization of shallow soils using the dynamic cone penetrometer Panda® which uses variable energy. The main purpose is to study and propose several techniques as part of an overall method in order to obtain a ground model through a geotechnical campaign based on the Panda test.This work is divided into four parts, each of them it is focused on a specific topic :first of all, we introduce the main site characterization techniques, including the dynamic penetrometer Panda. Then, we present a brief overview of the geotechnical model and the mathematical methods for the characterization of uncertainties in soil properties;the second part deals with the automatic identification of physical homogeneous soil units based on penetration's mechanical response of the soil using the Panda test. Following a study about the soil layers identification based only on expert's judgment, we have proposed statistical moving window procedures for an objective assessment. The application of these statistical methods have been studied for the laboratory and in situ Panda test;the third part focuses on the automatic classification of the penetrations curves in the homogeneous soil units identified using the statistical techniques proposed in part II. An automatic methodology to predict the soil grading from the dynamic cone resistance using artificial neural networks has been proposed. The framework has been studied for two different research problems: the classification of natural soils and the classification of several crushed aggregate-bentonite mixtures;finally, the last chapter was devoted to model the spatial variability of the dynamic cone resistance qd based on random field theory and geostatistics. In order to reduce uncertainty in the field where Panda measurements are carried out, we have proposed the use of conditional simulation in a three dimensional space. This approach has been applied and studied to a real site investigation carried out in an alluvial mediterranean deltaic environment in Spain. Complementary studies in order to improve the proposed framework have been explored based on another geotechnical campaign conducted on a second experimental site in France
APA, Harvard, Vancouver, ISO, and other styles
9

Wei, Yi. "Statistical methods on automatic aircraft recognition in aerial images." Thesis, University of Strathclyde, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.248947.

Full text
APA, Harvard, Vancouver, ISO, and other styles
10

Kim, Heeyoung. "Statistical methods for function estimation and classification." Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/44806.

Full text
Abstract:
This thesis consists of three chapters. The first chapter focuses on adaptive smoothing splines for fitting functions with varying roughness. In the first part of the first chapter, we study an asymptotically optimal procedure to choose the value of a discretized version of the variable smoothing parameter in adaptive smoothing splines. With the choice given by the multivariate version of the generalized cross validation, the resulting adaptive smoothing spline estimator is shown to be consistent and asymptotically optimal under some general conditions. In the second part, we derive the asymptotically optimal local penalty function, which is subsequently used for the derivation of the locally optimal smoothing spline estimator. In the second chapter, we propose a Lipschitz regularity based statistical model, and apply it to coordinate measuring machine (CMM) data to estimate the form error of a manufactured product and to determine the optimal sampling positions of CMM measurements. Our proposed wavelet-based model takes advantage of the fact that the Lipschitz regularity holds for the CMM data. The third chapter focuses on the classification of functional data which are known to be well separable within a particular interval. We propose an interval based classifier. We first estimate a baseline of each class via convex optimization, and then identify an optimal interval that maximizes the difference among the baselines. Our interval based classifier is constructed based on the identified optimal interval. The derived classifier can be implemented via a low-order-of-complexity algorithm.
APA, Harvard, Vancouver, ISO, and other styles
11

Sun, Felice (Felice Tzu-yun) 1976. "Integrating statistical and knowledge-based methods for automatic phonemic segmentation." Thesis, Massachusetts Institute of Technology, 1999. http://hdl.handle.net/1721.1/80127.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Sezgin, Ozge. "Statistical Methods In Credit Rating." Master's thesis, METU, 2006. http://etd.lib.metu.edu.tr/upload/12607625/index.pdf.

Full text
Abstract:
Credit risk is one of the major risks banks and financial institutions are faced with. With the New Basel Capital Accord, banks and financial institutions have the opportunity to improve their risk management process by using Internal Rating Based (IRB) approach. In this thesis, we focused on the internal credit rating process. First, a short overview of credit scoring techniques and validation techniques was given. By using real data set obtained from a Turkish bank about manufacturing firms, default prediction logistic regression, probit regression, discriminant analysis and classification and regression trees models were built. To improve the performances of the models the optimum sample for logistic regression was selected from the data set and taken as the model construction sample. In addition, also an information on how to convert continuous variables to ordered scaled variables to avoid difference in scale problem was given. After the models were built the performances of models for whole data set including both in sample and out of sample were evaluated with validation techniques suggested by Basel Committee. In most cases classification and regression trees model dominates the other techniques. After credit scoring models were constructed and evaluated, cut-off values used to map probability of default obtained from logistic regression to rating classes were determined with dual objective optimization. The cut-off values that gave the maximum area under ROC curve and minimum mean square error of regression tree was taken as the optimum threshold after 1000 simulation. Keywords: Credit Rating, Classification and Regression Trees, ROC curve, Pietra Index
APA, Harvard, Vancouver, ISO, and other styles
13

Towey, David John. "SPECT imaging and automatic classification methods in movement disorders." Thesis, Imperial College London, 2013. http://hdl.handle.net/10044/1/11182.

Full text
Abstract:
This work investigates neuroimaging as applied to movement disorders by the use of radionuclide imaging techniques. There are two focuses in this work: 1) The optimisation of the SPECT imaging process including acquisition and image reconstruction. 2) The development and optimisation of automated analysis techniques The first part has included practical measurements of camera performance using a range of phantoms. Filtered back projection and iterative methods of image reconstruction were compared and optimised. Compensation methods for attenuation and scatter are assessed. Iterative methods are shown to improve image quality over filtered back projection for a range of image quality indexes. Quantitative improvements are shown when attenuation and scatter compensation techniques are applied, but at the expense of increased noise. The clinical acquisition and processing procedures were adjusted accordingly. A large database of clinical studies was used to compare commercially available DaTSCAN quantification software programs. A novel automatic analysis technique was then developed by combining Principal Component Analysis (PCA) and machine learning techniques (including Support Vector Machines, and Naive Bayes). The accuracy of the various classification methods under different conditions is investigated and discussed. The thesis concludes that the described method can allow automatic classification of clinical images with equal or greater accuracy to that of commercially available systems.
APA, Harvard, Vancouver, ISO, and other styles
14

Kolesov, Ivan A. "Statistical methods for coupling expert knowledge and automatic image segmentation and registration." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/47739.

Full text
Abstract:
The objective of the proposed research is to develop methods that couple an expert user's guidance with automatic image segmentation and registration algorithms. Often, complex processes such as fire, anatomical changes/variations in human bodies, or unpredictable human behavior produce the target images; in these cases, creating a model that precisely describes the process is not feasible. A common solution is to make simplifying assumptions when performing detection, segmentation, or registration tasks automatically. However, when these assumptions are not satisfied, the results are unsatisfactory. Hence, removing these, often times stringent, assumptions at the cost of minimal user input is considered an acceptable trade-off. Three milestones towards reaching this goal have been achieved. First, an interactive image segmentation approach was created in which the user is coupled in a closed-loop control system with a level set segmentation algorithm. The user's expert knowledge is combined with the speed of automatic segmentation. Second, a stochastic point set registration algorithm is presented. The point sets can be derived from simple user input (e.g. a thresholding operation), and time consuming correspondence labeling is not required. Furthermore, common smoothness assumptions on the non-rigid deformation field are removed. Third, a stochastic image registration algorithm is designed to capture large misalignments. For future research, several improvements of the registration are proposed, and an iterative, landmark based segmentation approach, which couples the segmentation and registration, is envisioned.
APA, Harvard, Vancouver, ISO, and other styles
15

Randolph, Tami Rochele. "Image compression and classification using nonlinear filter banks." Diss., Georgia Institute of Technology, 2001. http://hdl.handle.net/1853/13439.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Mohamed, Ghada. "Text classification in the BNC using corpus and statistical methods." Thesis, Lancaster University, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.658020.

Full text
Abstract:
The main part of this thesis sets out to develop a system of categories within a text typology. Although there exist many different approaches to the classification of text into categories, this research fills a gap in the literature, as most work on text classification is based on features external to the text such as the text's purpose, the aim of discourse, and the medium of communication. Text categories that have been set up based on some external features are not linguistically defined. In consequence, texts that belong to the same type are not necessarily similar in their linguistic forms. Even Biber's (1988) linguistically-oriented work was based on externally defined ~registers. Further, establishing text categories based on text-external features favours theoretical and qualitative approaches of text classification. These approaches can be seen as top-down approaches where external features are defined functionally in advance, and subsequently patterns of linguistic features are described in relation to each function. In such a case, the process of linking texts with a particular type is not done in a systematic way. In this thesis, I show how a text typology based on similarities in linguistic form can be developed systematically using a multivariate statistical technique; namely, cluster analysis. Following a review of various possible approaches to multivariate statistical analysis, I argue that cluster analysis is the most appropriate for systematising the study of text classification, because it has the distinctive feature of placing objects into distinct groupings based on their overall similarities across multiple variables. Cluster analysis identifies these grouping algorithmically. The objects to be clustered in my thesis are the written texts in the British National Corpus (BNC). I will make use of the written part only, since results of previous research which attempts to classify texts of this dataset were not very beneficial. Takahashi (2006), for instance, identified merely a broad distinction between formal and informal styles in the written part; whereas in the spoken part, he could come up with insightful results. Thus, it seems justifiable to look at the part of the BNC which Taka..1.ashi found intractable, using a different multivariate technique, to see if this methodology allows patterns to emerge in the dataset. Further, there are two other reasons to use the written BNC. First, some studies (e.g. Akinnaso 1982; Chafe and Danielewicz 1987) suggest that distinctions between text varieties based on frequencies of linguistic features can be identified even within one mode of communication, i.e. writing. Second, analysing written text varieties has direct implications for pedagogy (Biber and Conrad 2009). The variables measured in the written texts of the BNC are linguistic features that have functional associations. However, any linguistic feature can be interpreted functionally; hence, we cannot say that there is an easy way to decide on a list of linguistic features to investigate text varieties. In this thesis, the list of linguistic features is informed by some aspects of Systemic Functional Theory (STF) and characteristics identified in previous research on writing, as opposed to speech. SFT lends itself to the interpretation of how language is used through functional associations of linguistic features, treating meaning and form as two inseparable notions. This characteristic of SFT can be one source to inform my research to some extent, which assumes that a model of text-types can be established by investigating not only the linguistic features shared in each type, but also the functions served by these linguistic features in each type. However, there is no commitment in this study to aspects of SFT other than those I have discussed here. Similarly, the linguistic features that reflect characteristics of speech and writing identified in previous research also have a crucial role in distinguishing between different texts. For instance, writing is elaborate, and this is associated with linguistic features such as subordinate clauses, prepositional phrases, adjectives, and so on. However, these characteristics do not only reflect the distinction between speech and writing; they can also distinguish between different spoken texts or different written texts (see Akinnaso 1982). Thus, the linguistic features seen as important from these two perspectives are included in my list of linguistic features. To make the list more principled and exhaustive, I also consult a comprehensive corpus-based work on English language, along with some microscopic studies examining individual features in different registers. The linguistic features include personal pronouns, passive constructions, prepositional phrases, nominalisation, modal auxiliaries, adverbs, and adjectives. Computing a cluster analysis based on this data is a complex process with many steps. At each step, several alternative techniques are available. Choosing among the available teclmiques is a non-trivial decision, as multiple alternatives are in common use by statisticians. I demonstrate how a process of testing several combinations of clustering methods, in order to determine the most useful/stable clustering combination(s) for use in the classification of texts by their linguistic features . To test the robustness of the clustering algorithms techniques and to validate the cluster analysis, I use three validation techniques for cluster analysis, namely the cophenetic coefficient, the adjusted Rand index, and the AV p-value. The findings of the cluster analysis represent a plausible attempt to systematise the study of the diversity of texts by means of automatic classification. Initially, the cluster analysis resulted in 16 clusters/text types. However, a thorough investigation of those 16 clusters reveals that some clusters represent quite similar text types. Thus, it is possible to establish overall headings for similar types, reflecting their shared linguistic features. The resulting typology contains six major text types: persuasion, narration, informational narration, exposition, scientific exposition, and literary exposition. Cluster analysis thus proves to be a powerful tool for structuring the data, if used with caution. The way it is implemented in this study constitutes an advance in the field of text typology. Finally, a small-scale case study of the validity of the text typology is carried out. A questionnaire is used to find out whether and to what extent my taxonomy corresponds to native speakers' understanding of textual variability, that is, whether the taxonomy has some mental reality for native speakers of English. The results showed that native speakers of English, on the one hand, are good at explicitly identifying the grammatical features associated with scientific exposition and narration; but on the other hand, they are not so good at identifying the grammatical features associated with literary exposition and persuasion. The results also showed that participants seem to have difficulties in identifying grammatical features of informational narration. The results of this small-scale case study indicate that the text typology in my thesis is, to some extent, a phenomenon that native speakers are aware of, and thus we can justify placing our trust in the results - at least in their general pattern, if not in every detail.
APA, Harvard, Vancouver, ISO, and other styles
17

Wei, Xuelian. "Statistical methods in classification problems using gene expression / proteomic signatures." Diss., Restricted to subscribing institutions, 2008. http://proquest.umi.com/pqdweb?did=1680042151&sid=2&Fmt=2&clientId=1564&RQT=309&VName=PQD.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Arif, Omar. "Robust target localization and segmentation using statistical methods." Diss., Georgia Institute of Technology, 2010. http://hdl.handle.net/1853/33882.

Full text
Abstract:
This thesis aims to contribute to the area of visual tracking, which is the process of identifying an object of interest through a sequence of successive images. The thesis explores kernel-based statistical methods, which map the data to a higher dimensional space. A pre-image framework is provided to find the mapping from the embedding space to the input space for several manifold learning and dimensional learning algorithms. Two algorithms are developed for visual tracking that are robust to noise and occlusions. In the first algorithm, a kernel PCA-based eigenspace representation is used. The de-noising and clustering capabilities of the kernel PCA procedure lead to a robust algorithm. This framework is extended to incorporate the background information in an energy based formulation, which is minimized using graph cut and to track multiple objects using a single learned model. In the second method, a robust density comparison framework is developed that is applied to visual tracking, where an object is tracked by minimizing the distance between a model distribution and given candidate distributions. The superior performance of kernel-based algorithms comes at a price of increased storage and computational requirements. A novel method is developed that takes advantage of the universal approximation capabilities of generalized radial basis function neural networks to reduce the computational and storage requirements for kernel-based methods.
APA, Harvard, Vancouver, ISO, and other styles
19

Tilley, Jason W. "A Comparison of Statistical Filtering Methods for Automatic Term Extraction for Domain Analysis." Thesis, Virginia Tech, 2008. http://hdl.handle.net/10919/30818.

Full text
Abstract:
Fourteen word frequency metrics were tested to evaluate their effectiveness in identifying vocabulary in a domain. Fifteen domain engineering projects were examined to measure how closely the vocabularies selected by the fourteen word frequency metrics were to the vocabularies produced by domain engineers. Six filtering mechanisms were also evaluated to measure their impact on selecting proper vocabulary terms. The results of the experiment show that stemming and stop word removal do improve overlap scores and that term frequency is a valuable contributor to overlap. Variations on term frequency are not always significant improvers of overlap.
Master of Science
APA, Harvard, Vancouver, ISO, and other styles
20

Anderson, Sarah G. "Statistical Methods for Biological and Relational Data." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1365441350.

Full text
APA, Harvard, Vancouver, ISO, and other styles
21

Southworth, Robin. "Classification of spatial data using neural networks and other statistical methods." Thesis, University of Leeds, 1997. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.679849.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Dambreville, Samuel. "Statistical and geometric methods for shape-driven segmentation and tracking." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/22707.

Full text
Abstract:
Thesis (Ph. D.)--Electrical and Computer Engineering, Georgia Institute of Technology, 2008.
Committee Chair: Allen Tannenbaum; Committee Member: Anthony Yezzi; Committee Member: Marc Niethammer; Committee Member: Patricio Vela; Committee Member: Yucel Altunbasak.
APA, Harvard, Vancouver, ISO, and other styles
23

Fürst, Elmar Wilhelm, Peter Oberhofer, Christian Vogelauer, Rudolf Bauer, and David Martin Herold. "Innovative methods in European road freight transport statistics: A pilot study." Tayler and Francis, 2019. http://dx.doi.org/10.1080/09720510.2019.

Full text
Abstract:
By using innovative methods, such as the automated transfer of corporate electronic data to National Statistical Institutions, official transport data can be significantly improved in terms of reliability, costs and the burden on respondents. In this paper, we show that the automated compilation of statistical reports is possible and feasible. Based on previous findings, a new method and tool were developed in cooperation with two business partners from the logistics sector in Austria. The results show that the prototype could successfully be implemented at the partner companies. Improved data quality can lead to more reliable analyses in various fields. Compared to actual volumes of investments into transport, the costs of transport statistics are limited. By using the new and innovative data collection techniques, these costs can even be reduced in the long run; at the same time, the risk of bad investments and wrong decisions caused by analyses relying on poor data quality can be reduced. This results in a substantial value for business, research, the economy and the society.
APA, Harvard, Vancouver, ISO, and other styles
24

Wu, Jian, and 武健. "Discriminative speaker adaptation and environmental robustness in automatic speech recognition." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2004. http://hub.hku.hk/bib/B31246138.

Full text
APA, Harvard, Vancouver, ISO, and other styles
25

Delmege, James W. "CLASS : a study of methods for coarse phonetic classification /." Online version of thesis, 1988. http://hdl.handle.net/1850/10449.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Tansel, Icten. "Differentiation And Classification Of Counterfeit And Real Coins By Applying Statistical Methods." Master's thesis, METU, 2012. http://etd.lib.metu.edu.tr/upload/12614417/index.pdf.

Full text
Abstract:
ABSTRACT DIFFERENTIATION AND CLASSIFICATION OF COUNTERFEIT AND REAL COINS BY APPLYING STATISTICAL METHODS Tansel, Iç
ten M.Sc, Archaeometry Graduate Program Supervisor : Assist. Prof. Dr. Zeynep Isil Kalaylioglu Co-Supervisor : Prof. Dr. Sahinde Demirci June 2012, 105 pages In this study, forty coins which were obtained from Museum of Anatolian Civilizations (MAC) in Ankara were investigated. Some of those coins were real (twenty two coins) and the remaining ones (eighteen coins) were fake coins. Forty coins were Greek coins which were dated back to middle of the fifth century BCE and reign of Alexander the Great (323 &ndash
336 BCE). The major aims of this study can be summarized as follow
APA, Harvard, Vancouver, ISO, and other styles
27

Leung, Yuk-yee, and 梁玉儀. "An integrated framework for feature selection and classification in microarray data analysis." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2009. http://hub.hku.hk/bib/B43278632.

Full text
APA, Harvard, Vancouver, ISO, and other styles
28

黃伯光 and Pak-kwong Wong. "Statistical language models for Chinese recognition: speech and character." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 1998. http://hub.hku.hk/bib/B31239456.

Full text
APA, Harvard, Vancouver, ISO, and other styles
29

McCormick, Neil Howie. "Bayesian methods for automatic segmentation and classification of SLO and SONAR data." Thesis, Heriot-Watt University, 2001. http://hdl.handle.net/10399/452.

Full text
APA, Harvard, Vancouver, ISO, and other styles
30

Lamont, Morné Michael Connell. "Binary classification trees : a comparison with popular classification methods in statistics using different software." Thesis, Stellenbosch : Stellenbosch University, 2002. http://hdl.handle.net/10019.1/52718.

Full text
Abstract:
Thesis (MComm) -- Stellenbosch University, 2002.
ENGLISH ABSTRACT: Consider a data set with a categorical response variable and a set of explanatory variables. The response variable can have two or more categories and the explanatory variables can be numerical or categorical. This is a typical setup for a classification analysis, where we want to model the response based on the explanatory variables. Traditional statistical methods have been developed under certain assumptions such as: the explanatory variables are numeric only and! or the data follow a multivariate normal distribution. hl practice such assumptions are not always met. Different research fields generate data that have a mixed structure (categorical and numeric) and researchers are often interested using all these data in the analysis. hl recent years robust methods such as classification trees have become the substitute for traditional statistical methods when the above assumptions are violated. Classification trees are not only an effective classification method, but offer many other advantages. The aim of this thesis is to highlight the advantages of classification trees. hl the chapters that follow, the theory of and further developments on classification trees are discussed. This forms the foundation for the CART software which is discussed in Chapter 5, as well as other software in which classification tree modeling is possible. We will compare classification trees to parametric-, kernel- and k-nearest-neighbour discriminant analyses. A neural network is also compared to classification trees and finally we draw some conclusions on classification trees and its comparisons with other methods.
AFRIKAANSE OPSOMMING: Beskou 'n datastel met 'n kategoriese respons veranderlike en 'n stel verklarende veranderlikes. Die respons veranderlike kan twee of meer kategorieë hê en die verklarende veranderlikes kan numeries of kategories wees. Hierdie is 'n tipiese opset vir 'n klassifikasie analise, waar ons die respons wil modelleer deur gebruik te maak van die verklarende veranderlikes. Tradisionele statistiese metodes is ontwikkelonder sekere aannames soos: die verklarende veranderlikes is slegs numeries en! of dat die data 'n meerveranderlike normaal verdeling het. In die praktyk word daar nie altyd voldoen aan hierdie aannames nie. Verskillende navorsingsvelde genereer data wat 'n gemengde struktuur het (kategories en numeries) en navorsers wil soms al hierdie data gebruik in die analise. In die afgelope jare het robuuste metodes soos klassifikasie bome die alternatief geword vir tradisionele statistiese metodes as daar nie aan bogenoemde aannames voldoen word nie. Klassifikasie bome is nie net 'n effektiewe klassifikasie metode nie, maar bied baie meer voordele. Die doel van hierdie werkstuk is om die voordele van klassifikasie bome uit te wys. In die hoofstukke wat volg word die teorie en verdere ontwikkelinge van klassifikasie bome bespreek. Hierdie vorm die fondament vir die CART sagteware wat bespreek word in Hoofstuk 5, asook ander sagteware waarin klassifikasie boom modelering moontlik is. Ons sal klassifikasie bome vergelyk met parametriese-, "kernel"- en "k-nearest-neighbour" diskriminant analise. 'n Neurale netwerk word ook vergelyk met klassifikasie bome en ten slote word daar gevolgtrekkings gemaak oor klassifikasie bome en hoe dit vergelyk met ander metodes.
APA, Harvard, Vancouver, ISO, and other styles
31

Ravindran, Sourabh. "Physiologically Motivated Methods For Audio Pattern Classification." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/14066.

Full text
Abstract:
Human-like performance by machines in tasks of speech and audio processing has remained an elusive goal. In an attempt to bridge the gap in performance between humans and machines there has been an increased effort to study and model physiological processes. However, the widespread use of biologically inspired features proposed in the past has been hampered mainly by either the lack of robustness across a range of signal-to-noise ratios or the formidable computational costs. In physiological systems, sensor processing occurs in several stages. It is likely the case that signal features and biological processing techniques evolved together and are complementary or well matched. It is precisely for this reason that modeling the feature extraction processes should go hand in hand with modeling of the processes that use these features. This research presents a front-end feature extraction method for audio signals inspired by the human peripheral auditory system. New developments in the field of machine learning are leveraged to build classifiers to maximize the performance gains afforded by these features. The structure of the classification system is similar to what might be expected in physiological processing. Further, the feature extraction and classification algorithms can be efficiently implemented using the low-power cooperative analog-digital signal processing platform. The usefulness of the features is demonstrated for tasks of audio classification, speech versus non-speech discrimination, and speech recognition. The low-power nature of the classification system makes it ideal for use in applications such as hearing aids, hand-held devices, and surveillance through acoustic scene monitoring
APA, Harvard, Vancouver, ISO, and other styles
32

Liu, Lihong. "Molecular phylogeny, classification, evolution and detection of pestiviruses /." Uppsala : Dept. of Biomedical Sciences and Veterinary Public Health, Swedish University of Agricultural Sciences, 2009. http://epsilon.slu.se/20098.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Zhong, Xiao. "A study of several statistical methods for classification with application to microbial source tracking." Link to electronic thesis, 2004. http://www.wpi.edu/Pubs/ETD/Available/etd-0430104-155106/.

Full text
Abstract:
Thesis (M.S.)--Worcester Polytechnic Institute.
Keywords: classification; k-nearest-neighbor (k-n-n); neural networks; linear discriminant analysis (LDA); support vector machines; microbial source tracking (MST); quadratic discriminant analysis (QDA); logistic regression. Includes bibliographical references (p. 59-61).
APA, Harvard, Vancouver, ISO, and other styles
34

Lavrik, Ilya A. "Novel Wavelet-Based Statistical Methods with Applications in Classification, Shrinkage, and Nano-Scale Image Analysis." Diss., Georgia Institute of Technology, 2005. http://hdl.handle.net/1853/10424.

Full text
Abstract:
Given the recent popularity and clear evidence of wide applicability of wavelets, this thesis is devoted to several statistical applications of Wavelet transforms. Statistical multiscale modeling has, in the most recent decade, become a well-established area in both theoretical and applied statistics, with impact on developments in statistical methodology. Wavelet-based methods are important in statistics in areas such as regression, density and function estimation, factor analysis, modeling and forecasting in time series analysis, assessing self-similarity and fractality in data, and spatial statistics. In this thesis we show applicability of the wavelets by considering three problems: First, we consider a binary wavelet-based linear classifier. Both consistency results and implemental issues are addressed. We show that under mild assumptions wavelet-based classification rule is both weakly and strongly universally consistent. The proposed method is illustrated on synthetic data sets in which the truth is known and on applied classification problems from the industrial and bioengineering fields. Second, we develop wavelet shrinkage methodology based on testing multiple hypotheses in the wavelet domain. The shrinkage/thresholding approach by implicit or explicit simultaneous testing of many hypotheses had been considered by many researchers and goes back to the early 1990's. We propose two new approaches to wavelet shrinkage/thresholding based on local False Discovery Rate (FDR), Bayes factors and ordering of posterior probabilities. Finally, we propose a novel method for the analysis of straight-line alignment of features in the images based on Hough and Wavelet transforms. The new method is designed to work specifically with Transmission Electron Microscope (TEM) images taken at nanoscale to detect linear structure formed by the atomic lattice.
APA, Harvard, Vancouver, ISO, and other styles
35

ALQADAH, HATIM FAROUQ. "OPTIMIZED TIME-FREQUENCY CLASSIFICATION METHODS FOR INTELLIGENT AUTOMATIC JETTISONING OF HELMET-MOUNTED DISPLAY SYSTEMS." University of Cincinnati / OhioLINK, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1185838368.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Bodin, Camilla. "Automatic Flight Maneuver Identification Using Machine Learning Methods." Thesis, Linköpings universitet, Reglerteknik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-165844.

Full text
Abstract:
This thesis proposes a general approach to solve the offline flight-maneuver identification problem using machine learning methods. The purpose of the study was to provide means for the aircraft professionals at the flight test and verification department of Saab Aeronautics to automate the procedure of analyzing flight test data. The suggested approach succeeded in generating binary classifiers and multiclass classifiers that identified six flight maneuvers of different complexity from real flight test data. The binary classifiers solved the problem of identifying one maneuver from flight test data at a time, while the multiclass classifiers solved the problem of identifying several maneuvers from flight test data simultaneously. To achieve these results, the difficulties that this time series classification problem entailed were simplified by using different strategies. One strategy was to develop a maneuver extraction algorithm that used handcrafted rules. Another strategy was to represent the time series data by statistical measures. There was also an issue of an imbalanced dataset, where one class far outweighed others in number of samples. This was solved by using a modified oversampling method on the dataset that was used for training. Logistic Regression, Support Vector Machines with both linear and nonlinear kernels, and Artifical Neural Networks were explored, where the hyperparameters for each machine learning algorithm were chosen during model estimation by 4-fold cross-validation and solving an optimization problem based on important performance metrics. A feature selection algorithm was also used during model estimation to evaluate how the performance changes depending on how many features were used. The machine learning models were then evaluated on test data consisting of 24 flight tests. The results given by the test data set showed that the simplifications done were reasonable, but the maneuver extraction algorithm could sometimes fail. Some maneuvers were easier to identify than others and the linear machine learning models resulted in a poor fit to the more complex classes. In conclusion, both binary classifiers and multiclass classifiers could be used to solve the flight maneuver identification problem, and solving a hyperparameter optimization problem boosted the performance of the finalized models. Nonlinear classifiers performed the best on average across all explored maneuvers.
APA, Harvard, Vancouver, ISO, and other styles
37

Lavrik, Ilya A. "Novel wavelet-based statistical methods with applications in classification, shrinkage, and nano-scale image analysis." Available online, Georgia Institute of Technology, 2006, 2006. http://etd.gatech.edu/theses/available/etd-11162005-131744/.

Full text
Abstract:
Thesis (Ph. D.)--Industrial and Systems Engineering, Georgia Institute of Technology, 2006.
Huo, Xiaoming, Committee Member ; Heil, Chris, Committee Member ; Wang, Yang, Committee Member ; Hayter, Anthony, Committee Member ; Vidakovic, Brani, Committee Chair.
APA, Harvard, Vancouver, ISO, and other styles
38

Moser, Paolo 1985, Alexander Christian 1959 Vibrans, Ronald McRoberts, and Universidade Regional de Blumenau Programa de Pós-Graduação em Engenharia Ambiental. "Statistical and computational methods of forest attribute estimation and classification based on remotely sensed data." reponame:Biblioteca Digital de Teses e Dissertações FURB, 2018. http://www.bc.furb.br/docs/TE/2018/364705_1_1.pdf.

Full text
Abstract:
Orientador: Alexander Christian Vibrans.
Coorientador: Ronald McRoberts.
Tese (Doutorado em Engenharia Ambiental) - Programa de Pós-Graduação em Engenharia Ambiental, Centro de Ciências Tecnológicas, Universidade Regional de Blumenau, Blumenau.
APA, Harvard, Vancouver, ISO, and other styles
39

Gurol, Selime. "Statistical Learning And Optimization Methods For Improving The Efficiency In Landscape Image Clustering And Classification Problems." Master's thesis, METU, 2005. http://etd.lib.metu.edu.tr/upload/12606595/index.pdf.

Full text
Abstract:
Remote sensing techniques are vital for early detection of several problems such as natural disasters, ecological problems and collecting information necessary for finding optimum solutions to those problems. Remotely sensed information has also important uses in predicting the future risks, urban planning, communication.Recent developments in remote sensing instrumentation offered a challenge to the mathematical and statistical methods to process the acquired information. Classification of satellite images in the context of land cover classification is the main concern of this study. Land cover classification can be performed by statistical learning methods like additive models, decision trees, neural networks, k-means methods which are already popular in unsupervised classification and clustering of image scene inverse problems. Due to the degradation and corruption of satellite images, the classification performance is limited both by the accuracy of clustering and by the extent of the classification. In this study, we are concerned with understanding the performance of the available unsupervised methods with k-means, supervised methods with Gaussian maximum likelihood which are very popular methods in land cover classification. A broader approach to the classification problem based on finding the optimal discriminants from a larger range of functions is considered also in this work. A novel method based on threshold decomposition and Boolean discriminant functions is developed as an implementable application of this approach. All methods are applied to BILSAT and Landsat satellite images using MATLAB software.
APA, Harvard, Vancouver, ISO, and other styles
40

Jianguo, Li. "Hybrid Methods for Acquisition of Lexical Information: the Case for Verbs." The Ohio State University, 2008. http://rave.ohiolink.edu/etdc/view?acc_num=osu1228259857.

Full text
APA, Harvard, Vancouver, ISO, and other styles
41

Chan, Oscar. "Prosodic features for a maximum entropy language model." University of Western Australia. School of Electrical, Electronic and Computer Engineering, 2008. http://theses.library.uwa.edu.au/adt-WU2008.0244.

Full text
Abstract:
A statistical language model attempts to characterise the patterns present in a natural language as a probability distribution defined over word sequences. Typically, they are trained using word co-occurrence statistics from a large sample of text. In some language modelling applications, such as automatic speech recognition (ASR), the availability of acoustic data provides an additional source of knowledge. This contains, amongst other things, the melodic and rhythmic aspects of speech referred to as prosody. Although prosody has been found to be an important factor in human speech recognition, its use in ASR has been limited. The goal of this research is to investigate how prosodic information can be employed to improve the language modelling component of a continuous speech recognition system. Because prosodic features are largely suprasegmental, operating over units larger than the phonetic segment, the language model is an appropriate place to incorporate such information. The prosodic features and standard language model features are combined under the maximum entropy framework, which provides an elegant solution to modelling information obtained from multiple, differing knowledge sources. We derive features for the model based on perceptually transcribed Tones and Break Indices (ToBI) labels, and analyse their contribution to the word recognition task. While ToBI has a solid foundation in linguistic theory, the need for human transcribers conflicts with the statistical model's requirement for a large quantity of training data. We therefore also examine the applicability of features which can be automatically extracted from the speech signal. We develop representations of an utterance's prosodic context using fundamental frequency, energy and duration features, which can be directly incorporated into the model without the need for manual labelling. Dimensionality reduction techniques are also explored with the aim of reducing the computational costs associated with training a maximum entropy model. Experiments on a prosodically transcribed corpus show that small but statistically significant reductions to perplexity and word error rates can be obtained by using both manually transcribed and automatically extracted features.
APA, Harvard, Vancouver, ISO, and other styles
42

Challa, Akkireddy. "Automatic Handwritten Digit Recognition On Document Images Using Machine Learning Methods." Thesis, Blekinge Tekniska Högskola, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-17656.

Full text
Abstract:
Context: The main purpose of this thesis is to build an automatic handwritten digit recognition method for the recognition of connected handwritten digit strings. To accomplish the recognition task, first, the digits were segmented into individual digits. Then, a digit recognition module is employed to classify each segmented digit completing the handwritten digit string recognition task. In this study, different machine learning methods, which are SVM, ANN and CNN architectures are used to achieve high performance on the digit string recognition problem. In these methods, images of digit strings are trained with the SVM, ANN and CNN model with HOG feature vectors and Deep learning methods structure by sliding a fixed size window through the images labeling each sub-image as a part of a digit or not. After the completion of the segmentation, to achieve the complete recognition of handwritten digits.Objective: The main purpose of this thesis is to find out the recognition performance of the methods. In order to analyze the performance of the methods, data is needed to be used for training using machine learning methods. Then digit data is tested on the desired machine learning technique. In this thesis, the following methods are performed: Implementation of HOG Feature extraction method with SVM Implementation of HOG Feature extraction method with ANN Implementation of Deep Learning methods with CNN Methods: This research will be carried out using two methods. The first research method is the ¨Literature Review¨ and the second ¨Experiment¨. Initially, a literature review is conducted to get a clear knowledge on the algorithms and techniques which will be used to answer the first research question i.e., to know which type of data is required for the machine learning methods and the data analysis is performed. Later on, with the knowledge of RQ1, Experimentation is conducted to answer the RQ2, RQ3, RQ4. Quantitative data is used to perform the experimentation because qualitative data which obtains from case-study and survey cannot be used for this experiment method as it contains non-numerical data. In this research, an experiment is conducted to find the best suitable machine learning method from the existing methods. As mentioned above in the objectives, an experiment is conducted using SVM, ANN, and CNN. By considering the results obtained from the experiment, a comparison is made on the metrics considered which results in CNN as the best method suitable for Documents Images. Results: Compare the results for SVM, ANN with HOG Feature extraction and the CNN method by using segmented results. Based on the Experiment results it is found that SVM and ANN have some drawbacks like low accuracy and low performance in the recognition of documented images. So, the other method i.e., CNN has greater performance with high accuracy. The following are the results of the recognition rates of each method. SVM performance - 39% ANN performance - 37% CNN performance - 71%. Conclusion: This research concentrates on providing an efficient method for recognition of automatic handwritten digits recognition. Here a sample training data is treated with existing machine learning and deep learning methods like SVM, ANN, and CNN. By the results obtained from the experimentation, it clearly is shown that the CNN method is much efficient with 71% performance when compared to ANN and SVM methods. Keywords: Handwritten Digit Recognition, Handwritten Digit Segmentation, Handwritten Digit Classification, Machine Learning Methods, Deep Learning, Image processing on document images, Support Vector Machine, Conventional Neural Networks, Artificial Neural Networks
APA, Harvard, Vancouver, ISO, and other styles
43

Azarbarzin, Ali. "Snoring sounds analysis: automatic detection, higher order statistics, and its application for sleep apnea diagnosis." IEEE, 2011. http://hdl.handle.net/1993/9593.

Full text
Abstract:
Snoring is a highly prevalent disorder affecting 20-40% of adult population. Snoring is also a major indicative of obstructive sleep apnea (OSA). Despite the magnitude of effort, the acoustical properties of snoring in relation to physiological states are not yet known. This thesis explores statistical properties of snoring sounds and their association with OSA. First, an unsupervised technique was developed to automatically extract the snoring sound segments from the lengthy recordings of respiratory sounds. This technique was tested over 5665 snoring sound segments of 30 participants and the detection accuracy of 98.6% was obtained. Second, the relationship between anthropometric parameters of snorers with different degrees of obstruction and their snoring sounds’ statistical characteristics was investigated. Snoring sounds are non-Gaussian in nature; thus second order statistical methods such as power spectral analysis would be inadequate to extract information from snoring sounds. Therefore, higher order statistical features, in addition to the second order ones, were extracted. Third, the variability of snoring sound segments within and between 57 snorers with and without OSA was investigated. It was found that the sound characteristics of non-apneic (when there is no apneic event), hypopneic (when there is hypopnea), and post-apneic (after apnea) snoring events were significantly different. Then, this variability of snoring sounds was used as a signature to discriminate the non-OSA snorers from OSA snorers. The accuracy was found to be 96.4%. Finally, it was observed that some snorers formed distinct clusters of snoring sounds in a multidimensional feature space. Hence, using Polysomnography (PSG) information, the dependency of snoring sounds on body position, sleep stage, and blood oxygen level was investigated. It was found that all the three variables affected snoring sounds. However, body position was found to have the highest effect on the characteristics of snoring sounds. In conclusion, snoring sounds analysis offers valuable information on the upper airway physiological state and pathology. Thus, snoring sound analysis may further find its use in determining the exact state and location of obstruction.
APA, Harvard, Vancouver, ISO, and other styles
44

Brookey, Carla M. "Application of Machine Learning and Statistical Learning Methods for Prediction in a Large-Scale Vegetation Map." DigitalCommons@USU, 2017. https://digitalcommons.usu.edu/etd/6962.

Full text
Abstract:
Original analyses of a large vegetation cover dataset from Roosevelt National Forest in northern Colorado were carried out by Blackard (1998) and Blackard and Dean (1998; 2000). They compared the classification accuracies of linear and quadratic discriminant analysis (LDA and QDA) with artificial neural networks (ANN) and obtained an overall classification accuracy of 70.58% for a tuned ANN compared to 58.38% for LDA and 52.76% for QDA. Because there has been tremendous development of machine learning classification methods over the last 35 years in both computer science and statistics, as well as substantial improvements in the speed of computer hardware, I applied five modern machine learning algorithms to the data to determine whether significant improvements in the classification accuracy were possible using one or more of these methods. I found that only a tuned gradient boosting machine had a higher accuracy (71.62%) that the ANN of Blackard and Dean (1998), and the difference in accuracies was only about 1%. Of the other four methods, Random Forests (RF), Support Vector Machines (SVM), Classification Trees (CT), and adaboosted trees (ADA), a tuned SVM and RF had accuracies of 67.17% and 67.57%, respectively. The partition of the data by Blackard and Dean (1998) was unusual in that the training and validation datasets had equal representation of the seven vegetation classes, even though 85% of the data fell into classes 1 and 2. For the second part of my analyses I randomly selected 60% of the data for the training data and 20% for each of the validation data and test data. On this partition of the data a single classification tree achieved an accuracy of 92.63% on the test data and the accuracy of RF is 83.98%. Unsurprisingly, most of the gains in accuracy were in classes 1 and 2, the largest classes which also had the highest misclassification rates under the original partition of the data. By decreasing the size of the training data but maintaining the same relative occurrences of the vegetation classes as in the full dataset I found that even for a training dataset of the same size as that of Blackard and Dean (1998) a single classification tree was more accurate (73.80%) that the ANN of Blackard and Dean (1998) (70.58%). The final part of my thesis was to explore the possibility that combining several of the machine learning classifiers predictions could result in higher predictive accuracies. In the analyses I carried out, the answer seems to be that increased accuracies do not occur with a simple voting of five machine learning classifiers.
APA, Harvard, Vancouver, ISO, and other styles
45

Gerber, Egardt. "The use of classification methods for gross error detection in process data." Thesis, Stellenbosch : Stellenbosch University, 2013. http://hdl.handle.net/10019.1/85856.

Full text
Abstract:
Thesis (MScEng)-- Stellenbosch University, 2013.
ENGLISH ABSTRACT: All process measurements contain some element of error. Typically, a distinction is made between random errors, with zero expected value, and gross errors with non-zero magnitude. Data Reconciliation (DR) and Gross Error Detection (GED) comprise a collection of techniques designed to attenuate measurement errors in process data in order to reduce the effect of the errors on subsequent use of the data. DR proceeds by finding the optimum adjustments so that reconciled measurement data satisfy imposed process constraints, such as material and energy balances. The DR solution is optimal under the assumed statistical random error model, typically Gaussian with zero mean and known covariance. The presence of outliers and gross errors in the measurements or imposed process constraints invalidates the assumptions underlying DR, so that the DR solution may become biased. GED is required to detect, identify and remove or otherwise compensate for the gross errors. Typically GED relies on formal hypothesis testing of constraint residuals or measurement adjustment-based statistics derived from the assumed random error statistical model. Classification methodologies are methods by which observations are classified as belonging to one of several possible groups. For the GED problem, artificial neural networks (ANN’s) have been applied historically to resolve the classification of a data set as either containing or not containing a gross error. The hypothesis investigated in this thesis is that classification methodologies, specifically classification trees (CT) and linear or quadratic classification functions (LCF, QCF), may provide an alternative to the classical GED techniques. This hypothesis is tested via the modelling of a simple steady-state process unit with associated simulated process measurements. DR is performed on the simulated process measurements in order to satisfy one linear and two nonlinear material conservation constraints. Selected features from the DR procedure and process constraints are incorporated into two separate input vectors for classifier construction. The performance of the classification methodologies developed on each input vector is compared with the classical measurement test in order to address the posed hypothesis. General trends in the results are as follows: - The power to detect and/or identify a gross error is a strong function of the gross error magnitude as well as location for all the classification methodologies as well as the measurement test. - For some locations there exist large differences between the power to detect a gross error and the power to identify it correctly. This is consistent over all the classifiers and their associated measurement tests, and indicates significant smearing of gross errors. - In general, the classification methodologies have higher power for equivalent type I error than the measurement test. - The measurement test is superior for small magnitude gross errors, and for specific locations, depending on which classification methodology it is compared with. There is significant scope to extend the work to more complex processes and constraints, including dynamic processes with multiple gross errors in the system. Further investigation into the optimal selection of input vector elements for the classification methodologies is also required.
AFRIKAANSE OPSOMMING: Alle prosesmetings bevat ʼn sekere mate van metingsfoute. Die fout-element van ʼn prosesmeting word dikwels uitgedruk as bestaande uit ʼn ewekansige fout met nul verwagte waarde, asook ʼn nie-ewekansige fout met ʼn beduidende grootte. Data Rekonsiliasie (DR) en Fout Opsporing (FO) is ʼn versameling van tegnieke met die doelwit om die effek van sulke foute in prosesdata op die daaropvolgende aanwending van die data te verminder. DR word uitgevoer deur die optimale veranderinge aan die oorspronklike prosesmetings aan te bring sodat die aangepaste metings sekere prosesmodelle gehoorsaam, tipies massa- en energie-balanse. Die DR-oplossing is optimaal, mits die statistiese aannames rakende die ewekansige fout-element in die prosesdata geldig is. Dit word tipies aanvaar dat die fout-element normaal verdeel is, met nul verwagte waarde, en ʼn gegewe kovariansie matriks. Wanneer nie-ewekansige foute in die data teenwoordig is, kan die resultate van DR sydig wees. FO is daarom nodig om nie-ewekansige foute te vind (Deteksie) en te identifiseer (Identifikasie). FO maak gewoonlik staat op die statistiese eienskappe van die meting aanpassings wat gemaak word deur die DR prosedure, of die afwykingsverskil van die model vergelykings, om formele hipoteses rakende die teenwoordigheid van nie-ewekansige foute te toets. Klassifikasie tegnieke word gebruik om die klasverwantskap van observasies te bepaal. Rakende die FO probleem, is sintetiese neurale netwerke (SNN) histories aangewend om die Deteksie en Identifikasie probleme op te los. Die hipotese van hierdie tesis is dat klassifikasie tegnieke, spesifiek klassifikasiebome (CT) en lineêre asook kwadratiese klassifikasie funksies (LCF en QCF), suksesvol aangewend kan word om die FO probleem op te los. Die hipotese word ondersoek deur middel van ʼn simulasie rondom ʼn eenvoudige gestadigde toestand proses-eenheid wat aan een lineêre en twee nie-lineêre vergelykings onderhewig is. Kunsmatige prosesmetings word geskep met behulp van lukrake syfers sodat die foutkomponent van elke prosesmeting bekend is. DR word toegepas op die kunsmatige data, en die DR resultate word gebruik om twee verskillende insetvektore vir die klassifikasie tegnieke te skep. Die prestasie van die klassifikasie metodes word vergelyk met die metingstoets van klassieke FO ten einde die gestelde hipotese te beantwoord. Die onderliggende tendense in die resultate is soos volg: - Die vermoë om ‘n nie-ewekansige fout op te spoor en te identifiseer is sterk afhanklik van die grootte asook die ligging van die fout vir al die klassifikasie tegnieke sowel as die metingstoets. - Vir sekere liggings van die nie-ewekansige fout is daar ‘n groot verskil tussen die vermoë om die fout op te spoor, en die vermoë om die fout te identifiseer, wat dui op smering van die fout. Al die klassifikasie tegnieke asook die metingstoets baar hierdie eienskap. - Oor die algemeen toon die klassifikasie metodes groter sukses as die metingstoets. - Die metingstoets is meer suksesvol vir relatief klein nie-ewekansige foute, asook vir sekere liggings van die nie-ewekansige fout, afhangende van die klassifikasie tegniek ter sprake. Daar is verskeie maniere om die bestek van hierdie ondersoek uit te brei. Meer komplekse, niegestadigde prosesse met sterk nie-lineêre prosesmodelle en meervuldige nie-ewekansige foute kan ondersoek word. Die moontlikheid bestaan ook om die prestasie van klassifikasie metodes te verbeter deur die gepaste keuse van insetvektor elemente.
APA, Harvard, Vancouver, ISO, and other styles
46

Ohrnberger, Matthias. "Continuous automatic classification of seismic signals of volcanic origin at Mt. Merapi, Java, Indonesia." Phd thesis, [S.l. : s.n.], 2001. http://pub.ub.uni-potsdam.de/2001/0016/ohrnberg.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
47

Wu, Tsung-Lin. "Classification models for disease diagnosis and outcome analysis." Diss., Georgia Institute of Technology, 2011. http://hdl.handle.net/1853/44918.

Full text
Abstract:
In this dissertation we study the feature selection and classification problems and apply our methods to real-world medical and biological data sets for disease diagnosis. Classification is an important problem in disease diagnosis to distinguish patients from normal population. DAMIP (discriminant analysis -- mixed integer program) was shown to be a good classification model, which can directly handle multigroup problems, enforce misclassification limits, and provide reserved judgement region. However, DAMIP is NP-hard and presents computational challenges. Feature selection is important in classification to improve the prediction performance, prevent over-fitting, or facilitate data understanding. However, this combinatorial problem becomes intractable when the number of features is large. In this dissertation, we propose a modified particle swarm optimization (PSO), a heuristic method, to solve the feature selection problem, and we study its parameter selection in our applications. We derive theories and exact algorithms to solve the two-group DAMIP in polynomial time. We also propose a heuristic algorithm to solve the multigroup DAMIP. Computational studies on simulated data and data from UCI machine learning repository show that the proposed algorithm performs very well. The polynomial solution time of the heuristic method allows us to solve DAMIP repeatedly within the feature selection procedure. We apply the PSO/DAMIP classification framework to several real-life medical and biological prediction problems. (1) Alzheimer's disease: We use data from several neuropsychological tests to discriminate subjects of Alzheimer's disease, subjects of mild cognitive impairment, and control groups. (2) Cardiovascular disease: We use traditional risk factors and novel oxidative stress biomarkers to predict subjects who are at high or low risk of cardiovascular disease, in which the risk is measured by the thickness of the carotid intima-media or/and the flow-mediated vasodilation. (3) Sulfur amino acid (SAA) intake: We use 1H NMR spectral data of human plasma to classify plasma samples obtained with low SAA intake or high SAA intake. This shows that our method helps for metabolomics study. (4) CpG islands for lung cancer: We identify a large number of sequence patterns (in the order of millions), search candidate patterns from DNA sequences in CpG islands, and look for patterns which can discriminate methylation-prone and methylation-resistant (or in addition, methylation-sporadic) sequences, which relate to early lung cancer prediction.
APA, Harvard, Vancouver, ISO, and other styles
48

Bankefors, Johan. "Structural classification of Quillaja saponins by electrospray ionisation ion trap multiple-stage mass spectrometry in combination with multivariate analysis /." Uppsala : Department of Chemistry, Swedish University of Agricultural Sciences, 2006. http://epsilon.slu.se/10284550.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
49

Fairley, Jacqueline Antoinette. "Statistical modeling of the human sleep process via physiological recordings." Diss., Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/33912.

Full text
Abstract:
The main objective of this work was the development of a computer-based Expert Sleep Analysis Methodology (ESAM) to aid sleep care physicians in the diagnosis of pre-Parkinson's disease symptoms using polysomnogram data. ESAM is significant because it streamlines the analysis of the human sleep cycles and aids the physician in the identification, treatment, and prediction of sleep disorders. In this work four aspects of computer-based human sleep analysis were investigated: polysomnogram interpretation, pre-processing, sleep event classification, and abnormal sleep detection. A review of previous developments in these four areas is provided along with their relationship to the establishment of ESAM. Polysomnogram interpretation focuses on the ambiguities found in human polysomnogram analysis when using the rule based 1968 sleep staging manual edited by Rechtschaffen and Kales (R&K). ESAM is presented as an alternative to the R&K approach in human polysomnogram interpretation. The second area, pre-processing, addresses artifact processing techniques for human polysomnograms. Sleep event classification, the third area, discusses feature selection, classification, and human sleep modeling approaches. Lastly, abnormal sleep detection focuses on polysomnogram characteristics common to patients suffering from Parkinson's disease. The technical approach in this work utilized polysomnograms of control subjects and pre-Parkinsonian disease patients obtained from the Emory Clinic Sleep Disorders Center (ECSDC) as inputs into ESAM. The engineering tools employed during the development of ESAM included the Generalized Singular Value Decomposition (GSVD) algorithm, sequential forward and backward feature selection algorithms, Particle Swarm Optimization algorithm, k-Nearest Neighbor classification, and Gaussian Observation Hidden Markov Modeling (GOHMM). In this study polysomnogram data was preprocessed for artifact removal and compensation using band-pass filtering and the GSVD algorithm. Optimal features for characterization of polysomnogram data of control subjects and pre-Parkinsonian disease patients were obtained using the sequential forward and backward feature selection algorithms, Particle Swarm Optimization, and k-Nearest Neighbor classification. ESAM output included GOHMMs constructed for both control subjects and pre-Parkinsonian disease patients. Furthermore, performance evaluation techniques were implemented to make conclusions regarding the constructed GOHMM's reflection of the underlying nature of the human sleep cycle.
APA, Harvard, Vancouver, ISO, and other styles
50

Tiwari, Ayush. "Comparison of Statistical Signal Processing and Machine Learning Algorithms as Applied to Cognitive Radios." University of Toledo / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1533218513862248.

Full text
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography