Dissertations / Theses on the topic 'Sparse features'

To see the other types of publications on this topic, follow the link: Sparse features.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Sparse features.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Strohmann, Thomas. "Very sparse kernel models: Predicting with few examples and few features." Diss., Connect to online resource, 2006. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3239405.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Radwan, Noha [Verfasser], and Wolfram [Akademischer Betreuer] Burgard. "Leveraging sparse and dense features for reliable state estimation in urban environments." Freiburg : Universität, 2019. http://d-nb.info/1190031361/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
3

Hata, Alberto Yukinobu. "Road features detection and sparse map-based vehicle localization in urban environments." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-08062017-090428/.

Full text
Abstract:
Localization is one of the fundamental components of autonomous vehicles by enabling tasks as overtaking, lane keeping and self-navigation. Urban canyons and bad weather interfere with the reception of GPS satellite signal which prohibits the exclusive use of such technology for vehicle localization in urban places. Alternatively, map-aided localization methods have been employed to enable position estimation without the dependence on GPS devices. In this solution, the vehicle position is given as the place that best matches the sensor measurement to the environment map. Before building the maps, feature sof the environment must be extracted from sensor measurements. In vehicle localization, curbs and road markings have been extensively employed as mapping features. However, most of the urban mapping methods rely on a street free of obstacles or require repetitive measurements of the same place to avoid occlusions. The construction of an accurate representation of the environment is necessary for a proper match of sensor measurements to the map during localization. To prevent the necessity of a manual process to remove occluding obstacles and unobserved areas, a vehicle localization method that supports maps built from partial observations of the environment is proposed. In this localization system,maps are formed by curb and road markings extracted from multilayer laser sensor measurements. Curb structures are detected even in the presence of vehicles that occlude the roadsides, thanks to the use of robust regression. Road markings detector employs Otsu thresholding to analyze infrared remittance data which makes the method insensitive to illumination. Detected road features are stored in two map representations: occupancy grid map (OGM) and Gaussian process occupancy map (GPOM). The first approach is a popular map structure that represents the environment through fine-grained grids. The second approach is a continuous representation that can estimate the occupancy of unseen areas. The Monte Carlo localization (MCL) method was adapted to support the obtained maps of the urban environment. In this sense, vehicle localization was tested in an MCL that supports OGM and an MCL that supports GPOM. Precisely, for MCL based on GPOM, a new measurement likelihood based on multivariate normal probability density function is formulated. Experiments were performed in real urban environments. Maps were built using sparse laser data to verify there ronstruction of non-observed areas. The localization system was evaluated by comparing the results with a high precision GPS device. Results were also compared with localization based on OGM.
No contexto de veículos autônomos, a localização é um dos componentes fundamentais, pois possibilita tarefas como ultrapassagem, direção assistida e navegação autônoma. A presença de edifícios e o mau tempo interferem na recepção do sinal de GPS que consequentemente dificulta o uso de tal tecnologia para a localização de veículos dentro das cidades. Alternativamente, a localização com suporte aos mapas vem sendo empregada para estimar a posição sem a dependência do GPS. Nesta solução, a posição do veículo é dada pela região em que ocorre a melhor correspondência entre o mapa do ambiente e a leitura do sensor. Antes da criação dos mapas, características dos ambientes devem ser extraídas a partir das leituras dos sensores. Dessa forma, guias e sinalizações horizontais têm sido largamente utilizados para o mapeamento. Entretanto, métodos de mapeamento urbano geralmente necessitam de repetidas leituras do mesmo lugar para compensar as oclusões. A construção de representações precisas dos ambientes é essencial para uma adequada associação dos dados dos sensores como mapa durante a localização. De forma a evitar a necessidade de um processo manual para remover obstáculos que causam oclusão e áreas não observadas, propõe-se um método de localização de veículos com suporte aos mapas construídos a partir de observações parciais do ambiente. No sistema de localização proposto, os mapas são construídos a partir de guias e sinalizações horizontais extraídas a partir de leituras de um sensor multicamadas. As guias podem ser detectadas mesmo na presença de veículos que obstruem a percepção das ruas, por meio do uso de regressão robusta. Na detecção de sinalizações horizontais é empregado o método de limiarização por Otsu que analisa dados de reflexão infravermelho, o que torna o método insensível à variação de luminosidade. Dois tipos de mapas são empregados para a representação das guias e das sinalizações horizontais: mapa de grade de ocupação (OGM) e mapa de ocupação por processo Gaussiano (GPOM). O OGM é uma estrutura que representa o ambiente por meio de uma grade reticulada. OGPOM é uma representação contínua que possibilita a estimação de áreas não observadas. O método de localização por Monte Carlo (MCL) foi adaptado para suportar os mapas construídos. Dessa forma, a localização de veículos foi testada em MCL com suporte ao OGM e MCL com suporte ao GPOM. No caso do MCL baseado em GPOM, um novo modelo de verossimilhança baseado em função densidade probabilidade de distribuição multi-normal é proposto. Experimentos foram realizados em ambientes urbanos reais. Mapas do ambiente foram gerados a partir de dados de laser esparsos de forma a verificar a reconstrução de áreas não observadas. O sistema de localização foi avaliado por meio da comparação das posições estimadas comum GPS de alta precisão. Comparou-se também o MCL baseado em OGM com o MCL baseado em GPOM, de forma a verificar qual abordagem apresenta melhores resultados.
APA, Harvard, Vancouver, ISO, and other styles
4

Pundlik, Shrinivas J. "Motion segmentation from clustering of sparse point features using spatially constrained mixture models." Connect to this title online, 2009. http://etd.lib.clemson.edu/documents/1252937182/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Quadros, Alistair James. "Representing 3D shape in sparse range images for urban object classification." Thesis, The University of Sydney, 2013. http://hdl.handle.net/2123/10515.

Full text
Abstract:
This thesis develops techniques for interpreting 3D range images acquired in outdoor environments at a low resolution. It focuses on the task of robustly capturing the shapes that comprise objects, in order to classify them. With the recent development of 3D sensors such as the Velodyne, it is now possible to capture range images at video frame rates, allowing mobile robots to observe dynamic scenes in 3D. To classify objects in these scenes, features are extracted from the data, which allows different regions to be matched. However, range images acquired at this speed are of low resolution, and there are often significant changes in sensor viewpoint and occlusion. In this context, existing methods for feature extraction do not perform well. This thesis contributes algorithms for the robust abstraction from 3D points to object classes. Efficient region-of-interest and surface normal extraction are evaluated, resulting in a keypoint algorithm that provides stable orientations. These build towards a novel feature, called the ‘line image,’ that is designed to consistently capture local shape, regardless of sensor viewpoint. It does this by explicitly reasoning about the difference between known empty space, and space that has not been measured due to occlusion or sparse sensing. A dataset of urban objects scanned with a Velodyne was collected and hand labelled, in order to compare this feature with several others on the task of classification. First, a simple k-nearest neighbours approach was used, where the line image showed improvements. Second, more complex classifiers were applied, requiring the features to be clustered. The clusters were used in topic modelling, allowing specific sub-parts of objects to be learnt across multiple scales, improving accuracy by 10%. This work is applicable to any range image data. In general, it demonstrates the advantages in using the inherent density and occupancy information in a range image during 3D point cloud processing.
APA, Harvard, Vancouver, ISO, and other styles
6

Mairal, Julien. "Sparse coding for machine learning, image processing and computer vision." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2010. http://tel.archives-ouvertes.fr/tel-00595312.

Full text
Abstract:
We study in this thesis a particular machine learning approach to represent signals that that consists of modelling data as linear combinations of a few elements from a learned dictionary. It can be viewed as an extension of the classical wavelet framework, whose goal is to design such dictionaries (often orthonormal basis) that are adapted to natural signals. An important success of dictionary learning methods has been their ability to model natural image patches and the performance of image denoising algorithms that it has yielded. We address several open questions related to this framework: How to efficiently optimize the dictionary? How can the model be enriched by adding a structure to the dictionary? Can current image processing tools based on this method be further improved? How should one learn the dictionary when it is used for a different task than signal reconstruction? How can it be used for solving computer vision problems? We answer these questions with a multidisciplinarity approach, using tools from statistical machine learning, convex and stochastic optimization, image and signal processing, computer vision, but also optimization on graphs.
APA, Harvard, Vancouver, ISO, and other styles
7

Abbasnejad, Iman. "Learning spatio-temporal features for efficient event detection." Thesis, Queensland University of Technology, 2018. https://eprints.qut.edu.au/121184/1/Iman_Abbasnejad_Thesis.pdf.

Full text
Abstract:
This thesis has addressed the topic of event detection in videos, which is a challenging problem as events to be detected, can be complex, correlated, and may require the detection of different objects and human actions. To address these challenges, the thesis has developed effective strategies for learning the spatio-temporal features of events. Improved event detection performance has been demonstrated on several real-world challenging databases. The outcome of our research will be useful for a number of applications including human computer interaction, robotics and video surveillance.
APA, Harvard, Vancouver, ISO, and other styles
8

Lakemond, Ruan. "Multiple camera management using wide baseline matching." Thesis, Queensland University of Technology, 2010. https://eprints.qut.edu.au/37668/1/Ruan_Lakemond_Thesis.pdf.

Full text
Abstract:
Camera calibration information is required in order for multiple camera networks to deliver more than the sum of many single camera systems. Methods exist for manually calibrating cameras with high accuracy. Manually calibrating networks with many cameras is, however, time consuming, expensive and impractical for networks that undergo frequent change. For this reason, automatic calibration techniques have been vigorously researched in recent years. Fully automatic calibration methods depend on the ability to automatically find point correspondences between overlapping views. In typical camera networks, cameras are placed far apart to maximise coverage. This is referred to as a wide base-line scenario. Finding sufficient correspondences for camera calibration in wide base-line scenarios presents a significant challenge. This thesis focuses on developing more effective and efficient techniques for finding correspondences in uncalibrated, wide baseline, multiple-camera scenarios. The project consists of two major areas of work. The first is the development of more effective and efficient view covariant local feature extractors. The second area involves finding methods to extract scene information using the information contained in a limited set of matched affine features. Several novel affine adaptation techniques for salient features have been developed. A method is presented for efficiently computing the discrete scale space primal sketch of local image features. A scale selection method was implemented that makes use of the primal sketch. The primal sketch-based scale selection method has several advantages over the existing methods. It allows greater freedom in how the scale space is sampled, enables more accurate scale selection, is more effective at combining different functions for spatial position and scale selection, and leads to greater computational efficiency. Existing affine adaptation methods make use of the second moment matrix to estimate the local affine shape of local image features. In this thesis, it is shown that the Hessian matrix can be used in a similar way to estimate local feature shape. The Hessian matrix is effective for estimating the shape of blob-like structures, but is less effective for corner structures. It is simpler to compute than the second moment matrix, leading to a significant reduction in computational cost. A wide baseline dense correspondence extraction system, called WiDense, is presented in this thesis. It allows the extraction of large numbers of additional accurate correspondences, given only a few initial putative correspondences. It consists of the following algorithms: An affine region alignment algorithm that ensures accurate alignment between matched features; A method for extracting more matches in the vicinity of a matched pair of affine features, using the alignment information contained in the match; An algorithm for extracting large numbers of highly accurate point correspondences from an aligned pair of feature regions. Experiments show that the correspondences generated by the WiDense system improves the success rate of computing the epipolar geometry of very widely separated views. This new method is successful in many cases where the features produced by the best wide baseline matching algorithms are insufficient for computing the scene geometry.
APA, Harvard, Vancouver, ISO, and other styles
9

Umakanthan, Sabanadesan. "Human action recognition from video sequences." Thesis, Queensland University of Technology, 2016. https://eprints.qut.edu.au/93749/1/Sabanadesan_Umakanthan_Thesis.pdf.

Full text
Abstract:
This PhD research has proposed new machine learning techniques to improve human action recognition based on local features. Several novel video representation and classification techniques have been proposed to increase the performance with lower computational complexity. The major contributions are the construction of new feature representation techniques, based on advanced machine learning techniques such as multiple instance dictionary learning, Latent Dirichlet Allocation (LDA) and Sparse coding. A Binary-tree based classification technique was also proposed to deal with large amounts of action categories. These techniques are not only improving the classification accuracy with constrained computational resources but are also robust to challenging environmental conditions. These developed techniques can be easily extended to a wide range of video applications to provide near real-time performance.
APA, Harvard, Vancouver, ISO, and other styles
10

Dhanjal, Charanpal. "Sparse Kernel feature extraction." Thesis, University of Southampton, 2008. https://eprints.soton.ac.uk/64875/.

Full text
Abstract:
The presence of irrelevant features in training data is a significant obstacle for many machine learning tasks, since it can decrease accuracy, make it harder to understand the learned model and increase computational and memory requirements. One approach to this problem is to extract appropriate features. General approaches such as Principal Components Analysis (PCA) are successful for a variety of applications, however they can be improved upon by targeting feature extraction towards more specific problems. More recent work has been more focused and considers sparser formulations which potentially have improved generalisation. However, sparsity is not always efficiently implemented and frequently requires complex optimisation routines. Furthermore, one often does not have a direct control on the sparsity of the solution. In this thesis, we address some of these problems, first by proposing a general framework for feature extraction which possesses a number of useful properties. The framework is based on Partial Least Squares (PLS), and one can choose a user defined criterion to compute projection directions. It draws together a number of existing results and provides additional insights into several popular feature extraction methods. More specific feature extraction is considered for three objectives: matrix approximation, supervised feature extraction and learning the semantics of two-viewed data. Computational and memory efficiency is prioritised, as well as sparsity in a direct manner and simple implementations. For the matrix approximation case, an analysis of different orthogonalisation methods is presented in terms of the optimal choice of projection direction. The analysis results in a new derivation for Kernel Feature Analysis (KFA) and the formation of two novel matrix approximation methods based on PLS. In the supervised case, we apply the general feature extraction framework to derive two new methods based on maximising covariance and alignment respectively. Finally, we outline a novel sparse variant of Kernel Canonical Correlation Analysis (KCCA) which approximates a cardinality constrained optimisation. This method, as well as a variant which performs feature selection in one view, is applied to an enzyme function prediction case study.
APA, Harvard, Vancouver, ISO, and other styles
11

Primadhanty, Audi. "Low-rank regularization for high-dimensional sparse conjunctive feature spaces in information extraction." Doctoral thesis, Universitat Politècnica de Catalunya, 2017. http://hdl.handle.net/10803/461682.

Full text
Abstract:
One of the challenges in Natural Language Processing (NLP) is the unstructured nature of texts, in which useful information is not easily identifiable. Information Extraction (IE) aims to alleviate it by enabling automatic extraction of structured information from such text sources. The resulting structured information will facilitate easier querying, organizing, and analyzing of data from texts. In this thesis, we are interested in two IE related tasks: (i) named entity classification and (ii) template filling. Specifically, this thesis examines the problem of learning classifiers of text spans and explore its application for extracting named entities and template slot-fillers. In general, our goal is to construct a method to learn classifiers that: (i) require less supervision, (ii) work well with high-dimensional sparse feature spaces and (iii) are able to classify unseen items (i.e. named entities/slot-fillers not observed in training data). The key idea of our contribution is the utilization of unseen conjunctive features. A conjunctive feature is a combination of features from different feature sets. For example, to classify a phrase, one might have one feature set for the context and another set for the phrase itself. When learning a classifier, only a factor of these conjunctive features will be observed in the training set, leaving the rest (i.e. unseen features) unusable for predicting items in test time. We hypothesize that utilizing such unseen conjunctions is useful to address all of the aspects of the goal. We develop a general regularization framework specifically designed for sparse conjunctive feature spaces. Our strategy is based on employing tensors to represent the conjunctive feature space, and forcing the model to induce low-dimensional embeddings of the feature vectors via low-rank regularization on the tensor parameters. Such compressed representation will help prediction by generalizing to novel examples where most of the conjunctions will be unseen in the training set. We conduct experiments on learning named entity classifiers and template filling, focusing on extracting unseen items. We show that when learning classifiers under minimal supervision, our approach is more effective in controlling model capacity than standard techniques for linear classification.
Uno de los retos en Procesamiento del Lenguaje Natural (NLP, del inglés Natural Language Processing) es la naturaleza no estructurada del texto, que hace que la información útil y relevante no sea fácilmente identificable. Los métodos de Extracción de Información (IE, del inglés Information Extraction) afrontan este problema mediante la extracción automática de información estructurada de dichos textos. La estructura resultante facilita la búsqueda, la organización y el análisis datos textuales. Esta tesis se centra en dos tareas relacionadas dentro de IE: (i) clasificación de entidades nombradas (NEC, del inglés Named Entity Classification), y (ii) rellenado de plantillas (en inglés, template filling). Concretamente, esta tesis estudia el problema de aprender clasificadores de secuencias textuales y explora su aplicación a la extracción de entidades nombradas y de valores para campos de plantillas. El objetivo general es desarrollar un método para aprender clasificadores que: (i) requieran poca supervisión; (ii) funcionen bien en espacios de características de alta dimensión y dispersión; y (iii) sean capaces de clasificar elementos nunca vistos (por ejemplo entidades o valores de campos que no hayan sido vistos en fase de entrenamiento). La idea principal de nuestra contribución es la utilización de características conjuntivas que no aparecen en el conjunto de entrenamiento. Una característica conjuntiva es una conjunción de características elementales. Por ejemplo, para clasificar la mención de una entidad en una oración, se utilizan características de la mención, del contexto de ésta, y a su vez conjunciones de los dos grupos de características. Cuando se aprende un clasificador en un conjunto de entrenamiento concreto, sólo se observará una fracción de estas características conjuntivas, dejando el resto (es decir, características no vistas) sin ser utilizado para predecir elementos en fase de evaluación y explotación del modelo. Nuestra hipótesis es que la utilización de estas conjunciones nunca vistas pueden ser potencialmente muy útiles, especialmente para reconocer entidades nuevas. Desarrollamos un marco de regularización general específicamente diseñado para espacios de características conjuntivas dispersas. Nuestra estrategia se basa en utilizar tensores para representar el espacio de características conjuntivas y obligar al modelo a inducir "embeddings" de baja dimensión de los vectores de características vía regularización de bajo rango en los parámetros de tensor. Dicha representación comprimida ayudará a la predicción, generalizando a nuevos ejemplos donde la mayoría de las conjunciones no han sido vistas durante la fase de entrenamiento. Presentamos experimentos sobre el aprendizaje de clasificadores de entidades nombradas, y clasificadores de valores en campos de plantillas, centrándonos en la extracción de elementos no vistos. Demostramos que al aprender los clasificadores bajo mínima supervisión, nuestro enfoque es más efectivo en el control de la capacidad del modelo que las técnicas estándar para la clasificación lineal
APA, Harvard, Vancouver, ISO, and other styles
12

Behúň, Kamil. "Příznaky z videa pro klasifikaci." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2013. http://www.nusl.cz/ntk/nusl-236367.

Full text
Abstract:
This thesis compares hand-designed features with features learned by feature learning methods in video classification. The features learned by Principal Component Analysis whitening, Independent subspace analysis and Sparse Autoencoders were tested in a standard Bag of Visual Word classification paradigm replacing hand-designed features (e.g. SIFT, HOG, HOF). The classification performance was measured on Human Motion DataBase and YouTube Action Data Set. Learned features showed better performance than the hand-desined features. The combination of hand-designed features and learned features by Multiple Kernel Learning method showed even better performance, including cases when hand-designed features and learned features achieved not so good performance separately.
APA, Harvard, Vancouver, ISO, and other styles
13

Meghnoudj, Houssem. "Génération de caractéristiques à partir de séries temporelles physiologiques basée sur le contrôle optimal parcimonieux : application au diagnostic de maladies et de troubles humains." Electronic Thesis or Diss., Université Grenoble Alpes, 2024. http://www.theses.fr/2024GRALT003.

Full text
Abstract:
Dans cette thèse, une nouvelle méthodologie a été proposée pour la génération de caractéristiques à partir de signaux physiologiques afin de contribuer au diagnostic d'une variété de maladies cérébrales et cardiaques. Basée sur le contrôle optimal parcimonieux, la génération de caractéristiques dynamiques parcimonieuses (SDF) s'inspire du fonctionnement du cerveau. Le concept fondamental de la méthode consiste à décomposer le signal de manière parcimonieuse en modes dynamiques qui peuvent être activés et/ou désactivés au moment approprié avec l'amplitude adéquate. Cette décomposition permet de changer le point de vue sur les données en donnant accès à des caractéristiques plus informatives qui sont plus fidèles au concept de production des signaux cérébraux. Néanmoins, la méthode reste générique et polyvalente puisqu'elle peut être appliquée à un large éventail de signaux. Les performances de la méthode ont été évaluées sur trois problématiques en utilisant des données réelles accessibles publiquement, en abordant des scénarios de diagnostic liés à : (1) la maladie de Parkinson, (2) la schizophrénie et (3) diverses maladies cardiaques. Pour les trois applications, les résultats sont très concluants, puisqu'ils sont comparables aux méthodes de l'état de l'art tout en n'utilisant qu'un petit nombre de caractéristiques (une ou deux pour les applications sur le cerveau) et un simple classifieur linéaire suggérant la robustesse et le bien-fondé des résultats. Il convient de souligner qu'une attention particulière a été accordée à l'obtention de résultats cohérents et significatifs avec une explicabilité sous-jacente
In this thesis, a novel methodology for features generation from physiological signals (EEG, ECG) has been proposed that is used for the diagnosis of a variety of brain and heart diseases. Based on sparse optimal control, the generation of Sparse Dynamical Features (SDFs) is inspired by the functioning of the brain. The method's fundamental concept revolves around sparsely decomposing the signal into dynamical modes that can be switched on and off at the appropriate time instants with the appropriate amplitudes. This decomposition provides a new point of view on the data which gives access to informative features that are faithful to the brain functioning. Nevertheless, the method remains generic and versatile as it can be applied to a wide range of signals. The methodology's performance was evaluated on three use cases using openly accessible real-world data: (1) Parkinson's Disease, (2) Schizophrenia, and (3) various cardiac diseases. For all three applications, the results are highly conclusive, achieving results that are comparable to the state-of-the-art methods while using only few features (one or two for brain applications) and a simple linear classifier supporting the significance and reliability of the findings. It's worth highlighting that special attention has been given to achieving significant and meaningful results with an underlying explainability
APA, Harvard, Vancouver, ISO, and other styles
14

Nziga, Jean-Pierre. "Incremental Sparse-PCA Feature Extraction For Data Streams." NSUWorks, 2015. http://nsuworks.nova.edu/gscis_etd/365.

Full text
Abstract:
Intruders attempt to penetrate commercial systems daily and cause considerable financial losses for individuals and organizations. Intrusion detection systems monitor network events to detect computer security threats. An extensive amount of network data is devoted to detecting malicious activities. Storing, processing, and analyzing the massive volume of data is costly and indicate the need to find efficient methods to perform network data reduction that does not require the data to be first captured and stored. A better approach allows the extraction of useful variables from data streams in real time and in a single pass. The removal of irrelevant attributes reduces the data to be fed to the intrusion detection system (IDS) and shortens the analysis time while improving the classification accuracy. This dissertation introduces an online, real time, data processing method for knowledge extraction. This incremental feature extraction is based on two approaches. First, Chunk Incremental Principal Component Analysis (CIPCA) detects intrusion in data streams. Then, two novel incremental feature extraction methods, Incremental Structured Sparse PCA (ISSPCA) and Incremental Generalized Power Method Sparse PCA (IGSPCA), find malicious elements. Metrics helped compare the performance of all methods. The IGSPCA was found to perform as well as or better than CIPCA overall in term of dimensionality reduction, classification accuracy, and learning time. ISSPCA yielded better results for higher chunk values and greater accumulation ratio thresholds. CIPCA and IGSPCA reduced the IDS dataset to 10 principal components as opposed to 14 eigenvectors for ISSPCA. ISSPCA is more expensive in terms of learning time in comparison to the other techniques. This dissertation presents new methods that perform feature extraction from continuous data streams to find the small number of features necessary to express the most data variance. Data subsets derived from a few important variables render their interpretation easier. Another goal of this dissertation was to propose incremental sparse PCA algorithms capable to process data with concept drift and concept shift. Experiments using WaveForm and WaveFormNoise datasets confirmed this ability. Similar to CIPCA, the ISSPCA and IGSPCA updated eigen-axes as a function of the accumulation ratio value, forming informative eigenspace with few eigenvectors.
APA, Harvard, Vancouver, ISO, and other styles
15

Brunnegård, Oliver, and Daniel Wikestad. "Visual SLAM using sparse maps based on feature points." Thesis, Högskolan i Halmstad, Halmstad Embedded and Intelligent Systems Research (EIS), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-34681.

Full text
Abstract:
Visual Simultaneous Localisation And Mapping is a useful tool forcreating 3D environments with feature points. These visual systemscould be very valuable in autonomous vehicles to improve the localisation.Cameras being a fairly cheap sensor with the capabilityto gather a large amount of data. More efficient algorithms are stillneeded to better interpret the most valuable information. This paperanalyses how much a feature based map can be reduced without losingsignificant accuracy during localising. Semantic segmentation created by a deep neural network is used toclassify the features used to create the map, the map is reduced by removingcertain classes. The results show that feature based maps cansignificantly be reduced without losing accuracy. The use of classesresulted in promising results, large amounts of feature were removedbut the system could still localise accurately. Removing some classesgave the same results or even better in certain weather conditionscompared to localisation with a full-scale map.
APA, Harvard, Vancouver, ISO, and other styles
16

O'Brien, Cian John. "Supervised feature learning via sparse coding for music information rerieval." Thesis, Georgia Institute of Technology, 2015. http://hdl.handle.net/1853/53615.

Full text
Abstract:
This thesis explores the ideas of feature learning and sparse coding for Music Information Retrieval (MIR). Sparse coding is an algorithm which aims to learn new feature representations from data automatically. In contrast to previous work which uses sparse coding in an MIR context the concept of supervised sparse coding is also investigated, which makes use of the ground-truth labels explicitly during the learning process. Here sparse coding and supervised coding are applied to two MIR problems: classification of musical genre and recognition of the emotional content of music. A variation of Label Consistent K-SVD is used to add supervision during the dictionary learning process. In the case of Music Genre Recognition (MGR) an additional discriminative term is added to encourage tracks from the same genre to have similar sparse codes. For Music Emotion Recognition (MER) a linear regression term is added to learn an optimal classifier and dictionary pair. These results indicate that while sparse coding performs well for MGR, the additional supervision fails to improve the performance. In the case of MER, supervised coding significantly outperforms both standard sparse coding and commonly used designed features, namely MFCC and pitch chroma.
APA, Harvard, Vancouver, ISO, and other styles
17

Zennaro, Fabio. "Feature distribution learning for covariate shift adaptation using sparse filtering." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/feature-distribution-learning-for-covariate-shift-adaptation-using-sparse-filtering(67989db2-b8a0-4fac-8832-f611e9236ed5).html.

Full text
Abstract:
This thesis studies a family of unsupervised learning algorithms called feature distribution learning and their extension to perform covariate shift adaptation. Unsupervised learning is one of the most active areas of research in machine learning, and a central challenge in this field is to develop simple and robust algorithms able to work in real-world scenarios. A traditional assumption of machine learning is the independence and identical distribution of data. Unfortunately, in realistic conditions this assumption is often unmet and the performances of traditional algorithms may be severely compromised. Covariate shift adaptation has then developed as a lively sub-field concerned with designing algorithms that can account for covariate shift, that is for a difference in the distribution of training and test samples. The first part of this dissertation focuses on the study of a family of unsupervised learning algorithms that has been recently proposed and has shown promise: feature distribution learning; in particular, sparse filtering, the most representative feature distribution learning algorithm, has commanded interest because of its simplicity and state-of-the-art performance. Despite its success and its frequent adoption, sparse filtering lacks any strong theoretical justification. This research questions how feature distribution learning can be rigorously formalized and how the dynamics of sparse filtering can be explained. These questions are answered by first putting forward a new definition of feature distribution learning based on concepts from information theory and optimization theory; relying on this, a theoretical analysis of sparse filtering is carried out, which is validated on both synthetic and real-world data sets. In the second part, the use of feature distribution learning algorithms to perform covariate shift adaptation is considered. Indeed, because of their definition and apparent insensitivity to the problem of modelling data distributions, feature distribution learning algorithms seems particularly fit to deal with covariate shift. This research questions whether and how feature distribution learning may be fruitfully employed to perform covariate shift adaptation. After making explicit the conditions of success for performing covariate shift adaptation, a theoretical analysis of sparse filtering and another novel algorithm, periodic sparse filtering, is carried out; this allows for the determination of the specific conditions under which these algorithms successfully work. Finally, a comparison of these sparse filtering-based algorithms against other traditional algorithms aimed at covariate shift adaptation is offered, showing that the novel algorithm is able to achieve competitive performance. In conclusion, this thesis provides a new rigorous framework to analyse and design feature distribution learning algorithms; it sheds light on the hidden assumptions behind sparse filtering, offering a clear understanding of its conditions of success; it uncovers the potential and the limitations of sparse filtering-based algorithm in performing covariate shift adaptation. These results are relevant both for researchers interested in furthering the understanding of unsupervised learning algorithms and for practitioners interested in deploying feature distribution learning in an informed way.
APA, Harvard, Vancouver, ISO, and other styles
18

Friess, Thilo-Thomas. "Perceptrons in kernel feature spaces." Thesis, University of Sheffield, 2000. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.327730.

Full text
APA, Harvard, Vancouver, ISO, and other styles
19

Chen, Jihong. "Sparse Modeling in Classification, Compression and Detection." Diss., Georgia Institute of Technology, 2004. http://hdl.handle.net/1853/5051.

Full text
Abstract:
The principal focus of this thesis is the exploration of sparse structures in a variety of statistical modelling problems. While more comprehensive models can be useful to solve a larger number of problems, its calculation may be ill-posed in most practical instances because of the sparsity of informative features in the data. If this sparse structure can be exploited, the models can often be solved very efficiently. The thesis is composed of four projects. Firstly, feature sparsity is incorporated to improve the performance of support vector machines when there are a lot of noise features present. The second project is about an empirical study on how to construct an optimal cascade structure. The third project involves the design of a progressive, rate-distortionoptimized shape coder by combining zero-tree algorithm with beamlet structure. Finally, the longest run statistics is applied for the detection of a filamentary structure in twodimensional rectangular region. The fundamental ideas of the above projects are common — extract an efficient summary from a large amount of data. The main contributions of this work are to develop and implement novel techniques for the efficient solutions of several dicult problems that arise in statistical signal/image processing.
APA, Harvard, Vancouver, ISO, and other styles
20

Fourie, Christoff. "A one-class object-based system for sparse geographic feature identification." Thesis, Stellenbosch : University of Stellenbosch, 2011. http://hdl.handle.net/10019.1/6666.

Full text
Abstract:
Thesis (MSc (Geography and Environmental Studies))--University of Stellenbosch, 2011.
ENGLISH ABSTRACT: The automation of information extraction from earth observation imagery has become a field of active research. This is mainly due to the high volumes of remotely sensed data that remain unused and the possible benefits that the extracted information can provide to a wide range of interest groups. In this work an earth observation image processing system is presented and profiled that attempts to streamline the information extraction process, without degradation of the quality of the extracted information, for geographic object anomaly detection. The proposed system, implemented as a software application, combines recent research in automating image segment generation and automatically finding statistical classifier parameters and attribute subsets using evolutionary inspired search algorithms. Exploratory research was conducted on the use of an edge metric as a fitness function to an evolutionary search heuristic to automate the generation of image segments for a region merging segmentation algorithm having six control parameters. The edge metric for such an application is compared with an area based metric. The use of attribute subset selection in conjunction with a free parameter tuner for a one class support vector machine (SVM) classifier, operating on high dimensional object based data, was also investigated. For common earth observation anomaly detection problems using typical segment attributes, such a combined free parameter tuning and attribute subset selection system provided superior statistically significant results compared to a free parameter tuning only process. In some extreme cases, due to the stochastic nature of the search algorithm employed, the free parameter only strategy provided slightly better results. The developed system was used in a case study to map a single class of interest on a 22.5 x 22.5km subset of a SPOT 5 image and is compared with a multiclass classification strategy. The developed system generated slightly better classification accuracies than the multiclass classifier and only required samples from the class of interest.
AFIKAANSE OPSOMMING: Die outomatisering van die verkryging van inligting vanaf aardwaarnemingsbeelde het in sy eie reg 'n navorsingsveld geword as gevolg van die groot volumes data wat nie benut word nie, asook na aanleiding van die moontlike bydrae wat inligting wat verkry word van hierdie beelde aan verskeie belangegroepe kan bied. In hierdie tesis word 'n aardwaarneming beeldverwerkingsstelsel bekend gestel en geëvalueer. Hierdie stelsel beoog om die verkryging van inligting van aardwaarnemingsbeelde te vergemaklik deur verbruikersinteraksie te minimaliseer, sonder om die kwaliteit van die resultate te beïnvloed. Die stelsel is ontwerp vir geografiese voorwerp anomalie opsporing en is as 'n sagteware program geïmplementeer. Die program kombineer onlangse navorsing in die gebruik van evolusionêre soek-algoritmes om outomaties goeie beeldsegmente te verkry en parameters te vind, sowel as om kenmerke vir 'n statistiese klassifikasie van beeld segmente te selekteer. Verkennende navorsing is gedoen op die benutting van 'n rand metriek as 'n passings funksie in 'n evolusionêre soek heuristiek om outomaties goeie parameters te vind vir 'n streeks kombinering beeld segmentasie algoritme met ses beheer parameters. Hierdie rand metriek word vergelyk met 'n area metriek vir so 'n toepassing. Die nut van atribuut substel seleksie in samewerking met 'n vrye parameter steller vir 'n een klas steun vektor masjien (SVM) klassifiseerder is ondersoek op hoë dimensionele objek georiënteerde data. Vir algemene aardwaarneming anomalie opsporings probleme met 'n tipiese segment kenmerk versameling, het so 'n stelsel beduidend beter resultate as 'n eksklusiewe vrye parameter stel stelsel gelewer in sommige uiterste gevalle. As gevolg van die stogastiese aard van die soek algoritme het die eksklusiewe vrye parameter stel strategie effens beter resultate gelewer. Die stelsel is getoets in 'n gevallestudie waar 'n enkele klas op 'n 22.5 x 22.5km substel van 'n SPOT 5 beeld geïdentifiseer word. Die voorgestelde stelsel, wat slegs monsters van die gekose klas gebruik het, het beter klassifikasie akkuraathede genereer as die multi klas klassifiseerder.
APA, Harvard, Vancouver, ISO, and other styles
21

Byrne, Evan Michael. "Sparse Multinomial Logistic Regression via Approximate Message Passing." The Ohio State University, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=osu1437416281.

Full text
APA, Harvard, Vancouver, ISO, and other styles
22

Steiger, Edgar [Verfasser]. "Efficient Sparse-Group Bayesian Feature Selection for Gene Network Reconstruction / Edgar Steiger." Berlin : Freie Universität Berlin, 2018. http://d-nb.info/1170876633/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
23

Zeng, Yaohui. "Scalable sparse machine learning methods for big data." Diss., University of Iowa, 2017. https://ir.uiowa.edu/etd/6021.

Full text
Abstract:
Sparse machine learning models have become increasingly popular in analyzing high-dimensional data. With the evolving era of Big Data, ultrahigh-dimensional, large-scale data sets are constantly collected in many areas such as genetics, genomics, biomedical imaging, social media analysis, and high-frequency finance. Mining valuable information efficiently from these massive data sets requires not only novel statistical models but also advanced computational techniques. This thesis focuses on the development of scalable sparse machine learning methods to facilitate Big Data analytics. Built upon the feature screening technique, the first part of this thesis proposes a family of hybrid safe-strong rules (HSSR) that incorporate safe screening rules into the sequential strong rule to remove unnecessary computational burden for solving the \textit{lasso-type} models. We present two instances of HSSR, namely SSR-Dome and SSR-BEDPP, for the standard lasso problem. We further extend SSR-BEDPP to the elastic net and group lasso problems to demonstrate the generalizability of the hybrid screening idea. In the second part, we design and implement an R package called \texttt{biglasso} to extend the lasso model fitting to Big Data in R. Our package \texttt{biglasso} utilizes memory-mapped files to store the massive data on the disk, only reading data into memory when necessary during model fitting, and is thus able to handle \textit{data-larger-than-RAM} cases seamlessly. Moreover, it's built upon our redesigned algorithm incorporated with the proposed HSSR screening, making it much more memory- and computation-efficient than existing R packages. Extensive numerical experiments with synthetic and real data sets are conducted in both parts to show the effectiveness of the proposed methods. In the third part, we consider a novel statistical model, namely the overlapping group logistic regression model, that allows for selecting important groups of features that are associated with binary outcomes in the setting where the features belong to overlapping groups. We conduct systematic simulations and real-data studies to show its advantages in the application of genetic pathway selection. We implement an R package called \texttt{grpregOverlap} that has HSSR screening built in for fitting overlapping group lasso models.
APA, Harvard, Vancouver, ISO, and other styles
24

Reese, Randall D. "Feature Screening of Ultrahigh Dimensional Feature Spaces With Applications in Interaction Screening." DigitalCommons@USU, 2018. https://digitalcommons.usu.edu/etd/7231.

Full text
Abstract:
Data for which the number of predictors exponentially exceeds the number of observations is becoming increasingly prevalent in fields such as bioinformatics, medical imaging, computer vision, And social network analysis. One of the leading questions statisticians must answer when confronted with such “big data” is how to reduce a set of exponentially many predictors down to a set of a mere few predictors which have a truly causative effect on the response being modelled. This process is often referred to as feature screening. In this work we propose three new methods for feature screening. The first method we propose (TC-SIS) is specifically intended for use with data having both categorical response and predictors. The second method we propose (JCIS) is meant for feature screening for interactions between predictors. JCIS is rare among interaction screening methods in that it does not require first finding a set of causative main effects before screening for interactive effects. Our final method (GenCorr) is intended for use with data having a multivariate response. GenCorr is the only method for multivariate screening which can screen for both causative main effects and causative interactions. Each of these aforementioned methods will be shown to possess both theoretical robustness as well as empirical agility.
APA, Harvard, Vancouver, ISO, and other styles
25

Pighin, Daniele. "Greedy Feature Selection in Tree Kernel Spaces." Doctoral thesis, Università degli studi di Trento, 2010. https://hdl.handle.net/11572/368779.

Full text
Abstract:
Tree Kernel functions are powerful tools for solving different classes of problems requiring large amounts of structured information. Combined with accurate learning algorithms, such as Support Vector Machines, they allow us to directly encode rich syntactic data in our learning problems without requiring an explicit feature mapping function or deep specific domain knowledge. However, as other very high dimensional kernel families, they come with two major drawbacks: first, the computational complexity induced by the dual representation makes them unpractical for very large datasets or for situations where very fast classifiers are necessary, e.g. real time systems or web applications; second, their implicit nature somehow limits their scientific appeal, as the implicit models that we learn cannot cast new light on the studied problems. As a possible solution to these two problems, this Thesis presents an approach to feature selection for tree kernel functions in the context of Support Vector learning, based on a greedy exploration of the fragment space. Features are selected according to a gradient norm preservation criterion, i.e. we select the heaviest features that account for a large percentage of the gradient norm, and are explicitly modeled and represented. The result of the feature extraction process is a data structure that can be used to decode the input structured data, i.e. to explicitly describe a tree in terms of its more relevant fragments. We present theoretical insights that justify the adopted strategy and detail the algorithms and data structures used to explore the feature space and store the most relevant features. Experiments on three different multi-class NLP tasks and data sets, namely question classification, relation extraction and semantic role labeling, confirm the theoretical findings and show that the decoding process can produce very fast and accurate linear classifiers, along with the explicit representation of the most relevant structured features identified for each class.
APA, Harvard, Vancouver, ISO, and other styles
26

Pighin, Daniele. "Greedy Feature Selection in Tree Kernel Spaces." Doctoral thesis, University of Trento, 2010. http://eprints-phd.biblio.unitn.it/359/1/thesis.pdf.

Full text
Abstract:
Tree Kernel functions are powerful tools for solving different classes of problems requiring large amounts of structured information. Combined with accurate learning algorithms, such as Support Vector Machines, they allow us to directly encode rich syntactic data in our learning problems without requiring an explicit feature mapping function or deep specific domain knowledge. However, as other very high dimensional kernel families, they come with two major drawbacks: first, the computational complexity induced by the dual representation makes them unpractical for very large datasets or for situations where very fast classifiers are necessary, e.g. real time systems or web applications; second, their implicit nature somehow limits their scientific appeal, as the implicit models that we learn cannot cast new light on the studied problems. As a possible solution to these two problems, this Thesis presents an approach to feature selection for tree kernel functions in the context of Support Vector learning, based on a greedy exploration of the fragment space. Features are selected according to a gradient norm preservation criterion, i.e. we select the heaviest features that account for a large percentage of the gradient norm, and are explicitly modeled and represented. The result of the feature extraction process is a data structure that can be used to decode the input structured data, i.e. to explicitly describe a tree in terms of its more relevant fragments. We present theoretical insights that justify the adopted strategy and detail the algorithms and data structures used to explore the feature space and store the most relevant features. Experiments on three different multi-class NLP tasks and data sets, namely question classification, relation extraction and semantic role labeling, confirm the theoretical findings and show that the decoding process can produce very fast and accurate linear classifiers, along with the explicit representation of the most relevant structured features identified for each class.
APA, Harvard, Vancouver, ISO, and other styles
27

Sanchez, Merchante Luis Francisco. "Learning algorithms for sparse classification." Phd thesis, Université de Technologie de Compiègne, 2013. http://tel.archives-ouvertes.fr/tel-00868847.

Full text
Abstract:
This thesis deals with the development of estimation algorithms with embedded feature selection the context of high dimensional data, in the supervised and unsupervised frameworks. The contributions of this work are materialized by two algorithms, GLOSS for the supervised domain and Mix-GLOSS for unsupervised counterpart. Both algorithms are based on the resolution of optimal scoring regression regularized with a quadratic formulation of the group-Lasso penalty which encourages the removal of uninformative features. The theoretical foundations that prove that a group-Lasso penalized optimal scoring regression can be used to solve a linear discriminant analysis bave been firstly developed in this work. The theory that adapts this technique to the unsupervised domain by means of the EM algorithm is not new, but it has never been clearly exposed for a sparsity-inducing penalty. This thesis solidly demonstrates that the utilization of group-Lasso penalized optimal scoring regression inside an EM algorithm is possible. Our algorithms have been tested with real and artificial high dimensional databases with impressive resuits from the point of view of the parsimony without compromising prediction performances.
APA, Harvard, Vancouver, ISO, and other styles
28

Hjelmare, Fredrik, and Jonas Rangsjö. "Simultaneous Localization And Mapping Using a Kinect in a Sparse Feature Indoor Environment." Thesis, Linköpings universitet, Reglerteknik, 2012. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-81140.

Full text
Abstract:
Localization and mapping are two of the most central tasks when it comes toautonomous robots. It has often been performed using expensive, accurate sensorsbut the fast development of consumer electronics has made similar sensorsavailable at a more affordable price. In this master thesis a TurtleBot\texttrademark\, robot and a MicrosoftKinect\texttrademark\, camera are used to perform Simultaneous Localization AndMapping, SLAM. The thesis presents modifications to an already existing opensource SLAM algorithm. The original algorithm, based on visual odometry, isextended so that it can also make use of measurements from wheel odometry and asingle axis gyro. Measurements are fused using an Extended Kalman Filter,EKF, operating in a multirate fashion. Both the SLAM algorithm and the EKF areimplemented in C++ using the framework Robot Operating System, ROS. The implementation is evaluated on two different data sets. One set isrecorded in an ordinary office room which constitutes an environment with manylandmarks. The other set is recorded in a conference room where one of the wallsis flat and white. This gives a partially sparse featured environment. The result by providing additional sensor information is a more robust algorithm.Periods without credible visual information does not make the algorithm lose itstrack and the algorithm can thus be used in a larger variety of environmentsincluding such where the possibility to extract landmarks is low. The resultalso shows that the visual odometry can cancel out drift introduced bywheel odometry and gyro sensors.
APA, Harvard, Vancouver, ISO, and other styles
29

Donini, Michele. "Exploiting the structure of feature spaces in kernel learning." Doctoral thesis, Università degli studi di Padova, 2016. http://hdl.handle.net/11577/3424320.

Full text
Abstract:
The problem of learning the optimal representation for a specific task recently became an important and not trivial topic in the machine learning community. In this field, deep architectures are the current gold standard among the machine learning algorithms by generating models with several levels of abstraction discovering very complicated structures in large datasets. Kernels and Deep Neural Networks (DNNs) are the principal methods to handle the representation problem in a deep manner. A DNN uses the famous back-propagation algorithm improving the state-of-the-art performance in several different real world applications, e.g. speech recognition, object detection and signal processing. Nevertheless, DNN algorithms have some drawbacks, inherited from standard neural networks, since they are theoretically not well understood. The main problems are: the complex structure of the solution, the unclear decoupling between the representation learning phase and the model generation, long training time, and the convergence to a sub-optimal solution (because of local minima and vanishing gradient). For these reasons, in this thesis, we propose new ideas to obtain an optimal representation by exploiting the kernels theory. Kernel methods have an elegant framework that decouples learning algorithms from data representations. On the other hand, kernels also have some weaknesses, for example they do not scale and they generally bring a shallow representation. In this thesis, we propose new theory and algorithms to fill this gap and make kernel learning able to generate deeper representation and to be more scalable. Considering this scenario we propose a different point of view regarding the Multiple Kernel Learning (MKL) framework, starting from the idea of a deeper kernel. An algorithm able to combine thousands of weak kernels with low computational and memory complexities is proposed. This procedure, called EasyMKL, outperforms the state-of-the-art methods combining the fragmented information in order to create an optimal kernel for the given task. Pursuing the idea to create an optimal family of weak kernels, we create a new measure for the evaluation of the kernel expressiveness, called spectral complexity. Exploiting this measure we are able to generate families of kernels with a hierarchical structure of the features by defining a new property concerning the monotonicity of the spectral complexity. We prove the quality of these weak families of kernels developing a new methodology for the Multiple Kernel Learning (MKL). Firstly we are able to create an optimal family of weak kernels by using the monotonically spectral-complex property; then we combine the optimal family of kernels by exploiting EasyMKL, obtaining a new kernel that is specific for the task; finally, we are able to generate the model by using a kernel machine. Moreover, we highlight the connection among distance metric learning, feature learning and kernel learning by proposing a method to learn the optimal family of weak kernels for a MKL algorithm in the different context in which the combination rule is the product element-wise of kernel matrices. This algorithm is able to generate the best parameters for an anisotropic RBF kernel and, therefore, a connection naturally appears among feature weighting, combinations of kernels and metric learning. Finally, the importance of the representation is also taken into account in three tasks from real world problems where we tackle different issues such as noise data, real-time application and big data
Il problema dell'apprendimento della reppresentazione ottima per un task specifico è divenuto un importante argomento nella comunità dell'apprendimento automatico. In questo campo, le architetture di tipo deep sono attualmente le più avanzate tra i possibili algoritmi di apprendimento automatico. Esse generano modelli che utilizzando alti gradi di astrazione e sono in grado di scoprire strutture complicate in dataset anche molto ampi. I kernel e le Deep Neural Network (DNN) sono i principali metodi per apprendere una rappresentazione di un problema in modo ricco (cioè deep). Le DNN sfruttano il famoso algoritmo di back-propagation migliorando le prestazioni degli algoritmi allo stato dell'arte in diverse applicazioni reali, come per esempio il riconoscimento vocale, il riconoscimento di oggetti o l'elaborazione di segnali. Tuttavia, gli algoritmi DNN hanno anche delle problematiche, ereditate dalle classiche reti neurali e derivanti dal fatto che esse non sono completamente comprese teoricamente. I problemi principali sono: la complessità della struttura della soluzione, la non chiara separazione tra la fase di apprendimento della rappresentazione ottimale e del modello, i lunghi tempi di training e la convergenza a soluzioni ottime solo localmente (a causa dei minimi locali e del vanishing gradient). Per questi motivi, in questa tesi, proponiamo nuove idee per ottenere rapprensetazioni ottimali sfruttando la teoria dei kernel. I metodi kernel hanno un elegante framework che separa l'algoritmo di apprendimento dalla rappresentazione delle informazioni. D'altro canto, anche i kernel hanno alcune debolezze, per esempio essi non scalano e, per come sono solitamente utilizzati, portano con loro una rappresentazione poco ricca (shallow). In questa tesi, proponiamo nuovi risultati teorici e nuovi algoritmi per cercare di risolvere questi problemi e rendere l'apprendimento dei kernel in grado di generare rappresentazioni più ricche (deeper) ed essere più scalabili. Verrà quindi presentato un nuovo algoritmo in grado di combinare migliaia di kernel deboli con un basso costo computazionale e di memoria. Questa procedura, chiamata EasyMKL, supera i metodi attualmente allo stato dell'arte combinando frammenti di informazione e creando in questo modo il kernel ottimale per uno specifico task. Perseguendo l'idea di creare una famiglia di kernel deboli ottimale, abbiamo creato una nuova misura di valutazione dell'espressività dei kernel, chiamata Spectral Complexity. Sfruttando questa misura siamo in grado di generare famiglia di kernel deboli con una struttura gerarchica nelle feature definendo una nuova proprietà riguardante la monotonicità della Spectral Complexity. Mostriamo la qualità dei nostri kernel deboli sviluppando una nuova metologia per il Multiple Kernel Learning (MKL). In primo luogo, siamo in grado di creare una famiglia ottimale di kernel deboli sfruttando la proprietà di monotinicità della Spectral Complexity; combiniamo quindi la famiglia di kernel deboli ottimale sfruttando EasyMKL e ottenendo un nuovo kernel, specifico per il singolo task; infine, siamo in grado di generare un modello sfruttando il nuovo kernel e kernel machine (per esempio una SVM). Inoltre, in questa tesi sottolineiamo le connessioni tra Distance Metric Learning, Feature Larning e Kernel Learning proponendo un metodo per apprendere la famiglia ottimale di kernel deboli per un algoritmo MKL in un contesto differente, in cui la regola di combinazione è il prodotto componente per componente delle matrici kernel. Questo algoritmo è in grado di generare i parametri ottimali per un kernel RBF anisotropico. Di conseguenza, si crea un naturale collegamento tra il Feature Weighting, le combinazioni dei kernel e l'apprendimento della metrica ottimale per il task. Infine, l'importanza della rappresentazione è anche presa in considerazione in tre task reali, dove affrontiamo differenti problematiche, tra cui: il rumore nei dati, le applicazioni in tempo reale e le grandi moli di dati (Big Data)
APA, Harvard, Vancouver, ISO, and other styles
30

Chen, Youqing. "Observation and analysis on features of microcracks and pore spaces in rocks." Kyoto University, 2002. http://hdl.handle.net/2433/150150.

Full text
APA, Harvard, Vancouver, ISO, and other styles
31

Cirujeda, Santolaria Pol. "Covariance-based descriptors for pattern recognition in multiple feature spaces." Doctoral thesis, Universitat Pompeu Fabra, 2015. http://hdl.handle.net/10803/350033.

Full text
Abstract:
En aquesta tesi s’explora l’ús de descriptors basats en la covariància per tal de traslladar la observació de característiques dins de regions d’interès a un determinat espai descriptiu que utilitzi les matrius de covariància de les característiques com a signatures discriminatives de les dades. Aquest espai constitueix la varietat de les matrius simètriques definides positives, amb la seva pròpia mètrica i consideracions analítiques, en la que podem desenvolupar diferents mètodes de machine learning per al reconeixement de patrons. Sigui quin sigui el domini de les característiques, ja siguin observacions visuals en imatges 2D, característiques de forma en núvols de punts 3D, gestos i moviment en seqüències d’imatges de profunditat, o informació de densitat en imatges mèdiques en 3D, l’espai del descriptor de covariància actua com un pas d’unificació en el repte de mantenir un marc de treball comú per a diverses aplicacions.
This dissertation explores the use of covariance-based descriptors in order to translate feature observations within regions of interest to a descriptor space using the feature covariance matrices as discriminative signatures. This space constitutes the particular manifold of symmetric positive definite matrices, with its own metric and analytical considerations, in which we can develop several machine learning algorithms for pattern recognition. Regardless of the feature domain, whether they are 2D image visual cues, 3D unstructured point cloud shape features, gesture and motion measurements from depth image sequences, or 3D tissue information in medical images, the covariance descriptor space acts as a unifying step in the task of keeping a common framework for several applications.
APA, Harvard, Vancouver, ISO, and other styles
32

Boone, Gary Noel. "Extreme dimensionality reduction for text learning : cluster-generated feature spaces." Diss., Georgia Institute of Technology, 2000. http://hdl.handle.net/1853/8139.

Full text
APA, Harvard, Vancouver, ISO, and other styles
33

Van, Dyk Hendrik Oostewald. "Classification in high dimensional feature spaces / by H.O. van Dyk." Thesis, North-West University, 2009. http://hdl.handle.net/10394/4091.

Full text
Abstract:
In this dissertation we developed theoretical models to analyse Gaussian and multinomial distributions. The analysis is focused on classification in high dimensional feature spaces and provides a basis for dealing with issues such as data sparsity and feature selection (for Gaussian and multinomial distributions, two frequently used models for high dimensional applications). A Naïve Bayesian philosophy is followed to deal with issues associated with the curse of dimensionality. The core treatment on Gaussian and multinomial models consists of finding analytical expressions for classification error performances. Exact analytical expressions were found for calculating error rates of binary class systems with Gaussian features of arbitrary dimensionality and using any type of quadratic decision boundary (except for degenerate paraboloidal boundaries). Similarly, computationally inexpensive (and approximate) analytical error rate expressions were derived for classifiers with multinomial models. Additional issues with regards to the curse of dimensionality that are specific to multinomial models (feature sparsity) were dealt with and tested on a text-based language identification problem for all eleven official languages of South Africa.
Thesis (M.Ing. (Computer Engineering))--North-West University, Potchefstroom Campus, 2009.
APA, Harvard, Vancouver, ISO, and other styles
34

Pathical, Santhosh P. "Classification in High Dimensional Feature Spaces through Random Subspace Ensembles." University of Toledo / OhioLINK, 2010. http://rave.ohiolink.edu/etdc/view?acc_num=toledo1290024890.

Full text
APA, Harvard, Vancouver, ISO, and other styles
35

De, Deuge Mark. "Manifold Learning Approaches to Compressing Latent Spaces of Unsupervised Feature Hierarchies." Thesis, The University of Sydney, 2015. http://hdl.handle.net/2123/14551.

Full text
Abstract:
Field robots encounter dynamic unstructured environments containing a vast array of unique objects. In order to make sense of the world in which they are placed, they collect large quantities of unlabelled data with a variety of sensors. Producing robust and reliable applications depends entirely on the ability of the robot to understand the unlabelled data it obtains. Deep Learning techniques have had a high level of success in learning powerful unsupervised representations for a variety of discriminative and generative models. Applying these techniques to problems encountered in field robotics remains a challenging endeavour. Modern Deep Learning methods are typically trained with a substantial labelled dataset, while datasets produced in a field robotics context contain limited labelled training data. The primary motivation for this thesis stems from the problem of applying large scale Deep Learning models to field robotics datasets that are label poor. While the lack of labelled ground truth data drives the desire for unsupervised methods, the need for improving the model scaling is driven by two factors, performance and computational requirements. When utilising unsupervised layer outputs as representations for classification, the classification performance increases with layer size. Scaling up models with multiple large layers of features is problematic, as the sizes of subsequent hidden layers scales with the size of the previous layer. This quadratic scaling, and the associated time required to train such networks has prevented adoption of large Deep Learning models beyond cluster computing. The contributions in this thesis are developed from the observation that parameters or filter el- ements learnt in Deep Learning systems are typically highly structured, and contain related ele- ments. Firstly, the structure of unsupervised filters is utilised to construct a mapping from the high dimensional filter space to a low dimensional manifold. This creates a significantly smaller repre- sentation for subsequent feature learning. This mapping, and its effect on the resulting encodings, highlights the need for the ability to learn highly overcomplete sets of convolutional features. Driven by this need, the unsupervised pretraining of Deep Convolutional Networks is developed to include a number of modern training and regularisation methods. These pretrained models are then used to provide initialisations for supervised convolutional models trained on low quantities of labelled data. By utilising pretraining, a significant increase in classification performance on a number of publicly available datasets is achieved. In order to apply these techniques to outdoor 3D Laser Illuminated Detection And Ranging data, we develop a set of resampling techniques to provide uniform input to Deep Learning models. The features learnt in these systems outperform the high effort hand engineered features developed specifically for 3D data. The representation of a given signal is then reinterpreted as a combination of modes that exist on the learnt low dimensional filter manifold. From this, we develop an encoding technique that allows the high dimensional layer output to be represented as a combination of low dimensional components. This allows the growth of subsequent layers to only be dependent on the intrinsic dimensionality of the filter manifold and not the number of elements contained in the previous layer. Finally, the resulting unsupervised convolutional model, the encoding frameworks and the em- bedding methodology are used to produce a new unsupervised learning stratergy that is able to encode images in terms of overcomplete filter spaces, without producing an explosion in the size of the intermediate parameter spaces. This model produces classification results on par with state of the art models, yet requires significantly less computational resources and is suitable for use in the constrained computation environment of a field robot.
APA, Harvard, Vancouver, ISO, and other styles
36

Ocloo, Isaac Xoese. "Energy Distance Correlation with Extended Bayesian Information Criteria for feature selection in high dimensional models." Bowling Green State University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1625238661031258.

Full text
APA, Harvard, Vancouver, ISO, and other styles
37

Shealy, Elizabeth Carlisle. "Designing outdoor spaces to support older adult dog walkers: A multi-method approach to identify and prioritize features in the built environment." Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/102931.

Full text
Abstract:
Associations between the built environment and walking are well understood among the general population, but far less is known about how features of the built environment influence walking in older adults. As compared to other age groups, older adults, defined as those 65 years of age and older, are more likely to experience declines in physical activity, social interaction, and loss of community connectivity. Animal companionship can provide older adults the motivation to stay physically active and help them mitigate the feelings of isolation. Built environments that align with the needs and abilities of older adults and their animal companions, like dogs, can encourage and help sustain walking habits. The aim of this study was to identify and prioritize features within the built environment pertinent to older adult dog walkers. Existing literature served as the basis for identifying neighborhood design features associated with general walking and dog walking. Through the use of a three round Delphi study, 25 experts from urban planning and design, management of outdoor spaces, public health, gerontology, and human-animal relationships modified and rated the importance of the identified features as it pertains to older adult dog walkers. Following the Delphi study, 12 older adult dog owners from the Warm Hearth Village participated in a guided walk and interview using the Photovoice technique. The goal was to gather their perceptions of the outdoor walking environment. Among expert panelists, safety from motorized traffic, crime, unleashed dogs, and personal injury was paramount (mean (M)= 93.20, standard deviation (SD) = 11.54). Experts also saw the value and agreed upon the importance of dog supportive features within the built environment, like dog waste stations dog waste stations (desirable; M = 87.95, SD = 11.37), and dog policy signage (desirable; M = 79.91, SD = 11.22). Older adults also believed safety was important. They saw their dog as a protective safety factor against walking deterrents like aggressive or unleashed dogs. However, the feature that resonated most with older adult dog walkers in this study was their interaction with nature. They described the pleasure of observing seasons change and the connection with nature that came from the tree canopy cocooning the walking path. Path design is also a necessary consideration. Older adults emphasized the importance of having options between paved and unpaved walking paths. The panelists stressed the need for creating lines of sight (desirable; M = 66.46, SD = 20.71) and lighting (desirable; M = 77.92, SD =19.77). Those who plan, develop, and maintain spaces that support older adults can prioritize the features I identified in my research. Incorporating these features into the design of spaces for older adults has the potential to translate into increased walking and opportunities to socialize, contributing to mental and physical health.
Doctor of Philosophy
Associations between the built environment and walking are well understood among the general population, but less is known about how features in the built environment influence older adults. As compared to other age groups, older adults are more likely to experience declines in physical activity and social interaction. Animal companionship can provide motivation to stay physically active and help them mitigate feelings of isolation. Built environments that align with the needs of older adults and their animal companions, like dogs, can encourage and help sustain walking habits. My research identified and prioritized features within the built environment pertinent to older adult dog walkers. I implemented an iterative three round study to gain consensus among expert panelists and guided walks and interviews with older adult dog walkers. Among expert panelists, safety from motorized traffic, crime, unleashed dogs, and personal injury was paramount. Experts also saw the value of dog supportive features within the built environment, like dog waste stations. Older adults also believed safety was important. They saw their dog as a protective safety factor against walking deterrents like aggressive dogs. The feature that resonated most with older adult in this study was nature. They described the pleasure of observing seasons change and the connection with nature that came from the tree canopy cocooning the walking path. Path design is also a necessary consideration. Older adults emphasized the importance of having options between paved and unpaved walking paths. Those who plan, develop, and maintain spaces that support older adults can prioritize the features I identified in my research. Incorporating these features into outdoor spaces has the potential to translate into increased walking and opportunities to socialize, contributing to mental and physical health.
APA, Harvard, Vancouver, ISO, and other styles
38

Klement, Sascha [Verfasser]. "The support feature machine : an odyssey in high-dimensional spaces / Sascha Klement." Lübeck : Zentrale Hochschulbibliothek Lübeck, 2014. http://d-nb.info/1046751751/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Mohapatra, Prateeti. "Deriving Novel Posterior Feature Spaces For Conditional Random Field - Based Phone Recognition." The Ohio State University, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=osu1236784133.

Full text
APA, Harvard, Vancouver, ISO, and other styles
40

Calarco, Francesca Maria Assunta. "Soundscape design of water features used in outdoor spaces where road traffic noise is audible." Thesis, Heriot-Watt University, 2015. http://hdl.handle.net/10399/3084.

Full text
Abstract:
This research focused on the soundscape design of a wide range of small to medium sized water features (waterfalls, fountains with upward jet(s), and streams) which can be used in gardens or parks for promoting peacefulness and relaxation in the presence of road traffic noise. Firstly, the thesis examined the audio-visual interaction and perceptual assessment of water features, including the semantic components and the qualitative categorisation and evocation of water sounds; and secondly, the thesis investigated the effectiveness of the water features tested in promoting relaxation through sound mapping. Different laboratory tests were carried out, and these included paired comparison tests (audio-only, visual-only and audio-visual tests), semantic differential tests, as well as tests aimed at the qualitative categorisation and evocation of water features. Sound maps of the water generated sounds were developed through the use of propagation models based on either point or line sources. Three acoustic zones (‘water sounds dominant zone’, ‘optimum zone’ and ‘RTN dominant zone’ (RTN: road traffic noise)) were defined in the maps as the zones where relaxation/pleasantness can be promoted over a 20 m × 20 m area for different road traffic noise levels. Paired comparisons highlighted the interdependence between uni-modal (audio-only or visual-only) and bi-modal (audio-visual) perception, indicating that equal attention should be given to the design of both stimuli. In general, natural looking features tended to increase preference scores (compared to audio-only paired comparison scores), while manmade looking features decreased them. Semantic descriptors showed significant correlations with preferences and were found to be more reliable design criteria than physical parameters. A principal component analysis identified three components within the nine semantic attributes tested: “emotional assessment,” “sound quality,” and “envelopment and temporal variation.” The first two showed significant correlations with audio-only preferences, “emotional assessment” being the most important predictor of preferences, and its attributes naturalness, relaxation, and freshness also being significantly correlated with preferences. Categorisation results indicated that natural stream sounds are easily identifiable (unlike waterfalls and fountains), while evocation results showed no unique relationship with preferences. The results of sound maps indicated that small to medium sized water features can be used mainly in environments where road traffic noise levels are equal or lower than 65 dBA.
APA, Harvard, Vancouver, ISO, and other styles
41

Fu, Huanzhang. "Contributions to generic visual object categorization." Phd thesis, Ecole Centrale de Lyon, 2010. http://tel.archives-ouvertes.fr/tel-00599713.

Full text
Abstract:
This thesis is dedicated to the active research topic of generic Visual Object Categorization(VOC), which can be widely used in many applications such as videoindexation and retrieval, video monitoring, security access control, automobile drivingsupport etc. Due to many realistic difficulties, it is still considered to be one ofthe most challenging problems in computer vision and pattern recognition. In thiscontext, we have proposed in this thesis our contributions, especially concerning thetwo main components of the methods addressing VOC problems, namely featureselection and image representation.Firstly, an Embedded Sequential Forward feature Selection algorithm (ESFS)has been proposed for VOC. Its aim is to select the most discriminant features forobtaining a good performance for the categorization. It is mainly based on thecommonly used sub-optimal search method Sequential Forward Selection (SFS),which relies on the simple principle to add incrementally most relevant features.However, ESFS not only adds incrementally most relevant features in each stepbut also merges them in an embedded way thanks to the concept of combinedmass functions from the evidence theory which also offers the benefit of obtaining acomputational cost much lower than the one of original SFS.Secondly, we have proposed novel image representations to model the visualcontent of an image, namely Polynomial Modeling and Statistical Measures basedImage Representation, called PMIR and SMIR respectively. They allow to overcomethe main drawback of the popular "bag of features" method which is the difficultyto fix the optimal size of the visual vocabulary. They have been tested along withour proposed region based features and SIFT. Two different fusion strategies, earlyand late, have also been considered to merge information from different "channels"represented by the different types of features.Thirdly, we have proposed two approaches for VOC relying on sparse representation,including a reconstructive method (R_SROC) as well as a reconstructiveand discriminative one (RD_SROC). Indeed, sparse representation model has beenoriginally used in signal processing as a powerful tool for acquiring, representingand compressing the high-dimensional signals. Thus, we have proposed to adaptthese interesting principles to the VOC problem. R_SROC relies on the intuitiveassumption that an image can be represented by a linear combination of trainingimages from the same category. Therefore, the sparse representations of images arefirst computed through solving the ℓ1 norm minimization problem and then usedas new feature vectors for images to be classified by traditional classifiers such asSVM. To improve the discrimination ability of the sparse representation to betterfit the classification problem, we have also proposed RD_SROC which includes adiscrimination term, such as Fisher discrimination measure or the output of a SVMclassifier, to the standard sparse representation objective function in order to learna reconstructive and discriminative dictionary. Moreover, we have also proposedChapter 0. Abstractto combine the reconstructive and discriminative dictionary and the adapted purereconstructive dictionary for a given category so that the discrimination power canfurther be increased.The efficiency of all the methods proposed in this thesis has been evaluated onpopular image datasets including SIMPLIcity, Caltech101 and Pascal2007.
APA, Harvard, Vancouver, ISO, and other styles
42

Winkler, Roland [Verfasser], and Rudolf [Akademischer Betreuer] Kruse. "Prototype based clustering in high-dimensional feature spaces / Roland Winkler. Betreuer: Rudolf Kruse." Magdeburg : Universitätsbibliothek, 2015. http://d-nb.info/1080560882/34.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Winkler, Roland Verfasser], and Rudolf [Akademischer Betreuer] [Kruse. "Prototype based clustering in high-dimensional feature spaces / Roland Winkler. Betreuer: Rudolf Kruse." Magdeburg : Universitätsbibliothek, 2015. http://nbn-resolving.de/urn:nbn:de:gbv:ma9:1-7159.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Tran, Antoine. "Object representation in local feature spaces : application to real-time tracking and detection." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLY010/document.

Full text
Abstract:
La représentation visuelle est un problème fondamental en vision par ordinateur. Le but est de réduire l'information au strict nécessaire pour une tâche désirée. Plusieurs types de représentation existent, comme les caractéristiques de couleur (histogrammes, attributs de couleurs...), de forme (dérivées, points d'intérêt...) ou d'autres, comme les bancs de filtres.Les caractéristiques bas-niveau (locales) sont rapides à calculer. Elles ont un pouvoir de représentation limité, mais leur généricité présente un intérêt pour des systèmes autonomes et multi-tâches, puisque les caractéristiques haut-niveau découlent d'elles.Le but de cette thèse est de construire puis d'étudier l'impact de représentations fondées seulement sur des caractéristiques locales de bas-niveau (couleurs, dérivées spatiales) pour deux tâches : la poursuite d'objets génériques, nécessitant des caractéristiques robustes aux variations d'aspect de l'objet et du contexte au cours du temps; la détection d'objets, où la représentation doit décrire une classe d'objets en tenant compte des variations intra-classe. Plutôt que de construire des descripteurs d'objets globaux dédiés, nous nous appuyons entièrement sur les caractéristiques locales et sur des mécanismes statistiques flexibles visant à estimer leur distribution (histogrammes) et leurs co-occurrences (Transformée de Hough Généralisée). La Transformée de Hough Généralisée (THG), créée pour la détection de formes quelconques, consiste à créer une structure de données représentant un objet, une classe... Cette structure, d'abord indexée par l'orientation du gradient, a été étendue à d'autres caractéristiques. Travaillant sur des caractéristiques locales, nous voulons rester proche de la THG originale.En poursuite d'objets, après avoir présenté nos premiers travaux, combinant la THG avec un filtre particulaire (utilisant un histogramme de couleurs), nous présentons un algorithme plus léger et rapide (100fps), plus précis et robuste. Nous présentons une évaluation qualitative et étudierons l'impact des caractéristiques utilisées (espace de couleur, formulation des dérivées partielles...). En détection, nous avons utilisé l'algorithme de Gall appelé forêts de Hough. Notre but est de réduire l'espace de caractéristiques utilisé par Gall, en supprimant celles de type HOG, pour ne garder que les dérivées partielles et les caractéristiques de couleur. Pour compenser cette réduction, nous avons amélioré deux étapes de l'entraînement : le support des descripteurs locaux (patchs) est partiellement produit selon une mesure géométrique, et l'entraînement des nœuds se fait en générant une carte de probabilité spécifique prenant en compte les patchs utilisés pour cette étape. Avec l'espace de caractéristiques réduit, le détecteur n'est pas plus précis. Avec les mêmes caractéristiques que Gall, sur une même durée d'entraînement, nos travaux ont permis d'avoir des résultats identiques, mais avec une variance plus faible et donc une meilleure répétabilité
Visual representation is a fundamental problem in computer vision. The aim is to reduce the information to the strict necessary for a query task. Many types of representation exist, like color features (histograms, color attributes...), shape ones (derivatives, keypoints...) or filterbanks.Low-level (and local) features are fast to compute. Their power of representation are limited, but their genericity have an interest for autonomous or multi-task systems, as higher level ones derivate from them. We aim to build, then study impact of low-level and local feature spaces (color and derivatives only) for two tasks: generic object tracking, requiring features robust to object and environment's aspect changes over the time; object detection, for which the representation should describe object class and cope with intra-class variations.Then, rather than using global object descriptors, we use entirely local features and statisticals mecanisms to estimate their distribution (histograms) and their co-occurrences (Generalized Hough Transform).The Generalized Hough Transform (GHT), created for detection of any shape, consists in building a codebook, originally indexed by gradient orientation, then to diverse features, modeling an object, a class. As we work on local features, we aim to remain close to the original GHT.In tracking, after presenting preliminary works combining the GHT with a particle filter (using color histograms), we present a lighter and fast (100 fps) tracker, more accurate and robust.We present a qualitative evaluation and study the impact of used features (color space, spatial derivative formulation).In detection, we used Gall's Hough Forest. We aim to reduce Gall's feature space and discard HOG features, to keep only derivatives and color ones.To compensate the reduction, we enhanced two steps: the support of local descriptors (patches) are partially chosen using a geometrical measure, and node training is done by using a specific probability map based on patches used at this step.With reduced feature space, the detector is less accurate than with Gall's feature space, but for the same training time, our works lead to identical results, but with higher stability and then better repeatability
APA, Harvard, Vancouver, ISO, and other styles
45

Van, der Walt Christiaan Maarten. "Maximum-likelihood kernel density estimation in high-dimensional feature spaces /| C.M. van der Walt." Thesis, North-West University, 2014. http://hdl.handle.net/10394/10635.

Full text
Abstract:
With the advent of the internet and advances in computing power, the collection of very large high-dimensional datasets has become feasible { understanding and modelling high-dimensional data has thus become a crucial activity, especially in the field of pattern recognition. Since non-parametric density estimators are data-driven and do not require or impose a pre-defined probability density function on data, they are very powerful tools for probabilistic data modelling and analysis. Conventional non-parametric density estimation methods, however, originated from the field of statistics and were not originally intended to perform density estimation in high-dimensional features spaces { as is often encountered in real-world pattern recognition tasks. Therefore we address the fundamental problem of non-parametric density estimation in high-dimensional feature spaces in this study. Recent advances in maximum-likelihood (ML) kernel density estimation have shown that kernel density estimators hold much promise for estimating nonparametric probability density functions in high-dimensional feature spaces. We therefore derive two new iterative kernel bandwidth estimators from the maximum-likelihood (ML) leave one-out objective function and also introduce a new non-iterative kernel bandwidth estimator (based on the theoretical bounds of the ML bandwidths) for the purpose of bandwidth initialisation. We name the iterative kernel bandwidth estimators the minimum leave-one-out entropy (MLE) and global MLE estimators, and name the non-iterative kernel bandwidth estimator the MLE rule-of-thumb estimator. We compare the performance of the MLE rule-of-thumb estimator and conventional kernel density estimators on artificial data with data properties that are varied in a controlled fashion and on a number of representative real-world pattern recognition tasks, to gain a better understanding of the behaviour of these estimators in high-dimensional spaces and to determine whether these estimators are suitable for initialising the bandwidths of iterative ML bandwidth estimators in high dimensions. We find that there are several regularities in the relative performance of conventional kernel density estimators across different tasks and dimensionalities and that the Silverman rule-of-thumb bandwidth estimator performs reliably across most tasks and dimensionalities of the pattern recognition datasets considered, even in high-dimensional feature spaces. Based on this empirical evidence and the intuitive theoretical motivation that the Silverman estimator optimises the asymptotic mean integrated squared error (assuming a Gaussian reference distribution), we select this estimator to initialise the bandwidths of the iterative ML kernel bandwidth estimators compared in our simulation studies. We then perform a comparative simulation study of the newly introduced iterative MLE estimators and other state-of-the-art iterative ML estimators on a number of artificial and real-world high-dimensional pattern recognition tasks. We illustrate with artificial data (guided by theoretical motivations) under what conditions certain estimators should be preferred and we empirically confirm on real-world data that no estimator performs optimally on all tasks and that the optimal estimator depends on the properties of the underlying density function being estimated. We also observe an interesting case of the bias-variance trade-off where ML estimators with fewer parameters than the MLE estimator perform exceptionally well on a wide variety of tasks; however, for the cases where these estimators do not perform well, the MLE estimator generally performs well. The newly introduced MLE kernel bandwidth estimators prove to be a useful contribution to the field of pattern recognition, since they perform optimally on a number of real-world pattern recognition tasks investigated and provide researchers and practitioners with two alternative estimators to employ for the task of kernel density estimation.
PhD (Information Technology), North-West University, Vaal Triangle Campus, 2014
APA, Harvard, Vancouver, ISO, and other styles
46

Baychev, Todor. "Pore space structure effects on flow in porous media." Thesis, University of Manchester, 2018. https://www.research.manchester.ac.uk/portal/en/theses/pore-space-structure-effects-on-flow-in-porous-media(5542173d-d6d1-4768-9f38-4b41254fa194).html.

Full text
Abstract:
Fluid flow in porous media is important for a number of fields including nuclear waste disposal, oil and gas, fuel cells, water treatment and civil engineering. The aim of this work is to improve the current understanding of how the pore space governs the fluid flow in porous media in the context of nuclear waste disposal. The effects of biofilm formation on flow are also investigated. The thesis begins with a review of the current porous media characterisation techniques and the means for converting the pore space into pore network models and their existing applications. Further, I review the current understanding of biofilm lifecycle in the context of porous media and its interactions with fluid flow. The model porous media used in this project is Hollington sandstone. The pore space of the material is characterised by X-ray CT and the equivalent pore networks from two popular pore network extraction algorithms are compared comprehensively. The results indicate that different pore network extraction algorithms could interpret the same pore space rather differently. Despite these differences, the single-phase flow properties of the extracted networks are in good agreement with the estimates from a direct approach. However, it is recommended that any flow or transport study using pore network modelling should entail a sensitivity study aiming to determine if the model results are extraction method specific. Following these results, a pore merging algorithm is introduced aimed to improve the over segmentation of long throats and hence improve the quality of the extracted statistics. The improved model is used to study quantitatively the pore space evolution of shale rock during pyrolysis. Next, the extracted statistics from one of the algorithms is used to explore the potential of regular pore network models for up-scaling the flow properties of porous materials. Analysis showed that the anisotropic flow properties observed in the irregular models are due to the different number of red (critical) features present along the flow direction. This observation is used to construct large regular models that can mimic that behaviour and to discuss the potential of estimating the flow properties of porous media based on their isotropic and anisotropic properties. Finally, a long-term flow-through column experiment is conducted aiming to understand the effects of bacterial colonisation on flow in Hollington sandstone. The results show that such systems are quite complex and are susceptible to perturbations. The flow properties of the sandstone were reduced significantly during the course of the experiment. The possible mechanisms responsible for the observed reductions in permeability are discussed and the need for developing new imaging techniques that can allow examining biofilm development in-situ is underlined as necessary for drawing more definitive conclusions.
APA, Harvard, Vancouver, ISO, and other styles
47

Leeds, Daniel Demeny. "Assisted auscultation : creation and visualization of high dimensional feature spaces for the detection of mitral regurgitation." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/36806.

Full text
Abstract:
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, June 2006.
"May 2006."
Includes bibliographical references (p. 83-84).
Cardiac auscultation, listening to the heart using a stethoscope, often constitutes the first step in detection of common heart problems. Unfortunately, primary care physicians, who perform this initial screening, often lack the experience to correctly evaluate what they hear. False referrals are frequent, costing hundreds of dollars and hours of time for many patients. We report on a system we have built to aid medical practitioners in diagnosing Mitral Regurgitation (MR) based on heart sounds. Our work builds on the "prototypical beat" introduced by Syed in [17] to extract two different feature sets characterizing systolic acoustic activity. One feature set is derived from current medical knowledge. The other is based on unsupervised learning of systolic shapes, using component Analysis. Our system employs self-organizing maps (SOMs) to depict the distribution of patients in each feature space as labels within a two-dimensional colored grid. A user screens new patients by viewing their projections onto the SOM, and determining whether they are closer in space, and thus more similar, to patients with or without MR. We evaluated our system on 46 patients. Using a combination of the two feature sets, SOM-based diagnosis classified patients with accuracy similar to that of a cardiologist.
by Daniel Demeny Leeds.
M.Eng.
APA, Harvard, Vancouver, ISO, and other styles
48

Bernard, Anne. "Développement de méthodes statistiques nécessaires à l'analyse de données génomiques : application à l'influence du polymorphisme génétique sur les caractéristiques cutanées individuelles et l'expression du vieillissement cutané." Phd thesis, Conservatoire national des arts et metiers - CNAM, 2013. http://tel.archives-ouvertes.fr/tel-00925074.

Full text
Abstract:
Les nouvelles technologies développées ces dernières années dans le domaine de la génétique ont permis de générer des bases de données de très grande dimension, en particulier de Single Nucleotide Polymorphisms (SNPs), ces bases étant souvent caractérisées par un nombre de variables largement supérieur au nombre d'individus. L'objectif de ce travail a été de développer des méthodes statistiques adaptées à ces jeux de données de grande dimension et permettant de sélectionner les variables les plus pertinentes au regard du problème biologique considéré. Dans la première partie de ce travail, un état de l'art présente différentes méthodes de sélection de variables non supervisées et supervisées pour 2 blocs de variables et plus. Dans la deuxième partie, deux nouvelles méthodes de sélection de variables non supervisées de type "sparse" sont proposées : la Group Sparse Principal Component Analysis (GSPCA) et l'Analyse des Correspondances Multiples sparse (ACM sparse). Vues comme des problèmes de régression avec une pénalisation group LASSO elles conduisent à la sélection de blocs de variables quantitatives et qualitatives, respectivement. La troisième partie est consacrée aux interactions entre SNPs et dans ce cadre, une méthode spécifique de détection d'interactions, la régression logique, est présentée. Enfin, la quatrième partie présente une application de ces méthodes sur un jeu de données réelles de SNPs afin d'étudier l'influence possible du polymorphisme génétique sur l'expression du vieillissement cutané au niveau du visage chez des femmes adultes. Les méthodes développées ont donné des résultats prometteurs répondant aux attentes des biologistes, et qui offrent de nouvelles perspectives de recherches intéressantes
APA, Harvard, Vancouver, ISO, and other styles
49

Mamani, Gladys Marleny Hilasaca. "Empregando técnicas de projeção multidimensional para transformação interativa de espaços de características." Universidade de São Paulo, 2012. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-20022013-163023/.

Full text
Abstract:
A tecnologia atual permite armazenar grandes quantidades de dados, no entanto sua exploração e compreensão resultam em um enorme desafio devido não só ao tamanho dos conjuntos produzidos mas também sua complexidade. Nesse sentido a visualização de informação vem se mostrando um recurso extremamente poderoso para ajudar a interpretar e extrair informação útil desse universo de dados. Dentre as abordagens existentes, as tecnicas de projeção multidimensional estão emergindo como um instrumento de visualização importante em aplicações que implicam a análise visual de dados de alta dimensão devido ao poder analítico que essas oferecem na exploração de relações de similaridade e correlação de dados abstratos. Contudo, os resultados obtidos por tais técnicas estão intimamente ligados à qualidade do espaço de características que descrevem os dados sendo processados. Se o espaço for bem formado e refletir as relações de similaridade esperadas por um usuário, os resultados nais serão satisfatórios. Caso contrário pouca utilidade terão as representações visuais geradas. Neste projeto de mestrado técnicas de projeção multidimensional são empregadas, para, não somente explorar conjuntos de dados multidimensionais, mas também para servir como um guia em um processo que visa \"moldar\" espaços de características. A abordagem proposta se baseia na combinação de projeções de amostras e mapeamentos locais, permitindo ao usuário de forma interativa transformar os atributos dos dados por meio da modicação dessas projeções. Mais especicamente, as novas relações de similaridade criadas pelo usuário na manipulação das projeções das amostras são propagadas para o espaço de característica que descreve os dados, transformando-o em um novo espaço que reflita essas relações, ou seja, o ponto de vista do usuário sobre as semelhanças e diferenças presentes nos dados. Resultados experimentais mostram que a abordagem desenvolvida nesse projeto pode com sucesso transformar espaços de características com base na manipulação da projeção de pequenas amostras, melhorando a coesão e separação de grupos. Com base no ferramental criado, um sistema de recuperação de imagens por conteúdo e sugerido, mostrando que a abordagem desenvolvida pode ser bastante útil nesse tipo de aplicação
Although the current technology allows storing large volumes of data, their exploration and understanding remains as challenges not only due to the size of the produced datasets but also their complexity. In this sense, the information visualization has proven to be an extremely powerful instrument to help users to interpret and extract useful information from this universe of data. Among the existing approaches, multidimensional projection techniques are emerging as an important visualization tool in applications involving visual analysis of high dimensional data due to the analytical power that these techniques oer in the exploitation of similarity relations and abstract data correlation. However, the results obtained by these techniques are closely tied to the quality of the feature space which describes the data being processed. If the space is well formed and reflect the similarity relations expected by an user, the nal results will be satisfactory. Otherwise, little utility will have the created visual representations. In this master\'s project, multidimensional projections techniques are employed not only to explore multidimensional data sets, but also to serve as a guide in a process that aims to \"mold\" features spaces. The proposed approach is based on the combination of projections of samples and local mappings, allowing the user to interactively transform the data attributes by modifying these projections. Specifically, the new similarity relations created by the user in manipulating the projections of the samples are propagated to the feature space that describes the data, transforming it into a new space that reflects these relationships, i.e., the point of view of the user about the similarities and dierences in the data. Experimental results show that the approach developed in this project can successfully transform feature spaces based on the manipulation of projections of small samples, improving the cohesion and separation of groups. Based on the created framework, a content-based image retrieval system is suggested, showing that the developed approach can be very useful in this type of application
APA, Harvard, Vancouver, ISO, and other styles
50

Truong, Hoang Vinh. "Multi color space LBP-based feature selection for texture classification." Thesis, Littoral, 2018. http://www.theses.fr/2018DUNK0468/document.

Full text
Abstract:
L'analyse de texture a été largement étudiée dans la littérature et une grande variété de descripteurs de texture ont été proposés. Parmi ceux-ci, les motifs binaires locaux (LBP) occupent une part importante dans la plupart des applications d'imagerie couleur ou de reconnaissance de formes et sont particulièrement exploités dans les problèmes d'analyse de texture. Généralement, les images couleurs acquises sont représentées dans l'espace colorimétrique RGB. Cependant, il existe de nombreux espaces couleur pour la classification des textures, chacun ayant des propriétés spécifiques qui impactent les performances. Afin d'éviter la difficulté de choisir un espace pertinent, la stratégie multi-espace couleur permet d'utiliser simultanémentles propriétés de plusieurs espaces. Toutefois, cette stratégie conduit à augmenter le nombre d'attributs, notamment lorsqu'ils sont extraits de LBP appliqués aux images couleur. Ce travail de recherche est donc axé sur la réduction de la dimension de l'espace d'attributs générés à partir de motifs binaires locaux par des méthodes de sélection d'attributs. Dans ce cadre, nous considérons l'histogramme des LBP pour la représentation des textures couleur et proposons des approches conjointes de sélection de bins et d'histogrammes multi-espace pour la classification supervisée de textures. Les nombreuses expériences menées sur des bases de référence de texture couleur, démontrent que les approches proposées peuvent améliorer les performances en classification comparées à l'état de l'art
Texture analysis has been extensively studied and a wide variety of description approaches have been proposed. Among them, Local Binary Pattern (LBP) takes an essential part of most of color image analysis and pattern recognition applications. Usually, devices acquire images and code them in the RBG color space. However, there are many color spaces for texture classification, each one having specific properties. In order to avoid the difficulty of choosing a relevant space, the multi color space strategy allows using the properties of several spaces simultaneously. However, this strategy leads to increase the number of features extracted from LBP applied to color images. This work is focused on the dimensionality reduction of LBP-based feature selection methods. In this framework, we consider the LBP histogram and bin selection approaches for supervised texture classification. Extensive experiments are conducted on several benchmark color texture databases. They demonstrate that the proposed approaches can improve the state-of-the-art results
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography