Dissertations / Theses: 'Structured sparsity model'

1

Tillander, Annika. "Classification models for high-dimensional data with sparsity patterns." Doctoral thesis, Stockholms universitet, Statistiska institutionen, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-95664.

Full text

Abstract:

Today's high-throughput data collection devices, e.g. spectrometers and gene chips, create information in abundance. However, this poses serious statistical challenges, as the number of features is usually much larger than the number of observed units. Further, in this high-dimensional setting, only a small fraction of the features are likely to be informative for any specific project. In this thesis, three different approaches to the two-class supervised classification in this high-dimensional, low sample setting are considered. There are classifiers that are known to mitigate the issues of high-dimensionality, e.g. distance-based classifiers such as Naive Bayes. However, these classifiers are often computationally intensive and therefore less time-consuming for discrete data. Hence, continuous features are often transformed into discrete features. In the first paper, a discretization algorithm suitable for high-dimensional data is suggested and compared with other discretization approaches. Further, the effect of discretization on misclassification probability in high-dimensional setting is evaluated. Linear classifiers are more stable which motivate adjusting the linear discriminant procedure to high-dimensional setting. In the second paper, a two-stage estimation procedure of the inverse covariance matrix, applying Lasso-based regularization and Cuthill-McKee ordering is suggested. The estimation gives a block-diagonal approximation of the covariance matrix which in turn leads to an additive classifier. In the third paper, an asymptotic framework that represents sparse and weak block models is derived and a technique for block-wise feature selection is proposed. Probabilistic classifiers have the advantage of providing the probability of membership in each class for new observations rather than simply assigning to a class. In the fourth paper, a method is developed for constructing a Bayesian predictive classifier. Given the block-diagonal covariance matrix, the resulting Bayesian predictive and marginal classifier provides an efficient solution to the high-dimensional problem by splitting it into smaller tractable problems. The relevance and benefits of the proposed methods are illustrated using both simulated and real data.
Med dagens teknik, till exempel spektrometer och genchips, alstras data i stora mängder. Detta överflöd av data är inte bara till fördel utan orsakar även vissa problem, vanligtvis är antalet variabler (p) betydligt fler än antalet observation (n). Detta ger så kallat högdimensionella data vilket kräver nya statistiska metoder, då de traditionella metoderna är utvecklade för den omvända situationen (p<n). Dessutom är det vanligtvis väldigt få av alla dessa variabler som är relevanta för något givet projekt och styrkan på informationen hos de relevanta variablerna är ofta svag. Därav brukar denna typ av data benämnas som gles och svag (sparse and weak). Vanligtvis brukar identifiering av de relevanta variablerna liknas vid att hitta en nål i en höstack. Denna avhandling tar upp tre olika sätt att klassificera i denna typ av högdimensionella data. Där klassificera innebär, att genom ha tillgång till ett dataset med både förklaringsvariabler och en utfallsvariabel, lära en funktion eller algoritm hur den skall kunna förutspå utfallsvariabeln baserat på endast förklaringsvariablerna. Den typ av riktiga data som används i avhandlingen är microarrays, det är cellprov som visar aktivitet hos generna i cellen. Målet med klassificeringen är att med hjälp av variationen i aktivitet hos de tusentals gener (förklaringsvariablerna) avgöra huruvida cellprovet kommer från cancervävnad eller normalvävnad (utfallsvariabeln). Det finns klassificeringsmetoder som kan hantera högdimensionella data men dessa är ofta beräkningsintensiva, därav fungera de ofta bättre för diskreta data. Genom att transformera kontinuerliga variabler till diskreta (diskretisera) kan beräkningstiden reduceras och göra klassificeringen mer effektiv. I avhandlingen studeras huruvida av diskretisering påverkar klassificeringens prediceringsnoggrannhet och en mycket effektiv diskretiseringsmetod för högdimensionella data föreslås. Linjära klassificeringsmetoder har fördelen att vara stabila. Nackdelen är att de kräver en inverterbar kovariansmatris och vilket kovariansmatrisen inte är för högdimensionella data. I avhandlingen föreslås ett sätt att skatta inversen för glesa kovariansmatriser med blockdiagonalmatris. Denna matris har dessutom fördelen att det leder till additiv klassificering vilket möjliggör att välja hela block av relevanta variabler. I avhandlingen presenteras även en metod för att identifiera och välja ut blocken. Det finns också probabilistiska klassificeringsmetoder som har fördelen att ge sannolikheten att tillhöra vardera av de möjliga utfallen för en observation, inte som de flesta andra klassificeringsmetoder som bara predicerar utfallet. I avhandlingen förslås en sådan Bayesiansk metod, givet den blockdiagonala matrisen och normalfördelade utfallsklasser. De i avhandlingen förslagna metodernas relevans och fördelar är visade genom att tillämpa dem på simulerade och riktiga högdimensionella data.

APA, Harvard, Vancouver, ISO, and other styles

2

Vinyes, Marina. "Convex matrix sparsity for demixing with an application to graphical model structure estimation." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1130/document.

Full text

Abstract:

En apprentissage automatique on a pour but d'apprendre un modèle, à partir de données, qui soit capable de faire des prédictions sur des nouvelles données (pas explorées auparavant). Pour obtenir un modèle qui puisse se généraliser sur les nouvelles données, et éviter le sur-apprentissage, nous devons restreindre le modèle. Ces restrictions sont généralement une connaissance a priori de la structure du modèle. Les premières approches considérées dans la littérature sont la régularisation de Tikhonov et plus tard le Lasso pour induire de la parcimonie dans la solution. La parcimonie fait partie d'un concept fondamental en apprentissage automatique. Les modèles parcimonieux sont attrayants car ils offrent plus d'interprétabilité et une meilleure généralisation (en évitant le sur-apprentissage) en induisant un nombre réduit de paramètres dans le modèle. Au-delà de la parcimonie générale et dans de nombreux cas, les modèles sont structurellement contraints et ont une représentation simple de certains éléments fondamentaux, comme par exemple une collection de vecteurs, matrices ou tenseurs spécifiques. Ces éléments fondamentaux sont appelés atomes. Dans ce contexte, les normes atomiques fournissent un cadre général pour estimer ce type de modèles. périodes de modèles. Le but de cette thèse est d'utiliser le cadre de parcimonie convexe fourni par les normes atomiques pour étudier une forme de parcimonie matricielle. Tout d'abord, nous développons un algorithme efficace basé sur les méthodes de Frank-Wolfe et qui est particulièrement adapté pour résoudre des problèmes convexes régularisés par une norme atomique. Nous nous concentrons ensuite sur l'estimation de la structure des modèles graphiques gaussiens, où la structure du modèle est encodée dans la matrice de précision et nous étudions le cas avec des variables manquantes. Nous proposons une formulation convexe avec une approche algorithmique et fournissons un résultat théorique qui énonce les conditions nécessaires pour récupérer la structure souhaitée. Enfin, nous considérons le problème de démixage d'un signal en deux composantes ou plus via la minimisation d’une somme de normes ou de jauges, encodant chacune la structure a priori des composants à récupérer. En particulier, nous fournissons une garantie de récupération exacte dans le cadre sans bruit, basée sur des mesures d'incohérence
The goal of machine learning is to learn a model from some data that will make accurate predictions on data that it has not seen before. In order to obtain a model that will generalize on new data, and avoid overfitting, we need to restrain the model. These restrictions are usually some a priori knowledge of the structure of the model. First considered approaches included a regularization, first ridge regression and later Lasso regularization for inducing sparsity in the solution. Sparsity, also known as parsimony, has emerged as a fundamental concept in machine learning. Parsimonious models are appealing since they provide more interpretability and better generalization (avoid overfitting) through the reduced number of parameters. Beyond general sparsity and in many cases, models are constrained structurally so they have a simple representation in terms of some fundamental elements, consisting for example of a collection of specific vectors, matrices or tensors. These fundamental elements are called atoms. In this context, atomic norms provide a general framework for estimating these sorts of models. The goal of this thesis is to use the framework of convex sparsity provided by atomic norms to study a form of matrix sparsity. First, we develop an efficient algorithm based on Frank-Wolfe methods that is particularly adapted to solve problems with an atomic norm regularization. Then, we focus on the structure estimation of Gaussian graphical models, where the structure of the graph is encoded in the precision matrix and study the case with unobserved variables. We propose a convex formulation with an algorithmic approach and provide a theoretical result that states necessary conditions for recovering the desired structure. Finally, we consider the problem of signal demixing into two or more components via the minimization of a sum of norms or gauges, encoding each a structural prior on the corresponding components to recover. In particular, we provide general exact recovery guarantees in the noiseless setting based on incoherence measures

APA, Harvard, Vancouver, ISO, and other styles

3

Smith, Chandler B. "Sparsity Constrained Inverse Problems - Application to Vibration-based Structural Health Monitoring." ScholarWorks @ UVM, 2019. https://scholarworks.uvm.edu/graddis/1143.

Full text

Abstract:

Vibration-based structural health monitoring (SHM) seeks to detect, quantify, locate, and prognosticate damage by processing vibration signals measured while the structure is operational. The basic premise of vibration-based SHM is that damage will affect the stiffness, mass or energy dissipation properties of the structure and in turn alter its measured dynamic characteristics. In order to make SHM a practical technology it is necessary to perform damage assessment using only a minimum number of permanently installed sensors. Deducing damage at unmeasured regions of the structural domain requires solving an inverse problem that is underdetermined and(or) ill-conditioned. In addition, the effects of local damage on global vibration response may be overshadowed by the effects of modelling error, environmental changes, sensor noise, and unmeasured excitation. These theoretical and practical challenges render the damage identification inverse problem ill-posed, and in some cases unsolvable with conventional inverse methods. This dissertation proposes and tests a novel interpretation of the damage identification inverse problem. Since damage is inherently local and strictly reduces stiffness and(or) mass, the underdetermined inverse problem can be made uniquely solvable by either imposing sparsity or non-negativity on the solution space. The goal of this research is to leverage this concept in order to prove that damage identification can be performed in practical applications using significantly less measurements than conventional inverse methods require. This dissertation investigates two sparsity inducing methods, L1-norm optimization and the non-negative least squares, in their application to identifying damage from eigenvalues, a minimal sensor-based feature that results in an underdetermined inverse problem. This work presents necessary conditions for solution uniqueness and a method to quantify the bounds on the non-unique solution space. The proposed methods are investigated using a wide range of numerical simulations and validated using a four-story lab-scale frame and a full-scale 17 m long aluminum truss. The findings of this study suggest that leveraging the attributes of both L1-norm optimization and non-negative constrained least squares can provide significant improvement over their standalone applications and over other existing methods of damage detection.

APA, Harvard, Vancouver, ISO, and other styles

4

Kim, Yookyung. "Compressed Sensing Reconstruction Using Structural Dependency Models." Diss., The University of Arizona, 2012. http://hdl.handle.net/10150/238613.

Full text

Abstract:

Compressed sensing (CS) theory has demonstrated that sparse signals can be reconstructed from far fewer measurements than suggested by the Nyquist sampling theory. CS has received great attention recently as an alternative to the current paradigm of sampling followed by compression. Initial CS operated under the implicit assumption that the sparsity domain coefficients are independently distributed. Recent results, however, show that exploiting statistical dependencies in sparse signals improves the recovery performance of CS. This dissertation proposes model-based CS reconstruction techniques. Statistical dependency models for several CS problems are proposed and incorporated into different CS algorithms. These models allow incorporation of a priori information into the CS reconstruction problems. Firstly, we propose the use of a Bayes least squares-Gaussian scale mixtures (BLS-GSM) model for CS recovery of natural images. The BLS-GSM model is able to exploit dependencies inherent in wavelet coefficients. This model is incorporated into several recent CS algorithms. The resulting methods significantly reduce reconstruction errors and/or the number of measurements required to obtain a desired reconstruction quality, when compared to state-of-the-art model-based CS methods in the literature. The model-based CS reconstruction techniques are then extended to video. In addition to spatial dependencies, video sequences exhibit significant temporal dependencies as well. In this dissertation, a model for jointly exploiting spatial and temporal dependencies in video CS is also proposed. The proposed method enforces structural self-similarity of image blocks within each frame as well as across neighboring frames. By sparsely representing collections of similar blocks, dominant image structures are retained while noise and incoherent undersampling artifacts are eliminated. A new video CS algorithm which incorporates this model is then proposed. The proposed algorithm iterates between enforcement of the self-similarity model and consistency with measurements. By enforcing measurement consistency in residual domain, sparsity is increased and CS reconstruction performance is enhanced. The proposed approach exhibits superior subjective image quality and significantly improves peak-signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM).Finally, a model-based CS framework is proposed for super resolution (SR) reconstruction. The SR reconstruction is formulated as a CS problem and a self-similarity model is incorporated into the reconstruction. The proposed model enforces similarity of collections of blocks through shrinkage of their transform-domain coefficients. A sharpening operation is performed in transform domain to emphasize edge recovery. The proposed method is compared with state-of-the-art SR techniques and provides high-quality SR images, both quantitatively and subjectively.

APA, Harvard, Vancouver, ISO, and other styles

5

McGrady, Christopher Dwain. "Linking Rheological and Processing Behavior to Molecular Structure in Sparsely-Branched Polyethylenes Using Constitutive Relationships." Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/37924.

Full text

Abstract:

This dissertation works towards the larger objective of identifying and assessing the key features of molecular structure that lead to desired polymer processing performance with an ultimate goal of being able to tailor-make specific macromolecules that yield the desired processing response. A series of eight well-characterized, high-density polyethylene (HDPE) resins, with varying degrees of sparse long chain branching (LCB) content, is used to study the effect of both LCB content and distribution on the rheological and commercial processing response using the Pom-pom constitutive relationship. A flow instability known as ductile failure in extensional flow required the development a novel technique known as encapsulation in order to carry out shear-free rheological characterization. Ductile failure prevents the rheological measurement of transient stress growth at higher strains for certain strain-hardening materials. This reduces the accuracy of nonlinear parameters for constitutive equations fit from transient stress growth data, as well as their effectiveness in modeling extensionally driven processes such as film casting. An experimental technique to overcome ductile failure called encapsulation in which the material that undergoes ductile failure is surrounded by a resin that readily deforms homogeneously at higher strains is introduced. A simple parallel model is shown to calculate the viscosity of the core material. The effect of sparse long chain branching, LCB, on the film-casting process is analyzed at various drawdown ratios. A full rheological characterization in both shear and shear-free flows is also presented. At low drawdown ratios, the low-density polyethylenes, LDPE, exhibited the least degree of necking at distances less than the HDPE frostline. The sparsely-branched HDPE resins films had similar final film-widths that were larger than those of the linear HDPE. As the drawdown ratio was increased, film width profiles separated based on branching level. Small amounts of LCB were found to reduce the amount of necking at intermediate drawdown ratios. At higher drawdown ratios, the sparsely-branched HDPE resins of lower LCB had content film-widths that mimicked that of the linear HDPE, while the sparsely-branched HDPE resins of higher LCB content retained a larger film width. Molecular structural analysis via the Pom-pom constitutive model suggested that branching that was distributed across a larger range of backbone lengths serve to improve resistance to necking. As the drawdown ratio increased, the length of the backbones dominating the response decreased, so that the linear chains were controlling the necking behavior of the sparsely-branched resins of lower LCB content while remaining in branched regime for higher LCB content HDPEs. Other processing variables such as shear viscosity magnitude, extrudate swell, and non-isothermal processing conditions were eliminated as contributing factors to the differences in the film width profile. The effect of sparse long chain branching, LCB, on the shear step-strain relaxation modulus is analyzed using a series of eight well-characterized, high-density polyethylene (HDPE) resins. The motivation for this work is in assessing the ability of step-strain flows to provide specific information about a material's branching architecture. Fundamental to this goal is proving the validity of relaxation moduli data at times shorter than the onset of time-strain separability. Strains of 1% to 1250% are imposed on materials with LCB content ranging from zero to 3.33 LCB per 10,000 carbon atoms. All materials are observed to obey time-strain separation beyond some characteristic time, Ï k. The presence of LCB is observed to increase the value of Ï k relative to the linear resin. Furthermore, the amount of LCB content is seen to correlate positively with increasing Ï k. The behavior of the relaxation modulus at times shorter than Ï k is investigated by an analysis of the enhancement seen in the linear relaxation modulus, G0(t), as a function of strain and LCB content. This enhancement is seen to 1) increase with increasing strain in all resins, 2) be significantly larger in the sparsely-branched HDPE resins relative to the linear HDPE resin, and 3) increase in magnitude with increasing LCB content. The shape and smoothness of the damping function is investigated to rule out the presence of wall-slip and material rupture during testing. The finite rise time to impose the desired strain is carefully monitored and compared to the Rouse relaxation time of the linear HDPE resins studied. Sparse LCB is found to increase the magnitude of the relaxation modulus at short times relative to the linear resin. It is shown that these differences are due to variations in the material architecture, specifically LCB content, and not because of mechanical anomalies.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

6

Yan, Enxu. "Sublinear-Time Learning and Inference for High-Dimensional Models." Research Showcase @ CMU, 2018. http://repository.cmu.edu/dissertations/1207.

Full text

Abstract:

Across domains, the scale of data and complexity of models have both been increasing greatly in the recent years. For many models of interest, tractable learning and inference without access to expensive computational resources have become challenging. In this thesis, we approach efficient learning and inference through the leverage of sparse structures inherent in the learning objective, which allows us to develop algorithms sublinear in the size of parameters without compromising the accuracy of models. In particular, we address the following three questions for each problem of interest: (a) how to formulate model estimation as an optimization problem with tractable sparse structure, (b) how to efficiently, i.e. in sublinear time, search, maintain, and utilize the sparse structures during training and inference, (c) how to guarantee fast convergence of our optimization algorithm despite its greedy nature? By answering these questions, we develop state-of-the-art algorithms in varied domains. Specifically, in the extreme classification domain, we utilizes primal and dual sparse structures to develop greedy algorithms of complexity sublinear in the number of classes, which obtain state-of-the-art accuracies on several benchmark data sets with one or two orders of magnitude speedup over existing algorithms. We also apply the primal-dual-sparse theory to develop a state-of-the-art trimming algorithm for Deep Neural Networks, which sparsifies neuron connections of a DNN with a task-dependent theoretical guarantee, which results in models of smaller storage cost and faster inference speed. When it comes to structured prediction problems (i.e. graphical models) with inter-dependent outputs, we propose decomposition methods that exploit sparse messages to decompose a structured learning problem of large output domains into factorwise learning modules amenable to sublineartime optimization methods, leading to practically much faster alternatives to existing learning algorithms. The decomposition technique is especially effective when combined with search data structures, such as those for Maximum Inner-Product Search (MIPS), to improve the learning efficiency jointly. Last but not the least, we design novel convex estimators for a latent-variable model by reparameterizing it as a solution of sparse support in an exponentially high-dimensional space, and approximate it with a greedy algorithm, which yields the first polynomial-time approximation method for the Latent-Feature Models and Generalized Mixed Regression without restrictive data assumptions.

APA, Harvard, Vancouver, ISO, and other styles

7

Rösmann, Christoph [Verfasser], Torsten [Akademischer Betreuer] Bertram, and Martin [Gutachter] Mönnigmann. "Time-optimal nonlinear model predictive control : Direct transcription methods with variable discretization and structural sparsity exploitation / Christoph Rösmann ; Gutachter: Martin Mönnigmann ; Betreuer: Torsten Bertram." Dortmund : Universitätsbibliothek Dortmund, 2019. http://d-nb.info/1199106364/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

8

Roulet, Vincent. "On the geometry of optimization problems and their structure." Thesis, Paris Sciences et Lettres (ComUE), 2017. http://www.theses.fr/2017PSLEE069/document.

Full text

Abstract:

Dans de nombreux domaines tels que l’apprentissage statistique, la recherche opérationnelle ou encore la conception de circuits, une tâche est modélisée par un jeu de paramètres que l’on cherche à optimiser pour prendre la meilleure décision possible. Mathématiquement, le problème revient à minimiser une fonction de l’objectif recherché par des algorithmes itératifs. Le développement de ces derniers dépend alors de la géométrie de la fonction ou de la structure du problème. Dans une première partie, cette thèse étudie comment l’acuité d’une fonction autour de ses minima peut être exploitée par le redémarrage d’algorithmes classiques. Les schémas optimaux sont présentés pour des problèmes convexes généraux. Ils nécessitent cependant une description complète de la fonction, ce qui est rarement disponible. Des stratégies adaptatives sont donc développées et prouvées être quasi-optimales. Une analyse spécifique est ensuite conduite pour les problèmes parcimonieux qui cherchent des représentations compressées des variables du problème. Leur géométrie conique sous-jacente, qui décrit l’acuité de la fonction de l’objectif, se révèle contrôler à la fois la performance statistique du problème et l’efficacité des procédures d’optimisation par une seule quantité. Une seconde partie est dédiée aux problèmes d’apprentissage statistique. Ceux-ci effectuent une analyse prédictive de données à l’aide d’un large nombre d’exemples. Une approche générique est présentée pour à la fois résoudre le problème de prédiction et le simplifier en groupant soit les variables, les exemples ou les tâches. Des méthodes algorithmiques systématiques sont développées en analysant la géométrie induite par une partition des données. Une analyse théorique est finalement conduite lorsque les variables sont groupées par analogie avec les méthodes parcimonieuses
In numerous fields such as machine learning, operational research or circuit design, a task is modeled by a set of parameters to be optimized in order to take the best possible decision. Formally, the problem amounts to minimize a function describing the desired objective with iterative algorithms. The development of these latter depends then on the characterization of the geometry of the function or the structure of the problem. In a first part, this thesis studies how sharpness of a function around its minimizers can be exploited by restarting classical algorithms. Optimal schemes are presented for general convex problems. They require however a complete description of the function that is rarely available. Adaptive strategies are therefore developed and shown to achieve nearly optimal rates. A specific analysis is then carried out for sparse problems that seek for compressed representation of the variables of the problem. Their underlying conic geometry, that describes sharpness of the objective, is shown to control both the statistical performance of the problem and the efficiency of dedicated optimization methods by a single quantity. A second part is dedicated to machine learning problems. These perform predictive analysis of data from large set of examples. A generic framework is presented to both solve the prediction problem and simplify it by grouping either features, samples or tasks. Systematic algorithmic approaches are developed by analyzing the geometry induced by partitions of the data. A theoretical analysis is then carried out for grouping features by analogy to sparse methods

APA, Harvard, Vancouver, ISO, and other styles

9

Kolar, Mladen. "Uncovering Structure in High-Dimensions: Networks and Multi-task Learning Problems." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/229.

Full text

Abstract:

Extracting knowledge and providing insights into complex mechanisms underlying noisy high-dimensional data sets is of utmost importance in many scientific domains. Statistical modeling has become ubiquitous in the analysis of high dimensional functional data in search of better understanding of cognition mechanisms, in the exploration of large-scale gene regulatory networks in hope of developing drugs for lethal diseases, and in prediction of volatility in stock market in hope of beating the market. Statistical analysis in these high-dimensional data sets is possible only if an estimation procedure exploits hidden structures underlying data. This thesis develops flexible estimation procedures with provable theoretical guarantees for uncovering unknown hidden structures underlying data generating process. Of particular interest are procedures that can be used on high dimensional data sets where the number of samples n is much smaller than the ambient dimension p. Learning in high-dimensions is difficult due to the curse of dimensionality, however, the special problem structure makes inference possible. Due to its importance for scientific discovery, we put emphasis on consistent structure recovery throughout the thesis. Particular focus is given to two important problems, semi-parametric estimation of networks and feature selection in multi-task learning.

APA, Harvard, Vancouver, ISO, and other styles

10

Todeschini, Adrien. "Probabilistic and Bayesian nonparametric approaches for recommender systems and networks." Thesis, Bordeaux, 2016. http://www.theses.fr/2016BORD0237/document.

Full text

Abstract:

Nous proposons deux nouvelles approches pour les systèmes de recommandation et les réseaux. Dans la première partie, nous donnons d’abord un aperçu sur les systèmes de recommandation avant de nous concentrer sur les approches de rang faible pour la complétion de matrice. En nous appuyant sur une approche probabiliste, nous proposons de nouvelles fonctions de pénalité sur les valeurs singulières de la matrice de rang faible. En exploitant une représentation de modèle de mélange de cette pénalité, nous montrons qu’un ensemble de variables latentes convenablement choisi permet de développer un algorithme espérance-maximisation afin d’obtenir un maximum a posteriori de la matrice de rang faible complétée. L’algorithme résultant est un algorithme à seuillage doux itératif qui adapte de manière itérative les coefficients de réduction associés aux valeurs singulières. L’algorithme est simple à mettre en œuvre et peut s’adapter à de grandes matrices. Nous fournissons des comparaisons numériques entre notre approche et de récentes alternatives montrant l’intérêt de l’approche proposée pour la complétion de matrice à rang faible. Dans la deuxième partie, nous présentons d’abord quelques prérequis sur l’approche bayésienne non paramétrique et en particulier sur les mesures complètement aléatoires et leur extension multivariée, les mesures complètement aléatoires composées. Nous proposons ensuite un nouveau modèle statistique pour les réseaux creux qui se structurent en communautés avec chevauchement. Le modèle est basé sur la représentation du graphe comme un processus ponctuel échangeable, et généralise naturellement des modèles probabilistes existants à structure en blocs avec chevauchement au régime creux. Notre construction s’appuie sur des vecteurs de mesures complètement aléatoires, et possède des paramètres interprétables, chaque nœud étant associé un vecteur représentant son niveau d’affiliation à certaines communautés latentes. Nous développons des méthodes pour simuler cette classe de graphes aléatoires, ainsi que pour effectuer l’inférence a posteriori. Nous montrons que l’approche proposée peut récupérer une structure interprétable à partir de deux réseaux du monde réel et peut gérer des graphes avec des milliers de nœuds et des dizaines de milliers de connections
We propose two novel approaches for recommender systems and networks. In the first part, we first give an overview of recommender systems and concentrate on the low-rank approaches for matrix completion. Building on a probabilistic approach, we propose novel penalty functions on the singular values of the low-rank matrix. By exploiting a mixture model representation of this penalty, we show that a suitably chosen set of latent variables enables to derive an expectation-maximization algorithm to obtain a maximum a posteriori estimate of the completed low-rank matrix. The resulting algorithm is an iterative soft-thresholded algorithm which iteratively adapts the shrinkage coefficients associated to the singular values. The algorithm is simple to implement and can scale to large matrices. We provide numerical comparisons between our approach and recent alternatives showing the interest of the proposed approach for low-rank matrix completion. In the second part, we first introduce some background on Bayesian nonparametrics and in particular on completely random measures (CRMs) and their multivariate extension, the compound CRMs. We then propose a novel statistical model for sparse networks with overlapping community structure. The model is based on representing the graph as an exchangeable point process, and naturally generalizes existing probabilistic models with overlapping block-structure to the sparse regime. Our construction builds on vectors of CRMs, and has interpretable parameters, each node being assigned a vector representing its level of affiliation to some latent communities. We develop methods for simulating this class of random graphs, as well as to perform posterior inference. We show that the proposed approach can recover interpretable structure from two real-world networks and can handle graphs with thousands of nodes and tens of thousands of edges

APA, Harvard, Vancouver, ISO, and other styles

11

Suwanwimolkul, Suwichaya. "Adaptive Markov Random Fields for Structured Compressive Sensing." Thesis, 2018. http://hdl.handle.net/2440/117806.

Full text

Abstract:

Compressive sensing (CS) has underpinned recent developments in data compression and signal acquisition systems. The goal of CS is to recover a high dimensional sparse signal from a few measurements. Recent progress in CS has attempted to further reduce the measurements by employing signal structures. This thesis presents a novel structured sparsity model, namely, adaptive Markov random field (MRF) to effectively extract the signal structures. The adaptive MRF achieves two desirable properties: flexibility—the ability to represent a wide range of structures—and adaptability—being adaptive to any structures. However, most existing work can only achieve one of these two properties. Previous MRF-based methods offer high flexibility but cannot adapt to new signal structures, while the data-adaptive based methods assume limited signal structures. Therefore, the contribution of this thesis is the novel and efficient signal recovery methods for CS. We propose to leverage the adaptability of the MRF by refining the MRF parameters based on a point estimate of the latent sparse signal, and then the sparse signal is estimated based on the resulting MRF. This method is termed Two-step-Adaptive MRF. To maximize the adaptability, we also propose a new sparse signal estimation method that estimates the sparse signal, support, and noise parameters jointly. The point estimation of the latent sparse signals underpins the performance of MRF parameter estimation, but it cannot depict the statistical uncertainty of the latent sparse signals, which can lead to inaccurate parameter estimations, and thus limit the ultimate signal recovery performance. Therefore, we reformulate the parameter estimation problem to offer better generalization over the latent sparse signals. We propose to obtain the MRF parameters from given measurements by solving a maximum marginal likelihood (MML) problem. The resulting MML problem allows the MRF parameters to be estimated from measurements directly in one step; thus, we term this method One-step-Adaptive MRF. To solve the MML problem efficiently, we propose to approximate the MRF model with the product of two simpler distributions which enables closed-form solutions for all unknown variables with low computational cost. Extensive experiments on three real-world datasets demonstrate the promising performance of Two-steps-Adaptive MRF. One-step-Adaptive MRF further improves over the state-of-the-art methods. Motivated by this, we apply One-step-Adaptive MRF to collaborative-representation based classifications (CRCs) to extract the underlying information that can help identify the class label of the corresponding query sample. CRCs have offered state-of-the-art performance in wearable sensor-based human activity recognition when training samples are limited. Existing work is based on the shortest Euclidean distance to a query sample, which can be susceptible to noise and correlation in the training samples. To improve robustness, we employ the adaptive MRF to extract the underlying structure of a representation vector directly from the query sample to improve discriminative power, because the underlying structure is unique to its corresponding query sample and independent of the quality of the training samples. The adaptive MRF can be customized to further reduce to the correlation in the training samples. Extensive experiments on two real-world datasets demonstrate the promising performance of the proposed method.
Thesis (Ph.D.) -- University of Adelaide, School of Computer Science, 2018

APA, Harvard, Vancouver, ISO, and other styles

12

Hwang, Sung Ju. "Discriminative object categorization with external semantic knowledge." 2013. http://hdl.handle.net/2152/21320.

Full text

Abstract:

Visual object category recognition is one of the most challenging problems in computer vision. Even assuming that we can obtain a near-perfect instance level representation with the advances in visual input devices and low-level vision techniques, object categorization still remains as a difficult problem because it requires drawing boundaries between instances in a continuous world, where the boundaries are solely defined by human conceptualization. Object categorization is essentially a perceptual process that takes place in a human-defined semantic space. In this semantic space, the categories reside not in isolation, but in relation to others. Some categories are similar, grouped, or co-occur, and some are not. However, despite this semantic nature of object categorization, most of the today's automatic visual category recognition systems rely only on the category labels for training discriminative recognition with statistical machine learning techniques. In many cases, this could result in the recognition model being misled into learning incorrect associations between visual features and the semantic labels, from essentially overfitting to training set biases. This limits the model's prediction power when new test instances are given. Using semantic knowledge has great potential to benefit object category recognition. First, semantic knowledge could guide the training model to learn a correct association between visual features and the categories. Second, semantics provide much richer information beyond the membership information given by the labels, in the form of inter-category and category-attribute distances, relations, and structures. Finally, the semantic knowledge scales well as the relations between categories become larger with an increasing number of categories. My goal in this thesis is to learn discriminative models for categorization that leverage semantic knowledge for object recognition, with a special focus on the semantic relationships among different categories and concepts. To this end, I explore three semantic sources, namely attributes, taxonomies, and analogies, and I show how to incorporate them into the original discriminative model as a form of structural regularization. In particular, for each form of semantic knowledge I present a feature learning approach that defines a semantic embedding to support the object categorization task. The regularization penalizes the models that deviate from the known structures according to the semantic knowledge provided. The first semantic source I explore is attributes, which are human-describable semantic characteristics of an instance. While the existing work treated them as mid-level features which did not introduce new information, I focus on their potential as a means to better guide the learning of object categories, by enforcing the object category classifiers to share features with attribute classifiers, in a multitask feature learning framework. This approach essentially discovers the common low-dimensional features that support predictions in both semantic spaces. Then, I move on to the semantic taxonomy, which is another valuable source of semantic knowledge. The merging and splitting criteria for the categories on a taxonomy are human-defined, and I aim to exploit this implicit semantic knowledge. Specifically, I propose a tree of metrics (ToM) that learns metrics that capture granularity-specific similarities at different nodes of a given semantic taxonomy, and uses a regularizer to isolate granularity-specific disjoint features. This approach captures the intuition that the features used for the discrimination of the parent class should be different from the features used for the children classes. Such learned metrics can be used for hierarchical classification. The use of a single taxonomy can be limited in that its structure is not optimal for hierarchical classification, and there may exist no single optimal semantic taxonomy that perfectly aligns with visual distributions. Thus, I next propose a way to overcome this limitation by leveraging multiple taxonomies as semantic sources to exploit, and combine the acquired complementary information across multiple semantic views and granularities. This allows us, for example, to synthesize semantics from both 'Biological', and 'Appearance'-based taxonomies when learning the visual features. Finally, as a further exploration of more complex semantic relations different from the previous two pairwise similarity-based models, I exploit analogies, which encode the relational similarities between two related pairs of categories. Specifically, I use analogies to regularize a discriminatively learned semantic embedding space for categorization, such that the displacements between the two category embeddings in both category pairs of the analogy are enforced to be the same. Such a constraint allows for a more confusing pair of categories to benefit from a clear separation in the matched pair of categories that share the same relation. All of these methods are evaluated on challenging public datasets, and are shown to effectively improve the recognition accuracy over purely discriminative models, while also guiding the recognition to be more semantic to human perception. Further, the applications of the proposed methods are not limited to visual object categorization in computer vision, but they can be applied to any classification problems where there exists some domain knowledge about the relationships or structures between the classes. Possible applications of my methods outside the visual recognition domain include document classification in natural language processing, and gene-based animal or protein classification in computational biology.
text

APA, Harvard, Vancouver, ISO, and other styles

13

Mayrink, Vinicius Diniz. "Factor Models to Describe Linear and Non-linear Structure in High Dimensional Gene Expression Data." Diss., 2011. http://hdl.handle.net/10161/3865.

Full text

Abstract:

An important problem in the analysis of gene expression data is the identification of groups of features that are coherently expressed. For example, one often wishes to know whether a group of genes, clustered because of correlation in one data set, is still highly co-expressed in another data set. For some microarray platforms there are many, relatively short, probes for each gene of interest. In this case, it is possible that a given probe is not measuring its targeted transcript, but rather a different gene with a similar region (called cross-hybridization). Similarly, the incorrect mapping of short nucleotide sequences to a target gene is a common issue related to the young technology producing RNA-Seq data. The expression pattern across samples is a valuable source of information, which can be used to address distinct problems through the application of factor models. Our first study is focused on the identification of the presence/absence status of a gene in a sample. We compare our factor model to state-of-the-art detection methods; the results suggest superior performance of the factor analysis for detecting transcripts. In the second study, we apply factor models to investigate gene modules (groups of coherently expressed genes). Variation in the number of copies of regions of the genome is a well known and important feature of most cancers. Copy number alteration is detected for a group of genes in breast cancer; our goal is to examine this abnormality in the same chromosomal region for other types of tumors (Ovarian, Lung and Brain). In the third application, the expression pattern related to RNA-Seq count data is evaluated through a factor model based on the Poisson distribution. Here, the presence/absence of coherent patterns is closely associated with the number of incorrect read mappings. The final study of this dissertation is dedicated to the analysis of multi-factor models with linear and non-linear structure of interactions between latent factors. The interaction terms can have important implications in the model; they represent relationships between genes which cannot be captured in an ordinary analysis.

Dissertation

APA, Harvard, Vancouver, ISO, and other styles

14

Xiong, Xin. "Efficient Jacobian Determination by Structure-Revealing Automatic Differentiation." Thesis, 2014. http://hdl.handle.net/10012/8197.

Full text

Abstract:

This thesis is concerned with the efficient computation of Jacobian matrices of nonlinear vector maps using automatic differentiation (AD). Specifically, we propose the use of two directed edge separator methods, the weighted minimum separator and natural order separator methods, to exploit the structure of the computational graph of the nonlinear system.This allows for the efficient determination of the Jacobian matrix using AD software. We will illustrate the promise of this approach with computational experiments.

APA, Harvard, Vancouver, ISO, and other styles

15

Sadhu, Ayan. "Decentralized Ambient System Identification of Structures." Thesis, 2013. http://hdl.handle.net/10012/7538.

Full text

Abstract:

Many of the existing ambient modal identification methods based on vibration data process information centrally to calculate the modal properties. Such methods demand relatively large memory and processing capabilities to interrogate the data. With the recent advances in wireless sensor technology, it is now possible to process information on the sensor itself. The decentralized information so obtained from individual sensors can be combined to estimate the global modal information of the structure. The main objective of this thesis is to present a new class of decentralized algorithms that can address the limitations stated above. The completed work in this regard involves casting the identification problem within the framework of underdetermined blind source separation (BSS). Time-frequency transformations of measurements are carried out, resulting in a sparse representation of the signals. Stationary wavelet packet transform (SWPT) is used as the primary means to obtain a sparse representation in the time-frequency domain. Several partial setups are used to obtain the partial modal information, which are then combined to obtain the global structural mode information. Most BSS methods in the context of modal identification assume that the excitation is white and do not contain narrow band excitation frequencies. However, this assumption is not satisfied in many situations (e.g., pedestrian bridges) when the excitation is a superposition of narrow-band harmonic(s) and broad-band disturbance. Under such conditions, traditional BSS methods yield sources (modes) without any indication as to whether the identified source(s) is a system or an excitation harmonic. In this research, a novel under-determined BSS algorithm is developed involving statistical characterization of the sources which are used to delineate the sources corresponding to external disturbances versus intrinsic modes of the system. Moreover, the issue of computational burden involving an over-complete dictionary of sparse bases is alleviated through a new underdetermined BSS method based on a tensor algebra tool called PARAllel FACtor (PARAFAC) decomposition. At the core of this method, the wavelet packet decomposition coefficients are used to form a covariance tensor, followed by PARAFAC tensor decomposition to separate the modal responses. Finally, the proposed methods are validated using measurements obtained from both wired and wireless sensors on laboratory scale and full scale buildings and bridges.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Structured sparsity model'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles