Log in

Relevant bibliographies by topics / Sparsification / Dissertations / Theses

Dissertations / Theses on the topic 'Sparsification'

To see the other types of publications on this topic, follow the link: Sparsification.

Author: Grafiati

Published: 4 June 2021

Last updated: 25 May 2024

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 26 dissertations / theses for your research on the topic 'Sparsification.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Camacho, Martin Ayalde. "Spectral sparsification." Thesis, Harvard University, 2014. http://nrs.harvard.edu/urn-3:HUL.InstRepos:12553868.

Full text

Abstract:

We survey recent literature focused on the following spectral sparsification question: Given an integer n and > 0, does there exist a function N(n; ) such that for every collection of C1; : : : ;Cm of n n real symmetric positive semidefinite matrices whose sum is the identity, there exists a weighted subset of size N(n; ) whose sum has eigenvalues lying between 1 and 1 + ? We present the algorithms for solving this problem given in [4, 8, 10]. These algorithms obtain N(n; ) = O(n= 2), which is optimal up to constant factors, through use of the barrier method, a proof technique involving potential functions which control the locations of the eigenvalues of a matrix under certain matrix updates. We then survey the applications of this sparsification result and its proof techniques to graph sparsification [4, 10], low-rank matrix approximation [8], and estimating the covariance of certain distributions of random matrices [32, 26]. We end our survey by examining a multivariate generalization of the barrier method used in Marcus, Spielman, and Srivastava’s recent proof [19] of the Kadison-Singer conjecture.

APA, Harvard, Vancouver, ISO, and other styles

2

Oliveira, Rafael (Rafael Mendes de Oliveira). "Spectral sparsification and spectrally thin trees." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/88906.

Full text

Abstract:

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2012.
Cataloged from PDF version of thesis.
Includes bibliographical references (page 31).
We provide results of intensive experimental data in order to investigate the existence of spectrally thin trees and unweighted spectral sparsifiers for graphs with small expansion. In addition, we also survey and prove some partial results on the existence of spectrally thin trees on dense graphs with high enough expansion.
by Rafael Oliveira.
M. Eng.

APA, Harvard, Vancouver, ISO, and other styles

3

Moitra, Ankur. "Vertex sparsification and universal rounding algorithms." Thesis, Massachusetts Institute of Technology, 2011. http://hdl.handle.net/1721.1/66019.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.
Cataloged from PDF version of thesis.
Includes bibliographical references (p. 125-129).
Suppose we are given a gigantic communication network, but are only interested in a small number of nodes (clients). There are many routing problems we could be asked to solve for our clients. Is there a much smaller network - that we could write down on a sheet of paper and put in our pocket - that approximately preserves all the relevant communication properties of the original network? As we will demonstrate, the answer to this question is YES, and we call this smaller network a vertex sparsifier. In fact, if we are asked to solve a sequence of optimization problems characterized by cuts or flows, we can compute a good vertex sparsifier ONCE and discard the original network. We can run our algorithms (or approximation algorithms) on the vertex sparsifier as a proxy - and still recover approximately optimal solutions in the original network. This novel pattern saves both space (because the network we store is much smaller) and time (because our algorithms run on a much smaller graph). Additionally, we apply these ideas to obtain a master theorem for graph partitioning problems - as long as the integrality gap of a standard linear programming relaxation is bounded on trees, then the integrality gap is at most a logarithmic factor larger for general networks. This result implies optimal bounds for many well studied graph partitioning problems as a special case, and even yields optimal bounds for more challenging problems that had not been studied before. Morally, these results are all based on the idea that even though the structure of optimal solutions can be quite complicated, these solution values can be approximated by crude (even linear) functions.
by Ankur Moitra.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

4

Vial, John Francis Stephen. "Conservative Sparsification for Efficient Approximate Estimation." Thesis, The University of Sydney, 2013. http://hdl.handle.net/2123/9907.

Full text

Abstract:

Linear Gaussian systems often exhibit sparse structures. For systems which grow as a function of time, marginalisation of past states will eventually introduce extra non-zero elements into the information matrix of the Gaussian distribution. These extra non-zeros can lead to dense problems as these systems progress through time. This thesis proposes a method that can delete elements of the information matrix while maintaining guarantees about the conservativeness of the resulting estimate with a computational complexity that is a function of the connectivity of the graph rather than the problem dimension. This sparsification can be performed iteratively and minimises the Kullback Leibler Divergence (KLD) between the original and approximate distributions. This new technique is called Conservative Sparsification (CS). For large sparse graphs employing a Junction Tree (JT) for estimation, efficiency is related to the size of the largest clique. Conservative Sparsification can be applied to clique splitting in JTs, enabling approximate and efficient estimation in JTs with the same conservative guarantees as CS for information matrices. In distributed estimation scenarios which use JTs, CS can be performed in parallel and asynchronously on JT cliques. This approach usually results in a larger KLD compared with the optimal CS approach, but an upper bound on this increased divergence can be calculated with information locally available to each clique. This work has applications in large scale distributed linear estimation problems where the size of the problem or communication overheads make optimal linear estimation difficult.

APA, Harvard, Vancouver, ISO, and other styles

5

Ortmann, Mark [Verfasser]. "Combinatorial Algorithms for Graph Sparsification / Mark Ortmann." Konstanz : Bibliothek der Universität Konstanz, 2017. http://d-nb.info/1173616438/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

6

Di, Jinchao. "Gene Co-expression Network Mining Using Graph Sparsification." The Ohio State University, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=osu1367583964.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Dahlin, Oskar. "Implementing and Evaluating sparsification methods in probabilistic networks." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-428591.

Full text

Abstract:

Most queries on probabilistic networks assume a possible world semantic, which causes an exponential increase in execution time. Deterministic networks can apply sparsification methods to reduce their sizes while preserving some structural properties, but there have not been any equivalent methods for probabilistic networks until recently. As a first work in the field, Parchas, Papailiou, Papadias and Bonchi have proposed sparsification methods for probabilistic networks by adapting a gradient descent and expectation-maximization algorithm. In this report the two proposed algorithms, Gradient Descent Backbone (GDB) and Expectation-Maximization Degree (EMD), were implemented and evaluated on different input parameters by comparing how well the general graph properties, expected vertex degrees and ego betweenness approximations are preserved after sparsifying different datasets. In the sparsified networks we found that the entropies had mostly gone down to zero, effectively creating a deterministic network. EMD generally showed better results than GDB specifically when using the relative discrepancies, however on lower alpha values the EMD methods can generate disconnected networks, more so when using absolute discrepancies. The methods produced unexpected results on higher alpha values which suggests they're not stable. Our evaluations have confirmed that the proposed algorithms produce acceptable results in some cases, however finding the right input parameters for specific networks can be time consuming. Therefore further testing on diverse structures of networks with different input parameters is recommended.Tryckt av:

APA, Harvard, Vancouver, ISO, and other styles

8

Tran, Gia-Lac. "Advances in Deep Gaussian Processes : calibration and sparsification." Electronic Thesis or Diss., Sorbonne université, 2020. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2020SORUS410.pdf.

Full text

Abstract:

L'intégration des Convolutional Neural Networks (CNNs) et des GPs est une solution prometteuse pour améliorer le pouvoir de représentation des méthodes contemporaines. Dans notre première étude, nous utilisons des diagrammes de fiabilité pour montrer que les combinaisons actuelles de cnns et GPs sont mal calibrées, ce qui donne lieu à des prédictions trop confiantes. En utilisant des Random Feature et la technique d'inférence variationnelle, nous proposons une nouvelle solution correctement calibrée pour combinaisons des CNNs et des GPs. Nous proposons également une extension intuitive de cette solution, utilisant des Structured Random Features afin d'améliorer la précision du modèle et réduire la complexité des calculs. En termes de coût de calcul, la complexité du GPs exact est cubique en la taille de l'ensemble d'entrainement, ce qui le rend inutilisable lorsque celle-ci dépasse quelques milliers d'éléments. Afin de faciliter l'extension des GPs à des quantités massives de données, nous sélectionnons un petit ensemble de points actifs ou points d'induction par une distillation globale à partir de toutes les observations. Nous utilisons ensuite ces points actifs pour faire des prédictions. Plusieurs travaux similaires se basent sur l'étude Titsias et al en 2009 [5] and Hensman et al en 2015 [6]. Cependant, il est encore difficile de traiter le cas général, et il est toujours possible que le nombre de points actifs requis dépasse un budget de calcul donné. Dans notre deuxième étude, nous proposons Sparse-within-Sparse Gaussian Processes (SWSGP) qui permet l'approximation avec un grand nombre de points inducteurs sans cout de calcul prohibitif
Gaussian Processes (GPs) are an attractive specific way of doing non-parametric Bayesian modeling in a supervised learning problem. It is well-known that GPs are able to make inferences as well as predictive uncertainties with a firm mathematical background. However, GPs are often unfavorable by the practitioners due to their kernel's expressiveness and the computational requirements. Integration of (convolutional) neural networks and GPs are a promising solution to enhance the representational power. As our first contribution, we empirically show that these combinations are miscalibrated, which leads to over-confident predictions. We also propose a novel well-calibrated solution to merge neural structures and GPs by using random features and variational inference techniques. In addition, these frameworks can be intuitively extended to reduce the computational cost by using structural random features. In terms of computational cost, the exact Gaussian Processes require the cubic complexity to training size. Inducing point-based Gaussian Processes are a common choice to mitigate the bottleneck by selecting a small set of active points through a global distillation from available observations. However, the general case remains elusive and it is still possible that the required number of active points may exceed a certain computational budget. In our second study, we propose Sparse-within-Sparse Gaussian Processes which enable the approximation with a large number of inducing points without suffering a prohibitive computational cost

APA, Harvard, Vancouver, ISO, and other styles

9

Kanapka, Joseph D. (Joseph Daniel) 1972. "Fast methods for extraction and sparsification of substrate coupling." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/29236.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Includes bibliographical references (p. 107-111).
Substrate coupling effects have had an increasing impact on circuit performance in recent years. As a result, there is strong demand for substrate simulation tools. Past work has concentrated on fast substrate solvers that are applied once per contact to get the dense conductance matrix G. We develop a method of using any underlying substrate solver a near-constant number of times to obtain a sparse approximate representation G [approximately equal to] QGwtQ' in a new basis. This method differs from previous matrix sparsification techniques in that it requires only a "black box" which can apply G quickly; it doesn't need an analytical representation of the underlying kernel or access to individual entries of G. The change-of-basis matrix Q is also sparse. For our largest example, with 10240 contacts, we obtained a Gwt with 130 times fewer nonzeros than the dense G (and Q more than twice as sparse as Gwt), with 20 times fewer solves than the naive method, and fewer than 4 percent of the QGwtQ' entries had relative error more than 10% compared to the exact G.
by Joseph Daniel Kanapka.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

10

Geppert, Jakob Alexander [Verfasser]. "Adaptive Sparsification Mechanisms in Signal Recovery / Jakob Alexander Geppert." Göttingen : Niedersächsische Staats- und Universitätsbibliothek Göttingen, 2021. http://d-nb.info/1231542098/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

11

Shah, Shivani. "Graph sparsification and unsupervised machine learning for metagenomic binning." Thesis, Tours, 2019. http://theses.scd.univ-tours.fr/index.php?fichier=2019/shivani.shah_18225.pdf.

Full text

Abstract:

La métagénomique est le domaine de la biologie qui concerne l’étude du contenu génomique des communautés microbiennes directement dans leur environnement. Les données métagénomiques utilisées dans ces travaux de thèse correspondent à des technologies de séquençage produisant des fragments d’ADN courts (reads). L'une des étapes clé de l'analyse des données métagénomiques et développée dans cette étude est le regroupement de reads, appelé également binning. Lors de cette tâche de binning, des groupes (bins) doivent être formés de sorte que chaque groupe soit composé de reads provenant de la même espèce ou genre. La méthodologie traditionnelle consiste à effectuer cette étape sur des séquences plus grandes (contigs), mais cette étape génère potentiellement des séquences dites chimériques. L'un des problèmes liés au binning appliqué aux lectures est lié à la taille importante des jeux de données. La méthodologie traditionnelle appliquée sur les reads, accable les ressources de calcul. Par conséquent, il est nécessaire de développer des approches de binning adaptables à de données massives.Dans cette thèse, nous abordons ce problème en proposant une méthode évolutive pour effectuer le binning. Nous positionnons notre travail parmi les approches de binning basées sur la composition et dans un contexte totalement non supervisé. Afin de réduire la complexité de la tâche de binning, des méthodes sont proposées pour filtrer préalablement les associations entre les données. Le développement de l'approche a été réalisé en deux étapes. D'abord, la méthodologie a été évaluée sur des ensembles de données métagénomiques plus petits (composés de quelques milliers de points). Dans un deuxième temps, nous proposons d’adapter cette approche à des ensembles de données plus volumineux (composés de millions de points) avec des méthodes d’indexation sensibles à la similarité (LSH). La thèse comporte trois contributions majeures.Premièrement, nous proposons un ensemble varié d’algorithmes de filtrage d’associations entre les données (reads) par l’intermédiaire de graphes de proximité. Ces graphes de proximité sont construits pour capturer les relations les plus pertinentes entre reads pour la tâche de binning. Nous exploitons par suite des algorithmes de détection de communautés sur ces graphes pour identifier les groupes de reads d’intérêts. Une étude exploratoire a été réalisée avec plusieurs graphes de proximité et algorithmes de détection de communautés sur trois jeux de données métagénomiques. Suite à cette étude, nous proposons une approche pipeline nommée ProxiClust couplant la construction d’un graphe de type kNN et l’algorithme Louvain de détection de communautés.Deuxièmement, afin d’adresser le problème de la scalabilité et aborder des jeux de données plus volumineux, la matrice de similarité utilisée dans le pipeline est remplacée par l’exploitation de tables de hachage sensibles à la similarité d’intérêt construites à partir de l'approche LSH Sim-Hash. Nous introduisons deux stratégies pour construire des graphes de proximité à partir des tables de hachage: 1) le graphe des microclusters et 2) le graphe kNN approché. Les performances et les limites de ces graphes ont été évaluées sur de grands ensembles de données MC et discutées. Sur la base de cette étude, nous retenons le graphe kNN mutuels comme le graphe de proximité le plus approprié pour les grands ensembles de données. Cette proposition a également été évaluée et confirmée sur des données de séquences métagénomiques de référence issues du challenge international CAMI.Enfin, nous examinons des approches de hachage alternatives pour construire des tables de hachage de meilleures qualités. L’approche de hachage dépendante des données ITQ est introduite et exploitée, puis nous en proposons deux variantes : orthogonale (ITQ-OrthSH) et non orthogonale (ITQ-SH). Ces approches de hachage ont été évaluées et discutées sur les données de reads massives à disposition
Metagenomics is the field biology that relates to the study of genomic content of microbial communities directly in their natural environments. The metagenomic data is generated by sequencing technology that take the enviormental samples as the input. The generated data is composed of short fragments of DNA (called reads), which originate from genomes of all species present in the sample. The datasets size range from thousands to millions of reads. One of the steps of metagenomic data analysis is binning of the reads. In binning groups (called bins) are to be formed such that each group is composed of reads which are likely to originate from the same specie or specie family. It has essentially been treated as a task of clustering in the metagenomic literature. One of the challenges in binning occurs due to the large size of the datasets. The method overwhelms the computational resources required while performing the task. Hence the development of binning approaches which are scalable to large datasets is required.In this thesis, we address this issue by proposing a scalable method to perform binning. We position our work among the compositional based binning approaches (use of short kmers) and in completely unsupervised context. On order to decrease the complexity of the binning task, methods are proposed to perform sparsification of the data prior to clustering. The development of the approach has been performed in two steps. First the idea has been evaluated on smaller metagenomic datasets (composed of few thousands of points). In the second step, we propose to scale this approach to larger datasets (composed of Millions of points) with similarity based indexing methods (LSH approaches). There are three major contributions of the thesis.First, we propose the idea of performing sparsification of the data with proximity graphs, prior to clustering. The proximity graphs are built on the data to capture pair-wise relationships between data points that are relevant for clustering. Then we leverage community detection algorithms on these graphs to identify clusters from the data. An exploratory study has been performed with several proximity graphs and community detection algorithm on three metagenomic datasets. Based on this study we propose an approach named ProxiClust with KNN graph and Louvain community detection to perform binning.Second, to scale this approach to larger datasets the distance matrix in the pipeline is replaced with hash tables built from Sim-hash LSH approach. We introduce two strategies to build proximity graphs from the hash tables: 1) Microclusters graph and 2) Approximate k nearest neighbour graph. The performance of these graphs have been evaluated on large MC datasets. The performance and limitations of these graphs are discussed. The baseline evaluation of these datasets have also been performed to determine their clustering difficulty. Based on this study we propose Mutual-KNN graph to be the appropriate proximity graph for the large datasets. This proposal has also evaluated and confirmed on the CAMI benchmark metagenomic datasets.Lastly, we examine alternative hashing approaches to build better quality hash tables. A data-dependent hashing approach ITQ and orthogonal version of Sim-hash have been included. Two new data dependent hashing approaches named ITQ-SH and ITQ-OrthSH are introduced. All the hashing approaches have been evaluated w.r.t their ability to hash the MC datasets with high precision and recall. AndThe introduction of Mutual-KNN as the appropriate proximity graph has led to new challenges in the pipeline. First, large number of clusters are generated due to high number of components in the Mutual-KNN graph. So, in order to obtain appropriate number of clusters, a strategy needs to be devised to merge the similar clusters. Also an approach to build Mutual-KNN graph from hash tables needs to be designed. This would complete the ProxiClust pipeline for the large datasets

APA, Harvard, Vancouver, ISO, and other styles

12

Muoh, Chibuike. "Sparsification for Topic Modeling and Applications to Information Retrieval." Kent State University / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=kent1259206719.

Full text

APA, Harvard, Vancouver, ISO, and other styles

13

Machimada, Machaiah Chittiappa <1993&gt. "Graph Sparsification and Semi-Supervised Learning: an Experimental Study." Master's Degree Thesis, Università Ca' Foscari Venezia, 2021. http://hdl.handle.net/10579/19404.

Full text

Abstract:

This study focuses on comparing the various graph sparsification methods that have been devised and tests the efficiency when compared to one another. And on the latter side of the project, semi-supervised learning algorithms are trained on a combination of labeled and unlabeled data. Hence, different sparsification methods are explored and the effect of such methods in Semi Supervised Graph Based Algorithms are evaluated.

APA, Harvard, Vancouver, ISO, and other styles

14

Asathulla, Mudabir Kabir. "A Sparsification Based Algorithm for Maximum-Cardinality Bipartite Matching in Planar Graphs." Thesis, Virginia Tech, 2017. http://hdl.handle.net/10919/88080.

Full text

Abstract:

Matching is one of the most fundamental algorithmic graph problems. Many variants of matching problems have been studied on different classes of graphs, the one of special interest to us being the Maximum Cardinality Bipartite Matching in Planar Graphs. In this work, we present a novel sparsification based approach for computing maximum/perfect bipartite matching in planar graphs. The overall complexity of our algorithm is O(n^{6/5} log^2 {n}) where n is the number of vertices in the graph, bettering the O(n^{3/2}) time achieved independently by Hopcroft-Karp algorithm and by Lipton and Tarjan divide and conquer approach using planar separators. Our algorithm combines the best of both these standard algorithms along with our sparsification technique and rich planar graph properties to achieve the speed up. Our algorithm is not the fastest, with the existence of O(nlog^3 {n}) algorithm based on max-flow reduction.
MS

APA, Harvard, Vancouver, ISO, and other styles

15

Bonnet, Edouard. "Résultats Positifs et Négatifs en Approximation et Complexité Paramétrée." Thesis, Paris 9, 2014. http://www.theses.fr/2014PA090040/document.

Full text

Abstract:

De nombreux problèmes de la vie réelle sont NP-Difficiles et ne peuvent pas être résolus en temps polynomial. Deux paradigmes notables pour les résoudre quand même sont: l'approximation et la complexité paramétrée. Dans cette thèse, on présente une nouvelle technique appelée "gloutonnerie-Pour-La-Paramétrisation". On l'utilise pour établir ou améliorer la complexité paramétrée de nombreux problèmes et également pour obtenir des algorithmes paramétrés pour des problèmes à cardinalité contrainte sur les graphes bipartis. En vue d'établir des résultats négatifs sur l'approximabilité en temps sous-Exponentiel et en temps paramétré, on introduit différentes méthodes de sparsification d'instances préservant l'approximation. On combine ces "sparsifieurs" à des réductions nouvelles ou déjà connues pour parvenir à nos fins. En guise de digestif, on présente des résultats de complexité de jeux comme le Bridge et Havannah
Several real-Life problems are NP-Hard and cannot be solved in polynomial time.The two main options to overcome this issue are: approximation and parameterized complexity. In this thesis, we present a new technique called greediness-For-Parameterization and we use it to improve the parameterized complexity of many problems. We also use this notion to obtain parameterized algorithms for some problems in bipartite graphs. Aiming at establishing negative results on the approximability in subexponential time and in parameterized time, we introduce new methods of sparsification that preserves approximation. We combine those "sparsifiers" with known or new reductions to achieve our goal. Finally, we present some hardness results of games such as Bridge and Havannah

APA, Harvard, Vancouver, ISO, and other styles

16

Wang, Guan. "STREAMING HYPERGRAPH PARTITION FOR MASSIVE GRAPHS." Kent State University / OhioLINK, 2013. http://rave.ohiolink.edu/etdc/view?acc_num=kent1385097649.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Chakeri, Alireza. "Scalable Unsupervised Learning with Game Theory." Scholar Commons, 2017. http://scholarcommons.usf.edu/etd/6616.

Full text

Abstract:

Recently dominant sets, a generalization of the notion of the maximal clique to edge-weighted graphs, have proven to be an effective tool for unsupervised learning and have found applications in different domains. Although, they were initially established using optimization and graph theory concepts, recent work has shown fascinating connections with evolutionary game theory, that leads to the clustering game framework. However, considering size of today's data sets, existing methods need to be modified in order to handle massive data. Hence, in this research work, first we address the limitations of the clustering game framework for large data sets theoretically. We propose a new important question for the clustering community ``How can a cluster of a subset of a dataset be a cluster of the entire dataset?''. We show that, this problem is a coNP-hard problem in a clustering game framework. Thus, we modify the definition of a cluster from a stable concept to a non-stable but optimal one (Nash equilibrium). By experiments we show that this relaxation does not change the qualities of the clusters practically. Following this alteration and the fact that equilibriums are generally compact subsets of vertices, we design an effective strategy to find equilibriums representing well distributed clusters. After finding such equilibriums, a linear game theoretic relation is proposed to assign vertices to the clusters and partition the graph. However, the method inherits a space complexity issue, that is the similarities between every pair of objects are required which proves practically intractable for large data sets. To overcome this limitation, after establishing necessary theoretical tools for a special type of graphs that we call vertex-repeated graphs, we propose the scalable clustering game framework. This approach divides a data set into disjoint tractable size chunks. Then, the exact clusters of the entire data are approximated by the clusters of the chunks. In fact, the exact equilibriums of the entire graph is approximated by the equilibriums of the subsets of the graph. We show theorems that enable significantly improved time complexity for the model. The applications include, but are not limited to, the maximum weight clique problem, large data clustering and image segmentation. Experiments have been done on random graphs and the DIMACS benchmark for the maximum weight clique problem and on magnetic resonance images (MRI) of the human brain consisting of about 4 million examples for large data clustering. Also, on the Berkeley Segmentation Dataset, the proposed method achieves results comparable to the state of the art, providing a parallel framework for image segmentation and without any training phase. The results show the effectiveness and efficiency of our approach. In another part of this research work, we generalize the clustering game method to cluster uncertain data where the similarities between the data points are not exactly known, that leads to the uncertain clustering game framework. Here, contrary to the ensemble clustering approaches, where the results of different similarity matrices are combined, we focus on the average utilities of an uncertain game. We show that the game theoretical solutions provide stable clusters even in the presence of severe uncertainties. In addition, based on this framework, we propose a novel concept in uncertain data clustering so that every subset of objects can have a ''cluster degree''. Extensive experiments on real world data sets, as well as on the Berkeley image segmentation dataset, confirm the performance of the proposed method. And finally, instead of dividing a graph into chunks to make the clustering scalable, we study the effect of the spectral sparsification method based on sampling by effective resistance on the clustering outputs. Through experimental and theoretical observations, we show that the clustering results obtained from sparsified graphs are very similar to the results of the original non-sparsified graphs. The rand index is always at about 0.9 to 0.99 in our experiments even when lots of sparsification is done.

APA, Harvard, Vancouver, ISO, and other styles

18

Wang, Nan. "A Framework of Transforming Vertex Deletion Algorithm to Edge Deletion Algorithm." University of Cincinnati / OhioLINK, 2017. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1504878748832156.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Däubel, Karl [Verfasser], Torsten [Akademischer Betreuer] Mütze, Martin [Akademischer Betreuer] Skutella, Martin [Gutachter] Skutella, Torsten [Gutachter] Mütze, and Sebastian [Gutachter] Stiller. "Some aspects of graph sparsification in theory and practice / Karl Däubel ; Gutachter: Martin Skutella, Torsten Mütze, Sebastian Stiller ; Torsten Mütze, Martin Skutella." Berlin : Technische Universität Berlin, 2020. http://d-nb.info/1214296068/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

20

Liang, Weifa, and wliang@cs anu edu au. "Designing Efficient Parallel Algorithms for Graph Problems." The Australian National University. Department of Computer Science, 1997. http://thesis.anu.edu.au./public/adt-ANU20010829.114536.

Full text

Abstract:

Graph algorithms are concerned with the algorithmic aspects of solving graph problems. The problems are motivated from and have application to diverse areas of computer science, engineering and other disciplines. Problems arising from these areas of application are good candidates for parallelization since they often have both intense computational needs and stringent response time requirements. Motivated by these concerns, this thesis investigates parallel algorithms for these kinds of graph problems that have at least one of the following properties: the problems involve some type of dynamic updates; the sparsification technique is applicable; or the problems are closely related to communications network issues. The models of parallel computation used in our studies are the Parallel Random Access Machine (PRAM) model and the practical interconnection network models such as meshes and hypercubes. ¶ Consider a communications network which can be represented by a graph G = (V;E), where V is a set of sites (processors), and E is a set of links which are used to connect the sites (processors). In some cases, we also assign weights and/or directions to the edges in E. Associated with this network, there are many problems such as (i) whether the network is k-edge (k-vertex) connected withfixed k; (ii) whether there are k-edge (k-vertex) disjoint paths between u and v for a pair of given vertices u and v after the network is dynamically updated by adding and/or deleting an edge etc; (iii) whether the sites in the network can communicate with each other when some sites and links fail; (iv) identifying the first k edges in the network whose deletion will result in the maximum increase in the routing cost in the resulting network for fixed k; (v) how to augment the network at optimal cost with a given feasible set of weighted edges such that the augmented network is k-edge (k-vertex) connected; (vi) how to route messages through the network efficiently. In this thesis we answer the problems mentioned above by presenting efficient parallel algorithms to solve them. As far as we know, most of the proposed algorithms are the first ones in the parallel setting. ¶ Even though most of the problems concerned in this thesis are related to communications networks, we also study the classic edge-coloring problem. The outstanding difficulty to solve this problem in parallel is that we do not yet know whether or not it is in NC. In this thesis we present an improved parallel algorithm for the problem which needs [bigcircle]([bigtriangleup][superscript 4.5]log [superscript 3] [bigtriangleup] log n + [bigtriangleup][superscript 4] log [superscript 4] n) time using [bigcircle](n[superscript 2][bigtriangleup] + n[bigtriangleup][superscript 3]) processors, where n is the number of vertices and [bigtriangleup] is the maximum vertex degree. Compared with a previously known result on the same model, we improved by an [bigcircle]([bigtriangleup][superscript 1.5]) factor in time. The non-trivial part is to reduce this problem to the edge-coloring update problem. We also generalize this problem to the approximate edge-coloring problem by giving a faster parallel algorithm for the latter case. ¶ Throughout the design and analysis of parallel graph algorithms, we also find a technique called the sparsification technique is very powerful in the design of efficient sequential and parallel algorithms on dense undirected graphs. We believe that this technique may be useful in its own right for guiding the design of efficient sequential and parallel algorithms for problems in other areas as well as in graph theory.

APA, Harvard, Vancouver, ISO, and other styles

21

Will, Sebastian, and Hosna Jabbari. "Sparse RNA folding revisited." Universitätsbibliothek Leipzig, 2016. http://nbn-resolving.de/urn:nbn:de:bsz:15-qucosa-204163.

Full text

Abstract:

Background: RNA secondary structure prediction by energy minimization is the central computational tool for the analysis of structural non-coding RNAs and their interactions. Sparsification has been successfully applied to improve the time efficiency of various structure prediction algorithms while guaranteeing the same result; however, for many such folding problems, space efficiency is of even greater concern, particularly for long RNA sequences. So far, spaceefficient sparsified RNA folding with fold reconstruction was solved only for simple base-pair-based pseudo-energy models. Results: Here, we revisit the problem of space-efficient free energy minimization. Whereas the space-efficient minimization of the free energy has been sketched before, the reconstruction of the optimum structure has not even been discussed. We show that this reconstruction is not possible in trivial extension of the method for simple energy models. Then, we present the time- and space-efficient sparsified free energy minimization algorithm SparseMFEFold that guarantees MFE structure prediction. In particular, this novel algorithm provides efficient fold reconstruction based on dynamically garbage-collected trace arrows. The complexity of our algorithm depends on two parameters, the number of candidates Z and the number of trace arrows T; both are bounded by n2, but are typically much smaller. The time complexity of RNA folding is reduced from O(n3) to O(n2 + nZ); the space complexity, from O(n2) to O(n + T + Z). Our empirical results show more than 80 % space savings over RNAfold [Vienna RNA package] on the long RNAs from the RNA STRAND database (≥2500 bases). Conclusions: The presented technique is intentionally generalizable to complex prediction algorithms; due to their high space demands, algorithms like pseudoknot prediction and RNA–RNA-interaction prediction are expected to profit even stronger than \"standard\" MFE folding. SparseMFEFold is free software, available at http://www.bioinf.unileipzig. de/~will/Software/SparseMFEFold.

APA, Harvard, Vancouver, ISO, and other styles

22

Dell, Holger. "Sparse instances of hard problems." Doctoral thesis, Humboldt-Universität zu Berlin, Mathematisch-Naturwissenschaftliche Fakultät II, 2011. http://dx.doi.org/10.18452/16375.

Full text

Abstract:

Diese Arbeit nutzt und verfeinert Methoden der Komplexitätstheorie, um mit diesen die Komplexität dünner Instanzen zu untersuchen. Dazu gehören etwa Graphen mit wenigen Kanten oder Formeln mit wenigen Bedingungen beschränkter Weite. Dabei ergeben sich zwei natürliche Fragestellungen: (a) Gibt es einen effizienten Algorithmus, der beliebige Instanzen eines NP-schweren Problems auf äquivalente, dünne Instanzen reduziert? (b) Gibt es einen Algorithmus, der dünne Instanzen NP-schwerer Probleme bedeutend schneller löst als allgemeine Instanzen gelöst werden können? Wir formalisieren diese Fragen für verschiedene Probleme und zeigen, dass positive Antworten jeweils zu komplexitätstheoretischen Konsequenzen führen, die als unwahrscheinlich gelten. Frage (a) wird als Kommunikation modelliert, in der zwei Akteure kooperativ eine NP-schwere Sprache entscheiden möchten und dabei möglichst wenig kommunizieren. Unter der komplexitätstheoretischen Annahme, dass coNP keine Teilmenge von NP/poly ist, erhalten wir aus unseren Ergebnissen erstaunlich scharfe untere Schranken für interessante Parameter aus verschiedenen Teilgebieten der theoretischen Informatik. Im Speziellen betrifft das die Ausdünnung von Formeln, die Kernelisierung aus der parameterisierten Komplexitätstheorie, die verlustbehaftete Kompression von Entscheidungsproblemen, und die Theorie der probabilistisch verifizierbaren Beweise. Wir untersuchen Fragestellung (b) anhand der Exponentialzeitkomplexität von Zählproblemen. Unter (Varianten) der bekannten Exponentialzeithypothese (ETH) erhalten wir exponentielle untere Schranken für wichtige #P-schwere Probleme: das Berechnen der Zahl der erfüllenden Belegungen einer 2-KNF Formel, das Berechnen der Zahl aller unabhängigen Mengen in einem Graphen, das Berechnen der Permanente einer Matrix mit Einträgen 0 und 1, das Auswerten des Tuttepolynoms an festen Punkten.
In this thesis, we use and refine methods of computational complexity theory to analyze the complexity of sparse instances, such as graphs with few edges or formulas with few constraints of bounded width. Two natural questions arise in this context: (a) Is there an efficient algorithm that reduces arbitrary instances of an NP-hard problem to equivalent, sparse instances? (b) Is there an algorithm that solves sparse instances of an NP-hard problem significantly faster than general instances can be solved? We formalize these questions for different problems and show that positive answers for these formalizations would lead to consequences in complexity theory that are considered unlikely. Question (a) is modeled by a communication process, in which two players want to cooperatively decide an NP-hard language and at the same time communicate as few as possible. Under the complexity-theoretic hypothesis that coNP is not in NP/poly, our results imply surprisingly tight lower bounds for parameters of interest in several areas, namely sparsification, kernelization in parameterized complexity, lossy compression, and probabilistically checkable proofs. We study the question (b) for counting problems in the exponential time setting. Assuming (variants of) the exponential time hypothesis (ETH), we obtain asymptotically tight, exponential lower bounds for well-studied #P-hard problems: Computing the number of satisfying assignments of a 2-CNF formula, computing the number of all independent sets in a graph, computing the permanent of a matrix with entries 0 and 1, evaluating the Tutte polynomial at fixed evaluation points.

APA, Harvard, Vancouver, ISO, and other styles

23

Curado, Manuel. "Structural Similarity: Applications to Object Recognition and Clustering." Doctoral thesis, Universidad de Alicante, 2018. http://hdl.handle.net/10045/98110.

Full text

Abstract:

In this thesis, we propose many developments in the context of Structural Similarity. We address both node (local) similarity and graph (global) similarity. Concerning node similarity, we focus on improving the diffusive process leading to compute this similarity (e.g. Commute Times) by means of modifying or rewiring the structure of the graph (Graph Densification), although some advances in Laplacian-based ranking are also included in this document. Graph Densification is a particular case of what we call graph rewiring, i.e. a novel field (similar to image processing) where input graphs are rewired to be better conditioned for the subsequent pattern recognition tasks (e.g. clustering). In the thesis, we contribute with an scalable an effective method driven by Dirichlet processes. We propose both a completely unsupervised and a semi-supervised approach for Dirichlet densification. We also contribute with new random walkers (Return Random Walks) that are useful structural filters as well as asymmetry detectors in directed brain networks used to make early predictions of Alzheimer's disease (AD). Graph similarity is addressed by means of designing structural information channels as a means of measuring the Mutual Information between graphs. To this end, we first embed the graphs by means of Commute Times. Commute times embeddings have good properties for Delaunay triangulations (the typical representation for Graph Matching in computer vision). This means that these embeddings can act as encoders in the channel as well as decoders (since they are invertible). Consequently, structural noise can be modelled by the deformation introduced in one of the manifolds to fit the other one. This methodology leads to a very high discriminative similarity measure, since the Mutual Information is measured on the manifolds (vectorial domain) through copulas and bypass entropy estimators. This is consistent with the methodology of decoupling the measurement of graph similarity in two steps: a) linearizing the Quadratic Assignment Problem (QAP) by means of the embedding trick, and b) measuring similarity in vector spaces. The QAP problem is also investigated in this thesis. More precisely, we analyze the behaviour of $m$-best Graph Matching methods. These methods usually start by a couple of best solutions and then expand locally the search space by excluding previous clamped variables. The next variable to clamp is usually selected randomly, but we show that this reduces the performance when structural noise arises (outliers). Alternatively, we propose several heuristics for spanning the search space and evaluate all of them, showing that they are usually better than random selection. These heuristics are particularly interesting because they exploit the structure of the affinity matrix. Efficiency is improved as well. Concerning the application domains explored in this thesis we focus on object recognition (graph similarity), clustering (rewiring), compression/decompression of graphs (links with Extremal Graph Theory), 3D shape simplification (sparsification) and early prediction of AD.
Ministerio de Economía, Industria y Competitividad (Referencia TIN2012-32839 BES-2013-064482)

APA, Harvard, Vancouver, ISO, and other styles

24

Liao, Zhenyu. "A regularization perspective on spectral sparsification." Thesis, 2019. https://hdl.handle.net/2144/38975.

Full text

Abstract:

In this thesis, we study how to obtain faster algorithms for spectral graph sparsifi-cation by applying continuous optimization techniques. Spectral sparsification is thetask of reducing the number of edges in a graph while maintaining a spectral ap-proximation to the original graph. Our key conceptual contribution is the connectionbetween spectral sparsification and regret minimization in online matrix games, i.e.,online convex programming over the positive semidefinite cone. While this connec-tion was previously noted [24, 47], we formally reduce graph sparsification to a matrixregret minimization problem, which we solve by applying mirror descent with a non-entropic regularizer. In this way, we not only obtain a new proof of the existenceof linear-sized spectral sparsifiers, originally given by [19], but improve the runningtime from Ω(n4)([19, 54]) to almost quadratic. More generally, our framework canalso be applied for the matrix multi-armed bandit online learning problem to reducethe regret bound to the optimalO(√nT), compared to theO(√nTlog(n) given bythe traditional matrix-entropy regulariz

APA, Harvard, Vancouver, ISO, and other styles

25

Wang, Sheng-Ping, and 王盛平. "Communication Usage Optimization of Gradient Sparsification with Aggregation." Thesis, 2018. http://ndltd.ncl.edu.tw/handle/ppyyqb.

Full text

Abstract:

碩士
國立臺灣大學
資訊工程學研究所
106
Communication usage is a bottleneck of scaling workers for distributed deep learning. One solution is to compress the exchanged gradients into sparse format with gradient sparsification. We found that the send cost of server, which is the aggregated size of sparse gradient, can be reduced by the gradient selection from workers. Following an observation that only a few gradients are significantly large and in a short period of time, we proposed several gradient selection algorithms based on different metrics. Experiment showed that our proposed method can reduce the aggregated size for server, and the reduction in time per iteration can make the convergence rate faster than traditional sparsification.

APA, Harvard, Vancouver, ISO, and other styles

26

Liang, Weifa. "Designing Efficient Parallel Algorithms for Graph Problems." Phd thesis, 1997. http://hdl.handle.net/1885/47660.

Full text

Abstract:

Graph algorithms are concerned with the algorithmic aspects of solving graph problems. The problems are motivated from and have application to diverse areas of computer science, engineering and other disciplines. Problems arising from these areas of application are good candidates for parallelization since they often have both intense computational needs and stringent response time requirements. Motivated by these concerns, this thesis investigates parallel algorithms for these kinds of graph problems that have at least one of the following properties: the problems involve some type of dynamic updates; the sparsification technique is applicable; or the problems are closely related to communications network issues. The models of parallel computation used in our studies are the Parallel Random Access Machine (PRAM) model and the practical interconnection network models such as meshes and hypercubes. ¶ ...

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!