Dissertations / Theses on the topic 'Learning on graphs'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Learning on graphs.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Vitale, F. "FAST LEARNING ON GRAPHS." Doctoral thesis, Università degli Studi di Milano, 2011. http://hdl.handle.net/2434/155500.
Full textIrniger, Christophe-André. "Graph matching filtering databases of graphs using machine learning techniques." Berlin Aka, 2005. http://deposit.ddb.de/cgi-bin/dokserv?id=2677754&prov=M&dok_var=1&dok_ext=htm.
Full textSimonovsky, Martin. "Deep learning on attributed graphs." Thesis, Paris Est, 2018. http://www.theses.fr/2018PESC1133/document.
Full textGraph is a powerful concept for representation of relations between pairs of entities. Data with underlying graph structure can be found across many disciplines, describing chemical compounds, surfaces of three-dimensional models, social interactions, or knowledge bases, to name only a few. There is a natural desire for understanding such data better. Deep learning (DL) has achieved significant breakthroughs in a variety of machine learning tasks in recent years, especially where data is structured on a grid, such as in text, speech, or image understanding. However, surprisingly little has been done to explore the applicability of DL on graph-structured data directly.The goal of this thesis is to investigate architectures for DL on graphs and study how to transfer, adapt or generalize concepts working well on sequential and image data to this domain. We concentrate on two important primitives: embedding graphs or their nodes into a continuous vector space representation (encoding) and, conversely, generating graphs from such vectors back (decoding). To that end, we make the following contributions.First, we introduce Edge-Conditioned Convolutions (ECC), a convolution-like operation on graphs performed in the spatial domain where filters are dynamically generated based on edge attributes. The method is used to encode graphs with arbitrary and varying structure.Second, we propose SuperPoint Graph, an intermediate point cloud representation with rich edge attributes encoding the contextual relationship between object parts. Based on this representation, ECC is employed to segment large-scale point clouds without major sacrifice in fine details.Third, we present GraphVAE, a graph generator allowing to decode graphs with variable but upper-bounded number of nodes making use of approximate graph matching for aligning the predictions of an autoencoder with its inputs. The method is applied to the task of molecule generation
Rosar, Kós Lassance Carlos Eduardo. "Graphs for deep learning representations." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2020. http://www.theses.fr/2020IMTA0204.
Full textIn recent years, Deep Learning methods have achieved state of the art performance in a vast range of machine learning tasks, including image classification and multilingual automatic text translation. These architectures are trained to solve machine learning tasks in an end-to-end fashion. In order to reach top-tier performance, these architectures often require a very large number of trainable parameters. There are multiple undesirable consequences, and in order to tackle these issues, it is desired to be able to open the black boxes of deep learning architectures. Problematically, doing so is difficult due to the high dimensionality of representations and the stochasticity of the training process. In this thesis, we investigate these architectures by introducing a graph formalism based on the recent advances in Graph Signal Processing (GSP). Namely, we use graphs to represent the latent spaces of deep neural networks. We showcase that this graph formalism allows us to answer various questions including: ensuring generalization abilities, reducing the amount of arbitrary choices in the design of the learning process, improving robustness to small perturbations added to the inputs, and reducing computational complexity
Ghiasnezhad, Omran Pouya. "Rule Learning in Knowledge Graphs." Thesis, Griffith University, 2018. http://hdl.handle.net/10072/382680.
Full textThesis (PhD Doctorate)
Doctor of Philosophy (PhD)
School of Info & Comm Tech
Science, Environment, Engineering and Technology
Full Text
Fan, Shuangfei. "Deep Representation Learning on Labeled Graphs." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/96596.
Full textDoctor of Philosophy
Graphs are one of the most important and powerful data structures for conveying the complex and correlated information among data points. In this research, we aim to provide more robust and accurate models for some graph specific tasks, such as collective classification and graph generation, by designing deep learning models to learn better task-specific representations for graphs. First, we studied the collective classification problem in graphs and proposed recurrent collective classification, a variant of the iterative classification algorithm that is more robust to situations where predictions are noisy or inaccurate. Then we studied the problem of graph generation using deep generative models. We first proposed a deep generative model using the GAN framework that generates labeled graphs. Then in order to support more applications and also get more control over the generated graphs, we extended the problem of graph generation to conditional graph generation which can then be applied to various applications for modeling graph evolution and transformation.
Rommedahl, David, and Martin Lindström. "Learning Sparse Graphs for Data Prediction." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295623.
Full textGrafstrukturer kan ofta användas för att beskriva komplex data. I många tillämpningar är grafstrukturen inte känd, utan måste läras från data. Vidare beskrivs verklig data ofta naturligt av glesa grafer. I detta projekt har vi försökt återskapa resultaten från ett tidigare forskningsarbete, nämligen att lära en graf som kan användas för prediktion med en ℓ1pennaliserad LASSO-metod. Vi föreslår även andra metoder för inlärning och utvärdering av grafen. Vi har testat metoderna på syntetisk data och verklig temperaturdata från Sverige. Resultaten visar att vi inte kan återskapa de tidigare forskarnas resultat, men vi lyckas lära in glesa grafer som kan användas för prediktion. Ytterligare arbete krävs för att verifiera våra resultat.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Xu, Keyulu. "Graph structures, random walks, and all that : learning graphs with jumping knowledge networks." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/121660.
Full textThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 51-54).
Graph representation learning aims to extract high-level features from the graph structures and node features, in order to make predictions about the nodes and the graphs. Applications include predicting chemical properties of drugs, community detection in social networks, and modeling interactions in physical systems. Recent deep learning approaches for graph representation learning, namely Graph Neural Networks (GNNs), follow a neighborhood aggregation procedure, where the representation vector of a node is computed by recursively aggregating and transforming feature vectors of its neighboring nodes. We analyze some important properties of these models, and propose a strategy to overcome the limitations. In particular, the range of neighboring nodes that a node's representation draws from strongly depends on the graph structure, analogous to the spread of a random walk. To adapt to local neighborhood properties and tasks, we explore an architecture - jumping knowledge (JK) networks that flexibly leverages, for each node, different neighborhood ranges to enable better structure-aware representation. In a number of experiments on social, bioinformatics and citation networks, we demonstrate that our model achieves state-of-the-art performance. Furthermore, combining the JK framework with models like Graph Convolutional Networks, GraphSAGE and Graph Attention Networks consistently improves those models' performance.
by Keyulu Xu.
S.M.
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Freeman, Guy. "Learning and predicting with chain event graphs." Thesis, University of Warwick, 2010. http://wrap.warwick.ac.uk/4529/.
Full textPasteris, S. U. "Efficient algorithms for online learning over graphs." Thesis, University College London (University of London), 2016. http://discovery.ucl.ac.uk/1516210/.
Full textSonntag, Dag. "Chain Graphs : Interpretations, Expressiveness and Learning Algorithms." Doctoral thesis, Linköpings universitet, Databas och informationsteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-125921.
Full textEid, Abdelrahman. "Stochastic simulations for graphs and machine learning." Thesis, Lille 1, 2020. http://www.theses.fr/2020LIL1I018.
Full textWhile it is impractical to study the population in many domains and applications, sampling is a necessary method allows to infer information. This thesis is dedicated to develop probability sampling algorithms to infer the whole population when it is too large or impossible to be obtained. Markov chain Monte Carlo (MCMC) techniques are one of the most important tools for sampling from probability distributions especially when these distributions haveintractable normalization constants.The work of this thesis is mainly interested in graph sampling techniques. Two methods in chapter 2 are presented to sample uniform subtrees from graphs using Metropolis-Hastings algorithms. The proposed methods aim to sample trees according to a distribution from a graph where the vertices are labelled. The efficiency of these methods is proved mathematically. Additionally, simulation studies were conducted and confirmed the theoretical convergence results to the equilibrium distribution.Continuing to the work on graph sampling, a method is presented in chapter 3 to sample sets of similar vertices in an arbitrary undirected graph using the properties of the Permanental Point processes PPP. Our algorithm to sample sets of k vertices is designed to overcome the problem of computational complexity when computing the permanent by sampling a joint distribution whose marginal distribution is a kPPP.Finally in chapter 4, we use the definitions of the MCMC methods and convergence speed to estimate the kernel bandwidth used for classification in supervised Machine learning. A simple and fast method called KBER is presented to estimate the bandwidth of the Radial basis function RBF kernel using the average Ricci curvature of graphs
Li, Qilin. "Affinity Learning on Graphs with Diffusion Processes." Thesis, Curtin University, 2020. http://hdl.handle.net/20.500.11937/80408.
Full textNguyen, Thi Kim Hue. "Structure learning of graphs for count data." Doctoral thesis, Università degli studi di Padova, 2018. http://hdl.handle.net/11577/3421952.
Full textI processi biologici che regolano le funzioni di base di una cellula sono caratterizzati da numerose interazioni tra geni. Da un punto di vista matematico, è possibile rappresentare queste interazioni attraverso grafi in cui i nodi e gli archi rappresentano, rispettivamente, i geni coinvolti e le loro interazioni. L’obiettivo principale di questa tesi è quello di sviluppare un approccio statistico alla modellazione delle interazioni tra geni quando questi sono misurati su scala discreta. Vengono a tal fine proposti vari algoritmi. La prima proposta è relativa ad un algoritmo disegnato per stimare la struttura di un grafo non orientato, per il quale si fornisce la dimostrazione di convergenza al crescere delle osservazioni. Altre tre proposte coinvolgono la definizione di algoritmi supervisionati per la stima della struttura di grafi direzionali aciclici, basati su una specificazione del modello statistico che ne garantisce l’identificabilità. Sempre con riferimento ai grafi direzionali aciclici, infine, si propone un algoritmo non supervisionato. Tutti gli algoritmi proposti mostrano risultati promettenti in termini di ricostruzione delle vere strutture quando applicati a dati simulati e dati reali.
Santacruz, Muñoz José Luis. "Error-tolerant Graph Matching on Huge Graphs and Learning Strategies on the Edit Costs." Doctoral thesis, Universitat Rovira i Virgili, 2019. http://hdl.handle.net/10803/668356.
Full textLos grafos son estructuras de datos abstractos que se utilizan para modelar problemas reales con dos entidades básicas: nodos y aristas. Cada nodo o vértice representa un punto de interés relevante de un problema, y cada arista representa la relación entre estos vértices. Los nodos y las aristas podrían incorporar atributos para aumentar la precisión del problema modelado. Debido a esta versatilidad, se han encontrado muchas aplicaciones en campos como la visión por computador, biomédicos, análisis de redes, etc. La Distancia de edición de grafos (GED) se ha convertido en una herramienta importante en el reconocimiento de patrones estructurales, ya que permite medir la disimilitud de los grafos. En la primera parte de esta tesis se presenta un método para generar una pareja grafos junto con su correspondencia en un coste computacional lineal. A continuación, se centra en cómo medir la disimilitud entre dos grafos enormes (más de 10.000 nodos), utilizando un nuevo algoritmo de emparejamiento de grafos llamado Belief Propagation. Tiene un coste computacional O(d^3.5n). Esta tesis también presenta un marco general para aprender los costos de edición implicados en los cálculos de GED automáticamente. Luego, concretamos este marco en dos modelos diferentes basados en redes neuronales y funciones de densidad de probabilidad. Se ha realizado una validación práctica exhaustiva en 14 bases de datos públicas. Esta validación muestra que la precisión es mayor con los costos de edición aprendidos, que con algunos costos impuestos manualmente u otros costos aprendidos automáticamente por métodos anteriores. Finalmente proponemos una aplicación del algoritmo Belief propagation utilizado en la simulación de la mecánica muscular.
Graphs are abstract data structures used to model real problems with two basic entities: nodes and edges. Each node or vertex represents a relevant point of interest of a problem, and each edge represents the relationship between these points. Nodes and edges could be attributed to increase the accuracy of the modeled problem, which means that these attributes could vary from feature vectors to description labels. Due to this versatility, many applications have been found in fields such as computer vision, bio-medics, network analysis, etc. Graph Edit Distance (GED) has become an important tool in structural pattern recognition since it allows to measure the dissimilarity of attributed graphs. The first part presents a method is presented to generate graphs together with an upper and lower bound distance and a correspondence in a linear computational cost. Through this method, the behaviour of the known -or the new- sub-optimal Error-Tolerant graph matching algorithm can be tested against a lower and an upper bound GED on large graphs, even though we do not have the true distance. Next, the present is focused on how to measure the dissimilarity between two huge graphs (more than 10.000 nodes), using a new Error-Tolerant graph matching algorithm called Belief Propagation algorithm. It has a O(d^3.5n) computational cost.This thesis also presents a general framework to learn the edit costs involved in the GED calculations automatically. Then, we concretise this framework in two different models based on neural networks and probability density functions. An exhaustive practical validation on 14 public databases has been performed. This validation shows that the accuracy is higher with the learned edit costs, than with some manually imposed costs or other costs automatically learned by previous methods. Finally we propose an application of the Belief propagation algorithm applied to muscle mechanics.
Ricatte, Thomas. "Hypernode graphs for learning from binary relations between sets of objects." Thesis, Lille 3, 2015. http://www.theses.fr/2015LIL30001/document.
Full textAmundsson, Karl. "Approximate Bayesian Learning of Partition Directed Acyclic Graphs." Thesis, KTH, Matematisk statistik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-192853.
Full textPartitionerade riktade acykliska grafer (engelska: PDAGs) är en modell där tabeller över betingade sannolikheter partitioneras i delar med lika sannolikhet. Detta gör att antalet parametrar som ska bestämmas kan reduceras, vilket i sin tur gör problemet beräkningsmässigt enklare. Ett känt samband mellan PDAGs och betecknade riktade acykliska grafer (engelska: LDAGs) sammanfattas här. Sedan jämförs en klustringsalgoritm med en algoritm som exakt bestämmer en PDAG. Klustringsalgoritmens pålitlighet kollas genom att använda den på simulerad data där det förväntade resultatet är känt.
Ma, Zongjie. "Searching on Massive Graphs and Regularizing Deep Learning." Thesis, Griffith University, 2018. http://hdl.handle.net/10072/385875.
Full textThesis (PhD Doctorate)
Doctor of Philosophy (PhD)
Inst Integrated&IntelligentSys
Science, Environment, Engineering and Technology
Full Text
Chandra, Nagasai. "Node Classification on Relational Graphs using Deep-RGCNs." DigitalCommons@CalPoly, 2021. https://digitalcommons.calpoly.edu/theses/2265.
Full textChamberlain, Benjamin Paul. "Practical challenges of learning and representation for large graphs." Thesis, Imperial College London, 2018. http://hdl.handle.net/10044/1/64783.
Full textUrry, Matthew. "Learning curves for Gaussian process regression on random graphs." Thesis, King's College London (University of London), 2013. https://kclpure.kcl.ac.uk/portal/en/theses/learning-curves-for-gaussian-process-regression-on-random-graphs(c1f5f395-0426-436c-989c-d0ade913423e).html.
Full textAraya, Valdivia Ernesto. "Kernel spectral learning and inference in random geometric graphs." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM020.
Full textThis thesis has two main objectives. The first is to investigate the concentration properties of random kernel matrices, which are central in the study of kernel methods. The second objective is to study statistical inference problems on random geometric graphs. Both objectives are connected by the graphon formalism, which allows to represent a graph by a kernel function. We briefly recall the basics of the graphon model in the first chapter. In chapter two, we present a set of accurate concentration inequalities for individual eigenvalues of the kernel matrix, where our main contribution is to obtain inequalities that scale with the eigenvalue in consideration, implying convergence rates that are faster than parametric and often exponential, which hitherto has only been establish under assumptions which are too restrictive for graph applications. We specialized our results to the case of dot products kernels, highlighting its relation with the random geometric graph model. In chapter three, we study the problem of latent distances estimation on random geometric graphs on the Euclidean sphere. We propose an efficient spectral algorithm that use the adjacency matrix to construct an estimator for the latent distances, and prove finite sample guaranties for the estimation error, establishing its convergence rate. In chapter four, we extend the method developed in the previous chapter to the case of random geometric graphs on the Euclidean ball, a model that despite its formal similarities with the spherical case it is more flexible for modelling purposes. In particular, we prove that for certain parameter choices its degree profile is power law distributed, which has been observed in many real life networks. All the theoretical findings of the last two chapters are verified and complemented by numerical experiments
García, Durán Alberto. "Learning representations in multi-relational graphs : algorithms and applications." Thesis, Compiègne, 2016. http://www.theses.fr/2016COMP2271/document.
Full textInternet provides a huge amount of information at hand in such a variety of topics, that now everyone is able to access to any kind of knowledge. Such a big quantity of information could bring a leap forward in many areas if used properly. This way, a crucial challenge of the Artificial Intelligence community has been to gather, organize and make intelligent use of this growing amount of available knowledge. Fortunately, important efforts have been made in gathering and organizing knowledge for some time now, and a lot of structured information can be found in repositories called Knowledge Bases (KBs). A main issue with KBs is that they are far from being complete. This thesis proposes several methods to add new links between the existing entities of the KB based on the learning of representations that optimize some defined energy function. We also propose a novel application to make use of this structured information to generate questions in natural language
Lê-Huu, Dien Khuê. "Nonconvex Alternating Direction Optimization for Graphs : Inference and Learning." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLC005/document.
Full textThis thesis presents our contributions toinference and learning of graph-based models in computervision. First, we propose a novel class of decompositionalgorithms for solving graph and hypergraphmatching based on the nonconvex alternating directionmethod of multipliers (ADMM). These algorithms arecomputationally efficient and highly parallelizable. Furthermore,they are also very general and can be appliedto arbitrary energy functions as well as arbitraryassignment constraints. Experiments show that theyoutperform existing state-of-the-art methods on popularbenchmarks. Second, we propose a nonconvex continuousrelaxation of maximum a posteriori (MAP) inferencein discrete Markov random fields (MRFs). Weshow that this relaxation is tight for arbitrary MRFs.This allows us to apply continuous optimization techniquesto solve the original discrete problem withoutloss in accuracy after rounding. We study two populargradient-based methods, and further propose a more effectivesolution using nonconvex ADMM. Experimentson different real-world problems demonstrate that theproposed ADMM compares favorably with state-of-theartalgorithms in different settings. Finally, we proposea method for learning the parameters of these graphbasedmodels from training data, based on nonconvexADMM. This method consists of viewing ADMM iterationsas a sequence of differentiable operations, whichallows efficient computation of the gradient of the trainingloss with respect to the model parameters, enablingefficient training using stochastic gradient descent. Atthe end we obtain a unified framework for inference andlearning with nonconvex ADMM. Thanks to its flexibility,this framework also allows training jointly endto-end a graph-based model with another model suchas a neural network, thus combining the strengths ofboth. We present experiments on a popular semanticsegmentation dataset, demonstrating the effectivenessof our method
Zappella, G. "LEARNING ON GRAPHS: ALGORITHMS FOR CLASSIFICATION AND SEQUENTIAL DECISIONS." Doctoral thesis, Università degli Studi di Milano, 2014. http://hdl.handle.net/2434/234167.
Full textLee, John Boaz T. "Deep Learning on Graph-structured Data." Digital WPI, 2019. https://digitalcommons.wpi.edu/etd-dissertations/570.
Full textPineau, Edouard. "Contributions to representation learning of multivariate time series and graphs." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT037.
Full textMachine learning (ML) algorithms are designed to learn models that have the ability to take decisions or make predictions from data, in a large panel of tasks. In general, the learned models are statistical approximations of the true/optimal unknown decision models. The efficiency of a learning algorithm depends on an equilibrium between model richness, complexity of the data distribution and complexity of the task to solve from data. Nevertheless, for computational convenience, the statistical decision models often adopt simplifying assumptions about the data (e.g. linear separability, independence of the observed variables, etc.). However, when data distribution is complex (e.g. high-dimensional with nonlinear interactions between observed variables), the simplifying assumptions can be counterproductive. In this situation, a solution is to feed the model with an alternative representation of the data. The objective of data representation is to separate the relevant information with respect to the task to solve from the noise, in particular if the relevant information is hidden (latent), in order to help the statistical model. Until recently and the rise of modern ML, many standard representations consisted in an expert-based handcrafted preprocessing of data. Recently, a branch of ML called deep learning (DL) completely shifted the paradigm. DL uses neural networks (NNs), a family of powerful parametric functions, as learning data representation pipelines. These recent advances outperformed most of the handcrafted data in many domains.In this thesis, we are interested in learning representations of multivariate time series (MTS) and graphs. MTS and graphs are particular objects that do not directly match standard requirements of ML algorithms. They can have variable size and non-trivial alignment, such that comparing two MTS or two graphs with standard metrics is generally not relevant. Hence, particular representations are required for their analysis using ML approaches. The contributions of this thesis consist of practical and theoretical results presenting new MTS and graphs representation learning frameworks.Two MTS representation learning frameworks are dedicated to the ageing detection of mechanical systems. First, we propose a model-based MTS representation learning framework called Sequence-to-graph (Seq2Graph). Seq2Graph assumes that the data we observe has been generated by a model whose graphical representation is a causality graph. It then represents, using an appropriate neural network, the sample on this graph. From this representation, when it is appropriate, we can find interesting information about the state of the studied mechanical system. Second, we propose a generic trend detection method called Contrastive Trend Estimation (CTE). CTE learns to classify pairs of samples with respect to the monotony of the trend between them. We show that using this method, under few assumptions, we identify the true state underlying the studied mechanical system, up-to monotone scalar transform.Two graph representation learning frameworks are dedicated to the classification of graphs. First, we propose to see graphs as sequences of nodes and create a framework based on recurrent neural networks to represent and classify them. Second, we analyze a simple baseline feature for graph classification: the Laplacian spectrum. We show that this feature matches minimal requirements to classify graphs when all the meaningful information is contained in the structure of the graphs
Neumann, Marion [Verfasser]. "Learning with Graphs using Kernels from Propagated Information / Marion Neumann." Bonn : Universitäts- und Landesbibliothek Bonn, 2015. http://d-nb.info/1077289626/34.
Full textMoghadasin, Babak. "An Approach on Learning Multivariate Regression Chain Graphs from Data." Thesis, Linköpings universitet, Databas och informationsteknik, 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-94019.
Full textYou, Chang Hun. "Learning patterns in dynamic graphs with application to biological networks." Pullman, Wash. : Washington State University, 2009. http://www.dissertations.wsu.edu/Dissertations/Summer2009/c_you_072309.pdf.
Full textTitle from PDF title page (viewed on Aug. 19, 2009). "School of Electrical Engineering and Computer Science." Includes bibliographical references (p. 114-117).
Kulla-Mader, Julia. "Graphs via Ink: Understanding How the Amount of Non-data Ink in a Graph Affects Perception and Learning." Thesis, School of Information and Library Science, 2007. http://hdl.handle.net/1901/379.
Full textAdjei, Seth Akonor. "Refining Learning Maps with Data Fitting Techniques." Digital WPI, 2015. https://digitalcommons.wpi.edu/etd-theses/178.
Full textMayo, Quentin R. "Detection of Generalizable Clone Security Coding Bugs Using Graphs and Learning Algorithms." Thesis, University of North Texas, 2018. https://digital.library.unt.edu/ark:/67531/metadc1404548/.
Full textEzzeddine, Diala. "A contribution to topological learning and its application in Social Networks." Thesis, Lyon 2, 2014. http://www.theses.fr/2014LYO22011/document.
Full textSupervised Learning is a popular field of Machine Learning that has made recent progress. In particular, many methods and procedures have been developed to solve the classification problem. Most classical methods in Supervised Learning use the density estimation of data to construct their classifiers.In this dissertation, we show that the topology of data can be a good alternative in constructing classifiers. We propose using topological graphs like Gabriel graphs (GG) and Relative Neighborhood Graphs (RNG) that can build the topology of data based on its neighborhood structure. To apply this concept, we create a new method called Random Neighborhood Classification (RNC).In this method, we use topological graphs to construct classifiers and then apply Ensemble Methods (EM) to get all relevant information from the data. EM is well known in Machine Learning, generates many classifiers from data and then aggregates these classifiers into one. Aggregate classifiers have been shown to be very efficient in many studies, because it leverages relevant and effective information from each generated classifier. We first compare RNC to other known classification methods using data from the UCI Irvine repository. We find that RNC works very well compared to very efficient methods such as Random Forests and Support Vector Machines. Most of the time, it ranks in the top three methods in efficiency. This result has encouraged us to study the efficiency of RNC on real data like tweets. Twitter, a microblogging Social Network, is especially useful to mine opinion on current affairs and topics that span the range of human interest, including politics. Mining political opinion from Twitter poses peculiar challenges such as the versatility of the authors when they express their political view, that motivate this study. We define a new attribute, called couple, that will be very helpful in the process to study the tweets opinion. A couple is an author that talk about a politician. We propose a new procedure that focuses on identifying the opinion on tweet using couples. We think that focusing on the couples's opinion expressed by several tweets can overcome the problems of analysing each single tweet. This approach can be useful to avoid the versatility, language ambiguity and many other artifacts that are easy to understand for a human being but not automatically for a machine.We use classical Machine Learning techniques like KNN, Random Forests (RF) and also our method RNC. We proceed in two steps : First, we build a reference set of classified couples using Naive Bayes. We also apply a second alternative method to Naive method, sampling plan procedure, to compare and evaluate the results of Naive method. Second, we evaluate the performance of this approach using proximity measures in order to use RNC, RF and KNN. The expirements used are based on real data of tweets from the French presidential election in 2012. The results show that this approach works well and that RNC performs very good in order to classify opinion in tweets.Topological Learning seems to be very intersting field to study, in particular to address the classification problem. Many concepts to get informations from topological graphs need to analyse like the ones described by Aupetit, M. in his work (2005). Our work show that Topological Learning can be an effective way to perform classification problem
Landrieu, Loïc. "Learning structured models on weighted graphs, with applications to spatial data analysis." Thesis, Paris Sciences et Lettres (ComUE), 2016. http://www.theses.fr/2016PSLEE046/document.
Full textModeling complex processes often involve a high number of variables with anintricate correlation structure. For example, many spatially-localized processes display spatial regularity, as variables corresponding to neighboring regions are more correlated than distant ones. The formalism of weighted graphs allows us to capture relationships between interacting variables in a compact manner, permitting the mathematical formulation of many spatial analysis tasks. The first part of this manuscript focuses on optimization problems with graph-structure dregularizers, such as the total variation or the total boundary size. We first present the convex formulation and its resolution with proximal splitting algorithms. We introduce a new preconditioning scheme for the existing generalized forward-backward proximal splitting algorithm, specifically designed for graphs with high variability in neighbourhood configurations and edge weights. We then introduce a new algorithm, cut pursuit, which used the links between graph cuts and total variation in a working set scheme. We also present a variation of this algorithm which solved the problem regularized by the non convex total boundary length penalty. We show that our proposed approaches reach or outperform state-of-the-art for geostatistical aggregation as well as image recovery problems. The second part focuses on the development of a new model, expanding continuous-time Markov chain models to general undirected weighted graphs. This allows us to take into account the interactions between neighbouring nodes in structured classification, as demonstrated for a supervised land-use classification task from cadastral data
Preece, Jenny. "Interpreting trends in graphs : a study of 14 and 15 year olds." Thesis, n.p, 1985. http://ethos.bl.uk/.
Full textYang, Karren Dai. "Learning causal graphs under interventions and applications to single-cell biological data analysis." Thesis, Massachusetts Institute of Technology, 2021. https://hdl.handle.net/1721.1/130806.
Full textThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021
Cataloged from the official PDF version of thesis.
Includes bibliographical references (pages 49-51).
This thesis studies the problem of learning causal directed acyclic graphs (DAGs) in the setting where both observational and interventional data is available. This setting is common in biology, where gene regulatory networks can be intervened on using chemical reagents or gene deletions. The identifiability of causal DAGs under perfect interventions, which eliminate dependencies between targeted variables and their direct causes, has previously been studied. This thesis first extends these identifiability results to general interventions, which may modify the dependencies between targeted variables and their causes without eliminating them, by defining and characterizing the interventional Markov equivalence class that can be identified from general interventions. Subsequently, this thesis proposes the first provably consistent algorithm for learning DAGs in this setting. Finally, this algorithm as well as related work is applied to analyze biological datasets.
by Karren Dai Yang.
S.M.
S.M.
S.M. Massachusetts Institute of Technology, Department of Biological Engineering
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Chávez, Escalante Diego Alonso 1988. "Semi-supervised learning with graphs methods using signal processing = Métodos de aprendizado semi-supervisionado com grafos usando processamento de sinais." [s.n.], 2014. http://repositorio.unicamp.br/jspui/handle/REPOSIP/275521.
Full textDissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação
Made available in DSpace on 2018-08-25T19:49:49Z (GMT). No. of bitstreams: 1 ChavezEscalante_DiegoAlonso_M.pdf: 1954210 bytes, checksum: c9a77d2f0545d5517700c34dd6cf3324 (MD5) Previous issue date: 2014
Resumo: No aprendizado de máquina, os problemas de classificação de padrões eram tradicionalmente abordados por algoritmos de aprendizado supervisionado que utilizam apenas dados rotulados para treinar-se. Entretanto, os dados rotulados são realmente difíceis de coletar em muitos domínios de problemas, enquanto os dados não rotulados são geralmente mais fáceis de recolher. Também em aprendizado de máquina só o aprendizado não supervisionado é capaz de aprender a topologia e propriedades de um conjunto de dados não rotulados. Portanto, a fim de conseguir uma classificação utilizando o conhecimento a partir de dados rotulados e não rotulados, é necessário o uso de conceitos de aprendizado supervisionado tanto como do não supervisionado. Este tipo de aprendizagem é chamado de aprendizado semi-supervisionado, que declara ter construído melhores classificadores que o tradicional aprendizado supervisionado em algumas condições especificas, porque não só aprende dos dados rotulados, mas também das propriedades naturais dos dados não rotulados como por exemplo a distribuição espacial deles. O aprendizado semi-supervisionado apresenta uma ampla coleção de métodos e técnicas para classificação, e um dos mais interessantes e o aprendizado semi-supervisionado baseado em grafos, o qual modela o problema da classificação semi-supervisionada utilizando a teoria dos grafos. Mas um problema que surge a partir dessa técnica é o custo para treinar conjuntos com grandes quantidades de dados, de modo que o desenvolvimento de algoritmos escaláveis e eficientes de aprendizado semi-supervisionado baseado em grafos e um problema muito interessante e prometedor para lidar com ele. Desta pesquisa foram desenvolvidos dois algoritmos, um para a construção do grafo usando redes neurais não supervisionadas e outro para a regularização do grafo usando processamento de sinais em grafos, especificamente usando filtros de resposta finita sobre o grafo. As duas soluções mostraram resultados comparáveis com os da literatura
Abstract: In machine learning, classification problems were traditionally addressed by supervised learning algorithms, which only use labeled data for training. However, labeled data in many problem domains are really hard to collect, while unlabeled data are usually easy to collect. Also, in machine learning, only unsupervised learning is capable to learn the topology and properties of a set of unlabeled data. In order to do a classification using knowledge from labeled and unlabeled data, it is necessary to use concepts from both supervised and unsupervised learning. This type of learning is called semi-supervised learning, which has claimed to build better classifiers than the traditional supervised learning in some specific conditions, because it does not only learn from the labeled data, but also from the natural properties of unlabeled data as for example spatial distribution. Semi-supervised learning presents a broad collection of methods and techniques for classification. Among them there is graph based semi-supervised learning, which model the problem of semi-supervised classification using graph theory. One problem that arises from this technique is the cost for training large data sets, so the development of scalable and efficient algorithms for graph based semi-supervised learning is a interesting and promising problem to deal with. From this research we developed two algorithms, one for graph construction using unsupervised neural networks; and other for graph regularization using graph signal processing theory, more specifically using FIR filters over a graph. Both solutions showed comparable performance to other literature methods in terms of accuracy
Mestrado
Ciência da Computação
Mestre em Ciência da Computação
GARBARINO, DAVIDE. "Acknowledging the structured nature of real-world data with graphs embeddings and probabilistic inference methods." Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1092453.
Full textFlores, Nicandro. "Counting directed acyclic graphs and its application to Monte Carlo learning of Bayesian networks." Connect to online resource, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:1447692.
Full textNavarin, Nicolò <1984>. "Learning with Kernels on Graphs: DAG-based kernels, data streams and RNA function prediction." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6578/1/navarin_nicolo_tesi.pdf.
Full textNavarin, Nicolò <1984>. "Learning with Kernels on Graphs: DAG-based kernels, data streams and RNA function prediction." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2014. http://amsdottorato.unibo.it/6578/.
Full textShah, Shivani. "Graph sparsification and unsupervised machine learning for metagenomic binning." Thesis, Tours, 2019. http://theses.scd.univ-tours.fr/index.php?fichier=2019/shivani.shah_18225.pdf.
Full textMetagenomics is the field biology that relates to the study of genomic content of microbial communities directly in their natural environments. The metagenomic data is generated by sequencing technology that take the enviormental samples as the input. The generated data is composed of short fragments of DNA (called reads), which originate from genomes of all species present in the sample. The datasets size range from thousands to millions of reads. One of the steps of metagenomic data analysis is binning of the reads. In binning groups (called bins) are to be formed such that each group is composed of reads which are likely to originate from the same specie or specie family. It has essentially been treated as a task of clustering in the metagenomic literature. One of the challenges in binning occurs due to the large size of the datasets. The method overwhelms the computational resources required while performing the task. Hence the development of binning approaches which are scalable to large datasets is required.In this thesis, we address this issue by proposing a scalable method to perform binning. We position our work among the compositional based binning approaches (use of short kmers) and in completely unsupervised context. On order to decrease the complexity of the binning task, methods are proposed to perform sparsification of the data prior to clustering. The development of the approach has been performed in two steps. First the idea has been evaluated on smaller metagenomic datasets (composed of few thousands of points). In the second step, we propose to scale this approach to larger datasets (composed of Millions of points) with similarity based indexing methods (LSH approaches). There are three major contributions of the thesis.First, we propose the idea of performing sparsification of the data with proximity graphs, prior to clustering. The proximity graphs are built on the data to capture pair-wise relationships between data points that are relevant for clustering. Then we leverage community detection algorithms on these graphs to identify clusters from the data. An exploratory study has been performed with several proximity graphs and community detection algorithm on three metagenomic datasets. Based on this study we propose an approach named ProxiClust with KNN graph and Louvain community detection to perform binning.Second, to scale this approach to larger datasets the distance matrix in the pipeline is replaced with hash tables built from Sim-hash LSH approach. We introduce two strategies to build proximity graphs from the hash tables: 1) Microclusters graph and 2) Approximate k nearest neighbour graph. The performance of these graphs have been evaluated on large MC datasets. The performance and limitations of these graphs are discussed. The baseline evaluation of these datasets have also been performed to determine their clustering difficulty. Based on this study we propose Mutual-KNN graph to be the appropriate proximity graph for the large datasets. This proposal has also evaluated and confirmed on the CAMI benchmark metagenomic datasets.Lastly, we examine alternative hashing approaches to build better quality hash tables. A data-dependent hashing approach ITQ and orthogonal version of Sim-hash have been included. Two new data dependent hashing approaches named ITQ-SH and ITQ-OrthSH are introduced. All the hashing approaches have been evaluated w.r.t their ability to hash the MC datasets with high precision and recall. AndThe introduction of Mutual-KNN as the appropriate proximity graph has led to new challenges in the pipeline. First, large number of clusters are generated due to high number of components in the Mutual-KNN graph. So, in order to obtain appropriate number of clusters, a strategy needs to be devised to merge the similar clusters. Also an approach to build Mutual-KNN graph from hash tables needs to be designed. This would complete the ProxiClust pipeline for the large datasets
Bodily, Robert Gordon. "Designing, Developing, and Implementing Real-Time Learning Analytics Student Dashboards." BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/7258.
Full textRichard, Émile. "Regularization methods for prediction in dynamic graphs and e-marketing applications." Phd thesis, École normale supérieure de Cachan - ENS Cachan, 2012. http://tel.archives-ouvertes.fr/tel-00906066.
Full textMorris, Christopher [Verfasser], Petra [Akademischer Betreuer] Mutzel, and Kristian [Gutachter] Kersting. "Learning with graphs: kernel and neural approaches / Christopher Morris ; Gutachter: Kristian Kersting ; Betreuer: Petra Mutzel." Dortmund : Universitätsbibliothek Dortmund, 2019. http://d-nb.info/1205157441/34.
Full textLimnios, Stratis. "Graph Degeneracy Studies for Advanced Learning Methods on Graphs and Theoretical Results Edge degeneracy: Algorithmic and structural results Degeneracy Hierarchy Generator and Efficient Connectivity Degeneracy Algorithm A Degeneracy Framework for Graph Similarity Hcore-Init: Neural Network Initialization based on Graph Degeneracy." Thesis, Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAX038.
Full textExtracting Meaningful substructures from graphs has always been a key part in graph studies. In machine learning frameworks, supervised or unsupervised, as well as in theoretical graph analysis, finding dense subgraphs and specific decompositions is primordial in many social and biological applications among many others.In this thesis we aim at studying graph degeneracy, starting from a theoretical point of view, and building upon our results to find the most suited decompositions for the tasks at hand.Hence the first part of the thesis we work on structural results in graphs with bounded edge admissibility, proving that such graphs can be reconstructed by aggregating graphs with almost-bounded-edge-degree. We also provide computational complexity guarantees for the different degeneracy decompositions, i.e. if they are NP-complete or polynomial, depending on the length of the paths on which the given degeneracy is defined.In the second part we unify the degeneracy and admissibility frameworks based on degree and connectivity. Within those frameworks we pick the most expressive, on the one hand, and computationally efficient on the other hand, namely the 1-edge-connectivity degeneracy, to experiment on standard degeneracy tasks, such as finding influential spreaders.Following the previous results that proved to perform poorly we go back to using the k-core but plugging it in a supervised framework, i.e. graph kernels. Thus providing a general framework named core-kernel, we use the k-core decomposition as a preprocessing step for the kernel and apply the latter on every subgraph obtained by the decomposition for comparison. We are able to achieve state-of-the-art performance on graph classification for a small computational cost trade-off.Finally we design a novel degree degeneracy framework for hypergraphs and simultaneously on bipartite graphs as they are hypergraphs incidence graph. This decomposition is then applied directly to pretrained neural network architectures as they induce bipartite graphs and use the coreness of the neurons to re-initialize the neural network weights. This framework not only outperforms state-of-the-art initialization techniques but is also applicable to any pair of layers convolutional and linear thus being applicable however needed to any type of architecture
Schwartz, Samuel David. "Machine Learning Techniques as Applied to Discrete and Combinatorial Structures." DigitalCommons@USU, 2019. https://digitalcommons.usu.edu/etd/7542.
Full textWeninger, Timothy Edwards. "Link discovery in very large graphs by constructive induction using genetic programming." Thesis, Manhattan, Kan. : Kansas State University, 2008. http://hdl.handle.net/2097/1087.
Full textBautista, Ruiz Esteban. "Laplacian Powers for Graph-Based Semi-Supervised Learning." Thesis, Lyon, 2019. http://www.theses.fr/2019LYSEN081.
Full textGraph-Based Semi-Supervised Learning (G-SSL) techniques learn from both labelled and unla- belled data to build better classifiers. Despite successes, its performance can still be improved, particularly in cases of graphs with unclear clusters or unbalanced labelled datasets. To ad- dress such limitations, the main contribution of this dissertation is a novel method for G-SSL referred to as the Lγ -PageRank method. It consists of a generalization of the PageRank algo- rithm based on the positive γ-th powers of the graph Laplacian matrix. The theoretical study of Lγ -PageRank shows that (i) for γ < 1, it corresponds to an extension of the PageRank algo- rithm to L´evy processes: where random walkers can now perform far-distant jumps in a single step; and (ii) for γ > 1, it operates on signed graphs: where nodes belonging to one same class are more likely to share positive edges while nodes from different classes are more likely to be connected with negative edges. We show the existence of an optimal γ-th power that maximizes performance, for which a method for its automatic estimation is devised and assessed. Exper- iments on several datasets demonstrate that the L´evy flight random walkers can enhance the detection of classes with complex local structures and that the signed graphs can significantly improve the separability of data and also override the issue of unbalanced labelled data. In addition, we study efficient implementations of Lγ -PageRank. Extensions of Power Iteration and Gauss-Southwell, successful algorithms to efficiently compute the solution of the standard PageRank algorithm, are derived for Lγ -PageRank. Moreover, the dynamic versions of Power Iteration and Gauss-Southwell, which can update the solution of standard PageRank in sub- linear complexity when the graph evolves or new data arrive, are also extended to Lγ -PageRank. Lastly, we apply Lγ -PageRank in the context of Internet routing. We address the problem of identifying the Autonomous Systems (AS) of inter-AS links from the network of IP addresses and AS public registers. Experiments on tracerout measurements collected from the Internet show that Lγ -PageRank can solve this inference task with no errors, even when the expert does not provide labelled examples of all classes