Dissertations / Theses on the topic 'Generalization bounds'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 21 dissertations / theses for your research on the topic 'Generalization bounds.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
McDonald, Daniel J. "Generalization Error Bounds for Time Series." Research Showcase @ CMU, 2012. http://repository.cmu.edu/dissertations/184.
Full textKroon, Rodney Stephen. "Support vector machines, generalization bounds, and transduction." Thesis, Stellenbosch : University of Stellenbosch, 2003. http://hdl.handle.net/10019.1/16375.
Full textKelby, Robin J. "Formalized Generalization Bounds for Perceptron-Like Algorithms." Ohio University / OhioLINK, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1594805966855804.
Full textGiulini, Ilaria. "Generalization bounds for random samples in Hilbert spaces." Thesis, Paris, Ecole normale supérieure, 2015. http://www.theses.fr/2015ENSU0026/document.
Full textThis thesis focuses on obtaining generalization bounds for random samples in reproducing kernel Hilbert spaces. The approach consists in first obtaining non-asymptotic dimension-free bounds in finite-dimensional spaces using some PAC-Bayesian inequalities related to Gaussian perturbations and then in generalizing the results in a separable Hilbert space. We first investigate the question of estimating the Gram operator by a robust estimator from an i. i. d. sample and we present uniform bounds that hold under weak moment assumptions. These results allow us to qualify principal component analysis independently of the dimension of the ambient space and to propose stable versions of it. In the last part of the thesis we present a new algorithm for spectral clustering. It consists in replacing the projection on the eigenvectors associated with the largest eigenvalues of the Laplacian matrix by a power of the normalized Laplacian. This iteration, justified by the analysis of clustering in terms of Markov chains, performs a smooth truncation. We prove nonasymptotic bounds for the convergence of our spectral clustering algorithm applied to a random sample of points in a Hilbert space that are deduced from the bounds for the Gram operator in a Hilbert space. Experiments are done in the context of image analysis
Rakhlin, Alexander. "Applications of empirical processes in learning theory : algorithmic stability and generalization bounds." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/34564.
Full textIncludes bibliographical references (p. 141-148).
This thesis studies two key properties of learning algorithms: their generalization ability and their stability with respect to perturbations. To analyze these properties, we focus on concentration inequalities and tools from empirical process theory. We obtain theoretical results and demonstrate their applications to machine learning. First, we show how various notions of stability upper- and lower-bound the bias and variance of several estimators of the expected performance for general learning algorithms. A weak stability condition is shown to be equivalent to consistency of empirical risk minimization. The second part of the thesis derives tight performance guarantees for greedy error minimization methods - a family of computationally tractable algorithms. In particular, we derive risk bounds for a greedy mixture density estimation procedure. We prove that, unlike what is suggested in the literature, the number of terms in the mixture is not a bias-variance trade-off for the performance. The third part of this thesis provides a solution to an open problem regarding the stability of Empirical Risk Minimization (ERM). This algorithm is of central importance in Learning Theory.
(cont.) By studying the suprema of the empirical process, we prove that ERM over Donsker classes of functions is stable in the L1 norm. Hence, as the number of samples grows, it becomes less and less likely that a perturbation of o(v/n) samples will result in a very different empirical minimizer. Asymptotic rates of this stability are proved under metric entropy assumptions on the function class. Through the use of a ratio limit inequality, we also prove stability of expected errors of empirical minimizers. Next, we investigate applications of the stability result. In particular, we focus on procedures that optimize an objective function, such as k-means and other clustering methods. We demonstrate that stability of clustering, just like stability of ERM, is closely related to the geometry of the class and the underlying measure. Furthermore, our result on stability of ERM delineates a phase transition between stability and instability of clustering methods. In the last chapter, we prove a generalization of the bounded-difference concentration inequality for almost-everywhere smooth functions. This result can be utilized to analyze algorithms which are almost always stable. Next, we prove a phase transition in the concentration of almost-everywhere smooth functions. Finally, a tight concentration of empirical errors of empirical minimizers is shown under an assumption on the underlying space.
by Alexander Rakhlin.
Ph.D.
Nordenfors, Oskar. "A Literature Study Concerning Generalization Error Bounds for Neural Networks via Rademacher Complexity." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184487.
Full textI denna uppsats presenteras några grundläggande resultat från teorin kring maskininlärning och neurala nätverk, med målet att slutligen diskutera övre begräsningar på generaliseringsfelet hos neurala nätverk, via Rademachers komplexitet.
Bellet, Aurélien. "Supervised metric learning with generalization guarantees." Phd thesis, Université Jean Monnet - Saint-Etienne, 2012. http://tel.archives-ouvertes.fr/tel-00770627.
Full textMusayeva, Khadija. "Generalization Performance of Margin Multi-category Classifiers." Thesis, Université de Lorraine, 2019. http://www.theses.fr/2019LORR0096/document.
Full textThis thesis deals with the theory of margin multi-category classification, and is based on the statistical learning theory founded by Vapnik and Chervonenkis. We are interested in deriving generalization bounds with explicit dependencies on the number C of categories, the sample size m and the margin parameter gamma, when the loss function considered is a Lipschitz continuous margin loss function. Generalization bounds rely on the empirical performance of the classifier as well as its "capacity". In this work, the following scale-sensitive capacity measures are considered: the Rademacher complexity, the covering numbers and the fat-shattering dimension. Our main contributions are obtained under the assumption that the classes of component functions implemented by a classifier have polynomially growing fat-shattering dimensions and that the component functions are independent. In the context of the pathway of Mendelson, which relates the Rademacher complexity to the covering numbers and the latter to the fat-shattering dimension, we study the impact that decomposing at the level of one of these capacity measures has on the dependencies on C, m and gamma. In particular, we demonstrate that the dependency on C can be substantially improved over the state of the art if the decomposition is postponed to the level of the metric entropy or the fat-shattering dimension. On the other hand, this impacts negatively the rate of convergence (dependency on m), an indication of the fact that optimizing the dependencies on the three basic parameters amounts to looking for a trade-off
Philips, Petra Camilla, and petra philips@gmail com. "Data-Dependent Analysis of Learning Algorithms." The Australian National University. Research School of Information Sciences and Engineering, 2005. http://thesis.anu.edu.au./public/adt-ANU20050901.204523.
Full textKatsikarelis, Ioannis. "Structurally Parameterized Tight Bounds and Approximation for Generalizations of Independence and Domination." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLED048.
Full textIn this thesis we focus on the NP-hard problems (k, r)-CENTER and d-SCATTERED SET that generalize the well-studied concepts of domination and independence over larger distances. In the first part we maintain a parameterized viewpoint and examine the standard parameterization as well as the most widely-used graph parameters measuring the input’s structure. We offer hardness results that show there is no algorithm of running-time below certain bounds, subject to the (Strong) Exponential Time Hypothesis, produce essentially optimal algorithms of complexity that matches these lower bounds and further attempt to offer an alternative to exact computation in significantly reduced running-time by way of approximation algorithms. In the second part we consider the (super-)polynomial (in-)approximability of the d-SCATTERED SET problem, i.e. we determine the exact relationship between an achievable approximation ratio ρ, the distance parameter d, and the runningtime of any ρ-approximation algorithm expressed as a function of the above and the size of the input n. We then consider strictly polynomial running-times and improve our understanding on the approximability characteristics of the problem on graphs of bounded maximum degree as well as bipartite graphs
Fukihara, Yoji. "Generalization of Bounded Linear Logic and its Categorical Semantics." Doctoral thesis, Kyoto University, 2021. http://hdl.handle.net/2433/263441.
Full textSchweighofer, Markus. "Iterated rings of bounded elements and generalizations of Schmüdgen's theorem." [S.l. : s.n.], 2002. http://www.bsz-bw.de/cgi-bin/xvms.cgi?SWB9911683.
Full textPeel, Thomas. "Algorithmes de poursuite stochastiques et inégalités de concentration empiriques pour l'apprentissage statistique." Thesis, Aix-Marseille, 2013. http://www.theses.fr/2013AIXM4769/document.
Full textThe first part of this thesis introduces new algorithms for the sparse encoding of signals. Based on Matching Pursuit (MP) they focus on the following problem : how to reduce the computation time of the selection step of MP. As an answer, we sub-sample the dictionary in line and column at each iteration. We show that this theoretically grounded approach has good empirical performances. We then propose a bloc coordinate gradient descent algorithm for feature selection problems in the multiclass classification setting. Thanks to the use of error-correcting output codes, this task can be seen as a simultaneous sparse encoding of signals problem. The second part exposes new empirical Bernstein inequalities. Firstly, they concern the theory of the U-Statistics and are applied in order to design generalization bounds for ranking algorithms. These bounds take advantage of a variance estimator and we propose an efficient algorithm to compute it. Then, we present an empirical version of the Bernstein type inequality for martingales by Freedman [1975]. Again, the strength of our result lies in the variance estimator computable from the data. This allows us to propose generalization bounds for online learning algorithms which improve the state of the art and pave the way to a new family of learning algorithms taking advantage of this empirical information
Polat, Faruk. "On The Generalizations And Properties Of Abramovich-wickstead Spaces." Phd thesis, METU, 2008. http://etd.lib.metu.edu.tr/upload/12610166/index.pdf.
Full textChinot, Geoffrey. "Localization methods with applications to robust learning and interpolation." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAG002.
Full textThis PhD thesis deals with supervized machine learning and statistics. The main goal is to use localization techniques to derive fast rates of convergence, with a particular focus on robust learning and interpolation problems.Localization methods aim to analyze localized properties of an estimator to obtain fast rates of convergence, that is rates of order O(1/n), where n is the number of observations. Under assumptions, such as the Bernstein condition, such rates are attainable.A robust estimator is an estimator with good theoretical guarantees, under as few assumptions as possible. This question is getting more and more popular in the current era of big data. Large dataset are very likely to be corrupted and one would like to build reliable estimators in such a setting. We show that the well-known regularized empirical risk minimizer (RERM) with Lipschitz-loss function is robust with respect to heavy-tailed noise and outliers in the label. When the class of predictor is heavy-tailed, RERM is not reliable. In this setting, we show that minmax Median of Means estimators can be a solution. By construction minmax-MOM estimators are also robust to an adversarial contamination.Interpolation problems study learning procedure with zero training error. Surprisingly, in large dimension, interpolating the data does not necessarily implies over-fitting. We study a high dimensional Gaussian linear model and show that sometimes the over-fitting may be benign
Cherief-Abdellatif, Badr-Eddine. "Contributions to the theoretical study of variational inference and robustness." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAG001.
Full textThis PhD thesis deals with variational inference and robustness. More precisely, it focuses on the statistical properties of variational approximations and the design of efficient algorithms for computing them in an online fashion, and investigates Maximum Mean Discrepancy based estimators as learning rules that are robust to model misspecification.In recent years, variational inference has been extensively studied from the computational viewpoint, but only little attention has been put in the literature towards theoretical properties of variational approximations until very recently. In this thesis, we investigate the consistency of variational approximations in various statistical models and the conditions that ensure the consistency of variational approximations. In particular, we tackle the special case of mixture models and deep neural networks. We also justify in theory the use of the ELBO maximization strategy, a model selection criterion that is widely used in the Variational Bayes community and is known to work well in practice.Moreover, Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even under model mismatch and with adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference? In this thesis, we show that this is indeed the case for some variational inference algorithms. We propose new online, tempered variational algorithms and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that our result should hold more generally and present empirical evidence in support of this. Our work presents theoretical justifications in favor of online algorithms that rely on approximate Bayesian methods. Another point that is addressed in this thesis is the design of a universal estimation procedure. This question is of major interest, in particular because it leads to robust estimators, a very hot topic in statistics and machine learning. We tackle the problem of universal estimation using a minimum distance estimator based on the Maximum Mean Discrepancy. We show that the estimator is robust to both dependence and to the presence of outliers in the dataset. We also highlight the connections that may exist with minimum distance estimators using L2-distance. Finally, we provide a theoretical study of the stochastic gradient descent algorithm used to compute the estimator, and we support our findings with numerical simulations. We also propose a Bayesian version of our estimator, that we study from both a theoretical and a computational points of view
(11196552), Kevin Segundo Bello Medina. "STRUCTURED PREDICTION: STATISTICAL AND COMPUTATIONAL GUARANTEES IN LEARNING AND INFERENCE." Thesis, 2021.
Find full textPhilips, Petra. "Data-Dependent Analysis of Learning Algorithms." Phd thesis, 2005. http://hdl.handle.net/1885/47998.
Full textMakareh, Shireh Miad. "Topics in the Notion of Amenability and its Generalizations for Banach Algebras." 2010. http://hdl.handle.net/1993/4197.
Full textSchweighofer, Markus [Verfasser]. "Iterated rings of bounded elements and generalizations of Schmüdgen's theorem / Markus Schweighofer." 2002. http://d-nb.info/964451824/34.
Full textNasre, Meghana. "Generalizations Of The Popular Matching Problem." Thesis, 2011. http://etd.iisc.ernet.in/handle/2005/2095.
Full text