Dissertations / Theses on the topic 'Kernel Inference'

To see the other types of publications on this topic, follow the link: Kernel Inference.

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 26 dissertations / theses for your research on the topic 'Kernel Inference.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Fouchet, Arnaud. "Kernel methods for gene regulatory network inference." Thesis, Evry-Val d'Essonne, 2014. http://www.theses.fr/2014EVRY0058/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
De nouvelles technologies, notamment les puces à adn, multiplient la quantité de données disponibles pour la biologie moléculaire. dans ce contexte, des méthodes informatiques et mathématiques sont activement développées pour extraire le plus d'information d'un grand nombre de données. en particulier, le problème d'inférence de réseaux de régulation génique a été abordé au moyen de multiples modèles mathématiques et statistiques, des plus basiques (corrélation, modèle booléen ou linéaire) aux plus sophistiqués (arbre de régression, modèles bayésiens avec variables cachées). malgré leurs qualités pour des problèmes similaires, les modèles à noyaux ont été peu utilisés pour l'inférence de réseaux de régulation génique. en effet, ces méthodes fournissent en général des modèles difficiles a interpréter. dans cette thèse, nous avons développé deux façons d'obtenir des méthodes à noyaux interprétables. dans un premier temps, d'un point de vue théorique, nous montrons que les méthodes à noyaux permettent d'estimer, a partir d'un ensemble d'apprentissage, une fonction de transition et ses dérivées partielles de façon consistante. ces estimations de dérivées partielles permettent, sur des exemples réalistes, de mieux identifier le réseau de régulation génique que des méthodes standards. dans un deuxième temps, nous développons une méthode à noyau interprétable grâce à l'apprentissage à noyaux multiples. ce modèle fournit des résultats du niveau de l'état de l'art sur des réseaux réels et des réseaux simulés réalistes
New technologies in molecular biology, in particular dna microarrays, have greatly increased the quantity of available data. in this context, methods from mathematics and computer science have been actively developed to extract information from large datasets. in particular, the problem of gene regulatory network inference has been tackled using many different mathematical and statistical models, from the most basic ones (correlation, boolean or linear models) to the most elaborate (regression trees, bayesian models with latent variables). despite their qualities when applied to similar problems, kernel methods have scarcely been used for gene network inference, because of their lack of interpretability. in this thesis, two approaches are developed to obtain interpretable kernel methods. firstly, from a theoretical point of view, some kernel methods are shown to consistently estimate a transition function and its partial derivatives from a learning dataset. these estimations of partial derivatives allow to better infer the gene regulatory network than previous methods on realistic gene regulatory networks. secondly, an interpretable kernel methods through multiple kernel learning is presented. this method, called lockni, provides state-of-the-art results on real and realistically simulated datasets
2

Chan, Karen Pui-Shan. "Kernel density estimation, Bayesian inference and random effects model." Thesis, University of Edinburgh, 1990. http://hdl.handle.net/1842/13350.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This thesis contains results of a study in kernel density estimation, Bayesian inference and random effects models, with application to forensic problems. Estimation of the Bayes' factor in a forensic science problem involved the derivation of predictive distributions in non-standard situations. The distribution of the values of a characteristic of interest among different items in forensic science problems is often non-Normal. Background, or training, data were available to assist in the estimation of the distribution for measurements on cat and dog hairs. An informative prior, based on the kernel method of density estimation, was used to derive the appropriate predictive distributions. The training data may be considered to be derived from a random effects model. This was taken into consideration in modelling the Bayes' factor. The usual assumption of the random factor being Normally distributed is unrealistic, so a kernel density estimate was used as the distribution of the unknown random factor. Two kernel methods were employed: the ordinary and adaptive kernel methods. The adaptive kernel method allowed for the longer tail, where little information was available. Formulae for the Bayes' factor in a forensic science context were derived assuming the training data were grouped or not grouped (for example, hairs from one cat would be thought of as belonging to the same group), and that the within-group variance was or was not known. The Bayes' factor, assuming known within-group variance, for the training data, grouped or not grouped, was extended to the multivariate case. The method was applied to a practical example in a bivariate situation. Similar modelling of the Bayes' factor was derived to cope with a particular form of mixture data. Boundary effects were also taken into consideration. Application of kernel density estimation to make inferences about the variance components under the random effects model was studied. Employing the maximum likelihood estimation method, it was shown that the between-group variance and the smoothing parameter in the kernel density estimation were related. They were not identifiable separately. With the smoothing parameter fixed at some predetermined value, the within-and between-group variance estimates from the proposed model were equivalent to the usual ANOVA estimates. Within the Bayesian framework, posterior distribution for the variance components, using various prior distributions for the parameters were derived incorporating kernel density functions. The modes of these posterior distributions were used as estimates for the variance components. A Student-t within a Bayesian framework was derived after introduction of a prior for the smoothing prameter. Two methods of obtaining hyper-parameters for the prior were suggested, both involving empirical Bayes methods. They were a modified leave-one-out maximum likelihood method and a method of moments based on the optimum smoothing parameter determined from Normality assumption.
3

Araya, Valdivia Ernesto. "Kernel spectral learning and inference in random geometric graphs." Thesis, université Paris-Saclay, 2020. http://www.theses.fr/2020UPASM020.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cette thèse comporte deux objectifs. Un premier objectif concerne l’étude des propriétés de concentration des matrices à noyau, qui sont fondamentales dans l’ensemble des méthodes à noyau. Le deuxième objectif repose quant à lui sur l’étude des problèmes d’inférence statistique dans le modèle des graphes aléatoires géométriques. Ces deux objectifs sont liés entre eux par le formalisme du graphon, qui permet représenter un graphe par un noyau. Nous rappelons les rudiments du modèle du graphon dans le premier chapitre. Le chapitre 2 présente des bornes précises pour les valeurs propres individuelles d’une matrice à noyau, où notre principale contribution est d’obtenir des inégalités à l’échelle de la valeur propre en considération. Ceci donne des vitesses de convergence qui sont meilleures que la vitesse paramétrique et, en occasions, exponentielles. Jusqu’ici cela n’avait été établi qu’avec des hypothèses contraignantes dans le contexte des graphes. Nous spécialisons les résultats au cas de noyaux de produit scalaire, en soulignant sa relation avec le modèle des graphes géométriques. Le chapitre 3 étudie le problème d’estimation des distances latentes pour le modèle des graphes aléatoires géométriques dans la sphère Euclidienne. Nous proposons un algorithme spectral efficace qui utilise la matrice d’adjacence pour construire un estimateur de la matrice des distances latentes, et des garanties théoriques pour l’erreur d’estimation, ainsi que la vitesse de convergence, sont montrées. Le chapitre 4 étend les méthodes développées dans le chapitre précédent au cas des graphes aléatoires géométriques dans la boule Euclidienne, un modèle qui, en dépit des similarités formelles avec le cas sphérique, est plus flexible en termes de modélisation. En particulier, nous montrons que pour certains choix des paramètres le profil des dégrées est distribué selon une loi de puissance, ce qui a été vérifié empiriquement dans plusieurs réseaux réels. Tous les résultats théoriques des deux derniers chapitres sont confirmés par des expériences numériques
This thesis has two main objectives. The first is to investigate the concentration properties of random kernel matrices, which are central in the study of kernel methods. The second objective is to study statistical inference problems on random geometric graphs. Both objectives are connected by the graphon formalism, which allows to represent a graph by a kernel function. We briefly recall the basics of the graphon model in the first chapter. In chapter two, we present a set of accurate concentration inequalities for individual eigenvalues of the kernel matrix, where our main contribution is to obtain inequalities that scale with the eigenvalue in consideration, implying convergence rates that are faster than parametric and often exponential, which hitherto has only been establish under assumptions which are too restrictive for graph applications. We specialized our results to the case of dot products kernels, highlighting its relation with the random geometric graph model. In chapter three, we study the problem of latent distances estimation on random geometric graphs on the Euclidean sphere. We propose an efficient spectral algorithm that use the adjacency matrix to construct an estimator for the latent distances, and prove finite sample guaranties for the estimation error, establishing its convergence rate. In chapter four, we extend the method developed in the previous chapter to the case of random geometric graphs on the Euclidean ball, a model that despite its formal similarities with the spherical case it is more flexible for modelling purposes. In particular, we prove that for certain parameter choices its degree profile is power law distributed, which has been observed in many real life networks. All the theoretical findings of the last two chapters are verified and complemented by numerical experiments
4

Jitkrittum, Wittawat. "Kernel-based distribution features for statistical tests and Bayesian inference." Thesis, University College London (University of London), 2017. http://discovery.ucl.ac.uk/10037987/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The kernel mean embedding is known to provide a data representation which preserves full information of the data distribution. While typically computationally costly, its nonparametric nature has an advantage of requiring no explicit model specification of the data. At the other extreme are approaches which summarize data distributions into a finite-dimensional vector of hand-picked summary statistics. This explicit finite-dimensional representation offers a computationally cheaper alternative. Clearly, there is a trade-off between cost and sufficiency of the representation, and it is of interest to have a computationally efficient technique which can produce a data-driven representation, thus combining the advantages from both extremes. The main focus of this thesis is on the development of linear-time mean-embedding-based methods to automatically extract informative features of data distributions, for statistical tests and Bayesian inference. In the first part on statistical tests, several new linear-time techniques are developed. These include a new kernel-based distance measure for distributions, a new linear-time nonparametric dependence measure, and a linear-time discrepancy measure between a probabilistic model and a sample, based on a Stein operator. These new measures give rise to linear-time and consistent tests of homogeneity, independence, and goodness of fit, respectively. The key idea behind these new tests is to explicitly learn distribution-characterizing feature vectors, by maximizing a proxy for the probability of correctly rejecting the null hypothesis. We theoretically show that these new tests are consistent for any finite number of features. In the second part, we explore the use of random Fourier features to construct approximate kernel mean embeddings, for representing messages in expectation propagation (EP) algorithm. The goal is to learn a message operator which predicts EP outgoing messages from incoming messages. We derive a novel two-layer random feature representation of the input messages, allowing online learning of the operator during EP inference.
5

Hsu, Yuan-Shuo Kelvin. "Bayesian Perspectives on Conditional Kernel Mean Embeddings: Hyperparameter Learning and Probabilistic Inference." Thesis, University of Sydney, 2020. https://hdl.handle.net/2123/24309.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This thesis presents the narrative of a particular journey towards discovering and developing Bayesian perspectives on conditional kernel mean embeddings. It is motivated by the desire and need to learn flexible and richer representations of conditional distributions for probabilistic inference in various contexts. While conditional kernel mean embeddings are able to achieve such representations, it is unclear how their hyperparameters can be learned for probabilistic inference in various settings. These hyperparameters govern the space of possible representations, and critically influence the degree of inference accuracy. At its core, this thesis argues for the notion that Bayesian perspectives lead to principled ways for formulating frameworks that provides a holistic treatment to model, learning, and inference. The story begins by emulating required properties of Bayesian frameworks via learning theoretic bounds. This is carried through the lens of a probabilistic multiclass setting, resulting in the multiclass conditional embedding framework. Through establishing convergence to multiclass probabilities and deriving learning theoretic and Rademacher complexity bounds, the framework arrives at an expected risk bound whose realizations exhibits desirable properties for hyperparameter learning such as the ever-crucial balance between data-fit error and model complexity, emulating marginal likelihoods. The probabilistic nature of this bound enable batch learning for scalability, and the generality of the model allow for various model architectures to be used and learned end-to-end. The narrative unfolds into forming approximate Bayesian inference frameworks directly for the likelihood-free Bayesian inference problem, leading to the kernel embedding likelihood-free inference framework. The core motivator centers around the natural suitability of conditional kernel mean embeddings to forming surrogate probabilistic models. By leveraging the likelihood-free Bayesian inference problem structure, surrogate models for both hyperparameter learning and posterior inference are developed. Finally, the journey concludes with a Bayesian regression framework that aligns the learning and inference to both the problem and the model. This begins by a careful formulation of the conditional mean and the novel deconditional mean problem, thereby revealing the novel deconditional mean embeddings as core elements of the wider kernel mean embedding framework. They can further be established as a nonparametric Bayes' rule with applications towards Bayesian inference. Crucially, by introducing the task transformed regression problem, they can be extended to the novel task transformed Gaussian processes as their Bayesian form, whose marginal likelihood can be used to learn hyperparameters in various forms and contexts. These perspectives and frameworks developed in this thesis shed light into creative ways conditional kernel mean embeddings can be learned and applied in existing problem domains, and further inspire elegant solutions in novel problem settings.
6

Adams, R. P. "Kernel methods for nonparametric Bayesian inference of probability densities and point processes." Thesis, University of Cambridge, 2009. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.595350.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
I propose two new kernel-based models that enable an exact generative procedure: the Gaussian process density sampler (GPDS) for probability density functions, and the sigmoidal Gaussian Cox process (SGCP) for the Poisson process. With generative priors, I show how it is now possible to construct two different kinds of Markov chains for inference in these models. These Markov chains have the desired posterior distribution as their equilibrium distributions, and, despite a parameter space with uncountably many dimensions, require only a finite amount of computation to simulate. The GPDS and SGCP, and the associated inference procedures, are the first kernel-based nonparametric Bayesian methods that allow inference without a finite-dimensional approximation. I also present several additional kernel-based models for data that extend the Gaussian process density sampler and sigmoidal Gaussian Cox process to other situations. The Archipelago model extends the GPDS to address the task of semi-supervised learning, where a flexible density estimate can improve the performance of a classifier when unlabelled data are available. I also generalise the SGCP to enable a nonparametric inhomogeneous Neyman-Scott process, and present a soft-core generalisation of the Matérn repulsive process that similarly allows non-approximate inference via Markov chain Monte Carlo.
7

Gogolashvili, Davit. "Global and local Kernel methods for dataset shift, scalable inference and optimization." Electronic Thesis or Diss., Sorbonne université, 2022. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2022SORUS363v2.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans de nombreux problèmes du monde réel, les données de formation et les données de test ont des distributions différentes. Cette situation est communément appelée " décalage de l'ensemble de données ". Les paramètres les plus courants pour le décalage des ensembles de données souvent considérés dans la littérature sont le décalage des covariables et le décalage des cibles. Dans cette thèse, nous étudions les modèles nonparamétriques appliqués au scénario de changement d'ensemble de données. Nous développons un nouveau cadre pour accélérer la régression par processus gaussien. En particulier, nous considérons des noyaux de localisation à chaque point de données pour réduire les contributions des autres points de données éloignés, et nous dérivons le modèle GPR découlant de l'application de cette opération de localisation. Grâce à une série d'expériences, nous démontrons la performance compétitive de l'approche proposée par rapport au GPR complet, à d'autres modèles localisés et aux processus gaussiens profonds. De manière cruciale, ces performances sont obtenues avec des accélérations considérables par rapport au GPR global standard en raison de l'effet de sparsification de la matrice de Gram induit par l'opération de localisation. Nous proposons une nouvelle méthode pour estimer le minimiseur et la valeur minimale d'une fonction de régression lisse et fortement convexe à partir d'observations contaminées par du bruit aléatoire
In many real world problems, the training data and test data have different distributions. The most common settings for dataset shift often considered in the literature are covariate shift and target shift. In this thesis, we investigate nonparametric models applied to the dataset shift scenario. We develop a novel framework to accelerate Gaussian process regression. In particular, we consider localization kernels at each data point to down-weigh the contributions from other data points that are far away, and we derive the GPR model stemming from the application of such localization operation. We propose a new method for estimating the minimizer and the minimum value of a smooth and strongly convex regression function from the observations contaminated by random noise
8

Maity, Arnab. "Efficient inference in general semiparametric regression models." [College Station, Tex. : Texas A&M University, 2008. http://hdl.handle.net/1969.1/ETD-TAMU-3075.

Full text
APA, Harvard, Vancouver, ISO, and other styles
9

Minnier, Jessica. "Inference and Prediction for High Dimensional Data via Penalized Regression and Kernel Machine Methods." Thesis, Harvard University, 2012. http://dissertations.umi.com/gsas.harvard:10327.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Analysis of high dimensional data often seeks to identify a subset of important features and assess their effects on the outcome. Furthermore, the ultimate goal is often to build a prediction model with these features that accurately assesses risk for future subjects. Such statistical challenges arise in the study of genetic associations with health outcomes. However, accurate inference and prediction with genetic information remains challenging, in part due to the complexity in the genetic architecture of human health and disease. A valuable approach for improving prediction models with a large number of potential predictors is to build a parsimonious model that includes only important variables. Regularized regression methods are useful, though often pose challenges for inference due to nonstandard limiting distributions or finite sample distributions that are difficult to approximate. In Chapter 1 we propose and theoretically justify a perturbation-resampling method to derive confidence regions and covariance estimates for marker effects estimated from regularized procedures with a general class of objective functions and concave penalties. Our methods outperform their asymptotic-based counterparts, even when effects are estimated as zero. In Chapters 2 and 3 we focus on genetic risk prediction. The difficulty in accurate risk assessment with genetic studies can in part be attributed to several potential obstacles: sparsity in marker effects, a large number of weak signals, and non-linear effects. Single marker analyses often lack power to select informative markers and typically do not account for non-linearity. One approach to gain predictive power and efficiency is to group markers based on biological knowledge such genetic pathways or gene structure. In Chapter 2 we propose and theoretically justify a multi-stage method for risk assessment that imposes a naive bayes kernel machine (KM) model to estimate gene-set specific risk models, and then aggregates information across all gene-sets by adaptively estimating gene-set weights via a regularization procedure. In Chapter 3 we extend these methods to meta-analyses by introducing sampling-based weights in the KM model. This permits building risk prediction models with multiple studies that have heterogeneous sampling schemes
10

Weller, Jennifer N. "Bayesian Inference In Forecasting Volcanic Hazards: An Example From Armenia." [Tampa, Fla.] : University of South Florida, 2004. http://purl.fcla.edu/fcla/etd/SFE0000485.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

El, Ghouch Anouar. "Nonparametric statistical inference for dependent censored data." Université catholique de Louvain, 2007. http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-09262007-123927/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
A frequent problem that appears in practical survival data analysis is censoring. A censored observation occurs when the observation of the event time (duration or survival time) may be prevented by the occurrence of an earlier competing event (censoring time). Censoring may be due to different causes. For example, the loss of some subjects under study, the end of the follow-up period, drop out or the termination of the study and the limitation in the sensitivity of a measurement instrument. The literature about censored data focuses on the i.i.d. case. However in many real applications the data are collected sequentially in time or space and so the assumption of independence in such case does not hold. Here we only give some typical examples from the literature involving correlated data which are subject to censoring. In the clinical trials domain it frequently happens that the patients from the same hospital have correlated survival times due to unmeasured variables like the quality of the hospital equipment. Censored correlated data are also a common problem in the domain of environmental and spatial (geographical or ecological) statistics. In fact, due to the process being used in the data sampling procedure, e.g. the analytical equipment, only the measurements which exceed some thresholds, for example the method detection limits or the instrumental detection limits, can be included in the data analysis. Many other examples can also be found in other fields like econometrics and financial statistics. Observations on duration of unemployment e.g., may be right censored and are typically correlated. When the data are not independent and are subject to censoring, estimation and inference become more challenging mathematical problems with a wide area of applications. In this context, we propose here some new and flexible tools based on a nonparametric approach. More precisely, allowing dependence between individuals, our main contribution to this domain concerns the following aspects. First, we are interested in developing more suitable confidence intervals for a general class of functionals of a survival distribution via the empirical likelihood method. Secondly, we study the problem of conditional mean estimation using the local linear technique. Thirdly, we develop and study a new estimator of the conditional quantile function also based on the local linear method. In this dissertation, for each proposed method, asymptotic results like consistency and asymptotic normality are derived and the finite sample performance is evaluated in a simulation study.
12

Razavian, Narges Sharif. "Continuous Graphical Models for Static and Dynamic Distributions: Application to Structural Biology." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/340.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Generative models of protein structure enable researchers to predict the behavior of proteins under different conditions. Continuous graphical models are powerful and efficient tools for modeling static and dynamic distributions, which can be used for learning generative models of molecular dynamics. In this thesis, we develop new and improved continuous graphical models, to be used in modeling of protein structure. We first present von Mises graphical models, and develop consistent and efficient algorithms for sparse structure learning and parameter estimation, and inference. We compare our model to sparse Gaussian graphical model and show it outperforms GGMs on synthetic and Engrailed protein molecular dynamics datasets. Next, we develop algorithms to estimate Mixture of von Mises graphical models using Expectation Maximization, and show that these models outperform Von Mises, Gaussian and mixture of Gaussian graphical models in terms of accuracy of prediction in imputation test of non-redundant protein structure datasets. We then use non-paranormal and nonparametric graphical models, which have extensive representation power, and compare several state of the art structure learning methods that can be used prior to nonparametric inference in reproducing kernel Hilbert space embedded graphical models. To be able to take advantage of the nonparametric models, we also propose feature space embedded belief propagation, and use random Fourier based feature approximation in our proposed feature belief propagation, to scale the inference algorithm to larger datasets. To improve the scalability further, we also show the integration of Coreset selection algorithm with the nonparametric inference, and show that the combined model scales to large datasets with very small adverse effect on the quality of predictions. Finally, we present time varying sparse Gaussian graphical models, to learn smoothly varying graphical models of molecular dynamics simulation data, and present results on CypA protein
13

Boussaid, Haithem. "Efficient inference and learning in graphical models for multi-organ shape segmentation." Thesis, Châtenay-Malabry, Ecole centrale de Paris, 2015. http://www.theses.fr/2015ECAP0002/document.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cette thèse explore l’utilisation des modèles de contours déformables pour la segmentation basée sur la forme des images médicales. Nous apportons des contributions sur deux fronts: dans le problème de l’apprentissage statistique, où le modèle est formé à partir d’un ensemble d’images annotées, et le problème de l’inférence, dont le but est de segmenter une image étant donnée un modèle. Nous démontrons le mérite de nos techniques sur une grande base d’images à rayons X, où nous obtenons des améliorations systématiques et des accélérations par rapport à la méthode de l’état de l’art. Concernant l’apprentissage, nous formulons la formation de la fonction de score des modèles de contours déformables en un problème de prédiction structurée à grande marge et construisons une fonction d’apprentissage qui vise à donner le plus haut score à la configuration vérité-terrain. Nous intégrons une fonction de perte adaptée à la prédiction structurée pour les modèles de contours déformables. En particulier, nous considérons l’apprentissage avec la mesure de performance consistant en la distance moyenne entre contours, comme une fonction de perte. L’utilisation de cette fonction de perte au cours de l’apprentissage revient à classer chaque contour candidat selon sa distance moyenne du contour vérité-terrain. Notre apprentissage des modèles de contours déformables en utilisant la prédiction structurée avec la fonction zéro-un de perte surpasse la méthode [Seghers et al. 2007] de référence sur la base d’images médicales considérée [Shiraishi et al. 2000, van Ginneken et al. 2006]. Nous démontrons que l’apprentissage avec la fonction de perte de distance moyenne entre contours améliore encore plus les résultats produits avec l’apprentissage utilisant la fonction zéro-un de perte et ce d’une quantité statistiquement significative.Concernant l’inférence, nous proposons des solveurs efficaces et adaptés aux problèmes combinatoires à variables spatiales discrétisées. Nos contributions sont triples: d’abord, nous considérons le problème d’inférence pour des modèles graphiques qui contiennent des boucles, ne faisant aucune hypothèse sur la topologie du graphe sous-jacent. Nous utilisons un algorithme de décomposition-coordination efficace pour résoudre le problème d’optimisation résultant: nous décomposons le graphe du modèle en un ensemble de sous-graphes en forme de chaines ouvertes. Nous employons la Méthode de direction alternée des multiplicateurs (ADMM) pour réparer les incohérences des solutions individuelles. Même si ADMM est une méthode d’inférence approximative, nous montrons empiriquement que notre implémentation fournit une solution exacte pour les exemples considérés. Deuxièmement, nous accélérons l’optimisation des modèles graphiques en forme de chaîne en utilisant l’algorithme de recherche hiérarchique A* [Felzenszwalb & Mcallester 2007] couplé avec les techniques d’élagage développés dans [Kokkinos 2011a]. Nous réalisons une accélération de 10 fois en moyenne par rapport à l’état de l’art qui est basé sur la programmation dynamique (DP) couplé avec les transformées de distances généralisées [Felzenszwalb & Huttenlocher 2004]. Troisièmement, nous intégrons A* dans le schéma d’ADMM pour garantir une optimisation efficace des sous-problèmes en forme de chaine. En outre, l’algorithme résultant est adapté pour résoudre les problèmes d’inférence augmentée par une fonction de perte qui se pose lors de l’apprentissage de prédiction des structure, et est donc utilisé lors de l’apprentissage et de l’inférence. [...]
This thesis explores the use of discriminatively trained deformable contour models (DCMs) for shape-based segmentation in medical images. We make contributions in two fronts: in the learning problem, where the model is trained from a set of annotated images, and in the inference problem, whose aim is to segment an image given a model. We demonstrate the merit of our techniques in a large X-Ray image segmentation benchmark, where we obtain systematic improvements in accuracy and speedups over the current state-of-the-art. For learning, we formulate training the DCM scoring function as large-margin structured prediction and construct a training objective that aims at giving the highest score to the ground-truth contour configuration. We incorporate a loss function adapted to DCM-based structured prediction. In particular, we consider training with the Mean Contour Distance (MCD) performance measure. Using this loss function during training amounts to scoring each candidate contour according to its Mean Contour Distance to the ground truth configuration. Training DCMs using structured prediction with the standard zero-one loss already outperforms the current state-of-the-art method [Seghers et al. 2007] on the considered medical benchmark [Shiraishi et al. 2000, van Ginneken et al. 2006]. We demonstrate that training with the MCD structured loss further improves over the generic zero-one loss results by a statistically significant amount. For inference, we propose efficient solvers adapted to combinatorial problems with discretized spatial variables. Our contributions are three-fold:first, we consider inference for loopy graphical models, making no assumption about the underlying graph topology. We use an efficient decomposition-coordination algorithm to solve the resulting optimization problem: we decompose the model’s graph into a set of open, chain-structured graphs. We employ the Alternating Direction Method of Multipliers (ADMM) to fix the potential inconsistencies of the individual solutions. Even-though ADMMis an approximate inference scheme, we show empirically that our implementation delivers the exact solution for the considered examples. Second,we accelerate optimization of chain-structured graphical models by using the Hierarchical A∗ search algorithm of [Felzenszwalb & Mcallester 2007] couple dwith the pruning techniques developed in [Kokkinos 2011a]. We achieve a one order of magnitude speedup in average over the state-of-the-art technique based on Dynamic Programming (DP) coupled with Generalized DistanceTransforms (GDTs) [Felzenszwalb & Huttenlocher 2004]. Third, we incorporate the Hierarchical A∗ algorithm in the ADMM scheme to guarantee an efficient optimization of the underlying chain structured subproblems. The resulting algorithm is naturally adapted to solve the loss-augmented inference problem in structured prediction learning, and hence is used during training and inference. In Appendix A, we consider the case of 3D data and we develop an efficientmethod to find the mode of a 3D kernel density distribution. Our algorithm has guaranteed convergence to the global optimum, and scales logarithmically in the volume size by virtue of recursively subdividing the search space. We use this method to rapidly initialize 3D brain tumor segmentation where we demonstrate substantial acceleration with respect to a standard mean-shift implementation. In Appendix B, we describe in more details our extension of the Hierarchical A∗ search algorithm of [Felzenszwalb & Mcallester 2007] to inference on chain-structured graphs
14

Bon, Joshua J. "Advances in sequential Monte Carlo methods." Thesis, Queensland University of Technology, 2022. https://eprints.qut.edu.au/235897/1/Joshua%2BBon%2BThesis%284%29.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Estimating parameters of complex statistical models and their uncertainty from data is a challenging task in statistics and data science. This thesis developed novel statistical algorithms for efficiently performing statistical estimation, established the validity of these algorithms, and explored their properties with mathematical analysis. The new algorithms and their associated analysis are significant since they permit principled and robust fitting of statistical models that were previously intractable and will thus facilitate new scientific discoveries.
15

Priddle, Jacob William. "Efficient and flexible Bayesian synthetic likelihood via transformations." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/205902/1/Jacob_Priddle_Thesis.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Simulator models are a type of stochastic model that is often used to approximate a real-life process. Current statistical methods for simulator models are computationally intensive, relying on a large number of model simulations. In this thesis, we develop new, efficient and flexible statistical methods that can be used for complex statistical models, such as simulator models. The new methods are theoretically justified and applied to a variety of simulated and real-life modelling scenarios from ecology and biology.
16

Jeunesse, Paulien. "Estimation non paramétrique du taux de mort dans un modèle de population générale : Théorie et applications. A new inference strategy for general population mortality tables Nonparametric adaptive inference of birth and death models in a large population limit Nonparametric inference of age-structured models in a large population limit with interactions, immigration and characteristics Nonparametric test of time dependance of age-structured models in a large population limit." Thesis, Paris Sciences et Lettres (ComUE), 2019. http://www.theses.fr/2019PSLED013.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
L’étude du taux de mortalité dans des modèles de population humaine ou en biologie est le cœur de ce travail. Cette thèse se situe à la frontière de la statistique des processus, de la statistique non-paramétrique et de l’analyse.Dans une première partie, centrée sur une problématique actuarielle, un algorithme est proposé pour estimer les tables de mortalité, utiles en assurance. Cet algorithme se base sur un modèle déterministe de population. Ces nouvelles estimations améliorent les résultats actuels en prenant en compte la dynamique globale de la population. Ainsi les naissances sont incorporées dans le modèle pour calculer le taux de mort. De plus, ces estimations sont mises en lien avec les travaux précédents, assurant ainsi la continuité théorique de notre travail.Dans une deuxième partie, nous nous intéressons à l’estimation du taux de mortalité dans un modèle stochastique de population. Cela nous pousse à utiliser des arguments propres à la statistique des processus et à la statistique non-paramétrique. On trouve alors des estimateurs non-paramétriques adaptatifs dans un cadre anisotrope pour la mortalité et la densité de population, ainsi que des inégalités de concentration non asymptotiques quantifiant la distance entre le modèle stochastique et le modèle déterministe limite utilisé dans la première partie. On montre que ces estimateurs restent optimaux dans un modèle où le taux de mort dépend d’interactions, comme dans le cas de la population logistique.Dans une troisième partie, on considère la réalisation d’un test pour détecter la présence d’interactions dans le taux de mortalité. Ce test permet en réalité de juger de la dépendance temporelle de ce taux. Sous une hypothèse, on montre alors qu’il est possible de détecter la présence d’interactions. Un algorithme pratique est proposé pour réaliser ce test
In this thesis, we study the mortality rate in different population models to apply our results to demography or biology. The mathematical framework includes statistics of process, nonparametric estimations and analysis.In a first part, an algorithm is proposed to estimate the mortality tables. This problematic comes from actuarial science and the aim is to apply our results in the insurance field. This algorithm is founded on a deterministic population model. The new estimates we gets improve the actual results. Its advantage is to take into account the global population dynamics. Thanks to that, births are used in our model to compute the mortality rate. Finally these estimations are linked with the precedent works. This is a point of great importance in the field of actuarial science.In a second part, we are interested in the estimation of the mortality rate in a stochastic population model. We need to use the tools coming from nonparametric estimations and statistics of process to do so. Indeed, the mortality rate is a function of two parameters, the time and the age. We propose minimax optimal and adaptive estimators for the mortality and the population density. We also demonstrate some non asymptotics concentration inequalities. These inequalities quantifiy the deviation between the stochastic process and its deterministic limit we used in the first part. We prove that our estimators are still optimal in a model where the mortality is influenced by interactions. This is for example the case for the logistic population.In a third part, we consider the testing problem to detect the existence of interactions. This test is in fact designed to detect the time dependance of the mortality rate. Under the assumption the time dependance in the mortality rate comes only from the interactions, we can detect the presence of interactions. Finally we propose an algorithm to do this test
17

Verbyla, Petras. "Network inference using independence criteria." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/277912.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Biological systems are driven by complex regulatory processes. Graphical models play a crucial role in the analysis and reconstruction of such processes. It is possible to derive regulatory models using network inference algorithms from high-throughput data, for example; from gene or protein expression data. A wide variety of network inference algorithms have been designed and implemented. Our aim is to explore the possibilities of using statistical independence criteria for biological network inference. The contributions of our work can be categorized into four sections. First, we provide a detailed overview of some of the most popular general independence criteria: distance covariance (dCov), kernel canonical variance (KCC), kernel generalized variance (KGV) and the Hilbert-Schmidt Independence Criterion (HSIC). We provide easy to understand geometrical interpretations for these criteria. We also explicitly show the equivalence of dCov, KGV and HSIC. Second, we introduce a new criterion for measuring dependence based on the signal to noise ratio (SNRIC). SNRIC is significantly faster to compute than other popular independence criteria. SNRIC is an approximate criterion but becomes exact under many popular modelling assumptions, for example for data from an additive noise model. Third, we compare the performance of the independence criteria on biological experimental data within the framework of the PC algorithm. Since not all criteria are available in a version that allows for testing conditional independence, we propose and test an approach which relies on residuals and requires only an unconditional version of an independence criterion. Finally we propose a novel method to infer networks with feedback loops. We use an MCMC sampler, which samples using a loss function based on an independence criterion. This allows us to find networks under very general assumptions, such as non-linear relationships, non-Gaussian noise distributions and feedback loops.
18

Massaroppe, Lucas. "Estimação da causalidade de Granger no caso de interação não-linear." Universidade de São Paulo, 2016. http://www.teses.usp.br/teses/disponiveis/3/3142/tde-20122016-083110/.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Esta tese examina o problema de detecção de conectividade entre séries temporais no sentido de Granger no caso em que a natureza não linear das interações não permite sua determinação por meio de modelos auto-regressivos lineares vetoriais. Mostra-se que é possível realizar esta detecção com auxílio dos chamados métodos de Kernel, que se tornaram populares em aprendizado por máquina (\'machine learning\') já que tais métodos permitem definir formas generalizadas de teste de Granger, coerência parcial direcionada e função de transferência direcionada. Usando simulações, mostram-se alguns exemplos de detecção nos quais fica também evidente que resultados assintóticos deduzidos originalmente para estimadores lineares podem ser generalizados de modo análogo, mostrando-se válidos no presente contexto kernelizado.
This work examines the connectivity detection problem between time series in the Granger sense when the nonlinear nature of interactions determination is impossible via linear vector autoregressive models, but is, nonetheless, feasible with the aid of the so-called Kernel methods that are popular in machine learning. The kernelization approach allows defining generalised versions for Granger tests, partial directed coherence and directed transfer function, which the simulation of some examples shows that the asymptotic detection results originally deducted for linear estimators, can also be employed under kernelization if suitably adapted.
19

Akcin, Haci Mustafa. "NONPARAMETRIC INFERENCES FOR THE HAZARD FUNCTION WITH RIGHT TRUNCATION." Digital Archive @ GSU, 2013. http://digitalarchive.gsu.edu/math_diss/12.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Incompleteness is a major feature of time-to-event data. As one type of incompleteness, truncation refers to the unobservability of the time-to-event variable because it is smaller (or greater) than the truncation variable. A truncated sample always involves left and right truncation. Left truncation has been studied extensively while right truncation has not received the same level of attention. In one of the earliest studies on right truncation, Lagakos et al. (1988) proposed to transform a right truncated variable to a left truncated variable and then apply existing methods to the transformed variable. The reverse-time hazard function is introduced through transformation. However, this quantity does not have a natural interpretation. There exist gaps in the inferences for the regular forward-time hazard function with right truncated data. This dissertation discusses variance estimation of the cumulative hazard estimator, one-sample log-rank test, and comparison of hazard rate functions among finite independent samples under the context of right truncation. First, the relation between the reverse- and forward-time cumulative hazard functions is clarified. This relation leads to the nonparametric inference for the cumulative hazard function. Jiang (2010) recently conducted a research on this direction and proposed two variance estimators of the cumulative hazard estimator. Some revision to the variance estimators is suggested in this dissertation and evaluated in a Monte-Carlo study. Second, this dissertation studies the hypothesis testing for right truncated data. A series of tests is developed with the hazard rate function as the target quantity. A one-sample log-rank test is first discussed, followed by a family of weighted tests for comparison between finite $K$-samples. Particular weight functions lead to log-rank, Gehan, Tarone-Ware tests and these three tests are evaluated in a Monte-Carlo study. Finally, this dissertation studies the nonparametric inference for the hazard rate function for the right truncated data. The kernel smoothing technique is utilized in estimating the hazard rate function. A Monte-Carlo study investigates the uniform kernel smoothed estimator and its variance estimator. The uniform, Epanechnikov and biweight kernel estimators are implemented in the example of blood transfusion infected AIDS data.
20

Semolini, Robinson. "Support vector machines, inferencia transdutiva e o problema de classificação." [s.n.], 2002. http://repositorio.unicamp.br/jspui/handle/REPOSIP/262026.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Orientador: Fernando Jose Von Zuben
Dissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Eletrica e de Computação
Made available in DSpace on 2018-08-02T22:45:22Z (GMT). No. of bitstreams: 1 Semolini_Robinson_M.pdf: 2460751 bytes, checksum: ebce4f71a94df85c3c47c496d4feae2a (MD5) Previous issue date: 2002
Mestrado
21

Dumora, Christophe. "Estimation de paramètres clés liés à la gestion d'un réseau de distribution d'eau potable : Méthode d'inférence sur les noeuds d'un graphe." Thesis, Bordeaux, 2020. http://www.theses.fr/2020BORD0325.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
L'essor des données générées par les capteurs et par les outils opérationnels autour de la gestion des réseaux d'alimentation en eau potable (AEP) rendent ces systèmes de plus en plus complexes et de façon générale les événements plus difficiles à appréhender. L'historique de données lié à la qualité de l’eau distribuée croisé avec la connaissance du patrimoine réseau, des données contextuelles et des paramètres temporels amène à étudier un système complexe de par sa volumétrie et l'existence d'interactions entre ces différentes données de natures diverses pouvant varier dans le temps et l’espace. L'utilisation de graphes mathématiques permet de regrouper toute cette diversité et fournit une représentation complète des réseaux AEP ainsi que les évènements pouvant y survenir ou influer sur leur bon fonctionnement. La théorie des graphes associées à ces graphes mathématiques permet une analyse structurelle et spectrale des réseaux ainsi constitués afin de répondre à des problématiques métiers concrètes et d'améliorer des processus internes existants. Ces graphes sont ensuite utilisés pour répondre au problème d'inférence sur les noeuds d'un très grand graphe à partir de l'observation partielle de quelques données sur un faible nombre de noeuds. Une approche par algorithme d'optimisation sur les graphes est utilisée pour construire une variable numérique de débit en tout noeuds du graphe (et donc en tout point du réseau physique) à l'aide d'algorithme de flots et des données issues des débitmètres réseau. Ensuite une approche de prédiction par noyau reposant sur un estimateur pénalisé de type Ridge, qui soulève des problèmes d'analyse spectrale de grande matrice creuse, permet l'inférence d'un signal observé sur un certains nombre de noeuds en tout point d'un réseau AEP
The rise of data generated by sensors and operational tools around water distribution network (WDN) management make these systems more and more complex and in general the events more difficult to predict. The history of data related to the quality of distributed water crossed with the knowledge of network assets, contextual data and temporal parameters lead to study a complex system due to its volume and the existence of interactions between these various type of data which may vary in time and space. This big variety of data is grouped by the use of mathematical graph and allow to represent WDN as a whole and all the events that may arise therein or influence their proper functioning. The graph theory associated with these mathematical graphs allow a structural and spectral analysis of WDN to answer to specific needs and enhance existing process. These graphs are then used to answer the probleme of inference on the nodes of large graph from the observation of data on a small number of nodes. An approach by optminisation algorithm is used to construct a variable of flow on every nodes of a graph (therefore at any point of a physical network) using flow algorithm and data measured in real time by flowmeters. Then, a kernel prediction approach based on a Ridge estimator, which raises spectral analysis problems of a large sparse matrix, allow the inference of a signal measured on specific nodes of a graph at any point of a WDN
22

Allain, Cédric. "Temporal point processes and scalable convolutional dictionary learning : a unified framework for m/eeg signal analysis in neuroscience." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG008.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans le domaine de l'imagerie cérébrale non invasive, la magnéto- et l'électroencéphalographie (M/EEG) offrent un précieux aperçu des activités neuronales. Les données enregistrées consistent en des séries temporelles multivariées qui fournissent des informations sur les processus cognitifs et sont souvent complétées par des détails auxiliaires liés au paradigme expérimental, tels que l'horodatage des stimuli externes ou des actions entreprises par les sujets. En outre, l'ensemble des données peut inclure des enregistrements de plusieurs sujets, ce qui facilite les analyses en population.Cette thèse de doctorat présente un nouveau cadre pour l'analyse des signaux M/EEG qui synergise l'Apprentissage Convolutif de Dictionnaire (CDL) et les Processus Ponctuels Temporels (TPP). Ce travail est divisé en deux composantes principales : les avancées en modélisation temporelle et le passage à l'échelle computationnelle. En matière de modélisation temporelle, deux nouveaux modèles de processus ponctuels sont introduits, accompagnés de méthodes d'inférence efficaces pour capturer les activités neuronales liées aux tâches. La méthode proposée d'Inférence Discrétisée Rapide pour les Processus de Hawkes (FaDIn) a également des implications pour des applications plus larges. De plus, ce travail aborde les défis computationnels de l'analyse des données M/EEG à grande échelle basée sur le CDL, en introduisant un nouvel algorithme robuste de CDL avec fenêtrage stochastique. Cet algorithme permet de traiter efficacement les signaux entachés d'artefacts ainsi que les études de population à grande échelle. Le CDL populationnelle a ensuite été utilisée sur le grand ensemble de données en libre accès Cam-CAN, révélant des aspects de l'activité neuronale liée à l'âge
In the field of non-invasive brain imaging, Magnetoencephalography and Electroencephalography (M/EEG) offer invaluable insights into neural activities. The recorded data consist of multivariate time series that provide information about cognitive processes and are often complemented by auxiliary details related to the experimental paradigm, such as timestamps of external stimuli or actions undertaken by the subjects. Additionally, the dataset may include recordings from multiple subjects, facilitating population- level analyses.This doctoral research presents a novel framework for M/EEG signal analysis that synergizes Convolutional Dictionary Learning (CDL) and Temporal Point Processes (TPPs). The work is segmented into two primary components: temporal modeling advancements and computational scalability. For temporal modeling, two novel point process models are introduced with efficient inference methods to capture task-specific neural activities. The proposed Fast Discretized Inference for Hawkes Processes (FaDIn) method also has implications for broader applications. Additionally, this work addresses the computational challenges of large-scale M/EEG data CDL-based analysis, by introducing a novel Stochastic Robust Windowing CDL algorithm. This algorithm allows to process efficiently artifact-ridden signals as well as large population studies. Population CDL was then used on the large open-access dataset Cam-CAN, shedding light on age-related neural activity
23

Song, Song. "Confidence bands in quantile regression and generalized dynamic semiparametric factor models." Doctoral thesis, Humboldt-Universität zu Berlin, Wirtschaftswissenschaftliche Fakultät, 2010. http://dx.doi.org/10.18452/16341.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
In vielen Anwendungen ist es notwendig, die stochastische Schwankungen der maximalen Abweichungen der nichtparametrischen Schätzer von Quantil zu wissen, zB um die verschiedene parametrische Modelle zu überprüfen. Einheitliche Konfidenzbänder sind daher für nichtparametrische Quantil Schätzungen der Regressionsfunktionen gebaut. Die erste Methode basiert auf der starken Approximation der empirischen Verfahren und Extremwert-Theorie. Die starke gleichmäßige Konsistenz liegt auch unter allgemeinen Bedingungen etabliert. Die zweite Methode beruht auf der Bootstrap Resampling-Verfahren. Es ist bewiesen, dass die Bootstrap-Approximation eine wesentliche Verbesserung ergibt. Der Fall von mehrdimensionalen und diskrete Regressorvariablen wird mit Hilfe einer partiellen linearen Modell behandelt. Das Verfahren wird mithilfe der Arbeitsmarktanalysebeispiel erklärt. Hoch-dimensionale Zeitreihen, die nichtstationäre und eventuell periodische Verhalten zeigen, sind häufig in vielen Bereichen der Wissenschaft, zB Makroökonomie, Meteorologie, Medizin und Financial Engineering, getroffen. Der typische Modelierungsansatz ist die Modellierung von hochdimensionalen Zeitreihen in Zeit Ausbreitung der niedrig dimensionalen Zeitreihen und hoch-dimensionale zeitinvarianten Funktionen über dynamische Faktorenanalyse zu teilen. Wir schlagen ein zweistufiges Schätzverfahren. Im ersten Schritt entfernen wir den Langzeittrend der Zeitreihen durch Einbeziehung Zeitbasis von der Gruppe Lasso-Technik und wählen den Raumbasis mithilfe der funktionalen Hauptkomponentenanalyse aus. Wir zeigen die Eigenschaften dieser Schätzer unter den abhängigen Szenario. Im zweiten Schritt erhalten wir den trendbereinigten niedrig-dimensionalen stochastischen Prozess (stationär).
In many applications it is necessary to know the stochastic fluctuation of the maximal deviations of the nonparametric quantile estimates, e.g. for various parametric models check. Uniform confidence bands are therefore constructed for nonparametric quantile estimates of regression functions. The first method is based on the strong approximations of the empirical process and extreme value theory. The strong uniform consistency rate is also established under general conditions. The second method is based on the bootstrap resampling method. It is proved that the bootstrap approximation provides a substantial improvement. The case of multidimensional and discrete regressor variables is dealt with using a partial linear model. A labor market analysis is provided to illustrate the method. High dimensional time series which reveal nonstationary and possibly periodic behavior occur frequently in many fields of science, e.g. macroeconomics, meteorology, medicine and financial engineering. One of the common approach is to separate the modeling of high dimensional time series to time propagation of low dimensional time series and high dimensional time invariant functions via dynamic factor analysis. We propose a two-step estimation procedure. At the first step, we detrend the time series by incorporating time basis selected by the group Lasso-type technique and choose the space basis based on smoothed functional principal component analysis. We show properties of this estimator under the dependent scenario. At the second step, we obtain the detrended low dimensional stochastic process (stationary).
24

Pawlowski, Filip igor. "High-performance dense tensor and sparse matrix kernels for machine learning." Thesis, Lyon, 2020. http://www.theses.fr/2020LYSEN081.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Dans cette thèse, nous développons des algorithmes à haute performance pour certains calculs impliquant des tenseurs denses et des matrices éparses. Nous abordons les opérations du noyau qui sont utiles pour les tâches d'apprentissage de la machine, telles que l'inférence avec les réseaux neuronaux profonds. Nous développons des structures de données et des techniques pour réduire l'utilisation de la mémoire, pour améliorer la localisation des données et donc pour améliorer la réutilisation du cache des opérations du noyau. Nous concevons des algorithmes parallèles à mémoire séquentielle et à mémoire partagée.Dans la première partie de la thèse, nous nous concentrons sur les noyaux tenseurs denses. Les noyaux tenseurs comprennent la multiplication tenseur-vecteur (TVM), la multiplication tenseur-matrice (TMM) et la multiplication tenseur-tendeur (TTM). Parmi ceux-ci, la MVT est la plus liée à la largeur de bande et constitue un élément de base pour de nombreux algorithmes. Nous proposons une nouvelle structure de données qui stocke le tenseur sous forme de blocs, qui sont ordonnés en utilisant la courbe de remplissage de l'espace connue sous le nom de courbe de Morton (ou courbe en Z). L'idée clé consiste à diviser le tenseur en blocs suffisamment petits pour tenir dans le cache et à les stocker selon l'ordre de Morton, tout en conservant un ordre simple et multidimensionnel sur les éléments individuels qui les composent. Ainsi, des routines BLAS haute performance peuvent être utilisées comme micro-noyaux pour chaque bloc. Les résultats démontrent non seulement que l'approche proposée est plus performante que les variantes de pointe jusqu'à 18%, mais aussi que l'approche proposée induit 71% de moins d'écart-type d'échantillon pour le MVT dans les différents modes possibles. Enfin, nous étudions des algorithmes de mémoire partagée parallèles pour la MVT qui utilisent la structure de données proposée. Nos résultats sur un maximum de 8 systèmes de prises montrent une performance presque maximale pour l'algorithme proposé pour les tenseurs à 2, 3, 4 et 5 dimensions.Dans la deuxième partie de la thèse, nous explorons les calculs épars dans les réseaux de neurones en nous concentrant sur le problème d'inférence profonde épars à haute performance. L'inférence sparse DNN est la tâche d'utiliser les réseaux sparse DNN pour classifier un lot d'éléments de données formant, dans notre cas, une matrice de caractéristiques sparse. La performance de l'inférence clairsemée dépend de la parallélisation efficace de la matrice clairsemée - la multiplication matricielle clairsemée (SpGEMM) répétée pour chaque couche dans la fonction d'inférence. Nous introduisons ensuite l'inférence modèle-parallèle, qui utilise un partitionnement bidimensionnel des matrices de poids obtenues à l'aide du logiciel de partitionnement des hypergraphes. Enfin, nous introduisons les algorithmes de tuilage modèle-parallèle et de tuilage hybride, qui augmentent la réutilisation du cache entre les couches, et utilisent un module de synchronisation faible pour cacher le déséquilibre de charge et les coûts de synchronisation. Nous évaluons nos techniques sur les données du grand réseau du IEEE HPEC 2019 Graph Challenge sur les systèmes à mémoire partagée et nous rapportons jusqu'à 2x l'accélération par rapport à la ligne de base
In this thesis, we develop high performance algorithms for certain computations involving dense tensors and sparse matrices. We address kernel operations that are useful for machine learning tasks, such as inference with deep neural networks (DNNs). We develop data structures and techniques to reduce memory use, to improve data locality and hence to improve cache reuse of the kernel operations. We design both sequential and shared-memory parallel algorithms. In the first part of the thesis we focus on dense tensors kernels. Tensor kernels include the tensor--vector multiplication (TVM), tensor--matrix multiplication (TMM), and tensor--tensor multiplication (TTM). Among these, TVM is the most bandwidth-bound and constitutes a building block for many algorithms. We focus on this operation and develop a data structure and sequential and parallel algorithms for it. We propose a novel data structure which stores the tensor as blocks, which are ordered using the space-filling curve known as the Morton curve (or Z-curve). The key idea consists of dividing the tensor into blocks small enough to fit cache, and storing them according to the Morton order, while keeping a simple, multi-dimensional order on the individual elements within them. Thus, high performance BLAS routines can be used as microkernels for each block. We evaluate our techniques on a set of experiments. The results not only demonstrate superior performance of the proposed approach over the state-of-the-art variants by up to 18%, but also show that the proposed approach induces 71% less sample standard deviation for the TVM across the d possible modes. Finally, we show that our data structure naturally expands to other tensor kernels by demonstrating that it yields up to 38% higher performance for the higher-order power method. Finally, we investigate shared-memory parallel TVM algorithms which use the proposed data structure. Several alternative parallel algorithms were characterized theoretically and implemented using OpenMP to compare them experimentally. Our results on up to 8 socket systems show near peak performance for the proposed algorithm for 2, 3, 4, and 5-dimensional tensors. In the second part of the thesis, we explore the sparse computations in neural networks focusing on the high-performance sparse deep inference problem. The sparse DNN inference is the task of using sparse DNN networks to classify a batch of data elements forming, in our case, a sparse feature matrix. The performance of sparse inference hinges on efficient parallelization of the sparse matrix--sparse matrix multiplication (SpGEMM) repeated for each layer in the inference function. We first characterize efficient sequential SpGEMM algorithms for our use case. We then introduce the model-parallel inference, which uses a two-dimensional partitioning of the weight matrices obtained using the hypergraph partitioning software. The model-parallel variant uses barriers to synchronize at layers. Finally, we introduce tiling model-parallel and tiling hybrid algorithms, which increase cache reuse between the layers, and use a weak synchronization module to hide load imbalance and synchronization costs. We evaluate our techniques on the large network data from the IEEE HPEC 2019 Graph Challenge on shared-memory systems and report up to 2x times speed-up versus the baseline
25

Chang, Chia-Hung, and 張嘉宏. "Design of an Inference Accelerator for CNN with Sparse Row-wise Kernel." Thesis, 2019. http://ndltd.ncl.edu.tw/handle/vgqj7n.

Full text
APA, Harvard, Vancouver, ISO, and other styles
26

Su, Wanhua. "Efficient Kernel Methods for Statistical Detection." Thesis, 2008. http://hdl.handle.net/10012/3598.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This research is motivated by a drug discovery problem -- the AIDS anti-viral database from the National Cancer Institute. The objective of the study is to develop effective statistical methods to model the relationship between the chemical structure of a compound and its activity against the HIV-1 virus. And as a result, the structure-activity model can be used to predict the activity of new compounds and thus helps identify those active chemical compounds that can be used as drug candidates. Since active compounds are generally rare in a compound library, we recognize the drug discovery problem as an application of the so-called statistical detection problem. In a typical statistical detection problem, we have data {Xi,Yi}, where Xi is the predictor vector of the ith observation and Yi={0,1} is its class label. The objective of a statistical detection problem is to identify class-1 observations, which are extremely rare. Besides drug discovery problem, other applications of statistical detection include direct marketing and fraud detection. We propose a computationally efficient detection method called LAGO, which stands for "locally adjusted GO estimator". The original idea is inspired by an ancient game known today as "GO". The construction of LAGO consists of two steps. In the first step, we estimate the density of class 1 with an adaptive bandwidth kernel density estimator. The kernel functions are located at and only at the class-1 observations. The bandwidth of the kernel function centered at a certain class-1 observation is calculated as the average distance between this class-1 observation and its K-nearest class-0 neighbors. In the second step, we adjust the density estimated in the first step locally according to the density of class 0. It can be shown that the amount of adjustment in the second step is approximately inversely proportional to the bandwidth calculated in the first step. Application to the NCI data demonstrates that LAGO is superior to methods such as K nearest neighbors and support vector machines. One drawback of the existing LAGO is that it only provides a point estimate of a test point's possibility of being class 1, ignoring the uncertainty of the model. In the second part of this thesis, we present a Bayesian framework for LAGO, referred to as BLAGO. This Bayesian approach enables quantification of uncertainty. Non-informative priors are adopted. The posterior distribution is calculated over a grid of (K, alpha) pairs by integrating out beta0 and beta1 using the Laplace approximation, where K and alpha are two parameters to construct the LAGO score. The parameters beta0, beta1 are the coefficients of the logistic transformation that converts the LAGO score to the probability scale. BLAGO provides proper probabilistic predictions that have support on (0,1) and captures uncertainty of the predictions as well. By avoiding Markov chain Monte Carlo algorithms and using the Laplace approximation, BLAGO is computationally very efficient. Without the need of cross-validation, BLAGO is even more computationally efficient than LAGO.

To the bibliography