To see the other types of publications on this topic, follow the link: Density estimation.

Dissertations / Theses on the topic 'Density estimation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Density estimation.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Wang, Xiaoxia. "Manifold aligned density estimation." Thesis, University of Birmingham, 2010. http://etheses.bham.ac.uk//id/eprint/847/.

Full text
Abstract:
With the advent of the information technology, the amount of data we are facing today is growing in both the scale and the dimensionality dramatically. It thus raises new challenges for some traditional machine learning tasks. This thesis is mainly concerned with manifold aligned density estimation problems. In particular, the work presented in this thesis includes efficiently learning the density distribution on very large-scale datasets and estimating the manifold aligned density through explicit manifold modeling. First, we propose an efficient and sparse density estimator: Fast Parzen Windows (FPW) to represent the density of large-scale dataset by a mixture of locally fitted Gaussians components. The Gaussian components in the model are estimated in a "sloppy" way, which can avoid very time-consuming "global" optimizations, keep the simplicity of the density estimator and also assure the estimation accuracy. Preliminary theoretical work shows that the performance of the local fitted Gaussian components is related to the curvature of the true density and the characteristic of Gaussian model itself. A successful application of our FPW on principled calibrating the galaxy simulations is also demonstrated in the thesis. Then, we investigate the problem of manifold (i.e., low dimensional structure) aligned density estimation through explicit manifold modeling, which aims to obtain the embedded manifold and the density distribution simultaneously. A new manifold learning algorithm is proposed to capture the non-linear low dimensional structure and provides an improved initialization to Generative Topographic Mapping (GTM) model. The GTM models are then employed in our proposed hierarchical mixture model to estimate the density of data aligned along multiple manifolds. Extensive experiments verified the effectiveness of the presented work.
APA, Harvard, Vancouver, ISO, and other styles
2

Rademeyer, Estian. "Bayesian kernel density estimation." Diss., University of Pretoria, 2017. http://hdl.handle.net/2263/64692.

Full text
Abstract:
This dissertation investigates the performance of two-class classi cation credit scoring data sets with low default ratios. The standard two-class parametric Gaussian and naive Bayes (NB), as well as the non-parametric Parzen classi ers are extended, using Bayes' rule, to include either a class imbalance or a Bernoulli prior. This is done with the aim of addressing the low default probability problem. Furthermore, the performance of Parzen classi cation with Silverman and Minimum Leave-one-out Entropy (MLE) Gaussian kernel bandwidth estimation is also investigated. It is shown that the non-parametric Parzen classi ers yield superior classi cation power. However, there is a longing for these non-parametric classi ers to posses a predictive power, such as exhibited by the odds ratio found in logistic regression (LR). The dissertation therefore dedicates a section to, amongst other things, study the paper entitled \Model-Free Objective Bayesian Prediction" (Bernardo 1999). Since this approach to Bayesian kernel density estimation is only developed for the univariate and the uncorrelated multivariate case, the section develops a theoretical multivariate approach to Bayesian kernel density estimation. This approach is theoretically capable of handling both correlated as well as uncorrelated features in data. This is done through the assumption of a multivariate Gaussian kernel function and the use of an inverse Wishart prior.
Dissertation (MSc)--University of Pretoria, 2017.
The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged. Opinions expressed and conclusions arrived at, are those of the authors and are not necessarily to be attributed to the NRF.
Statistics
MSc
Unrestricted
APA, Harvard, Vancouver, ISO, and other styles
3

Stride, Christopher B. "Semi-parametric density estimation." Thesis, University of Warwick, 1995. http://wrap.warwick.ac.uk/109619/.

Full text
Abstract:
The local likelihood method of Copas (1995a) allows for the incorporation into our parametric model of influence from data local to the point t at which we are estimating the true density function g(t). This is achieved through an analogy with censored data; we define the probability of a data point being considered observed, given that it has taken value xi, as where K is a scaled kernel function with smoothing parameter h. This leads to a likelihood function which gives more weight to observations close to t, hence the term ‘local likelihood’. After constructing this local likelihood function and maximising it at t, the resulting density estimate f(tiOt) can be described as semi-parametric in terms of its limits with respect to h. As h--oo, it approximates a standard parametric' fit f(I.O) whereas when h decreases towards 0, it approximates the non - parametric kernel density estimate. My thesis develops this idea, initially proving its asymptotic superiority over the standard parametric estimate under certain conditions. We then consider the improvements possible by making smoothing parameter h a function of /, enabling our semi parametric estimate to vary from approximating y(l) in regions of high density to f(t,0) in regions where we believe the true density to be low. Our improvement in accuracy is demonstrated in both simulated and real data examples, and the limits with respect to h and the new adaption parameter oo are examined. Methods for choosing h and oo are given and evaluated, along with a procedure for incorporating prior belief about the true form of the density into these choices. Further practical examples illustrate the effectiveness of I these ideas when applied to a wide range of data sets.
APA, Harvard, Vancouver, ISO, and other styles
4

Rossiter, Jane E. "Epidemiological applications of density estimation." Thesis, University of Oxford, 1991. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.291543.

Full text
APA, Harvard, Vancouver, ISO, and other styles
5

Sung, Iyue. "Importance sampling kernel density estimation /." The Ohio State University, 2001. http://rave.ohiolink.edu/etdc/view?acc_num=osu1486398528559777.

Full text
APA, Harvard, Vancouver, ISO, and other styles
6

Kile, Håkon. "Bandwidth Selection in Kernel Density Estimation." Thesis, Norwegian University of Science and Technology, Department of Mathematical Sciences, 2010. http://urn.kb.se/resolve?urn=urn:nbn:no:ntnu:diva-10015.

Full text
Abstract:

In kernel density estimation, the most crucial step is to select a proper bandwidth (smoothing parameter). There are two conceptually different approaches to this problem: a subjective and an objective approach. In this report, we only consider the objective approach, which is based upon minimizing an error, defined by an error criterion. The most common objective bandwidth selection method is to minimize some squared error expression, but this method is not without its critics. This approach is said to not perform satisfactory in the tail(s) of the density, and to put too much weight on observations close to the mode(s) of the density. An approach which minimizes an absolute error expression, is thought to be without these drawbacks. We will provide a new explicit formula for the mean integrated absolute error. The optimal mean integrated absolute error bandwidth will be compared to the optimal mean integrated squared error bandwidth. We will argue that these two bandwidths are essentially equal. In addition, we study data-driven bandwidth selection, and we will propose a new data-driven bandwidth selector. Our new bandwidth selector has promising behavior with respect to the visual error criterion, especially in the cases of limited sample sizes.

APA, Harvard, Vancouver, ISO, and other styles
7

Achilleos, Achilleas. "Deconvolution kernal density and regression estimation." Thesis, University of Bristol, 2011. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.544421.

Full text
APA, Harvard, Vancouver, ISO, and other styles
8

Buchman, Susan. "High-Dimensional Adaptive Basis Density Estimation." Research Showcase @ CMU, 2011. http://repository.cmu.edu/dissertations/169.

Full text
Abstract:
In the realm of high-dimensional statistics, regression and classification have received much attention, while density estimation has lagged behind. Yet there are compelling scientific questions which can only be addressed via density estimation using high-dimensional data, such as the paths of North Atlantic tropical cyclones. If we cast each track as a single high-dimensional data point, density estimation allows us to answer such questions via integration or Monte Carlo methods. In this dissertation, I present three new methods for estimating densities and intensities for high-dimensional data, all of which rely on a technique called diffusion maps. This technique constructs a mapping for high-dimensional, complex data into a low-dimensional space, providing a new basis that can be used in conjunction with traditional density estimation methods. Furthermore, I propose a reordering of importance sampling in the high-dimensional setting. Traditional importance sampling estimates high-dimensional integrals with the aid of an instrumental distribution chosen specifically to minimize the variance of the estimator. In many applications, the integral of interest is with respect to an estimated density. I argue that in the high-dimensional realm, performance can be improved by reversing the procedure: instead of estimating a density and then selecting an appropriate instrumental distribution, begin with the instrumental distribution and estimate the density with respect to it directly. The variance reduction follows from the improved density estimate. Lastly, I present some initial results in using climatic predictors such as sea surface temperature as spatial covariates in point process estimation.
APA, Harvard, Vancouver, ISO, and other styles
9

Lu, Shan. "Essays on volatility forecasting and density estimation." Thesis, University of Aberdeen, 2019. http://digitool.abdn.ac.uk:80/webclient/DeliveryManager?pid=240161.

Full text
Abstract:
This thesis studies two subareas within the forecasting literature: volatility forecasting and risk-neutral density estimation and asks the question of how accurate volatility forecasts and risk-neutral density estimates can be made based on the given information. Two sources of information are employed to make those forecasts: historical information contained in time series of asset prices, and forward-looking information embedded in prices of traded options. Chapter 2 tests the comparative performance of two volatility scaling laws - the square-root-of-time (√T) and an empirical law, TH, characterized by the Hurst exponent (H) - where volatility is measured by sample standard deviation of returns, for forecasting the volatility term structure of crude oil price changes and ten foreign currency changes. We find that the empirical law is overall superior for crude oil, whereas the selection of a superior model is currency-specific and relative performance substantially differs across currencies. Our results are particularly important for regulatory risk management using Value-at-Risk and suggest the use of empirical law for volatility and quantile scaling. Chapter 3 studies the predictive ability of corridor implied volatility (CIV) measure. By adding CIV measures to the modified GARCH specifications, we show that narrow and mid-range CIVs outperform the wide CIVs, market volatility index and the BlackScholes implied volatility for horizons up to 21 days under various market conditions. Results of simulated trading reinforce our statistical findings. Chapter 4 compares six estimation methods for extracting risk-neutral densities (RND) from option prices. By using a pseudo-price based simulation, we find that the positive convolution approximation method provides the best performance, while mixture of two lognormals is the worst; In addition, we show that both price and volatility jumps are important components for option pricing. Our results have practical applications for policymakers as RNDs are important indicators to gauge market sentiment and expectations.
APA, Harvard, Vancouver, ISO, and other styles
10

Chan, Kwokleung. "Bayesian learning in classification and density estimation /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC IP addresses, 2002. http://wwwlib.umi.com/cr/ucsd/fullcit?p3061619.

Full text
APA, Harvard, Vancouver, ISO, and other styles
11

Suaray, Kagba N. "On kernel density estimation for censored data /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2004. http://wwwlib.umi.com/cr/ucsd/fullcit?p3144346.

Full text
APA, Harvard, Vancouver, ISO, and other styles
12

Mao, Ruixue. "Road Traffic Density Estimation in Vehicular Network." Thesis, The University of Sydney, 2013. http://hdl.handle.net/2123/9467.

Full text
Abstract:
In recent decades, vehicular networks or intelligent transportation systems are being increasingly investigated and used to provide solutions to next generation traffic systems. Road traffic density estimation provides important information for road planning, intelligent road routing, road traffic control, vehicular network traffic scheduling, routing and dissemination. The ever increasing number of vehicles equipped with wireless communication capabilities provide new means to estimate the road traffic density more accurately and in real time than traditionally used techniques. In this thesis, we consider two research problems on road traffic density estimation. First research problem is the estimation algorithm design of road traffic density where each vehicle estimates its local road traffic density using some simple measurements only, i.e. the number of neighboring vehicles. A maximum likelihood estimator of the traffic density is obtained based on a rigorous analysis of the joint distribution of the number of vehicles in each hop. Analysis is also conducted on the accuracy of the estimation and the amount of neighborhood information required for an accurate road traffic density estimation. Simulations are performed which validate the accuracy and the robustness of the proposed density estimation algorithm. Secondly, we consider the problem of road traffic density estimation based on the use of a stochastic geometry concept—contact distribution function, which obtains density estimates by a probe vehicle traveling within objective area, measuring the inter-contact vehicle numbers and lengths. A maximum likelihood estimator of the traffic density is applied. Analysis is also performed on the accuracy of the estimation and the small sample sizes’ bias has been corrected. Simulations are performed which validate the accuracy and robustness of the proposed density estimation algorithm.
APA, Harvard, Vancouver, ISO, and other styles
13

Chee, Chew–Seng. "A mixture-based framework for nonparametric density estimation." Thesis, University of Auckland, 2011. http://hdl.handle.net/2292/10148.

Full text
Abstract:
The primary goal of this thesis is to provide a mixture-based framework for nonparametric density estimation. This framework advocates the use of a mixture model with a nonparametric mixing distribution to approximate the distribution of the data. The implementation of a mixture-based nonparametric density estimator generally requires the specification of parameters in a mixture model and the choice of the bandwidth parameter. Consequently, a nonparametric methodology consisting of both the estimation and selection steps is described. For the estimation of parameters in mixture models, we employ the minimum disparity estimation framework within which there exist several estimation approaches differing in the way smoothing is incorporated in the disparity objective function. For the selection of the bandwidth parameter, we study some popular methods such as cross-validation and information criteria-based model selection methods. Also, new algorithms are developed for the computation of the mixture-based nonparametric density estimates. A series of studies on mixture-based nonparametric density estimators is presented, ranging from the problems of nonparametric density estimation in general to estimation under constraints. The problem of estimating symmetric densities is firstly investigated, followed by an extension in which the interest lies in estimating finite mixtures of symmetric densities. The third study utilizes the idea of double smoothing in defining the least squares criterion for mixture-based nonparametric density estimation. For these problems, numerical studies whether using both simulated and real data examples suggest that the performance of the mixture-based nonparametric density estimators is generally better than or at least competitive with that of the kernel-based nonparametric density estimators. The last but not least concern is nonparametric estimation of continuous and discrete distributions under shape constraints. Particularly, a new model called the discrete k-monotone is proposed for estimating the number of unknown species. In fact, the discrete k- monotone distribution is a mixture of specific discrete beta distributions. Empirical results indicate that the new model outperforms the commonly used nonparametric Poisson mixture model in the context of species richness estimation. Although there remain issues to be resolved, the promising results from our series of studies make the mixture-based framework a valuable tool for nonparametric density estimation.
APA, Harvard, Vancouver, ISO, and other styles
14

Kharoufeh, Jeffrey P. "Density estimation for functions of correlated random variables." Ohio : Ohio University, 1997. http://www.ohiolink.edu/etd/view.cgi?ohiou1177097417.

Full text
APA, Harvard, Vancouver, ISO, and other styles
15

Nasios, Nikolaos. "Bayesian learning for parametric and kernel density estimation." Thesis, University of York, 2006. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.428460.

Full text
APA, Harvard, Vancouver, ISO, and other styles
16

Finch, Andrew M. "Density estimation for pattern recognition using neural networks." Thesis, University of York, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.261061.

Full text
APA, Harvard, Vancouver, ISO, and other styles
17

Lee, Suhwon. "Nonparametric bayesian density estimation with intrinsic autoregressive priors /." free to MU campus, to others for purchase, 2003. http://wwwlib.umi.com/cr/mo/fullcit?p3115565.

Full text
APA, Harvard, Vancouver, ISO, and other styles
18

Li, Yuhao. "Multiclass Density Estimation Analysis in N-Dimensional Space featuring Delaunay Tessellation Field Estimation." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-301958.

Full text
Abstract:
Multiclass density estimation is a method that can both estimate the density of a field and classify a given point to its targeted class. Delaunay Tessellation Field Estimation is a tessellation based multiclass density estimation technique that has recently been resurfaced and has been applied in the field of astronomy and computer science. In this paper Delaunay Tessellation Field Estimation is compared with other traditional density estimation techniques such as Kernel Density Estimation, k-Nearest Neighbour Density, Local Reachability Density and histogram to deliver a detailed performance analysis. One of the main conclusions is that Delaunay Tessellation Field Estimation scales in the number of data points but not the dimensions.
APA, Harvard, Vancouver, ISO, and other styles
19

Leahy, Logan Patrick. "Estimating output torque via amplitude estimation and neural drive : a high-density sEMG study." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/127134.

Full text
Abstract:
Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 115-121).
The scope and relevance of wearable robotics spans across a number of research fields with a variety of applications. One such application is the augmentation of healthy individuals for improved performance. A challenge within this field is improving user-interface control. An established approach for improving user-interface control is neural control interfaces derived from surface electromyography (sEMG). This thesis presents an exploration of output joint torque estimation using high density surface electromyography (HDsEMG). The specific aims of this thesis were to implement a well-established amplitude estimation method for standard multi-electrode sEMG collection with an HDsEMG grid, and to take an existing blind source separation algorithm for HDsEMG decomposition and modify it in order to decompose a nonisometric contraction.
In order to meet our study objectives, a novel dataset of simultaneous HDsEMG collected from the tibialis anterior muscle and torque output measures during controlled ankle movements was acquired. This data collection was conducted at The Army Research Laboratory. Data was collected for six subjects across three test conditions. The three test conditions were an isometric ramp-and-hold contraction, a force-varying isometric sinusoidal contraction, and a dynamic isotonic contraction. The amplitude estimation method used has been well-established but has not yet been explored for HDsEMG grids. In the exploration, three factors were varied: the number of channels on the grid used, the spatial area covered by the grid, and the signal whitening condition (no whitening, conventional whitening, and adaptive whitening).
The findings were that (1) Reducing the number of channels used while covering a constant spatial area did not diminish the output torque estimate, (2) Reducing the spatial area covered for a constant number of channels did not diminish the output torque estimate, and (3) For higher levels of contraction, adaptive whitening performed worse than conventional whitening and no whitening. The results suggest adaptive whitening is not a suitable method for HDsEMG. These findings are encouraging for developing an improved signal for myoelectric control: smaller, less expensive grids that use computationally less taxing methods could be utilized to achieve comparable, if not better, results. A blind source separation method based on iterative deconvolution of HDsEMG using independent component analysis was implemented to identify individual motor unit spike trains. Two methods were then used to generate the neural drive profile: rate coding and kernel smoothing.
A looped decomposition method was implemented for estimating output torque during the isotonic contraction. Even in the most controlled setting for a primarily single joint muscle, the modification of the algorithm did not represent the full population of the active motor units; thus, torque estimation was poor. There are still significant limitations in moving towards predicting output torque during dynamic contractions using this neural drive method. Although the decomposition of a non-isometric contraction was not successful, a contribution of this thesis work was identifying that the decomposition algorithm implemented may be biased towards larger motor units. This independently substantiated the same observation reported in a study published during the course of this thesis.
by Logan Patrick Leahy.
S.M.
S.M. Massachusetts Institute of Technology, Department of Mechanical Engineering
APA, Harvard, Vancouver, ISO, and other styles
20

Minsker, Stanislav. "Non-asymptotic bounds for prediction problems and density estimation." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44808.

Full text
Abstract:
This dissertation investigates the learning scenarios where a high-dimensional parameter has to be estimated from a given sample of fixed size, often smaller than the dimension of the problem. The first part answers some open questions for the binary classification problem in the framework of active learning. Given a random couple (X,Y) with unknown distribution P, the goal of binary classification is to predict a label Y based on the observation X. Prediction rule is constructed from a sequence of observations sampled from P. The concept of active learning can be informally characterized as follows: on every iteration, the algorithm is allowed to request a label Y for any instance X which it considers to be the most informative. The contribution of this work consists of two parts: first, we provide the minimax lower bounds for the performance of active learning methods. Second, we propose an active learning algorithm which attains nearly optimal rates over a broad class of underlying distributions and is adaptive with respect to the unknown parameters of the problem. The second part of this thesis is related to sparse recovery in the framework of dictionary learning. Let (X,Y) be a random couple with unknown distribution P. Given a collection of functions H, the goal of dictionary learning is to construct a prediction rule for Y given by a linear combination of the elements of H. The problem is sparse if there exists a good prediction rule that depends on a small number of functions from H. We propose an estimator of the unknown optimal prediction rule based on penalized empirical risk minimization algorithm. We show that the proposed estimator is able to take advantage of the possible sparse structure of the problem by providing probabilistic bounds for its performance.
APA, Harvard, Vancouver, ISO, and other styles
21

Cule, Madeleine. "Maximum likelihood estimation of a multivariate log-concave density." Thesis, University of Cambridge, 2010. https://www.repository.cam.ac.uk/handle/1810/237061.

Full text
Abstract:
Density estimation is a fundamental statistical problem. Many methods are eithersensitive to model misspecification (parametric models) or difficult to calibrate, especiallyfor multivariate data (nonparametric smoothing methods). We propose an alternativeapproach using maximum likelihood under a qualitative assumption on the shape ofthe density, specifically log-concavity. The class of log-concave densities includes manycommon parametric families and has desirable properties. For univariate data, theseestimators are relatively well understood, and are gaining in popularity in theory andpractice. We discuss extensions for multivariate data, which require different techniques. After establishing existence and uniqueness of the log-concave maximum likelihoodestimator for multivariate data, we see that a reformulation allows us to compute itusing standard convex optimization techniques. Unlike kernel density estimation, orother nonparametric smoothing methods, this is a fully automatic procedure, and noadditional tuning parameters are required. Since the assumption of log-concavity is non-trivial, we introduce a method forassessing the suitability of this shape constraint and apply it to several simulated datasetsand one real dataset. Density estimation is often one stage in a more complicatedstatistical procedure. With this in mind, we show how the estimator may be used forplug-in estimation of statistical functionals. A second important extension is the use oflog-concave components in mixture models. We illustrate how we may use an EM-stylealgorithm to fit mixture models where the number of components is known. Applicationsto visualization and classification are presented. In the latter case, improvement over aGaussian mixture model is demonstrated. Performance for density estimation is evaluated in two ways. Firstly, we considerHellinger convergence (the usual metric of theoretical convergence results for nonparametricmaximum likelihood estimators). We prove consistency with respect to this metricand heuristically discuss rates of convergence and model misspecification, supportedby empirical investigation. Secondly, we use the mean integrated squared error todemonstrate favourable performance compared with kernel density estimates using avariety of bandwidth selectors, including sophisticated adaptive methods. Throughout, we emphasise the development of stable numerical procedures able tohandle the additional complexity of multivariate data.
APA, Harvard, Vancouver, ISO, and other styles
22

Mulye, Apoorva. "Power Spectrum Density Estimation Methods for Michelson Interferometer Wavemeters." Thesis, Université d'Ottawa / University of Ottawa, 2016. http://hdl.handle.net/10393/35500.

Full text
Abstract:
In Michelson interferometry, many algorithms are used to detect the number of active laser sources at any given time. Conventional FFT-based non-parametric methods are widely used for this purpose. However, non-parametric methods are not the only possible option to distinguish the peaks in a spectrum, as these methods are not the most suitable methods for short data records and for closely spaced wavelengths. This thesis aims to provide solutions to these problems. It puts forward the use of parametric methods such as autoregressive methods and harmonic methods, and proposes two new algorithms to detect the closely spaced peaks for different scenarios of optical signals in wavemeters. Various parametric algorithms are studied, and their performances are compared with non-parametric algorithms for different criteria, e.g. absolute levels, frequency resolution, and accuracy of peak positions. Simulations are performed on synthetic signals produced from specifications provided by our sponsor, i.e., a wavemeter manufacturing company.
APA, Harvard, Vancouver, ISO, and other styles
23

Sardo, Lucia. "Model selection in probability density estimation using Gaussian mixtures." Thesis, University of Surrey, 1997. http://epubs.surrey.ac.uk/842833/.

Full text
Abstract:
This thesis proposes Gaussian Mixtures as a flexible semiparametric tool for density estimation and addresses the problem of model selection for this class of density estimators. First, a brief introduction to various techniques for model selection proposed in literature is given. The most commonly used techniques are cross validation nad methods based on data reuse and they all are either computationally very intensive or extremely demanding in terms of training set size. Another class of methods known as information criteria allows model selection at a much lower computational cost and for any sample size. The main objective of this study is to develop a technique for model selection that is not too computationally demanding, while capable of delivering an acceptable performance on a range of problems of various dimensionality. Another important issue addressed is the effect of the sample size. Large data sets are often difficult and costly to obtain, hence keeping the sample size within reasonable limits is also very important. Nevertheless sample size is central to the problem of density estimation and one cannot expect good results with extremely limited samples. Information Criteria are the most suitable candidates for a model selection procedure fulfilling these requirements. The well-known criterion Schwarz's Bayesian Information Criterion (BIC) has been analysed and its deficiencies when used with data of large dimensionality data are noted. A modification that improves on BIC criterion is proposed and named Maximum Penalised Likelihood (MPL) criterion. This criterion has the advantage that it can adapted to the data and its satisfactory performance is demonstrated experimentally. Unfortunately all information criteria, including the proposed MPL, suffer from a major drawback: a strong assumption of simplicity of the density to be estimated. This can lead to badly underfitted estimates, especially for small sample size problems. As a solution to such deficiencies, a procedure for validating the different models, based on an assessment of the model predictive performance, is proposed. The optimality criterion for model selection can be formulated as follow; if a model is able to predict the observed data frequencies within the statistical error, it is an acceptable model, otherwise it is rejected. An attractive feature of such a measure of goodness is the fact that it is an absolute measure, rather than a relative one, which would only provide a ranking between candidated models.
APA, Harvard, Vancouver, ISO, and other styles
24

Amghar, Mohamed. "Multiscale local polynomial transforms in smoothing and density estimation." Doctoral thesis, Universite Libre de Bruxelles, 2017. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/262040.

Full text
Abstract:
Un défi majeur dans les méthodes d'estimation non linéaire multi-échelle, comme le seuillage des ondelettes, c'est l'extension de ces méthodes vers une disposition où les observations sont irrégulières et non équidistantes. L'application de ces techniques dans le lissage de données ou l'estimation des fonctions de densité, il est crucial de travailler dans un espace des fonctions qui impose un certain degré de régularité. Nous suivons donc une approche différente, en utilisant le soi-disant système de levage. Afin de combiner la régularité et le bon conditionnement numérique, nous adoptons un schéma similaire à la pyramide Laplacienne, qui peut être considérée comme une transformation d'ondelettes légèrement redondantes. Alors que le schéma de levage classique repose sur l'interpolation comme opération de base, ce schéma permet d'utiliser le lissage, en utilisant par exemple des polynômes locaux. Le noyau de l'opération de lissage est choisi de manière multi-échelle. Le premier chapitre de ce projet consiste sur le développement de La transformée polynomiale locale multi-échelle, qui combine les avantages du lissage polynomial local avec la parcimonie de la décomposition multi-échelle. La contribution de cette partie est double. Tout d'abord, il se concentre sur les largeurs de bande utilisées tout au long de la transformée. Ces largeurs de bande fonctionnent comme des échelles contrôlées par l'utilisateur dans une analyse multi-échelle, ce qui s'explique par un intérêt particulier dans le cas des données non-équidistantes. Cette partie présente à la fois une sélection de bande passante optimale basée sur la vraisemblance et une approche heuristique rapide. La deuxième contribution consiste sur la combinaison du lissage polynomial local avec les préfiltres orthogonaux dans le but de diminuer la variance de la reconstruction. Dans le deuxième chapitre, le projet porte sur l'estimation des fonctions de densité à travers la transformée polynomiale locale multi-échelle, en proposant une reconstruction plus avancée, appelée reconstruction pondérée pour contrôler la propagation de la variance. Dans le dernier chapitre, On s’intéresse à l’extension de la transformée polynomiale locale multi-échelle dans le cas bivarié, tout en énumérant quelques avantages qu'on peut exploiter de cette transformée (la parcimonie, pas de triangulations), comparant à la transformée en ondelette classique en deux dimension.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished
APA, Harvard, Vancouver, ISO, and other styles
25

Inacio, Marco Henrique de Almeida. "Comparing two populations using Bayesian Fourier series density estimation." Universidade Federal de São Carlos, 2017. https://repositorio.ufscar.br/handle/ufscar/8920.

Full text
Abstract:
Submitted by Aelson Maciera (aelsoncm@terra.com.br) on 2017-06-28T18:26:17Z No. of bitstreams: 1 DissMHAI.pdf: 1513128 bytes, checksum: 1bb98ae57371ab00d2c86311b02054cb (MD5)
Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-07T17:53:27Z (GMT) No. of bitstreams: 1 DissMHAI.pdf: 1513128 bytes, checksum: 1bb98ae57371ab00d2c86311b02054cb (MD5)
Approved for entry into archive by Ronildo Prado (ronisp@ufscar.br) on 2017-08-07T17:53:36Z (GMT) No. of bitstreams: 1 DissMHAI.pdf: 1513128 bytes, checksum: 1bb98ae57371ab00d2c86311b02054cb (MD5)
Made available in DSpace on 2017-08-07T17:57:44Z (GMT). No. of bitstreams: 1 DissMHAI.pdf: 1513128 bytes, checksum: 1bb98ae57371ab00d2c86311b02054cb (MD5) Previous issue date: 2017-04-12
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Given two samples from two populations, one could ask how similar the populations are, that is, how close their probability distributions are. For absolutely continuous distributions, one way to measure the proximity of such populations is to use a measure of distance (metric) between the probability density functions (which are unknown given that only samples are observed). In this work, we work with the integrated squared distance as metric. To measure the uncertainty of the squared integrated distance, we first model the uncertainty of each of the probability density functions using a nonparametric Bayesian method. The method consists of estimating the probability density function f (or its logarithm) using Fourier series {f0;f1; :::;fI}. Assigning a prior distribution to f is then equivalent to assigning a prior distribution to the coefficients of this series. We used the prior suggested by Scricciolo (2006) (sieve prior), which not only places a prior on such coefficients, but also on I itself, so that in reality we work with a Bayesian mixture of finite dimensional models. To obtain posterior samples of such mixture, we marginalize out the discrete model index parameter I and use a statistical software called Stan. We conclude that the Bayesian Fourier series method has good performance when compared to kernel density estimation, although both methods often have problems in the estimation of the probability density function near the boundaries. Lastly, we showed how the methodology of Fourier series can be used to access the uncertainty regarding the similarity of two samples. In particular, we applied this method to dataset of patients with Alzheimer.
Dadas duas amostras de duas populações, pode-se questionar o quão parecidas as duas populações são, ou seja, o quão próximas estão suas distribuições de probabilidade. Para distribuições absolutamente contínuas, uma maneira de mensurar a proximidade dessas populações é utilizando uma medida de distância (métrica) entre as funções densidade de probabilidade (as quais são desconhecidas, em virtude de observarmos apenas as amostras). Nesta dissertação, utilizamos a distância quadrática integrada como métrica. Para mensurar a incerteza da distância quadrática integrada, primeiramente modelamos a incerteza sobre cada uma das funções densidade de probabilidade através de uma método bayesiano não paramétrico. O método consiste em estimar a função de densidade de probabilidade f (ou seu logaritmo) usando séries de Fourier {f0;f1; :::;fI}. Atribuir uma distribuição a priori para f é então equivalente a atribuir uma distribuição a priori aos coeficientes dessa serie. Utilizamos a priori sugerida em Scricciolo (2006) (priori de sieve), a qual não coloca uma priori somente nesses coeficientes, mas também no próprio I, de modo que, na realidade, trabalhamos com uma mistura bayesiana de modelos de dimensão finita. Para obter amostras a posteriori dessas misturas, marginalizamos o parâmetro (discreto) de indexação de modelos, I, e usamos um software estatístico chamado Stan. Concluímos que o método bayesiano de séries de Fourier tem boa performance quando comparado ao de estimativa de densidade kernel, apesar de ambos os métodos frequentemente apresentarem problemas na estimação da função de densidade de probabilidade perto das fronteiras. Por fim, mostramos como a metodologia de series de Fourier pode ser utilizada para mensurar a incerteza a cerca da similaridade de duas amostras. Em particular, aplicamos este método a um conjunto de dados de pacientes com doença de Alzheimer.
APA, Harvard, Vancouver, ISO, and other styles
26

Wright, George Alfred Jr. "Nonparameter density estimation and its application in communication theory." Diss., Georgia Institute of Technology, 1996. http://hdl.handle.net/1853/14979.

Full text
APA, Harvard, Vancouver, ISO, and other styles
27

Chan, Karen Pui-Shan. "Kernel density estimation, Bayesian inference and random effects model." Thesis, University of Edinburgh, 1990. http://hdl.handle.net/1842/13350.

Full text
Abstract:
This thesis contains results of a study in kernel density estimation, Bayesian inference and random effects models, with application to forensic problems. Estimation of the Bayes' factor in a forensic science problem involved the derivation of predictive distributions in non-standard situations. The distribution of the values of a characteristic of interest among different items in forensic science problems is often non-Normal. Background, or training, data were available to assist in the estimation of the distribution for measurements on cat and dog hairs. An informative prior, based on the kernel method of density estimation, was used to derive the appropriate predictive distributions. The training data may be considered to be derived from a random effects model. This was taken into consideration in modelling the Bayes' factor. The usual assumption of the random factor being Normally distributed is unrealistic, so a kernel density estimate was used as the distribution of the unknown random factor. Two kernel methods were employed: the ordinary and adaptive kernel methods. The adaptive kernel method allowed for the longer tail, where little information was available. Formulae for the Bayes' factor in a forensic science context were derived assuming the training data were grouped or not grouped (for example, hairs from one cat would be thought of as belonging to the same group), and that the within-group variance was or was not known. The Bayes' factor, assuming known within-group variance, for the training data, grouped or not grouped, was extended to the multivariate case. The method was applied to a practical example in a bivariate situation. Similar modelling of the Bayes' factor was derived to cope with a particular form of mixture data. Boundary effects were also taken into consideration. Application of kernel density estimation to make inferences about the variance components under the random effects model was studied. Employing the maximum likelihood estimation method, it was shown that the between-group variance and the smoothing parameter in the kernel density estimation were related. They were not identifiable separately. With the smoothing parameter fixed at some predetermined value, the within-and between-group variance estimates from the proposed model were equivalent to the usual ANOVA estimates. Within the Bayesian framework, posterior distribution for the variance components, using various prior distributions for the parameters were derived incorporating kernel density functions. The modes of these posterior distributions were used as estimates for the variance components. A Student-t within a Bayesian framework was derived after introduction of a prior for the smoothing prameter. Two methods of obtaining hyper-parameters for the prior were suggested, both involving empirical Bayes methods. They were a modified leave-one-out maximum likelihood method and a method of moments based on the optimum smoothing parameter determined from Normality assumption.
APA, Harvard, Vancouver, ISO, and other styles
28

Joshi, Niranjan Bhaskar. "Non-parametric probability density function estimation for medical images." Thesis, University of Oxford, 2008. http://ora.ox.ac.uk/objects/uuid:ebc6af07-770b-4fee-9dc9-5ebbe452a0c1.

Full text
Abstract:
The estimation of probability density functions (PDF) of intensity values plays an important role in medical image analysis. Non-parametric PDF estimation methods have the advantage of generality in their application. The two most popular estimators in image analysis methods to perform the non-parametric PDF estimation task are the histogram and the kernel density estimator. But these popular estimators crucially need to be ‘tuned’ by setting a number of parameters and may be either computationally inefficient or need a large amount of training data. In this thesis, we critically analyse and further develop a recently proposed non-parametric PDF estimation method for signals, called the NP windows method. We propose three new algorithms to compute PDF estimates using the NP windows method. One of these algorithms, called the log-basis algorithm, provides an easier and faster way to compute the NP windows estimate, and allows us to compare the NP windows method with the two existing popular estimators. Results show that the NP windows method is fast and can estimate PDFs with a significantly smaller amount of training data. Moreover, it does not require any additional parameter settings. To demonstrate utility of the NP windows method in image analysis we consider its application to image segmentation. To do this, we first describe the distribution of intensity values in the image with a mixture of non-parametric distributions. We estimate these distributions using the NP windows method. We then use this novel mixture model to evolve curves with the well-known level set framework for image segmentation. We also take into account the partial volume effect that assumes importance in medical image analysis methods. In the final part of the thesis, we apply our non-parametric mixture model (NPMM) based level set segmentation framework to segment colorectal MR images. The segmentation of colorectal MR images is made challenging due to sparsity and ambiguity of features, presence of various artifacts, and complex anatomy of the region. We propose to use the monogenic signal (local energy, phase, and orientation) to overcome the first difficulty, and the NPMM to overcome the remaining two. Results are improved substantially on those that have been reported previously. We also present various ways to visualise clinically useful information obtained with our segmentations in a 3-dimensional manner.
APA, Harvard, Vancouver, ISO, and other styles
29

Inácio, Marco Henrique de Almeida. "Comparing two populations using Bayesian Fourier series density estimation." Universidade de São Paulo, 2017. http://www.teses.usp.br/teses/disponiveis/104/104131/tde-12092017-083813/.

Full text
Abstract:
Given two samples from two populations, one could ask how similar the populations are, that is, how close their probability distributions are. For absolutely continuous distributions, one way to measure the proximity of such populations is to use a measure of distance (metric) between the probability density functions (which are unknown given that only samples are observed). In this work, we work with the integrated squared distance as metric. To measure the uncertainty of the squared integrated distance, we first model the uncertainty of each of the probability density functions using a nonparametric Bayesian method. The method consists of estimating the probability density function f (or its logarithm) using Fourier series {f0;f1; :::;fI}. Assigning a prior distribution to f is then equivalent to assigning a prior distribution to the coefficients of this series. We used the prior suggested by Scricciolo (2006) (sieve prior), which not only places a prior on such coefficients, but also on I itself, so that in reality we work with a Bayesian mixture of finite dimensional models. To obtain posterior samples of such mixture, we marginalize out the discrete model index parameter I and use a statistical software called Stan. We conclude that the Bayesian Fourier series method has good performance when compared to kernel density estimation, although both methods often have problems in the estimation of the probability density function near the boundaries. Lastly, we showed how the methodology of Fourier series can be used to access the uncertainty regarding the similarity of two samples. In particular, we applied this method to dataset of patients with Alzheimer.
Dadas duas amostras de duas populações, pode-se questionar o quão parecidas as duas populações são, ou seja, o quão próximas estão suas distribuições de probabilidade. Para distribuições absolutamente contínuas, uma maneira de mensurar a proximidade dessas populações é utilizando uma medida de distância (métrica) entre as funções densidade de probabilidade (as quais são desconhecidas, em virtude de observarmos apenas as amostras). Nesta dissertação, utilizamos a distância quadrática integrada como métrica. Para mensurar a incerteza da distância quadrática integrada, primeiramente modelamos a incerteza sobre cada uma das funções densidade de probabilidade através de uma método bayesiano não paramétrico. O método consiste em estimar a função de densidade de probabilidade f (ou seu logaritmo) usando séries de Fourier {f0;f1; :::;fI}. Atribuir uma distribuição a priori para f é então equivalente a atribuir uma distribuição a priori aos coeficientes dessa serie. Utilizamos a priori sugerida em Scricciolo (2006) (priori de sieve), a qual não coloca uma priori somente nesses coeficientes, mas também no próprio I, de modo que, na realidade, trabalhamos com uma mistura bayesiana de modelos de dimensão finita. Para obter amostras a posteriori dessas misturas, marginalizamos o parâmetro (discreto) de indexação de modelos, I, e usamos um software estatístico chamado Stan. Concluímos que o método bayesiano de séries de Fourier tem boa performance quando comparado ao de estimativa de densidade kernel, apesar de ambos os métodos frequentemente apresentarem problemas na estimação da função de densidade de probabilidade perto das fronteiras. Por fim, mostramos como a metodologia de series de Fourier pode ser utilizada para mensurar a incerteza a cerca da similaridade de duas amostras. Em particular, aplicamos este método a um conjunto de dados de pacientes com doença de Alzheimer.
APA, Harvard, Vancouver, ISO, and other styles
30

Ellis, Amanda Morgan. "An assessment of density estimation methods for forest ungulates." Thesis, Rhodes University, 2004. http://hdl.handle.net/10962/d1007830.

Full text
Abstract:
The development of conservation and management programs for an animal population relies on a knowledge of the number of individuals in an area. In order to achieve reliable estimates, precise and accurate techniques for estimating population densities are needed. This study compared the use of direct and indirect methods of estimating kudu (Trage/aphus strepsiceras), bush buck (Trage/aphus scriptus), common duiker (Sy/vicapra grimmia), and blue duiker (Philantamba manticala) densities on Shamwari Game Reserve in the Eastern Cape Province, South Africa. These species prefer habitats of dense forest and bush for concealment and are therefore not easily counted in open areas. Herein direct observation counts were compared to indirect sampling via pellet group counts (clearance plots, line transects, variable-width transects, and strip transects). Clearance plots were examined every 2 weeks, while all other methods were conducted seasonally, from August 2002 until August 2003. The strip transect method provided the lowest density estimates (animals per hal ranging from 0.001 for bushbuck to 0.025 for common duiker, while direct observations yielded the highest estimates, ranging from 0.804 for bush buck to 4.692 for kudu. Also, a validation of methods was performed against a known population of kudu during which the DISTANCE method yielded the most accurate results, with an estimated density of 0.261 that was within the actual density of 0.246 to 0.282. In addition, the DISTANCE method was compared to helicopter counts ofkudu and its estimates were found to be approximately 2.6 times greater than the helicopter count results. When the assessment of the methods was made, the cost, manpower and effort requirements, coefficient of variation, and performance against a known population for each method were taken into consideration. Overall, the DISTANCE method performed the best with low cost, minimal manpower and effort requirements, and low coefficient of variation. On Shamwari Game Reserve, the DISTANCE method estimated 0.300 kudu, 0.108 bushbuck, 0.387 common duiker, and 0.028 blue duiker per ha, which, when extrapolated to the total number of animals present within subtropical thicket habitat, estimated 1973 kudu, 710 bush buck, 2545 common duiker, and 184 blue duiker.
APA, Harvard, Vancouver, ISO, and other styles
31

Thomas, Derek C. "Theory and Estimation of Acoustic Intensity and Energy Density." Diss., CLICK HERE for online access, 2008. http://contentdm.lib.byu.edu/ETD/image/etd2560.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
32

Jawhar, Nizar Sami. "Adaptive Density Estimation Based on the Mode Existence Test." DigitalCommons@USU, 1996. https://digitalcommons.usu.edu/etd/7129.

Full text
Abstract:
The kernel persists as the most useful tool for density estimation. Although, in general, fixed kernel estimates have proven superior to results of available variable kernel estimators, Minnotte's mode tree and mode existence test give us newfound hope of producing a useful adaptive kernel estimator that triumphs when the fixed kernel methods fail. It improves on the fixed kernel in multimodal distributions where the size of modes is unequal, and where the degree of separation of modes varies. When these latter conditions exist, they present a serious challenge to the best of fixed kernel density estimators. Capitalizing on the work of Minnotte in detecting multimodality adaptively, we found it possible to determine the bandwidth h adaptively in a most original fashion and to estimate the mixture normals adaptively, using the normal kernel with encouraging results.
APA, Harvard, Vancouver, ISO, and other styles
33

Baba, Harra M'hammed. "Estimation de densités spectrales d'ordre élevé." Rouen, 1996. http://www.theses.fr/1996ROUES023.

Full text
Abstract:
Dans cette thèse nous construisons des estimateurs de la densité spectrale du cumulant, pour un processus strictement homogène et centré, l'espace des temps étant l'espace multidimensionnel, euclidien réel ou l'espace multidimensionnel des nombres p-adiques. Dans cette construction nous avons utilisé la méthode de lissage de la trajectoire et un déplacement dans le temps ou la méthode de fenêtres spectrales. Sous certaines conditions de régularité, les estimateurs proposés sont asymptotiquement sans biais et convergents. Les procédures d'estimation exposées peuvent trouver des applications dans de nombreux domaines scientifiques et peuvent aussi fournir des éléments de réponse aux questions relatives à certaines propriétés statistiques des processus aléatoires.
APA, Harvard, Vancouver, ISO, and other styles
34

Uria, Benigno. "Connectionist multivariate density-estimation and its application to speech synthesis." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/15868.

Full text
Abstract:
Autoregressive models factorize a multivariate joint probability distribution into a product of one-dimensional conditional distributions. The variables are assigned an ordering, and the conditional distribution of each variable modelled using all variables preceding it in that ordering as predictors. Calculating normalized probabilities and sampling has polynomial computational complexity under autoregressive models. Moreover, binary autoregressive models based on neural networks obtain statistical performances similar to that of some intractable models, like restricted Boltzmann machines, on several datasets. The use of autoregressive probability density estimators based on neural networks to model real-valued data, while proposed before, has never been properly investigated and reported. In this thesis we extend the formulation of neural autoregressive distribution estimators (NADE) to real-valued data; a model we call the real-valued neural autoregressive density estimator (RNADE). Its statistical performance on several datasets, including visual and auditory data, is reported and compared to that of other models. RNADE obtained higher test likelihoods than other tractable models, while retaining all the attractive computational properties of autoregressive models. However, autoregressive models are limited by the ordering of the variables inherent to their formulation. Marginalization and imputation tasks can only be solved analytically if the missing variables are at the end of the ordering. We present a new training technique that obtains a set of parameters that can be used for any ordering of the variables. By choosing a model with a convenient ordering of the dimensions at test time, it is possible to solve any marginalization and imputation tasks analytically. The same training procedure also makes it practical to train NADEs and RNADEs with several hidden layers. The resulting deep and tractable models display higher test likelihoods than the equivalent one-hidden-layer models for all the datasets tested. Ensembles of NADEs or RNADEs can be created inexpensively by combining models that share their parameters but differ in the ordering of the variables. These ensembles of autoregressive models obtain state-of-the-art statistical performances for several datasets. Finally, we demonstrate the application of RNADE to speech synthesis, and confirm that capturing the phone-conditional dependencies of acoustic features improves the quality of synthetic speech. Our model generates synthetic speech that was judged by naive listeners as being of higher quality than that generated by mixture density networks, which are considered a state-of-the-art synthesis technique.
APA, Harvard, Vancouver, ISO, and other styles
35

Pawluczyk, Olga. "Volumetric estimation of breast density for breast cancer risk prediction." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2001. http://www.collectionscanada.ca/obj/s4/f2/dsk3/ftp04/MQ58694.pdf.

Full text
APA, Harvard, Vancouver, ISO, and other styles
36

Kelly, Robert 1969. "Estimation of iceberg density in the Grand Banks of Newfoundland." Thesis, McGill University, 1996. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=23746.

Full text
Abstract:
Icebergs offshore Newfoundland represent hazards to both ships and constructed facilities, such as off-shore oil production facilities. Collision with icebergs represent hazards for both surface and sub-surface facilities. In the latter case, hazards are associated with seabed scouring by the iceberg keel. In both cases, hazard analysis requires estimation of the flux of icebergs and their size distribution. Estimates of the flux of icebergs can be achieved by obtaining separate estimates of iceberg densities and of drift patterns of iceberg velocities. The objective of this thesis is to develop and apply estimation procedures for the density of icebergs using presently available data sets. The most comprehensive of these data sets is compiled by the International Ice Patrol (IIP), starting in 1960. The IIP database comprises data from several sources and for icebergs of varying sizes. In addition, the spatial coverage of surveys does not appear to be uniform throughout the year. Several non-parametric density estimation procedures are investigated. The objective is to eliminate any apparent high densities in the estimates due to the non-uniform coverage of the region during surveys and retain statistically significant features in the spatial variation of densities.
Several kernel estimators are examined: (1) a uniform square kernel, (2) a uniform circular kernel, (3) a Normal kernel, and (4) an adaptive kernel. Uniform kernels have the advantage of computational efficiency, however, they do not account for spatial variations in the densities and produce over-smoothing in regions of peak iceberg densities and under-smoothing in regions of low iceberg densities. The adaptive kernel is computationally more demanding, but appears to fulfill all the desired requirements for preserving significant features and eliminating erratic estimates.
APA, Harvard, Vancouver, ISO, and other styles
37

Hazelton, Martin Luke. "Method of density estimation with application to Monte Carlo methods." Thesis, University of Oxford, 1993. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.334850.

Full text
APA, Harvard, Vancouver, ISO, and other styles
38

Bugrien, Jamal B. "Robust approaches to clustering based on density estimation and projection." Thesis, University of Leeds, 2005. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.418939.

Full text
APA, Harvard, Vancouver, ISO, and other styles
39

Braga, Ígor Assis. "Stochastic density ratio estimation and its application to feature selection." Universidade de São Paulo, 2014. http://www.teses.usp.br/teses/disponiveis/55/55134/tde-07042015-142545/.

Full text
Abstract:
The estimation of the ratio of two probability densities is an important statistical tool in supervised machine learning. In this work, we introduce new methods of density ratio estimation based on the solution of a multidimensional integral equation involving cumulative distribution functions. The resulting methods use the novel V -matrix, a concept that does not appear in previous density ratio estimation methods. Experiments demonstrate the good potential of this new approach against previous methods. Mutual Information - MI - estimation is a key component in feature selection and essentially depends on density ratio estimation. Using one of the methods of density ratio estimation proposed in this work, we derive a new estimator - VMI - and compare it experimentally to previously proposed MI estimators. Experiments conducted solely on mutual information estimation show that VMI compares favorably to previous estimators. Experiments applying MI estimation to feature selection in classification tasks evidence that better MI estimation leads to better feature selection performance. Parameter selection greatly impacts the classification accuracy of the kernel-based Support Vector Machines - SVM. However, this step is often overlooked in experimental comparisons, for it is time consuming and requires familiarity with the inner workings of SVM. In this work, we propose procedures for SVM parameter selection which are economic in their running time. In addition, we propose the use of a non-linear kernel function - the min kernel - that can be applied to both low- and high-dimensional cases without adding another parameter to the selection process. The combination of the proposed parameter selection procedures and the min kernel yields a convenient way of economically extracting good classification performance from SVM. The Regularized Least Squares - RLS - regression method is another kernel method that depends on proper selection of its parameters. When training data is scarce, traditional parameter selection often leads to poor regression estimation. In order to mitigate this issue, we explore a kernel that is less susceptible to overfitting - the additive INK-splines kernel. Then, we consider alternative parameter selection methods to cross-validation that have been shown to perform well for other regression methods. Experiments conducted on real-world datasets show that the additive INK-splines kernel outperforms both the RBF and the previously proposed multiplicative INK-splines kernel. They also show that the alternative parameter selection procedures fail to consistently improve performance. Still, we find that the Finite Prediction Error method with the additive INK-splines kernel performs comparably to cross-validation.
A estimação da razão entre duas densidades de probabilidade é uma importante ferramenta no aprendizado de máquina supervisionado. Neste trabalho, novos métodos de estimação da razão de densidades são propostos baseados na solução de uma equação integral multidimensional. Os métodos resultantes usam o conceito de matriz-V , o qual não aparece em métodos anteriores de estimação da razão de densidades. Experimentos demonstram o bom potencial da nova abordagem com relação a métodos anteriores. A estimação da Informação Mútua - IM - é um componente importante em seleção de atributos e depende essencialmente da estimação da razão de densidades. Usando o método de estimação da razão de densidades proposto neste trabalho, um novo estimador - VMI - é proposto e comparado experimentalmente a estimadores de IM anteriores. Experimentos conduzidos na estimação de IM mostram que VMI atinge melhor desempenho na estimação do que métodos anteriores. Experimentos que aplicam estimação de IM em seleção de atributos para classificação evidenciam que uma melhor estimação de IM leva as melhorias na seleção de atributos. A tarefa de seleção de parâmetros impacta fortemente o classificador baseado em kernel Support Vector Machines - SVM. Contudo, esse passo é frequentemente deixado de lado em avaliações experimentais, pois costuma consumir tempo computacional e requerer familiaridade com as engrenagens de SVM. Neste trabalho, procedimentos de seleção de parâmetros para SVM são propostos de tal forma a serem econômicos em gasto de tempo computacional. Além disso, o uso de um kernel não linear - o chamado kernel min - é proposto de tal forma que possa ser aplicado a casos de baixa e alta dimensionalidade e sem adicionar um outro parâmetro a ser selecionado. A combinação dos procedimentos de seleção de parâmetros propostos com o kernel min produz uma maneira conveniente de se extrair economicamente um classificador SVM com boa performance. O método de regressão Regularized Least Squares - RLS - é um outro método baseado em kernel que depende de uma seleção de parâmetros adequada. Quando dados de treinamento são escassos, uma seleção de parâmetros tradicional em RLS frequentemente leva a uma estimação ruim da função de regressão. Para aliviar esse problema, é explorado neste trabalho um kernel menos suscetível a superajuste - o kernel INK-splines aditivo. Após, são explorados métodos de seleção de parâmetros alternativos à validação cruzada e que obtiveram bom desempenho em outros métodos de regressão. Experimentos conduzidos em conjuntos de dados reais mostram que o kernel INK-splines aditivo tem desempenho superior ao kernel RBF e ao kernel INK-splines multiplicativo previamente proposto. Os experimentos também mostram que os procedimentos alternativos de seleção de parâmetros considerados não melhoram consistentemente o desempenho. Ainda assim, o método Finite Prediction Error com o kernel INK-splines aditivo possui desempenho comparável à validação cruzada.
APA, Harvard, Vancouver, ISO, and other styles
40

Alquier, Pierre. "Transductive and inductive adaptative inference for regression and density estimation." Paris 6, 2006. http://www.theses.fr/2006PA066436.

Full text
Abstract:
Inférence Adaptative, Inductive et Transductive, pour l'Estimation de la Régression et de la Densité (Pierre Alquier) Cette thèse a pour objet l'étude des propriétés statistiques de certains algorithmes d'apprentissage dans le cas de l'estimation de la régression et de la densité. Elle est divisée en trois parties. La première partie consiste en une généralisation des théorèmes PAC-Bayésiens, sur la classification, d'Olivier Catoni, au cas de la régression avec une fonction de perte générale. Dans la seconde partie, on étudie plus particulièrement le cas de la régression aux moindres carrés et on propose un nouvel algorithme de sélection de variables. Cette méthode peut être appliquée notamment au cas d'une base de fonctions orthonormales, et conduit alors à des vitesses de convergence optimales, mais aussi au cas de fonctions de type noyau, elle conduit alors à une variante des méthodes dites "machines à vecteurs supports" (SVM). La troisième partie étend les résultats de la seconde au cas de l'estimation de densité avec perte quadratique.
APA, Harvard, Vancouver, ISO, and other styles
41

McDonagh, Steven George. "Building models from multiple point sets with kernel density estimation." Thesis, University of Edinburgh, 2015. http://hdl.handle.net/1842/10568.

Full text
Abstract:
One of the fundamental problems in computer vision is point set registration. Point set registration finds use in many important applications and in particular can be considered one of the crucial stages involved in the reconstruction of models of physical objects and environments from depth sensor data. The problem of globally aligning multiple point sets, representing spatial shape measurements from varying sensor viewpoints, into a common frame of reference is a complex task that is imperative due to the large number of critical functions that accurate and reliable model reconstructions contribute to. In this thesis we focus on improving the quality and feasibility of model and environment reconstruction through the enhancement of multi-view point set registration techniques. The thesis makes the following contributions: First, we demonstrate that employing kernel density estimation to reason about the unknown generating surfaces that range sensors measure allows us to express measurement variability, uncertainty and also to separate the problems of model design and viewpoint alignment optimisation. Our surface estimates define novel view alignment objective functions that inform the registration process. Our surfaces can be estimated from point clouds in a datadriven fashion. Through experiments on a variety of datasets we demonstrate that we have developed a novel and effective solution to the simultaneous multi-view registration problem. We then focus on constructing a distributed computation framework capable of solving generic high-throughput computational problems. We present a novel task-farming model that we call Semi-Synchronised Task Farming (SSTF), capable of modelling and subsequently solving computationally distributable problems that benefit from both independent and dependent distributed components and a level of communication between process elements. We demonstrate that this framework is a novel schema for parallel computer vision algorithms and evaluate the performance to establish computational gains over serial implementations. We couple this framework with an accurate computation-time prediction model to contribute a novel structure appropriate for addressing expensive real-world algorithms with substantial parallel performance and predictable time savings. Finally, we focus on a timely instance of the multi-view registration problem: modern range sensors provide large numbers of viewpoint samples that result in an abundance of depth data information. The ability to utilise this abundance of depth data in a feasible and principled fashion is of importance to many emerging application areas making use of spatial information. We develop novel methodology for the registration of depth measurements acquired from many viewpoints capturing physical object surfaces. By defining registration and alignment quality metrics based on our density estimation framework we construct an optimisation methodology that implicitly considers all viewpoints simultaneously. We use a non-parametric data-driven approach to consider varying object complexity and guide large view-set spatial transform optimisations. By aligning large numbers of partial, arbitrary-pose views we evaluate this strategy quantitatively on large view-set range sensor data where we find that we can improve registration accuracy over existing methods and contribute increased registration robustness to the magnitude of coarse seed alignment. This allows large-scale registration on problem instances exhibiting varying object complexity with the added advantage of massive parallel efficiency.
APA, Harvard, Vancouver, ISO, and other styles
42

Zhu, Hui. "Scatterer number density estimation for tissue characterization in ultrasound imaging /." Online version of thesis, 1990. http://hdl.handle.net/1850/10882.

Full text
APA, Harvard, Vancouver, ISO, and other styles
43

Wong, Kam-wah. "Efficient computation of global illumination based on adaptive density estimation /." Hong Kong : University of Hong Kong, 2001. http://sunzi.lib.hku.hk/hkuto/record.jsp?B25151083.

Full text
APA, Harvard, Vancouver, ISO, and other styles
44

Wang, Yi. "Latent tree models for multivariate density estimation : algorithms and applications /." View abstract or full-text, 2009. http://library.ust.hk/cgi/db/thesis.pl?CSED%202009%20WANGY.

Full text
APA, Harvard, Vancouver, ISO, and other styles
45

Esterhuizen, Gerhard. "Generalised density function estimation using moments and the characteristic function." Thesis, Link to the online version, 2003. http://hdl.handle.net/10019.1/1001.

Full text
APA, Harvard, Vancouver, ISO, and other styles
46

Sain, Stephan R. "Adaptive kernel density estimation." Thesis, 1994. http://hdl.handle.net/1911/16743.

Full text
Abstract:
The need for improvements over the fixed kernel density estimator in certain situations has been discussed extensively in the literature, particularly in the application of density estimation to mode hunting. Problem densities often exhibit skewness or multimodality with differences in scale for each mode. By varying the bandwidth in some fashion, it is possible to achieve significant improvements over the fixed bandwidth approach. In general, variable bandwidth kernel density estimators can be divided into two categories: those that vary the bandwidth with the estimation point (balloon estimators) and those that vary the bandwidth with each data point (sample point estimators). For univariate balloon estimators, it can be shown that there exists a bandwidth in regions of f where f is convex (e.g. the tails) such that the bias is exactly zero. Such a bandwidth leads to a MSE = $O(n\sp{-1})$ for points in the appropriate regions. A global implementation strategy using a local cross-validation algorithm to estimate such bandwidths is developed. The theoretical behavior of the sample point estimator is difficult to examine as the form of the bandwidth function is unknown. An approximation based on binning the data is used to study the behavior of the MISE and the optimal bandwidth function. A practical data-based procedure for determining bandwidths for the sample point estimator is developed using a spline function to estimate the unknown bandwidth function. Finally, the multivariate problem is briefly addressed by examining the shape and size of the optimal bivariate kernels suggested by Terrell and Scott (1992). Extensions of the binning and spline estimation ideas are also discussed.
APA, Harvard, Vancouver, ISO, and other styles
47

Gebert, Mark Allen. "Nonparametric density contour estimation." Thesis, 1998. http://hdl.handle.net/1911/19261.

Full text
Abstract:
Estimation of the level sets for an unknown probability density is done with no specific assumed form for that density, that is, non-parametrically. Methods for tackling this problem are presented. Earlier research showed existence and properties of an estimate based on a kernel density estimate in one dimension. Monte Carlo methods further demonstrated the reasonability of extending this approach to two dimensions. An alternative procedure is now considered that focuses on properties of the contour itself; procedures wherein we define and make use of an objective function based on the characterization of contours as enclosing regions of minimum area given a constraint on probability. Restricting our attention to (possibly non-convex) polygons as candidate contours, numeric optimization of this difficult non-smooth objective function is accomplished using pdsopt, for Parallel Direct Search OPTimization, a set of routines developed for minimization of a scalar-valued function over a high-dimensional domain. Motivation for this method is given, as well as results of simulations done to test it; these include exploration of a Lagrange-multiplier penalty on area and the need which arises for addition of a penalty on the "roughness" of a polygonal contour.
APA, Harvard, Vancouver, ISO, and other styles
48

莊宗霖. "An Approach on Function Estimation and Density Estimation." Thesis, 2003. http://ndltd.ncl.edu.tw/handle/92765889113139635622.

Full text
Abstract:
碩士
國立中正大學
數理統計研究所
91
A recent approach using argument on expectation of random variables for estimation of unknown functional values based on some known values of the function at various points is investigated by way of empirical simulation. The approach can be applied to do estimation on probability density functions based on random samples from the assumed distribution. Theoretical formulations are presented to express the estimators in each case. Such estimators are more extensive than the traditional kernel type estimators for estimating unknown functions and probability density functions. This empirical simulation produce insight regarding the role of selected bandwidth, size of random sample and type of the subject distributions.
APA, Harvard, Vancouver, ISO, and other styles
49

Yao, Bo-Yuan, and 姚博元. "Density Estimation by Spline Smoothing." Thesis, 2014. http://ndltd.ncl.edu.tw/handle/03542027112847534183.

Full text
APA, Harvard, Vancouver, ISO, and other styles
50

Lin, Mu. "Nonparametric density estimation via regularization." 2009. http://hdl.handle.net/10048/709.

Full text
Abstract:
Thesis (M. Sc.)--University of Alberta, 2009.
Title from pdf file main screen (viewed on Dec. 11, 2009). "A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Master of Science in Statistics, Department of Mathematical and Statistical Sciences, University of Alberta." Includes bibliographical references.
APA, Harvard, Vancouver, ISO, and other styles
We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!

To the bibliography