Dissertations / Theses: 'Data approximation'

1

Ross, Colin. "Applications of data fusion in data approximation." Thesis, University of Huddersfield, 2002. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.247372.

Full text

APA, Harvard, Vancouver, ISO, and other styles

2

Deligiannakis, Antonios. "Accurate data approximation in constrained environments." College Park, Md. : University of Maryland, 2005. http://hdl.handle.net/1903/2681.

Full text

Abstract:

Thesis (Ph. D.) -- University of Maryland, College Park, 2005.
Thesis research directed by: Computer Science. Title from abstract of PDF. Includes bibliographical references. Published by UMI Dissertation Services, Ann Arbor, Mich. Also available in paper.

APA, Harvard, Vancouver, ISO, and other styles

3

Tomek, Peter. "Approximation of Terrain Data Utilizing Splines." Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2012. http://www.nusl.cz/ntk/nusl-236488.

Full text

Abstract:

Pro optimalizaci letových trajektorií ve velmi malé nadmorské výšce, terenní vlastnosti musí být zahrnuty velice přesne. Proto rychlá a efektivní evaluace terenních dat je velice důležitá vzhledem nato, že čas potrebný pro optimalizaci musí být co nejkratší. Navyše, na optimalizaci letové trajektorie se využívájí metody založené na výpočtu gradientu. Proto musí být aproximační funkce terenních dat spojitá do určitého stupne derivace. Velice nádejná metoda na aproximaci terenních dat je aplikace víceroměrných simplex polynomů. Cílem této práce je implementovat funkci, která vyhodnotí dané terenní data na určitých bodech spolu s gradientem pomocí vícerozměrných splajnů. Program by měl vyčíslit více bodů najednou a měl by pracovat v $n$-dimensionálním prostoru.

APA, Harvard, Vancouver, ISO, and other styles

4

Cao, Phuong Thao. "Approximation of OLAP queries on data warehouses." Phd thesis, Université Paris Sud - Paris XI, 2013. http://tel.archives-ouvertes.fr/tel-00905292.

Full text

Abstract:

We study the approximate answers to OLAP queries on data warehouses. We consider the relative answers to OLAP queries on a schema, as distributions with the L1 distance and approximate the answers without storing the entire data warehouse. We first introduce three specific methods: the uniform sampling, the measure-based sampling and the statistical model. We introduce also an edit distance between data warehouses with edit operations adapted for data warehouses. Then, in the OLAP data exchange, we study how to sample each source and combine the samples to approximate any OLAP query. We next consider a streaming context, where a data warehouse is built by streams of different sources. We show a lower bound on the size of the memory necessary to approximate queries. In this case, we approximate OLAP queries with a finite memory. We describe also a method to discover the statistical dependencies, a new notion we introduce. We are looking for them based on the decision tree. We apply the method to two data warehouses. The first one simulates the data of sensors, which provide weather parameters over time and location from different sources. The second one is the collection of RSS from the web sites on Internet.

APA, Harvard, Vancouver, ISO, and other styles

5

Lehman, Eric (Eric Allen) 1970. "Approximation algorithms for grammar-based data compression." Thesis, Massachusetts Institute of Technology, 2002. http://hdl.handle.net/1721.1/87172.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002.
Includes bibliographical references (p. 109-113).
This thesis considers the smallest grammar problem: find the smallest context-free grammar that generates exactly one given string. We show that this problem is intractable, and so our objective is to find approximation algorithms. This simple question is connected to many areas of research. Most importantly, there is a link to data compression; instead of storing a long string, one can store a small grammar that generates it. A small grammar for a string also naturally brings out underlying patterns, a fact that is useful, for example, in DNA analysis. Moreover, the size of the smallest context-free grammar generating a string can be regarded as a computable relaxation of Kolmogorov complexity. Finally, work on the smallest grammar problem qualitatively extends the study of approximation algorithms to hierarchically-structured objects. In this thesis, we establish hardness results, evaluate several previously proposed algorithms, and then present new procedures with much stronger approximation guarantees.
by Eric Lehman.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

6

Hou, Jun. "Function Approximation and Classification with Perturbed Data." The Ohio State University, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=osu1618266875924225.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Zaman, Muhammad Adib Uz. "Bicubic L1 Spline Fits for 3D Data Approximation." Thesis, Northern Illinois University, 2018. http://pqdtopen.proquest.com/#viewpdf?dispub=10751900.

Full text

Abstract:

Univariate cubic L¹ spline fits have been successful to preserve the shapes of 2D data with abrupt changes. The reason is that the minimization of L¹ norm of the data is considered, as opposite to L² norm. While univariate L¹ spline fits for 2D data are discussed by many, bivariate L¹ spline fits for 3D data are yet to be fully explored. This thesis aims to develop bicubic L¹ spline fits for 3D data approximation. This can be achieved by solving a bi-level optimization problem. One level is bivariate cubic spline interpolation and the other level is L¹ error minimization. In the first level, a bicubic interpolated spline surface will be constructed on a rectangular grid with necessary first and second order derivative values estimated by using a 5-point window algorithm for univariate L¹ interpolation. In the second level, the absolute error (i.e. L¹ norm) will be minimized using an iterative gradient search. This study may be extended to higher dimensional cubic L¹ spline fits research.

APA, Harvard, Vancouver, ISO, and other styles

8

Cooper, Philip. "Rational approximation of discrete data with asymptotic behaviour." Thesis, University of Huddersfield, 2007. http://eprints.hud.ac.uk/id/eprint/2026/.

Full text

Abstract:

This thesis is concerned with the least-squares approximation of discrete data that appear to exhibit asymptotic behaviour. In particular, we consider using rational functions as they are able to display a number of types of asymptotic behaviour. The research is biased towards the development of simple and easily implemented algorithms that can be used for this purpose. We discuss a number of novel approximation forms, including the Semi-Infinite Rational Spline and the Asymptotic Polynomial. The Semi-Infinite Rational Spline is a piecewise rational function, continuous across a single knot, and may be defined to have different asymptotic limits at ±∞. The continuity constraints at the knot are implicit in the function definition, and it can be fitted to data without the use of constrained optimisation algorithms. The Asymptotic Polynomial is a linear combination of weighted basis functions, orthogonalised with respect to a rational weight function of nonlinear approximation parameters. We discuss an efficient and numerically stable implementation of the Gauss-Newton method that can be used to fit this function to discrete data. A number of extensions of the Loeb algorithm are discussed, including a simple modification for fitting Semi- Infinite Rational Splines, and a new hybrid algorithm that is a combination of the Loeb algorithm and the Lawson algorithm (including its Rice and Usow extension), for fitting ℓp rational approximations. In addition, we present an extension of the Rice and Usow algorithm to include ℓp approximation for values p < 2. Also discussed is an alternative representation of a polynomial ratio denominator, that allows pole free approximations to be fitted to data with the use of unconstrained optimisation methods. In all cases we present a large number of numerical applications of these methods to illustrate their usefulness.

APA, Harvard, Vancouver, ISO, and other styles

9

Schmid, Dominik. "Scattered data approximation on the rotation group and generalizations." Aachen Shaker, 2009. http://d-nb.info/995021562/04.

Full text

APA, Harvard, Vancouver, ISO, and other styles

10

McQuarrie, Shane Alexander. "Data Assimilation in the Boussinesq Approximation for Mantle Convection." BYU ScholarsArchive, 2018. https://scholarsarchive.byu.edu/etd/6951.

Full text

Abstract:

Many highly developed physical models poorly approximate actual physical systems due to natural random noise. For example, convection in the earth's mantle—a fundamental process for understanding the geochemical makeup of the earth's crust and the geologic history of the earth—exhibits chaotic behavior, so it is difficult to model accurately. In addition, it is impossible to directly measure temperature and fluid viscosity in the mantle, and any indirect measurements are not guaranteed to be highly accurate. Over the last 50 years, mathematicians have developed a rigorous framework for reconciling noisy observations with reasonable physical models, a technique called data assimilation. We apply data assimilation to the problem of mantle convection with the infinite-Prandtl Boussinesq approximation to the Navier-Stokes equations as the model, providing rigorous conditions that guarantee synchronization between the observational system and the model. We validate these rigorous results through numerical simulations powered by a flexible new Python package, Dedalus. This methodology, including the simulation and post-processing code, may be generalized to many other systems. The numerical simulations show that the rigorous synchronization conditions are not sharp; that is, synchronization may occur even when the conditions are not met. These simulations also cast some light on the true relationships between the system parameters that are required in order to achieve synchronization. To conclude, we conduct experiments for two closely related data assimilation problems to further demonstrate the limitations of the rigorous results and to test the flexibility of data assimilation for mantle-like systems.

APA, Harvard, Vancouver, ISO, and other styles

11

Măndoiu, Ion I. "Approximation algorithms for VLSI routing." Diss., Georgia Institute of Technology, 2000. http://hdl.handle.net/1853/9128.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Santin, Gabriele. "Approximation in kernel-based spaces, optimal subspaces and approximation of eigenfunctions." Doctoral thesis, Università degli studi di Padova, 2016. http://hdl.handle.net/11577/3424498.

Full text

Abstract:

Kernel-based approximation methods provide optimal recovery procedures in the native Hilbert spaces in which they are reproducing. Among other, kernels in the notable class of continuous and strictly positive definite kernels on compact sets possess a series decomposition in L2 - orthonormal eigenfunctions of a particular integral operator. The interest for this decomposition is twofold. On one hand, the subspaces generated by eigenfunctions, or eigenbasis elements, are L2 -optimal trial spaces in the sense of widths. On the other hand, such expansion is the fundamental tool of some of the state of the art algorithms in kernel approximation. Despite these reasons motivate a great interest in the eigenbasis, for a given kernel this decomposition is generally completely unknown. In this view, this thesis faces the problem of approximating the eigenbasis of general continuous and strictly positive definite kernels on general compact sets of the Euclidean space, for any space dimension. We will at first define a new kind of optimality that is based on a error measurement closer to the one of standard kernel interpolation. This new width is then analyzed, and we will determine its value and characterize its optimal subspaces, which are spanned by the eigenbasis. Moreover, this optimality result is suitable to be scaled to some particular subspace of the native space, and this restriction allows us to prove new results on the construction of computable optimal trial spaces. This situation covers the standard case of point-based interpolation, and will provide algorithms to approximate the eigenbasis by means of standard kernel techniques. On the basis of the new theoretical results, asymptotic estimates on the convergence of the method will be proven. These computations will be translated into effective algorithms, and we will test their behavior in the approximation of the eigenspaces. Moreover, two applications of kernel-based methods will be analyzed.
I metodi kernel forniscono procedure di miglior approssimazione negli spazi di Hilbert nativi, ovvero gli spazi in cui tali kernel sono reproducing kernel. Nel caso notevole di kernel continui e strettamente definiti positivi su insiemi compatti, e' nota l’esistenza di una decomposizione in una serie data dalle autofunzioni (ortonormali in L2 ) di un particolare operatore integrale. L’interesse per questa espansione e' motivata da due ragioni. Da un lato, i sottospazi generati dalle autofunzioni, o elementi della eigenbasis, sono i trial space L2 -ottimali nel senso delle widhts. D’altro canto, tale espansione e' lo strumento fondamentale alla base in alcuni degli algoritmi di riferimento utilizzati nell’approssimazione con kernel. Nonostante queste ragioni motivino decisamente l’interesse per le eigenbasis, la suddetta decomposizione e' generalmente sconosciuta. Alla luce di queste motivazioni, la tesi affronta il problema dell’approssimazione delle eigenbasis per generici kernel continui e strettamente definiti positivi su generici insiemi compatti dello spazio euclideo, per ogni dimensione. Inizieremo col definire un nuovo tipo di ottimalita' basata sulla misura dell’errore tipica dell’interpolazione kernel standard. Il nuovo concetto di width sara' analizzato, ne sara' calcolato il valore e caratterizzati i rispettivi sottospazi ottimali, che saranno generati dalla eigenbasis. Inoltre, questo risultato di ottimalita' risultera' essere adatto ad essere ristretto ad alcuni particolari sottospazi dello spazio nativo. Questa restrizione ci permettera' di dimostrare nuovi risultati sulla costruzione di trial space ottimali che siano effettivamente calcolabili. Questa situazione include anche il caso dell’interpolazione kernel basata su valutazioni puntuali, e fornira' algoritmi per approssimare le autofunzioni tramite metodi kernel standard. Forniremo inoltre stime asintotiche di convergenza del metodo basate sui nuovi risultati teorici. I metodi presentati saranno implementati in algoritmi numerici, e ne testeremo il comportamento nell’approssimazione degli autospazi. Infine analizzeremo l’applicazioni dei metodi kernel a due diversi problemi di approssimazione.

APA, Harvard, Vancouver, ISO, and other styles

13

Koufogiannakis, Christos. "Approximation algorithms for covering problems." Diss., [Riverside, Calif.] : University of California, Riverside, 2009. http://proquest.umi.com/pqdweb?index=0&did=1957320821&SrchMode=2&sid=1&Fmt=2&VInst=PROD&VType=PQD&RQT=309&VName=PQD&TS=1268338860&clientId=48051.

Full text

Abstract:

Thesis (Ph. D.)--University of California, Riverside, 2009.
Includes abstract. Title from first page of PDF file (viewed March 11, 2010). Available via ProQuest Digital Dissertations. Includes bibliographical references (p. 70-77). Also issued in print.

APA, Harvard, Vancouver, ISO, and other styles

14

Wiley, David F. "Approximation and visualization of scientific data using higher-order elements /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2003. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Grishin, Denis. "Fast and efficient methods for multi-dimensional scattered data approximation /." For electronic version search Digital dissertations database. Restricted to UC campuses. Access is free to UC campus dissertations, 2004. http://uclibs.org/PID/11984.

Full text

APA, Harvard, Vancouver, ISO, and other styles

16

Fung, Ping-yuen, and 馮秉遠. "Approximation for minimum triangulations of convex polyhedra." Thesis, The University of Hong Kong (Pokfulam, Hong Kong), 2001. http://hub.hku.hk/bib/B29809964.

Full text

APA, Harvard, Vancouver, ISO, and other styles

17

Lewis, Cannada Andrew. "The Unreasonable Usefulness of Approximation by Linear Combination." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/83866.

Full text

Abstract:

Through the exploitation of data-sparsity ---a catch all term for savings gained from a variety of approximations--- it is possible to reduce the computational cost of accurate electronic structure calculations to linear. Meaning, that the total time to solution for the calculation grows at the same rate as the number of particles that are correlated. Multiple techniques for exploiting data-sparsity are discussed, with a focus on those that can be systematically improved by tightening numerical parameters such that as the parameter approaches zero the approximation becomes exact. These techniques are first applied to Hartree-Fock theory and then we attempt to design a linear scaling massively parallel electron correlation strategy based on second order perturbation theory.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

18

Thomas, A. "Data structures, methods of approximation and optimal computation for pedigree analysis." Thesis, University of Cambridge, 1985. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.372922.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Turner, David Andrew. "The approximation of Cartesian coordinate data by parametric orthogonal distance regression." Thesis, University of Huddersfield, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.323778.

Full text

Abstract:

This thesis is concerned with the approximation of Cartesian coordinate data by parametric curves and surfaces, with an emphasis upon a technique known as parametric orthogonal distance regression (parametric ODR). The technique has become increasingly popular in the literature over the past decade and has applications in a wide range of fields, including metrology-the science of measurement, and computer aided design (CAD) modelling. Typically, the data are obtained by recording points measured in the surface of some physical artefact, such as a manufactured part. Parametric ODR involves minimizing the shortest distances from the data to the curve or surface in some norm. Under moderate assumptions, these shortest distances are orthogonal projections from the data onto the approximant, hence the nomenclature ODR. The motivation behind this type of approximation is that, by using a distance-based measure, the resulting best fit curve or surface is independent of the position or orientation of the physical artefact from which the data is obtained. The thesis predominately concerns itself with parametric ODR in a least squares setting, although it is indicated how the techniques described can be extended to other error measures in a fairly straightforward manner. The parametric ODR problem is formulated mathematically, and a detailed survey of the existing algorithms for solving it is given. These algorithms are then used as the basis for developing new techniques, with an emphasis placed upon their efficiency and reliability. The algorithms (old and new) detailed in this thesis are illustrated by problems involving well-known geometric elements such as lines, circles, ellipse and ellipsoids, as well as spline curves and surfaces. Numerical considerations specific to these individual elements, including ones not previously reported in the literature, are addressed. We also consider a sub-problem of parametric ODR known as template matching, which involves mapping in an optimal way a set of data into the same frame of reference as a fixed curve or surface.

APA, Harvard, Vancouver, ISO, and other styles

20

Schmid, Dominik [Verfasser]. "Scattered Data Approximation on the Rotation Group and Generalizations / Dominik Schmid." Aachen : Shaker, 2009. http://d-nb.info/1161303006/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Grimm, Alexander Rudolf. "Parametric Dynamical Systems: Transient Analysis and Data Driven Modeling." Diss., Virginia Tech, 2018. http://hdl.handle.net/10919/83840.

Full text

Abstract:

Dynamical systems are a commonly used and studied tool for simulation, optimization and design. In many applications such as inverse problem, optimal control, shape optimization and uncertainty quantification, those systems typically depend on a parameter. The need for high fidelity in the modeling stage leads to large-scale parametric dynamical systems. Since these models need to be simulated for a variety of parameter values, the computational burden they incur becomes increasingly difficult. To address these issues, parametric reduced models have encountered increased popularity in recent years. We are interested in constructing parametric reduced models that represent the full-order system accurately over a range of parameters. First, we define a global joint error mea- sure in the frequency and parameter domain to assess the accuracy of the reduced model. Then, by assuming a rational form for the reduced model with poles both in the frequency and parameter domain, we derive necessary conditions for an optimal parametric reduced model in this joint error measure. Similar to the nonparametric case, Hermite interpolation conditions at the reflected images of the poles characterize the optimal parametric approxi- mant. This result extends the well-known interpolatory H2 optimality conditions by Meier and Luenberger to the parametric case. We also develop a numerical algorithm to construct locally optimal reduced models. The theory and algorithm are data-driven, in the sense that only function evaluations of the parametric transfer function are required, not access to the internal dynamics of the full model. While this first framework operates on the continuous function level, assuming repeated transfer function evaluations are available, in some cases merely frequency samples might be given without an option to re-evaluate the transfer function at desired points; in other words, the function samples in parameter and frequency are fixed. In this case, we construct a parametric reduced model that minimizes a discretized least-squares error in the finite set of measurements. Towards this goal, we extend Vector Fitting (VF) to the parametric case, solving a global least-squares problem in both frequency and parameter. The output of this approach might lead to a moderate size reduced model. In this case, we perform a post- processing step to reduce the output of the parametric VF approach using H2 optimal model reduction for a special parametrization. The final model inherits the parametric dependence of the intermediate model, but is of smaller order. A special case of a parameter in a dynamical system is a delay in the model equation, e.g., arising from a feedback loop, reaction time, delayed response and various other physical phenomena. Modeling such a delay comes with several challenges for the mathematical formulation, analysis, and solution. We address the issue of transient behavior for scalar delay equations. Besides the choice of an appropriate measure, we analyze the impact of the coefficients of the delay equation on the finite time growth, which can be arbitrary large purely by the influence of the delay.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

22

Swingler, Kevin. "Mixed order hyper-networks for function approximation and optimisation." Thesis, University of Stirling, 2016. http://hdl.handle.net/1893/25349.

Full text

Abstract:

Many systems take inputs, which can be measured and sometimes controlled, and outputs, which can also be measured and which depend on the inputs. Taking numerous measurements from such systems produces data, which may be used to either model the system with the goal of predicting the output associated with a given input (function approximation, or regression) or of finding the input settings required to produce a desired output (optimisation, or search). Approximating or optimising a function is central to the field of computational intelligence. There are many existing methods for performing regression and optimisation based on samples of data but they all have limitations. Multi layer perceptrons (MLPs) are universal approximators, but they suffer from the black box problem, which means their structure and the function they implement is opaque to the user. They also suffer from a propensity to become trapped in local minima or large plateaux in the error function during learning. A regression method with a structure that allows models to be compared, human knowledge to be extracted, optimisation searches to be guided and model complexity to be controlled is desirable. This thesis presents such as method. This thesis presents a single framework for both regression and optimisation: the mixed order hyper network (MOHN). A MOHN implements a function f:{-1,1}^n →R to arbitrary precision. The structure of a MOHN makes the ways in which input variables interact to determine the function output explicit, which allows human insights and complexity control that are very difficult in neural networks with hidden units. The explicit structure representation also allows efficient algorithms for searching for an input pattern that leads to a desired output. A number of learning rules for estimating the weights based on a sample of data are presented along with a heuristic method for choosing which connections to include in a model. Several methods for searching a MOHN for inputs that lead to a desired output are compared. Experiments compare a MOHN to an MLP on regression tasks. The MOHN is found to achieve a comparable level of accuracy to an MLP but suffers less from local minima in the error function and shows less variance across multiple training trials. It is also easier to interpret and combine from an ensemble. The trade-off between the fit of a model to its training data and that to an independent set of test data is shown to be easier to control in a MOHN than an MLP. A MOHN is also compared to a number of existing optimisation methods including those using estimation of distribution algorithms, genetic algorithms and simulated annealing. The MOHN is able to find optimal solutions in far fewer function evaluations than these methods on tasks selected from the literature.

APA, Harvard, Vancouver, ISO, and other styles

23

Kotsakis, Christophoros. "Multiresolution aspects of linear approximation methods in Hilbert spaces using gridded data." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape4/PQDD_0016/NQ54794.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Fu, Shuting. "Bayesian Logistic Regression Model with Integrated Multivariate Normal Approximation for Big Data." Digital WPI, 2016. https://digitalcommons.wpi.edu/etd-theses/451.

Full text

Abstract:

The analysis of big data is of great interest today, and this comes with challenges of improving precision and efficiency in estimation and prediction. We study binary data with covariates from numerous small areas, where direct estimation is not reliable, and there is a need to borrow strength from the ensemble. This is generally done using Bayesian logistic regression, but because there are numerous small areas, the exact computation for the logistic regression model becomes challenging. Therefore, we develop an integrated multivariate normal approximation (IMNA) method for binary data with covariates within the Bayesian paradigm, and this procedure is assisted by the empirical logistic transform. Our main goal is to provide the theory of IMNA and to show that it is many times faster than the exact logistic regression method with almost the same accuracy. We apply the IMNA method to the health status binary data (excellent health or otherwise) from the Nepal Living Standards Survey with more than 60,000 households (small areas). We estimate the proportion of Nepalese in excellent health condition for each household. For these data IMNA gives estimates of the household proportions as precise as those from the logistic regression model and it is more than fifty times faster (20 seconds versus 1,066 seconds), and clearly this gain is transferable to bigger data problems.

APA, Harvard, Vancouver, ISO, and other styles

25

Pötzelberger, Klaus, and Klaus Felsenstein. "On the Fisher Information of Discretized Data." Department of Statistics and Mathematics, WU Vienna University of Economics and Business, 1991. http://epub.wu.ac.at/1700/1/document.pdf.

Full text

Abstract:

In this paper we study the loss of Fisher information in approximating a continous distribution by a multinominal distribution coming from a partition of the sample space into a finite number of intervals. We describe and characterize the Fisher information as a function of the partition chosen especially for location parameters. For a small number of intervals the consequences of the choice is demonstrated by instructive examples. For increasing number of individuals we give the asymptotically optimal partition. (author's abstract)
Series: Forschungsberichte / Institut für Statistik

APA, Harvard, Vancouver, ISO, and other styles

26

Lee, Dong-Wook. "Extracting multiple frequencies from phase-only data." Diss., Georgia Institute of Technology, 1992. http://hdl.handle.net/1853/15031.

Full text

APA, Harvard, Vancouver, ISO, and other styles

27

Hakimi, Sibooni J. "Application of extreme value theory." Thesis, University of Bradford, 1988. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.384263.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Furuhashi, Takeshi, Tomohiro Yoshikawa, Kanta Tachibana, and Minh Tuan Pham. "A Clustering Method for Geometric Data based on Approximation using Conformal Geometric Algebra." IEEE, 2011. http://hdl.handle.net/2237/20706.

Full text

APA, Harvard, Vancouver, ISO, and other styles

29

Qin, Hanzhang S. M. Massachusetts Institute of Technology. "Near-optimal data-driven approximation schemes for joint pricing and inventory control models." Thesis, Massachusetts Institute of Technology, 2018. http://hdl.handle.net/1721.1/119336.

Full text

Abstract:

Thesis: S.M. in Transportation, Massachusetts Institute of Technology, Department of Civil and Environmental Engineering, 2018.
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 95-96).
The thesis studies the classical multi-period joint pricing and inventory control problem in a data-driven setting. In the problem, a retailer makes periodic decisions of the prices and inventory levels of an item that the retailer wishes to sell. The objective is to match the inventory level with a random demand that depends on the price in each period, while maximizing the expected profit over finite horizon. In reality, the demand functions or the distribution of the random noise are usually unavailable, whereas past demand data are relatively easy to collect. A novel data-driven nonparametric algorithm is proposed, which uses the past demand data to solve the joint pricing and inventory control problem, without assuming the parameters of the demand functions and the noise distributions are known. Explicit sample complexity bounds are given, on the number of data samples needed to guarantee a near-optimal profit. A simulation study suggests that the algorithm is efficient in practice.
by Hanzhang Qin.
S.M. in Transportation
S.M.

APA, Harvard, Vancouver, ISO, and other styles

30

Bingham, Jonathan D. "Comparison of Data Collection and Methods For the Approximation of Streambed Thermal Properties." DigitalCommons@USU, 2009. https://digitalcommons.usu.edu/etd/456.

Full text

Abstract:

When approximating heat transfer through a streambed, an understanding of the thermal properties of the sediments is essential (e.g., thermal conductivity, specific heat capacity, and density). Even though considerable research has been completed in this field, little has been done to establish appropriate standard data collection approaches or to compare modeling methods for approximating these properties. Three mixture models were selected for comparison against each other and against a bed conduction model (SEDMOD). Typical data collection approaches were implemented for use in the mixture models while numerous data collection approaches were employed for use within SEDMOD. Sediment samples were taken from the streambed to estimate the necessary parameters for the mixture models (e.g., sediment volume, density, porosity, etc.) and to identify the minerals present. To yield more accurate estimates of the thermal properties from SEDMOD, methods of obtaining sediment temperature profiles representing the influences of conduction only were developed through the use of a steel cylinder and different capping materials (e.g., using geo-fabric or aluminum). In comparison to laboratory measurements of the thermal properties, it was found that the mixture model that provided the best estimates of the thermal properties was a volume weighted average. The method that best isolated conductive heating from advective heating was the steel cylinder with an aluminum cap. Using this data to calibrate SEDMOD yielded thermal diffusivity values most similar to the laboratory measurements. Due to its ability to estimate both thermal diffusivity and reproduce sediment temperature profiles, SEDMOD is recommended in combination with the aluminum isolation technique.

APA, Harvard, Vancouver, ISO, and other styles

31

Kim, Jung Hoon. "Performance Analysis and Sampled-Data Controller Synthesis for Bounded Persistent Disturbances." 京都大学 (Kyoto University), 2015. http://hdl.handle.net/2433/199317.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Wang, Hongyan. "Analysis of statistical learning algorithms in data dependent function spaces /." access full-text access abstract and table of contents, 2009. http://libweb.cityu.edu.hk/cgi-bin/ezdb/thesis.pl?phd-ma-b23750534f.pdf.

Full text

Abstract:

Thesis (Ph.D.)--City University of Hong Kong, 2009.
"Submitted to Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy." Includes bibliographical references (leaves [87]-100)

APA, Harvard, Vancouver, ISO, and other styles

33

Mehl, Craig. "Developing a sorting code for Coulomb excitation data analysis." University of the Western Cape, 2015. http://hdl.handle.net/11394/4871.

Full text

Abstract:

>Magister Scientiae - MSc
This thesis aims at developing a sorting code for Coulomb excitation studies at iThemba LABS. In Coulomb excitation reactions, the inelastic scattering of the projectile transfers energy to the partner nucleus (and vice-versa) through a time-dependent electromagnetic field. At energies well below the Coulomb barrier, the particles interact solely through the well known electromagnetic interaction, thereby excluding nuclear excitations from the process . The data can therefore be analyzed using a semiclassical approximation. The sorting code was used to process and analyze data acquired from the Coulomb excitation of 20Ne beams at 73 and 96 MeV, onto a 194Pt target. The detection of gamma rays was done using the AFRODITE HPGe clover detector array, which consists of nine clover detectors, in coincidence with the 20Ne particles detected with an S3 double-sided silicon detector. The new sorting code includes Doppler-correction effects, charge-sharing, energy and time conditions, kinematics and stopping powers, among others, and can be used for any particle-γ coincidence measurements at iThemba LABS. Results from other Coulomb excitation measurements at iThemba LABS will also be presented.

APA, Harvard, Vancouver, ISO, and other styles

34

Bennell, Robert Paul. "Continuous approximation methods for data smoothing and Fredholm integral equations of the first kind when the data are noisy." Thesis, Cranfield University, 1996. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.296023.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

Levy, Eythan. "Approximation algorithms for covering problems in dense graphs." Doctoral thesis, Universite Libre de Bruxelles, 2009. http://hdl.handle.net/2013/ULB-DIPOT:oai:dipot.ulb.ac.be:2013/210359.

Full text

Abstract:

We present a set of approximation results for several covering problems in dense graphs. These results show that for several problems, classical algorithms with constant approximation ratios can be analyzed in a finer way, and provide better constant approximation ratios under some density constraints. In particular, we show that the maximal matching heuristic approximates VERTEX COVER (VC) and MINIMUM MAXIMAL MATCHING (MMM) with a constant ratio strictly smaller than 2 when the proportion of edges present in the graph (weak density) is at least 3/4, or when the normalized minimum degree (strong density) is at least 1/2. We also show that this result can be improved by a greedy algorithm which provides a constant ratio smaller than 2 when the weak density is at least 1/2. We also provide tight families of graphs for all these approximation ratios. We then looked at several algorithms from the literature for VC and SET COVER (SC). We present a unified and critical approach to the Karpinski/Zelikovsky, Imamura/Iwama and Bar-Yehuda/Kehat algorithms, identifying the general the general scheme underlying these algorithms.

Finally, we look at the CONNECTED VERTEX COVER (CVC) problem,for which we proposed new approximation results in dense graphs. We first analyze Carla Savage's algorithm, then a new variant of the Karpinski-Zelikovsky algorithm. Our results show that these algorithms provide the same approximation ratios for CVC as the maximal matching heuristic and the Karpinski-Zelikovsky algorithm did for VC. We provide tight examples for the ratios guaranteed by both algorithms. We also introduce a new invariant, the "price of connectivity of VC", defined as the ratio between the optimal solutions of CVC and VC, and showed a nearly tight upper bound on its value as a function of the weak density. Our last chapter discusses software aspects, and presents the use of the GRAPHEDRON software in the framework of approximation algorithms, as well as our contributions to the development of this system.

/

Nous présentons un ensemble de résultats d'approximation pour plusieurs problèmes de couverture dans les graphes denses. Ces résultats montrent que pour plusieurs problèmes, des algorithmes classiques à facteur d'approximation constant peuvent être analysés de manière plus fine, et garantissent de meilleurs facteurs d'aproximation constants sous certaines contraintes de densité. Nous montrons en particulier que l'heuristique du matching maximal approxime les problèmes VERTEX COVER (VC) et MINIMUM MAXIMAL MATCHING (MMM) avec un facteur constant inférieur à 2 quand la proportion d'arêtes présentes dans le graphe (densité faible) est supérieure à 3/4 ou quand le degré minimum normalisé (densité forte) est supérieur à 1/2. Nous montrons également que ce résultat peut être amélioré par un algorithme de type GREEDY, qui fournit un facteur constant inférieur à 2 pour des densités faibles supérieures à 1/2. Nous donnons également des familles de graphes extrémaux pour nos facteurs d'approximation. Nous nous somme ensuite intéressés à plusieurs algorithmes de la littérature pour les problèmes VC et SET COVER (SC). Nous avons présenté une approche unifiée et critique des algorithmes de Karpinski-Zelikovsky, Imamura-Iwama, et Bar-Yehuda-Kehat, identifiant un schéma général dans lequel s'intègrent ces algorithmes.

Nous nous sommes finalement intéressés au problème CONNECTED VERTEX COVER (CVC), pour lequel nous avons proposé de nouveaux résultats d'approximation dans les graphes denses, au travers de l'algorithme de Carla Savage d'une part, et d'une nouvelle variante de l'algorithme de Karpinski-Zelikovsky d'autre part. Ces résultats montrent que nous pouvons obtenir pour CVC les mêmes facteurs d'approximation que ceux obtenus pour VC à l'aide de l'heuristique du matching maximal et de l'algorithme de Karpinski-Zelikovsky. Nous montrons également des familles de graphes extrémaux pour les ratios garantis par ces deux algorithmes. Nous avons également étudié un nouvel invariant, le coût de connectivité de VC, défini comme le rapport entre les solutions optimales de CVC et de VC, et montré une borne supérieure sur sa valeur en fonction de la densité faible. Notre dernier chapitre discute d'aspects logiciels, et présente l'utilisation du logiciel GRAPHEDRON dans le cadre des algorithmes d'approximation, ainsi que nos contributions au développement du logiciel.
Doctorat en Sciences
info:eu-repo/semantics/nonPublished

APA, Harvard, Vancouver, ISO, and other styles

36

Basna, Rani. "Edgeworth Expansion and Saddle Point Approximation for Discrete Data with Application to Chance Games." Thesis, Linnaeus University, School of Computer Science, Physics and Mathematics, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-8681.

Full text

Abstract:

We investigate mathematical tools, Edgeworth series expansion and the saddle point method, which are approximation techniques that help us to estimate the distribution function for the standardized mean of independent identical distributed random variables where we will take into consideration the lattice case. Later on we will describe one important application for these mathematical tools where game developing companies can use them to reduce the amount of time needed to satisfy their standard requests before they approve any game

APA, Harvard, Vancouver, ISO, and other styles

37

Folia, Maria Myrto. "Inference in stochastic systems with temporally aggregated data." Thesis, University of Manchester, 2017. https://www.research.manchester.ac.uk/portal/en/theses/inference-in-stochastic-systems-with-temporally-aggregated-data(17940c86-e6b3-4f7d-8a43-884bbf72b39e).html.

Full text

Abstract:

The stochasticity of cellular processes and the small number of molecules in a cell make deterministic models inappropriate for modelling chemical reactions at the single cell level. The Chemical Master Equation (CME) is widely used to describe the evolution of biochemical reactions inside cells stochastically but is computationally expensive. The Linear Noise Approximation (LNA) is a popular method for approximating the CME in order to carry out inference and parameter estimation in stochastic models. Data from stochastic systems is often aggregated over time. One such example is in luminescence bioimaging, where a luciferase reporter gene allows us to quantify the activity of proteins inside a cell. The luminescence intensity emitted from the luciferase experiments is collected from single cells and is integrated over a time period (usually 15 to 30 minutes), which is then collected as a single data point. In this work we consider stochastic systems that we approximate using the Linear Noise Approximation (LNA). We demonstrate our method by learning the parameters of three different models from which aggregated data was simulated, an Ornstein-Uhlenbeck model, a Lotka-Voltera model and a gene transcription model. We have additionally compared our approach to the existing approach and find that our method is outperforming the existing one. Finally, we apply our method in microscopy data from a translation inhibition experiment.

APA, Harvard, Vancouver, ISO, and other styles

38

Karimi, Belhal. "Non-Convex Optimization for Latent Data Models : Algorithms, Analysis and Applications." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLX040/document.

Full text

Abstract:

De nombreux problèmes en Apprentissage Statistique consistent à minimiser une fonction non convexe et non lisse définie sur un espace euclidien. Par exemple, les problèmes de maximisation de la vraisemblance et la minimisation du risque empirique en font partie.Les algorithmes d'optimisation utilisés pour résoudre ce genre de problèmes ont été largement étudié pour des fonctions convexes et grandement utilisés en pratique.Cependant, l'accrudescence du nombre d'observation dans l'évaluation de ce risque empirique ajoutée à l'utilisation de fonctions de perte de plus en plus sophistiquées représentent des obstacles.Ces obstacles requièrent d'améliorer les algorithmes existants avec des mis à jour moins coûteuses, idéalement indépendantes du nombre d'observations, et d'en garantir le comportement théorique sous des hypothèses moins restrictives, telles que la non convexité de la fonction à optimiser.Dans ce manuscrit de thèse, nous nous intéressons à la minimisation de fonctions objectives pour des modèles à données latentes, ie, lorsque les données sont partiellement observées ce qui inclut le sens conventionnel des données manquantes mais est un terme plus général que cela.Dans une première partie, nous considérons la minimisation d'une fonction (possiblement) non convexe et non lisse en utilisant des mises à jour incrémentales et en ligne. Nous proposons et analysons plusieurs algorithmes à travers quelques applications.Dans une seconde partie, nous nous concentrons sur le problème de maximisation de vraisemblance non convexe en ayant recourt à l'algorithme EM et ses variantes stochastiques. Nous en analysons plusieurs versions rapides et moins coûteuses et nous proposons deux nouveaux algorithmes du type EM dans le but d'accélérer la convergence des paramètres estimés
Many problems in machine learning pertain to tackling the minimization of a possibly non-convex and non-smooth function defined on a Many problems in machine learning pertain to tackling the minimization of a possibly non-convex and non-smooth function defined on a Euclidean space.Examples include topic models, neural networks or sparse logistic regression.Optimization methods, used to solve those problems, have been widely studied in the literature for convex objective functions and are extensively used in practice.However, recent breakthroughs in statistical modeling, such as deep learning, coupled with an explosion of data samples, require improvements of non-convex optimization procedure for large datasets.This thesis is an attempt to address those two challenges by developing algorithms with cheaper updates, ideally independent of the number of samples, and improving the theoretical understanding of non-convex optimization that remains rather limited.In this manuscript, we are interested in the minimization of such objective functions for latent data models, ie, when the data is partially observed which includes the conventional sense of missing data but is much broader than that.In the first part, we consider the minimization of a (possibly) non-convex and non-smooth objective function using incremental and online updates.To that end, we propose several algorithms exploiting the latent structure to efficiently optimize the objective and illustrate our findings with numerous applications.In the second part, we focus on the maximization of non-convex likelihood using the EM algorithm and its stochastic variants.We analyze several faster and cheaper algorithms and propose two new variants aiming at speeding the convergence of the estimated parameters

APA, Harvard, Vancouver, ISO, and other styles

39

Choudhury, Salimur Rashid, and University of Lethbridge Faculty of Arts and Science. "Approximation algorithms for a graph-cut problem with applications to a clustering problem in bioinformatics." Thesis, Lethbridge, Alta. : University of Lethbridge, Deptartment of Mathematics and Computer Science, 2008, 2008. http://hdl.handle.net/10133/774.

Full text

Abstract:

Clusters in protein interaction networks can potentially help identify functional relationships among proteins. We study the clustering problem by modeling it as graph cut problems. Given an edge weighted graph, the goal is to partition the graph into a prescribed number of subsets obeying some capacity constraints, so as to maximize the total weight of the edges that are within a subset. Identification of a dense subset might shed some light on the biological function of all the proteins in the subset. We study integer programming formulations and exhibit large integrality gaps for various formulations. This is indicative of the difficulty in obtaining constant factor approximation algorithms using the primal-dual schema. We propose three approximation algorithms for the problem. We evaluate the algorithms on the database of interacting proteins and on randomly generated graphs. Our experiments show that the algorithms are fast and have good performance ratio in practice.
xiii, 71 leaves : ill. ; 29 cm.

APA, Harvard, Vancouver, ISO, and other styles

40

Lundberg, Oscar, Oskar Bjersing, and Martin Eriksson. "Approximation of ab initio potentials of carbon nanomaterials with machine learning." Thesis, Luleå tekniska universitet, Institutionen för teknikvetenskap och matematik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:ltu:diva-62568.

Full text

Abstract:

In this work potentials of carbon nanomaterials calculated with Density Functional Theory (DFT) are approximated using an Artificial Neural Network (ANN). Previous work in this field has focused on estimating potential energies of bulk structures. We investigate the possibility to approximate both the potential energies and the forces of periodic carbon nanotubes (CNTs) and fullerenes. The results indicate that for test structures similar to those in the training set the ANN approximates the energies to within 270 meV/atom (< 3.7% error, RMSE 40 meV/atom) and the forces to within 7.5 eV/Å (< 73% error, RMSE 1.34 eV/Å) per atom compared with DFT calculations. Furthermore, we investigate how well the ANN approximates the potentials and forces in structures that are combinations of CNTs and fullerenes (capped CNTs) and find that the ANN generalizes the potential energies to within 100 meV/atom (< 1.1% error, RMSE 78 meV/atom) and the forces to within 6 eV/Å (< 60% error, RMSE 0.55 eV/Å) per atom. The ANN approximated potentials and forces are used to geometry optimize CNTs and we observe that the optimized periodic CNTs match DFT calculated structures and energies while the capped CNTs result in comparable energies but incorrect structures compared to DFT calculations. Considering geometry optimization performed with ANN on CNTs the errors lie within 170 meV/atom (< 1.8% error) with an RMSE of 20 meV/atom. For the geometry optimizations of the capped CNTs the errors are within 430 meV/atom (< 5.5% error) with an RMSE of 14 meV/atom. All results are compared with empirical potentials (ReaxFF) and we find that the ANN approximated potentials are more accurate than the best tested empirical potential. This work shows that machine learning may be used to approximate DFT calculations. However, for further applications our conclusion is that the error of the estimated forces must be reduced further. Finally, we investigate the computing time (number of core hours) required and find that the ANN is about two orders of magnitude faster than DFT and three to four orders of magnitude slower than ReaxFF. For the unseen data the ANN is still around 2 orders of magnitude quicker than the DFT but here it is around 4 order of magnitude slower than ReaxFF.

Supervisors: Daniel Hedman and Fredrik Sandin

F7042T - Project in Engineering Physics

APA, Harvard, Vancouver, ISO, and other styles

41

Eremic, John C. "Iterative methods for estimation of 2-D AR parameters using a data-adaptive Toeplitz approximation algorithm." Thesis, Monterey, California. Naval Postgraduate School, 1991. http://hdl.handle.net/10945/28321.

Full text

APA, Harvard, Vancouver, ISO, and other styles

42

Agarwal, Khushbu. "A partition based approach to approximate tree mining : a memory hierarchy perspective." The Ohio State University, 2007. http://rave.ohiolink.edu/etdc/view?acc_num=osu1196284256.

Full text

APA, Harvard, Vancouver, ISO, and other styles

43

Singhal, Kritika. "Geometric Methods for Simplification and Comparison of Data Sets." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu1587253879303425.

Full text

APA, Harvard, Vancouver, ISO, and other styles

44

Essegbey, John W. "Piece-wise Linear Approximation for Improved Detection in Structural Health Monitoring." University of Cincinnati / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1342729241.

Full text

APA, Harvard, Vancouver, ISO, and other styles

45

Duan, Xiuwen. "Revisiting Empirical Bayes Methods and Applications to Special Types of Data." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42340.

Full text

Abstract:

Empirical Bayes methods have been around for a long time and have a wide range of applications. These methods provide a way in which historical data can be aggregated to provide estimates of the posterior mean. This thesis revisits some of the empirical Bayesian methods and develops new applications. We first look at a linear empirical Bayes estimator and apply it on ranking and symbolic data. Next, we consider Tweedie’s formula and show how it can be applied to analyze a microarray dataset. The application of the formula is simplified with the Pearson system of distributions. Saddlepoint approximations enable us to generalize several results in this direction. The results show that the proposed methods perform well in applications to real data sets.

APA, Harvard, Vancouver, ISO, and other styles

46

Qin, Xiao. "A Data-Driven Approach for System Approximation and Set Point Optimization, with a Focus in HVAC Systems." Diss., The University of Arizona, 2014. http://hdl.handle.net/10150/318828.

Full text

Abstract:

Dynamically determining input signals to a complex system, to increase performance and/or reduce cost, is a difficult task unless users are provided with feedback on the consequences of different input decisions. For example, users self-determine the set point schedule (i.e. temperature thresholds) of their HVAC system, without an ability to predict cost--they select only comfort. Users are unable to optimize the set point schedule with respect to cost because the cost feedback is provided at billing-cycle intervals. To provide rapid feedback (such as expected monthly/daily cost), mechanisms for system monitoring, data-driven modeling, simulation, and optimization are needed. Techniques from the literature require in-depth knowledge in the domain, and/or significant investment in infrastructure or equipment to measure state variables, making these solutions difficult to implement or to scale down in cost. This work introduces methods to approximate complex system behavior prediction and optimization, based on dynamic data obtained from inexpensive sensors. Unlike many existing approaches, we do not extract an exact model to capture every detail of the system; rather, we develop an approximated model with key predictive characteristics. Such a model makes estimation and prediction available to users who can then make informed decisions; alternatively, these estimates are made available as an input to an optimization tool to automatically provide pareto-optimized set points. Moreover, the approximation nature of this model makes the determination of the prediction and optimization parameters computationally inexpensive, adaptive to system or environment change, and suitable for embedded system implementation. Effectiveness of these methods is first demonstrated on an HVAC system methodology, and then extended to a variety of complex system applications.

APA, Harvard, Vancouver, ISO, and other styles

47

Morel, Jules. "Surface reconstruction based on forest terrestrial LiDAR data." Thesis, Aix-Marseille, 2017. http://www.theses.fr/2017AIXM0039/document.

Full text

Abstract:

Au cours des dernières années, la capacité de la technologie LiDAR à capturer des informations détaillées sur la structure des forêts a attiré une attention croissante de la part de la communauté des écologues et des forestiers. Le LiDAR terrestre, notamment, apparaît comme un outil prometteur pour recueillir les caractéristiques géométriques des arbres à une précision millimétrique.Cette thèse étudie la reconstruction de surface à partir de nuages de points épars et non structurés, capturés en environnement forestier par un LiDAR terrestre. Nous proposons une suite d’algorithmes dédiés à la reconstruction de modèles d’attributs de placettes forestières : le sol etla structure ligneuse des arbres (i.e. troncs et branches principales). En pratique, nos approches modélisent le problème par des surfaces implicites construites à partir de fonctions à base radiale pour faire face à la forte hétérogénéité spatiale du nuage de points Lidar terrestre
In recent years, the capacity of LiDAR technology to capture detailed information about forests structure has attracted increasing attention in the field of forest science. In particular, the terrestrial LiDAR arises as a promising tool to retrieve geometrical characteristics of trees at a millimeter level.This thesis studies the surface reconstruction problem from scattered and unorganized point clouds, captured in forested environment by a terrestrial LiDAR. We propose a sequence of algorithms dedicated to the reconstruction of forests plot attributes model: the ground and the woody structure of trees (i.e. the trunk and the main branches). In practice, our approaches model the surface with implicit function build with radial basis functions to manage the homogeneity and handle the noise of the sample data points

APA, Harvard, Vancouver, ISO, and other styles

48

Morvan, Anne. "Contributions to unsupervised learning from massive high-dimensional data streams : structuring, hashing and clustering." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLED033/document.

Full text

Abstract:

Cette thèse étudie deux tâches fondamentales d'apprentissage non supervisé: la recherche des plus proches voisins et le clustering de données massives en grande dimension pour respecter d'importantes contraintes de temps et d'espace.Tout d'abord, un nouveau cadre théorique permet de réduire le coût spatial et d'augmenter le débit de traitement du Cross-polytope LSH pour la recherche du plus proche voisin presque sans aucune perte de précision.Ensuite, une méthode est conçue pour apprendre en une seule passe sur des données en grande dimension des codes compacts binaires. En plus de garanties théoriques, la qualité des sketches obtenus est mesurée dans le cadre de la recherche approximative des plus proches voisins. Puis, un algorithme de clustering sans paramètre et efficace en terme de coût de stockage est développé en s'appuyant sur l'extraction d'un arbre couvrant minimum approché du graphe de dissimilarité compressé auquel des coupes bien choisies sont effectuées
This thesis focuses on how to perform efficiently unsupervised machine learning such as the fundamentally linked nearest neighbor search and clustering task, under time and space constraints for high-dimensional datasets. First, a new theoretical framework reduces the space cost and increases the rate of flow of data-independent Cross-polytope LSH for the approximative nearest neighbor search with almost no loss of accuracy.Second, a novel streaming data-dependent method is designed to learn compact binary codes from high-dimensional data points in only one pass. Besides some theoretical guarantees, the quality of the obtained embeddings are accessed on the approximate nearest neighbors search task.Finally, a space-efficient parameter-free clustering algorithm is conceived, based on the recovery of an approximate Minimum Spanning Tree of the sketched data dissimilarity graph on which suitable cuts are performed

APA, Harvard, Vancouver, ISO, and other styles

49

Gorman, Joe, Glenn Takata, Subhash Patel, and Dan Grecu. "A Constraint-Based Approach to Predictive Maintenance Model Development." International Foundation for Telemetering, 2008. http://hdl.handle.net/10150/606187.

Full text

Abstract:

ITC/USA 2008 Conference Proceedings / The Forty-Fourth Annual International Telemetering Conference and Technical Exhibition / October 27-30, 2008 / Town and Country Resort & Convention Center, San Diego, California
Predictive maintenance is the combination of inspection and data analysis to perform maintenance when the need is indicated by unit performance. Significant cost savings are possible while preserving a high level of system performance and readiness. Identifying predictors of maintenance conditions requires expert knowledge and the ability to process large data sets. This paper describes a novel use of constraint-based data-mining to model exceedence conditions. The approach extends the extract, transformation, and load process with domain aggregate approximation to encode expert knowledge. A data-mining workbench enables an expert to pose hypotheses that constrain a multivariate data-mining process.

APA, Harvard, Vancouver, ISO, and other styles

50

Hildebrandt, Filip, and Leonard Halling. "Identifiering av tendenser i data för prediktiv analys hos Flygresor.se." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209646.

Full text

Abstract:

I och med digitaliseringen förändras samhället snabbare än någonsin och det är viktigt för företag att hålla sig uppdaterade för att kunna anpassa sin verksamhet till en marknad som hela tiden utvecklas. Det existerar en uppsjö av business intelligence modeller för just detta ändamål, och prediktiv analys är en central del bland dessa. Fokus i denna rapport ligger i att undersöka i vilken utsträckning tre olika prediktiva analysmetoder lämpar sig för ett specifikt uppdrag gällande månadsprognoser baserat på klickdata från Flygresor.se. Målet med rapporten är att kunna redogöra för vilken av metoderna som fastställer den mest precisa prognoser för given data och vilka karakteristiska drag i datan som bidrar till detta resultat. Vi kommer att tillämpa de prediktiva analysmodellerna Holt-Winters och ARIMA, samt en utbyggd linjär approximation, på historisk klickdata och återge arbetsprocessen samt utifrån resultatet beskriva vilka konsekvenser datan från Flygresor.se förde med sig.
With digitization, society changes faster than ever and it’s important for companies to stay up to date in order to adapt their business to a constantly changing market. There exists a lot of models in business intelligence, and predictive analytics is an important one. This study investigates to what extent three different methods of predictive analytics are suitable for a specific assignment regarding monthly forecasts based on click data from Flygresor.se. The purpose of the report is to be able to present which of the methods who determines the most precise forecasts for the given data and what trends in the data that contributes to this result. We will use the predictive analytics models Holt-Winters and ARIMA, as well as an expanded linear approximation, on historical click data and render the work process as well as what consequences the data from Flygresor.se brought with them.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Data approximation'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles