Dissertations / Theses: 'Statistical learning theory'

1

Liang, Annie. "Economic Theory and Statistical Learning." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493561.

Full text

Abstract:

This dissertation presents three independent essays in microeconomic theory. Chapter 1 suggests an alternative to the common prior assumption, in which agents form beliefs by learning from data, possibly interpreting the data in different ways. In the limit as agents observe increasing quantities of data, the model returns strict solutions of a limiting complete information game, but predictions may diverge substantially for small quantities of data. Chapter 2 (with Jon Kleinberg and Sendhil Mullainathan) proposes use of machine learning algorithms to construct benchmarks for “achievable" predictive accuracy. The paper illustrates this approach for the problem of predicting human-generated random sequences. We find that leading models explain approximately 10-15% of predictable variation in the problem. Chapter 3 considers the problem of how to interpret inconsistent choice data, when the observed departures from the standard model (perfect maximization of a single preference) may emerge either from context-dependencies in preference or from stochastic choice error. I show that if preferences are “simple" in the sense that they consist only of a small number of context-dependencies, then the analyst can use a proposed optimization problem to recover the true number of underlying context-dependent preferences.
Economics

APA, Harvard, Vancouver, ISO, and other styles

2

Deng, Xinwei. "Contributions to statistical learning and statistical quantification in nanomaterials." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29777.

Full text

Abstract:

Thesis (Ph.D)--Industrial and Systems Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Wu, C. F. Jeff; Committee Co-Chair: Yuan, Ming; Committee Member: Huo, Xiaoming; Committee Member: Vengazhiyil, Roshan Joseph; Committee Member: Wang, Zhonglin. Part of the SMARTech Electronic Thesis and Dissertation Collection.

APA, Harvard, Vancouver, ISO, and other styles

3

Hill, S. "Applications of statistical learning theory to signal processing problems." Thesis, University of Cambridge, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.604048.

Full text

Abstract:

The dissertation focuses on the applicability of Support Vector Regression (SVR) in signal processing contexts. This is shown to be particularly well-suited to filtering in alpha-stable noise environments, and a further slight modification is proposed to this end. The main work in this dissertation on SVR is on the application to audio filtering based on perceptual criteria. This appears an ideal solution to the problem due to the fact that the loss function typically used by perceptual audio filtering practitioners incorporates a region of zero loss, as does SVR. SVR is extended to the problem of complex-valued regression, for application in the audio filtering problem to the frequency domain. This is with regions of zero loss that are both square and circular, and the circular case is extended to the problem of vector-valued regression. Three experiments are detailed with a mix of both good and poor results, and further refinements are proposed. Polychotomous, or multi-category classification is then studied. Many previous attempts are reviewed, and compared. A new approach is proposed, based on a geometrical structure. This is shown to overcome many of the problems identified with previous methods in addition to being very flexible and efficient in its implementation. This architecture is also derived, for just the three-class case, using a complex-valued kernel function. The general architecture is used experimentally in three separate implementations to give a demonstration of the overall approach. The methodology is shown to achieve results comparable to those of many other methods, and to include many of them as special cases. Further possible refinements are proposed which should drastically reduce optimisation times for so-called 'all-together' methods.

APA, Harvard, Vancouver, ISO, and other styles

4

Hu, Qiao Ph D. Massachusetts Institute of Technology. "Application of statistical learning theory to plankton image analysis." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/39206.

Full text

Abstract:

Thesis (Ph. D.)--Joint Program in Applied Ocean Science and Engineering (Massachusetts Institute of Technology, Dept. of Mechanical Engineering; and the Woods Hole Oceanographic Institution), 2006.
Includes bibliographical references (leaves 155-173).
A fundamental problem in limnology and oceanography is the inability to quickly identify and map distributions of plankton. This thesis addresses the problem by applying statistical machine learning to video images collected by an optical sampler, the Video Plankton Recorder (VPR). The research is focused on development of a real-time automatic plankton recognition system to estimate plankton abundance. The system includes four major components: pattern representation/feature measurement, feature extraction/selection, classification, and abundance estimation. After an extensive study on a traditional learning vector quantization (LVQ) neural network (NN) classifier built on shape-based features and different pattern representation methods, I developed a classification system combined multi-scale cooccurrence matrices feature with support vector machine classifier. This new method outperforms the traditional shape-based-NN classifier method by 12% in classification accuracy. Subsequent plankton abundance estimates are improved in the regions of low relative abundance by more than 50%. Both the NN and SVM classifiers have no rejection metrics. In this thesis, two rejection metrics were developed.
(cont.) One was based on the Euclidean distance in the feature space for NN classifier. The other used dual classifier (NN and SVM) voting as output. Using the dual-classification method alone yields almost as good abundance estimation as human labeling on a test-bed of real world data. However, the distance rejection metric for NN classifier might be more useful when the training samples are not "good" ie, representative of the field data. In summary, this thesis advances the current state-of-the-art plankton recognition system by demonstrating multi-scale texture-based features are more suitable for classifying field-collected images. The system was verified on a very large real-world dataset in systematic way for the first time. The accomplishments include developing a multi-scale occurrence matrices and support vector machine system, a dual-classification system, automatic correction in abundance estimation, and ability to get accurate abundance estimation from real-time automatic classification. The methods developed are generic and are likely to work on range of other image classification applications.
by Qiao Hu.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

5

Shipitsyn, Aleksey. "Statistical Learning with Imbalanced Data." Thesis, Linköpings universitet, Filosofiska fakulteten, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139168.

Full text

Abstract:

In this thesis several sampling methods for Statistical Learning with imbalanced data have been implemented and evaluated with a new metric, imbalanced accuracy. Several modifications and new algorithms have been proposed for intelligent sampling: Border links, Clean Border Undersampling, One-Sided Undersampling Modified, DBSCAN Undersampling, Class Adjusted Jittering, Hierarchical Cluster Based Oversampling, DBSCAN Oversampling, Fitted Distribution Oversampling, Random Linear Combinations Oversampling, Center Repulsion Oversampling. A set of requirements on a satisfactory performance metric for imbalanced learning have been formulated and a new metric for evaluating classification performance has been developed accordingly. The new metric is based on a combination of the worst class accuracy and geometric mean. In the testing framework nonparametric Friedman's test and post hoc Nemenyi’s test have been used to assess the performance of classifiers, sampling algorithms, combinations of classifiers and sampling algorithms on several data sets. A new approach of detecting algorithms with dominating and dominated performance has been proposed with a new way of visualizing the results in a network. From experiments on simulated and several real data sets we conclude that: i) different classifiers are not equally sensitive to sampling algorithms, ii) sampling algorithms have different performance within specific classifiers, iii) oversampling algorithms perform better than undersampling algorithms, iv) Random Oversampling and Random Undersampling outperform many well-known sampling algorithms, v) our proposed algorithms Hierarchical Cluster Based Oversampling, DBSCAN Oversampling with FDO, and Class Adjusted Jittering perform much better than other algorithms, vi) a few good combinations of a classifier and sampling algorithm may boost classification performance, while a few bad combinations may spoil the performance, but the majority of combinations are not significantly different in performance.

APA, Harvard, Vancouver, ISO, and other styles

6

Wang, Hongyan. "Analysis of statistical learning algorithms in data dependent function spaces /." access full-text access abstract and table of contents, 2009. http://libweb.cityu.edu.hk/cgi-bin/ezdb/thesis.pl?phd-ma-b23750534f.pdf.

Full text

Abstract:

Thesis (Ph.D.)--City University of Hong Kong, 2009.
"Submitted to Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy." Includes bibliographical references (leaves [87]-100)

APA, Harvard, Vancouver, ISO, and other styles

7

Gianvecchio, Steven. "Application of information theory and statistical learning to anomaly detection." W&M ScholarWorks, 2010. https://scholarworks.wm.edu/etd/1539623563.

Full text

Abstract:

In today's highly networked world, computer intrusions and other attacks area constant threat. The detection of such attacks, especially attacks that are new or previously unknown, is important to secure networks and computers. A major focus of current research efforts in this area is on anomaly detection.;In this dissertation, we explore applications of information theory and statistical learning to anomaly detection. Specifically, we look at two difficult detection problems in network and system security, (1) detecting covert channels, and (2) determining if a user is a human or bot. We link both of these problems to entropy, a measure of randomness information content, or complexity, a concept that is central to information theory. The behavior of bots is low in entropy when tasks are rigidly repeated or high in entropy when behavior is pseudo-random. In contrast, human behavior is complex and medium in entropy. Similarly, covert channels either create regularity, resulting in low entropy, or encode extra information, resulting in high entropy. Meanwhile, legitimate traffic is characterized by complex interdependencies and moderate entropy. In addition, we utilize statistical learning algorithms, Bayesian learning, neural networks, and maximum likelihood estimation, in both modeling and detecting of covert channels and bots.;Our results using entropy and statistical learning techniques are excellent. By using entropy to detect covert channels, we detected three different covert timing channels that were not detected by previous detection methods. Then, using entropy and Bayesian learning to detect chat bots, we detected 100% of chat bots with a false positive rate of only 0.05% in over 1400 hours of chat traces. Lastly, using neural networks and the idea of human observational proofs to detect game bots, we detected 99.8% of game bots with no false positives in 95 hours of traces. Our work shows that a combination of entropy measures and statistical learning algorithms is a powerful and highly effective tool for anomaly detection.

APA, Harvard, Vancouver, ISO, and other styles

8

Srivastava, Santosh. "Bayesian minimum expected risk estimation of distributions for statistical learning /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6765.

Full text

APA, Harvard, Vancouver, ISO, and other styles

9

Wang, Ni. "Statistical Learning in Logistics and Manufacturing Systems." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/11457.

Full text

Abstract:

This thesis focuses on the developing of statistical methodology in reliability and quality engineering, and to assist the decision-makings at enterprise level, process level, and product level. In Chapter II, we propose a multi-level statistical modeling strategy to characterize data from spatial logistics systems. The model can support business decisions at different levels. The information available from higher hierarchies is incorporated into the multi-level model as constraint functions for lower hierarchies. The key contributions include proposing the top-down multi-level spatial models which improve the estimation accuracy at lower levels; applying the spatial smoothing techniques to solve facility location problems in logistics. In Chapter III, we propose methods for modeling system service reliability in a supply chain, which may be disrupted by uncertain contingent events. This chapter applies an approximation technique for developing first-cut reliability analysis models. The approximation relies on multi-level spatial models to characterize patterns of store locations and demands. The key contributions in this chapter are to bring statistical spatial modeling techniques to approximate store location and demand data, and to build system reliability models entertaining various scenarios of DC location designs and DC capacity constraints. Chapter IV investigates the power law process, which has proved to be a useful tool in characterizing the failure process of repairable systems. This chapter presents a procedure for detecting and estimating a mixture of conforming and nonconforming systems. The key contributions in this chapter are to investigate the property of parameter estimation in mixture repair processes, and to propose an effective way to screen out nonconforming products. The key contributions in Chapter V are to propose a new method to analyze heavily censored accelerated life testing data, and to study the asymptotic properties. This approach flexibly and rigorously incorporates distribution assumptions and regression structures into estimating equations in a nonparametric estimation framework. Derivations of asymptotic properties of the proposed method provide an opportunity to compare its estimation quality to commonly used parametric MLE methods in the situation of mis-specified regression models.

APA, Harvard, Vancouver, ISO, and other styles

10

Rydén, Otto. "Statistical learning procedures for analysis of residential property price indexes." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-207946.

Full text

Abstract:

Residential Price Property Indexes (RPPIs) are used to study the price development of residential property over time. Modeling and analysing an RPPI is not straightforward due to residential property being a heterogeneous good. This thesis focuses on analysing the properties of the two most conventional hedonic index modeling approaches, the hedonic time dummy method and the hedonic imputation method. These two methods are analysed with statistical learning procedures from a regression perspective, specifically, ordinary least squares regression, and a number of more advanced regression approaches, Huber regression, lasso regression, ridge regression and principal component regression. The analysis is based on the data from 56 000 apartment transactions in Stockholm during the period 2013-2016 and results in several models of a RPPI. These suggested models are then validated using both qualitative and quantitative methods, specifically a bootstrap re-sampling to perform analyses of an empirical confidence interval for the index values and a mean squared errors analysis of the different index periods. Main results of this thesis show that the hedonic time dummy index methodology produces indexes with smaller variances and more robust indexes for smaller datasets. It is further shown that modeling of RPPIs with robust regression generally results in a more stable index that is less affected by outliers in the underlying transaction data. This type of robust regression strategy is therefore recommended for a commercial implementation of an RPPI.
Bostadsprisindex används för att undersöka prisutvecklingen för bostäder över tid. Att modellera ett bostadsprisindex är inte alltid lätt då bostäder är en heterogen vara. Denna uppsats analyserar skillnaden mellan de tvåhuvudsakliga hedoniska indexmodelleringsmetoderna, som är, hedoniska tiddummyvariabelmetoden och den hedoniska imputeringsmetoden. Dessa metoder analyseras med en statistisk inlärningsprocedur gjord utifrån ett regressionsperspektiv, som inkluderar analys utav minsta kvadrats-regression, Huberregression, lassoregression, ridgeregression och principal componentregression. Denna analys är baserad på ca 56 000 lägenhetstransaktioner för lägenheter i Stockholm under perioden 2013-2016 och används för att modellera era versioner av ett bostadsprisindex. De modellerade bostadsprisindexen analyseras sedan med hjälp utav både kvalitativa och kvantitativa metoder inklusive en version av bootstrap för att räkna ut ett empiriskt konfidensintervall för bostadsprisindexen samt en medelfelsanalys av indexpunktskattningarna i varje tidsperiod. Denna analys visar att den hedoniska tid-dummyvariabelmetoden producerar bostadsprisindex med mindre varians och ger också robustare bostadsprisindex för en mindre datamängd. Denna uppsats visar också att användandet av robustare regressionsmetoder leder till stabilare bostadsprisindex som är mindre påverkade av extremvärden, därför rekommenderas robusta regressionsmetoder för en kommersiell implementering av ett bostadsprisindex.

APA, Harvard, Vancouver, ISO, and other styles

11

Menke, Joshua E. "Improving machine learning through oracle learning /." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd1726.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

12

Yaman, Sibel. "A multi-objective programming perspective to statistical learning problems." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26470.

Full text

Abstract:

Thesis (Ph.D)--Electrical and Computer Engineering, Georgia Institute of Technology, 2009.
Committee Chair: Chin-Hui Lee; Committee Member: Anthony Yezzi; Committee Member: Evans Harrell; Committee Member: Fred Juang; Committee Member: James H. McClellan. Part of the SMARTech Electronic Thesis and Dissertation Collection.

APA, Harvard, Vancouver, ISO, and other styles

13

Frank, Ernest. "The effect of individual difference variables, learning environment, and cognitive task on statistical learning performance." Morgantown, W. Va. : [West Virginia University Libraries], 2000. http://etd.wvu.edu/templates/showETD.cfm?recnum=1383.

Full text

Abstract:

Thesis (Ed. D.)--West Virginia University, 2000.
Title from document title page. Document formatted into pages; contains xvi, 183 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 160-173).

APA, Harvard, Vancouver, ISO, and other styles

14

Li, Bin. "Statistical learning and predictive modeling in data mining." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1155058111.

Full text

APA, Harvard, Vancouver, ISO, and other styles

15

Agerberg, Jens. "Statistical Learning and Analysis on Homology-Based Features." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273581.

Full text

Abstract:

Stable rank has recently been proposed as an invariant to encode the result of persistent homology, a method used in topological data analysis. In this thesis we develop methods for statistical analysis as well as machine learning methods based on stable rank. As stable rank may be viewed as a mapping to a Hilbert space, a kernel can be constructed from the inner product in this space. First, we investigate this kernel in the context of kernel learning methods such as support-vector machines. Next, using the theory of kernel embedding of probability distributions, we give a statistical treatment of the kernel by showing some of its properties and develop a two-sample hypothesis test based on the kernel. As an alternative approach, a mapping to a Euclidean space with learnable parameters can be conceived, serving as an input layer to a neural network. The developed methods are first evaluated on synthetic data. Then the two-sample hypothesis test is applied on the OASIS open access brain imaging dataset. Finally a graph classification task is performed on a dataset collected from Reddit.
Stable rank har föreslagits som en sammanfattning på datanivå av resultatet av persistent homology, en metod inom topologisk dataanalys. I detta examensarbete utvecklar vi metoder inom statistisk analys och maskininlärning baserade på stable rank. Eftersom stable rank kan ses som en avbildning i ett Hilbertrum kan en kärna konstrueras från inre produkten i detta rum. Först undersöker vi denna kärnas egenskaper när den används inom ramen för maskininlärningsmetoder som stödvektormaskin (SVM). Därefter, med grund i teorin för inbäddning av sannolikhetsfördelningar i reproducing kernel Hilbertrum, undersöker vi hur kärnan kan användas för att utveckla ett test för statistisk hypotesprövning. Slutligen, som ett alternativ till metoder baserade på kärnor, utvecklas en avbildning i ett euklidiskt rum med optimerbara parametrar, som kan användas som ett ingångslager i ett neuralt nätverk. Metoderna utvärderas först på syntetisk data. Vidare utförs ett statistiskt test på OASIS, ett öppet dataset inom neuroradiologi. Slutligen utvärderas metoderna på klassificering av grafer, baserat på ett dataset insamlat från Reddit.

QC 20200523

APA, Harvard, Vancouver, ISO, and other styles

16

Lu, Yibiao. "Statistical methods with application to machine learning and artificial intelligence." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44730.

Full text

Abstract:

This thesis consists of four chapters. Chapter 1 focuses on theoretical results on high-order laplacian-based regularization in function estimation. We studied the iterated laplacian regularization in the context of supervised learning in order to achieve both nice theoretical properties (like thin-plate splines) and good performance over complex region (like soap film smoother). In Chapter 2, we propose an innovative static path-planning algorithm called m-A* within an environment full of obstacles. Theoretically we show that m-A* reduces the number of vertex. In the simulation study, our approach outperforms A* armed with standard L1 heuristic and stronger ones such as True-Distance heuristics (TDH), yielding faster query time, adequate usage of memory and reasonable preprocessing time. Chapter 3 proposes m-LPA* algorithm which extends the m-A* algorithm in the context of dynamic path-planning and achieves better performance compared to the benchmark: lifelong planning A* (LPA*) in terms of robustness and worst-case computational complexity. Employing the same beamlet graphical structure as m-A*, m-LPA* encodes the information of the environment in a hierarchical, multiscale fashion, and therefore it produces a more robust dynamic path-planning algorithm. Chapter 4 focuses on an approach for the prediction of spot electricity spikes via a combination of boosting and wavelet analysis. Extensive numerical experiments show that our approach improved the prediction accuracy compared to those results of support vector machine, thanks to the fact that the gradient boosting trees method inherits the good properties of decision trees such as robustness to the irrelevant covariates, fast computational capability and good interpretation.

APA, Harvard, Vancouver, ISO, and other styles

17

Verleyen, Wim. "Machine learning for systems pathology." Thesis, University of St Andrews, 2013. http://hdl.handle.net/10023/4512.

Full text

Abstract:

Systems pathology attempts to introduce more holistic approaches towards pathology and attempts to integrate clinicopathological information with “-omics” technology. This doctorate researches two examples of a systems approach for pathology: (1) a personalized patient output prediction for ovarian cancer and (2) an analytical approach differentiates between individual and collective tumour invasion. During the personalized patient output prediction for ovarian cancer study, clinicopathological measurements and proteomic biomarkers are analysed with a set of newly engineered bioinformatic tools. These tools are based upon feature selection, survival analysis with Cox proportional hazards regression, and a novel Monte Carlo approach. Clinical and pathological data proves to have highly significant information content, as expected; however, molecular data has little information content alone, and is only significant when selected most-informative variables are placed in the context of the patient's clinical and pathological measures. Furthermore, classifiers based on support vector machines (SVMs) that predict one-year PFS and three-year OS with high accuracy, show how the addition of carefully selected molecular measures to clinical and pathological knowledge can enable personalized prognosis predictions. Finally, the high-performance of these classifiers are validated on an additional data set. A second study, an analytical approach differentiates between individual and collective tumour invasion, analyses a set of morphological measures. These morphological measurements are collected with a newly developed process using automated imaging analysis for data collection in combination with a Bayesian network analysis to probabilistically connect morphological variables with tumour invasion modes. Between an individual and collective invasion mode, cell-cell contact is the most discriminating morphological feature. Smaller invading groups were typified by smoother cellular surfaces than those invading collectively in larger groups. Interestingly, elongation was evident in all invading cell groups and was not a specific feature of single cell invasion as a surrogate of epithelialmesenchymal transition. In conclusion, the combination of automated imaging analysis and Bayesian network analysis provides an insight into morphological variables associated with transition of cancer cells between invasion modes. We show that only two morphologically distinct modes of invasion exist. The two studies performed in this thesis illustrate the potential of a systems approach for pathology and illustrate the need of quantitative approaches in order to reveal the system behind pathology.

APA, Harvard, Vancouver, ISO, and other styles

18

Yang, Ying. "Discretization for Naive-Bayes learning." Monash University, School of Computer Science and Software Engineering, 2003. http://arrow.monash.edu.au/hdl/1959.1/9393.

Full text

APA, Harvard, Vancouver, ISO, and other styles

19

Karlaftis, Vasileios Misak. "Structural and functional brain plasticity for statistical learning." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278790.

Full text

Abstract:

Extracting structure from initially incomprehensible streams of events is fundamental to a range of human abilities: from navigating in a new environment to learning a language. These skills rely on our ability to extract spatial and temporal regularities, often with minimal explicit feedback, that is known as statistical learning. Despite the importance of statistical learning for making perceptual decisions, we know surprisingly little about the brain circuits and how they change when learning temporal regularities. In my thesis, I combine behavioural measurements, Diffusion Tensor Imaging (DTI) and resting-state fMRI (rs-fMRI) to investigate the structural and functional circuits that are involved in statistical learning of temporal structures. In particular, I compare structural connectivity as measured by DTI and functional connectivity as measured by rs-fMRI before vs. after training to investigate learning-dependent changes in human brain pathways. Further, I combine the two imaging modalities using graph theory and regression analyses to identify key predictors of individual learning performance. Using a prediction task in the context of sequence learning without explicit feedback, I demonstrate that individuals adapt to the environment’s statistics as they change over time from simple repetition to probabilistic combinations. Importantly, I show that learning of temporal structures relates to decision strategy that varies among individuals between two prototypical distributions: matching the exact sequence statistics or selecting the most probable outcome in a given context (i.e. maximising). Further, combining DTI and rs-fMRI, I show that learning-dependent plasticity in dissociable cortico-striatal circuits relates to decision strategy. In particular, matching relates to connectivity between visual cortex, hippocampus and caudate, while maximisation relates to connectivity between frontal and motor cortices and striatum. These findings have potential translational applications, as alternate brain routes may be re-trained to support learning ability when specific pathways (e.g. memory-related circuits) are compromised by age or disease.

APA, Harvard, Vancouver, ISO, and other styles

20

Guidolin, Massimo. "Asset prices on Bayesian learning paths /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2000. http://wwwlib.umi.com/cr/ucsd/fullcit?p9975886.

Full text

APA, Harvard, Vancouver, ISO, and other styles

21

Huszár, Ferenc. "Scoring rules, divergences and information in Bayesian machine learning." Thesis, University of Cambridge, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648333.

Full text

APA, Harvard, Vancouver, ISO, and other styles

22

Yang, Liu. "Mathematical Theories of Interaction with Oracles." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/559.

Full text

APA, Harvard, Vancouver, ISO, and other styles

23

DENEVI, GIULIA. "Efficient Lifelong Learning Algorithms: Regret Bounds and Statistical Guarantees." Doctoral thesis, Università degli studi di Genova, 2019. http://hdl.handle.net/11567/986813.

Full text

Abstract:

We study the Meta-Learning paradigm where the goal is to select an algorithm in a prescribed family – usually denoted as inner or within-task algorithm – that is appropriate to address a class of learning problems (tasks), sharing specific similarities. More precisely, we aim at designing a procedure, called meta-algorithm, that is able to infer this tasks’ relatedness from a sequence of observed tasks and to exploit such a knowledge in order to return a within-task algorithm in the class that is best suited to solve a new similar task. We are interested in the online Meta-Learning setting, also known as Lifelong Learning. In this scenario the meta-algorithm receives the tasks sequentially and it incrementally adapts the inner algorithm on the fly as the tasks arrive. In particular, we refer to the framework in which also the within-task data are processed sequentially by the inner algorithm as Online-Within-Online (OWO) Meta-Learning, while, we use the term Online-Within-Batch (OWB) Meta-Learning to denote the setting in which the within-task data are processed in a single batch. In this work we propose an OWO Meta-Learning method based on primal-dual Online Learning. Our method is theoretically grounded and it is able to cover various types of tasks’ relatedness and learning algorithms. More precisely, we focus on the family of inner algorithms given by a parametrized variant of Follow The Regularized Leader (FTRL) aiming at minimizing the withintask regularized empirical risk. The inner algorithm in this class is incrementally adapted by a FTRL meta-algorithm using the within-task minimum regularized empirical risk as the meta-loss. In order to keep the process fully online, we use the online inner algorithm to approximate the subgradients used by the meta-algorithm and we show how to exploit an upper bound on this approximation error in order to derive a cumulative error bound for the proposed method. Our analysis can be adapted to the statistical setting by two nested online-to-batch conversion steps. We also show how the proposed OWO method can provide statistical guarantees comparable to its natural more expensive OWB variant, where the inner online algorithm is substituted by the batch minimizer of the regularized empirical risk. Finally, we apply our method to two important families of learning algorithms parametrized by a bias vector or a linear feature map.

APA, Harvard, Vancouver, ISO, and other styles

24

Brodin, Kristoffer. "Statistical Machine Learning from Classification Perspective: : Prediction of Household Ties for Economical Decision Making." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215923.

Full text

Abstract:

In modern society, many companies have large data records over their individual customers, containing information about attributes, such as name, gender, marital status, address, etc. These attributes can be used to link costumers together, depending on whether they share some sort of relationship with each other or not. In this thesis the goal is to investigate and compare methods to predict relationships between individuals in the terms of what we define as a household relationship, i.e. we wish to identify which individuals are sharing living expenses with one another. The objective is to explore the ability of three supervised statistical machine learning methods, namely, logistic regression (LR), artificial neural networks (ANN) and the support vector machine (SVM), to predict these household relationships and evaluate their predictive performance for different settings on their corresponding tuning parameters. Data over a limited population of individuals, containing information about household affiliation and attributes, were available for this task. In order to apply these methods, the problem had to be formulated on a form enabling supervised learning, i.e. a target Y and input predictors X = (X1, …, Xp), based on the set of p attributes associated with each individual, had to be derived. We have presented a technique which forms pairs of individuals under the hypothesis H0, that they share a household relationship, and then a test of significance is constructed. This technique transforms the problem into a standard binary classification problem. A sample of observations could be generated by randomly pair individuals and using the available data over each individual to code the corresponding outcome on Y and X for each random pair. For evaluation and tuning of the three supervised learning methods, the sample was split into a training set, a validation set and a test set. We have seen that the prediction error, in term of misclassification rate, is very small for all three methods since the two classes, H0 is true, and H0 is false, are far away from each other and well separable. The data have shown pronounced linear separability, generally resulting in minor differences in misclassification rate as the tuning parameters are modified. However, some variations in the prediction results due to tuning have been observed, and if also considering computational time and requirements on computational power, optimal settings on the tuning parameters could be determined for each method. Comparing LR, ANN and SVM, using optimal tuning settings, the results from testing have shown that there is no significant difference between the three methods performances and they all predict well. Nevertheless, due to difference in complexity between the methods, we have concluded that SVM is the least suitable method to use, whereas LR most suitable. However, the ANN handles complex and non-linear data better than LR, therefore, for future application of the model, where data might not have such a pronounced linear separability, we find it suitable to consider ANN as well. This thesis has been written at Svenska Handelsbanken, one of the large major banks in Sweden, with offices all around the world. Their headquarters are situated in Kungsträdgården, Stockholm. Computations have been performed using SAS software and data have been processed in SQL relational database management system.
I det moderna samhället har många företag stora datasamlingar över sina enskilda kunder, innehållande information om attribut, så som namn, kön, civilstatus, adress etc. Dessa attribut kan användas för att länka samman kunderna beroende på om de delar någon form av relation till varandra eller ej. I denna avhandling är målet att undersöka och jämföra metoder för att prediktera relationer mellan individer i termer av vad vi definierar som en hushållsrelation, d.v.s. vi vill identifiera vilka individer som delar levnadskostnader med varandra. Målsättningen är att undersöka möjligheten för tre övervakade statistiska maskininlärningsmetoder, nämligen, logistisk regression (LR), artificiella neurala nätverk (ANN) och stödvektormaskinen (SVM), för att prediktera dessa hushållsrelationer och utvärdera deras prediktiva prestanda för olika inställningar på deras motsvarande inställningsparametrar. Data över en begränsad mängd individer, innehållande information om hushållsrelation och attribut, var tillgänglig för denna uppgift. För att tillämpa dessa metoder måste problemet formuleras på en form som möjliggör övervakat lärande, d.v.s. en målvariabel Y och prediktorer X = (X1,…,Xp), baserat på uppsättningen av p attribut associerade med varje individ, måste härledas. Vi har presenterat en teknik som utgörs av att skapa par av individer under hypotesen H0, att de delar ett hushållsförhållande, och sedan konstrueras ett signifikanstest. Denna teknik omvandlar problemet till ett standard binärt klassificeringsproblem. Ett stickprov av observationer, för att träna metoderna, kunde genereras av att slumpmässigt para individer och använda informationen från datasamlingarna för att koda motsvarande utfall på Y och X för varje slumpmässigt par. För utvärdering och avstämning av de tre övervakade inlärningsmetoderna delades observationerna i stickprovet in i en träningsmängd, en valideringsmängd och en testmängd. Vi har sett att prediktionsfelet, i form av felklassificeringsfrekvens, är mycket litet för alla metoder och de två klasserna, H0 är sann, och H0 är falsk, ligger långt ifrån varandra och väl separabla. Data har visat sig ha en uttalad linjär separabilitet, vilket generellt resulterar i mycket små skillnader i felklassificeringsfrekvens då inställningsparametrarna modifieras. Dock har vissa variationer i prediktiv prestanda p.g.a. inställningskonfiguration ändå observerats, och om hänsyn även tages till beräkningstid och beräkningskraft, har optimala inställningsparametrar ändå kunnat fastställas för respektive metod. Jämförs därefter LR, ANN och SVM, med optimala parameterinställningar, visar resultaten från testningen att det inte finns någon signifikant skillnad mellan metodernas prestanda och de predikterar alla väl. På grund av skillnad i komplexitet mellan metoderna, har det dock konstaterats att SVM är den minst lämpliga metoden att använda medan LR är lämpligast. ANN hanterar dock komplex och icke-linjära data bättre än LR, därför, för framtida tillämpning av modellen, där data kanske inte uppvisar lika linjär separabilitet, tycker vi att det är lämpligt att även överväga ANN. Denna uppsats har skrivits på Svenska Handelsbanken, en av storbankerna i Sverige, med kontor över hela världen. Huvudkontoret är beläget i Kungsträdgården, Stockholm. Beräkningar har utförts i programvaran SAS och datahantering i databashanteraren SQL.

APA, Harvard, Vancouver, ISO, and other styles

25

Dearden, Richard W. "Learning and planning in structured worlds." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0020/NQ56531.pdf.

Full text

APA, Harvard, Vancouver, ISO, and other styles

26

Whalen, Andrew. "Computational, experimental, and statistical analyses of social learning in humans and animals." Thesis, University of St Andrews, 2016. http://hdl.handle.net/10023/8822.

Full text

Abstract:

Social learning is ubiquitous among animals and humans and is thought to be critical to the widespread success of humans and to the development and evolution of human culture. Evolutionary theory, however, suggests that social learning alone may not be adaptive but that individuals may need to be selective in who and how they copy others. One of the key findings of these evolutionary models (reviewed in Chapter 1) is that social information may be widely adaptive if individuals are able to combine social and asocial sources of information together strategically. However, up until this point the focus of theoretic models has been on the population level consequences of different social learning strategies, and not on how individuals combine social and asocial information on specific tasks. In Chapter 2 I carry out an analysis of how animal learners might incorporate social information into a reinforcement learning framework and find that even limited, low-fidelity copying of actions in an action sequence may combine with asocial learning to result in high fidelity transmission of entire action sequences. In Chapter 3 I describe a series of experiments that find that human learners flexibly use a conformity biased learning strategy to learn from multiple demonstrators depending on demonstrator accuracy, either indicated by environmental cues or past experience with these demonstrators. The chapter reveals close quantitative and qualitative matches between participant's performance and a Bayesian model of social learning. In both Chapters 2 and 3 I find, consistent with previous evolutionary findings, that by combining social and asocial sources of information together individuals are able to learn about the world effectively. Exploring how animals use social learning experimentally can be a substantially more difficult task than exploring human social learning. In Chapter 4, I develop and present a refined version of Network Based Diffusion analysis to provide a statistical framework for inferring social learning mechanisms from animal diffusion experiments. In Chapter 5 I move from examining the effects of social learning at an individual level to examining their population level outcomes and provide an analysis of how fine-grained population structure may alter the spread of novel behaviours through a population. I find that although a learner's social learning strategy and the learnability of a novel behaviour strongly impact how likely the behaviour is to spread through the population, fine grained population structure plays a much smaller role. In Chapter 6 I summarize the results of this thesis, and provide suggestions for future work to understand how individuals, humans and other animals alike, use social information.

APA, Harvard, Vancouver, ISO, and other styles

27

Frigola-Alcalde, Roger. "Bayesian time series learning with Gaussian processes." Thesis, University of Cambridge, 2016. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709520.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Land, Walker, Dan Margolis, Ronald Gottlieb, Elizabeth Krupinski, and Jack Yang. "Improving CT prediction of treatment response in patients with metastatic colorectal carcinoma using statistical learning theory." BioMed Central, 2010. http://hdl.handle.net/10150/610011.

Full text

Abstract:

BACKGROUND:Significant interest exists in establishing radiologic imaging as a valid biomarker for assessing the response of cancer to a variety of treatments. To address this problem, we have chosen to study patients with metastatic colorectal carcinoma to learn whether statistical learning theory can improve the performance of radiologists using CT in predicting patient treatment response to therapy compared with the more traditional RECIST (Response Evaluation Criteria in Solid Tumors) standard.RESULTS:Predictions of survival after 8 months in 38 patients with metastatic colorectal carcinoma using the Support Vector Machine (SVM) technique improved 30% when using additional information compared to WHO (World Health Organization) or RECIST measurements alone. With both Logistic Regression (LR) and SVM, there was no significant difference in performance between WHO and RECIST. The SVM and LR techniques also demonstrated that one radiologist consistently outperformed another.CONCLUSIONS:This preliminary research study has demonstrated that SLT algorithms, properly used in a clinical setting, have the potential to address questions and criticisms associated with both RECIST and WHO scoring methods. We also propose that tumor heterogeneity, shape, etc. obtained from CT and/or MRI scans be added to the SLT feature vector for processing.

APA, Harvard, Vancouver, ISO, and other styles

29

Robbin, Alice, and Lee Frost-Kumpf. "Extending theory for user-centered information systems: Diagnosing and learning from error in complex statistical data." John Wiley & Sons, Inc, 1997. http://hdl.handle.net/10150/105746.

Full text

Abstract:

Utilization of complex statistical data has come at great cost to individual researchers, the information community, and to the national information infrastructure. Dissatisfaction with the traditional approach to information system design and information services provision, and, by implication, the theoretical bases on which these systems and services have been developed has led librarians and information scientists to propose that information is a user construct and therefore system designs should place greater emphasis on user-centered approaches. This article extends Dervinâ s and Morris's theoretical framework for designing effective information services by synthesizing and integrating theory and research derived from multiple approaches in the social and behavioral sciences. These theoretical frameworks are applied to develop general design strategies and principles for information systems and services that rely on complex statistical data. The focus of this article is on factors that contribute to error in the production of high quality scientific output and on failures of communication during the process of data production and data utilization. Such insights provide useful frameworks to diagnose, communicate, and learn from error. Strategies to design systems that support communicative competence and cognitive competence emphasize the utilization of information systems in a user centered learning environment. This includes viewing cognition as a generative process and recognizing the continuing interdependence and active involvement of experts, novices, and technological gatekeepers.

APA, Harvard, Vancouver, ISO, and other styles

30

Van, der Merwe Rudolph. "Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models." Full text open access at:, 2004. http://content.ohsu.edu/u?/etd,8.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Vang, Jee. "Using a model of human cognition of causality to orient arcs in structural learning of Bayesian networks." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3386.

Full text

Abstract:

Thesis (Ph.D.)--George Mason University, 2008.
Vita: p. 249. Thesis director: Farrokh Alemi. Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computational Sciences and Informatics. Title from PDF t.p. (viewed Mar. 16, 2009). Includes bibliographical references (p. 238-248). Also issued in print.

APA, Harvard, Vancouver, ISO, and other styles

32

Perrot, Michaël. "Theory and algorithms for learning metrics with controlled behaviour." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSES072/document.

Full text

Abstract:

De nombreux algorithmes en Apprentissage Automatique utilisent une notion de distance ou de similarité entre les exemples pour résoudre divers problèmes tels que la classification, le partitionnement ou l'adaptation de domaine. En fonction des tâches considérées ces métriques devraient avoir des propriétés différentes mais les choisir manuellement peut-être fastidieux et difficile. Une solution naturelle est alors d'adapter automatiquement ces métriques à la tâche considérée. Il s'agit alors d'un problème connu sous le nom d'Apprentissage de Métriques et où le but est principalement de trouver les meilleurs paramètres d'une métrique respectant des contraintes spécifiques. Les approches classiques dans ce domaine se focalisent habituellement sur l'apprentissage de distances de Mahalanobis ou de similarités bilinéaires et l'une des principales limitations est le fait que le contrôle du comportement de ces métriques est souvent limité. De plus, si des travaux théoriques existent pour justifier de la capacité de généralisation des modèles appris, la plupart des approches ne présentent pas de telles garanties. Dans cette thèse nous proposons de nouveaux algorithmes pour apprendre des métriques à comportement contrôlé et nous mettons l'accent sur les propriétés théoriques de ceux-ci. Nous proposons quatre contributions distinctes qui peuvent être séparées en deux parties: (i) contrôler la métrique apprise en utilisant une métrique de référence et (ii) contrôler la transformation induite par la métrique apprise. Notre première contribution est une approche locale d'apprentissage de métriques où le but est de régresser une distance proportionnelle à la perception humaine des couleurs. Notre approche est justifiée théoriquement par des garanties en généralisation sur les métriques apprises. Dans notre deuxième contribution nous nous sommes intéressés à l'analyse théorique de l'intérêt d'utiliser une métrique de référence dans un terme de régularisation biaisé pour aider lors du processus d'apprentissage. Nous proposons d'utiliser trois cadres théoriques différents qui nous permettent de dériver trois mesures différentes de l'apport de la métrique de référence. Ces mesures nous donnent un aperçu de l'impact de la métrique de référence sur celle apprise. Dans notre troisième contribution nous proposons un algorithme d'apprentissage de métriques où la transformation induite est contrôlée. L'idée est que, plutôt que d'utiliser des contraintes de similarité et de dissimilarité, chaque exemple est associé à un point virtuel qui appartient déjà à l'espace induit par la métrique apprise. D'un point de vue théorique nous montrons que les métriques apprises de cette façon généralisent bien mais aussi que notre approche est liée à une méthode plus classique d'apprentissage de métriques basée sur des contraintes de paires. Dans notre quatrième contribution nous essayons aussi de contrôler la transformation induite par une métrique apprise. Cependant, plutôt que considérer un contrôle individuel pour chaque exemple, nous proposons une approche plus globale en forçant la transformation à suivre une transformation géométrique associée à un problème de transport optimal. D'un point de vue théorique nous proposons une discussion sur le lien entre la transformation associée à la métrique apprise et la transformation associée au problème de transport optimal. D'un point de vue plus pratique nous montrons l'intérêt de notre approche pour l'adaptation de domaine mais aussi pour l'édition d'images
Many Machine Learning algorithms make use of a notion of distance or similarity between examples to solve various problems such as classification, clustering or domain adaptation. Depending on the tasks considered these metrics should have different properties but manually choosing an adapted comparison function can be tedious and difficult. A natural trend is then to automatically tailor such metrics to the task at hand. This is known as Metric Learning and the goal is mainly to find the best parameters of a metric under some specific constraints. Standard approaches in this field usually focus on learning Mahalanobis distances or Bilinear similarities and one of the main limitations is that the control over the behaviour of the learned metrics is often limited. Furthermore if some theoretical works exist to justify the generalization ability of the learned models, most of the approaches do not come with such guarantees. In this thesis we propose new algorithms to learn metrics with a controlled behaviour and we put a particular emphasis on the theoretical properties of these algorithms. We propose four distinct contributions which can be separated in two parts, namely (i) controlling the metric with respect to a reference metric and (ii) controlling the underlying transformation corresponding to the learned metric. Our first contribution is a local metric learning method where the goal is to regress a distance proportional to the human perception of colors. Our approach is backed up by theoretical guarantees on the generalization ability of the learned metrics. In our second contribution we are interested in theoretically studying the interest of using a reference metric in a biased regularization term to help during the learning process. We propose to use three different theoretical frameworks allowing us to derive three different measures of goodness for the reference metric. These measures give us some insights on the impact of the reference metric on the learned one. In our third contribution we propose a metric learning algorithm where the underlying transformation is controlled. The idea is that instead of using similarity and dissimilarity constraints we associate each learning example to a so-called virtual point belonging to the output space associated with the learned metric. We theoretically show that metrics learned in this way generalize well but also that our approach is linked to a classic metric learning method based on pairs constraints. In our fourth contribution we also try to control the underlying transformation of a learned metric. However instead of considering a point-wise control we consider a global one by forcing the transformation to follow the geometrical transformation associated to an optimal transport problem. From a theoretical standpoint we propose a discussion on the link between the transformation associated with the learned metric and the transformation associated with the optimal transport problem. On a more practical side we show the interest of our approach for domain adaptation but also for a task of seamless copy in images

APA, Harvard, Vancouver, ISO, and other styles

33

Riggelsen, Carsten. "Approximation methods for efficient learning of Bayesian networks /." Amsterdam ; Washington, DC : IOS Press, 2008. http://www.loc.gov/catdir/toc/fy0804/2007942192.html.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Grimes, David B. "Learning by imitation and exploration : Bayesian models and applications in humanoid robotics /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6879.

Full text

APA, Harvard, Vancouver, ISO, and other styles

35

PAGLIANA, NICOLO'. "On the Role of Regularization in Machine Learning: Classical Theory, Computational Aspects and Modern Regimes." Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1081700.

Full text

Abstract:

In this work we study performances of different machine learning models by focusing on regularization properties in order to explain different phenomena that are observed in practice. We consider linear models on possibly infinite dimensionale feature space that are trained by optimizing an empirical mean squared errror. We study regularization properties of accelerated method like Nesterov or $ u$-method and properties of interpolating estimator where the main regularization sources vanish to zero and explain different behaviour which can be seen in practical applications.

APA, Harvard, Vancouver, ISO, and other styles

36

Cardamone, Dario. "Support Vector Machine a Machine Learning Algorithm." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text

Abstract:

Nella presente tesi di laurea viene preso in considerazione l’algoritmo di classificazione Support Vector Machine. Piu` in particolare si considera la sua formulazione come problema di ottimizazione Mixed Integer Program per la classificazione binaria super- visionata di un set di dati.

APA, Harvard, Vancouver, ISO, and other styles

37

Shon, Aaron P. "Bayesian cognitive models for imitation /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/7013.

Full text

APA, Harvard, Vancouver, ISO, and other styles

38

Zhu, Shaojuan. "Associative memory as a Bayesian building block /." Full text open access at:, 2008. http://content.ohsu.edu/u?/etd,655.

Full text

APA, Harvard, Vancouver, ISO, and other styles

39

GUASTAVINO, SABRINA. "Learning and inverse problems: from theory to solar physics applications." Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/998315.

Full text

Abstract:

The problem of approximating a function from a set of discrete measurements has been extensively studied since the seventies. Our theoretical analysis proposes a formalization of the function approximation problem which allows dealing with inverse problems and supervised kernel learning as two sides of the same coin. The proposed formalization takes into account arbitrary noisy data (deterministically or statistically defined), arbitrary loss functions (possibly seen as a log-likelihood), handling both direct and indirect measurements. The core idea of this part relies on the analogy between statistical learning and inverse problems. One of the main evidences of the connection occurring across these two areas is that regularization methods, usually developed for ill-posed inverse problems, can be used for solving learning problems. Furthermore, spectral regularization convergence rate analyses provided in these two areas, share the same source conditions but are carried out with either increasing number of samples in learning theory or decreasing noise level in inverse problems. Even more in general, regularization via sparsity-enhancing methods is widely used in both areas and it is possible to apply well-known $ell_1$-penalized methods for solving both learning and inverse problems. In the first part of the Thesis, we analyze such a connection at three levels: (1) at an infinite dimensional level, we define an abstract function approximation problem from which the two problems can be derived; (2) at a discrete level, we provide a unified formulation according to a suitable definition of sampling; and (3) at a convergence rates level, we provide a comparison between convergence rates given in the two areas, by quantifying the relation between the noise level and the number of samples. In the second part of the Thesis, we focus on a specific class of problems where measurements are distributed according to a Poisson law. We provide a data-driven, asymptotically unbiased, and globally quadratic approximation of the Kullback-Leibler divergence and we propose Lasso-type methods for solving sparse Poisson regression problems, named PRiL for Poisson Reweighed Lasso and an adaptive version of this method, named APRiL for Adaptive Poisson Reweighted Lasso, proving consistency properties in estimation and variable selection, respectively. Finally we consider two problems in solar physics: 1) the problem of forecasting solar flares (learning application) and 2) the desaturation problem of solar flare images (inverse problem application). The first application concerns the prediction of solar storms using images of the magnetic field on the sun, in particular physics-based features extracted from active regions from data provided by Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO). The second application concerns the reconstruction problem of Extreme Ultra-Violet (EUV) solar flare images recorded by a second instrument on board SDO, the Atmospheric Imaging Assembly (AIA). We propose a novel sparsity-enhancing method SE-DESAT to reconstruct images affected by saturation and diffraction, without using any a priori estimate of the background solar activity.

APA, Harvard, Vancouver, ISO, and other styles

40

Berlin, Daniel. "Multi-class Supervised Classification Techniques for High-dimensional Data: Applications to Vehicle Maintenance at Scania." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209257.

Full text

Abstract:

In vehicle repairs, many times locating the cause of error could turn out more time consuming than the reparation itself. Hence a systematic way to accurately predict a fault causing part would constitute a valuable tool especially for errors difficult to diagnose. This thesis explores the predictive ability of Diagnostic Trouble Codes (DTC’s), produced by the electronic system on Scania vehicles, as indicators for fault causing parts. The statistical analysis is based on about 18800 observations of vehicles where both DTC’s and replaced parts could be identified during the period march 2016 - march 2017. Two different approaches of forming classes is evaluated. Many classes had only few observations and, to give the classifiers a fair chance, it is decided to omit observations of classes based on their frequency in data. After processing, the resulting data could comprise 1547 observations on 4168 features, demonstrating very high dimensionality and making it impossible to apply standard methods of large-sample statistical inference. Two procedures of supervised statistical learning, that are able to cope with high dimensionality and multiple classes, Support Vector Machines and Neural Networks are exploited and evaluated. The analysis showed that on data with 1547 observations of 4168 features (unique DTC’s) and 7 classes SVM yielded an average prediction accuracy of 79.4% compared to 75.4% using NN.The conclusion of the analysis is that DTC’s holds potential to be used as indicators for fault causing parts in a predictive model, but in order to increase prediction accuracy learning data needs improvements. Scope for future research to improve and expand the model, along with practical suggestions for exploiting supervised classifiers at Scania is provided. keywords: Statistical learning, Machine learning, Neural networks, Deep learning, Supervised learning, High dimensionality
Många gånger i samband med fordonsreparationer är felsökningen mer tidskrävande än själva reparationen. Således skulle en systematisk metod för att noggrant prediktera felkällan vara ett värdefullt verktyg för att diagnostisera reparationsåtgärder. I denna uppsats undersöks möjligheten att använda Diagnostic Trouble Codes (DTC:er), som genereras av de elektroniska systemen i Scanias fordon, som indikatorer för att peka ut felorsaken. Till grund för analysen användes ca 18800 observationer av fordon där både DTC:er samt utbytta delar kunnat identifieras under perioden mars 2016 - mars 2017. Två olika strategier för att generera klasser har utvärderats. Till många av klasserna fanns det endast ett fåtal observationer, och för att ge de prediktiva modellerna bra förutsättningar så användes endast klasser med tillräckligt många observationer i träningsdata. Efter bearbetning kunde data innehålla 1547 observationer 4168 attribut, vilket demonstrerar problemets höga dimensionalitet och gör det omöjligt att applicera standard metoder för statistisk analys på stora datamängder. Två metoder för övervakad statistisk inlärning, lämpliga för högdimensionell data med multipla klasser, Södvectormaskiner (SVM) samt Neurala Nätverk (NN) implementeras och deras resultat utvärderas. Analysen visade att på data med 1547 observationer av 4168 attribut (unika DTC:er) och 7 klasser kunde SVM prediktera observationer till klasserna med 79.4% noggrannhet jämfört med 75.4% för NN. De slutsatser som kunde dras av analysen var att DTC:er tycks ha potential att användas för att indikera felorsaker med en prediktiv modell, men att den data som ligger till grund för analysen bör förbättras för att öka noggrannheten i de prediktiva modellerna. Framtida forskningsmöjligheter för att ytterligare förbättra samt utveckla modellen, tillsammans med förslag för hur övervakade klassificerings modeller kan användas på Scnaia har identifierats.

APA, Harvard, Vancouver, ISO, and other styles

41

Machart, Pierre. "Coping with the Computational and Statistical Bipolar Nature of Machine Learning." Phd thesis, Aix-Marseille Université, 2012. http://tel.archives-ouvertes.fr/tel-00771718.

Full text

Abstract:

L'Apprentissage Automatique tire ses racines d'un large champ disciplinaire qui inclut l'Intelligence Artificielle, la Reconnaissance de Formes, les Statistiques ou l'Optimisation. Dès les origines de l'Apprentissage, les questions computationelles et les propriétés en généralisation ont toutes deux été identifiées comme centrales pour la discipline. Tandis que les premières concernent les questions de calculabilité ou de complexité (sur un plan fondamental) ou d'efficacité computationelle (d'un point de vue plus pratique) des systèmes d'apprentissage, les secondes visent a comprendre et caractériser comment les solutions qu'elles fournissent vont se comporter sur de nouvelles données non encore vues. Ces dernières années, l'émergence de jeux de données à grande échelle en Apprentissage Automatique a profondément remanié les principes de la Théorie de l'Apprentissage. En prenant en compte de potentielles contraintes sur le temps d'entraînement, il faut faire face à un compromis plus complexe que ceux qui sont classiquement traités par les Statistiques. Une conséquence directe tient en ce que la mise en place d'algorithmes efficaces (autant en théorie qu'en pratique) capables de tourner sur des jeux de données a grande échelle doivent impérativement prendre en compte les aspects statistiques et computationels de l'Apprentissage de façon conjointe. Cette thèse a pour but de mettre à jour, analyser et exploiter certaines des connections qui existent naturellement entre les aspects statistiques et computationels de l'Apprentissage. Plus précisément, dans une première partie, nous étendons l'analyse en stabilité, qui relie certaines propriétés algorithmiques aux capacités de généralisation des algorithmes d'apprentissage, la matrice de confusion, que nous suggérons comme nouvelle mesure de performance (fine). Dans une seconde partie, nous présentons un nouvelle approche pour apprendre une fonction de régression basée sur les noyaux, où le noyau appris sert directement la tâche de régression, et qui exploite la structure du problème pour offrir une procédure d'optimisation peu coûteuse. Finalement, nous étudions le compromis entre vitesse de convergence et coût computationel lorsque l'on minimise une fonction composite avec des méthodes par gradient-proximal inexact. Dans ce contexte, nous identifions des stratégies d'optimisation qui sont computationellement optimales.

APA, Harvard, Vancouver, ISO, and other styles

42

Ozogur-akyuz, Sureyya. "A Mathematical Contribution Of Statistical Learning And Continuous Optimization Using Infinite And Semi-infinite Programming To Computational Statistics." Phd thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12610381/index.pdf.

Full text

Abstract:

A subfield of artificial intelligence, machine learning (ML), is concerned with the development of algorithms that allow computers to &ldquo
learn&rdquo
. ML is the process of training a system with large number of examples, extracting rules and finding patterns in order to make predictions on new data points (examples). The most common machine learning schemes are supervised, semi-supervised, unsupervised and reinforcement learning. These schemes apply to natural language processing, search engines, medical diagnosis, bioinformatics, detecting credit fraud, stock market analysis, classification of DNA sequences, speech and hand writing recognition in computer vision, to encounter just a few. In this thesis, we focus on Support Vector Machines (SVMs) which is one of the most powerful methods currently in machine learning. As a first motivation, we develop a model selection tool induced into SVM in order to solve a particular problem of computational biology which is prediction of eukaryotic pro-peptide cleavage site applied on the real data collected from NCBI data bank. Based on our biological example, a generalized model selection method is employed as a generalization for all kinds of learning problems. In ML algorithms, one of the crucial issues is the representation of the data. Discrete geometric structures and, especially, linear separability of the data play an important role in ML. If the data is not linearly separable, a kernel function transforms the nonlinear data into a higher-dimensional space in which the nonlinear data are linearly separable. As the data become heterogeneous and large-scale, single kernel methods become insufficient to classify nonlinear data. Convex combinations of kernels were developed to classify this kind of data [8]. Nevertheless, selection of the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, we propose a novel method of &ldquo
infinite&rdquo
kernel combinations for learning problems with the help of infinite and semi-infinite programming regarding all elements in kernel space. This will provide to study variations of combinations of kernels when considering heterogeneous data in real-world applications. Combination of kernels can be done, e.g., along a homotopy parameter or a more specific parameter. Looking at all infinitesimally fine convex combinations of the kernels from the infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, it becomes semi-infinite. We analyze the regularity conditions which satisfy the Reduction Ansatz and discuss the type of distribution functions within the structure of the constraints and our bilevel optimization problem. Finally, we adapted well known numerical methods of semiinfinite programming to our new kernel machine. We improved the discretization method for our specific model and proposed two new algorithms. We proved the convergence of the numerical methods and we analyzed the conditions and assumptions of these convergence theorems such as optimality and convergence.

APA, Harvard, Vancouver, ISO, and other styles

43

Gonzales, Kalim. "Establishing a Learning Foundation in a Dynamically Changing World: Insights from Artificial Language Work." Diss., The University of Arizona, 2013. http://hdl.handle.net/10150/308884.

Full text

Abstract:

It is argued that infants build a foundation for learning about the world through their incidental acquisition of the spatial and temporal regularities surrounding them. A challenge is that learning occurs across multiple contexts whose statistics can greatly differ. Two artificial language studies with 12-month-olds demonstrate that infants come prepared to parse statistics across contexts using the temporal and perceptual features that distinguish one context from another. These results suggest that infants can organize their statistical input with a wider range of features that typically considered. Possible attention, decision making, and memory mechanisms are discussed.

APA, Harvard, Vancouver, ISO, and other styles

44

Amethier, Patrik, and André Gerbaulet. "Sales Volume Forecasting of Ericsson Radio Units - A Statistical Learning Approach." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288504.

Full text

Abstract:

Demand forecasting is a well-established internal process at Ericsson, where employees from various departments within the company collaborate in order to predict future sales volumes of specific products over horizons ranging from months to a few years. This study aims to evaluate current predictions regarding radio unit products of Ericsson, draw insights from historical volume data, and finally develop a novel, statistical prediction approach. Specifically, a two-part statistical model with a decision tree followed by a neural network is trained on previous sales data of radio units, and then evaluated (also on historical data) regarding predictive accuracy. To test the hypothesis that mid-range volume predictions of a 1-3 year horizon made by data-driven statistical models can be more accurate, the two-part model makes predictions per individual radio unit product based on several predictive attributes, mainly historical volume data and information relating to geography, country and customer trends. The majority of wMAPEs per product from the predictive model were shown to be less than 5% for the three different prediction horizons, which can be compared to global wMAPEs from Ericsson's existing long range forecast process of 9% for 1 year, 13% for 2 years and 22% for 3 years. These results suggest the strength of the data-driven predictive model. However, care must be taken when comparing the two error measures and one must take into account the large variances of wMAPEs from the predictive model.
Ericsson har en väletablerad intern process för prognostisering av försäljningsvolymer, där produktnära samt kundnära roller samarbetar med inköpsorganisationen för att säkra noggranna uppskattningar angående framtidens efterfrågan. Syftet med denna studie är att evaluera tidigare prognoser, och sedan utveckla en ny prediktiv, statistisk modell som prognostiserar baserad på historisk data. Studien fokuserar på produktkategorin radio, och utvecklar en två-stegsmodell bestående av en trädmodell och ett neuralt nätverk. För att testa hypotesen att en 1-3 års prognos för en produkt kan göras mer noggran med en datadriven modell, tränas modellen på attribut kopplat till produkten, till exempel historiska volymer för produkten, och volymtrender inom produktens marknadsområden och kundgrupper. Detta resulterade i flera prognoser på olika tidshorisonter, nämligen 1-12 månader, 13-24 månader samt 25-36 månder. Majoriteten av wMAPE-felen för dess prognoser visades ligga under 5%, vilket kan jämföras med wMAPE på 9% för Ericssons befintliga 1-årsprognoser, 13% för 2-årsprognerna samt 22% för 3-årsprognoserna. Detta pekar på att datadrivna, statistiska metoder kan användas för att producera gedigna prognoser för framtida försäljningsvolymer, men hänsyn bör tas till jämförelsen mellan de kvalitativa uppskattningarna och de statistiska prognoserna, samt de höga varianserna i felen.

APA, Harvard, Vancouver, ISO, and other styles

45

Hazarika, Subhashis. "Statistical and Machine Learning Approaches For Visualizing and Analyzing Large-Scale Simulation Data." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1574692702479196.

Full text

APA, Harvard, Vancouver, ISO, and other styles

46

Lafon, Nicolas. "Statistical learning for geosciences : methods for extreme generation and data assimilation." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASJ006.

Full text

Abstract:

Le domaine des géosciences vise à comprendre de manière exhaustive le système terrestre. Il intervient dans la compréhension de problématiques majeures, notamment l'impact du changement climatique ou la gestion des risques liés à des événements extrêmes. Les géosciences bénéficient considérablement de la massification de données à grande échelle, ce qui les rend propices à l'utilisation d'algorithmes de Machine Learning (ML). Du fait de ses spécificités, l'analyse des données géoscientifiques nécessite des formulations et des méthodologies ML innovantes. Le travail effectué dans cette thèse apporte de nouveaux outils basés sur le ML adaptés aux défis des géosciences, avec un potentiel d'applications plus larges au-delà du domaine des géosciences.Dans la première partie de cette thèse, nous proposons une approche ML pour estimer la distribution de variables spatio-temporelles dynamiques à partir d'observations bruitées et irrégulières. En effet, nous introduisons un cadre d'apprentissage pour estimer à la fois l'état d'un système dynamique et les incertitudes sous forme d'une matrice de covariance. Cette méthode trouve des applications dans les problèmes d'assimilation de données, pour lesquels on dispose d'observations bruitées et éparses couplées à des connaissances sur la dynamique physique. Les modèles de prévision météorologique ou océanographique sont concernés.La deuxième partie de cette thèse présente un modèle génératif ML produisant de nouveaux échantillons d'une distribution multivariée inconnue à partir d'exemples. Notre simulateur fournit des échantillons en dehors des données d'entraînement et permet d'extrapoler. Cette approche a des applications directes dans l'étude des risques environnementaux puisqu'elle permet la simulation numérique d'échantillons extrêmes rares
The field of geosciences aims to comprehensively understand the Earth system. It addresses critical challenges, including the impact of climate change or management of risks from extreme events. Geosciences benefit significantly from the influx of large-scale data, making it conducive for machine learning (ML) applications. Because of its specific features, the analysis of geoscience data requires innovative ML formulations and methodologies. The work in this thesis contributes novel ML-based tools tailored for geoscience challenges, with the potential for broader applications beyond the geosciences domain.In the first part of this thesis, we propose a ML approach to estimate the distribution of dynamically driven spatio-temporal variables from noisy and irregular observations. Indeed, we introduce a learning framework to estimate both the state of a dynamical system with associated uncertainties as a covariance matrix. Such method can finds applications to data assimilation problems, in which noisy and sparse observations are available coupled with knowledge about the physical dynamics. Weather or oceanographic forecast models are concerned.The second part of this thesis presents a ML-based generative model which produce new samples of an unknown multivariate distribution given examples. Our simulator provides samples outside of the training data and allows to extrapolate. This approach has direct applications to the study of environmental hazards since it allows numerical simulation of rare extreme samples

APA, Harvard, Vancouver, ISO, and other styles

47

Saers, Markus. "Translation as Linear Transduction : Models and Algorithms for Efficient Learning in Statistical Machine Translation." Doctoral thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-135704.

Full text

Abstract:

Automatic translation has seen tremendous progress in recent years, mainly thanks to statistical methods applied to large parallel corpora. Transductions represent a principled approach to modeling translation, but existing transduction classes are either not expressive enough to capture structural regularities between natural languages or too complex to support efficient statistical induction on a large scale. A common approach is to severely prune search over a relatively unrestricted space of transduction grammars. These restrictions are often applied at different stages in a pipeline, with the obvious drawback of committing to irrevocable decisions that should not have been made. In this thesis we will instead restrict the space of transduction grammars to a space that is less expressive, but can be efficiently searched. First, the class of linear transductions is defined and characterized. They are generated by linear transduction grammars, which represent the natural bilingual case of linear grammars, as well as the natural linear case of inversion transduction grammars (and higher order syntax-directed transduction grammars). They are recognized by zipper finite-state transducers, which are equivalent to finite-state automata with four tapes. By allowing this extra dimensionality, linear transductions can represent alignments that finite-state transductions cannot, and by keeping the mechanism free of auxiliary storage, they become much more efficient than inversion transductions. Secondly, we present an algorithm for parsing with linear transduction grammars that allows pruning. The pruning scheme imposes no restrictions a priori, but guides the search to potentially interesting parts of the search space in an informed and dynamic way. Being able to parse efficiently allows learning of stochastic linear transduction grammars through expectation maximization. All the above work would be for naught if linear transductions were too poor a reflection of the actual transduction between natural languages. We test this empirically by building systems based on the alignments imposed by the learned grammars. The conclusion is that stochastic linear inversion transduction grammars learned from observed data stand up well to the state of the art.

APA, Harvard, Vancouver, ISO, and other styles

48

Scuderi, Marco Giovanni. "Bayesian approaches to learning from data how to untangle the travel behavior and land use relationships." College Park, Md. : University of Maryland, 2005. http://hdl.handle.net/1903/3201.

Full text

Abstract:

Thesis (Ph. D.)--University of Maryland, College Park, 2005.
"Bayesian scoring is used to evaluate and compare results from actual data collected for the Baltimore Metropolitan Area with the set of predominant conceptual frameworks linking travel behavior and land use obtained from the literature"--Abstract. Includes bibliographical references (p. 167-176) and abstract.

APA, Harvard, Vancouver, ISO, and other styles

49

Vogel, Robin. "Similarity ranking for biometrics : theory and practice." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT031.

Full text

Abstract:

L’augmentation rapide de la population combinée à la mobilité croissante des individus a engendré le besoin de systèmes de gestion d’identités sophistiqués. À cet effet, le terme biométrie se réfère généralement aux méthodes permettant d’identifier les individus en utilisant des caractéristiques biologiques ou comportementales. Les méthodes les plus populaires, c’est-à-dire la reconnaissance d’empreintes digitales, d’iris ou de visages, se basent toutes sur des méthodes de vision par ordinateur. L’adoption de réseaux convolutifs profonds, rendue possible par le calcul générique sur processeur graphique, ont porté les récentes avancées en vision par ordinateur. Ces avancées ont permis une amélioration drastique des performances des méthodes conventionnelles en biométrie, ce qui a accéléré leur adoption pour des usages concrets, et a provoqué un débat public sur l’utilisation de ces techniques. Dans ce contexte, les concepteurs de systèmes biométriques sont confrontés à un grand nombre de challenges dans l’apprentissage de ces réseaux. Dans cette thèse, nous considérons ces challenges du point de vue de l’apprentissage statistique théorique, ce qui nous amène à proposer ou esquisser des solutions concrètes. Premièrement, nous répondons à une prolifération de travaux sur l’apprentissage de similarité pour les réseaux profonds, qui optimisent des fonctions objectif détachées du but naturel d’ordonnancement recherché en biométrie. Précisément, nous introduisons la notion d’ordonnancement par similarité, en mettant en évidence la relation entre l’ordonnancement bipartite et la recherche d’une similarité adaptée à l’identification biométrique. Nous étendons ensuite la théorie sur l’ordonnancement bipartite à ce nouveau problème, tout en l’adaptant aux spécificités de l’apprentissage sur paires, notamment concernant son coût computationnel. Les fonctions objectif usuelles permettent d’optimiser la performance prédictive, mais de récents travaux ont mis en évidence la nécessité de prendre en compte d’autres facteurs lors de l’entraı̂nement d’un système biométrique, comme les biais présents dans les données, la robustesse des prédictions ou encore des questions d’équité. La thèse aborde ces trois exemples, en propose une étude statistique minutieuse, ainsi que des méthodes pratiques qui donnent les outils nécessaires aux concepteurs de systèmes biométriques pour adresser ces problématiques, sans compromettre la performance de leurs algorithmes
The rapid growth in population, combined with the increased mobility of people has created a need for sophisticated identity management systems.For this purpose, biometrics refers to the identification of individuals using behavioral or biological characteristics. The most popular approaches, i.e. fingerprint, iris or face recognition, are all based on computer vision methods. The adoption of deep convolutional networks, enabled by general purpose computing on graphics processing units, made the recent advances incomputer vision possible. These advances have led to drastic improvements for conventional biometric methods, which boosted their adoption in practical settings, and stirred up public debate about these technologies. In this respect, biometric systems providers face many challenges when learning those networks.In this thesis, we consider those challenges from the angle of statistical learning theory, which leads us to propose or sketch practical solutions. First, we answer to the proliferation of papers on similarity learningfor deep neural networks that optimize objective functions that are disconnected with the natural ranking aim sought out in biometrics. Precisely, we introduce the notion of similarity ranking, by highlighting the relationship between bipartite ranking and the requirements for similarities that are well suited to biometric identification. We then extend the theory of bipartite ranking to this new problem, by adapting it to the specificities of pairwise learning, particularly those regarding its computational cost. Usual objective functions optimize for predictive performance, but recentwork has underlined the necessity to consider other aspects when training a biometric system, such as dataset bias, prediction robustness or notions of fairness. The thesis tackles all of those three examplesby proposing their careful statistical analysis, as well as practical methods that provide the necessary tools to biometric systems manufacturers to address those issues, without jeopardizing the performance of their algorithms

APA, Harvard, Vancouver, ISO, and other styles

50

Nelson, Jonathan David. "Optimal experimental design as a theory of perceptual and cognitive information acquisition /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2005. http://wwwlib.umi.com/cr/ucsd/fullcit?p3191765.

Full text

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Statistical learning theory'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles