Dissertations / Theses on the topic 'Statistical learning theory'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Statistical learning theory.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Liang, Annie. "Economic Theory and Statistical Learning." Thesis, Harvard University, 2016. http://nrs.harvard.edu/urn-3:HUL.InstRepos:33493561.
Full textEconomics
Deng, Xinwei. "Contributions to statistical learning and statistical quantification in nanomaterials." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2009. http://hdl.handle.net/1853/29777.
Full textCommittee Chair: Wu, C. F. Jeff; Committee Co-Chair: Yuan, Ming; Committee Member: Huo, Xiaoming; Committee Member: Vengazhiyil, Roshan Joseph; Committee Member: Wang, Zhonglin. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Hill, S. "Applications of statistical learning theory to signal processing problems." Thesis, University of Cambridge, 2003. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.604048.
Full textHu, Qiao Ph D. Massachusetts Institute of Technology. "Application of statistical learning theory to plankton image analysis." Thesis, Massachusetts Institute of Technology, 2006. http://hdl.handle.net/1721.1/39206.
Full textIncludes bibliographical references (leaves 155-173).
A fundamental problem in limnology and oceanography is the inability to quickly identify and map distributions of plankton. This thesis addresses the problem by applying statistical machine learning to video images collected by an optical sampler, the Video Plankton Recorder (VPR). The research is focused on development of a real-time automatic plankton recognition system to estimate plankton abundance. The system includes four major components: pattern representation/feature measurement, feature extraction/selection, classification, and abundance estimation. After an extensive study on a traditional learning vector quantization (LVQ) neural network (NN) classifier built on shape-based features and different pattern representation methods, I developed a classification system combined multi-scale cooccurrence matrices feature with support vector machine classifier. This new method outperforms the traditional shape-based-NN classifier method by 12% in classification accuracy. Subsequent plankton abundance estimates are improved in the regions of low relative abundance by more than 50%. Both the NN and SVM classifiers have no rejection metrics. In this thesis, two rejection metrics were developed.
(cont.) One was based on the Euclidean distance in the feature space for NN classifier. The other used dual classifier (NN and SVM) voting as output. Using the dual-classification method alone yields almost as good abundance estimation as human labeling on a test-bed of real world data. However, the distance rejection metric for NN classifier might be more useful when the training samples are not "good" ie, representative of the field data. In summary, this thesis advances the current state-of-the-art plankton recognition system by demonstrating multi-scale texture-based features are more suitable for classifying field-collected images. The system was verified on a very large real-world dataset in systematic way for the first time. The accomplishments include developing a multi-scale occurrence matrices and support vector machine system, a dual-classification system, automatic correction in abundance estimation, and ability to get accurate abundance estimation from real-time automatic classification. The methods developed are generic and are likely to work on range of other image classification applications.
by Qiao Hu.
Ph.D.
Shipitsyn, Aleksey. "Statistical Learning with Imbalanced Data." Thesis, Linköpings universitet, Filosofiska fakulteten, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-139168.
Full textWang, Hongyan. "Analysis of statistical learning algorithms in data dependent function spaces /." access full-text access abstract and table of contents, 2009. http://libweb.cityu.edu.hk/cgi-bin/ezdb/thesis.pl?phd-ma-b23750534f.pdf.
Full text"Submitted to Department of Mathematics in partial fulfillment of the requirements for the degree of Doctor of Philosophy." Includes bibliographical references (leaves [87]-100)
Gianvecchio, Steven. "Application of information theory and statistical learning to anomaly detection." W&M ScholarWorks, 2010. https://scholarworks.wm.edu/etd/1539623563.
Full textSrivastava, Santosh. "Bayesian minimum expected risk estimation of distributions for statistical learning /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6765.
Full textWang, Ni. "Statistical Learning in Logistics and Manufacturing Systems." Diss., Georgia Institute of Technology, 2006. http://hdl.handle.net/1853/11457.
Full textRydén, Otto. "Statistical learning procedures for analysis of residential property price indexes." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-207946.
Full textBostadsprisindex används för att undersöka prisutvecklingen för bostäder över tid. Att modellera ett bostadsprisindex är inte alltid lätt då bostäder är en heterogen vara. Denna uppsats analyserar skillnaden mellan de tvåhuvudsakliga hedoniska indexmodelleringsmetoderna, som är, hedoniska tiddummyvariabelmetoden och den hedoniska imputeringsmetoden. Dessa metoder analyseras med en statistisk inlärningsprocedur gjord utifrån ett regressionsperspektiv, som inkluderar analys utav minsta kvadrats-regression, Huberregression, lassoregression, ridgeregression och principal componentregression. Denna analys är baserad på ca 56 000 lägenhetstransaktioner för lägenheter i Stockholm under perioden 2013-2016 och används för att modellera era versioner av ett bostadsprisindex. De modellerade bostadsprisindexen analyseras sedan med hjälp utav både kvalitativa och kvantitativa metoder inklusive en version av bootstrap för att räkna ut ett empiriskt konfidensintervall för bostadsprisindexen samt en medelfelsanalys av indexpunktskattningarna i varje tidsperiod. Denna analys visar att den hedoniska tid-dummyvariabelmetoden producerar bostadsprisindex med mindre varians och ger också robustare bostadsprisindex för en mindre datamängd. Denna uppsats visar också att användandet av robustare regressionsmetoder leder till stabilare bostadsprisindex som är mindre påverkade av extremvärden, därför rekommenderas robusta regressionsmetoder för en kommersiell implementering av ett bostadsprisindex.
Menke, Joshua E. "Improving machine learning through oracle learning /." Diss., CLICK HERE for online access, 2007. http://contentdm.lib.byu.edu/ETD/image/etd1726.pdf.
Full textYaman, Sibel. "A multi-objective programming perspective to statistical learning problems." Diss., Atlanta, Ga. : Georgia Institute of Technology, 2008. http://hdl.handle.net/1853/26470.
Full textCommittee Chair: Chin-Hui Lee; Committee Member: Anthony Yezzi; Committee Member: Evans Harrell; Committee Member: Fred Juang; Committee Member: James H. McClellan. Part of the SMARTech Electronic Thesis and Dissertation Collection.
Frank, Ernest. "The effect of individual difference variables, learning environment, and cognitive task on statistical learning performance." Morgantown, W. Va. : [West Virginia University Libraries], 2000. http://etd.wvu.edu/templates/showETD.cfm?recnum=1383.
Full textTitle from document title page. Document formatted into pages; contains xvi, 183 p. : ill. (some col.). Includes abstract. Includes bibliographical references (p. 160-173).
Li, Bin. "Statistical learning and predictive modeling in data mining." Columbus, Ohio : Ohio State University, 2006. http://rave.ohiolink.edu/etdc/view?acc%5Fnum=osu1155058111.
Full textAgerberg, Jens. "Statistical Learning and Analysis on Homology-Based Features." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273581.
Full textStable rank har föreslagits som en sammanfattning på datanivå av resultatet av persistent homology, en metod inom topologisk dataanalys. I detta examensarbete utvecklar vi metoder inom statistisk analys och maskininlärning baserade på stable rank. Eftersom stable rank kan ses som en avbildning i ett Hilbertrum kan en kärna konstrueras från inre produkten i detta rum. Först undersöker vi denna kärnas egenskaper när den används inom ramen för maskininlärningsmetoder som stödvektormaskin (SVM). Därefter, med grund i teorin för inbäddning av sannolikhetsfördelningar i reproducing kernel Hilbertrum, undersöker vi hur kärnan kan användas för att utveckla ett test för statistisk hypotesprövning. Slutligen, som ett alternativ till metoder baserade på kärnor, utvecklas en avbildning i ett euklidiskt rum med optimerbara parametrar, som kan användas som ett ingångslager i ett neuralt nätverk. Metoderna utvärderas först på syntetisk data. Vidare utförs ett statistiskt test på OASIS, ett öppet dataset inom neuroradiologi. Slutligen utvärderas metoderna på klassificering av grafer, baserat på ett dataset insamlat från Reddit.
QC 20200523
Lu, Yibiao. "Statistical methods with application to machine learning and artificial intelligence." Diss., Georgia Institute of Technology, 2012. http://hdl.handle.net/1853/44730.
Full textVerleyen, Wim. "Machine learning for systems pathology." Thesis, University of St Andrews, 2013. http://hdl.handle.net/10023/4512.
Full textYang, Ying. "Discretization for Naive-Bayes learning." Monash University, School of Computer Science and Software Engineering, 2003. http://arrow.monash.edu.au/hdl/1959.1/9393.
Full textKarlaftis, Vasileios Misak. "Structural and functional brain plasticity for statistical learning." Thesis, University of Cambridge, 2018. https://www.repository.cam.ac.uk/handle/1810/278790.
Full textGuidolin, Massimo. "Asset prices on Bayesian learning paths /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2000. http://wwwlib.umi.com/cr/ucsd/fullcit?p9975886.
Full textHuszár, Ferenc. "Scoring rules, divergences and information in Bayesian machine learning." Thesis, University of Cambridge, 2013. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.648333.
Full textYang, Liu. "Mathematical Theories of Interaction with Oracles." Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/559.
Full textDENEVI, GIULIA. "Efficient Lifelong Learning Algorithms: Regret Bounds and Statistical Guarantees." Doctoral thesis, Università degli studi di Genova, 2019. http://hdl.handle.net/11567/986813.
Full textBrodin, Kristoffer. "Statistical Machine Learning from Classification Perspective: : Prediction of Household Ties for Economical Decision Making." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-215923.
Full textI det moderna samhället har många företag stora datasamlingar över sina enskilda kunder, innehållande information om attribut, så som namn, kön, civilstatus, adress etc. Dessa attribut kan användas för att länka samman kunderna beroende på om de delar någon form av relation till varandra eller ej. I denna avhandling är målet att undersöka och jämföra metoder för att prediktera relationer mellan individer i termer av vad vi definierar som en hushållsrelation, d.v.s. vi vill identifiera vilka individer som delar levnadskostnader med varandra. Målsättningen är att undersöka möjligheten för tre övervakade statistiska maskininlärningsmetoder, nämligen, logistisk regression (LR), artificiella neurala nätverk (ANN) och stödvektormaskinen (SVM), för att prediktera dessa hushållsrelationer och utvärdera deras prediktiva prestanda för olika inställningar på deras motsvarande inställningsparametrar. Data över en begränsad mängd individer, innehållande information om hushållsrelation och attribut, var tillgänglig för denna uppgift. För att tillämpa dessa metoder måste problemet formuleras på en form som möjliggör övervakat lärande, d.v.s. en målvariabel Y och prediktorer X = (X1,…,Xp), baserat på uppsättningen av p attribut associerade med varje individ, måste härledas. Vi har presenterat en teknik som utgörs av att skapa par av individer under hypotesen H0, att de delar ett hushållsförhållande, och sedan konstrueras ett signifikanstest. Denna teknik omvandlar problemet till ett standard binärt klassificeringsproblem. Ett stickprov av observationer, för att träna metoderna, kunde genereras av att slumpmässigt para individer och använda informationen från datasamlingarna för att koda motsvarande utfall på Y och X för varje slumpmässigt par. För utvärdering och avstämning av de tre övervakade inlärningsmetoderna delades observationerna i stickprovet in i en träningsmängd, en valideringsmängd och en testmängd. Vi har sett att prediktionsfelet, i form av felklassificeringsfrekvens, är mycket litet för alla metoder och de två klasserna, H0 är sann, och H0 är falsk, ligger långt ifrån varandra och väl separabla. Data har visat sig ha en uttalad linjär separabilitet, vilket generellt resulterar i mycket små skillnader i felklassificeringsfrekvens då inställningsparametrarna modifieras. Dock har vissa variationer i prediktiv prestanda p.g.a. inställningskonfiguration ändå observerats, och om hänsyn även tages till beräkningstid och beräkningskraft, har optimala inställningsparametrar ändå kunnat fastställas för respektive metod. Jämförs därefter LR, ANN och SVM, med optimala parameterinställningar, visar resultaten från testningen att det inte finns någon signifikant skillnad mellan metodernas prestanda och de predikterar alla väl. På grund av skillnad i komplexitet mellan metoderna, har det dock konstaterats att SVM är den minst lämpliga metoden att använda medan LR är lämpligast. ANN hanterar dock komplex och icke-linjära data bättre än LR, därför, för framtida tillämpning av modellen, där data kanske inte uppvisar lika linjär separabilitet, tycker vi att det är lämpligt att även överväga ANN. Denna uppsats har skrivits på Svenska Handelsbanken, en av storbankerna i Sverige, med kontor över hela världen. Huvudkontoret är beläget i Kungsträdgården, Stockholm. Beräkningar har utförts i programvaran SAS och datahantering i databashanteraren SQL.
Dearden, Richard W. "Learning and planning in structured worlds." Thesis, National Library of Canada = Bibliothèque nationale du Canada, 2000. http://www.collectionscanada.ca/obj/s4/f2/dsk1/tape3/PQDD_0020/NQ56531.pdf.
Full textWhalen, Andrew. "Computational, experimental, and statistical analyses of social learning in humans and animals." Thesis, University of St Andrews, 2016. http://hdl.handle.net/10023/8822.
Full textFrigola-Alcalde, Roger. "Bayesian time series learning with Gaussian processes." Thesis, University of Cambridge, 2016. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.709520.
Full textLand, Walker, Dan Margolis, Ronald Gottlieb, Elizabeth Krupinski, and Jack Yang. "Improving CT prediction of treatment response in patients with metastatic colorectal carcinoma using statistical learning theory." BioMed Central, 2010. http://hdl.handle.net/10150/610011.
Full textRobbin, Alice, and Lee Frost-Kumpf. "Extending theory for user-centered information systems: Diagnosing and learning from error in complex statistical data." John Wiley & Sons, Inc, 1997. http://hdl.handle.net/10150/105746.
Full textVan, der Merwe Rudolph. "Sigma-Point Kalman Filters for Probabilistic Inference in Dynamic State-Space Models." Full text open access at:, 2004. http://content.ohsu.edu/u?/etd,8.
Full textVang, Jee. "Using a model of human cognition of causality to orient arcs in structural learning of Bayesian networks." Fairfax, VA : George Mason University, 2008. http://hdl.handle.net/1920/3386.
Full textVita: p. 249. Thesis director: Farrokh Alemi. Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computational Sciences and Informatics. Title from PDF t.p. (viewed Mar. 16, 2009). Includes bibliographical references (p. 238-248). Also issued in print.
Perrot, Michaël. "Theory and algorithms for learning metrics with controlled behaviour." Thesis, Lyon, 2016. http://www.theses.fr/2016LYSES072/document.
Full textMany Machine Learning algorithms make use of a notion of distance or similarity between examples to solve various problems such as classification, clustering or domain adaptation. Depending on the tasks considered these metrics should have different properties but manually choosing an adapted comparison function can be tedious and difficult. A natural trend is then to automatically tailor such metrics to the task at hand. This is known as Metric Learning and the goal is mainly to find the best parameters of a metric under some specific constraints. Standard approaches in this field usually focus on learning Mahalanobis distances or Bilinear similarities and one of the main limitations is that the control over the behaviour of the learned metrics is often limited. Furthermore if some theoretical works exist to justify the generalization ability of the learned models, most of the approaches do not come with such guarantees. In this thesis we propose new algorithms to learn metrics with a controlled behaviour and we put a particular emphasis on the theoretical properties of these algorithms. We propose four distinct contributions which can be separated in two parts, namely (i) controlling the metric with respect to a reference metric and (ii) controlling the underlying transformation corresponding to the learned metric. Our first contribution is a local metric learning method where the goal is to regress a distance proportional to the human perception of colors. Our approach is backed up by theoretical guarantees on the generalization ability of the learned metrics. In our second contribution we are interested in theoretically studying the interest of using a reference metric in a biased regularization term to help during the learning process. We propose to use three different theoretical frameworks allowing us to derive three different measures of goodness for the reference metric. These measures give us some insights on the impact of the reference metric on the learned one. In our third contribution we propose a metric learning algorithm where the underlying transformation is controlled. The idea is that instead of using similarity and dissimilarity constraints we associate each learning example to a so-called virtual point belonging to the output space associated with the learned metric. We theoretically show that metrics learned in this way generalize well but also that our approach is linked to a classic metric learning method based on pairs constraints. In our fourth contribution we also try to control the underlying transformation of a learned metric. However instead of considering a point-wise control we consider a global one by forcing the transformation to follow the geometrical transformation associated to an optimal transport problem. From a theoretical standpoint we propose a discussion on the link between the transformation associated with the learned metric and the transformation associated with the optimal transport problem. On a more practical side we show the interest of our approach for domain adaptation but also for a task of seamless copy in images
Riggelsen, Carsten. "Approximation methods for efficient learning of Bayesian networks /." Amsterdam ; Washington, DC : IOS Press, 2008. http://www.loc.gov/catdir/toc/fy0804/2007942192.html.
Full textGrimes, David B. "Learning by imitation and exploration : Bayesian models and applications in humanoid robotics /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/6879.
Full textPAGLIANA, NICOLO'. "On the Role of Regularization in Machine Learning: Classical Theory, Computational Aspects and Modern Regimes." Doctoral thesis, Università degli studi di Genova, 2022. http://hdl.handle.net/11567/1081700.
Full textCardamone, Dario. "Support Vector Machine a Machine Learning Algorithm." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.
Find full textShon, Aaron P. "Bayesian cognitive models for imitation /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/7013.
Full textZhu, Shaojuan. "Associative memory as a Bayesian building block /." Full text open access at:, 2008. http://content.ohsu.edu/u?/etd,655.
Full textGUASTAVINO, SABRINA. "Learning and inverse problems: from theory to solar physics applications." Doctoral thesis, Università degli studi di Genova, 2020. http://hdl.handle.net/11567/998315.
Full textBerlin, Daniel. "Multi-class Supervised Classification Techniques for High-dimensional Data: Applications to Vehicle Maintenance at Scania." Thesis, KTH, Matematisk statistik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-209257.
Full textMånga gånger i samband med fordonsreparationer är felsökningen mer tidskrävande än själva reparationen. Således skulle en systematisk metod för att noggrant prediktera felkällan vara ett värdefullt verktyg för att diagnostisera reparationsåtgärder. I denna uppsats undersöks möjligheten att använda Diagnostic Trouble Codes (DTC:er), som genereras av de elektroniska systemen i Scanias fordon, som indikatorer för att peka ut felorsaken. Till grund för analysen användes ca 18800 observationer av fordon där både DTC:er samt utbytta delar kunnat identifieras under perioden mars 2016 - mars 2017. Två olika strategier för att generera klasser har utvärderats. Till många av klasserna fanns det endast ett fåtal observationer, och för att ge de prediktiva modellerna bra förutsättningar så användes endast klasser med tillräckligt många observationer i träningsdata. Efter bearbetning kunde data innehålla 1547 observationer 4168 attribut, vilket demonstrerar problemets höga dimensionalitet och gör det omöjligt att applicera standard metoder för statistisk analys på stora datamängder. Två metoder för övervakad statistisk inlärning, lämpliga för högdimensionell data med multipla klasser, Södvectormaskiner (SVM) samt Neurala Nätverk (NN) implementeras och deras resultat utvärderas. Analysen visade att på data med 1547 observationer av 4168 attribut (unika DTC:er) och 7 klasser kunde SVM prediktera observationer till klasserna med 79.4% noggrannhet jämfört med 75.4% för NN. De slutsatser som kunde dras av analysen var att DTC:er tycks ha potential att användas för att indikera felorsaker med en prediktiv modell, men att den data som ligger till grund för analysen bör förbättras för att öka noggrannheten i de prediktiva modellerna. Framtida forskningsmöjligheter för att ytterligare förbättra samt utveckla modellen, tillsammans med förslag för hur övervakade klassificerings modeller kan användas på Scnaia har identifierats.
Machart, Pierre. "Coping with the Computational and Statistical Bipolar Nature of Machine Learning." Phd thesis, Aix-Marseille Université, 2012. http://tel.archives-ouvertes.fr/tel-00771718.
Full textOzogur-akyuz, Sureyya. "A Mathematical Contribution Of Statistical Learning And Continuous Optimization Using Infinite And Semi-infinite Programming To Computational Statistics." Phd thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12610381/index.pdf.
Full textlearn&rdquo
. ML is the process of training a system with large number of examples, extracting rules and finding patterns in order to make predictions on new data points (examples). The most common machine learning schemes are supervised, semi-supervised, unsupervised and reinforcement learning. These schemes apply to natural language processing, search engines, medical diagnosis, bioinformatics, detecting credit fraud, stock market analysis, classification of DNA sequences, speech and hand writing recognition in computer vision, to encounter just a few. In this thesis, we focus on Support Vector Machines (SVMs) which is one of the most powerful methods currently in machine learning. As a first motivation, we develop a model selection tool induced into SVM in order to solve a particular problem of computational biology which is prediction of eukaryotic pro-peptide cleavage site applied on the real data collected from NCBI data bank. Based on our biological example, a generalized model selection method is employed as a generalization for all kinds of learning problems. In ML algorithms, one of the crucial issues is the representation of the data. Discrete geometric structures and, especially, linear separability of the data play an important role in ML. If the data is not linearly separable, a kernel function transforms the nonlinear data into a higher-dimensional space in which the nonlinear data are linearly separable. As the data become heterogeneous and large-scale, single kernel methods become insufficient to classify nonlinear data. Convex combinations of kernels were developed to classify this kind of data [8]. Nevertheless, selection of the finite combinations of kernels are limited up to a finite choice. In order to overcome this discrepancy, we propose a novel method of &ldquo
infinite&rdquo
kernel combinations for learning problems with the help of infinite and semi-infinite programming regarding all elements in kernel space. This will provide to study variations of combinations of kernels when considering heterogeneous data in real-world applications. Combination of kernels can be done, e.g., along a homotopy parameter or a more specific parameter. Looking at all infinitesimally fine convex combinations of the kernels from the infinite kernel set, the margin is maximized subject to an infinite number of constraints with a compact index set and an additional (Riemann-Stieltjes) integral constraint due to the combinations. After a parametrization in the space of probability measures, it becomes semi-infinite. We analyze the regularity conditions which satisfy the Reduction Ansatz and discuss the type of distribution functions within the structure of the constraints and our bilevel optimization problem. Finally, we adapted well known numerical methods of semiinfinite programming to our new kernel machine. We improved the discretization method for our specific model and proposed two new algorithms. We proved the convergence of the numerical methods and we analyzed the conditions and assumptions of these convergence theorems such as optimality and convergence.
Gonzales, Kalim. "Establishing a Learning Foundation in a Dynamically Changing World: Insights from Artificial Language Work." Diss., The University of Arizona, 2013. http://hdl.handle.net/10150/308884.
Full textAmethier, Patrik, and André Gerbaulet. "Sales Volume Forecasting of Ericsson Radio Units - A Statistical Learning Approach." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-288504.
Full textEricsson har en väletablerad intern process för prognostisering av försäljningsvolymer, där produktnära samt kundnära roller samarbetar med inköpsorganisationen för att säkra noggranna uppskattningar angående framtidens efterfrågan. Syftet med denna studie är att evaluera tidigare prognoser, och sedan utveckla en ny prediktiv, statistisk modell som prognostiserar baserad på historisk data. Studien fokuserar på produktkategorin radio, och utvecklar en två-stegsmodell bestående av en trädmodell och ett neuralt nätverk. För att testa hypotesen att en 1-3 års prognos för en produkt kan göras mer noggran med en datadriven modell, tränas modellen på attribut kopplat till produkten, till exempel historiska volymer för produkten, och volymtrender inom produktens marknadsområden och kundgrupper. Detta resulterade i flera prognoser på olika tidshorisonter, nämligen 1-12 månader, 13-24 månader samt 25-36 månder. Majoriteten av wMAPE-felen för dess prognoser visades ligga under 5%, vilket kan jämföras med wMAPE på 9% för Ericssons befintliga 1-årsprognoser, 13% för 2-årsprognerna samt 22% för 3-årsprognoserna. Detta pekar på att datadrivna, statistiska metoder kan användas för att producera gedigna prognoser för framtida försäljningsvolymer, men hänsyn bör tas till jämförelsen mellan de kvalitativa uppskattningarna och de statistiska prognoserna, samt de höga varianserna i felen.
Hazarika, Subhashis. "Statistical and Machine Learning Approaches For Visualizing and Analyzing Large-Scale Simulation Data." The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1574692702479196.
Full textLafon, Nicolas. "Statistical learning for geosciences : methods for extreme generation and data assimilation." Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASJ006.
Full textThe field of geosciences aims to comprehensively understand the Earth system. It addresses critical challenges, including the impact of climate change or management of risks from extreme events. Geosciences benefit significantly from the influx of large-scale data, making it conducive for machine learning (ML) applications. Because of its specific features, the analysis of geoscience data requires innovative ML formulations and methodologies. The work in this thesis contributes novel ML-based tools tailored for geoscience challenges, with the potential for broader applications beyond the geosciences domain.In the first part of this thesis, we propose a ML approach to estimate the distribution of dynamically driven spatio-temporal variables from noisy and irregular observations. Indeed, we introduce a learning framework to estimate both the state of a dynamical system with associated uncertainties as a covariance matrix. Such method can finds applications to data assimilation problems, in which noisy and sparse observations are available coupled with knowledge about the physical dynamics. Weather or oceanographic forecast models are concerned.The second part of this thesis presents a ML-based generative model which produce new samples of an unknown multivariate distribution given examples. Our simulator provides samples outside of the training data and allows to extrapolate. This approach has direct applications to the study of environmental hazards since it allows numerical simulation of rare extreme samples
Saers, Markus. "Translation as Linear Transduction : Models and Algorithms for Efficient Learning in Statistical Machine Translation." Doctoral thesis, Uppsala universitet, Institutionen för lingvistik och filologi, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-135704.
Full textScuderi, Marco Giovanni. "Bayesian approaches to learning from data how to untangle the travel behavior and land use relationships." College Park, Md. : University of Maryland, 2005. http://hdl.handle.net/1903/3201.
Full text"Bayesian scoring is used to evaluate and compare results from actual data collected for the Baltimore Metropolitan Area with the set of predominant conceptual frameworks linking travel behavior and land use obtained from the literature"--Abstract. Includes bibliographical references (p. 167-176) and abstract.
Vogel, Robin. "Similarity ranking for biometrics : theory and practice." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAT031.
Full textThe rapid growth in population, combined with the increased mobility of people has created a need for sophisticated identity management systems.For this purpose, biometrics refers to the identification of individuals using behavioral or biological characteristics. The most popular approaches, i.e. fingerprint, iris or face recognition, are all based on computer vision methods. The adoption of deep convolutional networks, enabled by general purpose computing on graphics processing units, made the recent advances incomputer vision possible. These advances have led to drastic improvements for conventional biometric methods, which boosted their adoption in practical settings, and stirred up public debate about these technologies. In this respect, biometric systems providers face many challenges when learning those networks.In this thesis, we consider those challenges from the angle of statistical learning theory, which leads us to propose or sketch practical solutions. First, we answer to the proliferation of papers on similarity learningfor deep neural networks that optimize objective functions that are disconnected with the natural ranking aim sought out in biometrics. Precisely, we introduce the notion of similarity ranking, by highlighting the relationship between bipartite ranking and the requirements for similarities that are well suited to biometric identification. We then extend the theory of bipartite ranking to this new problem, by adapting it to the specificities of pairwise learning, particularly those regarding its computational cost. Usual objective functions optimize for predictive performance, but recentwork has underlined the necessity to consider other aspects when training a biometric system, such as dataset bias, prediction robustness or notions of fairness. The thesis tackles all of those three examplesby proposing their careful statistical analysis, as well as practical methods that provide the necessary tools to biometric systems manufacturers to address those issues, without jeopardizing the performance of their algorithms
Nelson, Jonathan David. "Optimal experimental design as a theory of perceptual and cognitive information acquisition /." Diss., Connect to a 24 p. preview or request complete full text in PDF format. Access restricted to UC campuses, 2005. http://wwwlib.umi.com/cr/ucsd/fullcit?p3191765.
Full text