Dissertations / Theses on the topic 'Machine Learning Model Robustness'
Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles
Consult the top 50 dissertations / theses for your research on the topic 'Machine Learning Model Robustness.'
Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.
You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.
Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.
Adams, William A. "Analysis of Robustness in Lane Detection using Machine Learning Models." Ohio University / OhioLINK, 2015. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1449167611.
Full textLundström, Linnea. "Formally Verifying the Robustness of Machine Learning Models : A Comparative Study." Thesis, Linköpings universitet, Programvara och system, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-167504.
Full textMAURI, LARA. "DATA PARTITIONING AND COMPENSATION TECHNIQUES FOR SECURE TRAINING OF MACHINE LEARNING MODELS." Doctoral thesis, Università degli Studi di Milano, 2022. http://hdl.handle.net/2434/932387.
Full textRado, Omesaad A. M. "Contributions to evaluation of machine learning models. Applicability domain of classification models." Thesis, University of Bradford, 2019. http://hdl.handle.net/10454/18447.
Full textMinistry of Higher Education in Libya
Cherief-Abdellatif, Badr-Eddine. "Contributions to the theoretical study of variational inference and robustness." Electronic Thesis or Diss., Institut polytechnique de Paris, 2020. http://www.theses.fr/2020IPPAG001.
Full textThis PhD thesis deals with variational inference and robustness. More precisely, it focuses on the statistical properties of variational approximations and the design of efficient algorithms for computing them in an online fashion, and investigates Maximum Mean Discrepancy based estimators as learning rules that are robust to model misspecification.In recent years, variational inference has been extensively studied from the computational viewpoint, but only little attention has been put in the literature towards theoretical properties of variational approximations until very recently. In this thesis, we investigate the consistency of variational approximations in various statistical models and the conditions that ensure the consistency of variational approximations. In particular, we tackle the special case of mixture models and deep neural networks. We also justify in theory the use of the ELBO maximization strategy, a model selection criterion that is widely used in the Variational Bayes community and is known to work well in practice.Moreover, Bayesian inference provides an attractive online-learning framework to analyze sequential data, and offers generalization guarantees which hold even under model mismatch and with adversaries. Unfortunately, exact Bayesian inference is rarely feasible in practice and approximation methods are usually employed, but do such methods preserve the generalization properties of Bayesian inference? In this thesis, we show that this is indeed the case for some variational inference algorithms. We propose new online, tempered variational algorithms and derive their generalization bounds. Our theoretical result relies on the convexity of the variational objective, but we argue that our result should hold more generally and present empirical evidence in support of this. Our work presents theoretical justifications in favor of online algorithms that rely on approximate Bayesian methods. Another point that is addressed in this thesis is the design of a universal estimation procedure. This question is of major interest, in particular because it leads to robust estimators, a very hot topic in statistics and machine learning. We tackle the problem of universal estimation using a minimum distance estimator based on the Maximum Mean Discrepancy. We show that the estimator is robust to both dependence and to the presence of outliers in the dataset. We also highlight the connections that may exist with minimum distance estimators using L2-distance. Finally, we provide a theoretical study of the stochastic gradient descent algorithm used to compute the estimator, and we support our findings with numerical simulations. We also propose a Bayesian version of our estimator, that we study from both a theoretical and a computational points of view
Ilyas, Andrew. "On practical robustness of machine learning systems." Thesis, Massachusetts Institute of Technology, 2018. https://hdl.handle.net/1721.1/122911.
Full textThesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 71-79).
We consider the importance of robustness in evaluating machine learning systems, an in particular systems involving deep learning. We consider these systems' vulnerability to adversarial examples--subtle, crafted perturbations to inputs which induce large change in output. We show that these adversarial examples are not only theoretical concern, by desigining the first 3D adversarial objects, and by demonstrating that these examples can be constructed even when malicious actors have little power. We suggest a potential avenue for building robust deep learning models by leveraging generative models.
by Andrew Ilyas.
M. Eng.
M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science
Ishii, Shotaro, and David Ljunggren. "A Comparative Analysis of Robustness to Noise in Machine Learning Classifiers." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-302532.
Full textData som härstammar från verkliga mätningar innehåller ofta förvrängningar i viss utsträckning. Sådana förvrängningar kan i vissa fall leda till försämrad klassificeringsnoggrannhet. I den här studien jämförs tre klassificeringsalgoritmer med avseende på hur pass robusta de är när den data de presenteras innehåller syntetiska förvrängningar. Mer specifikt så tränades och jämfördes slumpskogar, stödvektormaskiner och artificiella neuronnät på fyra olika mängder data med varierande nivåer av syntetiska förvrängningar. Sammanfattningsvis så presterade slumpskogen bäst, och var den mest robusta klassificeringsalgoritmen på åtta av tio förvrängningsnivåer, tätt följt av det artificiella neuronnätet. På de två återstående förvrängningsnivåerna presterade stödvektormaskinen med linjär kärna bäst och var den mest robusta klassificeringsalgoritmen.
Ebrahimi, Javid. "Robustness of Neural Networks for Discrete Input: An Adversarial Perspective." Thesis, University of Oregon, 2019. http://hdl.handle.net/1794/24535.
Full textFagogenis, Georgios. "Increasing the robustness of autonomous systems to hardware degradation using machine learning." Thesis, Heriot-Watt University, 2016. http://hdl.handle.net/10399/3378.
Full textHaussamer, Nicolai Haussamer. "Model Calibration with Machine Learning." Master's thesis, University of Cape Town, 2018. http://hdl.handle.net/11427/29451.
Full textZhao, Yajing. "Chaotic Model Prediction with Machine Learning." BYU ScholarsArchive, 2020. https://scholarsarchive.byu.edu/etd/8419.
Full textNitesh, Varma Rudraraju Nitesh, and Boyanapally Varun Varun. "Data Quality Model for Machine Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-18498.
Full textHarte, Thomas James. "Discrete-time model-based Iterative Learning Control : stability, monotonicity and robustness." Thesis, University of Sheffield, 2007. http://etheses.whiterose.ac.uk/3624/.
Full textFerdowsi, Khosrowshahi Aidin. "Distributed Machine Learning for Autonomous and Secure Cyber-physical Systems." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/99466.
Full textDoctor of Philosophy
In order to deliver innovative technological services to their residents, smart cities will rely on autonomous cyber-physical systems (CPSs) such as cars, drones, sensors, power grids, and other networks of digital devices. Maintaining stability, robustness, and security (SRS) of those smart city CPSs is essential for the functioning of our modern economies and societies. SRS can be defined as the ability of a CPS, such as an autonomous vehicular system, to operate without disruption in its quality of service. In order to guarantee SRS of CPSs one must overcome many technical challenges such as CPSs' vulnerability to various disruptive events such as natural disasters or cyber attacks, limited resources, scale, and interdependency. Such challenges must be considered for CPSs in order to design vehicles that are controlled autonomously and whose motion is robust against unpredictable events in their trajectory, to implement stable Internet of digital devices that work with a minimum communication delay, or to secure critical infrastructure to provide services such as electricity, gas, and water systems. The primary goal of this dissertation is, thus, to develop novel foundational analytical tools, that weave together notions from machine learning, game theory, and control theory, in order to study, analyze, and optimize SRS of autonomous CPSs which eventually will improve the quality of service provided by smart cities. To this end, various frameworks and effective algorithms are proposed in order to enhance the SRS of CPSs and pave the way toward the practical deployment of autonomous CPSs and applications. The results show that the developed solutions can enable a CPS to operate efficiently while maintaining its SRS. As such, the outcomes of this research can be used as a building block for the large deployment of smart city technologies that can be of immense benefit to tomorrow's societies.
Tigreat, Philippe. "Sparsity, redundancy and robustness in artificial neural networks for learning and memory." Thesis, Ecole nationale supérieure Mines-Télécom Atlantique Bretagne Pays de la Loire, 2017. http://www.theses.fr/2017IMTA0046/document.
Full textThe objective of research in Artificial Intelligence (AI) is to reproduce human cognitive abilities by means of modern computers. The results of the last few years seem to announce a technological revolution that could profoundly change society. We focus our interest on two fundamental cognitive aspects, learning and memory. Associative memories offer the possibility to store information elements and to retrieve them using a sub-part of their content, thus mimicking human memory. Deep Learning allows to transition from an analog perception of the outside world to a sparse and more compact representation.In Chapter 2, we present a neural associative memory model inspired by Willshaw networks, with constrained connectivity. This brings an performance improvement in message retrieval and a more efficient storage of information.In Chapter 3, a convolutional architecture was applied on a task of reading partially displayed words under similar conditions as in a former psychology study on human subjects. This experiment put inevidence the similarities in behavior of the network with the human subjects regarding various properties of the display of words.Chapter 4 introduces a new method for representing categories usingneuron assemblies in deep networks. For problems with a large number of classes, this allows to reduce significantly the dimensions of a network.Chapter 5 describes a method for interfacing deep unsupervised networks with clique-based associative memories
Menke, Joshua Ephraim. "Improving Machine Learning Through Oracle Learning." BYU ScholarsArchive, 2007. https://scholarsarchive.byu.edu/etd/843.
Full textWang, Gang. "Solution path algorithms : an efficient model selection approach /." View abstract or full-text, 2007. http://library.ust.hk/cgi/db/thesis.pl?CSED%202007%20WANGG.
Full textWang, Jiahao. "Vehicular Traffic Flow Prediction Model Using Machine Learning-Based Model." Thesis, Université d'Ottawa / University of Ottawa, 2021. http://hdl.handle.net/10393/42288.
Full textFerreira, E. (Eija). "Model selection in time series machine learning applications." Doctoral thesis, Oulun yliopisto, 2015. http://urn.fi/urn:isbn:9789526209012.
Full textTiivistelmä Mallinvalinta on oleellinen osa minkä tahansa käytännön mallinnusongelman ratkaisua. Koska mallinnettavan ilmiön toiminnan taustalla olevaa todellista mallia ei voida tietää, on mallinvalinnan tarkoituksena valita malliehdokkaiden joukosta sitä lähimpänä oleva malli. Tässä väitöskirjassa käsitellään mallinvalintaa aikasarjamuotoista dataa sisältävissä sovelluksissa neljän koneoppimisprosessissa yleisesti noudatetun askeleen kautta: aineiston esikäsittely, algoritmin valinta, piirteiden valinta ja validointi. Väitöskirjassa tutkitaan, kuinka käytettävissä olevan aineiston ominaisuudet ja määrä tulisi ottaa huomioon algoritmin valinnassa, ja kuinka aineisto tulisi jakaa mallin opetusta, testausta ja validointia varten mallin yleistettävyyden ja tulevan suorituskyvyn optimoimiseksi. Myös erityisiä rajoitteita ja vaatimuksia tavanomaisten koneoppimismenetelmien soveltamiselle aikasarjadataan käsitellään. Työn tavoitteena on erityisesti tuoda esille mallin ylioppimiseen ja ylivalintaan liittyviä ongelmia, jotka voivat seurata mallinvalin- tamenetelmien huolimattomasta tai osaamattomasta käytöstä. Työn käytännön tulokset perustuvat koneoppimismenetelmien soveltamiseen aikasar- jadatan mallinnukseen kolmella eri tutkimusalueella: pistehitsaus, fyysisen harjoittelun aikasen energiankulutuksen arviointi sekä kognitiivisen kuormituksen mallintaminen. Väitöskirja tarjoaa näihin tuloksiin pohjautuen yleisiä suuntaviivoja, joita voidaan käyttää apuna lähdettäessä ratkaisemaan uutta koneoppimisongelmaa erityisesti aineiston ominaisuuksien ja määrän, laskennallisten resurssien sekä ongelman mahdollisen aikasar- jaluonteen näkökulmasta. Työssä pohditaan myös mallin lopullisen toimintaympäristön asettamien käytännön näkökohtien ja rajoitteiden vaikutusta algoritmin valintaan
Uziela, Karolis. "Protein Model Quality Assessment : A Machine Learning Approach." Doctoral thesis, Stockholms universitet, Institutionen för biokemi och biofysik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:su:diva-137695.
Full textAt the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 3: Manuscript.
de, la Rúa Martínez Javier. "Scalable Architecture for Automating Machine Learning Model Monitoring." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280345.
Full textUnder de senaste åren har konceptet MLOps blivit alltmer populärt på grund av tillkomsten av mer sofistikerade verktyg för explorativ dataanalys, datahantering, modell-träning och model serving som tjänstgör i produktion. Som ett försök att föra DevOps processer till Machine Learning (ML)-livscykeln, siktar MLOps på mer automatisering i utförandet av mångfaldiga och repetitiva uppgifter längs cykeln samt på smidigare interoperabilitet mellan team och verktyg inblandade. I det här sammanhanget har de största molnleverantörerna byggt sina egna ML-plattformar [4, 34, 61], vilka erbjuds som tjänster i deras molnlösningar. Dessutom har flera ramar tagits fram för att lösa konkreta problem såsom datatestning, datamärkning, distribuerad träning eller tolkning av förutsägelse, och nya övervakningsmetoder har föreslagits [32, 33, 65]. Av alla stadier i ML-livscykeln förbises ofta modellövervakning trots att det är relevant. På senare tid har molnleverantörer presenterat sina egna verktyg att kunna användas inom sina plattformar [4, 61] medan arbetet pågår för att integrera befintliga ramverk [72] med lösningar för modellplatformer med öppen källkod [38]. De flesta av dessa ramverk är antingen byggda som ett tillägg till en befintlig plattform (dvs. saknar portabilitet), följer en schemalagd batchbearbetningsmetod med en lägsta hastighet av ett antal timmar, eller innebär begränsningar för vissa extremvärden och drivalgoritmer på grund av plattformsarkitekturens design där de är integrerade. I det här arbetet utformas och utvärderas en skalbar automatiserad molnbaserad arkitektur för MLmodellövervakning i en streaming-metod. Ett experiment som utförts på ett 7nodskluster med 250.000 förfrågningar vid olika samtidigheter visar maximala latenser på 5,9, 29,92 respektive 30,86 sekunder efter tid för förfrågningen för 75% av avståndsbaserad detektering av extremvärden, windowed statistics och distributionsbaserad datadriftdetektering, med hjälp av windows med 15 sekunders längd och 6 sekunders fördröjning av vattenstämpel.
Kothawade, Rohan Dilip. "Wine quality prediction model using machine learning techniques." Thesis, Högskolan i Skövde, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:his:diva-20009.
Full textChida, Anjum A. "Protein Tertiary Model Assessment Using Granular Machine Learning Techniques." Digital Archive @ GSU, 2012. http://digitalarchive.gsu.edu/cs_diss/65.
Full textLee, Wei-En. "Visualizations for model tracking and predictions in machine learning." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/113133.
Full textThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (pages 82-84).
Building machine learning models is often an exploratory and iterative process. A data scientist frequently builds and trains hundreds of models with different parameters and feature sets in order to find one that meets the desired criteria. However, it can be difficult to keep track of all the parameters and metadata that are associated with the models. ModelDB, an end-to-end system for managing machine learning models, is a tool that solves this problem of model management. In this thesis, we present a graphical user interface for ModelDB, along with an extension for visualizing model predictions. The core user interface for model management augments the ModelDB system, which previously consisted only of native client libraries and a backend. The interface provides new ways of exploring, visualizing, and analyzing model data through a web application. The prediction visualizations extend the core user interface by providing a novel prediction matrix that displays classifier outputs in order to convey model performance at the example level. We present the design and implementation of both the core user interface and the prediction visualizations, discussing at each step the motivations behind key features. We evaluate the prediction visualizations through a pilot user study, which produces preliminary feedback on the practicality and utility of the interface. The overall goal of this research is to provide a powerful, user-friendly interface that leverages the data stored in ModelDB to generate effective visualizations for analyzing and improving models.
by Wei-En Lee.
M. Eng.
Bagheri, Rajeoni Alireza. "ANALOG CIRCUIT SIZING USING MACHINE LEARNING BASED TRANSISTORCIRCUIT MODEL." University of Akron / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=akron1609428170125214.
Full textSharma, Sagar. "Towards Data and Model Confidentiality in Outsourced Machine Learning." Wright State University / OhioLINK, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=wright1567529092809275.
Full textLanctot, J. Kevin (Joseph Kevin) Carleton University Dissertation Mathematics. "Discrete estimator algorithms: a mathematical model of machine learning." Ottawa, 1989.
Find full textKokkonen, H. (Henna). "Effects of data cleaning on machine learning model performance." Bachelor's thesis, University of Oulu, 2019. http://jultika.oulu.fi/Record/nbnfioulu-201911133081.
Full textBadayos, Noah Garcia. "Machine Learning-Based Parameter Validation." Diss., Virginia Tech, 2014. http://hdl.handle.net/10919/47675.
Full textPh. D.
Abdurahiman, Vakulathil. "Towards inducing a simulation model description." Thesis, Brunel University, 1994. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.239138.
Full textGeras, Krzysztof Jerzy. "Exploiting diversity for efficient machine learning." Thesis, University of Edinburgh, 2018. http://hdl.handle.net/1842/28839.
Full textStroulia, Eleni. "Failure-driven learning as model-based self-redesign." Diss., Georgia Institute of Technology, 1994. http://hdl.handle.net/1853/8291.
Full textCaceres, Carlos Antonio. "Machine Learning Techniques for Gesture Recognition." Thesis, Virginia Tech, 2014. http://hdl.handle.net/10919/52556.
Full textMaster of Science
Follett, Stephen James. "A computational model of learning in Go." Thesis, University of South Wales, 2001. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.343412.
Full textNath, Gourabmoy. "A Model of Situation Learning in Design." Thesis, The University of Sydney, 1999. https://hdl.handle.net/2123/25096.
Full textMurray-Smith, Roderick. "A local model network approach to nonlinear modelling." Thesis, University of Strathclyde, 1994. http://oleg.lib.strath.ac.uk:80/R/?func=dbin-jump-full&object_id=27067.
Full textAbdullah, Siti Norbaiti binti. "Machine learning approach for crude oil price prediction." Thesis, University of Manchester, 2014. https://www.research.manchester.ac.uk/portal/en/theses/machine-learning-approach-for-crude-oil-price-prediction(949fa2d5-1a4d-416a-8e7c-dd66da95398e).html.
Full textViswanathan, Srinidhi. "ModelDB : tools for machine learning model management and prediction storage." Thesis, Massachusetts Institute of Technology, 2017. http://hdl.handle.net/1721.1/113540.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 99-100).
Building a machine learning model is often an iterative process. Data scientists train hundreds of models before finding a model that meets acceptable criteria. But tracking these models and remembering the insights obtained from them is an arduous task. In this thesis, we present two main systems for facilitating better tracking, analysis, and querying of scikit-learn machine learning models. First, we introduce our scikit-learn client for ModelDB, a novel end-to-end system for managing machine learning models. The client allows data scientists to easily track diverse scikit-learn workflows with minimal changes to their code. Then, we describe our extension to ModelDB, PredictionStore. While the ModelDB client enables users to track the different models they have run, PredictionStore creates a prediction matrix to tackle the remaining piece in the puzzle: facilitating better exploration and analysis of model performance. We implement a query API to assist in analyzing predictions and answering nuanced questions about models. We also implement a variety of algorithms to recommend particular models to ensemble utilizing the prediction matrix. We evaluate ModelDB and PredictionStore on different datasets and determine ModelDB successfully tracks scikit-learn models, and most complex model queries can be executed in a matter of seconds using our query API. In addition, the workflows demonstrate significant improvement in accuracy using the ensemble algorithms. The overall goal of this research is to provide a flexible framework for training scikit-learn models, storing their predictions/ models, and efficiently exploring and analyzing the results.
by Srinidhi Viswanathan.
M. Eng.
Adhikari, Bhisma. "Intelligent Simulink Modeling Assistance via Model Clones and Machine Learning." Miami University / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=miami1627040347560589.
Full textAnam, Md Tahseen. "Evaluate Machine Learning Model to Better Understand Cutting in Wood." Thesis, Uppsala universitet, Institutionen för informationsteknologi, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-448713.
Full textZhou, Wei. "Analysing the Robustness of Semantic Segmentation for Autonomous Vehicles." Thesis, University of Sydney, 2020. https://hdl.handle.net/2123/22699.
Full textPouilly-Cathelain, Maxime. "Synthèse de correcteurs s’adaptant à des critères multiples de haut niveau par la commande prédictive et les réseaux de neurones." Electronic Thesis or Diss., université Paris-Saclay, 2020. http://www.theses.fr/2020UPASG019.
Full textThis PhD thesis deals with the control of nonlinear systems subject to nondifferentiable or nonconvex constraints. The objective is to design a control law considering any type of constraints that can be online evaluated.To achieve this goal, model predictive control has been used in addition to barrier functions included in the cost function. A gradient-free optimization algorithm has been used to solve this optimization problem. Besides, a cost function formulation has been proposed to ensure stability and robustness against disturbances for linear systems. The proof of stability is based on invariant sets and the Lyapunov theory.In the case of nonlinear systems, dynamic neural networks have been used as a predictor for model predictive control. Machine learning algorithms and the nonlinear observers required for the use of neural networks have been studied. Finally, our study has focused on improving neural network prediction in the presence of disturbances.The synthesis method presented in this work has been applied to obstacle avoidance by an autonomous vehicle
Darwiche, Aiman A. "Machine Learning Methods for Septic Shock Prediction." Diss., NSUWorks, 2018. https://nsuworks.nova.edu/gscis_etd/1051.
Full textChapala, Usha Kiran, and Sridhar Peteti. "Continuous Video Quality of Experience Modelling using Machine Learning Model Trees." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 1996. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-17814.
Full textWu, Michael (Michael Q. ). "The synthetic student : a machine learning model to simulate MOOC data." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/100681.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (page 103).
It's now possible to take all of your favorite courses online. With growing popularity, Massive Open Online Courses (MOOCs) offer a learning opportunity to anyone with a computer - as well as an opportunity for researchers to investigate student learning through the accumulation of data about student-course interactions. Unfortunately, efforts to mine student data for information are currently limited by privacy concerns over how the data can be distributed. In this thesis, we present a generative model that learns from student data at the click-by-click level. When fully trained, this model is able to generate synthetic student data at the click-by-click level that can be released to the public. To develop a model at such granularity, we had to learn problem submission tendencies, characterize time spent viewing webpages and problem submission grades, and analyze how student activity transitions from week to week. We further developed a novel multi-level time-series model that goes beyond the classic Markov model and HMM methods used by most state-of-the art ML methods for weblogs, and showed that our model performs better than these methods. After training our model on a 6.002x course on edX, we generated synthetic data and found that a classifier that predicts student dropout is 93% as effective (by AUC) when trained on the simulated data as when trained on the real data. Lastly, we found that using features learned by our model improves dropout prediction performance by 9.5%.
by Michael Wu.
M. Eng.
Shen, Yingzhen. "Forecasting Twitter topic popularity using bass diffusion model and machine learning." Thesis, Massachusetts Institute of Technology, 2015. http://hdl.handle.net/1721.1/99575.
Full textCataloged from PDF version of thesis.
Includes bibliographical references (pages 91-93).
Today social network websites like Twitter are important information sources for a company's marketing, logistics and supply chain. Sometimes a topic about a product will "explode" at a "peak day," suddenly being talked about by a large number of users. Predicting the diffusion process of a Twitter topic is meaningful for a company to forecast demand, and plan ahead to dispatch its products. In this study, we collected Twitter data on 220 topics, covering a wide range of fields. And we created 12 features for each topic at each time stage, e.g. number of tweets mentioning this topic per hour, number of followers of users already mentioning this topic, and percentage of root tweets among all tweets. The task in this study is to predict the total mention count within the whole time horizon, 180 days, as early and accurately as possible. To complete this task, we applied two models - fitting the curve denoting topic popularity (mention count curve) by Bass diffusion model; and using machine learning models including K-nearest-neighbor, linear regression, bagged tree, and ensemble to learn the topic popularity as a function of the features we created. The results of this study reveal that the Basic Bass model captures the underlying mechanism of the Twitter topic development process. And we can analogue Twitter topics' adoption to a new product's diffusion. Using only mention count, over the whole time horizon, the Bass model has much better predictive accuracy, compared to machine learning models with extra features. However, even with the best model (the Bass model) and focusing on the subset of topics with better predictability, predictive accuracy is still not good enough before the "explosion day." This is because "explosion" is usually triggered by news outside Twitter, and therefore is hard to predict without information outside Twitter.
by Yingzhen Shen.
S.M. in Transportation
Essaidi, Moez. "Model-Driven Data Warehouse and its Automation Using Machine Learning Techniques." Paris 13, 2013. http://scbd-sto.univ-paris13.fr/secure/edgalilee_th_2013_essaidi.pdf.
Full textThis thesis aims at proposing an end-to-end approach which allows the automation of the process of model transformations for the development of data warehousing components. The main idea is to reduce as much as possible the intervention of human experts by using once again the traces of transformations produced on similar projects. The goal is to use supervised learning techniques to handle concept definitions with the same expressive level as manipulated data. The nature of the manipulated data leads us to choose relational languages for the description of examples and hypothesises. These languages have the advantage of being expressive by giving the possibility to express relationships between the manipulated objects, but they have the major disadvantage of not having algorithms allowing the application on large scales of industrial applications. To solve this problem, we have proposed an architecture that allows the perfect exploitation of the knowledge obtained from transformations' invariants between models and metamodels. This way of proceeding has highlighted the dependencies between the concepts to learn and has led us to propose a learning paradigm, called dependent-concept learning. Finally, this thesis presents various aspects that may inuence the next generation of data warehousing platforms. The latter suggests, in particular, an architecture for business intelligence-as-a-service based on the most recent and promising industrial standards and technologies
Oskarsson, Emma. "Machine Learning Model for Predicting the Repayment Rate of Loan Takers." Thesis, Umeå universitet, Institutionen för fysik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-184154.
Full textQader, Aso, and William Shiver. "Developing an Advanced Internal Ratings-Based Model by Applying Machine Learning." Thesis, KTH, Matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-273418.
Full textSedan det regulatoriska ramverket Basel II implementerades 2007, har banker tillåtits utveckla interna riskmodeller för att beräkna kapitalkravet. Genom att använda data på fallerade konsumentlån från Hoist Finance, utvärderar uppsatsen den avancerade interna riskklassificeringsmodellen. I synnerhet fokuserar arbetet på hur banker aktiva inom sektorn för fallerade lån, kan riskklassificera sina lån trots begränsad datatillgång om låntagarna. Dessutom analyseras effekten av maximala inkassoperioden på kapitalkravet. I sammandrag visade en jämförelse av fem modeller, baserade på tidigare forskning inom området, att lånen kan modelleras genom en tvåstegs trädmodell med logistisk regression samt s.k. zero-inflated beta regression, resulterande i en maximal inkassoperiod om åtta år. Samtidigt är det värt att notera svårigheten i att skilja mellan låg- och högriskslåntagare genom att huvudsakligen analysera elementär data om låntagarna. Rekommenderade tillägg till analysen i fortsatt forskning är att inkludera makroekonomiska variabler för att bättre inkorporera effekten av ekonomiska nedgångar.
Ferrer, Martínez Claudia. "Machine Learning for Solar Energy Prediction." Thesis, Högskolan i Gävle, Avdelningen för elektronik, matematik och naturvetenskap, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hig:diva-27423.
Full text