Dissertations / Theses: 'Machine learning, Global Optimization'

1

Nowak, Hans II(Hans Antoon). "Strategic capacity planning using data science, optimization, and machine learning." Thesis, Massachusetts Institute of Technology, 2020. https://hdl.handle.net/1721.1/126914.

Full text

Abstract:

Thesis: M.B.A., Massachusetts Institute of Technology, Sloan School of Management, in conjunction with the Leaders for Global Operations Program at MIT, May, 2020
Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, in conjunction with the Leaders for Global Operations Program at MIT, May, 2020
Cataloged from the official PDF of thesis.
Includes bibliographical references (pages 101-104).
Raytheon's Circuit Card Assembly (CCA) factory in Andover, MA is Raytheon's largest factory and the largest Department of Defense (DOD) CCA manufacturer in the world. With over 500 operations, it manufactures over 7000 unique parts with a high degree of complexity and varying levels of demand. Recently, the factory has seen an increase in demand, making the ability to continuously analyze factory capacity and strategically plan for future operations much needed. This study seeks to develop a sustainable strategic capacity optimization model and capacity visualization tool that integrates demand data with historical manufacturing data. Through automated data mining algorithms of factory data sources, capacity utilization and overall equipment effectiveness (OEE) for factory operations are evaluated. Machine learning methods are then assessed to gain an accurate estimate of cycle time (CT) throughout the factory. Finally, a mixed-integer nonlinear program (MINLP) integrates the capacity utilization framework and machine learning predictions to compute the optimal strategic capacity planning decisions. Capacity utilization and OEE models are shown to be able to be generated through automated data mining algorithms. Machine learning models are shown to have a mean average error (MAE) of 1.55 on predictions for new data, which is 76.3% lower than the current CT prediction error. Finally, the MINLP is solved to optimality within a tolerance of 1.00e-04 and generates resource and production decisions that can be acted upon.
by Hans Nowak II.
M.B.A.
S.M.
M.B.A. Massachusetts Institute of Technology, Sloan School of Management
S.M. Massachusetts Institute of Technology, Department of Mechanical Engineering

APA, Harvard, Vancouver, ISO, and other styles

2

Veluscek, Marco. "Global supply chain optimization : a machine learning perspective to improve caterpillar's logistics operations." Thesis, Brunel University, 2016. http://bura.brunel.ac.uk/handle/2438/13050.

Full text

Abstract:

Supply chain optimization is one of the key components for the effective management of a company with a complex manufacturing process and distribution network. Companies with a global presence in particular are motivated to optimize their distribution plans in order to keep their operating costs low and competitive. Changing condition in the global market and volatile energy prices increase the need for an automatic decision and optimization tool. In recent years, many techniques and applications have been proposed to address the problem of supply chain optimization. However, such techniques are often too problemspecific or too knowledge-intensive to be implemented as in-expensive, and easy-to-use computer system. The effort required to implement an optimization system for a new instance of the problem appears to be quite significant. The development process necessitates the involvement of expert personnel and the level of automation is low. The aim of this project is to develop a set of strategies capable of increasing the level of automation when developing a new optimization system. An increased level of automation is achieved by focusing on three areas: multi-objective optimization, optimization algorithm usability, and optimization model design. A literature review highlighted the great level of interest for the problem of multiobjective optimization in the research community. However, the review emphasized a lack of standardization in the area and insufficient understanding of the relationship between multi-objective strategies and problems. Experts in the area of optimization and artificial intelligence are interested in improving the usability of the most recent optimization algorithms. They stated the concern that the large number of variants and parameters, which characterizes such algorithms, affect their potential applicability in real-world environments. Such characteristics are seen as the root cause for the low success of the most recent optimization algorithms in industrial applications. Crucial task for the development of an optimization system is the design of the optimization model. Such task is one of the most complex in the development process, however, it is still performed mostly manually. The importance and the complexity of the task strongly suggest the development of tools to aid the design of optimization models. In order to address such challenges, first the problem of multi-objective optimization is considered and the most widely adopted techniques to solve it are identified. Such techniques are analyzed and described in details to increase the level of standardization in the area. Empirical evidences are highlighted to suggest what type of relationship exists between strategies and problem instances. Regarding the optimization algorithm, a classification method is proposed to improve its usability and computational requirement by automatically tuning one of its key parameters, the termination condition. The algorithm understands the problem complexity and automatically assigns the best termination condition to minimize runtime. The runtime of the optimization system has been reduced by more than 60%. Arguably, the usability of the algorithm has been improved as well, as one of the key configuration tasks can now be completed automatically. Finally, a system is presented to aid the definition of the optimization model through regression analysis. The purpose of the method is to gather as much knowledge about the problem as possible so that the task of the optimization model definition requires a lower user involvement. The application of the proposed algorithm is estimated that could have saved almost 1000 man-weeks to complete the project. The developed strategies have been applied to the problem of Caterpillar’s global supply chain optimization. This thesis describes also the process of developing an optimization system for Caterpillar and highlights the challenges and research opportunities identified while undertaking this work. This thesis describes the optimization model designed for Caterpillar’s supply chain and the implementation details of the Ant Colony System, the algorithm selected to optimize the supply chain. The system is now used to design the distribution plans of more than 7,000 products. The system improved Caterpillar’s marginal profit on such products by a factor of 4.6% on average.

APA, Harvard, Vancouver, ISO, and other styles

3

Schweidtmann, Artur M. [Verfasser], Alexander [Akademischer Betreuer] Mitsos, and Andreas [Akademischer Betreuer] Schuppert. "Global optimization of processes through machine learning / Artur M. Schweidtmann ; Alexander Mitsos, Andreas Schuppert." Aachen : Universitätsbibliothek der RWTH Aachen, 2021. http://d-nb.info/1240690924/34.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

Taheri, Mehdi. "Machine Learning from Computer Simulations with Applications in Rail Vehicle Dynamics and System Identification." Diss., Virginia Tech, 2016. http://hdl.handle.net/10919/81417.

Full text

Abstract:

The application of stochastic modeling for learning the behavior of multibody dynamics models is investigated. The stochastic modeling technique is also known as Kriging or random function approach. Post-processing data from a simulation run is used to train the stochastic model that estimates the relationship between model inputs, such as the suspension relative displacement and velocity, and the output, for example, sum of suspension forces. Computational efficiency of Multibody Dynamics (MBD) models can be improved by replacing their computationally-intensive subsystems with stochastic predictions. The stochastic modeling technique is able to learn the behavior of a physical system and integrate its behavior in MBS models, resulting in improved real-time simulations and reduced computational effort in models with repeated substructures (for example, modeling a train with a large number of rail vehicles). Since the sampling plan greatly influences the overall accuracy and efficiency of the stochastic predictions, various sampling plans are investigated, and a space-filling Latin Hypercube sampling plan based on the traveling salesman problem (TPS) is suggested for efficiently representing the entire parameter space. The simulation results confirm the expected increased modeling efficiency, although further research is needed for improving the accuracy of the predictions. The prediction accuracy is expected to improve through employing a sampling strategy that considers the discrete nature of the training data and uses infill criteria that considers the shape of the output function and detects sample spaces with high prediction errors. It is recommended that future efforts consider quantifying the computation efficiency of the proposed learning behavior by overcoming the inefficiencies associated with transferring data between multiple software packages, which proved to be a limiting factor in this study. These limitations can be overcome by using the user subroutine functionality of SIMPACK and adding the stochastic modeling technique to its force library.
Ph. D.

APA, Harvard, Vancouver, ISO, and other styles

5

Gabere, Musa Nur. "Prediction of antimicrobial peptides using hyperparameter optimized support vector machines." Thesis, University of the Western Cape, 2011. http://etd.uwc.ac.za/index.php?module=etd&action=viewtitle&id=gen8Srv25Nme4_7345_1330684697.

Full text

Abstract:

Antimicrobial peptides (AMPs) play a key role in the innate immune response. They can be ubiquitously found in a wide range of eukaryotes including mammals, amphibians, insects, plants, and protozoa. In lower organisms, AMPs function merely as antibiotics by permeabilizing cell membranes and lysing invading microbes. Prediction of antimicrobial peptides is important because experimental methods used in characterizing AMPs are costly, time consuming and resource intensive and identification of AMPs in insects can serve as a template for the design of novel antibiotic. In order to fulfil this, firstly, data on antimicrobial peptides is extracted from UniProt, manually curated and stored into a centralized database called dragon antimicrobial peptide database (DAMPD). Secondly, based on the curated data, models to predict antimicrobial peptides are created using support vector machine with optimized hyperparameters. In particular, global optimization methods such as grid search, pattern search and derivative-free methods are utilised to optimize the SVM hyperparameters. These models are useful in characterizing unknown antimicrobial peptides. Finally, a webserver is created that will be used to predict antimicrobial peptides in haemotophagous insects such as Glossina morsitan and Anopheles gambiae.

APA, Harvard, Vancouver, ISO, and other styles

6

Belkhir, Nacim. "Per Instance Algorithm Configuration for Continuous Black Box Optimization." Thesis, Université Paris-Saclay (ComUE), 2017. http://www.theses.fr/2017SACLS455/document.

Full text

Abstract:

Cette thèse porte sur la configurationAutomatisée des algorithmes qui vise à trouver le meilleur paramétrage à un problème donné ou une catégorie deproblèmes.Le problème de configuration de l'algorithme revient doncà un problème de métaFoptimisation dans l'espace desparamètres, dont le métaFobjectif est la mesure deperformance de l’algorithme donné avec une configuration de paramètres donnée.Des approches plus récentes reposent sur une description des problèmes et ont pour but d’apprendre la relationentre l’espace des caractéristiques des problèmes etl’espace des configurations de l’algorithme à paramétrer.Cette thèse de doctorat porter le CAPI (Configurationd'Algorithme Par Instance) pour résoudre des problèmesd'optimisation de boîte noire continus, où seul un budgetlimité d'évaluations de fonctions est disponible. Nous étudions d'abord' les algorithmes évolutionnairesPour l'optimisation continue, en mettant l'accent sur deux algorithmes que nous avons utilisés comme algorithmecible pour CAPI,DE et CMAFES.Ensuite, nous passons en revue l'état de l'art desapproches de configuration d'algorithme, et lesdifférentes fonctionnalités qui ont été proposées dansla littérature pour décrire les problèmesd'optimisation de boîte noire continue.Nous introduisons ensuite une méthodologie générale Pour étudier empiriquement le CAPI pour le domainecontinu, de sorte que toutes les composantes du CAPIpuissent être explorées dans des conditions réelles.À cette fin, nous introduisons également un nouveau Banc d'essai de boîte noire continue, distinct ducélèbre benchmark BBOB, qui est composé deplusieurs fonctions de test multidimensionnelles avec'différentes propriétés problématiques, issues de lalittérature.La méthodologie proposée est finalement appliquée 'àdeux AES. La méthodologie est ainsi, validéempiriquement sur le nouveau banc d’essaid’optimisation boîte noire pour des dimensions allant jusqu’à 100
This PhD thesis focuses on the automated algorithm configuration that aims at finding the best parameter setting for a given problem or a' class of problem. The Algorithm Configuration problem thus amounts to a metal Foptimization problem in the space of parameters, whosemetaFobjective is the performance measure of the given algorithm at hand with a given parameter configuration. However, in the continuous domain, such method can only be empirically assessed at the cost of running the algorithm on some problem instances. More recent approaches rely on a description of problems in some features space, and try to learn a mapping from this feature space onto the space of parameter configurations of the algorithm at hand. Along these lines, this PhD thesis focuses on the Per Instance Algorithm Configuration (PIAC) for solving continuous black boxoptimization problems, where only a limited budget confessionnalisations available. We first survey Evolutionary Algorithms for continuous optimization, with a focus on two algorithms that we have used as target algorithm for PIAC, DE and CMAFES. Next, we review the state of the art of Algorithm Configuration approaches, and the different features that have been proposed in the literature to describe continuous black box optimization problems. We then introduce a general methodology to empirically study PIAC for the continuous domain, so that all the components of PIAC can be explored in real Fworld conditions. To this end, we also introduce a new continuous black box test bench, distinct from the famous BBOB'benchmark, that is composed of a several multiFdimensional test functions with different problem properties, gathered from the literature. The methodology is finally applied to two EAS. First we use Differential Evolution as'target algorithm, and explore all the components of PIAC, such that we empirically assess the best. Second, based on the results on DE, we empirically investigate PIAC with Covariance Matrix Adaptation Evolution Strategy (CMAFES) as target algorithm. Both use cases empirically validate the proposed methodology on the new black box testbench for dimensions up to100

APA, Harvard, Vancouver, ISO, and other styles

7

Liu, Liu. "Stochastic Optimization in Machine Learning." Thesis, The University of Sydney, 2019. http://hdl.handle.net/2123/19982.

Full text

Abstract:

Stochastic optimization has received extensive attention in recent years due to their extremely potential for solving the large-scale optimization problem. However, the classical optimization algorithm and original stochastic method might prove to be inefficient due to the fact that: 1) the cost-per-iteration is a computational challenge, 2) the convergence and complexity are poorly performed. In this thesis, we exploit the stochastic optimization from three kinds of "order" optimization to address the problem. For the stochastic zero-order optimization, we introduce a novel variance reduction based method under Gaussian smoothing and establish the complexity for optimizing non-convex problems. With variance reduction on both sample space and search space, the complexity of our algorithm is sublinear to d and is strictly better than current approaches, in both smooth and non-smooth cases. Moreover, we extend the proposed method to the mini-batch version. For the stochastic first-order optimization, we consider two kinds of functions with one finite-sum and two finite-sums. The one first structure, we apply the dual coordinate ascent and accelerated algorithm to propose a general scheme for the double-accelerated stochastic method to deal with the ill-conditioned problem. The second structure, we apply the variance-reduced technique to derive the stochastic composition, including inner and outer finite-sum functions with a large number of component functions, via variance reduction that significantly improves the query complexity when the number of inner component functions is sufficiently large. For the stochastic second-order optimization, we study a family of stochastic trust region and cubic regularization methods when gradient, Hessian and function values are computed inexactly, and show the iteration complexity to achieve $\epsilon$-approximate second-order optimality is in the same order with previous work for which gradient and function values are computed exactly. The mild conditions on inexactness can be achieved in finite-sum minimization using random sampling.

APA, Harvard, Vancouver, ISO, and other styles

8

Leblond, Rémi. "Asynchronous optimization for machine learning." Thesis, Paris Sciences et Lettres (ComUE), 2018. http://www.theses.fr/2018PSLEE057/document.

Full text

Abstract:

Les explosions combinées de la puissance computationnelle et de la quantité de données disponibles ont fait des algorithmes les nouveaux facteurs limitants en machine learning. L’objectif de cette thèse est donc d’introduire de nouvelles méthodes capables de tirer profit de quantités de données et de ressources computationnelles importantes. Nous présentons deux contributions indépendantes. Premièrement, nous développons des algorithmes d’optimisation rapides, adaptés aux avancées en architecture de calcul parallèle pour traiter des quantités massives de données. Nous introduisons un cadre d’analyse pour les algorithmes parallèles asynchrones, qui nous permet de faire des preuves correctes et simples. Nous démontrons son utilité en analysant les propriétés de convergence et d’accélération de deux nouveaux algorithmes. Asaga est une variante parallèle asynchrone et parcimonieuse de Saga, un algorithme à variance réduite qui a un taux de convergence linéaire rapide dans le cas d’un objectif lisse et fortement convexe. Dans les conditions adéquates, Asaga est linéairement plus rapide que Saga, même en l’absence de parcimonie. ProxAsaga est une extension d’Asaga au cas plus général où le terme de régularisation n’est pas lisse. ProxAsaga obtient aussi une accélération linéaire. Nous avons réalisé des expériences approfondies pour comparer nos algorithms à l’état de l’art. Deuxièmement, nous présentons de nouvelles méthodes adaptées à la prédiction structurée. Nous nous concentrons sur les réseaux de neurones récurrents (RNNs), dont l’algorithme d’entraînement traditionnel – basé sur le principe du maximum de vraisemblance (MLE) – présente plusieurs limitations. La fonction de coût associée ignore l’information contenue dans les métriques structurées ; de plus, elle entraîne des divergences entre l’entraînement et la prédiction. Nous proposons donc SeaRNN, un nouvel algorithme d’entraînement des RNNs inspiré de l’approche dite “learning to search”. SeaRNN repose sur une exploration de l’espace d’états pour définir des fonctions de coût globales-locales, plus proches de la métrique d’évaluation que l’objectif MLE. Les modèles entraînés avec SeaRNN ont de meilleures performances que ceux appris via MLE pour trois tâches difficiles, dont la traduction automatique. Enfin, nous étudions le comportement de ces modèles et effectuons une comparaison détaillée de notre nouvelle approche aux travaux de recherche connexes
The impressive breakthroughs of the last two decades in the field of machine learning can be in large part attributed to the explosion of computing power and available data. These two limiting factors have been replaced by a new bottleneck: algorithms. The focus of this thesis is thus on introducing novel methods that can take advantage of high data quantity and computing power. We present two independent contributions. First, we develop and analyze novel fast optimization algorithms which take advantage of the advances in parallel computing architecture and can handle vast amounts of data. We introduce a new framework of analysis for asynchronous parallel incremental algorithms, which enable correct and simple proofs. We then demonstrate its usefulness by performing the convergence analysis for several methods, including two novel algorithms. Asaga is a sparse asynchronous parallel variant of the variance-reduced algorithm Saga which enjoys fast linear convergence rates on smooth and strongly convex objectives. We prove that it can be linearly faster than its sequential counterpart, even without sparsity assumptions. ProxAsaga is an extension of Asaga to the more general setting where the regularizer can be non-smooth. We prove that it can also achieve a linear speedup. We provide extensive experiments comparing our new algorithms to the current state-of-art. Second, we introduce new methods for complex structured prediction tasks. We focus on recurrent neural networks (RNNs), whose traditional training algorithm for RNNs – based on maximum likelihood estimation (MLE) – suffers from several issues. The associated surrogate training loss notably ignores the information contained in structured losses and introduces discrepancies between train and test times that may hurt performance. To alleviate these problems, we propose SeaRNN, a novel training algorithm for RNNs inspired by the “learning to search” approach to structured prediction. SeaRNN leverages test-alike search space exploration to introduce global-local losses that are closer to the test error than the MLE objective. We demonstrate improved performance over MLE on three challenging tasks, and provide several subsampling strategies to enable SeaRNN to scale to large-scale tasks, such as machine translation. Finally, after contrasting the behavior of SeaRNN models to MLE models, we conduct an in-depth comparison of our new approach to the related work

APA, Harvard, Vancouver, ISO, and other styles

9

Bai, Hao. "Machine learning assisted probabilistic prediction of long-term fatigue damage and vibration reduction of wind turbine tower using active damping system." Thesis, Normandie, 2021. http://www.theses.fr/2021NORMIR01.

Full text

Abstract:

Cette thèse est consacrée au développement d'un système d'amortissement actif pour la réduction des vibrations du mât d'éoliennes en cas de vent avec rafales et de vent avec turbulence. La présence de vibrations entraîne souvent soit une déflexion ultime au sommet du mât d'éolienne, soit une défaillance due à la fatigue du matériau près du bas du mât d'éolienne. De plus, étant donné la nature aléatoire de l'état du vent, il est indispensable d'examiner ce problème d'un point de vue probabiliste. Dans ce travail, un cadre probabiliste d'analyse de la fatigue est développé et amélioré en utilisant le réseau de neurones résiduels. Un système d'amortissement utilisant un amortisseur actif, le Twin Rotor Damper, est conçu pour l'éolienne référentielle NREL 5MW. La conception est optimisée par un algorithme évolutionniste avec une méthode de réglage automatique des paramètres basée sur l'exploitation et l'exploration
This dissertation is devoted to the development of an active damping system for vibration reduction of wind turbine tower under gusty wind and turbulent wind. The presence of vibrations often leads to either an ultimate deflection on the top of wind tower or a failure due to the material’s fatigue near the bottom of wind tower. Furthermore, given the random nature of wind conditions, it is indispensable to look at this problem from a probabilistic point of view. In this work, a probabilistic framework of fatigue analysis is developed and improved by using a residual neural network. A damping system employing an active damper, Twin Rotor Damper, is designed for NREL 5MW reference wind turbine. The design is optimized by an evolutionary algorithm with automatic parameter tuning method based on exploitation and exploration

APA, Harvard, Vancouver, ISO, and other styles

10

Chang, Allison An. "Integer optimization methods for machine learning." Thesis, Massachusetts Institute of Technology, 2012. http://hdl.handle.net/1721.1/72643.

Full text

Abstract:

Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2012.
This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.
Cataloged from student-submitted PDF version of thesis.
Includes bibliographical references (p. 129-137).
In this thesis, we propose new mixed integer optimization (MIO) methods to ad- dress problems in machine learning. The first part develops methods for supervised bipartite ranking, which arises in prioritization tasks in diverse domains such as information retrieval, recommender systems, natural language processing, bioinformatics, and preventative maintenance. The primary advantage of using MIO for ranking is that it allows for direct optimization of ranking quality measures, as opposed to current state-of-the-art algorithms that use heuristic loss functions. We demonstrate using a number of datasets that our approach can outperform other ranking methods. The second part of the thesis focuses on reverse-engineering ranking models. This is an application of a more general ranking problem than the bipartite case. Quality rankings affect business for many organizations, and knowing the ranking models would allow these organizations to better understand the standards by which their products are judged and help them to create higher quality products. We introduce an MIO method for reverse-engineering such models and demonstrate its performance in a case-study with real data from a major ratings company. We also devise an approach to find the most cost-effective way to increase the rank of a certain product. In the final part of the thesis, we develop MIO methods to first generate association rules and then use the rules to build an interpretable classifier in the form of a decision list, which is an ordered list of rules. These are both combinatorially challenging problems because even a small dataset may yield a large number of rules and a small set of rules may correspond to many different orderings. We show how to use MIO to mine useful rules, as well as to construct a classifier from them. We present results in terms of both classification accuracy and interpretability for a variety of datasets.
by Allison An Chang.
Ph.D.

APA, Harvard, Vancouver, ISO, and other styles

11

Reddi, Sashank Jakkam. "New Optimization Methods for Modern Machine Learning." Research Showcase @ CMU, 2017. http://repository.cmu.edu/dissertations/1116.

Full text

Abstract:

Modern machine learning systems pose several new statistical, scalability, privacy and ethical challenges. With the advent of massive datasets and increasingly complex tasks, scalability has especially become a critical issue in these systems. In this thesis, we focus on fundamental challenges related to scalability, such as computational and communication efficiency, in modern machine learning applications. The underlying central message of this thesis is that classical statistical thinking leads to highly effective optimization methods for modern big data applications. The first part of the thesis investigates optimization methods for solving large-scale nonconvex Empirical Risk Minimization (ERM) problems. Such problems have surged into prominence, notably through deep learning, and have led to exciting progress. However, our understanding of optimization methods suitable for these problems is still very limited. We develop and analyze a new line of optimization methods for nonconvex ERM problems, based on the principle of variance reduction. We show that our methods exhibit fast convergence to stationary points and improve the state-of-the-art in several nonconvex ERM settings, including nonsmooth and constrained ERM. Using similar principles, we also develop novel optimization methods that provably converge to second-order stationary points. Finally, we show that the key principles behind our methods can be generalized to overcome challenges in other important problems such as Bayesian inference. The second part of the thesis studies two critical aspects of modern distributed machine learning systems — asynchronicity and communication efficiency of optimization methods. We study various asynchronous stochastic algorithms with fast convergence for convex ERM problems and show that these methods achieve near-linear speedups in sparse settings common to machine learning. Another key factor governing the overall performance of a distributed system is its communication efficiency. Traditional optimization algorithms used in machine learning are often ill-suited for distributed environments with high communication cost. To address this issue, we dis- cuss two different paradigms to achieve communication efficiency of algorithms in distributed environments and explore new algorithms with better communication complexity.

APA, Harvard, Vancouver, ISO, and other styles

12

Etheve, Marc. "Solving repeated optimization problems by Machine Learning." Thesis, Paris, HESAM, 2021. http://www.theses.fr/2021HESAC040.

Full text

Abstract:

Cette thèse a pour but d’utiliser des techniques d’apprentissage automatique pour larésolution de problèmes linéaires en nombres entiers issus de données stochastiques. Plutôt que de lesrésoudre indépendamment, nous proposons de tirer profit des similarités entre instances en apprenantdifférentes stratégies au sein d’un algorithme de Branch and Bound (B&B).L’axe principal développé est l’utilisation d’apprentissage par renforcement pour découvrir des stratégiesminimisant la taille des arbres de B&B. Afin de s’adapter `a l’environnement induit par l’algorithmede B&B, nous définissons un nouveau type de transitions au sein de processus de décision markoviens,basées sur la structure d’arbre binaire. Par ailleurs, nous étudions différents modèles de coûts et prouvonsl’optimalité du modèle de coût unitaire sous les transitions classiques et binaires, dans l’apprentissagedes stratégies de branchement et de sélection de noeud. Pour autant, les expérimentations menéessuggèrent qu’il peut -être préfèrable de biaiser le modèle de coût afin d’améliorer la stabilité du processusd’apprentissage. En ce qui concerne la stratégie de sélection de noeud, nous démontrons l’optimalitéd’une stratégie explicitement définie, qui peut -être apprise plus efficacement de manière supervisée.Par ailleurs, nous proposons d’exploiter la structure des problèmes étudiés. Nous étudions pour cela unestratégie de décomposition-coordination, une heuristique de branchement basée sur une représentationpar graphe d’un noeud de l’arbre de B&B et enfin l’apprentissage de perturbations de la fonction objectif
This thesis aims at using machine learning techniques in the context of Mixed Integer LinearProgramming instances generated by stochastic data. Rather than solve these instances independentlyusing the Branch and Bound algorithm (B&B), we propose to leverage the similarities between instancesby learning inner strategies of this algorithm, such as node selection and branching.The main approach developed in this work is to use reinforcement learning to discover by trials-and-errorsstrategies which minimize the B&B tree size. To properly adapt to the B&B environment, we definea new kind of tree-based transitions, and elaborate on different cost models in the correspondingMarkov Decision Processes. We prove the optimality of the unitary cost model under both classical andtree-based transitions, either for branching or node selection. However, we experimentally show that itmay be beneficial to bias the cost so as to improve the learning stability. Regarding node selection, weformally exhibit an optimal strategy which can be more efficiently learnt directly by supervised learning.In addition, we propose to exploit the structure of the studied problems. To this end, we propose adecomposition-coordination methodology, a branching heuristic based on a graph representation of aB&B node and finally an approach for learning to disrupt the objective function

APA, Harvard, Vancouver, ISO, and other styles

13

Van, Mai Vien. "Large-Scale Optimization With Machine Learning Applications." Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263147.

Full text

Abstract:

This thesis aims at developing efficient algorithms for solving some fundamental engineering problems in data science and machine learning. We investigate a variety of acceleration techniques for improving the convergence times of optimization algorithms. First, we investigate how problem structure can be exploited to accelerate the solution of highly structured problems such as generalized eigenvalue and elastic net regression. We then consider Anderson acceleration, a generic and parameter-free extrapolation scheme, and show how it can be adapted to accelerate practical convergence of proximal gradient methods for a broad class of non-smooth problems. For all the methods developed in this thesis, we design novel algorithms, perform mathematical analysis of convergence rates, and conduct practical experiments on real-world data sets.

QC 20191105

APA, Harvard, Vancouver, ISO, and other styles

14

Cardamone, Dario. "Support Vector Machine a Machine Learning Algorithm." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017.

Find full text

Abstract:

Nella presente tesi di laurea viene preso in considerazione l’algoritmo di classificazione Support Vector Machine. Piu` in particolare si considera la sua formulazione come problema di ottimizazione Mixed Integer Program per la classificazione binaria super- visionata di un set di dati.

APA, Harvard, Vancouver, ISO, and other styles

15

Tugnoli, Riccardo. "MVA Calculation and Optimization with Machine Learning Techniques." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018.

Find full text

Abstract:

In the last few years a new margin requirement, the Initial Margin, was introduced by both Central Counterparties for cleared derivatives and BCBS\&IOSCO for derivatives Over The Counter in order to reduce the counterparty credit risk when dealing derivatives. Besides, due to its segregation, the IM funding always represents a cost. Consequently, a pricing adjustment, called (Initial) Margin Valuation Adjustment, needs to be applied. Since the IM is based on a risk measure (i.e. Value at Risk or Expected Shortfall), the MVA calculation involves a nested Monte Carlo simulation, which is computationally intractable with "brute force". Firstly, we are going to analyze several approaches discussed in recent literature in order to solve the computational problem. Then we are going to test Machine Learning algorithms to minimize the MVA from a monetary point of view. In conclusion, we are going to show two different methods for the computational time optimization.

APA, Harvard, Vancouver, ISO, and other styles

16

Dahlberg, Leslie. "Evolutionary Computation in Continuous Optimization and Machine Learning." Thesis, Mälardalens högskola, Akademin för innovation, design och teknik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-35674.

Full text

Abstract:

Evolutionary computation is a field which uses natural computational processes to optimize mathematical and industrial problems. Differential Evolution, Particle Swarm Optimization and Estimation of Distribution Algorithm are some of the newer emerging varieties which have attracted great interest among researchers. This work has compared these three algorithms on a set of mathematical and machine learning benchmarks and also synthesized a new algorithm from the three other ones and compared it to them. The results from the benchmark show which algorithm is best suited to handle various machine learning problems and presents the advantages of using the new algorithm. The new algorithm called DEDA (Differential Estimation of Distribution Algorithms) has shown promising results at both machine learning and mathematical optimization tasks.

APA, Harvard, Vancouver, ISO, and other styles

17

Konečný, Jakub. "Stochastic, distributed and federated optimization for machine learning." Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/31478.

Full text

Abstract:

We study optimization algorithms for the finite sum problems frequently arising in machine learning applications. First, we propose novel variants of stochastic gradient descent with a variance reduction property that enables linear convergence for strongly convex objectives. Second, we study distributed setting, in which the data describing the optimization problem does not fit into a single computing node. In this case, traditional methods are inefficient, as the communication costs inherent in distributed optimization become the bottleneck. We propose a communication-efficient framework which iteratively forms local subproblems that can be solved with arbitrary local optimization algorithms. Finally, we introduce the concept of Federated Optimization/Learning, where we try to solve the machine learning problems without having data stored in any centralized manner. The main motivation comes from industry when handling user-generated data. The current prevalent practice is that companies collect vast amounts of user data and store them in datacenters. An alternative we propose is not to collect the data in first place, and instead occasionally use the computational power of users' devices to solve the very same optimization problems, while alleviating privacy concerns at the same time. In such setting, minimization of communication rounds is the primary goal, and we demonstrate that solving the optimization problems in such circumstances is conceptually tractable.

APA, Harvard, Vancouver, ISO, and other styles

18

Hedberg, Karolina. "Optimization of Insert-Tray Matching using Machine Learning." Thesis, Uppsala universitet, Avdelningen för systemteknik, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-452871.

Full text

Abstract:

The manufacturing process of carbide inserts at Sandvik Coromant consists of several operations. During some of these, the inserts are positioned on trays. For some inserts the trays are pre-defined but for others the insert-tray matching is partly improvised. The goal of this thesis project is to examine whether machine learning can be used to predict which tray to use for a given insert. It is also investigated which insert features are determining for the choice of tray. The study is done with insert and tray data from four blasting operations and considers a set of standardized inserts since it is assumed that the tray matching for these is well tuned. The algorithm that is used for the predictions is the supervised learning algorithm k-nearest neighbors. The problem of identifying the determining features is regarded as a feature selection problem and is done with the ReliefF algorithm. From the classification results it is seen that the classifiers are overfitting. The main reason for this is probably that the datasets contain features that together are uniquely defining for which tray is used. This was not detected during the feature selection since ReliefF identifies features that are individually relevant to the output. An idea to avoid overfitting the classifiers is to exclude these defining features from the dataset. Further work is thus recommended.

APA, Harvard, Vancouver, ISO, and other styles

19

Singh, Karanpreet. "Accelerating Structural Design and Optimization using Machine Learning." Diss., Virginia Tech, 2020. http://hdl.handle.net/10919/104114.

Full text

Abstract:

Machine learning techniques promise to greatly accelerate structural design and optimization. In this thesis, deep learning and active learning techniques are applied to different non-convex structural optimization problems. Finite Element Analysis (FEA) based standard optimization methods for aircraft panels with bio-inspired curvilinear stiffeners are computationally expensive. The main reason for employing many of these standard optimization methods is the ease of their integration with FEA. However, each optimization requires multiple computationally expensive FEA evaluations, making their use impractical at times. To accelerate optimization, the use of Deep Neural Networks (DNNs) is proposed to approximate the FEA buckling response. The results show that DNNs obtained an accuracy of 95% for evaluating the buckling load. The DNN accelerated the optimization by a factor of nearly 200. The presented work demonstrates the potential of DNN-based machine learning algorithms for accelerating the optimization of bio-inspired curvilinearly stiffened panels. But, the approach could have disadvantages for being only specific to similar structural design problems, and requiring large datasets for DNNs training. An adaptive machine learning technique called active learning is used in this thesis to accelerate the evolutionary optimization of complex structures. The active learner helps the Genetic Algorithms (GA) by predicting if the possible design is going to satisfy the required constraints or not. The approach does not need a trained surrogate model prior to the optimization. The active learner adaptively improve its own accuracy during the optimization for saving the required number of FEA evaluations. The results show that the approach has the potential to reduce the total required FEA evaluations by more than 50%. Lastly, the machine learning is used to make recommendations for modeling choices while analyzing a structure using FEA. The decisions about the selection of appropriate modeling techniques are usually based on an analyst's judgement based upon their knowledge and intuition from past experience. The machine learning-based approach provides recommendations within seconds, thus, saving significant computational resources for making accurate design choices.
Doctor of Philosophy
This thesis presents an innovative application of artificial intelligence (AI) techniques for designing aircraft structures. An important objective for the aerospace industry is to design robust and fuel-efficient aerospace structures. The state of the art research in the literature shows that the structure of aircraft in future could mimic organic cellular structure. However, the design of these new panels with arbitrary structures is computationally expensive. For instance, applying standard optimization methods currently being applied to aerospace structures to design an aircraft, can take anywhere from a few days to months. The presented research demonstrates the potential of AI for accelerating the optimization of an aircraft structures. This will provide an efficient way for aircraft designers to design futuristic fuel-efficient aircraft which will have positive impact on the environment and the world.

APA, Harvard, Vancouver, ISO, and other styles

20

Dabert, Geoffrey. "Application of Machine Learning techniques to Optimization algorithms." Thesis, KTH, Optimeringslära och systemteori, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-207471.

Full text

Abstract:

Optimization problems have been immuned to any attempt of combination with machine learning until a decade ago but it is now an active research field. This thesis has studied the potential implementation of a machine learning heuristic to improve the resolution of the optimization scheduling problems based on a Constraint Programming solver. Some scheduling problems, known as N P -hard problems, suffer from large computational cost (large number of jobs to schedule) and consequent human effort (well-suited heuristics need to be derived). Moreover industrial scheduling problems obviously evolves over time but a lot of features and the basic structure remain the same. Hence they have potential in the implementation a supervised-learning-based heuristic. First part of the study was to model a given benchmark of instances and im- plement some famous heuristics (as earliest due date, combined with the largest duration) in order to solve the benchmark. Based on the none-optimality of returned solutions, primaries instances were choosen to implement our method. The second part represents the procedure which has been set up to design a supervised-learning-based heuristic. An instance generator was first built to map the potential industrial evolutions of the instances. It returned secondaries instances representing the learning database. Then a CP-well-suited node ex- traction scheme was set up to collect relevant information from the resolution of the search tree. It will collect data from nodes of the search tree given a proper criteria. These nodes are next projected onto a constant-dimensional space which described the system, the underlying subtree and the impact of the affectations. Upon these features and designed target values statistical mod- els are implemented. A linear and a gradient boosting regressions have been implemented, calibrated and tuned upon the data. Last was to integrate the supervised-learning model into an heuristic framework. This has been done via a soft propagation module to try the instantiation of all the children of the considered node and apply the given module upon them. The selection decision rule was based upon a reconstructed score. Third part was to test the procedure implemented. New secondaries instances were generated and supervised- learning-based heuristic tested against the earliest due date one. The procedure was tested upon two different instances. The integrated heuristic returned positive results for both instances. For the first one (10 jobs to schedule) a gain in the first solution found (resp. the number of backtracks) of 18% (resp. 13% were realized. For the second instance (90 jobs to schedule) a gain in the first solution found of at least 16%. The results come to validate the procedure implemented and the methodology used.

APA, Harvard, Vancouver, ISO, and other styles

21

Giarimpampa, Despoina. "Blind Image Steganalytic Optimization by using Machine Learning." Thesis, Högskolan i Halmstad, Akademin för informationsteknologi, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-38150.

Full text

Abstract:

Since antiquity, steganography has existed in protecting sensitive information against unauthorized unveiling attempts. Nevertheless, digital media’s evolution, reveals that steganography has been used as a tool for activities such as terrorism or child pornography. Given this background, steganalysis arises as an antidote to steganography. Steganalysis can be divided into two main approaches: universal – also called blind – and speciﬁc. Speciﬁc methods request a previous knowledge of the steganographic technique under analysis. On the other hand, universal methods which can be widely practiced in a variety of algorithms, are more adaptable to real-world applications. Thus, it is necessary to establish even more accurate steganalysis techniques capable of detecting the hidden information coming from the use of diverse steganographic methods. Considering this, a universal steganalysis method specialized in images is proposed. The method is based on the typical steganalysis process, where feature extractors and classiﬁers are used. The experiments were implemented on different embedding rates and for various steganographic techniques. It turns out that the proposed method succeeds for the most part, providing digniﬁed results on color images and promising results on gray-scale images.

APA, Harvard, Vancouver, ISO, and other styles

22

Nguyen, Thanh Tan. "Selected non-convex optimization problems in machine learning." Thesis, Queensland University of Technology, 2020. https://eprints.qut.edu.au/200748/1/Thanh_Nguyen_Thesis.pdf.

Full text

Abstract:

Non-convex optimization is an important and rapidly growing research area. It is tied to the latest success of deep learning, reinforcement learning, matrix factorization, and more. As a contribution to this area, this thesis provides analyses and algorithms for three important problems. The first one is optimization of noisy functions defined on a large graph, which is useful for AB testing, digital marketing. The second one is learning a convex ensemble of basis models, with application in regression and classification. The last one is optimization of ResNet with restricted residual modules, which leads to better performance over standard ResNet.

APA, Harvard, Vancouver, ISO, and other styles

23

Detassis, Fabrizio <1991&gt. "Methods for integrating machine learning and constrained optimization." Doctoral thesis, Alma Mater Studiorum - Università di Bologna, 2022. http://amsdottorato.unibo.it/10360/1/Detassis_Fabrizio_Thesis_Final.pdf.

Full text

Abstract:

In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.

APA, Harvard, Vancouver, ISO, and other styles

24

Hill, Jerry L., and Randall P. Mora. "An Autonomous Machine Learning Approach for Global Terrorist Recognition." International Foundation for Telemetering, 2012. http://hdl.handle.net/10150/581675.

Full text

Abstract:

ITC/USA 2012 Conference Proceedings / The Forty-Eighth Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2012 / Town and Country Resort & Convention Center, San Diego, California
A major intelligence challenge we face in today's national security environment is the threat of terrorist attack against our national assets, especially our citizens. This paper addresses global reconnaissance which incorporates an autonomous Intelligent Agent/Data Fusion solution for recognizing potential risk of terrorist attack through identifying and reporting imminent persona-oriented terrorist threats based on data reduction/compression of a large volume of low latency data possibly from hundreds, or even thousands of data points.

APA, Harvard, Vancouver, ISO, and other styles

25

Kindestam, Anton. "Graph-based features for machine learning driven code optimization." Thesis, KTH, Skolan för datavetenskap och kommunikation (CSC), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-211444.

Full text

Abstract:

In this paper we present a method of using the Shortest-Path Graph Kernel, on graph-based features of computer programs, to train a Support Vector Regression model which predicts execution time speedup over baseline given an unseen program and a point in optimization space, based on a method proposed in Using Graph-Based Program Characterization for Predictive Modeling by Park et al. The optimization space is represented by command-line parameters to the polyhedral C-to-C compiler PoCC, and PolyBench is used to generate the data set of speedups over baseline. The model is found to produce results reasonable by some metrics, but due to the large error and the pseudo-random behaviour of the output the method, in its current form, must reluctantly be rejected.
I den här raporten presenterar vi en metod att träna en Stöd-vektor-regressions-modell som givet ett osett program och en punkt i optimeringsrymden kan förutsäga hur mycket snabbare över baslinjen programmet kommer att exekvera förutsatt att man applicerar givna optimeringar. För att representera programmet använder vi en grafstruktur för vilken vi kan använda en grafkärna, Shortest-Path Graph Kernel, vilken kan avgöra hur lika två olika grafer är varandra. Metoden är baserad på en metod presenterad av Park et al. i Using Graph-Based Program Characterization for Predictive Modeling. Optimeringsrymden erhålls genom olika kombinationer av kommandoradsparametrar till den polyhedriska C-till-C-kompilatorn PoCC. Testdatat erhölls genom att förberäkna hastighetsfaktorer för alla optimeringar och alla program i test-algoritms-biblioteket PolyBench. Vi finner att modellen i vissa mått mätt producerar "bra" resultat, men p.g.a. av det stora felet och det slumpmässiga beteendet måste dessvärre metoden, i dess nuvarande form,förkastas.

APA, Harvard, Vancouver, ISO, and other styles

26

Patvarczki, Jozsef. "Layout Optimization for Distributed Relational Databases Using Machine Learning." Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-dissertations/291.

Full text

Abstract:

A common problem when running Web-based applications is how to scale-up the database. The solution to this problem usually involves having a smart Database Administrator determine how to spread the database tables out amongst computers that will work in parallel. Laying out database tables across multiple machines so they can act together as a single efficient database is hard. Automated methods are needed to help eliminate the time required for database administrators to create optimal configurations. There are four operators that we consider that can create a search space of possible database layouts: 1) denormalizing, 2) horizontally partitioning, 3) vertically partitioning, and 4) fully replicating. Textbooks offer general advice that is useful for dealing with extreme cases - for instance you should fully replicate a table if the level of insert to selects is close to zero. But even this seemingly obvious statement is not necessarily one that will lead to a speed up once you take into account that some nodes might be a bottle neck. There can be complex interactions between the 4 different operators which make it even more difficult to predict what the best thing to do is. Instead of using best practices to do database layout, we need a system that collects empirical data on when these 4 different operators are effective. We have implemented a state based search technique to try different operators, and then we used the empirically measured data to see if any speed up occurred. We recognized that the costs of creating the physical database layout are potentially large, but it is necessary since we want to know the "Ground Truth" about what is effective and under what conditions. After creating a dataset where these four different operators have been applied to make different databases, we can employ machine learning to induce rules to help govern the physical design of the database across an arbitrary number of computer nodes. This learning process, in turn, would allow the database placement algorithm to get better over time as it trains over a set of examples. What this algorithm calls for is that it will try to learn 1) "What is a good database layout for a particular application given a query workload?" and 2) "Can this algorithm automatically improve itself in making recommendations by using machine learned rules to try to generalize when it makes sense to apply each of these operators?" There has been considerable research done in parallelizing databases where large amounts of data are shipped from one node to another to answer a single query. Sometimes the costs of shipping the data back and forth might be high, so in this work we assume that it might be more efficient to create a database layout where each query can be answered by a single node. To make this assumption requires that all the incoming query templates are known beforehand. This requirement can easily be satisfied in the case of a Web-based application due to the characteristic that users typically interact with the system through a web interface such as web forms. In this case, unseen queries are not necessarily answerable, without first possibly reconstructing the data on a single machine. Prior knowledge of these exact query templates allows us to select the best possible database table placements across multiple nodes. But in the case of trying to improve the efficiency of a Web-based application, a web site provider might feel that they are willing to suffer the inconvenience of not being able to answer an arbitrary query, if they are in turn provided with a system that runs more efficiently.

APA, Harvard, Vancouver, ISO, and other styles

27

Valenzuela, Michael Lawrence. "Machine Learning, Optimization, and Anti-Training with Sacrificial Data." Diss., The University of Arizona, 2016. http://hdl.handle.net/10150/605111.

Full text

Abstract:

Traditionally the machine learning community has viewed the No Free Lunch (NFL) theorems for search and optimization as a limitation. I review, analyze, and unify the NFL theorem with the many frameworks to arrive at necessary conditions for improving black-box optimization, model selection, and machine learning in general. I review meta-learning literature to determine when and how meta-learning can benefit machine learning. We generalize meta-learning, in context of the NFL theorems, to arrive at a novel technique called Anti-Training with Sacrificial Data (ATSD). My technique applies at the meta level to arrive at domain specific algorithms and models. I also show how to generate sacrificial data. An extensive case study is presented along with simulated annealing results to demonstrate the efficacy of the ATSD method.

APA, Harvard, Vancouver, ISO, and other styles

28

Mahajan, Ankush. "Machine learning assisted QoT estimation for optical networks optimization." Doctoral thesis, Universitat Politècnica de Catalunya, 2021. http://hdl.handle.net/10803/672665.

Full text

Abstract:

The tremendous increase in data traffic has spurred a rapid evolution of the optical networks for a reliable, affordable, cost effective and scalable network infrastructure. To meet some of these requirements, network operators are pushing toward disaggregation. Network disaggregation focuses on decoupling the traditional monolithic optical transport hardware into independent functional blocks that interoperate. This enables a relatively free market where the network operators/owners could choose the best-in-class equipment from different vendors overcoming the vendor lock-in, at better prices. In this multi-vendor disaggregation context, the used equipment would impact the physical layer and the overall network behavior. This results in increasing the uncertainty on the performance when compared to a traditional single vendor aggregated approach. For effective optical network planning, operation and optimization, it is necessary to estimate the Quality of Transmission (QoT) of the connections. Network designers are interested in accurate and fast QoT estimation for services to be established in a future or existing network. Typically, QoT estimation is performed using a Physical Layer Model (PLM) which is included in the QoT estimation tool or Qtool. A design margin is generally included in a Qtool to account for the modeling and parameter inaccuracies, to re-assure an acceptable performance. PLM accuracy is highly important as modeling errors translate into a higher design margin which in turn translate into wasted capacity or unwanted regeneration. Recently monitoring and machine learning (ML) techniques have been proposed to account for the actual network conditions and improving the accuracy of the PLM in single vendor networks. This in turn results in more accurate QoT estimation. The first part of the thesis focuses on the ML assisted accurate QoT estimation techniques. In this regard, we developed a model that uses monitoring information from an operating network combined with supervised ML regression techniques to understand the network conditions. In particular, we model the generated penalties due to i). EDFA gain ripple effect, and ii). filter spectral shape uncertainties at ROADM nodes. Furthermore, with the aim of improving the Qtool estimation accuracy in multi-vendor networks, we propose PLM extensions. In particular, we introduce four TP vendor dependent performance factors that capture the performance variations of multi-vendor TPs. To verify the potential improvement, we studied the following two use cases with the proposed PLM, to: i) optimize the transponders (TPs) launch power; and ii) reduce design margin in incremental planning. In consequence, the last part of this thesis aims at investigating and solving the issue of accuracy limitation of Qtool in dynamic optimization tasks. To keep the models aligned to the real conditions, the digital twin (DT) concept is gaining significant attention in the research community. The DT is more than a model of the system; it includes an evolving set of data, and a means to dynamically adjust the model. Based on the DT fundamentals, we devised and implemented an iterative closed control loop process that, after several intermediate iterations of the optimization algorithm, configures the network, monitors, and retrains the Qtool. For the Qtool retraining, we adopt a ML-based nonlinear regression fitting technique. The key advantage of this novel scheme is that whilst the network operates, the Qtool parameters are retrained according to the monitored information with the adopted ML model. Hence, the Qtool tracks the projected states intermediately calculated by the algorithm. This reduces the optimization time as opposed to directly probing and monitoring the network.
Las operadoras están impulsando el concepto de desagregación de red. Dicho concepto permite desacoplar el tradicional hardware de transporte óptico dispuesto de forma monolítica en bloques funcionales independientes que interoperan entre ellos. Como resultado, esta desagregación incentiva un mercado más abierto en el cual los operadores/propietarios de la red pueden elegir los mejores dispositivos de diferentes proveedores, eliminando el conocido como bloqueo/dependencia del proveedor, a precios más competitivos. En este contexto de desagregación con múltiples fabricantes, cada equipo afecta de forma independiente. Por lo tanto, la incertidumbre aumenta al compararlo con el rendimiento obtenido mediante un modelo más tradicional basado en agregación y dependiente de un único proveedor. Para una eficiente planificación y optimización de una red óptica, es necesario estimar la Quality of Transmission (QoT) de las conexiones. Los diseñadores de redes están interesados en una estimación precisa y rápida de la QoT para los servicios que se establezcan. Normalmente, la estimación de la QoT se realiza mediante un Physical Layer Model (PLM) que se incluye en la herramienta de estimación de la QoT o Qtool. Además, se incluye unos márgenes de diseño (design margin) dentro de la herramienta Qtool. Esto permite tener en cuenta las imprecisiones de modelado y de los parámetros y de esta forma asegurar un rendimiento aceptable. La precisión del PLM es muy importante, ya que los errores de modelado se traducen en un mayor design margin que, a su vez, se traduce en una pérdida de capacidad. Recientemente, importantes logros en la definición de PLMs para redes ópticas más precisos y rápidos se han alcanzado. Estos se basan en métodos tradicionales con soluciones analíticas o numéricas. La primera parte de la tesis se centra en las técnicas de estimación precisa de QoT asistidas por machine learning (ML). Se ha desarrollado un modelo que utiliza la información de monitorización de red combinada con técnicas de regresión ML supervisadas para comprender las condiciones de la red. En particular, se han modelado las penalizaciones generadas debido a: i) el efecto de gain ripple del EDFA, y ii) las incertidumbres de la forma espectral del filtro en los nodos ROADM. Además, con el objetivo de mejorar la precisión de la estimación del Qtool en redes que incluyen elementos de diferentes fabricantes (i.e., multi-proveedor), se han propuesto unas extensiones del PLM. Se han introducido cuatro factores de rendimiento dependientes del proveedor del transponder (TP) que capturan las variaciones de rendimiento de los TP de múltiples proveedores. Para verificar la mejora potencial, se han estudiado los siguientes dos casos de uso con el PLM propuesto: i) optimizar la potencia de lanzamiento de los TPs; y ii) reducir el design margin. La última parte de esta tesis ha tenido como objetivo investigar la cuestión de la limitación de la precisión del Qtool en las tareas de optimización dinámica. Para mantener los modelos alineados con las condiciones reales, el concepto de digital twin (DT) está ganando mucha atención. El DT incluye un conjunto de datos en evolución y un medio para ajustar dinámicamente el modelo. Basándonos en los fundamentos del DT, se ha ideado e implementado un proceso iterativo de bucle cerrado de control que, tras varias iteraciones intermedias del algoritmo de optimización, configura la red, supervisa y reentrena el Qtool. Para el reentrenamiento del Qtool, se ha adoptado una técnica de ajuste de regresión no lineal basada en ML. La principal ventaja es que, mientras la red funciona, los parámetros del Qtool se reentrenan según la información monitorizada con el modelo ML adoptado. Por lo tanto, el Qtool sigue los estados proyectados de forma intermedia calculados por el algoritmo. Esto reduce el tiempo de optimización en comparación con el sondeo y la monitorización directa
Teoria del senyal i comunicacions

APA, Harvard, Vancouver, ISO, and other styles

29

Crouch, Ingrid W. M. "A knowledge-based simulation optimization system with machine learning." Diss., Virginia Tech, 1992. http://hdl.handle.net/10919/37245.

Full text

APA, Harvard, Vancouver, ISO, and other styles

30

Zhou, Yi. "Nonconvex Optimization in Machine Learning: Convergence, Landscape, and Generalization." The Ohio State University, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=osu1533554879269658.

Full text

APA, Harvard, Vancouver, ISO, and other styles

31

Adurti, Devi Abhiseshu, and Mohit Battu. "Optimization of Heterogeneous Parallel Computing Systems using Machine Learning." Thesis, Blekinge Tekniska Högskola, Institutionen för datavetenskap, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21834.

Full text

Abstract:

Background: Heterogeneous parallel computing systems utilize the combination of different resources CPUs and GPUs to achieve high performance and, reduced latency and energy consumption. Programming applications that target various processing units requires employing different tools and programming models/languages. Furthermore, selecting the most optimal implementation, which may either target different processing units (i.e. CPU or GPU) or implement the various algorithms, is not trivial for a given context. In this thesis, we investigate the use of machine learning to address the selection problem of various implementation variants for an application running on a heterogeneous system. Objectives: This study is focused on providing an approach for optimization of heterogeneous parallel computing systems at runtime by building the most efficient machine learning model to predict the optimal implementation variant of an application. Methods: The six machine learning models KNN, XGBoost, DTC, Random Forest Classifier, LightGBM, and SVM are trained and tested using stratified k-fold on the dataset generated from the matrix multiplication application for square matrix input dimension ranging from 16x16 to 10992x10992. Results: The results of each machine learning algorithm’s finding are presented through accuracy, confusion matrix, classification report for parameters precision, recall, and F-1 score, and a comparison between the machine learning models in terms of accuracy, run-time training, and run-time prediction are provided to determine the best model. Conclusions: The XGBoost, DTC, SVM algorithms achieved 100% accuracy. In comparison to the other machine learning models, the DTC is found to be the most suitable due to its low time required for training and prediction in predicting the optimal implementation variant of the heterogeneous system application. Hence the DTC is the best suitable algorithm for the optimization of heterogeneous parallel computing.

APA, Harvard, Vancouver, ISO, and other styles

32

Ekman, Björn. "Machine Learning for Beam Based Mobility Optimization in NR." Thesis, Linköpings universitet, Kommunikationssystem, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-136489.

Full text

Abstract:

One option for enabling mobility between 5G nodes is to use a set of area-fixed reference beams in the downlink direction from each node. To save power these reference beams should be turned on only on demand, i.e. only if a mobile needs it. An User Equipment (UE) moving out of a beam's coverage will require a switch from one beam to another, preferably without having to turn on all possible beams to find out which one is the best. This thesis investigates how to transform the beam selection problem into a format suitable for machine learning and how good such solutions are compared to baseline models. The baseline models considered were beam overlap and average Reference Signal Received Power (RSRP), both building beam-to-beam maps. Emphasis in the thesis was on handovers between nodes and finding the beam with the highest RSRP. Beam-hit-rate and RSRP-difference (selected minus best) were key performance indicators and were compared for different numbers of activated beams. The problem was modeled as a Multiple Output Regression (MOR) problem and as a Multi-Class Classification (MCC) problem. Both problems are possible to solve with the random forest model, which was the learning model of choice during this work. An Ericsson simulator was used to simulate and collect data from a seven-site scenario with 40 UEs. Primary features available were the current serving beam index and its RSRP. Additional features, like position and distance, were suggested, though many ended up being limited either by the simulated scenario or by the cost of acquiring the feature in a real-world scenario. Using primary features only, learned models' performance were equal to or worse than the baseline models' performance. Adding distance improved the performance considerably, beating the baseline models, but still leaving room for more improvements.

APA, Harvard, Vancouver, ISO, and other styles

33

Zhang, Tianfang. "Machine learning multicriteria optimization in radiation therapy treatment planning." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-257509.

Full text

Abstract:

In radiation therapy treatment planning, recent works have used machine learning based on historically delivered plans to automate the process of producing clinically acceptable plans. Compared to traditional approaches such as repeated weighted-sum optimization or multicriteria optimization (MCO), automated planning methods have, in general, the benefits of low computational times and minimal user interaction, but on the other hand lack the flexibility associated with general-purpose frameworks such as MCO. Machine learning approaches can be especially sensitive to deviations in their dose prediction due to certain properties of the optimization functions usually used for dose mimicking and, moreover, suffer from the fact that there exists no general causality between prediction accuracy and optimized plan quality.In this thesis, we present a means of unifying ideas from machine learning planning methods with the well-established MCO framework. More precisely, given prior knowledge in the form of either a previously optimized plan or a set of historically delivered clinical plans, we are able to automatically generate Pareto optimal plans spanning a dose region corresponding to plans which are achievable as well as clinically acceptable. For the former case, this is achieved by introducing dose--volume constraints; for the latter case, this is achieved by fitting a weighted-data Gaussian mixture model on pre-defined dose statistics using the expectation--maximization algorithm, modifying it with exponential tilting and using specially developed optimization functions to take into account prediction uncertainties.Numerical results for conceptual demonstration are obtained for a prostate cancer case with treatment delivered by a volumetric-modulated arc therapy technique, where it is shown that the methods developed in the thesis are successful in automatically generating Pareto optimal plans of satisfactory quality and diversity, while excluding clinically irrelevant dose regions. For the case of using historical plans as prior knowledge, the computational times are significantly shorter than those typical of conventional MCO.
Inom strålterapiplanering har den senaste forskningen använt maskininlärning baserat på historiskt levererade planer för att automatisera den process i vilken kliniskt acceptabla planer produceras. Jämfört med traditionella angreppssätt, såsom upprepad optimering av en viktad målfunktion eller flermålsoptimering (MCO), har automatiska planeringsmetoder generellt sett fördelarna av lägre beräkningstider och minimal användarinteraktion, men saknar däremot flexibiliteten hos allmänna ramverk som exempelvis MCO. Maskininlärningsmetoder kan vara speciellt känsliga för avvikelser i dosprediktionssteget på grund av särskilda egenskaper hos de optimeringsfunktioner som vanligtvis används för att återskapa dosfördelningar, och lider dessutom av problemet att det inte finns något allmängiltigt orsakssamband mellan prediktionsnoggrannhet och kvalitet hos optimerad plan. I detta arbete presenterar vi ett sätt att förena idéer från maskininlärningsbaserade planeringsmetoder med det väletablerade MCO-ramverket. Mer precist kan vi, givet förkunskaper i form av antingen en tidigare optimerad plan eller en uppsättning av historiskt levererade kliniska planer, automatiskt generera Paretooptimala planer som täcker en dosregion motsvarande uppnåeliga såväl som kliniskt acceptabla planer. I det förra fallet görs detta genom att introducera dos--volym-bivillkor; i det senare fallet görs detta genom att anpassa en gaussisk blandningsmodell med viktade data med förväntning--maximering-algoritmen, modifiera den med exponentiell lutning och sedan använda speciellt utvecklade optimeringsfunktioner för att ta hänsyn till prediktionsosäkerheter.Numeriska resultat för konceptuell demonstration erhålls för ett fall av prostatacancer varvid behandlingen levererades med volymetriskt modulerad bågterapi, där det visas att metoderna utvecklade i detta arbete är framgångsrika i att automatiskt generera Paretooptimala planer med tillfredsställande kvalitet och variation medan kliniskt irrelevanta dosregioner utesluts. I fallet då historiska planer används som förkunskap är beräkningstiderna markant kortare än för konventionell MCO.

APA, Harvard, Vancouver, ISO, and other styles

34

Sedig, Victoria, Evelina Samuelsson, Nils Gumaelius, and Andrea Lindgren. "Greenhouse Climate Optimization using Weather Forecasts and Machine Learning." Thesis, Uppsala universitet, Avdelningen för beräkningsvetenskap, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-391045.

Full text

Abstract:

It is difficult for a small scaled local farmer to support him- or herself. In this investigation a program was devloped to help the small scaled farmer Janne from Sala to keep an energy efficient greenhouse. The program applied machine learning to make predictions of future temperatures in the greenhouse. When the temperature was predicted to be dangerously low for the plants and crops Janne was warned via a HTML web page. To make an as accurate prediction as possible different machine learning algorithm methods were evaluated. XGBoost was the most efficient and accurate method with an cross validation value at 2.33 and was used to make the predictions. The data to train the method with was old data inside and outside the greenhouse provided from the consultancy Bitroot and SMHI. To make predictions in real time weather forecast was collectd from SMHI via their API. The program can be useful for a farmer and can be further developed in the future.

APA, Harvard, Vancouver, ISO, and other styles

35

Bompaire, Martin. "Machine learning based on Hawkes processes and stochastic optimization." Thesis, Université Paris-Saclay (ComUE), 2019. http://www.theses.fr/2019SACLX030/document.

Full text

Abstract:

Le fil rouge de cette thèse est l'étude des processus de Hawkes. Ces processus ponctuels décryptent l'inter-causalité qui peut avoir lieu entre plusieurs séries d'événements. Concrètement, ils déterminent l'influence qu'ont les événements d'une série sur les événements futurs de toutes les autres séries. Par exemple, dans le contexte des réseaux sociaux, ils décrivent à quel point l'action d'un utilisateur, par exemple un Tweet, sera susceptible de déclencher des réactions de la part des autres.Le premier chapitre est une brève introduction sur les processus ponctuels suivie par un approfondissement sur les processus de Hawkes et en particulier sur les propriétés de la paramétrisation à noyaux exponentiels, la plus communément utilisée. Dans le chapitre suivant, nous introduisons une pénalisation adaptative pour modéliser, avec des processus de Hawkes, la propagation de l'information dans les réseaux sociaux. Cette pénalisation est capable de prendre en compte la connaissance a priori des caractéristiques de ces réseaux, telles que les interactions éparses entre utilisateurs ou la structure de communauté, et de les réfléchir sur le modèle estimé. Notre technique utilise des pénalités pondérées dont les poids sont déterminés par une analyse fine de l'erreur de généralisation.Ensuite, nous abordons l'optimisation convexe et les progrès réalisés avec les méthodes stochastiques du premier ordre avec réduction de variance. Le quatrième chapitre est dédié à l'adaptation de ces techniques pour optimiser le terme d'attache aux données le plus couramment utilisé avec les processus de Hawkes. En effet, cette fonction ne vérifie pas l'hypothèse de gradient-Lipschitz habituellement utilisée. Ainsi, nous travaillons avec une autre hypothèse de régularité, et obtenons un taux de convergence linéaire pour une version décalée de Stochastic Dual Coordinate Ascent qui améliore l'état de l'art. De plus, de telles fonctions comportent beaucoup de contraintes linéaires qui sont fréquemment violées par les algorithmes classiques du premier ordre, mais, dans leur version duale ces contraintes sont beaucoup plus aisées à satisfaire. Ainsi, la robustesse de notre algorithme est d'avantage comparable à celle des méthodes du second ordre dont le coût est prohibitif en grandes dimensions.Enfin, le dernier chapitre présente une nouvelle bibliothèque d'apprentissage statistique pour Python 3 avec un accent particulier mis sur les modèles temporels. Appelée tick, cette bibliothèque repose sur une implémentation en C++ et les algorithmes d'optimisation issus de l'état de l'art pour réaliser des estimations très rapides dans un environnement multi-cœurs. Publiée sur Github, cette bibliothèque a été utilisée tout au long de cette thèse pour effectuer des expériences
The common thread of this thesis is the study of Hawkes processes. These point processes decrypt the cross-causality that occurs across several event series. Namely, they retrieve the influence that the events of one series have on the future events of all series. For example, in the context of social networks, they describe how likely an action of a certain user (such as a Tweet) will trigger reactions from the others.The first chapter consists in a general introduction on point processes followed by a focus on Hawkes processes and more specifically on the properties of the widely used exponential kernels parametrization. In the following chapter, we introduce an adaptive penalization technique to model, with Hawkes processes, the information propagation on social networks. This penalization is able to take into account the prior knowledge on the social network characteristics, such as the sparse interactions between users or the community structure, to reflect them on the estimated model. Our technique uses data-driven weighted penalties induced by a careful analysis of the generalization error.Next, we focus on convex optimization and recall the recent progresses made with stochastic first order methods using variance reduction techniques. The fourth chapter is dedicated to an adaptation of these techniques to optimize the most commonly used goodness-of-fit of Hawkes processes. Indeed, this goodness-of-fit does not meet the gradient-Lipschitz assumption that is required by the latest first order methods. Thus, we work under another smoothness assumption, and obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent that improves the current state-of-the-art. Besides, such objectives include many linear constraints that are easily violated by classic first order algorithms, but in the Fenchel-dual problem these constraints are easier to deal with. Hence, our algorithm's robustness is comparable to second order methods that are very expensive in high dimensions.Finally, the last chapter introduces a new statistical learning library for Python 3 with a particular emphasis on time-dependent models, tools for generalized linear models and survival analysis. Called tick, this library relies on a C++ implementation and state-of-the-art optimization algorithms to provide very fast computations in a single node multi-core setting. Open-sourced and published on Github, this library has been used all along this thesis to perform benchmarks and experiments

APA, Harvard, Vancouver, ISO, and other styles

36

Del, Testa Davide. "Stochastic Optimization and Machine Learning Modeling for Wireless Networking." Doctoral thesis, Università degli studi di Padova, 2017. http://hdl.handle.net/11577/3424825.

Full text

Abstract:

In the last years, the telecommunications industry has seen an increasing interest in the development of advanced solutions that enable communicating nodes to exchange large amounts of data. Indeed, well-known applications such as VoIP, audio streaming, video on demand, real-time surveillance systems, safety vehicular requirements, and remote computing have increased the demand for the efficient generation, utilization, management and communication of larger and larger data quantities. New transmission technologies have been developed to permit more efficient and faster data exchanges, including multiple input multiple output architectures or software defined networking: as an example, the next generation of mobile communication, known as 5G, is expected to provide data rates of tens of megabits per second for tens of thousands of users and only 1 ms latency. In order to achieve such demanding performance, these systems need to effectively model the considerable level of uncertainty related to fading transmission channels, interference, or the presence of noise in the data. In this thesis, we will present how different approaches can be adopted to model these kinds of scenarios, focusing on wireless networking applications. In particular, the first part of this work will show how stochastic optimization models can be exploited to design energy management policies for wireless sensor networks. Traditionally, transmission policies are designed to reduce the total amount of energy drawn from the batteries of the devices; here, we consider energy harvesting wireless sensor networks, in which each device is able to scavenge energy from the environment and charge its battery with it. In this case, the goal of the optimal transmission policies is to efficiently manage the energy harvested from the environment, avoiding both energy outage (i.e., no residual energy in a battery) and energy overflow (i.e., the impossibility to store scavenged energy when the battery is already full). In the second part of this work, we will explore the adoption of machine learning techniques to tackle a number of common wireless networking problems. These algorithms are able to learn from and make predictions on data, avoiding the need to follow limited static program instructions: models are built from sample inputs, thus allowing for data-driven predictions and decisions. In particular, we will first design an on-the-fly prediction algorithm for the expected time of arrival related to WiFi transmissions. This predictor only exploits those network parameters available at each receiving node and does not require additional knowledge from the transmitter, hence it can be deployed without modifying existing standard transmission protocols. Secondly, we will investigate the usage of particular neural network instances known as autoencoders for the compression of biosignals, such as electrocardiography and photo plethysmographic sequences. A lightweight lossy compressor will be designed, able to be deployed in wearable battery-equipped devices with limited computational power. Thirdly, we will propose a predictor for the long-term channel gain in a wireless network. Differently from other works in the literature, such predictor will only exploit past channel samples, without resorting to additional information such as GPS data. An accurate estimation of this gain would enable to, e.g., efficiently allocate resources and foretell future handover procedures. Finally, although not strictly related to wireless networking scenarios, we will show how deep learning techniques can be applied to the field of autonomous driving. This final section will deal with state-of-the-art machine learning solutions, proving how these techniques are able to considerably overcome the performance given by traditional approaches.
Scopo di questa tesi è quello di presentare come sia possibile modellizzare il considerevole livello di incertezza proprio dei moderni sistemi di telecomunicazioni attraverso differenti approcci. Il primo è basato su modelli di ottimizzazione stocastici, e verrà adottato per la progettazione di politiche di trasmissione in particolari reti di sensori wireless, dotate di apparati in grado di recuperare energia dall'ambiente. Il secondo approccio verte sull'utilizzo di tecniche di apprendimento automatico applicate alla stima di parametri di rete, alla compressione di segnali biomedici e alla predizione del guadagno di canale in reti mobili.

APA, Harvard, Vancouver, ISO, and other styles

37

Wu, Anjian M. B. A. Sloan School of Management. "Performance modeling of human-machine interfaces using machine learning." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122599.

Full text

Abstract:

Thesis: M.B.A., Massachusetts Institute of Technology, Sloan School of Management, 2019, In conjunction with the Leaders for Global Operations Program at MIT
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019, In conjunction with the Leaders for Global Operations Program at MIT
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 70-71).
As the popularity of online retail expands, world-class electronic commerce (e-commerce) businesses are increasingly adopting collaborative robotics and Internet of Things (IoT) technologies to enhance fulfillment efficiency and operational advantage. E-commerce giants like Alibaba and Amazon are known to have smart warehouses staffed by both machines and human operators. The robotics systems specialize in transporting and maneuvering heavy shelves of goods to and from operators. Operators are left to higher-level cognitive tasks needed to process goods such as identification and complex manipulation of individual objects. Achieving high system throughput in these systems require harmonized interaction between humans and machines. The robotics systems must minimize time that operators are waiting for new work (idle time) and operators need to minimize time processing items (takt time). Over time, these systems will naturally generate extensive amounts of data. Our research provides insights into both using this data to design a machine-learning (ML) model of takt time, as well as exploring methods of interpreting insights from such a model. We start by presenting our iterative approach to developing a ML model that predicts the average takt of a group of operators at hourly intervals. Our final XGBoost model reached an out-of-sample performance of 4.01% mean absolute percent error (MAPE) using over 250,000 hours of historic data across multiple warehouses around the world. Our research will share methods to cross-examine and interpret the relationships learned by the model for business value. This can allow organizations to effectively quantify system trade-offs as well as identify root-causes of takt performance deviations. Finally, we will discuss the implications of our empirical findings.
by Anjian Wu.
M.B.A.
S.M.
M.B.A. Massachusetts Institute of Technology, Sloan School of Management
S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science

APA, Harvard, Vancouver, ISO, and other styles

38

Armond, Kenneth C. Jr. "Distributed Support Vector Machine Learning." ScholarWorks@UNO, 2008. http://scholarworks.uno.edu/td/711.

Full text

Abstract:

Support Vector Machines (SVMs) are used for a growing number of applications. A fundamental constraint on SVM learning is the management of the training set. This is because the order of computations goes as the square of the size of the training set. Typically, training sets of 1000 (500 positives and 500 negatives, for example) can be managed on a PC without hard-drive thrashing. Training sets of 10,000 however, simply cannot be managed with PC-based resources. For this reason most SVM implementations must contend with some kind of chunking process to train parts of the data at a time (10 chunks of 1000, for example, to learn the 10,000). Sequential and multi-threaded chunking methods provide a way to run the SVM on large datasets while retaining accuracy. The multi-threaded distributed SVM described in this thesis is implemented using Java RMI, and has been developed to run on a network of multi-core/multi-processor computers.

APA, Harvard, Vancouver, ISO, and other styles

39

Ouyang, Hua. "Optimal stochastic and distributed algorithms for machine learning." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49091.

Full text

Abstract:

Stochastic and data-distributed optimization algorithms have received lots of attention from the machine learning community due to the tremendous demand from the large-scale learning and the big-data related optimization. A lot of stochastic and deterministic learning algorithms are proposed recently under various application scenarios. Nevertheless, many of these algorithms are based on heuristics and their optimality in terms of the generalization error is not sufficiently justified. In this talk, I will explain the concept of an optimal learning algorithm, and show that given a time budget and proper hypothesis space, only those achieving the lower bounds of the estimation error and the optimization error are optimal. Guided by this concept, we investigated the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We proposed a novel algorithm named Accelerated Nonsmooth Stochastic Gradient Descent, which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algorithm that can achieve the optimal O(1/t) rate for minimizing nonsmooth loss functions. The fast rates are confirmed by empirical comparisons with state-of-the-art algorithms including the averaged SGD. The Alternating Direction Method of Multipliers (ADMM) is another flexible method to explore function structures. In the second part we proposed stochastic ADMM that can be applied to a general class of convex and nonsmooth functions, beyond the smooth and separable least squares loss used in lasso. We also demonstrate the rates of convergence for our algorithm under various structural assumptions of the stochastic function: O(1/sqrt{t}) for convex functions and O(log t/t) for strongly convex functions. A novel application named Graph-Guided SVM is proposed to demonstrate the usefulness of our algorithm. We also extend the scalability of stochastic algorithms to nonlinear kernel machines, where the problem is formulated as a constrained dual quadratic optimization. The simplex constraint can be handled by the classic Frank-Wolfe method. The proposed stochastic Frank-Wolfe methods achieve comparable or even better accuracies than state-of-the-art batch and online kernel SVM solvers, and are significantly faster. The last part investigates the problem of data-distributed learning. We formulate it as a consensus-constrained optimization problem and solve it with ADMM. It turns out that the underlying communication topology is a key factor in achieving a balance between a fast learning rate and computation resource consumption. We analyze the linear convergence behavior of consensus ADMM so as to characterize the interplay between the communication topology and the penalty parameters used in ADMM. We observe that given optimal parameters, the complete bipartite and the master-slave graphs exhibit the fastest convergence, followed by bi-regular graphs.

APA, Harvard, Vancouver, ISO, and other styles

40

CESARI, TOMMASO RENATO. "ALGORITHMS, LEARNING, AND OPTIMIZATION." Doctoral thesis, Università degli Studi di Milano, 2020. http://hdl.handle.net/2434/699354.

Full text

Abstract:

This thesis covers some algorithmic aspects of online machine learning and optimization. In Chapter 1 we design algorithms with state-of-the-art regret guarantees for the problem dynamic pricing. In Chapter 2 we move on to an asynchronous online learning setting in which only some of the agents in the network are active at each time step. We show that when information is shared among neighbors, knowledge about the graph structure might have a significantly different impact on learning rates depending on how agents are activated. In Chapter 3 we investigate the online problem of multivariate non-concave maximization under weak assumptions on the regularity of the objective function. In Chapter 4 we introduce a new performance measure and design an efficient algorithm to learn optimal policies in repeated A/B testing.

APA, Harvard, Vancouver, ISO, and other styles

41

Bhat, Sooraj. "Syntactic foundations for machine learning." Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/47700.

Full text

Abstract:

Machine learning has risen in importance across science, engineering, and business in recent years. Domain experts have begun to understand how their data analysis problems can be solved in a principled and efficient manner using methods from machine learning, with its simultaneous focus on statistical and computational concerns. Moreover, the data in many of these application domains has exploded in availability and scale, further underscoring the need for algorithms which find patterns and trends quickly and correctly. However, most people actually analyzing data today operate far from the expert level. Available statistical libraries and even textbooks contain only a finite sample of the possibilities afforded by the underlying mathematical principles. Ideally, practitioners should be able to do what machine learning experts can do--employ the fundamental principles to experiment with the practically infinite number of possible customized statistical models as well as alternative algorithms for solving them, including advanced techniques for handling massive datasets. This would lead to more accurate models, the ability in some cases to analyze data that was previously intractable, and, if the experimentation can be greatly accelerated, huge gains in human productivity. Fixing this state of affairs involves mechanizing and automating these statistical and algorithmic principles. This task has received little attention because we lack a suitable syntactic representation that is capable of specifying machine learning problems and solutions, so there is no way to encode the principles in question, which are themselves a mapping between problem and solution. This work focuses on providing the foundational layer for enabling this vision, with the thesis that such a representation is possible. We demonstrate the thesis by defining a syntactic representation of machine learning that is expressive, promotes correctness, and enables the mechanization of a wide variety of useful solution principles.

APA, Harvard, Vancouver, ISO, and other styles

42

Alcoverro, Vidal Marcel. "Stochastic optimization and interactive machine learning for human motion analysis." Doctoral thesis, Universitat Politècnica de Catalunya, 2014. http://hdl.handle.net/10803/285337.

Full text

Abstract:

The analysis of human motion from visual data is a central issue in the computer vision research community as it enables a wide range of applications and it still remains a challenging problem when dealing with unconstrained scenarios and general conditions. Human motion analysis is used in the entertainment industry for movies or videogame production, in medical applications for rehabilitation or biomechanical studies. It is also used for human computer interaction in any kind of environment, and moreover, it is used for big data analysis from social networks such as Youtube or Flickr, to mention some of its use cases. In this thesis we have studied human motion analysis techniques with a focus on its application for smart room environments. That is, we have studied methods that will support the analysis of people behavior in the room, allowing interaction with computers in a natural manner and in general, methods that introduce computers in human activity environments to enable new kind of services but in an unobstrusive mode. The thesis is structured in two parts, where we study the problem of 3D pose estimation from multiple views and the recognition of gestures using range sensors. First, we propose a generic framework for hierarchically layered particle filtering (HPF) specially suited for motion capture tasks. Human motion capture problem generally involve tracking or optimization of high-dimensional state vectors where also one have to deal with multi-modal pdfs. HPF allow to overcome the problem by means of multiple passes through substate space variables. Then, based on the HPF framework, we propose a method to estimate the anthropometry of the subject, which at the end allows to obtain a human body model adjusted to the subject. Moreover, we introduce a new weighting function strategy for approximate partitioning of observations and a method that employs body part detections to improve particle propagation and weight evaluation, both integrated within the HPF framework. The second part of this thesis is centered in the detection of gestures, and we have focused the problem of reducing annotation and training efforts required to train a specific gesture. In order to reduce the efforts required to train a gesture detector, we propose a solution based on online random forests that allows training in real-time, while receiving new data in sequence. The main aspect that makes the solution effective is the method we propose to collect the hard negatives examples while training the forests. The method uses the detector trained up to the current frame to test on that frame, and then collects samples based on the response of the detector such that they will be more relevant for training. In this manner, training is more effective in terms of the number of annotated frames required.
L'anàlisi del moviment humà a partir de dades visuals és un tema central en la recerca en visió per computador, per una banda perquè habilita un ampli espectre d'aplicacions i per altra perquè encara és un problema no resolt quan és aplicat en escenaris no controlats. L'analisi del moviment humà s'utilitza a l'indústria de l'entreteniment per la producció de pel·lícules i videojocs, en aplicacions mèdiques per rehabilitació o per estudis bio-mecànics. També s'utilitza en el camp de la interacció amb computadors o també per l'analisi de grans volums de dades de xarxes socials com Youtube o Flickr, per mencionar alguns exemples. En aquesta tesi s'han estudiat tècniques per l'anàlisi de moviment humà enfocant la seva aplicació en entorns de sales intel·ligents. És a dir, s'ha enfocat a mètodes que puguin permetre l'anàlisi del comportament de les persones a la sala, que permetin la interacció amb els dispositius d'una manera natural i, en general, mètodes que incorporin les computadores en espais on hi ha activitat de persones, per habilitar nous serveis de manera que no interfereixin en la activitat. A la primera part, es proposa un marc genèric per l'ús de filtres de partícules jeràrquics (HPF) especialment adequat per tasques de captura de moviment humà. La captura de moviment humà generalment implica seguiment i optimització de vectors d'estat de molt alta dimensió on a la vegada també s'han de tractar pdf's multi-modals. Els HPF permeten tractar aquest problema mitjançant multiples passades en subdivisions del vector d'estat. Basant-nos en el marc dels HPF, es proposa un mètode per estimar l'antropometria del subjecte, que a la vegada permet obtenir un model acurat del subjecte. També proposem dos nous mètodes per la captura de moviment humà. Per una banda, el APO es basa en una nova estratègia per les funcions de cost basada en la partició de les observacions. Per altra, el DD-HPF utilitza deteccions de parts del cos per millorar la propagació de partícules i l'avaluació de pesos. Ambdós mètodes són integrats dins el marc dels HPF. La segona part de la tesi es centra en la detecció de gestos, i s'ha enfocat en el problema de reduir els esforços d'anotació i entrenament requerits per entrenar un detector per un gest concret. Per tal de reduir els esforços requerits per entrenar un detector de gestos, proposem una solució basada en online random forests que permet l'entrenament en temps real, mentre es reben noves dades sequencialment. El principal aspecte que fa la solució efectiva és el mètode que proposem per obtenir mostres negatives rellevants, mentre s'entrenen els arbres de decisió. El mètode utilitza el detector entrenat fins al moment per recollir mostres basades en la resposta del detector, de manera que siguin més rellevants per l'entrenament. D'aquesta manera l'entrenament és més efectiu pel que fa al nombre de mostres anotades que es requereixen.

APA, Harvard, Vancouver, ISO, and other styles

43

Shahriari, Bobak. "Practical Bayesian optimization with application to tuning machine learning algorithms." Thesis, University of British Columbia, 2016. http://hdl.handle.net/2429/59104.

Full text

Abstract:

Bayesian optimization has recently emerged in the machine learning community as a very effective automatic alternative to the tedious task of hand-tuning algorithm hyperparameters. Although it is a relatively new aspect of machine learning, it has known roots in the Bayesian experimental design (Lindley, 1956; Chaloner and Verdinelli, 1995), the design and analysis of computer experiments (DACE; Sacks et al., 1989), Kriging (Krige, 1951), and multi-armed bandits (Gittins, 1979). In this thesis, we motivate and introduce the model-based optimization framework and provide some historical context to the technique that dates back as far as 1933 with application to clinical drug trials (Thompson, 1933). Contributions of this work include a Bayesian gap-based exploration policy, inspired by Gabillon et al. (2012); a principled information-theoretic portfolio strategy, out-performing the portfolio of Hoffman et al. (2011); and a general practical technique circumventing the need for an initial bounding box. These various works each address existing practical challenges in the way of more widespread adoption of probabilistic model-based optimization techniques. Finally, we conclude this thesis with important directions for future research, emphasizing scalability and computational feasibility of the approach as a general purpose optimizer.
Science, Faculty of
Computer Science, Department of
Graduate

APA, Harvard, Vancouver, ISO, and other styles

44

Mazzieri, Diego. "Machine Learning for combinatorial optimization: the case of Vehicle Routing." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021. http://amslaurea.unibo.it/24688/.

Full text

Abstract:

The Vehicle Routing Problem (VRP) is one of the most intensively studied combinatorial optimization problems in the Operations Research (OR) community. Its relevance is not only related to the various real-world applications it deals with, but to its inherent complexity being an NP-hard problem. From its original formulation more than 60 years ago, numerous mathematical models and algorithms have been proposed to solve VRP. The most recent trend is to leverage Machine Learning (ML) in conjunction with these traditional approaches to enhance their performance. In particular, this work investigates the use of ML-driven components as destroy or repair methods inside the Large Neighborhood Search (LNS) metaheuristic, trying to understand if, where, and when it is effective to apply them in the context of VRP. For these purposes, we propose NeuRouting, an open-source hybridization framework aimed at facilitating the integration between ML and LNS. Regarding the destroy phase, we adopt a Graph Neural Network (GNN) assisted heuristic, which we hybridize with a neural repair methodology taken from the literature. We investigate this integration both on its own and as part of an Adaptive Large Neighborhood Search (ALNS), performing an empirical study on instances of various sizes and against some traditional solvers.

APA, Harvard, Vancouver, ISO, and other styles

45

Zhu, Zhanxing. "Integrating local information for inference and optimization in machine learning." Thesis, University of Edinburgh, 2016. http://hdl.handle.net/1842/20980.

Full text

Abstract:

In practice, machine learners often care about two key issues: one is how to obtain a more accurate answer with limited data, and the other is how to handle large-scale data (often referred to as “Big Data” in industry) for efficient inference and optimization. One solution to the first issue might be aggregating learned predictions from diverse local models. For the second issue, integrating the information from subsets of the large-scale data is a proven way of achieving computation reduction. In this thesis, we have developed some novel frameworks and schemes to handle several scenarios in each of the two salient issues. For aggregating diverse models – in particular, aggregating probabilistic predictions from different models – we introduce a spectrum of compositional methods, Rényi divergence aggregators, which are maximum entropy distributions subject to biases from individual models, with the Rényi divergence parameter dependent on the bias. Experiments are implemented on various simulated and real-world datasets to verify the findings. We also show the theoretical connections between Rényi divergence aggregators and machine learning markets with isoelastic utilities. The second issue involves inference and optimization with large-scale data. We consider two important scenarios: one is optimizing large-scale Convex-Concave Saddle Point problem with a Separable structure, referred as Sep-CCSP; and the other is large-scale Bayesian posterior sampling. Two different settings of Sep-CCSP problem are considered, Sep-CCSP with strongly convex functions and non-strongly convex functions. We develop efficient stochastic coordinate descent methods for both of the two cases, which allow fast parallel processing for large-scale data. Both theoretically and empirically, it is demonstrated that the developed methods perform comparably, or more often, better than state-of-the-art methods. To handle the scalability issue in Bayesian posterior sampling, the stochastic approximation technique is employed, i.e., only touching a small mini batch of data items to approximate the full likelihood or its gradient. In order to deal with subsampling error introduced by stochastic approximation, we propose a covariance-controlled adaptive Langevin thermostat that can effectively dissipate parameter-dependent noise while maintaining a desired target distribution. This method achieves a substantial speedup over popular alternative schemes for large-scale machine learning applications.

APA, Harvard, Vancouver, ISO, and other styles

46

Bergkvist, Markus, and Tobias Olandersson. "Machine learning in simulated RoboCup." Thesis, Blekinge Tekniska Högskola, Institutionen för programvaruteknik och datavetenskap, 2003. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-3827.

Full text

Abstract:

An implementation of the Electric Field Approach applied to the simulated RoboCup is presented, together with a demonstration of a learning system. Results are presented from the optimization of the Electric Field parameters in a limited situation, using the learning system. Learning techniques used in contemporary RoboCup research are also described including a brief presentation of their results.

APA, Harvard, Vancouver, ISO, and other styles

47

Narasimhan, Mukund. "Applications of submodular minimization in machine learning /." Thesis, Connect to this title online; UW restricted, 2007. http://hdl.handle.net/1773/5983.

Full text

APA, Harvard, Vancouver, ISO, and other styles

48

Casavant, Matt(Matt Stephen). "Predicting competitor restructuring using machine learning methods." Thesis, Massachusetts Institute of Technology, 2019. https://hdl.handle.net/1721.1/122595.

Full text

Abstract:

Thesis: M.B.A., Massachusetts Institute of Technology, Sloan School of Management, 2019, In conjunction with the Leaders for Global Operations Program at MIT
Thesis: S.M., Massachusetts Institute of Technology, Department of Mechanical Engineering, 2019, In conjunction with the Leaders for Global Operations Program at MIT
Cataloged from PDF version of thesis.
Includes bibliographical references (pages 59-60).
Increasing competition in the defense industry risks contract margin degradation and increases the need for new avenues to margin expansion. One such area of opportunity is take-away bids for under-performing competitor sole source contracts. Post financial crisis, the government has been more willing to entertain conversation with outside firms about existing contracts in the execution phase if the contracted firm is under performing budgetary and schedule terms. The contracted firm has the opportunity to defend its performance though, so in order to maximize the likelihood of successful take-away, the bid would ideally be submitted when the contracted firm is distracted and cannot put together as strong of a defense as would be typical. Corporate restructuring is an example of such a time; employees are distracted and leadership, communication, and approval chains are disrupted. Because the government contracting process is long and detailed, often taking on the order of one year, if restructuring at competitor firms could be predicted up to a year in advance, resources could be shifted ahead of time to align bid submittal with the public restructuring announcement and therefore increase the likelihood of take-away success. The subject of this thesis is the development of the necessary dataset and application of various machine learning methods to predict future restructuring. Literature review emphasizes understanding of current methods benefits and shortcomings in relation to forecasting, and proposed methods seeks to fill in gaps. Depending on the competitor, the resulting models predict future restructuring on blind historical test set data with an accuracy of 80-90%. While blind historical test set data are not necessarily indicative of future data, one of the firm's under assessment recently announced a future restructuring in the same quarter that the model predicted.
by Matt Casavant.
M.B.A.
S.M.
M.B.A. Massachusetts Institute of Technology, Sloan School of Management
S.M. Massachusetts Institute of Technology, Department of Mechanical Engineering

APA, Harvard, Vancouver, ISO, and other styles

49

Addis, Antonio. "Deep reinforcement learning optimization of video streaming." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2019.

Find full text

Abstract:

Questa tesi si occuperà dell'ottimizzazione delle performance di streaming video attraverso internet, divenute particolarmente problematiche con l'avvento delle nuove risoluzioni ultraHD e i video a 360 gradi per la realtà virtuale. Verranno confrontate le performance ottenute con gli algoritmi che attualmente fanno parte dello stato dell'arte, e sviluppato un modello di reinforcement learning che sia capace di effettuare scelte per migliorare la QoE(quality of experience) durante una sessione di streaming. Per i video a 360 gradi, verrà inoltre implementata la tecnica snapchange, con questo metodo è possibile ridurre la banda utilizzata durante lo streaming, forzando la riposizione dello sguardo dell'utente in un'area di maggior interesse del video.

APA, Harvard, Vancouver, ISO, and other styles

50

Torkamani, MohamadAli. "Robust Large Margin Approaches for Machine Learning in Adversarial Settings." Thesis, University of Oregon, 2016. http://hdl.handle.net/1794/20677.

Full text

Abstract:

Machine learning algorithms are invented to learn from data and to use data to perform predictions and analyses. Many agencies are now using machine learning algorithms to present services and to perform tasks that used to be done by humans. These services and tasks include making high-stake decisions. Determining the right decision strongly relies on the correctness of the input data. This fact provides a tempting incentive for criminals to try to deceive machine learning algorithms by manipulating the data that is fed to the algorithms. And yet, traditional machine learning algorithms are not designed to be safe when confronting unexpected inputs. In this dissertation, we address the problem of adversarial machine learning; i.e., our goal is to build safe machine learning algorithms that are robust in the presence of noisy or adversarially manipulated data. Many complex questions -- to which a machine learning system must respond -- have complex answers. Such outputs of the machine learning algorithm can have some internal structure, with exponentially many possible values. Adversarial machine learning will be more challenging when the output that we want to predict has a complex structure itself. In this dissertation, a significant focus is on adversarial machine learning for predicting structured outputs. In this thesis, first, we develop a new algorithm that reliably performs collective classification: It jointly assigns labels to the nodes of graphed data. It is robust to malicious changes that an adversary can make in the properties of the different nodes of the graph. The learning method is highly efficient and is formulated as a convex quadratic program. Empirical evaluations confirm that this technique not only secures the prediction algorithm in the presence of an adversary, but it also generalizes to future inputs better, even if there is no adversary. While our robust collective classification method is efficient, it is not applicable to generic structured prediction problems. Next, we investigate the problem of parameter learning for robust, structured prediction models. This method constructs regularization functions based on the limitations of the adversary in altering the feature space of the structured prediction algorithm. The proposed regularization techniques secure the algorithm against adversarial data changes, with little additional computational cost. In this dissertation, we prove that robustness to adversarial manipulation of data is equivalent to some regularization for large-margin structured prediction, and vice versa. This confirms some of the previous results for simpler problems. As a matter of fact, an ordinary adversary regularly either does not have enough computational power to design the ultimate optimal attack, or it does not have sufficient information about the learner's model to do so. Therefore, it often tries to apply many random changes to the input in a hope of making a breakthrough. This fact implies that if we minimize the expected loss function under adversarial noise, we will obtain robustness against mediocre adversaries. Dropout training resembles such a noise injection scenario. Dropout training was initially proposed as a regularization technique for neural networks. The procedure is simple: At each iteration of training, randomly selected features are set to zero. We derive a regularization method for large-margin parameter learning based on dropout. Our method calculates the expected loss function under all possible dropout values. This method results in a simple objective function that is efficient to optimize. We extend dropout regularization to non-linear kernels in several different directions. We define the concept of dropout for input space, feature space, and input dimensions, and we introduce methods for approximate marginalization over feature space, even if the feature space is infinite-dimensional. Empirical evaluations show that our techniques consistently outperform the baselines on different datasets.

APA, Harvard, Vancouver, ISO, and other styles

Dissertations / Theses on the topic 'Machine learning, Global Optimization'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles