Tesi sul tema "Distributed optimization and learning"
Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili
Vedi i top-50 saggi (tesi di laurea o di dottorato) per l'attività di ricerca sul tema "Distributed optimization and learning".
Accanto a ogni fonte nell'elenco di riferimenti c'è un pulsante "Aggiungi alla bibliografia". Premilo e genereremo automaticamente la citazione bibliografica dell'opera scelta nello stile citazionale di cui hai bisogno: APA, MLA, Harvard, Chicago, Vancouver ecc.
Puoi anche scaricare il testo completo della pubblicazione scientifica nel formato .pdf e leggere online l'abstract (il sommario) dell'opera se è presente nei metadati.
Vedi le tesi di molte aree scientifiche e compila una bibliografia corretta.
Funkquist, Mikaela, e Minghua Lu. "Distributed Optimization Through Deep Reinforcement Learning". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-293878.
Testo completoFörstärkningsinlärningsmetoder tillåter självlärande enheter att spela video- och brädspel autonomt. Projektet siktar på att studera effektiviteten hos förstärkningsinlärningsmetoderna Q-learning och deep Q-learning i dynamiska problem. Målet är att träna upp robotar så att de kan röra sig genom ett varuhus på bästa sätt utan att kollidera. En virtuell miljö skapades, i vilken algoritmerna testades genom att simulera agenter som rörde sig. Algoritmernas effektivitet utvärderades av hur snabbt agenterna lärde sig att utföra förutbestämda uppgifter. Resultatet visar att Q-learning fungerar bra för enkla problem med få agenter, där system med två aktiva agenter löstes snabbt. Deep Q-learning fungerar bättre för mer komplexa system som innehåller fler agenter, men fall med suboptimala rörelser uppstod. Båda algoritmerna visade god potential inom deras respektive områden, däremot måste förbättringar göras innan de kan användas i verkligheten.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm
Konečný, Jakub. "Stochastic, distributed and federated optimization for machine learning". Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/31478.
Testo completoArmond, Kenneth C. Jr. "Distributed Support Vector Machine Learning". ScholarWorks@UNO, 2008. http://scholarworks.uno.edu/td/711.
Testo completoPatvarczki, Jozsef. "Layout Optimization for Distributed Relational Databases Using Machine Learning". Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-dissertations/291.
Testo completoOuyang, Hua. "Optimal stochastic and distributed algorithms for machine learning". Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49091.
Testo completoEl, Gamal Mostafa. "Distributed Statistical Learning under Communication Constraints". Digital WPI, 2017. https://digitalcommons.wpi.edu/etd-dissertations/314.
Testo completoDai, Wei. "Learning with Staleness". Research Showcase @ CMU, 2018. http://repository.cmu.edu/dissertations/1209.
Testo completoLu, Yumao. "Kernel optimization and distributed learning algorithms for support vector machines". Diss., Restricted to subscribing institutions, 2005. http://uclibs.org/PID/11984.
Testo completoDinh, The Canh. "Distributed Algorithms for Fast and Personalized Federated Learning". Thesis, The University of Sydney, 2023. https://hdl.handle.net/2123/30019.
Testo completoReddi, Sashank Jakkam. "New Optimization Methods for Modern Machine Learning". Research Showcase @ CMU, 2017. http://repository.cmu.edu/dissertations/1116.
Testo completoShi, Shaohuai. "Communication optimizations for distributed deep learning". HKBU Institutional Repository, 2020. https://repository.hkbu.edu.hk/etd_oa/813.
Testo completoWang, Sinong. "Coded Computation for Speeding up Distributed Machine Learning". The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555336880521062.
Testo completoSitta, Alessandro. "Privacy-Preserving Distributed Optimization via Obfuscated Gradient Tracking". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.
Cerca il testo completoCostantini, Marina. "Optimization methods over networks for popular content delivery and distributed machine learning". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS182.
Testo completoThe ever-increasing number of users and applications on the Internet sets a number of challenges for network operators and engineers in order to keep up with the high traffic demands. In this scenario, making efficient use of the resources available has become imperative. In this thesis, we develop optimization methods to improve the utilization of the network in two specific applications enabled by the Internet: network edge caching and distributed model training. Network edge caching is a recent technique that proposes to store at the network edge copies of contents that have a high probability of being requested to reduce latency and improve the overall user experience. Traditionally, when a user requests a web page or application, the request is sent to a remote server that stores the data. The server retrieves the requested data and sends it back to the user. This process can be slow and can lead to latency and congestion issues, especially when multiple users are accessing the same data simultaneously. To address this issue, network operators can deploy edge caching servers close to end-users. These servers are then filled during off-peak hours with contents that have high probability of being requested, so that during times of high traffic the user can still retrieve them in a short time and high quality. On the other hand, distributed model training, or more generally, distributed optimization, is a method for training large-scale machine learning models using multiple agents that work together to find the optimal parameters of the model. In such settings, the agents interleave local computation steps with communication steps to train a model that takes into account the data of all agents. To achieve this, agents may exchange optimization values (parameters, gradients) but not the data. Here we consider two such distributed training settings: the decentralized and the federated. In the decentralized setting, agents are interconnected in a network and communicate their optimization values only to their direct neighbors. In the federated, the agents communicate with a central server that regularly averages the most recent values of (usually a subset of) the agents and broadcasts the result to all of them. Naturally, the success of such techniques relies on the frequent communication of the agents between them or with the server. Therefore, there is a great interest in designing distributed optimization algorithms that achieve state-of-the-art performance at lower communication costs. In this thesis, we propose algorithms that improve the performance of existing methods for popular content delivery and distributed machine learning by making a better utilization of the network resources. In Chapter 2, we propose an algorithm that exploits recommendation engines to design jointly the contents cached at the network edge and the recommendations shown to the user. This algorithm achieves a higher fraction of requests served by the cache than its competitors, and thus requires less communication with the remote server. In Chapter 3, we design an asynchronous algorithm for decentralized optimization that requires minimum coordination between the agents and thus allows for connection interruptions and link failures. We then show that, if the agents are allowed to increase the amount of data they transmit by a factor equal to their node degree, the convergence of this algorithm can be made much faster by letting the agents decide their communication scheme according to the gains provided by communicating with each of their neighbors. Finally, in Chapter 4 we propose an algorithm that exploits inter-agent communication within the classical federated learning setup (where, in principle, agents communicate only with the server), and which can achieve the same convergence speed as the classical setup with fewer communication rounds with the server, which constitute the main bottleneck in this setting
Singh, Manish Kumar. "Optimization, Learning, and Control for Energy Networks". Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/104064.
Testo completoDoctor of Philosophy
Massive infrastructure networks play a pivotal role in everyday human lives. A minor service disruption occurring locally in electric power, natural gas, or water networks is considered a significant loss. Uncertain demands, equipment failures, regulatory stipulations, and most importantly complicated physical laws render managing these networks an arduous task. Oftentimes, the first principle mathematical models for these networks are well known. Nevertheless, the computations needed in real-time to make spontaneous decisions frequently surpass the available resources. Explicitly identifying such problems, this dissertation extends the state of the art on three fronts: First, efficient models enabling the operators to tractably solve some routinely encountered problems are developed using fundamental and diverse mathematical tools; Second, quickly trainable machine learning based solutions are developed that enable spontaneous decision making while learning offline from sophisticated mathematical programs; and Third, control mechanisms are designed that ensure a safe and autonomous network operation without human intervention. These novel solutions are bolstered by mathematical guarantees and extensive simulations on benchmark power, water, and natural gas networks.
Khirirat, Sarit. "First-Order Algorithms for Communication Efficient Distributed Learning". Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263738.
Testo completoUtvecklandet av kommunikationsteknologi och datalagring har gjort stora mängder av datasamlingar mer tillgängliga än någonsin. Denna förändring leder till numeriska optimeringsproblem med datamängder med stor skala i volym och dimension. Som svar på denna trend har populariteten för högpresterande beräkningsarkitekturer ökat mer än någonsin tidigare. Skalbara optimeringsverktyg kan uppnå hög effektivitet genom att fördela beräkningsbördan mellan ett flertal maskiner. De kommer dock i praktiken med ett pris som utgörs av betydande kommunikationsomkostnader. Detta orsakar ett skifte i flaskhalsen för prestandan från beräkningar till kommunikation. När lösning av verkliga optimeringsproblem sker med ett stort antal parametrar, dominerar kommunikationen mellan maskiner nästan 80% av träningstiden. För att minska kommunikationsbelastningen, har ett flertal kompressionstekniker föreslagits i litteraturen. Även om optimeringsalgoritmer som använder dessa kompressorer rapporteras vara lika konkurrenskraftiga som sina motsvarigheter med full precision, dras de med en förlust av noggrannhet. För att ge en uppfattning om detta, utvecklar vi i denna avhandling teori och tekniker för att designa kommunikations-effektiva optimeringsalgoritmer som endast använder information med låg precision. I den första delen analyserar vi konvergensen hos optimeringsalgoritmer med direkt kompression. Först ger vi en översikt av kompressionstekniker som täcker in många kompressorer av praktiskt intresse. Sedan presenterar vi ett enhetligt analysramverk för optimeringsalgoritmer med kompressorer, som kan vara antingen deterministiska eller randomiserade. I synnerhet visas val av parametrar i komprimerade optimeringsalgoritmer som avgörs av kompressorns parametrar som garanterar konvergens. Våra konvergensgarantier visar beroende av kompressorns noggrannhet och fördröjningseffekter på grund av asynkronicitet hos algoritmer. Detta låter oss karakterisera avvägningen mellan iterations- och kommunikations-komplexitet när kompression används. I den andra delen studerarvi hög prestanda hos felkompenseringsmetoder för komprimerade optimeringsalgoritmer. Även om konvergensgarantier med felkompensering har etablerats finns det väldigt begränsat teoretiskt stöd för konkurrenskraftiga konvergensgarantier med felkompensering. Vi utvecklar därför teoretiska förklaringar, som visar att användande av felkompensering garanterar godtyckligt hög lösningsnoggrannhet från komprimerad information. I synnerhet bidrar felkompensering till att ta bort ackumulerade kompressionsfel och förbättrar därmed lösningsnoggrannheten speciellt för illa konditionerade kvadratiska optimeringsproblem. Vi presenterar också stark konvergensanalys för felkompensering tillämpat på stokastiska gradientmetoder med ett kommunikationsnätverk innehållande ett flertal maskiner. De felkompenserade algoritmerna resulterar, i motsats till direkt kompression, i betydande reducering av kompressionsfelet. Simuleringar av algoritmer i denna avhandling på verkligaproblem med referensdatamängder validerar våra teoretiska resultat.
QC20191120
Jeon, Sung-eok. "Near-Optimality of Distributed Network Management with a Machine Learning Approach". Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/16136.
Testo completoEsteves, José Jurandir Alves. "Optimization of network slice placement in distributed large-scale infrastructures : from heuristics to controlled deep reinforcement learning". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS325.
Testo completoThis PhD thesis investigates how to optimize Network Slice Placement in distributed large-scale infrastructures focusing on online heuristic and Deep Reinforcement Learning (DRL) based approaches. First, we rely on Integer Linear Programming (ILP) to propose a data model for enabling on-Edge and on-Network Slice Placement. In contrary to most studies related to placement in the NFV context, the proposed ILP model considers complex Network Slice topologies and pays special attention to the geographic location of Network Slice Users and its impact on the End-to-End (E2E) latency. Extensive numerical experiments show the relevance of taking into account the user location constraints. Then, we rely on an approach called the “Power of Two Choices"(P2C) to propose an online heuristic algorithm for the problem which is adapted to support placement on large-scale distributed infrastructures while integrating Edge-specific constraints. The evaluation results show the good performance of the heuristic that solves the problem in few seconds under a large-scale scenario. The heuristic also improves the acceptance ratio of Network Slice Placement Requests when compared against a deterministic online ILP-based solution. Finally, we investigate the use of ML methods, more specifically DRL, for increasing scalability and automation of Network Slice Placement considering a multi-objective optimization approach to the problem. We first propose a DRL algorithm for Network Slice Placement which relies on the Advantage Actor Critic algorithm for fast learning, and Graph Convolutional Networks for feature extraction automation. Then, we propose an approach we call Heuristically Assisted Deep Reinforcement Learning (HA-DRL), which uses heuristics to control the learning and execution of the DRL agent. We evaluate this solution trough simulations under stationary, cycle-stationary and non-stationary network load conditions. The evaluation results show that heuristic control is an efficient way of speeding up the learning process of DRL, achieving a substantial gain in resource utilization, reducing performance degradation, and is more reliable under unpredictable changes in network load than non-controlled DRL algorithms
Salim, Adil. "Random monotone operators and application to stochastic optimization". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT021/document.
Testo completoThis thesis mainly studies optimization algorithms. Programming problems arising in signal processing and machine learning are composite in many cases, i.e they exhibit constraints and non smooth regularization terms. Proximal methods are known to be efficient to solve such problems. However, in modern applications of data sciences, functions to be minimized are often represented as statistical expectations, whose evaluation is intractable. This cover the case of online learning, big data problems and distributed computation problems. To solve this problems, we study in this thesis proximal stochastic methods, that generalize proximal algorithms to the case of cost functions written as expectations. Stochastic proximal methods are first studied with a constant step size, using stochastic approximation techniques. More precisely, the Ordinary Differential Equation method is adapted to the case of differential inclusions. In order to study the asymptotic behavior of the algorithms, the stability of the sequences of iterates (seen as Markov chains) is studied. Then, generalizations of the stochastic proximal gradient algorithm with decreasing step sizes are designed to solve composite problems. Every quantities used to define the optimization problem are written as expectations. This include a primal dual algorithm to solve regularized and linearly constrained problems and an optimization over large graphs algorithm
Mhanna, Elissa. "Beyond gradients : zero-order approaches to optimization and learning in multi-agent environments". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG123.
Testo completoThe rise of connected devices and the data they produce has driven the development of large-scale applications. These devices form distributed networks with decentralized data processing. As the number of devices grows, challenges like communication overhead and computational costs increase, requiring optimization methods that work under strict resource constraints, especially where derivatives are unavailable or costly. This thesis focuses on zero-order (ZO) optimization methods are ideal for scenarios where explicit function derivatives are inaccessible. ZO methods estimate gradients based only on function evaluations, making them highly suitable for distributed and federated learning environments where devices collaborate to solve global optimization tasks with limited information and noisy data. In the first chapter, we address distributed ZO optimization for strongly convex functions across multiple agents in a network. We propose a distributed zero-order projected gradient descent algorithm that uses one-point gradient estimates, where the function is queried only once per stochastic realization, and noisy function evaluations estimate the gradient. The chapter establishes the almost sure convergence of the algorithm and derives theoretical upper bounds on the convergence rate. With constant step sizes, the algorithm achieves a linear convergence rate. This is the first time this rate has been established for one-point (and even two-point) gradient estimates. We also analyze the effects of diminishing step sizes, establishing a convergence rate that matches centralized ZO methods' lower bounds. The second chapter addresses the challenges of federated learning (FL) which is often hindered by the communication bottleneck—the high cost of transmitting large amounts of data over limited-bandwidth networks. To address this, we propose a novel zero-order federated learning (ZOFL) algorithm that reduces communication overhead using one-point gradient estimates. Devices transmit scalar values instead of large gradient vectors, lowering the data sent over the network. Moreover, the algorithm incorporates wireless communication disturbances directly into the optimization process, eliminating the need for explicit knowledge of the channel state. This approach is the first to integrate wireless channel properties into a learning algorithm, making it resilient to real-world communication issues. We prove the almost sure convergence of this method in nonconvex optimization settings, establish its convergence rate, and validate its effectiveness through experiments. In the final chapter, we extend the ZOFL algorithm to include two-point gradient estimates. Unlike one-point estimates, which rely on a single function evaluation, two-point estimates query the function twice, providing a more accurate gradient approximation and enhancing the convergence rate. This method maintains the communication efficiency of one-point estimates, where only scalar values are transmitted, and relaxes the assumption that the objective function must be bounded. The chapter demonstrates that the proposed two-point ZO method achieves linear convergence rates for strongly convex and smooth objective functions. For nonconvex problems, the method shows improved convergence speed, particularly when the objective function is smooth and K-gradient-dominated, where a linear rate is also achieved. We also analyze the impact of constant versus diminishing step sizes and provide numerical results showing the method's communication efficiency compared to other federated learning techniques
Salim, Adil. "Random monotone operators and application to stochastic optimization". Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT021.
Testo completoThis thesis mainly studies optimization algorithms. Programming problems arising in signal processing and machine learning are composite in many cases, i.e they exhibit constraints and non smooth regularization terms. Proximal methods are known to be efficient to solve such problems. However, in modern applications of data sciences, functions to be minimized are often represented as statistical expectations, whose evaluation is intractable. This cover the case of online learning, big data problems and distributed computation problems. To solve this problems, we study in this thesis proximal stochastic methods, that generalize proximal algorithms to the case of cost functions written as expectations. Stochastic proximal methods are first studied with a constant step size, using stochastic approximation techniques. More precisely, the Ordinary Differential Equation method is adapted to the case of differential inclusions. In order to study the asymptotic behavior of the algorithms, the stability of the sequences of iterates (seen as Markov chains) is studied. Then, generalizations of the stochastic proximal gradient algorithm with decreasing step sizes are designed to solve composite problems. Every quantities used to define the optimization problem are written as expectations. This include a primal dual algorithm to solve regularized and linearly constrained problems and an optimization over large graphs algorithm
Zecchin, Matteo. "Robust Machine Learning Approaches to Wireless Communication Networks". Electronic Thesis or Diss., Sorbonne université, 2022. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2022SORUS397.pdf.
Testo completoArtificial intelligence is widely viewed as a key enabler of sixth generation wireless systems. In this thesis we target fundamental problems arising from the interaction between these two technologies with the end goal of paving the way towards the adoption of reliable AI in future wireless networks. We develop of distributed training algorithms that allow collaborative learning at edge of wireless networks despite communication bottlenecks, unreliability of its workers and data heterogeneity. We then take a critical look at the application of the standard frequentist learning paradigm to wireless communication problems and propose an extension of the generalized Bayesian learning, that concurrently counteracts three prominent challenges arising in application domain: data scarcity, the presence of outliers and model misspecification
Alsayasneh, Maha. "On the identification of performance bottlenecks in multi-tier distributed systems". Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM009.
Testo completoToday's distributed systems are made of various software componentswith complex interactions and a large number of configurationsettings. Pinpointing the performance bottlenecks is generally a non-trivial task, which requires human expertise as well as trial anderror. Moreover, the same software stack may exhibit very differentbottlenecks depending on factors such as the underlying hardware, theapplication logic, the configuration settings, and the operatingconditions. This work aims to (i) investigate whether it is possibleto identify a set of key metrics that can be used as reliable andgeneral indicators of performance bottlenecks, (ii) identify thecharacteristics of these indicators, and (iii) build a tool that canautomatically and accurately determine if the system reaches itsmaximum capacity in terms of throughput.In this thesis, we present three contributions. First, we present ananalytical study of a large number of realistic configuration setupsof multi-tier distributed applications, more specifically focusing ondata processing pipelines. By analyzing a large number of metrics atthe hardware and at the software level, we identify the ones thatexhibit changes in their behavior at the point where the systemreaches its maximum capacity. We consider these metrics as reliableindicators of performance bottlenecks. Second, we leverage machinelearning techniques to build a tool that can automatically identifyperformance bottlenecks in the data processing pipeline. We considerdifferent machine learning methods, different selections of metrics,and different cases of generalization to new setups. Third, to assessthe validity of the results obtained considering the data processingpipeline for both the analytical and the learning-based approaches,the two approaches are applied to the case of a Web stack.From our research, we draw several conclusions. First, it is possibleto identify key metrics that act as reliable indicators of performancebottlenecks for a multi-tier distributed system. More precisely,identifying when the server has reached its maximum capacity can beidentified based on these reliable metrics. Contrary to the approachadopted by many existing works, our results show that a combination ofmetrics of different types is required to ensure reliableidentification of performance bottlenecks in a large number ofsetups. We also show that approaches based on machine learningtechniques to analyze metrics can identify performance bottlenecks ina multi-tier distributed system. The comparison of different modelsshows that the ones based on the reliable metrics identified by ouranalytical study are the ones that achieve the bestaccuracy. Furthermore, our extensive analysis shows the robustness ofthe obtained models that can generalize to new setups, to new numbersof clients, and to both new setups and new numbers ofclients. Extending the analysis to a Web stack confirmsthe main findings obtained through the study of the data processingpipeline. These results pave the way towards a general and accuratetool to identify performance bottlenecks in distributed systems
Samarakoon, S. (Sumudu). "Learning-based methods for resource allocation and interference management in energy-efficient small cell networks". Doctoral thesis, Oulun yliopisto, 2017. http://urn.fi/urn:isbn:9789526216874.
Testo completoTiivistelmä Langattomien piensoluverkkojen resurssien allokointi ja häiriön hallinta on ollut viime vuosina tärkeä tutkimuskohde. Tutkimuksia on tehty paljon, mutta uudet viidennen sukupolven (5G) verkot vaativat suurta kapasiteettia, luotettavuutta ja energiatehokkuutta. Sen vuoksi on kehitettävä menetelmiä, jotka keskittyy ultratiheisiin ja itseorganisoituviin piensoluverkkoihin. (SCN). Tämän väitöskirjan tärkein tavoite onkin esittää joukko hajautettuja menetelmiä piensoluverkkojen yhteisten resurssien allokointiin ja häiriön hallintaan, kun käytössä on erilaisia verkkoarkkitehtuureja. Tässä väitöskirjassa ehdotetaan ja tutkitaan hajautettuja menetelmiä langattomien piensoluverkkojen hallintaan kolmessa eri tilanteessa: välityskanavan huomioiva häiriönhallinta menetelmä langattomissa piensoluverkoissa, dynaamisiin klustereihin perustuva malli tiheiden langattomien piensoluverkkojen energiatehokkuuden maksimointiin ja yhteinen tehonsäädön ja käyttäjien allokaatio menetelmä ultratiheiden piensoluverkkojen energiatehokkuuden optimointiin. Ultratiheiden piensoluverkkojen optimointi on erittäin haastavaa häiriön sekä jonojen ja kanavatilojen vahvojen kytkösten vuoksi. Lisäksi, koska klustereilla/tukiasemilla ei ole kommunikaatiota, tarvitaan hajautettuja oppimisalgoritmeja, jotta saadaan itsenäisesti valittua optimaaliset lähetys menetelmät hyödyntäen vain paikallista tietoa. Tämän vuoksi kehitetään useita hajautettuja algoritmeja, jotka hyödyntävät koneoppimista, Lyapunov optimointia ja mean-field teoriaa. Kaikki yllä olevat esitetyt menetelmät on validoitu laajoilla simulaatioilla, joilla on voitu todentaa niiden suorituskyky perinteisiin malleihin verrattuna. Perinteiset mallit eivät pysty ottamaan huomioon verkon laajuuden, jonon ja kanavatilojen dynamiikan, eri välityskanavien ja rajallisen kapasiteetin asettamia rajoituksia sekä verkon elementtien välisen koordinoinnin puuttumista. Esitetyt menetelmät tuottavat huomattavia parannuksia energiansäästöön, siirtonopeuteen ja viiveiden vähentämiseen verrattuna perinteisiin malleihin, joita kirjallisuudessa on tarkasteltu
Rodio, Angelo. "Hétérogénéité des clients dans les systèmes d'apprentissage fédérés". Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4029.
Testo completoFederated Learning (FL) stands as a collaborative framework where clients (mobile devices) train a machine learning model under a central server's orchestration, preserving data decentralization. Client participation heterogeneity stems from varied device capabilities in hardware specifications (CPU power, memory capacity), network connectivity types (3G, 4G, 5G, WiFi), and power availability (battery levels), and it is generally beyond server control. This thesis focuses on providing a theoretical understanding of federated learning systems under heterogeneous client participation, specifically analyzing the impact of this heterogeneity on the convergence of federated learning algorithms, and proposing practical solutions for a more efficient system and resource usage.The first part of the thesis focuses on tackling challenges associated with temporal and spatial correlation in client participation, the former due to the correlated client participation dynamics over time, the latter due to the clients correlated geographic distributions. In this chapter, we first observe that the heterogeneous client participation can potentially bias the learning process. We formalize the bias-variance tradeoff induced by heterogeneous client participation by decomposing the optimization error into variance (related to convergence speed) and bias (indicative of model quality). By minimizing these two errors, we demonstrate that assigning larger aggregation weights to frequently participating clients can accelerate convergence.Moreover, we study the impact of temporal and spatial correlation in client participation through a finite-state Markov chain modeling. We show that correlation slows down convergence within a logarithmic factor related to the Markov chain's geometric mixing time. Minimizing the bias-variance tradeoff, we also find that lower aggregation weights for highly correlated clients accelerate convergence. We finally propose an algorithm, Correlation-Aware Federated Learning (CA-Fed), to optimize the bias-variance tradeoff and thus achieve faster convergence.The second part of the thesis consider more applied scenarios of lossy communication channels. Network conditions, particularly packet losses, represent a main, uncontrollable source of heterogeneity in client participation. In this chapter, challenging the conventional mitigation strategies for packet losses such as retransmission or error correction, we show that federated learning algorithms can still learn in asymmetric, lossy channels. Our proposed solution modifies traditional federated learning approaches by transmitting model updates in place of models and correcting the averaging step to account for the heterogeneity of the communication channels. Experimental results confirm that our algorithm, under lossy channels, matches the performance in ideal, lossless conditions within a limited number of communication rounds.The third part investigates leveraging variance reduction methods, specifically stale updates, to compensate for the heterogeneity in client participation. Recent research considered similar strategies to mitigate the effects of partial client participation in federated learning. These methods involve retaining the last computed, potentially stale, update for each client to replace unavailable current updates for non-participating clients. However, existing analyses rely on the assumption of uniform client participation — restrictive in real-world scenarios. By broadening the analysis to heterogeneous client participation, we discover that convergence is significantly influenced by the least participating clients. This suggests that existing algorithms are not optimally designed for such environments, and we propose a more robust approach, FedStale, to exploit stale model updates under heterogeneous client participation
Bokiš, Daniel. "Optimalizační algoritmy v logistických kombinatorických úlohách". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2015. http://www.nusl.cz/ntk/nusl-234978.
Testo completoMartinez, Medina Lourdes. "Optimisation des requêtes distribuées par apprentissage". Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM015.
Testo completoDistributed data systems are becoming increasingly complex. They interconnect devices (e.g. smartphones, tablets, etc.) that are heterogeneous, autonomous, either static or mobile, and with physical limitations. Such devices run applications (e.g. virtual games, social networks, etc.) for the online interaction of users producing / consuming data on demand or continuously. The characteristics of these systems add new dimensions to the query optimization problem, such as multi-optimization criteria, scarce information on data, lack of global system view, among others. Traditional query optimization techniques focus on semi (or not at all) autonomous systems. They rely on information about data and make strong assumptions about the system behavior. Moreover, most of these techniques are centered on the optimization of execution time only. The difficulty for evaluating queries efficiently on nowadays applications motivates this work to revisit traditional query optimization techniques. This thesis faces these challenges by adapting the Case Based Reasoning (CBR) paradigm to query processing, providing a way to optimize queries when there is no prior knowledge of data. It focuses on optimizing queries using cases generated from the evaluation of similar past queries. A query case comprises: (i) the query, (ii) the query plan and (iii) the measures (computational resources consumed) of the query plan. The thesis also concerns the way the CBR process interacts with the query plan generation process. This process uses classical heuristics and makes decisions randomly (e.g. when there are no statistics for join ordering and selection of algorithms, routing protocols). It also (re)uses cases (existing query plans) for similar queries parts, improving the query optimization, and therefore evaluation efficiency. The propositions of this thesis have been validated within the CoBRa optimizer developed in the context of the UBIQUEST project
Friend, Daniel. "Cognitive Networks: Foundations to Applications". Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/26449.
Testo completoPh. D.
Jankee, Christopher. "Optimisation par métaheuristique adaptative distribuée en environnement de calcul parallèle". Thesis, Littoral, 2018. http://www.theses.fr/2018DUNK0480/document.
Testo completoTo solve discrete optimization problems of black box type, many stochastic algorithms such as evolutionary algorithms or metaheuristics exist and prove to be particularly effective according to the problem to be solved. Depending on the observed properties of the problem, choosing the most relevant algorithm is a difficult problem. In the original framework of parallel and distributed computing environments, we propose and analyze different adaptive optimization algorithm selection strategies. These selection strategies are based on reinforcement learning methods automatic, from the field of artificial intelligence, and on information sharing between computing nodes. We compare and analyze selection strategies in different situations. Two types of synchronous distributed computing environment are discussed : the island model and the master-slave model. On the set of nodes synchronously at each iteration, the adaptive selection strategy chooses an algorithm according to the state of the search for the solution. In the first part, two problems OneMax and NK, one unimodal and the other multimodal, are used as benchmarks for this work. Then, to better understand and improve the design of adaptive selection strategies, we propose a modeling of the optimization problem and its local search operator. In this modeling, an important characteristic is the average gain of an operator according to the fitness of the candidate solution. The model is used in the synchronous framework of the master-slave model. A selection strategy is broken down into three main components : the aggregation of the rewards exchanged, the learning scheme and the distribution of the algorithms on the computing nodes. In the final part, we study three scenarios, and we give keys to understanding the relevant use of adaptive selection strategies over naïve strategies. In the framework of the master-slave model, we study the different ways of aggregating the rewards on the master node, the distribution of the optimization algorithms of the nodes of computation and the time of communication. This thesis ends with perspectives in the field of distributed adaptive stochastic optimization
Capps, Michael V. "Fidelity optimization in distributed virtual environments". Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2000. http://handle.dtic.mil/100.2/ADA381349.
Testo completoDissertation supervisor: Zyda, Michael. "June 2000." Includes bibliographical references (p. 199-206). Also Available online.
Ray, William J. "Optimization of distributed, object-oriented architectures". Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2001. http://handle.dtic.mil/100.2/ADA397283.
Testo completoDissertation supervisor: Berzins, Valdis. "September 2001." Includes bibliographical references (p. 299-302 ). Also available in print.
Johansson, Björn. "On Distributed Optimization in Networked Systems". Doctoral thesis, KTH, Reglerteknik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-9801.
Testo completoQC 20100813
Liu, Ying. "Query optimization for distributed stream processing". [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3274258.
Testo completoSource: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4597. Adviser: Beth Plale. Title from dissertation home page (viewed Apr. 21, 2008).
Li, Lijun. "Optimization techniques for distributed Verilog simulation". Thesis, McGill University, 2008. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=21936.
Testo completoLa Loi de Moore stipule que la puissance des processeurs double approximativement tous les 18 mois. Pour le constructeur de semi-conducteurs, cela ´equivaut `a un constant probl`eme d'apporter des CI (Circuits Integr´es) de plus en plus larges et complexes sur le march´e. Il est bien connu que le goulet d'´etranglement dans la conception de circuits r´eside dans la simulation. Les simulateurs `a simple processeur peuvent ne pas suivre les demandes croissantes pour plus de vitesse et de m´emoire. Cette th`ese pr´esente un environnement de simulation Verilog avec plusieurs techniques d'optimization. Verilog est une langue de conception digitale couramment utilis´ee. Une simulation distribu´ee Verilog peut ˆetre ex´ecut´ee sur un groupe de postes de travail en utilisant une librarie passant des messages telle que IPM (Interface Passant des Messages). Nous d´ecrivons la reconstruction d'´ev´enements, une technique qui r´eduit l'en-tˆete caus´e par une sauvegarde d'´ev´enements, et comparons sa consommation de m´emoire et son temps d'ex´ecution avec les r´esultats obtenus par checkpointing dynamique. Comme son nom l'indique, la reconstruction d'´ev´enements reconstruit la saisie d'´ev´enements et d'anti-´ev´enements `a partir de la difference entre les ´etats adjacents, et ne sauvegarde pas la saisie d'´ev´enements dans la queue des ´ev´enements. Nous proposons un algorythme partionn´e redondant `a plusieurs voies et orient´e vers le design pour Verilog bas´e sur des instances de modules. Nous faisons cela afin de profiter de l'information hi´erarchique de conception contenue dans les modules et leurs instances. Une instance Verilog est repr´esent´ee par un vertex dans un circuit hypergraphique. Ce vertex peut etre ´ecras´e en plusieurs vertexs dans le cas o`u une charge ad´equate n'est pas produite par une instance bas´ee sur des partitions. Dans ce cas l`a l'algorythme ´ecrase la plus grosse instance et d´eplace les port
Xu, Qing. "Optimization techniques for distributed logic simulation". Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=96665.
Testo completoLa simulation "gate-level" est une tape ncessaire pour vrifier la conformit dela conception d'un circuit avant sa fabrication. C'est un programme qui prendbeaucoup de temps, compte tenu particulirement de la taille actuelle des circuits.Ceux-ci ne cessant de se dvelopper en taille et en complexit, il y a un rel besoin detechniques de simulation plus efficaces afin de maintenir la dure de vrification ducircuit raisonnablement courte. Une de ces techniques consiste utiliser la simulationparallle ou distribue. Quand excute sur un rseau de postes de travail, la simulationdistribue se rvle galement tre une technique trs rentable. Cette recherche se concentresur l'optimisation des techniques de simulations "gate-level" logiques bases surTime Warp. Les techniques qui sont dcrites dans cet expos sont orientes vers lesplateformes distribues. La premire contribution majeure de cet expos a t la crationd'un simulateur distribu orient sur l'objet, XTW. Il utilise un algorithme de synchronisationoptimiste et incorpore un certain nombre de techniques d'optimisationconnues visant diffrents aspects de la simulation distribue logique. XEQ, un algorithmeprogrammateur d'vnements O(1) pour ce simulateur a t dvelopp pour treutilis dans XTW. XEQ nous permet d'excuter des simulations "gate-level" jusqu'9,4 fois plus rapides qu'avec le mme simulateur utilisant une suite d'vnement en"skip-list" (O(lg n)). "rb-message" – un mcanisme qui diminue le co?t de rductiondans Time Warp a galement t mis au point pour tre utilis dans XTW. Nos essaisont rvl que le mcanisme de "rb-message" permettait de diminuer le nombre des antimessagesenvoys au cours d'une simulation logique base sur Time Warp de 76 % enmoyenne. Il a t en outre con?u, en se basant sur les observations que (1) certainscircuits ne devraient pas tre simuls en parallle et (2) que diffrents circuits atteignentleur meilleure performance de simulation parallle avec un nombre diffrent de noeudsde calculs, un algorithme utilisant l'algorithme d'apprentissage de la machine K-NNafin de dterminer quelle tait l'association de logiciel et de matriel la plus efficacedans le cadre d'une simulation logique. l'issue d'un entra?nement approfondi, ilest apparu qu'il pouvait faire un pronostic juste 99 % tablissant quand utiliser unsimulateur parallle ou squentiel. Le nombre annonc de noeuds utiliser sur une plateformeparallle s'est avr permettre une dure d'excution moyenne gale 12 % de la pluscourte dure d'excution. La configuration ayant abouti la dure d'excution minimalea t reprise dans 61 % des cas. Dernire contribution apporte par cet expos, relier lessimulateurs commerciaux processeur unique utilisant Verilog PLI.
Mota, Joao F. C. "Communication-Efficient Algorithms For Distributed Optimization". Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/283.
Testo completoJohansson, Björn. "On distributed optimization in networked systems /". Stockholm : Elektro- och systemteknik, Kungliga Tekniska högskolan, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-9801.
Testo completoSevinc, Ender. "Genetic Algorithms For Distributed Database Design And Distributed Database Query Optimization". Phd thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12611194/index.pdf.
Testo completoTerelius, Håkan. "Distributed Multi-Agent Optimization via Dual Decomposition". Thesis, KTH, Reglerteknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-46501.
Testo completoPayberah, Amir H. "Distributed Optimization of P2P Media Delivery Overlays". Licentiate thesis, KTH, Programvaru- och datorsystem, SCS, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-33287.
Testo completoQC 20110517
Bruce, Craig Steven. "Performance optimization for distributed-shared-data systems". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ32819.pdf.
Testo completoMagnússon, Sindri. "Distributed Optimization with Nonconvexities and Limited Communication". Licentiate thesis, KTH, Reglerteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-181111.
Testo completoQC 20160203
Das, Sanghamitra. "Application semantics based optimization of distributed algorithm". Diss., Kansas State University, 2012. http://hdl.handle.net/2097/15056.
Testo completoDepartment of Computing and Information Sciences
Gurdip Singh
To increase their applicability, distributed algorithms are typically written to work with any application on any network. This flexibility comes at the cost of performance since these 'general purpose' algorithms are written with the worst case scenario in mind. A distributed algorithm written for a specific application or a class of application is fine tuned to the properties of the application and can give a better performance when compared to the general purpose one. In this work, we propose two mechanisms in which we can optimize a general purpose algorithm - Alg based on the application - App using it. In the first approach, we analyze the specification of App to identify patterns of communication in its communication topology. These properties are then used to customize the behavior of the underlying distributed algorithm Alg. To demonstrate this approach, we study applications specified as component based systems where application components communicate via events and distributed algorithms to enforce ordering requirements on these events. We show how our approach can be used to optimize event ordering algorithms based on communication patterns in the applications. In the second approach, rather than analyzing the application specification, we assume that the developer provides application properties - I_App which are invariants for the optimization process. We assume that the algorithm is written and annotated in a format that is amenable to analysis. Our analysis algorithm then takes as input the application invariants and the annotated algorithm and looks for potential functions in the algorithm which are redundant in the context of the given application. In particular, we first look for function invocations in the algorithm whose post-conditions are already satisfied as a result of the application invariants. Each such invocation is considered as a potential redundant module. We further analyze the distributed algorithm to identify the impact of the removal of a specific invocation on the rest of the algorithm. We describe an implementation of this approach and demonstrate the applicability using a distributed termination detection algorithm.
Payberah, Amir. "Distributed Optimization of P2P Media Delivery Overlays". Licentiate thesis, SICS, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-23989.
Testo completoREST
Shah, Arpit. "DISTRIBUTED BIOGEOGRAPHY BASED OPTIMIZATION FOR MOBILE ROBOTS". Cleveland State University / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=csu1335969537.
Testo completoGuerreiro, Igor Moaco. "Distributed optimization techniques for 4G and beyond". reponame:Repositório Institucional da UFC, 2016. http://www.repositorio.ufc.br/handle/riufc/23343.
Testo completoSubmitted by Renato Vasconcelos (ppgeti@ufc.br) on 2017-06-14T15:00:33Z No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2003633 bytes, checksum: a1a392147f3f5f0c8946e15f33186715 (MD5)
Rejected by Marlene Sousa (mmarlene@ufc.br), reason: Renato, Favor enviar ao autor as seguintes alterações que ele deve providenciar. Prezado Igor: Existe uma orientação para que normalizemos as dissertações e teses da UFC, em suas paginas pré-textuais e lista de referencias, pelas regras da ABNT. Por esse motivo, sugerimos consultar o modelo de template, para ajudá-lo nesta tarefa, disponível em: http://www.biblioteca.ufc.br/educacao-de-usuarios/1234-templates Vamos agora as correções sempre de acordo com o template: 1. Na capa, a UFC solicitou que nos fizéssemos a opção de colocar a identificação da instituição e sua hierarquia em língua portuguesa. Sendo assim a ordem é: nome da Universidade, Centro, Departamento e nome do Programa em língua portuguesa. Na sua tese faltou o Centro. 2. Na capa e folha de rosto retire o nome do estado e deixe só a cidade da defesa e o ano 3. Na folha de rosto coloque o titulo de sua tese em ingles, 4.No sumário as seções terciárias (com 3 dígitos) ficam em negrito e itálico. As quaternárias (4 dígitos) ficam só em itálico sem negrito. Corrigir em todo o sumário. Atenciosamente, Marlene Rocha 3366-9620 on 2017-06-16T13:10:37Z (GMT)
Submitted by Renato Vasconcelos (ppgeti@ufc.br) on 2017-06-16T14:20:51Z No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5)
Approved for entry into archive by Marlene Sousa (mmarlene@ufc.br) on 2017-06-16T14:53:04Z (GMT) No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5)
Made available in DSpace on 2017-06-16T14:53:04Z (GMT). No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5) Previous issue date: 2016-06-30
In today’s and future wireless telecommunication systems, the use of multiple antenna elements on the transmitter side, and also on the receiver side, provides users a better experience in terms of data rate, coverage and energy efficiency. In the fourth generation of such systems, precoding emerged as a relevant problem to be optimally solved so that network capacity can be increased by exploiting the characteristics of the channel. In this case, transmitters are equipped with few antenna elements, say up to eight, which means there are a few tens of precoding matrices, assuming a discrete codebook, to be coordinated per transmitter. As the number of antenna elements increases at communication nodes, conditions to keep in providing good experience for users become challenging. That’s one of the challenges of the fifth generation. Every transmitter must deal with narrow beams when a massive number of antenna elements is adopted. The hard part, regarding the narrowness of beams, is to keep the spatial alignment between transmitter and receiver. Any misalignment makes the received signal quality drop significantly so that users no longer experience good coverage. In particular, to provide initial access to unsynchronized devices, transmitters need to find the best beams to send synchronization signals consuming as little power as possible without any knowledge on unsynchronized devices’ channel states. This thesis thus addresses both precoding and beam finding as parameter coordination problems. The goal is to provide methods to solve them in a distributed manner. For this purpose, two types of iterative algorithms are presented for both. The first and simplest method is the greedy solution in which each communication node in the network acts selfishly. The second method and the focus of this work is based on a message-passing algorithm, namely min-sum algorithm, in factor graphs. The precoding problem is modeled as a discrete optimization one whose discrete variables comprise precoding matrix indexes. As for beam finding, a beam sweep procedure is adopted and the total consumed power over the network is optimized. Numerical results show that the graph-based solution outperforms the baseline/greedy one in terms of the following performance metrics: a) system capacity for the precoding problem, and b) power consumption for the beam finding one. Although message-passing demands more signaling load and more iterations to converge compared to baseline method, it usually provides a near-optimal solution in a more efficient way than the centralized solution.
Nos sistemas de telecomunicações sem fio atuais e futuros, o uso de múltiplos elementos de antena no lado do transmissor, e também no do receptor, proporciona aos usuários uma melhor experiência em termos de taxa de dados, cobertura e eficiência energética. Na quarta geração de tais sistemas, precodificação surgiu como um problema relevante a ser solucionado de forma ótima de modo que a capacidade da rede pudesse ser aumentada explorando as características do canal. Neste caso, transmissores são equipados com alguns elementos de antena, por exem- plo, oito, o que resulta em algumas dezenas de matrizes de precodificação, assumindo um con- junto discreto, para serem coordenadas em cada transmissor. Com o crescimento no número de elementos de antena nos nós de comunicação, as condições para continuar provendo boa ex- periência para usuários se tornam desafiantes. Isto é um dos desafios da quinta geração. Cada transmissor deve lidar com feixes estreitos quando é adotado um número massivo de elemen- tos de antena. A parte complicada, considerando a pequena largura dos feixes, é a manutenção do alinhamento espacial entre transmissor e receptor. Qualquer desalinhamento desta natureza faz a qualidade do sinal recebido cair significativamente de tal forma que usuários deixam de perceber boa cobertura. Particularmente, para prover acesso inicial a dispositivos não sincroni- zados, transmissores necessitam achar os melhores feixes, sem nenhum conhecimento do estado de canal de tais dispositivos não sincronizados, para enviar sinais de sincronização consumindo o mínimo de energia possível. Esta tese, portanto, discute ambos os problemas de precodifica- ção e descoberta de feixes como sendo de coordenação de parâmetros. A meta é prover métodos para solucioná-los de maneira distribuída. Para este propósito, dois tipos de algoritmos iterativos são apresentados para ambos os problemas. O primeiro método, e o mais simples, é a solução “gulosa”, na qual cada nó de comunicação na rede atua de forma egoísta. O segundo método, e o foco deste trabalho, é baseado em um algoritmo de troca de mensagens, especificamente o algoritmo min-sum, em grafos fatores. O problema de precodificação é modelado como um de otimização discreta onde as variáveis discretas compreendem índices de matrizes de precodifi- cação. A respeito da descoberta de feixes, é adotado um procedimento de varredura de feixes com a otimização do consumo total de potência na rede. Resultados numéricos mostram que a solução baseada em grafos ganha da egoísta, que é usada como solução base, em termos da mé- trica de desempenho: a) capacidade do sistema para o problema de precodificação, e b) consumo de potência para o caso de descoberta de feixes. Embora a troca de mensagens demande mais carga de sinalização e mais iterações para convergir comparado ao método base, ela geralmente provê uma solução próxima da ótima de forma mais eficiente comparada a solução centralizada.
Ibrahim, Sarmad Khaleel. "DISTRIBUTION SYSTEM OPTIMIZATION WITH INTEGRATED DISTRIBUTED GENERATION". UKnowledge, 2018. https://uknowledge.uky.edu/ece_etds/116.
Testo completoHa, Viet Hai. "Optimization of memory management on distributed machine". Phd thesis, Institut National des Télécommunications, 2012. http://tel.archives-ouvertes.fr/tel-00814630.
Testo completoHa, Viet Hai. "Optimization of memory management on distributed machine". Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2012. http://www.theses.fr/2012TELE0042.
Testo completoIn order to explore further the capabilities of parallel computing architectures such as grids, clusters, multi-processors and more recently, clouds and multi-cores, an easy-to-use parallel language is an important challenging issue. From the programmer's point of view, OpenMP is very easy to use with its ability to support incremental parallelization, features for dynamically setting the number of threads and scheduling strategies. However, as initially designed for shared memory systems, OpenMP is usually limited on distributed memory systems to intra-nodes' computations. Many attempts have tried to port OpenMP on distributed systems. The most emerged approaches mainly focus on exploiting the capabilities of a special network architecture and therefore cannot provide an open solution. Others are based on an already available software solution such as DMS, MPI or Global Array and, as a consequence, they meet difficulties to become a fully-compliant and high-performance implementation of OpenMP. As yet another attempt to built an OpenMP compliant implementation for distributed memory systems, CAPE − which stands for Checkpointing Aide Parallel Execution − has been developed which with the following idea: when reaching a parallel section, the master thread is dumped and its image is sent to slaves; then, each slave executes a different thread; at the end of the parallel section, slave threads extract and return to the master thread the list of all modifications that has been locally performed; the master includes these modifications and resumes its execution. In order to prove the feasibility of this paradigm, the first version of CAPE was implemented using complete checkpoints. However, preliminary analysis showed that the large amount of data transferred between threads and the extraction of the list of modifications from complete checkpoints lead to weak performance. Furthermore, this version was restricted to parallel problems satisfying the Bernstein's conditions, i.e. it did not solve the requirements of shared data. This thesis aims at presenting the approaches we proposed to improve CAPE' performance and to overcome the restrictions on shared data. First, we developed DICKPT which stands for Discontinuous Incremental Checkpointing, an incremental checkpointing technique that supports the ability to save incremental checkpoints discontinuously during the execution of a process. Based on the DICKPT, the execution speed of the new version of CAPE was significantly increased. For example, the time to compute a large matrix-matrix product on a desktop cluster has become very similar to the execution time of the same optimized MPI program. Moreover, the speedup associated with this new version for various number of threads is quite linear for different problem sizes. In the side of shared data, we proposed UHLRC, which stands for Updated Home-based Lazy Release Consistency, a modified version of the Home-based Lazy Release Consistency (HLRC) memory model, to make it more appropriate to the characteristics of CAPE. Prototypes and algorithms to implement the synchronization and OpenMP data-sharing clauses and directives are also specified. These two works ensures the ability for CAPE to respect shared-data behavior
Thompson, Simon Giles. "Distributed boosting algorithms". Thesis, University of Portsmouth, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.285529.
Testo completo