Tesi: "Distributed optimization and learning"

1

Funkquist, Mikaela, e Minghua Lu. "Distributed Optimization Through Deep Reinforcement Learning". Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-293878.

Testo completo

Abstract (sommario):

Reinforcement learning methods allows self-learningagents to play video- and board games autonomously. Thisproject aims to study the efficiency of the reinforcement learningalgorithms Q-learning and deep Q-learning for dynamical multi-agent problems. The goal is to train robots to optimally navigatethrough a warehouse without colliding.A virtual environment was created, in which the learning algo-rithms were tested by simulating moving agents. The algorithms’efficiency was evaluated by how fast the agents learned to performpredetermined tasks.The results show that Q-learning excels in simple problemswith few agents, quickly solving systems with two active agents.Deep Q-learning proved to be better suited for complex systemscontaining several agents, though cases of sub-optimal movementwere still possible. Both algorithms showed great potential fortheir respective areas however improvements still need to be madefor any real-world use.
Förstärkningsinlärningsmetoder tillåter självlärande enheter att spela video- och brädspel autonomt. Projektet siktar på att studera effektiviteten hos förstärkningsinlärningsmetoderna Q-learning och deep Q-learning i dynamiska problem. Målet är att träna upp robotar så att de kan röra sig genom ett varuhus på bästa sätt utan att kollidera. En virtuell miljö skapades, i vilken algoritmerna testades genom att simulera agenter som rörde sig. Algoritmernas effektivitet utvärderades av hur snabbt agenterna lärde sig att utföra förutbestämda uppgifter. Resultatet visar att Q-learning fungerar bra för enkla problem med få agenter, där system med två aktiva agenter löstes snabbt. Deep Q-learning fungerar bättre för mer komplexa system som innehåller fler agenter, men fall med suboptimala rörelser uppstod. Båda algoritmerna visade god potential inom deras respektive områden, däremot måste förbättringar göras innan de kan användas i verkligheten.
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm

Gli stili APA, Harvard, Vancouver, ISO e altri

2

Konečný, Jakub. "Stochastic, distributed and federated optimization for machine learning". Thesis, University of Edinburgh, 2017. http://hdl.handle.net/1842/31478.

Testo completo

Abstract (sommario):

We study optimization algorithms for the finite sum problems frequently arising in machine learning applications. First, we propose novel variants of stochastic gradient descent with a variance reduction property that enables linear convergence for strongly convex objectives. Second, we study distributed setting, in which the data describing the optimization problem does not fit into a single computing node. In this case, traditional methods are inefficient, as the communication costs inherent in distributed optimization become the bottleneck. We propose a communication-efficient framework which iteratively forms local subproblems that can be solved with arbitrary local optimization algorithms. Finally, we introduce the concept of Federated Optimization/Learning, where we try to solve the machine learning problems without having data stored in any centralized manner. The main motivation comes from industry when handling user-generated data. The current prevalent practice is that companies collect vast amounts of user data and store them in datacenters. An alternative we propose is not to collect the data in first place, and instead occasionally use the computational power of users' devices to solve the very same optimization problems, while alleviating privacy concerns at the same time. In such setting, minimization of communication rounds is the primary goal, and we demonstrate that solving the optimization problems in such circumstances is conceptually tractable.

Gli stili APA, Harvard, Vancouver, ISO e altri

3

Armond, Kenneth C. Jr. "Distributed Support Vector Machine Learning". ScholarWorks@UNO, 2008. http://scholarworks.uno.edu/td/711.

Testo completo

Abstract (sommario):

Support Vector Machines (SVMs) are used for a growing number of applications. A fundamental constraint on SVM learning is the management of the training set. This is because the order of computations goes as the square of the size of the training set. Typically, training sets of 1000 (500 positives and 500 negatives, for example) can be managed on a PC without hard-drive thrashing. Training sets of 10,000 however, simply cannot be managed with PC-based resources. For this reason most SVM implementations must contend with some kind of chunking process to train parts of the data at a time (10 chunks of 1000, for example, to learn the 10,000). Sequential and multi-threaded chunking methods provide a way to run the SVM on large datasets while retaining accuracy. The multi-threaded distributed SVM described in this thesis is implemented using Java RMI, and has been developed to run on a network of multi-core/multi-processor computers.

Gli stili APA, Harvard, Vancouver, ISO e altri

4

Patvarczki, Jozsef. "Layout Optimization for Distributed Relational Databases Using Machine Learning". Digital WPI, 2012. https://digitalcommons.wpi.edu/etd-dissertations/291.

Testo completo

Abstract (sommario):

A common problem when running Web-based applications is how to scale-up the database. The solution to this problem usually involves having a smart Database Administrator determine how to spread the database tables out amongst computers that will work in parallel. Laying out database tables across multiple machines so they can act together as a single efficient database is hard. Automated methods are needed to help eliminate the time required for database administrators to create optimal configurations. There are four operators that we consider that can create a search space of possible database layouts: 1) denormalizing, 2) horizontally partitioning, 3) vertically partitioning, and 4) fully replicating. Textbooks offer general advice that is useful for dealing with extreme cases - for instance you should fully replicate a table if the level of insert to selects is close to zero. But even this seemingly obvious statement is not necessarily one that will lead to a speed up once you take into account that some nodes might be a bottle neck. There can be complex interactions between the 4 different operators which make it even more difficult to predict what the best thing to do is. Instead of using best practices to do database layout, we need a system that collects empirical data on when these 4 different operators are effective. We have implemented a state based search technique to try different operators, and then we used the empirically measured data to see if any speed up occurred. We recognized that the costs of creating the physical database layout are potentially large, but it is necessary since we want to know the "Ground Truth" about what is effective and under what conditions. After creating a dataset where these four different operators have been applied to make different databases, we can employ machine learning to induce rules to help govern the physical design of the database across an arbitrary number of computer nodes. This learning process, in turn, would allow the database placement algorithm to get better over time as it trains over a set of examples. What this algorithm calls for is that it will try to learn 1) "What is a good database layout for a particular application given a query workload?" and 2) "Can this algorithm automatically improve itself in making recommendations by using machine learned rules to try to generalize when it makes sense to apply each of these operators?" There has been considerable research done in parallelizing databases where large amounts of data are shipped from one node to another to answer a single query. Sometimes the costs of shipping the data back and forth might be high, so in this work we assume that it might be more efficient to create a database layout where each query can be answered by a single node. To make this assumption requires that all the incoming query templates are known beforehand. This requirement can easily be satisfied in the case of a Web-based application due to the characteristic that users typically interact with the system through a web interface such as web forms. In this case, unseen queries are not necessarily answerable, without first possibly reconstructing the data on a single machine. Prior knowledge of these exact query templates allows us to select the best possible database table placements across multiple nodes. But in the case of trying to improve the efficiency of a Web-based application, a web site provider might feel that they are willing to suffer the inconvenience of not being able to answer an arbitrary query, if they are in turn provided with a system that runs more efficiently.

Gli stili APA, Harvard, Vancouver, ISO e altri

5

Ouyang, Hua. "Optimal stochastic and distributed algorithms for machine learning". Diss., Georgia Institute of Technology, 2013. http://hdl.handle.net/1853/49091.

Testo completo

Abstract (sommario):

Stochastic and data-distributed optimization algorithms have received lots of attention from the machine learning community due to the tremendous demand from the large-scale learning and the big-data related optimization. A lot of stochastic and deterministic learning algorithms are proposed recently under various application scenarios. Nevertheless, many of these algorithms are based on heuristics and their optimality in terms of the generalization error is not sufficiently justified. In this talk, I will explain the concept of an optimal learning algorithm, and show that given a time budget and proper hypothesis space, only those achieving the lower bounds of the estimation error and the optimization error are optimal. Guided by this concept, we investigated the stochastic minimization of nonsmooth convex loss functions, a central problem in machine learning. We proposed a novel algorithm named Accelerated Nonsmooth Stochastic Gradient Descent, which exploits the structure of common nonsmooth loss functions to achieve optimal convergence rates for a class of problems including SVMs. It is the first stochastic algorithm that can achieve the optimal O(1/t) rate for minimizing nonsmooth loss functions. The fast rates are confirmed by empirical comparisons with state-of-the-art algorithms including the averaged SGD. The Alternating Direction Method of Multipliers (ADMM) is another flexible method to explore function structures. In the second part we proposed stochastic ADMM that can be applied to a general class of convex and nonsmooth functions, beyond the smooth and separable least squares loss used in lasso. We also demonstrate the rates of convergence for our algorithm under various structural assumptions of the stochastic function: O(1/sqrt{t}) for convex functions and O(log t/t) for strongly convex functions. A novel application named Graph-Guided SVM is proposed to demonstrate the usefulness of our algorithm. We also extend the scalability of stochastic algorithms to nonlinear kernel machines, where the problem is formulated as a constrained dual quadratic optimization. The simplex constraint can be handled by the classic Frank-Wolfe method. The proposed stochastic Frank-Wolfe methods achieve comparable or even better accuracies than state-of-the-art batch and online kernel SVM solvers, and are significantly faster. The last part investigates the problem of data-distributed learning. We formulate it as a consensus-constrained optimization problem and solve it with ADMM. It turns out that the underlying communication topology is a key factor in achieving a balance between a fast learning rate and computation resource consumption. We analyze the linear convergence behavior of consensus ADMM so as to characterize the interplay between the communication topology and the penalty parameters used in ADMM. We observe that given optimal parameters, the complete bipartite and the master-slave graphs exhibit the fastest convergence, followed by bi-regular graphs.

Gli stili APA, Harvard, Vancouver, ISO e altri

6

El, Gamal Mostafa. "Distributed Statistical Learning under Communication Constraints". Digital WPI, 2017. https://digitalcommons.wpi.edu/etd-dissertations/314.

Testo completo

Abstract (sommario):

"In this thesis, we study distributed statistical learning, in which multiple terminals, connected by links with limited capacity, cooperate to perform a learning task. As the links connecting the terminals have limited capacity, the messages exchanged between the terminals have to be compressed. The goal of this thesis is to investigate how to compress the data observations at multiple terminals and how to use the compressed data for inference. We first focus on the distributed parameter estimation problem, in which terminals send messages related to their local observations using limited rates to a fusion center that will obtain an estimate of a parameter related to the observations of all terminals. It is well known that if the transmission rates are in the Slepian-Wolf region, the fusion center can fully recover all observations and hence can construct an estimator having the same performance as that of the centralized case. One natural question is whether Slepian-Wolf rates are necessary to achieve the same estimation performance as that of the centralized case. In this thesis, we show that the answer to this question is negative. We then examine the optimality of data dimensionality reduction via sufficient statistics compression in distributed parameter estimation problems. The data dimensionality reduction step is often needed especially if the data has a very high dimension and the communication rate is not as high as the one characterized above. We show that reducing the dimensionality by extracting sufficient statistics of the parameter to be estimated does not degrade the overall estimation performance in the presence of communication constraints. We further analyze the optimal estimation performance in the presence of communication constraints and we verify the derived bound using simulations. Finally, we study distributed optimization problems, for which we examine the randomized distributed coordinate descent algorithm with quantized updates. In the literature, the iteration complexity of the randomized distributed coordinate descent algorithm has been characterized under the assumption that machines can exchange updates with an infinite precision. We consider a practical scenario in which the messages exchange occurs over channels with finite capacity, and hence the updates have to be quantized. We derive sufficient conditions on the quantization error such that the algorithm with quantized update still converge."

Gli stili APA, Harvard, Vancouver, ISO e altri

7

Dai, Wei. "Learning with Staleness". Research Showcase @ CMU, 2018. http://repository.cmu.edu/dissertations/1209.

Testo completo

Abstract (sommario):

A fundamental assumption behind most machine learning (ML) algorithms and analyses is the sequential execution. That is, any update to the ML model can be immediately applied and the new model is always available for the next algorithmic step. This basic assumption, however, can be costly to realize, when the computation is carried out across multiple machines, linked by commodity networks that are usually 104 times slower than the memory speed due to fundamental hardware limitations. As a result, concurrent ML computation in the distributed settings often needs to handle delayed updates and perform learning in the presence of staleness. This thesis characterizes learning with staleness from three directions: (1) We extend the theoretical analyses of a number of classical ML algorithms, including stochastic gradient descent, proximal gradient descent on non-convex problems, and Frank-Wolfe algorithms, to explicitly incorporate staleness into their convergence characterizations. (2)We conduct simulation and large-scale distributed experiments to study the empirical effects of staleness on ML algorithms under indeterministic executions. Our results reveal that staleness is a key parameter governing the convergence speed for all considered ML algorithms, with varied ramifications. (3) We design staleness-minimizing parameter server systems by optimizing synchronization methods to effectively reduce the runtime staleness. The proposed optimization of a bounded consistency model utilizes the additional network bandwidths to communicate updates eagerly, relieving users of the burden to tune the staleness level. By minimizing staleness at the framework level, our system stabilizes diverging optimization paths and substantially accelerates convergence across ML algorithms without any modification to the ML programs.

Gli stili APA, Harvard, Vancouver, ISO e altri

8

Lu, Yumao. "Kernel optimization and distributed learning algorithms for support vector machines". Diss., Restricted to subscribing institutions, 2005. http://uclibs.org/PID/11984.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

9

Dinh, The Canh. "Distributed Algorithms for Fast and Personalized Federated Learning". Thesis, The University of Sydney, 2023. https://hdl.handle.net/2123/30019.

Testo completo

Abstract (sommario):

The significant increase in the number of cutting-edge user equipment (UE) results in the phenomenal growth of the data volume generated at the edge. This shift fuels the booming trend of an emerging technique named Federated Learning. In contrast to traditional methods in which data is collected and processed centrally, FL builds a global model from contributions of UE's model without sending private data then effectively ensures data privacy. However, FL faces challenges in non-identically distributed (non-IID) data, communication cost, and convergence rate. Firstly, we propose first-order optimization FL algorithms named FedApprox and FEDL to improve the convergence rate. We propose FedApprox exploiting proximal stochastic variance-reduced gradient methods and extract insights from convergence conditions via the algorithm’s parameter control. We then propose FEDL to handle heterogeneous UE data and characterize the trade-off between local computation and global communication. Experimentally, FedApprox outperforms vanilla FedAvg while FEDL outperforms FedApprox and FedAvg. Secondly, we consider the communication between edges to be more costly than local computational overhead. We propose DONE, a distributed approximate Newton-type algorithm for communication-efficient federated edge learning. DONE approximates Newton direction using classical Richardson iteration on each edge. Experimentally, DONE attains a comparable performance to Newton’s method and outperforms first-order algorithms. Finally, we address the non-IID issue by proposing pFedMe, a personalized FL algorithm using Moreau envelopes. pFedMe achieves quadratic speedup for strongly convex and sublinear speedup of order 2/3 for smooth nonconvex objectives. We then propose FedU, a Federated Multitask Learning algorithm using Laplacian regularization to leverage the relationships among the users' models. Experimentally, pFedMe excels FedAvg and Per-FedAvg while FedU outperforms pFedMe and MOCHA.

Gli stili APA, Harvard, Vancouver, ISO e altri

10

Reddi, Sashank Jakkam. "New Optimization Methods for Modern Machine Learning". Research Showcase @ CMU, 2017. http://repository.cmu.edu/dissertations/1116.

Testo completo

Abstract (sommario):

Modern machine learning systems pose several new statistical, scalability, privacy and ethical challenges. With the advent of massive datasets and increasingly complex tasks, scalability has especially become a critical issue in these systems. In this thesis, we focus on fundamental challenges related to scalability, such as computational and communication efficiency, in modern machine learning applications. The underlying central message of this thesis is that classical statistical thinking leads to highly effective optimization methods for modern big data applications. The first part of the thesis investigates optimization methods for solving large-scale nonconvex Empirical Risk Minimization (ERM) problems. Such problems have surged into prominence, notably through deep learning, and have led to exciting progress. However, our understanding of optimization methods suitable for these problems is still very limited. We develop and analyze a new line of optimization methods for nonconvex ERM problems, based on the principle of variance reduction. We show that our methods exhibit fast convergence to stationary points and improve the state-of-the-art in several nonconvex ERM settings, including nonsmooth and constrained ERM. Using similar principles, we also develop novel optimization methods that provably converge to second-order stationary points. Finally, we show that the key principles behind our methods can be generalized to overcome challenges in other important problems such as Bayesian inference. The second part of the thesis studies two critical aspects of modern distributed machine learning systems — asynchronicity and communication efficiency of optimization methods. We study various asynchronous stochastic algorithms with fast convergence for convex ERM problems and show that these methods achieve near-linear speedups in sparse settings common to machine learning. Another key factor governing the overall performance of a distributed system is its communication efficiency. Traditional optimization algorithms used in machine learning are often ill-suited for distributed environments with high communication cost. To address this issue, we dis- cuss two different paradigms to achieve communication efficiency of algorithms in distributed environments and explore new algorithms with better communication complexity.

Gli stili APA, Harvard, Vancouver, ISO e altri

11

Shi, Shaohuai. "Communication optimizations for distributed deep learning". HKBU Institutional Repository, 2020. https://repository.hkbu.edu.hk/etd_oa/813.

Testo completo

Abstract (sommario):

With the increasing amount of data and the growing computing power, deep learning techniques using deep neural networks (DNNs) have been successfully applied in many practical artificial intelligence applications. The mini-batch stochastic gradient descent (SGD) algorithm and its variants are the most widely used algorithms in training deep models. The SGD algorithm is an iterative algorithm that needs to update the model parameters many times by traversing the training data, which is very time-consuming even using the single powerful GPU or TPU. Therefore, it becomes a common practice to exploit multiple processors (e.g., GPUs or TPUs) to accelerate the training process using distributed SGD. However, the iterative nature of distributed SGD requires multiple processors to iteratively communicate with each other to collaboratively update the model parameters. The intensive communication cost easily becomes the system bottleneck and limits the system scalability. In this thesis, we study the communication-efficient techniques for distributed SGD to improve the system scalability and thus accelerate the training process. We identify the performance issues in distributed SGD through benchmarking and modeling and then propose several communication optimization algorithms to address the communication issues. First, we build a performance model with a directed acyclic graph (DAG) to modeling the training process of distributed SGD and verify the model with extensive benchmarks on existing state-of-the-art deep learning frameworks including Caffe, MXNet, TensorFlow, and CNTK. Our benchmarking and modeling point out that existing optimizations for the communication problems are sub-optimal, which we need to address in this thesis. Second, to address the startup problem (due to the high latency of each communication) of layer-wise communications with wait-free backpropagation (WFBP), we propose an optimal gradient merging solution for WFBP, named MG-WFBP, that exploits the layer-wise property to well overlap the communication tasks with the computing tasks and can be adaptive to the training environments. Experiments are conducted on dense-GPU clusters with Ethernet and InfiniBand, and the results show that MG-WFBP can well address the startup problem in distributed training of layer-wise structured DNNs. Third, to make the high computing-intensive training tasks be possible in GPU clusters with low- bandwidth interconnect, we investigate the gradient compression techniques in distributed training. The top-{dollar}k{dollar} sparsification can well compress the communication traffic with little impact on the model convergence, but it suffers from a linear communication complexity to the number of workers so that top-{dollar}k{dollar} sparsification cannot scale well in large-scale clusters. To address the problem, we propose a global top-{dollar}k{dollar} (gTop-{dollar}k{dollar}) sparsification algorithm that reduces the communication complexity to be logarithmic to the number of workers. We also provide detailed theoretical analysis for the gTop-{dollar}k{dollar} SGD training algorithm, and the theoretical results show that our gTop-{dollar}k{dollar} SGD has the same order of convergence rate with SGD. Experiments are conducted on up to 64-GPU cluster to verify that gTop-{dollar}k{dollar} SGD significantly improves the system scalability with only a slight impact on the model convergence. Lastly, to enjoy the both benefits of the pipelining technique and the gradient sparsification algorithm, we propose a new distributed training algorithm, layer-wise adaptive gradient sparsification SGD (LAGS-SGD), which supports layer-wise sparsification and communication, and we theoretically and empirically prove that the LAGS-SGD preserves the convergence properties. To further alliterate the impact of the startup problem of layer-wise communications in LAGS-SGD, we also propose the optimal gradient merging solution for LAGS-SGD, named OMGS-SGD, and theoretical prove its optimality. The experimental results on a 16-node GPU cluster connected 1Gbps Ethernet show that OMGS-SGD can always improve the system scalability while the model convergence properties are not affected

Gli stili APA, Harvard, Vancouver, ISO e altri

12

Wang, Sinong. "Coded Computation for Speeding up Distributed Machine Learning". The Ohio State University, 2019. http://rave.ohiolink.edu/etdc/view?acc_num=osu1555336880521062.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

13

Sitta, Alessandro. "Privacy-Preserving Distributed Optimization via Obfuscated Gradient Tracking". Master's thesis, Alma Mater Studiorum - Università di Bologna, 2021.

Cerca il testo completo

Abstract (sommario):

As the modern world becomes increasingly digitized and interconnected, distributed systems have proven to be effective in the processing of large volumes of data. In this context, optimization techniques have become essential in an extensive range of domains. However, a major concern, regarding the privacy issue in handling sensitive data, has recently emerged. To address this privacy issue we propose a novel consensus-based privacy-preserving distributed optimization algorithm called Obfuscated Gradient Tracking. The algorithm is characterized by a balanced noise insertion method which protects private data from being revealed to others, while not affecting the result’s accuracy. Indeed, we theoretically prove that the introduced perturbations do not condition the convergence properties of the algorithm, which is proven to reach the optimal solution without compromises. Moreover, security against the widely-used honest-but-curious adversary model, is shown. Furthermore, numerical tests are performed to show the effectiveness of the novel algorithm, both in terms of privacy and convergence properties. Numerical results highlight the Obfuscated Gradient Tracking attractiveness, against standard distributed algorithms, when privacy issues are involved. Finally, we present a privacy-preserving distributed Deep Learning application developed using our novel algorithm, with the aim of demonstrating its general applicability.

Gli stili APA, Harvard, Vancouver, ISO e altri

14

Costantini, Marina. "Optimization methods over networks for popular content delivery and distributed machine learning". Electronic Thesis or Diss., Sorbonne université, 2023. http://www.theses.fr/2023SORUS182.

Testo completo

Abstract (sommario):

Le nombre d'utilisateurs et d'applications sur l'internet pose un grand nombre de défis aux opérateurs de réseaux pour pouvoir répondre aux demandes de trafic élevées. Dans ce contexte, il est devenu impératif d'utiliser efficacement les ressources disponibles. Dans cette thèse, nous développons des méthodes d'optimisation pour améliorer l'utilisation du réseau dans deux applications : la mise en cache à la périphérie du réseau et l'apprentissage de modèles distribués. La mise en cache à la périphérie du réseau est une technique qui propose de stocker à la périphérie du réseau des copies de contenus populaires afin de réduire la latence et d'améliorer l'expérience de l'utilisateur. Traditionnellement, lorsqu'un utilisateur demande une page web ou une application, la requête est envoyée à un serveur distant qui stocke les données. Le serveur récupère les données demandées et les renvoie à l'utilisateur. Ce processus peut entraîner des problèmes de congestion. Pour résoudre ce problème, les opérateurs de réseau peuvent déployer des serveurs de mise en cache à proximité des utilisateurs. Ces serveurs sont ensuite remplis pendant les heures creuses avec des contenus qui ont une forte probabilité d'être demandés, de sorte que pendant les périodes de fort trafic, l'utilisateur peut toujours les récupérer en peu de temps. D'autre part, l'apprentissage distribué de modèles, ou plus généralement l'optimisation distribuée, est une méthode d'entraînement de modèles d'apprentissage automatique utilisant plusieurs agents qui travaillent ensemble pour trouver les paramètres optimaux du modèle. Dans ce cadre, les agents intercalent des étapes de calcul local avec des étapes de communication pour entraîner un modèle qui prend en compte les données de tous les agents. Nous considérons ici deux contextes de formation distribuée : le contexte décentralisé et le contexte fédéré. Dans le cadre décentralisé, les agents sont interconnectés dans un réseau et ne communiquent leurs valeurs d'optimisation qu'à leurs voisins directs. Dans le cadre fédéré, les agents communiquent avec un serveur central qui calcule régulièrement la moyenne des valeurs les plus récentes d' un sous-ensemble d'agents et diffuse le résultat à tous les agents. Naturellement, le succès de ces techniques repose sur la communication fréquente des agents. C'est pourquoi la conception d'algorithmes d'optimisation distribués permettant d'obtenir des performances de pointe à des coûts de communication moindres suscite un grand intérêt. Dans cette thèse, nous proposons des algorithmes qui améliorent les performances des méthodes existantes pour la fourniture de contenu populaire et l'apprentissage automatique distribué en faisant une meilleure utilisation des ressources du réseau. Dans le chapitre 2, nous proposons un algorithme qui exploite les moteurs de recommandation pour concevoir conjointement les contenus mis en cache à la périphérie du réseau et les recommandations présentées à l'utilisateur. Cet algorithme permet d'atteindre une fraction plus élevée de demandes servies par le cache que ses concurrents, et nécessite donc moins de communication avec le serveur distant. Dans le chapitre 3, nous concevons un algorithme asynchrone pour l'optimisation décentralisée qui nécessite une coordination minimale entre les agents et permet donc des interruptions de connexion et des défaillances de liaison. Nous montrons que la convergence de cet algorithme peut être rendue beaucoup plus rapide en laissant les agents décider de leur schéma de communication en fonction des gains fournis par la communication avec chacun de leurs voisins. Enfin, au chapitre 4, nous proposons un algorithme qui exploite la communication inter-agents dans le cadre de l'apprentissage fédéré classique et qui peut atteindre la même vitesse de convergence que la configuration classique avec moins de cycles de communication avec le serveur, qui constituent le principal goulot d'étranglement dans ce cadre
The ever-increasing number of users and applications on the Internet sets a number of challenges for network operators and engineers in order to keep up with the high traffic demands. In this scenario, making efficient use of the resources available has become imperative. In this thesis, we develop optimization methods to improve the utilization of the network in two specific applications enabled by the Internet: network edge caching and distributed model training. Network edge caching is a recent technique that proposes to store at the network edge copies of contents that have a high probability of being requested to reduce latency and improve the overall user experience. Traditionally, when a user requests a web page or application, the request is sent to a remote server that stores the data. The server retrieves the requested data and sends it back to the user. This process can be slow and can lead to latency and congestion issues, especially when multiple users are accessing the same data simultaneously. To address this issue, network operators can deploy edge caching servers close to end-users. These servers are then filled during off-peak hours with contents that have high probability of being requested, so that during times of high traffic the user can still retrieve them in a short time and high quality. On the other hand, distributed model training, or more generally, distributed optimization, is a method for training large-scale machine learning models using multiple agents that work together to find the optimal parameters of the model. In such settings, the agents interleave local computation steps with communication steps to train a model that takes into account the data of all agents. To achieve this, agents may exchange optimization values (parameters, gradients) but not the data. Here we consider two such distributed training settings: the decentralized and the federated. In the decentralized setting, agents are interconnected in a network and communicate their optimization values only to their direct neighbors. In the federated, the agents communicate with a central server that regularly averages the most recent values of (usually a subset of) the agents and broadcasts the result to all of them. Naturally, the success of such techniques relies on the frequent communication of the agents between them or with the server. Therefore, there is a great interest in designing distributed optimization algorithms that achieve state-of-the-art performance at lower communication costs. In this thesis, we propose algorithms that improve the performance of existing methods for popular content delivery and distributed machine learning by making a better utilization of the network resources. In Chapter 2, we propose an algorithm that exploits recommendation engines to design jointly the contents cached at the network edge and the recommendations shown to the user. This algorithm achieves a higher fraction of requests served by the cache than its competitors, and thus requires less communication with the remote server. In Chapter 3, we design an asynchronous algorithm for decentralized optimization that requires minimum coordination between the agents and thus allows for connection interruptions and link failures. We then show that, if the agents are allowed to increase the amount of data they transmit by a factor equal to their node degree, the convergence of this algorithm can be made much faster by letting the agents decide their communication scheme according to the gains provided by communicating with each of their neighbors. Finally, in Chapter 4 we propose an algorithm that exploits inter-agent communication within the classical federated learning setup (where, in principle, agents communicate only with the server), and which can achieve the same convergence speed as the classical setup with fewer communication rounds with the server, which constitute the main bottleneck in this setting

Gli stili APA, Harvard, Vancouver, ISO e altri

15

Singh, Manish Kumar. "Optimization, Learning, and Control for Energy Networks". Diss., Virginia Tech, 2021. http://hdl.handle.net/10919/104064.

Testo completo

Abstract (sommario):

Massive infrastructure networks such as electric power, natural gas, or water systems play a pivotal role in everyday human lives. Development and operation of these networks is extremely capital-intensive. Moreover, security and reliability of these networks is critical. This work identifies and addresses a diverse class of computationally challenging and time-critical problems pertaining to these networks. This dissertation extends the state of the art on three fronts. First, general proofs of uniqueness for network flow problems are presented, thus addressing open problems. Efficient network flow solvers based on energy function minimizations, convex relaxations, and mixed-integer programming are proposed with performance guarantees. Second, a novel approach is developed for sample-efficient training of deep neural networks (DNN) aimed at solving optimal network dispatch problems. The novel feature here is that the DNNs are trained to match not only the minimizers, but also their sensitivities with respect to the optimization problem parameters. Third, control mechanisms are designed that ensure resilient and stable network operation. These novel solutions are bolstered by mathematical guarantees and extensive simulations on benchmark power, water, and natural gas networks.
Doctor of Philosophy
Massive infrastructure networks play a pivotal role in everyday human lives. A minor service disruption occurring locally in electric power, natural gas, or water networks is considered a significant loss. Uncertain demands, equipment failures, regulatory stipulations, and most importantly complicated physical laws render managing these networks an arduous task. Oftentimes, the first principle mathematical models for these networks are well known. Nevertheless, the computations needed in real-time to make spontaneous decisions frequently surpass the available resources. Explicitly identifying such problems, this dissertation extends the state of the art on three fronts: First, efficient models enabling the operators to tractably solve some routinely encountered problems are developed using fundamental and diverse mathematical tools; Second, quickly trainable machine learning based solutions are developed that enable spontaneous decision making while learning offline from sophisticated mathematical programs; and Third, control mechanisms are designed that ensure a safe and autonomous network operation without human intervention. These novel solutions are bolstered by mathematical guarantees and extensive simulations on benchmark power, water, and natural gas networks.

Gli stili APA, Harvard, Vancouver, ISO e altri

16

Khirirat, Sarit. "First-Order Algorithms for Communication Efficient Distributed Learning". Licentiate thesis, KTH, Reglerteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-263738.

Testo completo

Abstract (sommario):

Technological developments in devices and storages have made large volumes of data collections more accessible than ever. This transformation leads to optimization problems with massive data in both volume and dimension. In response to this trend, the popularity of optimization on high performance computing architectures has increased unprecedentedly. These scalable optimization solvers can achieve high efficiency by splitting computational loads among multiple machines. However, these methods also incur large communication overhead. To solve optimization problems with millions of parameters, communication between machines has been reported to consume up to 80% of the training time. To alleviate this communication bottleneck, many optimization algorithms with data compression techniques have been studied. In practice, they have been reported to significantly save communication costs while exhibiting almost comparable convergence as the full-precision algorithms. To understand this intuition, we develop theory and techniques in this thesis to design communication-efficient optimization algorithms. In the first part, we analyze the convergence of optimization algorithms with direct compression. First, we outline definitions of compression techniques which cover many compressors of practical interest. Then, we provide the unified analysis framework of optimization algorithms with compressors which can be either deterministic or randomized. In particular, we show how the tuning parameters of compressed optimization algorithms must be chosen to guarantee performance. Our results show explicit dependency on compression accuracy and delay effect due to asynchrony of algorithms. This allows us to characterize the trade-off between iteration and communication complexity under gradient compression. In the second part, we study how error compensation schemes can improve the performance of compressed optimization algorithms. Even though convergence guarantees of optimization algorithms with error compensation have been established, there is very limited theoretical support which guarantees improved solution accuracy. We therefore develop theoretical explanations, which show that error compensation guarantees arbitrarily high solution accuracy from compressed information. In particular, error compensation helps remove accumulated compression errors, thus improving solution accuracy especially for ill-conditioned problems. We also provide strong convergence analysis of error compensation on parallel stochastic gradient descent across multiple machines. In particular, the error-compensated algorithms, unlike direct compression, result in significant reduction in the compression error. Applications of the algorithms in this thesis to real-world problems with benchmark data sets validate our theoretical results.
Utvecklandet av kommunikationsteknologi och datalagring har gjort stora mängder av datasamlingar mer tillgängliga än någonsin. Denna förändring leder till numeriska optimeringsproblem med datamängder med stor skala i volym och dimension. Som svar på denna trend har populariteten för högpresterande beräkningsarkitekturer ökat mer än någonsin tidigare. Skalbara optimeringsverktyg kan uppnå hög effektivitet genom att fördela beräkningsbördan mellan ett flertal maskiner. De kommer dock i praktiken med ett pris som utgörs av betydande kommunikationsomkostnader. Detta orsakar ett skifte i flaskhalsen för prestandan från beräkningar till kommunikation. När lösning av verkliga optimeringsproblem sker med ett stort antal parametrar, dominerar kommunikationen mellan maskiner nästan 80% av träningstiden. För att minska kommunikationsbelastningen, har ett flertal kompressionstekniker föreslagits i litteraturen. Även om optimeringsalgoritmer som använder dessa kompressorer rapporteras vara lika konkurrenskraftiga som sina motsvarigheter med full precision, dras de med en förlust av noggrannhet. För att ge en uppfattning om detta, utvecklar vi i denna avhandling teori och tekniker för att designa kommunikations-effektiva optimeringsalgoritmer som endast använder information med låg precision. I den första delen analyserar vi konvergensen hos optimeringsalgoritmer med direkt kompression. Först ger vi en översikt av kompressionstekniker som täcker in många kompressorer av praktiskt intresse. Sedan presenterar vi ett enhetligt analysramverk för optimeringsalgoritmer med kompressorer, som kan vara antingen deterministiska eller randomiserade. I synnerhet visas val av parametrar i komprimerade optimeringsalgoritmer som avgörs av kompressorns parametrar som garanterar konvergens. Våra konvergensgarantier visar beroende av kompressorns noggrannhet och fördröjningseffekter på grund av asynkronicitet hos algoritmer. Detta låter oss karakterisera avvägningen mellan iterations- och kommunikations-komplexitet när kompression används. I den andra delen studerarvi hög prestanda hos felkompenseringsmetoder för komprimerade optimeringsalgoritmer. Även om konvergensgarantier med felkompensering har etablerats finns det väldigt begränsat teoretiskt stöd för konkurrenskraftiga konvergensgarantier med felkompensering. Vi utvecklar därför teoretiska förklaringar, som visar att användande av felkompensering garanterar godtyckligt hög lösningsnoggrannhet från komprimerad information. I synnerhet bidrar felkompensering till att ta bort ackumulerade kompressionsfel och förbättrar därmed lösningsnoggrannheten speciellt för illa konditionerade kvadratiska optimeringsproblem. Vi presenterar också stark konvergensanalys för felkompensering tillämpat på stokastiska gradientmetoder med ett kommunikationsnätverk innehållande ett flertal maskiner. De felkompenserade algoritmerna resulterar, i motsats till direkt kompression, i betydande reducering av kompressionsfelet. Simuleringar av algoritmer i denna avhandling på verkligaproblem med referensdatamängder validerar våra teoretiska resultat.

QC20191120

Gli stili APA, Harvard, Vancouver, ISO e altri

17

Jeon, Sung-eok. "Near-Optimality of Distributed Network Management with a Machine Learning Approach". Diss., Georgia Institute of Technology, 2007. http://hdl.handle.net/1853/16136.

Testo completo

Abstract (sommario):

An analytical framework is developed for distributed management of large networks where each node makes locally its decisions. Two issues remain open. One is whether a distributed algorithm would result in a near-optimal management. The other is the complexity, i.e., whether a distributed algorithm would scale gracefully with a network size. We study these issues through modeling, approximation, and randomized distributed algorithms. For near-optimality issue, we first derive a global probabilistic model of network management variables which characterizes the complex spatial dependence of the variables. The spatial dependence results from externally imposed management constraints and internal properties of communication environments. We then apply probabilistic graphical models in machine learning to show when and whether the global model can be approximated by a local model. This study results in a sufficient condition for distributed management to be nearly optimal. We then show how to obtain a near-optimal configuration through decentralized adaptation of local configurations. We next derive a near-optimal distributed inference algorithm based on the derived local model. We characterize the trade-off between near-optimality and complexity of distributed and statistical management. We validate our formulation and theory through simulations.

Gli stili APA, Harvard, Vancouver, ISO e altri

18

Esteves, José Jurandir Alves. "Optimization of network slice placement in distributed large-scale infrastructures : from heuristics to controlled deep reinforcement learning". Electronic Thesis or Diss., Sorbonne université, 2021. http://www.theses.fr/2021SORUS325.

Testo completo

Abstract (sommario):

Cette thèse examine comment optimiser le placement de tranches (slices) de réseau dans les infrastructures distribuées à grande échelle en se concentrant sur des approches heuristiques en ligne et basées sur l'apprentissage par renforcement profond (DRL). Tout d'abord, nous nous appuyons sur la programmation linéaire en nombre entiers (ILP) pour proposer un modèle de données permettant le placement de tranches de réseau sur le bord et le cœur du réseau. Contrairement à la plupart des études relatives au placement de fonctions réseau virtualisées, le modèle ILP proposé prend en compte les topologies complexes des tranches de réseau et accorde une attention particulière à l'emplacement géographique des utilisateurs des tranches réseau et à son impact sur le calcul de la latence de bout en bout. Des expérimentations numériques nous ont permis de montrer la pertinence de la prise en compte des contraintes de localisation des utilisateurs.Ensuite, nous nous appuyons sur une approche appelée "Power of Two Choices" pour proposer un algorithme heuristique en ligne qui est adapté à supporter le placement sur des infrastructures distribuées à grande échelle tout en intégrant des contraintes spécifiques au bord du réseau. Les résultats de l'évaluation montrent la bonne performance de l'heuristique qui résout le problème en quelques secondes dans un scénario à grande échelle. L'heuristique améliore également le taux d'acceptation des demandes de placement de tranches de réseau par rapport à une solution déterministe en ligne en utilisant l'ILP.Enfin, nous étudions l'utilisation de méthodes de ML, et plus particulièrement de DRL, pour améliorer l'extensibilité et l'automatisation du placement de tranches réseau en considérant une version multi-objectif du problème. Nous proposons d'abord un algorithme DRL pour le placement de tranches réseau qui s'appuie sur l'algorithme "Advantage Actor Critic" pour un apprentissage rapide, et sur les réseaux convolutionels de graphes pour l'extraction de propriétés. Ensuite, nous proposons une approche que nous appelons "Heuristically Assisted DRL" (HA-DRL), qui utilise des heuristiques pour contrôler l'apprentissage et l'exécution de l'agent DRL. Nous évaluons cette solution par des simulations dans des conditions de charge de réseau stationnaire, ensuite cyclique et enfin non-stationnaire. Les résultats de l'évaluation montrent que le contrôle par heuristique est un moyen efficace d'accélérer le processus d'apprentissage du DRL, et permet d'obtenir un gain substantiel dans l'utilisation des ressources, de réduire la dégradation des performances et d'être plus fiable en cas de changements imprévisibles de la charge du réseau que les algorithmes DRL non contrôlés
This PhD thesis investigates how to optimize Network Slice Placement in distributed large-scale infrastructures focusing on online heuristic and Deep Reinforcement Learning (DRL) based approaches. First, we rely on Integer Linear Programming (ILP) to propose a data model for enabling on-Edge and on-Network Slice Placement. In contrary to most studies related to placement in the NFV context, the proposed ILP model considers complex Network Slice topologies and pays special attention to the geographic location of Network Slice Users and its impact on the End-to-End (E2E) latency. Extensive numerical experiments show the relevance of taking into account the user location constraints. Then, we rely on an approach called the “Power of Two Choices"(P2C) to propose an online heuristic algorithm for the problem which is adapted to support placement on large-scale distributed infrastructures while integrating Edge-specific constraints. The evaluation results show the good performance of the heuristic that solves the problem in few seconds under a large-scale scenario. The heuristic also improves the acceptance ratio of Network Slice Placement Requests when compared against a deterministic online ILP-based solution. Finally, we investigate the use of ML methods, more specifically DRL, for increasing scalability and automation of Network Slice Placement considering a multi-objective optimization approach to the problem. We first propose a DRL algorithm for Network Slice Placement which relies on the Advantage Actor Critic algorithm for fast learning, and Graph Convolutional Networks for feature extraction automation. Then, we propose an approach we call Heuristically Assisted Deep Reinforcement Learning (HA-DRL), which uses heuristics to control the learning and execution of the DRL agent. We evaluate this solution trough simulations under stationary, cycle-stationary and non-stationary network load conditions. The evaluation results show that heuristic control is an efficient way of speeding up the learning process of DRL, achieving a substantial gain in resource utilization, reducing performance degradation, and is more reliable under unpredictable changes in network load than non-controlled DRL algorithms

Gli stili APA, Harvard, Vancouver, ISO e altri

19

Salim, Adil. "Random monotone operators and application to stochastic optimization". Thesis, Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT021/document.

Testo completo

Abstract (sommario):

Cette thèse porte essentiellement sur l'étude d'algorithmes d'optimisation. Les problèmes de programmation intervenant en apprentissage automatique ou en traitement du signal sont dans beaucoup de cas composites, c'est-à-dire qu'ils sont contraints ou régularisés par des termes non lisses. Les méthodes proximales sont une classe d'algorithmes très efficaces pour résoudre de tels problèmes. Cependant, dans les applications modernes de sciences des données, les fonctions à minimiser se représentent souvent comme une espérance mathématique, difficile ou impossible à évaluer. C'est le cas dans les problèmes d'apprentissage en ligne, dans les problèmes mettant en jeu un grand nombre de données ou dans les problèmes de calcul distribué. Pour résoudre ceux-ci, nous étudions dans cette thèse des méthodes proximales stochastiques, qui adaptent les algorithmes proximaux aux cas de fonctions écrites comme une espérance. Les méthodes proximales stochastiques sont d'abord étudiées à pas constant, en utilisant des techniques d'approximation stochastique. Plus précisément, la méthode de l'Equation Differentielle Ordinaire est adaptée au cas d'inclusions differentielles. Afin d'établir le comportement asymptotique des algorithmes, la stabilité des suites d'itérés (vues comme des chaines de Markov) est étudiée. Ensuite, des généralisations de l'algorithme du gradient proximal stochastique à pas décroissant sont mises au point pour resoudre des problèmes composites. Toutes les grandeurs qui permettent de décrire les problèmes à résoudre s'écrivent comme une espérance. Cela inclut un algorithme primal dual pour des problèmes régularisés et linéairement contraints ainsi qu'un algorithme d'optimisation sur les grands graphes
This thesis mainly studies optimization algorithms. Programming problems arising in signal processing and machine learning are composite in many cases, i.e they exhibit constraints and non smooth regularization terms. Proximal methods are known to be efficient to solve such problems. However, in modern applications of data sciences, functions to be minimized are often represented as statistical expectations, whose evaluation is intractable. This cover the case of online learning, big data problems and distributed computation problems. To solve this problems, we study in this thesis proximal stochastic methods, that generalize proximal algorithms to the case of cost functions written as expectations. Stochastic proximal methods are first studied with a constant step size, using stochastic approximation techniques. More precisely, the Ordinary Differential Equation method is adapted to the case of differential inclusions. In order to study the asymptotic behavior of the algorithms, the stability of the sequences of iterates (seen as Markov chains) is studied. Then, generalizations of the stochastic proximal gradient algorithm with decreasing step sizes are designed to solve composite problems. Every quantities used to define the optimization problem are written as expectations. This include a primal dual algorithm to solve regularized and linearly constrained problems and an optimization over large graphs algorithm

Gli stili APA, Harvard, Vancouver, ISO e altri

20

Mhanna, Elissa. "Beyond gradients : zero-order approaches to optimization and learning in multi-agent environments". Electronic Thesis or Diss., université Paris-Saclay, 2024. http://www.theses.fr/2024UPASG123.

Testo completo

Abstract (sommario):

L'essor des dispositifs connectés et des données qu'ils génèrent a stimulé le développement d'applications à grande échelle. Ces dispositifs forment des réseaux distribués avec un traitement de données décentralisé. À mesure que leur nombre augmente, des défis comme la surcharge de communication et les coûts computationnels se présentent, nécessitant des méthodes d'optimisation adaptées à des contraintes de ressources strictes, surtout lorsque les dérivées sont coûteuses ou indisponibles. Cette thèse se concentre sur les méthodes d'optimisation sans dérivées, qui sont idéales quand les dérivées des fonctions sont inaccessibles. Ces méthodes estiment les gradients à partir des évaluations de fonction, ce qui les rend adaptées à l'apprentissage distribué et fédéré, où les dispositifs collaborent pour résoudre des tâches d'optimisation avec peu d'informations et des données bruitées. Dans le premier chapitre, nous traitons de l'optimisation distribuée sans dérivées pour des fonctions fortement convexes sur plusieurs agents. Nous proposons un algorithme distribué de descente de gradient projete sans dérivées, qui utilise des estimations de gradient à un point, où la fonction est interrogée une seule fois par réalisation stochastique, et les évaluations sont bruitées. Ce chapitre démontre la convergence presque sûre de l'algorithme et fournit des bornes théoriques sur le taux de convergence. Avec des pas constants, l'algorithme atteint un taux de convergence linéaire. C'est la première fois que ce taux est établi pour des estimations de gradient à un point (voire même pour des estimations de gradient à deux points) pour des fonctions stochastiques. Nous analysons aussi les effets des pas décroissants, établissant un taux de convergence correspondant aux méthodes centralisées sans dérivées. Le deuxième chapitre se penche sur les défis de l'apprentissage fédéré qui est limité par le coût élevé de la transmission de données sur des réseaux à bande passante restreinte. Pour y répondre, nous proposons un nouvel algorithme qui réduit la surcharge de communication en utilisant des estimations de gradient à un point. Les dispositifs transmettent des valeurs scalaires plutôt que de grands vecteurs de gradient, réduisant ainsi la quantité de données envoyées. L'algorithme intègre aussi directement les perturbations des communications sans fil dans l'optimisation, éliminant le besoin de connaître explicitement l'état du canal. C'est la première approche à inclure les propriétés du canal sans fil dans un algorithme d'apprentissage, le rendant résilient aux problèmes de communication réels. Nous prouvons la convergence presque sûre de cette méthode dans des environnements non convexes et validons son efficacité à travers des expériences. Le dernier chapitre étend l'algorithme précédent au cas des estimations de gradient à deux points. Contrairement aux estimations à un point, les estimations à deux points interrogent la fonction deux fois, fournissant une approximation plus précise du gradient et améliorant le taux de convergence. Cette méthode conserve l'efficacité de communication des estimations à un point, avec uniquement des valeurs scalaires transmises, et assouplit l'hypothèse de bornitude de la fonction objective. Nous prouvons des taux de convergence linéaires pour des fonctions fortement convexes et lisses. Pour les problèmes non convexes, nous montrons une amélioration notable du taux de convergence, en particulier pour les fonctions dominées par le gradient K, atteignant également un taux linéaire. Nous fournissons aussi des résultats montrant l'efficacité de communication par rapport à d'autres techniques d'apprentissage fédéré
The rise of connected devices and the data they produce has driven the development of large-scale applications. These devices form distributed networks with decentralized data processing. As the number of devices grows, challenges like communication overhead and computational costs increase, requiring optimization methods that work under strict resource constraints, especially where derivatives are unavailable or costly. This thesis focuses on zero-order (ZO) optimization methods are ideal for scenarios where explicit function derivatives are inaccessible. ZO methods estimate gradients based only on function evaluations, making them highly suitable for distributed and federated learning environments where devices collaborate to solve global optimization tasks with limited information and noisy data. In the first chapter, we address distributed ZO optimization for strongly convex functions across multiple agents in a network. We propose a distributed zero-order projected gradient descent algorithm that uses one-point gradient estimates, where the function is queried only once per stochastic realization, and noisy function evaluations estimate the gradient. The chapter establishes the almost sure convergence of the algorithm and derives theoretical upper bounds on the convergence rate. With constant step sizes, the algorithm achieves a linear convergence rate. This is the first time this rate has been established for one-point (and even two-point) gradient estimates. We also analyze the effects of diminishing step sizes, establishing a convergence rate that matches centralized ZO methods' lower bounds. The second chapter addresses the challenges of federated learning (FL) which is often hindered by the communication bottleneck—the high cost of transmitting large amounts of data over limited-bandwidth networks. To address this, we propose a novel zero-order federated learning (ZOFL) algorithm that reduces communication overhead using one-point gradient estimates. Devices transmit scalar values instead of large gradient vectors, lowering the data sent over the network. Moreover, the algorithm incorporates wireless communication disturbances directly into the optimization process, eliminating the need for explicit knowledge of the channel state. This approach is the first to integrate wireless channel properties into a learning algorithm, making it resilient to real-world communication issues. We prove the almost sure convergence of this method in nonconvex optimization settings, establish its convergence rate, and validate its effectiveness through experiments. In the final chapter, we extend the ZOFL algorithm to include two-point gradient estimates. Unlike one-point estimates, which rely on a single function evaluation, two-point estimates query the function twice, providing a more accurate gradient approximation and enhancing the convergence rate. This method maintains the communication efficiency of one-point estimates, where only scalar values are transmitted, and relaxes the assumption that the objective function must be bounded. The chapter demonstrates that the proposed two-point ZO method achieves linear convergence rates for strongly convex and smooth objective functions. For nonconvex problems, the method shows improved convergence speed, particularly when the objective function is smooth and K-gradient-dominated, where a linear rate is also achieved. We also analyze the impact of constant versus diminishing step sizes and provide numerical results showing the method's communication efficiency compared to other federated learning techniques

Gli stili APA, Harvard, Vancouver, ISO e altri

21

Salim, Adil. "Random monotone operators and application to stochastic optimization". Electronic Thesis or Diss., Université Paris-Saclay (ComUE), 2018. http://www.theses.fr/2018SACLT021.

Testo completo

Abstract (sommario):

Cette thèse porte essentiellement sur l'étude d'algorithmes d'optimisation. Les problèmes de programmation intervenant en apprentissage automatique ou en traitement du signal sont dans beaucoup de cas composites, c'est-à-dire qu'ils sont contraints ou régularisés par des termes non lisses. Les méthodes proximales sont une classe d'algorithmes très efficaces pour résoudre de tels problèmes. Cependant, dans les applications modernes de sciences des données, les fonctions à minimiser se représentent souvent comme une espérance mathématique, difficile ou impossible à évaluer. C'est le cas dans les problèmes d'apprentissage en ligne, dans les problèmes mettant en jeu un grand nombre de données ou dans les problèmes de calcul distribué. Pour résoudre ceux-ci, nous étudions dans cette thèse des méthodes proximales stochastiques, qui adaptent les algorithmes proximaux aux cas de fonctions écrites comme une espérance. Les méthodes proximales stochastiques sont d'abord étudiées à pas constant, en utilisant des techniques d'approximation stochastique. Plus précisément, la méthode de l'Equation Differentielle Ordinaire est adaptée au cas d'inclusions differentielles. Afin d'établir le comportement asymptotique des algorithmes, la stabilité des suites d'itérés (vues comme des chaines de Markov) est étudiée. Ensuite, des généralisations de l'algorithme du gradient proximal stochastique à pas décroissant sont mises au point pour resoudre des problèmes composites. Toutes les grandeurs qui permettent de décrire les problèmes à résoudre s'écrivent comme une espérance. Cela inclut un algorithme primal dual pour des problèmes régularisés et linéairement contraints ainsi qu'un algorithme d'optimisation sur les grands graphes
This thesis mainly studies optimization algorithms. Programming problems arising in signal processing and machine learning are composite in many cases, i.e they exhibit constraints and non smooth regularization terms. Proximal methods are known to be efficient to solve such problems. However, in modern applications of data sciences, functions to be minimized are often represented as statistical expectations, whose evaluation is intractable. This cover the case of online learning, big data problems and distributed computation problems. To solve this problems, we study in this thesis proximal stochastic methods, that generalize proximal algorithms to the case of cost functions written as expectations. Stochastic proximal methods are first studied with a constant step size, using stochastic approximation techniques. More precisely, the Ordinary Differential Equation method is adapted to the case of differential inclusions. In order to study the asymptotic behavior of the algorithms, the stability of the sequences of iterates (seen as Markov chains) is studied. Then, generalizations of the stochastic proximal gradient algorithm with decreasing step sizes are designed to solve composite problems. Every quantities used to define the optimization problem are written as expectations. This include a primal dual algorithm to solve regularized and linearly constrained problems and an optimization over large graphs algorithm

Gli stili APA, Harvard, Vancouver, ISO e altri

22

Zecchin, Matteo. "Robust Machine Learning Approaches to Wireless Communication Networks". Electronic Thesis or Diss., Sorbonne université, 2022. https://accesdistant.sorbonne-universite.fr/login?url=https://theses-intra.sorbonne-universite.fr/2022SORUS397.pdf.

Testo completo

Abstract (sommario):

L'intelligence artificielle est largement considérée comme un élément clé des systèmes sans fil de sixième génération. Dans cette thèse, nous nous focalisons sur les problèmes fondamentaux résultant de l'interaction entre ces deux technologies dans le but d'ouvrir la voie à l'adoption d'une IA fiable dans les futurs réseaux sans fil. Nous développons des algorithmes distribués qui permettent l'apprentissage collaboratif à la périphérie des réseaux sans fil malgré les problèmes de communication, le manque de fiabilité des travailleurs et l'hétérogénéité des données. Nous examinons ensuite d'un œil critique l'application du paradigme d'apprentissage fréquentiste standard aux problèmes de communication sans fil et proposons une extension de l'apprentissage bayésien généralisé, qui permet de relever simultanément trois défis majeurs dans le domaine d'application : la rareté des données, la présence de valeurs aberrantes et la mauvaise spécification du modèle
Artificial intelligence is widely viewed as a key enabler of sixth generation wireless systems. In this thesis we target fundamental problems arising from the interaction between these two technologies with the end goal of paving the way towards the adoption of reliable AI in future wireless networks. We develop of distributed training algorithms that allow collaborative learning at edge of wireless networks despite communication bottlenecks, unreliability of its workers and data heterogeneity. We then take a critical look at the application of the standard frequentist learning paradigm to wireless communication problems and propose an extension of the generalized Bayesian learning, that concurrently counteracts three prominent challenges arising in application domain: data scarcity, the presence of outliers and model misspecification

Gli stili APA, Harvard, Vancouver, ISO e altri

23

Alsayasneh, Maha. "On the identification of performance bottlenecks in multi-tier distributed systems". Thesis, Université Grenoble Alpes, 2020. http://www.theses.fr/2020GRALM009.

Testo completo

Abstract (sommario):

De nos jours, les systèmes distribués sont constitués de nombreuxcomposants logiciels ayant des interactions complexes et de nombreusespossibilités de configurations. Dès lors, localiser les problèmes deperformance est une tâche difficile, nécessitant une expertisehumaine et de nombreux essais. En effet, la même pile logicielle peutse comporter différemment en fonction dumatériel utilisé, de la logique applicative, des paramètres deconfiguration et des conditions de fonctionnement. Ce travail a pourobjectif (i) d’identifier un ensemble demétriques de référence, générales etfiables, pour localiser les problèmes de performance, (ii) d’identifierles caractéristiques de ces indicateurs, et (iii) deconstruire un outil qui puisse déterminer de manière automatiquesi le système a atteint sa capacité maximale en terme dedébit.Dans cette thèse, nous présentons trois contributionsprincipales. Premièrement, nous présentons une étude analytique d’ungrand nombre de configurations réalistes d’applications distribuéesmulti-tiers, se concentrant sur les chaînes de traitements desdonnées. En analysant un grand nombre de métriques au niveau logicielet matériel, nous identifions celles dont le comportement change aumoment où le système atteint sa capacité maximale. Deuxièmement, nousexploitons les techniques d’apprentissage machine pour développer unoutil capable d’identifier automatiquement les problèmes deperformance dans la chaîne de traitement de données. Pour ce faire,nous évaluons plusieurs techniques d’apprentissage machine, plusieurssélections de métriques, et différentscas de généralisation pour de nouvelles configurations. Troisièmement,pour valider les résultats obtenues sur la chaîne de traitement dedonnées, nous appliquons notre approche analytique et notre approchefondée sur l'apprentissage machine au cas d’une architecture Web.Nous tirons plusieurs conclusions de nos travaux. Premièrement, il estpossible d’identifier des métriques clés qui sont des indicateursfiables de problèmes de performance dans les systèmes distribuésmulti-tiers. Plus précisément, identifier le moment où le serveur aatteint sa capacité maximale peut être identifier grâce à cesmétriques fiables. Contrairement à l’approche adoptée par de nombreuxtravaux existants, nos travaux démontrent qu'une combinaison demétriques de différents types est nécessaire pour assurer uneidentification fiable des problèmes de performance dans un grandnombre de configurations. Nos travaux montrent aussi que les approchesfondées sur des méthodes d’apprentissage machine pour analyser lesmétriques permettent d’identifier les problèmes de performance dansles systèmes distribués multi-tiers. La comparaison de différentsmodèles met en évidence que ceux utilisant les métriques fiablesidentifiées par notre étude analytique sont ceux qui obtiennent lameilleure précision. De plus, notre analyse approfondie montre larobustesse des modèles obtenues. En effet, ils permettent unegénéralisation à de nouvelles configurations, à de nouveaux nombres declients, et à de nouvelles configurations exécutées avec de nouveauxnombres de clients. L'extension de notre étude au cas d'unearchitecture Web confirme les résultats principaux obtenus à traversl’étude sur la chaîne de traitement de données. Ces résultats ouvrentla voie à la construction d'un outil générique pour identifier demanière fiable les problèmes de performance dans les systèmesdistribués
Today's distributed systems are made of various software componentswith complex interactions and a large number of configurationsettings. Pinpointing the performance bottlenecks is generally a non-trivial task, which requires human expertise as well as trial anderror. Moreover, the same software stack may exhibit very differentbottlenecks depending on factors such as the underlying hardware, theapplication logic, the configuration settings, and the operatingconditions. This work aims to (i) investigate whether it is possibleto identify a set of key metrics that can be used as reliable andgeneral indicators of performance bottlenecks, (ii) identify thecharacteristics of these indicators, and (iii) build a tool that canautomatically and accurately determine if the system reaches itsmaximum capacity in terms of throughput.In this thesis, we present three contributions. First, we present ananalytical study of a large number of realistic configuration setupsof multi-tier distributed applications, more specifically focusing ondata processing pipelines. By analyzing a large number of metrics atthe hardware and at the software level, we identify the ones thatexhibit changes in their behavior at the point where the systemreaches its maximum capacity. We consider these metrics as reliableindicators of performance bottlenecks. Second, we leverage machinelearning techniques to build a tool that can automatically identifyperformance bottlenecks in the data processing pipeline. We considerdifferent machine learning methods, different selections of metrics,and different cases of generalization to new setups. Third, to assessthe validity of the results obtained considering the data processingpipeline for both the analytical and the learning-based approaches,the two approaches are applied to the case of a Web stack.From our research, we draw several conclusions. First, it is possibleto identify key metrics that act as reliable indicators of performancebottlenecks for a multi-tier distributed system. More precisely,identifying when the server has reached its maximum capacity can beidentified based on these reliable metrics. Contrary to the approachadopted by many existing works, our results show that a combination ofmetrics of different types is required to ensure reliableidentification of performance bottlenecks in a large number ofsetups. We also show that approaches based on machine learningtechniques to analyze metrics can identify performance bottlenecks ina multi-tier distributed system. The comparison of different modelsshows that the ones based on the reliable metrics identified by ouranalytical study are the ones that achieve the bestaccuracy. Furthermore, our extensive analysis shows the robustness ofthe obtained models that can generalize to new setups, to new numbersof clients, and to both new setups and new numbers ofclients. Extending the analysis to a Web stack confirmsthe main findings obtained through the study of the data processingpipeline. These results pave the way towards a general and accuratetool to identify performance bottlenecks in distributed systems

Gli stili APA, Harvard, Vancouver, ISO e altri

24

Samarakoon, S. (Sumudu). "Learning-based methods for resource allocation and interference management in energy-efficient small cell networks". Doctoral thesis, Oulun yliopisto, 2017. http://urn.fi/urn:isbn:9789526216874.

Testo completo

Abstract (sommario):

Abstract Resource allocation and interference management in wireless small cell networks have been areas of key research interest in the past few years. Although a large number of research studies have been carried out, the needs for high capacity, reliability, and energy efficiency in the emerging fifth-generation (5G) networks warrants the development of methodologies focusing on ultra-dense and self-organizing small cell network (SCN) scenarios. In this regard, the prime motivation of this thesis is to propose an array of distributed methodologies to solve the problem of joint resource allocation and interference management in SCNs pertaining to different network architectures. The present dissertation proposes and investigates distributed control mechanisms for wireless SCNs mainly in three cases: a backhaul-aware interference management mechanism of the uplink of wireless SCNs, a dynamic cluster-based approach for maximizing the energy efficiency of dense wireless SCNs, and a joint power control and user scheduling mechanism for optimizing energy efficiency in ultra-dense SCNs. Optimizing SCNs, especially in the ultra-dense regime, is extremely challenging due to the severe coupling in interference and the dynamics of both queues and channel states. Moreover, due to the lack of inter-base station/cluster communications, smart distributed learning mechanisms are required to autonomously choose optimal transmission strategies based on local information. To overcome these challenges, an array of distributed algorithms are developed by combining the tools from machine learning, Lyapunov optimization and mean-field theory. For each of the above proposals, extensive sets of simulations have been carried out to validate the performance of the proposed methods compared to conventional models that fail to account for the limitations due to network scale, dynamics of queue and channel states, backhaul heterogeneity and capacity constraints, and the lack of coordination between network elements. The results of the proposed methods yield significant gains of the proposed methods in terms of energy savings, rate improvements, and delay reductions compared to the conventional models studied in the existing literature
Tiivistelmä Langattomien piensoluverkkojen resurssien allokointi ja häiriön hallinta on ollut viime vuosina tärkeä tutkimuskohde. Tutkimuksia on tehty paljon, mutta uudet viidennen sukupolven (5G) verkot vaativat suurta kapasiteettia, luotettavuutta ja energiatehokkuutta. Sen vuoksi on kehitettävä menetelmiä, jotka keskittyy ultratiheisiin ja itseorganisoituviin piensoluverkkoihin. (SCN). Tämän väitöskirjan tärkein tavoite onkin esittää joukko hajautettuja menetelmiä piensoluverkkojen yhteisten resurssien allokointiin ja häiriön hallintaan, kun käytössä on erilaisia verkkoarkkitehtuureja. Tässä väitöskirjassa ehdotetaan ja tutkitaan hajautettuja menetelmiä langattomien piensoluverkkojen hallintaan kolmessa eri tilanteessa: välityskanavan huomioiva häiriönhallinta menetelmä langattomissa piensoluverkoissa, dynaamisiin klustereihin perustuva malli tiheiden langattomien piensoluverkkojen energiatehokkuuden maksimointiin ja yhteinen tehonsäädön ja käyttäjien allokaatio menetelmä ultratiheiden piensoluverkkojen energiatehokkuuden optimointiin. Ultratiheiden piensoluverkkojen optimointi on erittäin haastavaa häiriön sekä jonojen ja kanavatilojen vahvojen kytkösten vuoksi. Lisäksi, koska klustereilla/tukiasemilla ei ole kommunikaatiota, tarvitaan hajautettuja oppimisalgoritmeja, jotta saadaan itsenäisesti valittua optimaaliset lähetys menetelmät hyödyntäen vain paikallista tietoa. Tämän vuoksi kehitetään useita hajautettuja algoritmeja, jotka hyödyntävät koneoppimista, Lyapunov optimointia ja mean-field teoriaa. Kaikki yllä olevat esitetyt menetelmät on validoitu laajoilla simulaatioilla, joilla on voitu todentaa niiden suorituskyky perinteisiin malleihin verrattuna. Perinteiset mallit eivät pysty ottamaan huomioon verkon laajuuden, jonon ja kanavatilojen dynamiikan, eri välityskanavien ja rajallisen kapasiteetin asettamia rajoituksia sekä verkon elementtien välisen koordinoinnin puuttumista. Esitetyt menetelmät tuottavat huomattavia parannuksia energiansäästöön, siirtonopeuteen ja viiveiden vähentämiseen verrattuna perinteisiin malleihin, joita kirjallisuudessa on tarkasteltu

Gli stili APA, Harvard, Vancouver, ISO e altri

25

Rodio, Angelo. "Hétérogénéité des clients dans les systèmes d'apprentissage fédérés". Electronic Thesis or Diss., Université Côte d'Azur, 2024. http://www.theses.fr/2024COAZ4029.

Testo completo

Abstract (sommario):

L'apprentissage fédéré (FL) est un cadre collaboratif où les clients (dispositifs mobiles) entraînent un modèle d'apprentissage machine sous la coordination d'un serveur central en préservant la décentralisation des données. L'hétérogénéité de la participation des clients provient de la diversité des capacités des appareils en termes de spécifications matérielles, de types de connectivité réseau et de disponibilité énergétique. Cette thèse se concentre sur l'analyse de l'impact de cette hétérogénéité sur la convergence des algorithmes d'apprentissage fédéré et propose des algorithmes pratiques pour une utilisation plus efficace des systèmes et des ressources. La première partie aborde les défis liés à la corrélation temporelle et spatiale dans la participation des clients. Nous montrons que la participation hétérogène peut biaiser l'apprentissage et formalisons ce biais-variance induit par cette hétérogénéité. Notre étude démontre que donner plus de poids aux clients qui participent fréquemment peut accélérer la convergence. De plus, nous examinons l'impact de la corrélation à travers un modèle de chaîne de Markov à états finis, révélant que la corrélation ralentit la convergence. Notre approche propose une optimisation de ce compromis biais-variance à travers l'algorithme Correlation-Aware Federated Learning (CA-Fed), qui favorise une convergence plus rapide. La deuxième partie traite des scénarios de communication défectueuse. Les conditions réseau, notamment les pertes de paquets, introduisent une hétérogénéité incontrôlable dans la participation des clients. Contrairement aux stratégies traditionnelles qui compensent les pertes de paquets, notre solution permet aux algorithmes d'apprentissage fédéré de fonctionner efficacement même dans des canaux asymétriques et défaillants en adaptant la transmission des mises à jour des modèles. Les tests montrent que notre algorithme atteint des performances comparables à celles obtenues dans des conditions idéales, sans pertes de paquets. Enfin, la troisième partie explore l'utilisation de méthodes de réduction de variance pour compenser l'hétérogénéité dans la participation des clients. Bien que des stratégies similaires aient été envisagées pour atténuer l'impact de la participation partielle des clients, notre analyse élargit ce cadre à une participation client hétérogène. Nous démontrons que la convergence est significativement influencée par les clients participant le moins, suggérant que les algorithmes existants ne sont pas optimisés pour de tels environnements. Notre méthode, FedStale, utilise efficacement les mises à jour obsolètes dans des contextes de participation hétérogène
Federated Learning (FL) stands as a collaborative framework where clients (mobile devices) train a machine learning model under a central server's orchestration, preserving data decentralization. Client participation heterogeneity stems from varied device capabilities in hardware specifications (CPU power, memory capacity), network connectivity types (3G, 4G, 5G, WiFi), and power availability (battery levels), and it is generally beyond server control. This thesis focuses on providing a theoretical understanding of federated learning systems under heterogeneous client participation, specifically analyzing the impact of this heterogeneity on the convergence of federated learning algorithms, and proposing practical solutions for a more efficient system and resource usage.The first part of the thesis focuses on tackling challenges associated with temporal and spatial correlation in client participation, the former due to the correlated client participation dynamics over time, the latter due to the clients correlated geographic distributions. In this chapter, we first observe that the heterogeneous client participation can potentially bias the learning process. We formalize the bias-variance tradeoff induced by heterogeneous client participation by decomposing the optimization error into variance (related to convergence speed) and bias (indicative of model quality). By minimizing these two errors, we demonstrate that assigning larger aggregation weights to frequently participating clients can accelerate convergence.Moreover, we study the impact of temporal and spatial correlation in client participation through a finite-state Markov chain modeling. We show that correlation slows down convergence within a logarithmic factor related to the Markov chain's geometric mixing time. Minimizing the bias-variance tradeoff, we also find that lower aggregation weights for highly correlated clients accelerate convergence. We finally propose an algorithm, Correlation-Aware Federated Learning (CA-Fed), to optimize the bias-variance tradeoff and thus achieve faster convergence.The second part of the thesis consider more applied scenarios of lossy communication channels. Network conditions, particularly packet losses, represent a main, uncontrollable source of heterogeneity in client participation. In this chapter, challenging the conventional mitigation strategies for packet losses such as retransmission or error correction, we show that federated learning algorithms can still learn in asymmetric, lossy channels. Our proposed solution modifies traditional federated learning approaches by transmitting model updates in place of models and correcting the averaging step to account for the heterogeneity of the communication channels. Experimental results confirm that our algorithm, under lossy channels, matches the performance in ideal, lossless conditions within a limited number of communication rounds.The third part investigates leveraging variance reduction methods, specifically stale updates, to compensate for the heterogeneity in client participation. Recent research considered similar strategies to mitigate the effects of partial client participation in federated learning. These methods involve retaining the last computed, potentially stale, update for each client to replace unavailable current updates for non-participating clients. However, existing analyses rely on the assumption of uniform client participation — restrictive in real-world scenarios. By broadening the analysis to heterogeneous client participation, we discover that convergence is significantly influenced by the least participating clients. This suggests that existing algorithms are not optimally designed for such environments, and we propose a more robust approach, FedStale, to exploit stale model updates under heterogeneous client participation

Gli stili APA, Harvard, Vancouver, ISO e altri

26

Bokiš, Daniel. "Optimalizační algoritmy v logistických kombinatorických úlohách". Master's thesis, Vysoké učení technické v Brně. Fakulta informačních technologií, 2015. http://www.nusl.cz/ntk/nusl-234978.

Testo completo

Abstract (sommario):

This thesis deals with optimization problems with main focus on logistic Vehicle Routing Problem (VRP). In the first part term optimization is established and most important optimization problems are presented. Next section deals with methods, which are capable of solving those problems. Furthermore it is explored how to apply those methods to specific VRP, along with presenting some enhancement of those algorithms. This thesis also introduces learning method capable of using knowledge of previous solutions. At the end of the paper, experiments are performed to tune the parameters of used algorithms and to discuss benefit of suggested improvements.

Gli stili APA, Harvard, Vancouver, ISO e altri

27

Martinez, Medina Lourdes. "Optimisation des requêtes distribuées par apprentissage". Thesis, Grenoble, 2014. http://www.theses.fr/2014GRENM015.

Testo completo

Abstract (sommario):

Les systèmes de gestion de données distribuées deviennent de plus en plus complexes. Ils interagissent avec des réseaux de dispositifs fixes et/ou mobiles, tels que des smartphones ou des tablettes, dispositifs hétérogènes, autonomes et possédant des limitations physiques. Ces dispositifs exécutent des applications permettant l'interaction des usagers (i.e. jeux virtuels, réseaux sociaux). Ces applications produisent et consomment des données à tout moment voire même en continu. Les caractéristiques de ces systèmes ajoutent des dimensions au problème de l'optimisation de requêtes, telles que la variabilité des objectifs d'optimisation, l'absence d'information sur les données (métadonnées) ou le manque d'une vision globale du système. Les techniques traditionnelles d'optimisation des requêtes n'abordent pas (ou très peu) les systèmes autonomes. Elles se basent sur les métadonnées et font des hypothèses très fortes sur le comportement du système. En plus, la majorité de ces techniques d'optimisation ciblent uniquement l'optimisation du temps d'exécution. La difficulté d'évaluation des requêtes dans les applications modernes incite à revisiter les techniques traditionnelles d'optimisation. Cette thèse fait face aux défis décris précédemment par l'adaptation du paradigme du Raisonnement à partir de cas (CBR pour Case-Based Reasoning) au problème de l'optimisation des requêtes. Cette adaptation, associée à une exploration pseudo-aléatoire de l'espace de solutions fournit un moyen pour optimiser des requêtes dans les contextes possédant très peu voire aucune information sur les données. Cette approche se concentre sur l'optimisation de requêtes en utilisant les cas générés précédemment dans l'évaluation de requêtes similaires. Un cas de requête et composé par : (i) la requête (le problème), (ii) le plan d'exécution (la solution) et (iii) les mesures de ressources utilisés par l'exécution du plan (l'évaluation de la solution). Cette thèse aborde également la façon que le processus CBR interagit avec le processus de génération de plan d'exécution de la requête qui doit permettre d'explorer l'espace des solutions. Ce processus utilise les heuristiques classiques et prennent des décisions de façon aléatoire lorsque les métadonnées viennent à manquer (e.g. pour l'ordre des jointures, la sélection des algorithmes, voire même le choix des protocoles d'acheminement de messages). Ce processus exploite également le CBR pour générer des plans pour des sous-requêtes, accélérant ainsi l'apprentissage de nouveaux cas. Les propositions de cette thèse ont été validées à l'aide du prototype CoBRA développé dans le contexte du projet UBIQUEST
Distributed data systems are becoming increasingly complex. They interconnect devices (e.g. smartphones, tablets, etc.) that are heterogeneous, autonomous, either static or mobile, and with physical limitations. Such devices run applications (e.g. virtual games, social networks, etc.) for the online interaction of users producing / consuming data on demand or continuously. The characteristics of these systems add new dimensions to the query optimization problem, such as multi-optimization criteria, scarce information on data, lack of global system view, among others. Traditional query optimization techniques focus on semi (or not at all) autonomous systems. They rely on information about data and make strong assumptions about the system behavior. Moreover, most of these techniques are centered on the optimization of execution time only. The difficulty for evaluating queries efficiently on nowadays applications motivates this work to revisit traditional query optimization techniques. This thesis faces these challenges by adapting the Case Based Reasoning (CBR) paradigm to query processing, providing a way to optimize queries when there is no prior knowledge of data. It focuses on optimizing queries using cases generated from the evaluation of similar past queries. A query case comprises: (i) the query, (ii) the query plan and (iii) the measures (computational resources consumed) of the query plan. The thesis also concerns the way the CBR process interacts with the query plan generation process. This process uses classical heuristics and makes decisions randomly (e.g. when there are no statistics for join ordering and selection of algorithms, routing protocols). It also (re)uses cases (existing query plans) for similar queries parts, improving the query optimization, and therefore evaluation efficiency. The propositions of this thesis have been validated within the CoBRa optimizer developed in the context of the UBIQUEST project

Gli stili APA, Harvard, Vancouver, ISO e altri

28

Friend, Daniel. "Cognitive Networks: Foundations to Applications". Diss., Virginia Tech, 2009. http://hdl.handle.net/10919/26449.

Testo completo

Abstract (sommario):

Fueled by the rapid advancement in digital and wireless technologies, the ever-increasing capabilities of wireless devices have placed upon us a tremendous challenge - how to put all of this capability to effective use. Individually, wireless devices have outpaced the ability of users to optimally configure them. Collectively, the complexity is far more daunting. Research in cognitive networks seeks to provide a solution to the diffculty of effectively using the expanding capabilities of wireless networks by embedding greater degrees of intelligence within the network itself. In this dissertation, we address some fundamental questions related to cognitive networks, such as "What is a cognitive network?" and "What methods may be used to design a cognitive network?" We relate cognitive networks to a common artificial intelligence (AI) framework, the multi-agent system (MAS). We also discuss the key elements of learning and reasoning, with the ability to learn being the primary differentiator for a cognitive network. Having discussed some of the fundamentals, we proceed to further illustrate the cognitive networking principle by applying it to two problems: multichannel topology control for dynamic spectrum access (DSA) and routing in a mobile ad hoc network (MANET). The multichannel topology control problem involves confguring secondary network parameters to minimize the probability that the secondary network will cause an outage to a primary user in the future. This requires the secondary network to estimate an outage potential map, essentially a spatial map of predicted primary user density, which must be learned using prior observations of spectral occupancy made by secondary nodes. Due to the complexity of the objective function, we provide a suboptimal heuristic and compare its performance against heuristics targeting power-based and interference-based topology control objectives. We also develop a genetic algorithm to provide reference solutions since obtaining optimal solutions is impractical. We show how our approach to this problem qualifies as a cognitive network. In presenting our second application, we address the role of network state observations in cognitive networking. Essentially, we need a way to quantify how much information is needed regarding the state of the network to achieve a desired level of performance. This question is applicable to networking in general, but becomes increasingly important in the cognitive network context because of the potential volume of information that may be desired for decision-making. In this case, the application is routing in MANETs. Current MANET routing protocols are largely adapted from routing algorithms developed for wired networks. Although optimal routing in wired networks is grounded in dynamic programming, the critical assumption, static link costs and states, that enables the use of dynamic programming for wired networks need not apply to MANETs. We present a link-level model of a MANET, which models the network as a stochastically varying graph that possesses the Markov property. We present the Markov decision process as the appropriate framework for computing optimal routing policies for such networks. We then proceed to analyze the relationship between optimal policy and link state information as a function of minimum distance from the forwarding node. The applications that we focus on are quite different, both in their models as well as their objectives. This difference is intentional and signficant because it disassociates the technology, i.e. cognitive networks, from the application of the technology. As a consequence, the versatility of the cognitive networks concept is demonstrated. Simultaneously, we are able to address two open problems and provide useful results, as well as new perspective, on both multichannel topology control and MANET routing. This material is posted here with permission from the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of Virginia Tech library's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this material, you agree to all provisions of the copyright laws protecting it.
Ph. D.

Gli stili APA, Harvard, Vancouver, ISO e altri

29

Jankee, Christopher. "Optimisation par métaheuristique adaptative distribuée en environnement de calcul parallèle". Thesis, Littoral, 2018. http://www.theses.fr/2018DUNK0480/document.

Testo completo

Abstract (sommario):

Pour résoudre des problèmes d'optimisation discret de type boîte noire, de nombreux algorithmes stochastiques tels que les algorithmes évolutionnaires ou les métaheuristiques existent et se révèlent particulièrement efficaces selon le problème à résoudre. En fonction des propriétés observées du problème, choisir l'algorithme le plus pertinent est un problème difficile. Dans le cadre original des environnements de calcul parallèle et distribué, nous proposons et analysons différentes stratégies adaptative de sélection d'algorithme d'optimisation. Ces stratégies de sélection reposent sur des méthodes d'apprentissage automatique par renforcement, issu du domaine de l'intelligence artificielle, et sur un partage d'information entre les noeuds de calcul. Nous comparons et analysons les stratégies de sélection dans différentes situations. Deux types d'environnement de calcul distribué synchrone sont abordés : le modèle en île et le modèle maître-esclave. Sur l'ensemble des noeuds de manière synchrone à chaque itération la stratégie de sélection adaptative choisit un algorithme selon l'état de la recherche de la solution. Dans une première partie, deux problèmes OneMax et NK, l'un unimodal et l'autre multimodal, sont utilisés comme banc d'essai de ces travaux. Ensuite, pour mieux saisir et améliorer la conception des stratégies de sélection adaptatives, nous proposons une modélisation du problème d'optimisation et de son opérateur de recherche locale. Dans cette modélisation, une caractéristique importante est le gain moyen d'un opérateur en fonction de la fitness de la solution candidate. Le modèle est utilisé dans le cadre synchrone du modèle maître-esclave. Une stratégie de sélection se décompose en trois composantes principales : l'agrégation des récompenses échangées, la technique d'apprentissage et la répartition des algorithmes sur les noeuds de calcul. Dans une dernière partie, nous étudions trois scénarios et nous donnons des clés de compréhension sur l'utilisation pertinente des stratégies de sélection adaptative par rapport aux stratégies naïves. Dans le cadre du modèle maître-esclave, nous étudions les différentes façons d'agréger les récompenses sur le noeud maître, la répartition des algorithmes d'optimisation sur les noeuds de calcul et le temps de communication. Cette thèse se termine par des perspectives pour le domaine de l'optimisation stochastique adaptative distribuée
To solve discrete optimization problems of black box type, many stochastic algorithms such as evolutionary algorithms or metaheuristics exist and prove to be particularly effective according to the problem to be solved. Depending on the observed properties of the problem, choosing the most relevant algorithm is a difficult problem. In the original framework of parallel and distributed computing environments, we propose and analyze different adaptive optimization algorithm selection strategies. These selection strategies are based on reinforcement learning methods automatic, from the field of artificial intelligence, and on information sharing between computing nodes. We compare and analyze selection strategies in different situations. Two types of synchronous distributed computing environment are discussed : the island model and the master-slave model. On the set of nodes synchronously at each iteration, the adaptive selection strategy chooses an algorithm according to the state of the search for the solution. In the first part, two problems OneMax and NK, one unimodal and the other multimodal, are used as benchmarks for this work. Then, to better understand and improve the design of adaptive selection strategies, we propose a modeling of the optimization problem and its local search operator. In this modeling, an important characteristic is the average gain of an operator according to the fitness of the candidate solution. The model is used in the synchronous framework of the master-slave model. A selection strategy is broken down into three main components : the aggregation of the rewards exchanged, the learning scheme and the distribution of the algorithms on the computing nodes. In the final part, we study three scenarios, and we give keys to understanding the relevant use of adaptive selection strategies over naïve strategies. In the framework of the master-slave model, we study the different ways of aggregating the rewards on the master node, the distribution of the optimization algorithms of the nodes of computation and the time of communication. This thesis ends with perspectives in the field of distributed adaptive stochastic optimization

Gli stili APA, Harvard, Vancouver, ISO e altri

30

Capps, Michael V. "Fidelity optimization in distributed virtual environments". Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2000. http://handle.dtic.mil/100.2/ADA381349.

Testo completo

Abstract (sommario):

Dissertation (Ph.D. in Computer Science)--Naval Postgraduate School, June 2000.
Dissertation supervisor: Zyda, Michael. "June 2000." Includes bibliographical references (p. 199-206). Also Available online.

Gli stili APA, Harvard, Vancouver, ISO e altri

31

Ray, William J. "Optimization of distributed, object-oriented architectures". Thesis, Monterey, Calif. : Springfield, Va. : Naval Postgraduate School ; Available from National Technical Information Service, 2001. http://handle.dtic.mil/100.2/ADA397283.

Testo completo

Abstract (sommario):

Dissertation (Ph.D. in Software Engineering)--Naval Postgraduate School, Sept. 2001.
Dissertation supervisor: Berzins, Valdis. "September 2001." Includes bibliographical references (p. 299-302 ). Also available in print.

Gli stili APA, Harvard, Vancouver, ISO e altri

32

Johansson, Björn. "On Distributed Optimization in Networked Systems". Doctoral thesis, KTH, Reglerteknik, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-9801.

Testo completo

Abstract (sommario):

Numerous control and decision problems in networked systems can be posed as optimization problems. Examples include the framework of network utility maximization for resource allocation in communication networks, multi-agent coordination in robotics, and collaborative estimation in wireless sensor networks (WSNs). In contrast to classical distributed optimization, which focuses on improving computational efficiency and scalability, these new applications require simple mechanisms that can operate under limited communication. In this thesis, we develop several novel mechanisms for distributed optimization under communication constraints, and apply these to several challenging engineering problems. In particular, we devise three tailored optimization algorithms relying only on nearest neighbor, also known as peer-to-peer, communication. Two of the algorithms are designed to minimize a non-smooth convex additive objective function, in which each term corresponds to a node in a network. The first method is an extension of the randomized incremental subgradient method where the update order is given by a random walk on the underlying communication graph, resulting in a randomized peer-to-peer algorithm with guaranteed convergence properties. The second method combines local subgradient iterations with consensus steps to average local update directions. The resulting optimization method can be executed in a peer-to-peer fashion and analyzed using epsilon-subgradient methods. The third algorithm is a center-free algorithm, which solves a non-smooth resource allocation problem with a separable additive convex objective function subject to a constant sum constraint. Then we consider cross-layer optimization of communication networks, and demonstrate how optimization techniques allow us to engineer protocols that mimic the operation of distributed optimization algorithms to obtain an optimal resource allocation. We describe a novel use of decomposition methods for cross-layer optimization, and present a flowchart that can be used to categorize and visualize a large part of the current literature on this topic. In addition, we devise protocols that optimize the resource allocation in frequency-division multiple access (FDMA) networks and spatial reuse time-division multiple access (TDMA) networks, respectively. Next we investigate some variants of the consensus problem for multi-robot coordination, for which it is usually standard to assume that agents should meet at the barycenter of the initial states. We propose a negotiation strategy to find an optimal meeting point in the sense that the agents' trajectories to the meeting point minimize a quadratic cost criterion. Furthermore, we also demonstrate how an augmented state vector can be used to boost the convergence rate of the standard linear distributed averaging iterations, and we present necessary and sufficient convergence conditions for a general version of these iterations. Finally, we devise a generic optimization software component for WSNs. To this end, we implement some of the most promising optimization algorithms developed by ourselves and others in our WSN testbed, and present experimental results, which show that the proposed algorithms work surprisingly well.
QC 20100813

Gli stili APA, Harvard, Vancouver, ISO e altri

33

Liu, Ying. "Query optimization for distributed stream processing". [Bloomington, Ind.] : Indiana University, 2007. http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqdiss&rft_dat=xri:pqdiss:3274258.

Testo completo

Abstract (sommario):

Thesis (Ph.D.)--Indiana University, Dept. of Computer Science, 2007.
Source: Dissertation Abstracts International, Volume: 68-07, Section: B, page: 4597. Adviser: Beth Plale. Title from dissertation home page (viewed Apr. 21, 2008).

Gli stili APA, Harvard, Vancouver, ISO e altri

34

Li, Lijun. "Optimization techniques for distributed Verilog simulation". Thesis, McGill University, 2008. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=21936.

Testo completo

Abstract (sommario):

Moore's Law states that computational power will roughly double every 18 months. To the semiconductor designer, this means the never-ending challenge of bringing increasingly larger and more complex ICs (Integrated Circuits) to market. It is well known that the principle bottleneck in circuit design is simulation. Uniprocessor simulators may not be able to keep up with increased demands on them for both speed and memory. This thesis has three main contributions. The first contribution is a distributed Verilog simulation environment which can be executed on a cluster of workstations using a message-passing library such as MPI (Message Passing Interface). It employs OOCTW as the synchronization backend and takes advantage of the open source code of Icarus Verilog simulator. It is designed to be flexible for future extension and optimization. To our knowledge, DVS is the first distributed Verilog simulator. The second contribution is event reconstruction, a technique which reduces the overhead caused by event saving. As the name implies, event reconstruction reconstructs input events and anti-events from the differences between adjacent states, and does not save input events in the event queue. Memory consumption and execution time of event reconstruction are compared to the results obtained by dynamic checkpointing revealing that event reconstruction yields a significant reduction in memory utilization and leads to a faster simulation. The third contribution is a multiway design-driven iterative partitioning algorithm for Verilog based on module instances. We do this in order to take advantage of the design hierarchy information contained in the modules and their instances. A Verilog instance is represented by one vertex in a circuit hypergraph. The vertex can be flattened into multiple vertices in the event that an adequate load balance is not achieved by instance based partitioning. In this case the algorithm flattens the largest instance and moves gates betw
La Loi de Moore stipule que la puissance des processeurs double approximativement tous les 18 mois. Pour le constructeur de semi-conducteurs, cela ´equivaut `a un constant probl`eme d'apporter des CI (Circuits Integr´es) de plus en plus larges et complexes sur le march´e. Il est bien connu que le goulet d'´etranglement dans la conception de circuits r´eside dans la simulation. Les simulateurs `a simple processeur peuvent ne pas suivre les demandes croissantes pour plus de vitesse et de m´emoire. Cette th`ese pr´esente un environnement de simulation Verilog avec plusieurs techniques d'optimization. Verilog est une langue de conception digitale couramment utilis´ee. Une simulation distribu´ee Verilog peut ˆetre ex´ecut´ee sur un groupe de postes de travail en utilisant une librarie passant des messages telle que IPM (Interface Passant des Messages). Nous d´ecrivons la reconstruction d'´ev´enements, une technique qui r´eduit l'en-tˆete caus´e par une sauvegarde d'´ev´enements, et comparons sa consommation de m´emoire et son temps d'ex´ecution avec les r´esultats obtenus par checkpointing dynamique. Comme son nom l'indique, la reconstruction d'´ev´enements reconstruit la saisie d'´ev´enements et d'anti-´ev´enements `a partir de la difference entre les ´etats adjacents, et ne sauvegarde pas la saisie d'´ev´enements dans la queue des ´ev´enements. Nous proposons un algorythme partionn´e redondant `a plusieurs voies et orient´e vers le design pour Verilog bas´e sur des instances de modules. Nous faisons cela afin de profiter de l'information hi´erarchique de conception contenue dans les modules et leurs instances. Une instance Verilog est repr´esent´ee par un vertex dans un circuit hypergraphique. Ce vertex peut etre ´ecras´e en plusieurs vertexs dans le cas o`u une charge ad´equate n'est pas produite par une instance bas´ee sur des partitions. Dans ce cas l`a l'algorythme ´ecrase la plus grosse instance et d´eplace les port

Gli stili APA, Harvard, Vancouver, ISO e altri

35

Xu, Qing. "Optimization techniques for distributed logic simulation". Thesis, McGill University, 2011. http://digitool.Library.McGill.CA:80/R/?func=dbin-jump-full&object_id=96665.

Testo completo

Abstract (sommario):

Gate level simulation is a necessary step to verify the correctness of a circuitdesign before fabrication. It is a very time-consuming application, especially in lightof current circuit sizes. Since circuits are continually growing in size and complexity,there is a need for more efficient simulation techniques to keep the circuit verificationtime acceptably small. The use of parallel or distributed simulation is such a technique.When executed on a network of workstations, distributed simulation is alsoa very cost-effective technique. This research focuses on optimization techniques forTime Warp based gate-level logic simulations. The techniques which are described inthis thesis are oriented towards distributed platforms. The first major contributionof this thesis was the creation of an object oriented distributed simulator, XTW. Ituses an optimistic synchronization algorithm and incorporates a number of knownoptimization techniques targeting different aspects of distributed logic simulation.XEQ, an O(1) event scheduling algorithm for this simulator was developed for usein XTW. XEQ enabled us to execute gate level simulations up to 9.4 times fasterthan the same simulator using a skip-list (O(lg n)) event queue. rb-messagea mechanism which reduces the cost of rollback in Time Warp was also developedfor use in XTW. Our experiments revealed that the rb-message mechanism reducedthe number of anti-messages sent in a Time Warp based logic simulation by 76%on average. Moreover, based on the observations that (1)not all circuits should besimulated in parallel and (2) different circuits achieve their best parallel simulationperformance with a different number of compute nodes, an algorithm that uses theK-NN machine learning algorithm was devised to determine the most effective softwareand hardware combination for a logic simulation. After an extensive trainingregime, it was shown to make a correct prediction 99% of the time on whether touse a parallel or sequential simulator. The predicted number of nodes to use on aparallel platform was shown to produce an average execution time which was notmore than 12% of the smallest execution time. The configuration which resulted inthe minimal execution time was picked 61% of the time. A final contribution of thisthesis is an effort to link together commercial single processor simulators making useof Verilog PLI.
La simulation "gate-level" est une tape ncessaire pour vrifier la conformit dela conception d'un circuit avant sa fabrication. C'est un programme qui prendbeaucoup de temps, compte tenu particulirement de la taille actuelle des circuits.Ceux-ci ne cessant de se dvelopper en taille et en complexit, il y a un rel besoin detechniques de simulation plus efficaces afin de maintenir la dure de vrification ducircuit raisonnablement courte. Une de ces techniques consiste utiliser la simulationparallle ou distribue. Quand excute sur un rseau de postes de travail, la simulationdistribue se rvle galement tre une technique trs rentable. Cette recherche se concentresur l'optimisation des techniques de simulations "gate-level" logiques bases surTime Warp. Les techniques qui sont dcrites dans cet expos sont orientes vers lesplateformes distribues. La premire contribution majeure de cet expos a t la crationd'un simulateur distribu orient sur l'objet, XTW. Il utilise un algorithme de synchronisationoptimiste et incorpore un certain nombre de techniques d'optimisationconnues visant diffrents aspects de la simulation distribue logique. XEQ, un algorithmeprogrammateur d'vnements O(1) pour ce simulateur a t dvelopp pour treutilis dans XTW. XEQ nous permet d'excuter des simulations "gate-level" jusqu'9,4 fois plus rapides qu'avec le mme simulateur utilisant une suite d'vnement en"skip-list" (O(lg n)). "rb-message" – un mcanisme qui diminue le co?t de rductiondans Time Warp a galement t mis au point pour tre utilis dans XTW. Nos essaisont rvl que le mcanisme de "rb-message" permettait de diminuer le nombre des antimessagesenvoys au cours d'une simulation logique base sur Time Warp de 76 % enmoyenne. Il a t en outre con?u, en se basant sur les observations que (1) certainscircuits ne devraient pas tre simuls en parallle et (2) que diffrents circuits atteignentleur meilleure performance de simulation parallle avec un nombre diffrent de noeudsde calculs, un algorithme utilisant l'algorithme d'apprentissage de la machine K-NNafin de dterminer quelle tait l'association de logiciel et de matriel la plus efficacedans le cadre d'une simulation logique. l'issue d'un entra?nement approfondi, ilest apparu qu'il pouvait faire un pronostic juste 99 % tablissant quand utiliser unsimulateur parallle ou squentiel. Le nombre annonc de noeuds utiliser sur une plateformeparallle s'est avr permettre une dure d'excution moyenne gale 12 % de la pluscourte dure d'excution. La configuration ayant abouti la dure d'excution minimalea t reprise dans 61 % des cas. Dernire contribution apporte par cet expos, relier lessimulateurs commerciaux processeur unique utilisant Verilog PLI.

Gli stili APA, Harvard, Vancouver, ISO e altri

36

Mota, Joao F. C. "Communication-Efficient Algorithms For Distributed Optimization". Research Showcase @ CMU, 2013. http://repository.cmu.edu/dissertations/283.

Testo completo

Abstract (sommario):

This thesis is concerned with the design of distributed algorithms for solving optimization problems. The particular scenario we consider is a network with P compute nodes, where each node p has exclusive access to a cost function fp. We design algorithms in which all the nodes cooperate to find the minimum of the sum of all the cost functions, f1 + · · · + fP . Several problems in signal processing, control, and machine learning can be posed as such optimization problems. Given that communication is often the most energy-consuming operation in networks and, many times, also the slowest one, it is important to design distributed algorithms with low communication requirements, that is, communication-efficient algorithms. The two main contributions of this thesis are a classification scheme for distributed optimization problems of the kind explained above and a set of corresponding communication-efficient algorithms. The class of optimization problems we consider is quite general, since we allow that each function may depend on arbitrary components of the optimization variable, and not necessarily on all of them. In doing so, we go beyond the commonly used assumption in distributed optimization and create additional structure that can be explored to reduce the total number of communications. This structure is captured by our classification scheme, which identifies particular instances of the problem that are easier to solve. One example is the standard distributed optimization problem, in which all the functions depend on all the components of the variable. All our algorithms are distributed in the sense that no central node coordinates the network, all the communications occur exclusively between neighboring nodes, and the data associated with each node is always processed locally. We show several applications of our algorithms, including average consensus, support vector machines, network flows, and several distributed scenarios for compressed sensing. We also propose a new framework for distributed model predictive control, which can be solved with our algorithms. Through extensive numerical experiments, we show that our algorithms outperform prior distributed algorithms in terms of communication-efficiency, even some that were specifically designed for a particular application.

Gli stili APA, Harvard, Vancouver, ISO e altri

37

Johansson, Björn. "On distributed optimization in networked systems /". Stockholm : Elektro- och systemteknik, Kungliga Tekniska högskolan, 2008. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-9801.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

38

Sevinc, Ender. "Genetic Algorithms For Distributed Database Design And Distributed Database Query Optimization". Phd thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/3/12611194/index.pdf.

Testo completo

Abstract (sommario):

The increasing performance of computers, reduced prices and ability to connect systems with low cost gigabit ethernet LAN and ATM WAN networks make distributed database systems an attractive research area. However, the complexity of distributed database query optimization is still a limiting factor. Optimal techniques, such as dynamic programming, used in centralized database query optimization are not feasible because of the increased problem size. The recently developed genetic algorithm (GA) based optimization techniques presents a promising alternative. We compared the best known GA with a random algorithm and showed that it achieves almost no improvement over the random search algorithm generating an equal number of random solutions. Then, we analyzed a set of possible GA parameters and determined that two-point truncate technique using GA gives the best results. New mutation and crossover operators defined in our GA are experimentally analyzed within a synthetic distributed database having increasing the numbers of relations and nodes. The designed synthetic database replicated relations, but there was no horizontal/vertical fragmentation. We can translate a select-project-join query including a fragmented relation with N fragments into a corresponding query with N relations. Comparisons with optimal results found by exhaustive search are only 20% off the results produced by our new GA formulation showing a 50% improvement over the previously known GA based algorithm.

Gli stili APA, Harvard, Vancouver, ISO e altri

39

Terelius, Håkan. "Distributed Multi-Agent Optimization via Dual Decomposition". Thesis, KTH, Reglerteknik, 2010. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-46501.

Testo completo

Abstract (sommario):

In this master thesis, a new distributed multi-agent optimization algorithm is introduced. The algorithm is based upon the dual decomposition of the optimization problem, together with the subgradient method for finding the optimal dual solution. The convergence of the new optimization algorithm is proved for communication networks with bounded time-varying delays, and noisy communication. Further, an explicit bound on the convergence rate is given, that shows the dependency on the network parameters. Finally, the new optimization algorithm is also compared to an earlier known primal decomposition algorithm, with extensive numerical simulations.

Gli stili APA, Harvard, Vancouver, ISO e altri

40

Payberah, Amir H. "Distributed Optimization of P2P Media Delivery Overlays". Licentiate thesis, KTH, Programvaru- och datorsystem, SCS, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-33287.

Testo completo

Abstract (sommario):

Media streaming over the Internet is becoming increasingly popular. Currently, most media is delivered using global content-delivery networks, providing a scalable and robust client-server model. However, content delivery infrastructures are expensive. One approach to reduce the cost of media delivery is to use peer-to-peer (P2P) overlay networks, where nodes share responsibility for delivering the media to one another. The main challenges in P2P media streaming using overlay networks include: (i) nodes should receive the stream with respect to certain timing constraints, (ii) the overlay should adapt to the changes in the network, e.g., varying bandwidth capacity and join/failure of nodes, (iii) nodes should be intentivized to contribute and share their resources, and (iv) nodes should be able to establish connectivity to the other nodes behind NATs. In this work, we meet these requirements by presenting P2P solutions for live media streaming, as well as proposing a distributed NAT traversal solution. First of all, we introduce a distributed market model to construct an approximately minimal height multiple-tree streaming overlay for content delivery, in gradienTv. In this system, we assume all the nodes are cooperative and execute the protocol. However, in reality, there may exist some opportunistic nodes, free-riders, that take advantage of the system, without contributing to content distribution. To overcome this problem, we extend our market model in Sepidar to be effective in deterring free-riders. However, gradienTv and Sepidar are tree-based solutions, which are fragile in high churn and failure scenarios. We present a solution to this problem in GLive that provides a more robust overlay by replacing the tree structure with a mesh. We show in simulation, that the mesh-based overlay outperforms the multiple-tree overlay. Moreover, we compare the performance of all our systems with the state-of-the-art NewCoolstreaming, and observe that they provide better playback continuity and lower playback latency than that of NewCoolstreaming under a variety of experimental scenarios. Although our distributed market model can be run against a random sample of nodes, we improve its convergence time by executing it against a sample of nodes taken from the Gradient overlay. The Gradient overlay organizes nodes in a topology using a local utility value at each node, such that nodes are ordered in descending utility values away from a core of the highest utility nodes. The evaluations show that the streaming overlays converge faster when our market model works on top of the Gradient overlay. We use a gossip-based peer sampling service in our streaming systems to provide each node with a small list of live nodes. However, in the Internet, where a high percentage of nodes are behind NATs, existing gossiping protocols break down. To solve this problem, we present Gozar , a NAT-friendly gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service. We compare Gozar with the state-of-the-art NAT-friendly gossip-based peer sampling service, Nylon, and show that only Gozar supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s.
QC 20110517

Gli stili APA, Harvard, Vancouver, ISO e altri

41

Bruce, Craig Steven. "Performance optimization for distributed-shared-data systems". Thesis, National Library of Canada = Bibliothèque nationale du Canada, 1998. http://www.collectionscanada.ca/obj/s4/f2/dsk2/ftp02/NQ32819.pdf.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

42

Magnússon, Sindri. "Distributed Optimization with Nonconvexities and Limited Communication". Licentiate thesis, KTH, Reglerteknik, 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-181111.

Testo completo

Abstract (sommario):

In economical and sustainable operation of cyber-physical systems, a number of entities need to often cooperate over a communication network to solve optimization problems. A challenging aspect in the design of robust distributed solution algorithms to these optimization problems is that as technology advances and the networks grow larger, the communication bandwidth used to coordinate the solution is limited. Moreover, even though most research has focused distributed convex optimization, in cyberphysical systems nonconvex problems are often encountered, e.g., localization in wireless sensor networks and optimal power flow in smart grids, the solution of which poses major technical difficulties. Motivated by these challenges this thesis investigates distributed optimization with emphasis on limited communication for both convex and nonconvex structured problems. In particular, the thesis consists of four articles as summarized below. The first two papers investigate the convergence of distributed gradient solution methods for the resource allocation optimization problem, where gradient information is communicated at every iteration, using limited communication. In particular, the first paper investigates how distributed dual descent methods can perform demand-response in power networks by using one-way communication. To achieve the one-way communication, the power supplier first broadcasts a coordination signal to the users and then updates the coordination signal by using physical measurements related to the aggregated power usage. Since the users do not communicate back to the supplier, but instead they only take a measurable action, it is essential that the algorithm remains primal feasible at every iteration to avoid blackouts. The paper demonstrates how such blackouts can be avoided by appropriately choosing the algorithm parameters. Moreover, the convergence rate of the algorithm is investigated. The second paper builds on the work of the first paper and considers more general resource allocation problem with multiple resources. In particular, a general class of quantized gradient methods are studied where the gradient direction is approximated by a finite quantization set. Necessary and sufficient conditions on the quantization set are provided to guarantee the ability of these methods to solve a large class of dual problems. A lower bound on the cardinality of the quantization set is provided, along with specific examples of minimal quantizations. Furthermore, convergence rate results are established that connect the fineness of the quantization and number of iterations needed to reach a predefined solution accuracy. The results provide a bound on the number of bits needed to achieve the desired accuracy of the optimal solution. The third paper investigates a particular nonconvex resource allocation problem, the Optimal Power Flow (OPF) problem, which is of central importance in the operation of power networks. An efficient novel method to address the general nonconvex OPF problem is investigated, which is based on the Alternating Direction Method of Multipliers (ADMM) combined with sequential convex approximations. The global OPF problem is decomposed into smaller problems associated to each bus of the network, the solutions of which are coordinated via a light communication protocol. Therefore, the proposed method is highly scalable. The convergence properties of the proposed algorithm are mathematically and numerically substantiated. The fourth paper builds on the third paper and investigates the convergence of distributed algorithms as in the third paper but for more general nonconvex optimization problems. In particular, two distributed solution methods, including ADMM, that combine the fast convergence properties of augmented Lagrangian-based methods with the separability properties of alternating optimization are investigated. The convergence properties of these methods are investigated and sufficient conditions under which the algorithms asymptotically reache the first order necessary conditions for optimality are established. Finally, the results are numerically illustrated on a nonconvex localization problem in wireless sensor networks. The results of this thesis advocate the promising convergence behaviour of some distributed optimization algorithms on nonconvex problems. Moreover, the results demonstrate the potential of solving convex distributed resource allocation problems using very limited communication bandwidth. Future work will consider how even more general convex and nonconvex problems can be solved using limited communication bandwidth and also study lower bounds on the bandwidth needed to solve general resource allocation optimization problems.

QC 20160203

Gli stili APA, Harvard, Vancouver, ISO e altri

43

Das, Sanghamitra. "Application semantics based optimization of distributed algorithm". Diss., Kansas State University, 2012. http://hdl.handle.net/2097/15056.

Testo completo

Abstract (sommario):

Doctor of Philosophy
Department of Computing and Information Sciences
Gurdip Singh
To increase their applicability, distributed algorithms are typically written to work with any application on any network. This flexibility comes at the cost of performance since these 'general purpose' algorithms are written with the worst case scenario in mind. A distributed algorithm written for a specific application or a class of application is fine tuned to the properties of the application and can give a better performance when compared to the general purpose one. In this work, we propose two mechanisms in which we can optimize a general purpose algorithm - Alg based on the application - App using it. In the first approach, we analyze the specification of App to identify patterns of communication in its communication topology. These properties are then used to customize the behavior of the underlying distributed algorithm Alg. To demonstrate this approach, we study applications specified as component based systems where application components communicate via events and distributed algorithms to enforce ordering requirements on these events. We show how our approach can be used to optimize event ordering algorithms based on communication patterns in the applications. In the second approach, rather than analyzing the application specification, we assume that the developer provides application properties - I_App which are invariants for the optimization process. We assume that the algorithm is written and annotated in a format that is amenable to analysis. Our analysis algorithm then takes as input the application invariants and the annotated algorithm and looks for potential functions in the algorithm which are redundant in the context of the given application. In particular, we first look for function invocations in the algorithm whose post-conditions are already satisfied as a result of the application invariants. Each such invocation is considered as a potential redundant module. We further analyze the distributed algorithm to identify the impact of the removal of a specific invocation on the rest of the algorithm. We describe an implementation of this approach and demonstrate the applicability using a distributed termination detection algorithm.

Gli stili APA, Harvard, Vancouver, ISO e altri

44

Payberah, Amir. "Distributed Optimization of P2P Media Delivery Overlays". Licentiate thesis, SICS, 2011. http://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-23989.

Testo completo

Abstract (sommario):

Media streaming over the Internet is becoming increasingly popular. Currently, most media is delivered using global content-delivery networks, providing a scalable and robust client-server model. However, content delivery infrastructures are expensive. One approach to reduce the cost of media delivery is to use peer-to-peer (P2P) overlay networks, where nodes share responsibility for delivering the media to one another. The main challenges in P2P media streaming using overlay networks include: (i) nodes should receive the stream with respect to certain timing constraints, (ii) the overlay should adapt to the changes in the network, e.g., varying bandwidth capacity and join/failure of nodes, (iii) nodes should be intentivized to contribute and share their resources, and (iv) nodes should be able to establish connectivity to the other nodes behind NATs. In this work, we meet these requirements by presenting P2P solutions for live media streaming, as well as proposing a distributed NAT traversal solution. First of all, we introduce a distributed market model to construct an approximately minimal height multiple-tree streaming overlay for content delivery, in gradienTv. In this system, we assume all the nodes are cooperative and execute the protocol. However, in reality, there may exist some opportunistic nodes, free-riders, that take advantage of the system, without contributing to content distribution. To overcome this problem, we extend our market model in Sepidar to be effective in deterring free-riders. However, gradienTv and Sepidar are tree-based solutions, which are fragile in high churn and failure scenarios. We present a solution to this problem in GLive that provides a more robust overlay by replacing the tree structure with a mesh. We show in simulation, that the mesh-based overlay outperforms the multiple-tree overlay. Moreover, we compare the performance of all our systems with the state-of-the-art NewCoolstreaming, and observe that they provide better playback continuity and lower playback latency than that of NewCoolstreaming under a variety of experimental scenarios. Although our distributed market model can be run against a random sample of nodes, we improve its convergence time by executing it against a sample of nodes taken from the Gradient overlay. The Gradient overlay organizes nodes in a topology using a local utility value at each node, such that nodes are ordered in descending utility values away from a core of the highest utility nodes. The evaluations show that the streaming overlays converge faster when our market model works on top of the Gradient overlay. We use a gossip-based peer sampling service in our streaming systems to provide each node with a small list of live nodes. However, in the Internet, where a high percentage of nodes are behind NATs, existing gossiping protocols break down. To solve this problem, we present Gozar, a NAT-friendly gossip-based peer sampling service that: (i) provides uniform random samples in the presence of NATs, and (ii) enables direct connectivity to sampled nodes using a fully distributed NAT traversal service. We compare Gozar with the state-of-the-art NAT-friendly gossip-based peer sampling service, Nylon, and show that only Gozar supports one-hop NAT traversal, and its overhead is roughly half of Nylon’s.
REST

Gli stili APA, Harvard, Vancouver, ISO e altri

45

Shah, Arpit. "DISTRIBUTED BIOGEOGRAPHY BASED OPTIMIZATION FOR MOBILE ROBOTS". Cleveland State University / OhioLINK, 2012. http://rave.ohiolink.edu/etdc/view?acc_num=csu1335969537.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

46

Guerreiro, Igor Moaco. "Distributed optimization techniques for 4G and beyond". reponame:Repositório Institucional da UFC, 2016. http://www.repositorio.ufc.br/handle/riufc/23343.

Testo completo

Abstract (sommario):

GUERREIRO, I. M. Distributed optimization techniques for 4G and beyond. 2016. 134 f. Tese (Doutorado em Engenharia de Teleinformática)–Centro de Tecnologia, Universidade Federal do Ceará, Fortaleza, 2016.
Submitted by Renato Vasconcelos (ppgeti@ufc.br) on 2017-06-14T15:00:33Z No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2003633 bytes, checksum: a1a392147f3f5f0c8946e15f33186715 (MD5)
Rejected by Marlene Sousa (mmarlene@ufc.br), reason: Renato, Favor enviar ao autor as seguintes alterações que ele deve providenciar. Prezado Igor: Existe uma orientação para que normalizemos as dissertações e teses da UFC, em suas paginas pré-textuais e lista de referencias, pelas regras da ABNT. Por esse motivo, sugerimos consultar o modelo de template, para ajudá-lo nesta tarefa, disponível em: http://www.biblioteca.ufc.br/educacao-de-usuarios/1234-templates Vamos agora as correções sempre de acordo com o template: 1. Na capa, a UFC solicitou que nos fizéssemos a opção de colocar a identificação da instituição e sua hierarquia em língua portuguesa. Sendo assim a ordem é: nome da Universidade, Centro, Departamento e nome do Programa em língua portuguesa. Na sua tese faltou o Centro. 2. Na capa e folha de rosto retire o nome do estado e deixe só a cidade da defesa e o ano 3. Na folha de rosto coloque o titulo de sua tese em ingles, 4.No sumário as seções terciárias (com 3 dígitos) ficam em negrito e itálico. As quaternárias (4 dígitos) ficam só em itálico sem negrito. Corrigir em todo o sumário. Atenciosamente, Marlene Rocha 3366-9620 on 2017-06-16T13:10:37Z (GMT)
Submitted by Renato Vasconcelos (ppgeti@ufc.br) on 2017-06-16T14:20:51Z No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5)
Approved for entry into archive by Marlene Sousa (mmarlene@ufc.br) on 2017-06-16T14:53:04Z (GMT) No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5)
Made available in DSpace on 2017-06-16T14:53:04Z (GMT). No. of bitstreams: 1 2016_tese_imguerreiro.pdf: 2004320 bytes, checksum: b65ef686e66a34877a8b00018ddc6449 (MD5) Previous issue date: 2016-06-30
In today’s and future wireless telecommunication systems, the use of multiple antenna elements on the transmitter side, and also on the receiver side, provides users a better experience in terms of data rate, coverage and energy efficiency. In the fourth generation of such systems, precoding emerged as a relevant problem to be optimally solved so that network capacity can be increased by exploiting the characteristics of the channel. In this case, transmitters are equipped with few antenna elements, say up to eight, which means there are a few tens of precoding matrices, assuming a discrete codebook, to be coordinated per transmitter. As the number of antenna elements increases at communication nodes, conditions to keep in providing good experience for users become challenging. That’s one of the challenges of the fifth generation. Every transmitter must deal with narrow beams when a massive number of antenna elements is adopted. The hard part, regarding the narrowness of beams, is to keep the spatial alignment between transmitter and receiver. Any misalignment makes the received signal quality drop significantly so that users no longer experience good coverage. In particular, to provide initial access to unsynchronized devices, transmitters need to find the best beams to send synchronization signals consuming as little power as possible without any knowledge on unsynchronized devices’ channel states. This thesis thus addresses both precoding and beam finding as parameter coordination problems. The goal is to provide methods to solve them in a distributed manner. For this purpose, two types of iterative algorithms are presented for both. The first and simplest method is the greedy solution in which each communication node in the network acts selfishly. The second method and the focus of this work is based on a message-passing algorithm, namely min-sum algorithm, in factor graphs. The precoding problem is modeled as a discrete optimization one whose discrete variables comprise precoding matrix indexes. As for beam finding, a beam sweep procedure is adopted and the total consumed power over the network is optimized. Numerical results show that the graph-based solution outperforms the baseline/greedy one in terms of the following performance metrics: a) system capacity for the precoding problem, and b) power consumption for the beam finding one. Although message-passing demands more signaling load and more iterations to converge compared to baseline method, it usually provides a near-optimal solution in a more efficient way than the centralized solution.
Nos sistemas de telecomunicações sem fio atuais e futuros, o uso de múltiplos elementos de antena no lado do transmissor, e também no do receptor, proporciona aos usuários uma melhor experiência em termos de taxa de dados, cobertura e eficiência energética. Na quarta geração de tais sistemas, precodificação surgiu como um problema relevante a ser solucionado de forma ótima de modo que a capacidade da rede pudesse ser aumentada explorando as características do canal. Neste caso, transmissores são equipados com alguns elementos de antena, por exem- plo, oito, o que resulta em algumas dezenas de matrizes de precodificação, assumindo um con- junto discreto, para serem coordenadas em cada transmissor. Com o crescimento no número de elementos de antena nos nós de comunicação, as condições para continuar provendo boa ex- periência para usuários se tornam desafiantes. Isto é um dos desafios da quinta geração. Cada transmissor deve lidar com feixes estreitos quando é adotado um número massivo de elemen- tos de antena. A parte complicada, considerando a pequena largura dos feixes, é a manutenção do alinhamento espacial entre transmissor e receptor. Qualquer desalinhamento desta natureza faz a qualidade do sinal recebido cair significativamente de tal forma que usuários deixam de perceber boa cobertura. Particularmente, para prover acesso inicial a dispositivos não sincroni- zados, transmissores necessitam achar os melhores feixes, sem nenhum conhecimento do estado de canal de tais dispositivos não sincronizados, para enviar sinais de sincronização consumindo o mínimo de energia possível. Esta tese, portanto, discute ambos os problemas de precodifica- ção e descoberta de feixes como sendo de coordenação de parâmetros. A meta é prover métodos para solucioná-los de maneira distribuída. Para este propósito, dois tipos de algoritmos iterativos são apresentados para ambos os problemas. O primeiro método, e o mais simples, é a solução “gulosa”, na qual cada nó de comunicação na rede atua de forma egoísta. O segundo método, e o foco deste trabalho, é baseado em um algoritmo de troca de mensagens, especificamente o algoritmo min-sum, em grafos fatores. O problema de precodificação é modelado como um de otimização discreta onde as variáveis discretas compreendem índices de matrizes de precodifi- cação. A respeito da descoberta de feixes, é adotado um procedimento de varredura de feixes com a otimização do consumo total de potência na rede. Resultados numéricos mostram que a solução baseada em grafos ganha da egoísta, que é usada como solução base, em termos da mé- trica de desempenho: a) capacidade do sistema para o problema de precodificação, e b) consumo de potência para o caso de descoberta de feixes. Embora a troca de mensagens demande mais carga de sinalização e mais iterações para convergir comparado ao método base, ela geralmente provê uma solução próxima da ótima de forma mais eficiente comparada a solução centralizada.

Gli stili APA, Harvard, Vancouver, ISO e altri

47

Ibrahim, Sarmad Khaleel. "DISTRIBUTION SYSTEM OPTIMIZATION WITH INTEGRATED DISTRIBUTED GENERATION". UKnowledge, 2018. https://uknowledge.uky.edu/ece_etds/116.

Testo completo

Abstract (sommario):

In this dissertation, several volt-var optimization methods have been proposed to improve the expected performance of the distribution system using distributed renewable energy sources and conventional volt-var control equipment: photovoltaic inverter reactive power control for chance-constrained distribution system performance optimisation, integrated distribution system optimization using a chance-constrained formulation, integrated control of distribution system equipment and distributed generation inverters, and coordination of PV inverters and voltage regulators considering generation correlation and voltage quality constraints for loss minimization. Distributed generation sources (DGs) have important benefits, including the use of renewable resources, increased customer participation, and decreased losses. However, as the penetration level of DGs increases, the technical challenges of integrating these resources into the power system increase as well. One such challenge is the rapid variation of voltages along distribution feeders in response to DG output fluctuations, and the traditional volt-var control equipment and inverter-based DG can be used to address this challenge. These methods aim to achieve an optimal expected performance with respect to the figure of merit of interest to the distribution system operator while maintaining appropriate system voltage magnitudes and considering the uncertainty of DG power injections. The first method is used to optimize only the reactive power output of DGs to improve system performance (e.g., operating profit) and compensate for variations in active power injection while maintaining appropriate system voltage magnitudes and considering the uncertainty of DG power injections over the interval of interest. The second method proposes an integrated volt-var control based on a control action ahead of time to find the optimal voltage regulation tap settings and inverter reactive control parameters to improve the expected system performance (e.g., operating profit) while keeping the voltages across the system within specified ranges and considering the uncertainty of DG power injections over the interval of interest. In the third method, an integrated control strategy is formulated for the coordinated control of both distribution system equipment and inverter-based DG. This control strategy combines the use of inverter reactive power capability with the operation of voltage regulators to improve the expected value of the desired figure of merit (e.g., system losses) while maintaining appropriate system voltage magnitudes. The fourth method proposes a coordinated control strategy of voltage and reactive power control equipment to improve the expected system performance (e.g., system losses and voltage profiles) while considering the spatial correlation among the DGs and keeping voltage magnitudes within permissible limits, by formulating chance constraints on the voltage magnitude and considering the uncertainty of PV power injections over the interval of interest. The proposed methods require infrequent communication with the distribution system operator and base their decisions on short-term forecasts (i.e., the first and second methods) and long-term forecasts (i.e., the third and fourth methods). The proposed methods achieve the best set of control actions for all voltage and reactive power control equipment to improve the expected value of the figure of merit proposed in this dissertation without violating any of the operating constraints. The proposed methods are validated using the IEEE 123-node radial distribution test feeder.

Gli stili APA, Harvard, Vancouver, ISO e altri

48

Ha, Viet Hai. "Optimization of memory management on distributed machine". Phd thesis, Institut National des Télécommunications, 2012. http://tel.archives-ouvertes.fr/tel-00814630.

Testo completo

Abstract (sommario):

In order to explore further the capabilities of parallel computing architectures such as grids, clusters, multi-processors and more recently, clouds and multi-cores, an easy-to-use parallel language is an important challenging issue. From the programmer's point of view, OpenMP is very easy to use with its ability to support incremental parallelization, features for dynamically setting the number of threads and scheduling strategies. However, as initially designed for shared memory systems, OpenMP is usually limited on distributed memory systems to intra-nodes' computations. Many attempts have tried to port OpenMP on distributed systems. The most emerged approaches mainly focus on exploiting the capabilities of a special network architecture and therefore cannot provide an open solution. Others are based on an already available software solution such as DMS, MPI or Global Array and, as a consequence, they meet difficulties to become a fully-compliant and high-performance implementation of OpenMP. As yet another attempt to built an OpenMP compliant implementation for distributed memory systems, CAPE − which stands for Checkpointing Aide Parallel Execution − has been developed which with the following idea: when reaching a parallel section, the master thread is dumped and its image is sent to slaves; then, each slave executes a different thread; at the end of the parallel section, slave threads extract and return to the master thread the list of all modifications that has been locally performed; the master includes these modifications and resumes its execution. In order to prove the feasibility of this paradigm, the first version of CAPE was implemented using complete checkpoints. However, preliminary analysis showed that the large amount of data transferred between threads and the extraction of the list of modifications from complete checkpoints lead to weak performance. Furthermore, this version was restricted to parallel problems satisfying the Bernstein's conditions, i.e. it did not solve the requirements of shared data. This thesis aims at presenting the approaches we proposed to improve CAPE' performance and to overcome the restrictions on shared data. First, we developed DICKPT which stands for Discontinuous Incremental Checkpointing, an incremental checkpointing technique that supports the ability to save incremental checkpoints discontinuously during the execution of a process. Based on the DICKPT, the execution speed of the new version of CAPE was significantly increased. For example, the time to compute a large matrix-matrix product on a desktop cluster has become very similar to the execution time of the same optimized MPI program. Moreover, the speedup associated with this new version for various number of threads is quite linear for different problem sizes. In the side of shared data, we proposed UHLRC, which stands for Updated Home-based Lazy Release Consistency, a modified version of the Home-based Lazy Release Consistency (HLRC) memory model, to make it more appropriate to the characteristics of CAPE. Prototypes and algorithms to implement the synchronization and OpenMP data-sharing clauses and directives are also specified. These two works ensures the ability for CAPE to respect shared-data behavior

Gli stili APA, Harvard, Vancouver, ISO e altri

49

Ha, Viet Hai. "Optimization of memory management on distributed machine". Electronic Thesis or Diss., Evry, Institut national des télécommunications, 2012. http://www.theses.fr/2012TELE0042.

Testo completo

Abstract (sommario):

Afin d'exploiter les capacités des architectures parallèles telles que les grappes, les grilles, les systèmes multi-processeurs, et plus récemment les nuages et les systèmes multi-cœurs, un langage de programmation universel et facile à utiliser reste à développer. Du point de vue du programmeur, OpenMP est très facile à utiliser en grande partie grâce à sa capacité à supporter une parallélisation incrémentale, la possibilité de définir dynamiquement le nombre de fils d'exécution, et aussi grâce à ses stratégies d'ordonnancement. Cependant, comme il a été initialement conçu pour des systèmes à mémoire partagée, OpenMP est généralement très limité pour effectuer des calculs sur des systèmes à mémoire distribuée. De nombreuses solutions ont été essayées pour faire tourner OpenMP sur des systèmes à mémoire distribuée. Les approches les plus abouties se concentrent sur l’exploitation d’une architecture réseau spéciale et donc ne peuvent fournir une solution ouverte. D'autres sont basées sur une solution logicielle déjà disponible telle que DMS, MPI ou Global Array, et par conséquent rencontrent des difficultés pour fournir une implémentation d'OpenMP complètement conforme et à haute performance. CAPE — pour Checkpointing Aided Parallel Execution — est une solution alternative permettant de développer une implémentation conforme d'OpenMP pour les systèmes à mémoire distribuée. L'idée est la suivante : en arrivant à une section parallèle, l'image du thread maître est sauvegardé et est envoyée aux esclaves ; puis, chaque esclave exécute l'un des threads ; à la fin de la section parallèle, chaque threads esclaves extraient une liste de toutes modifications ayant été effectuées localement et la renvoie au thread maître ; le thread maître intègre ces modifications et reprend son exécution. Afin de prouver la faisabilité de cette approche, la première version de CAPE a été implémentée en utilisant des points de reprise complets. Cependant, une analyse préliminaire a montré que la grande quantité de données transmises entre les threads et l’extraction de la liste des modifications depuis les points de reprise complets conduit à de faibles performances. De plus, cette version est limitée à des problèmes parallèles satisfaisant les conditions de Bernstein, autrement dit, il ne permet pas de prendre en compte les données partagées. L'objectif de cette thèse est de proposer de nouvelles approches pour améliorer les performances de CAPE et dépasser les restrictions sur les données partagées. Tout d'abord, nous avons développé DICKPT (Discontinuous Incremental ChecKPoinTing), une technique points de reprise incrémentaux qui supporte la possibilité de prendre des points de reprise discontinue lors de l'exécution d'un processus. Basé sur DICKPT, la vitesse d'exécution de la nouvelle version de CAPE a été considérablement augmenté. Par exemple, le temps de calculer une grande multiplication matrice-matrice sur un cluster des ordinateurs bureaux est devenu très similaire à la durée d'exécution d'un programme MPI optimisé. En outre, l'accélération associée à cette nouvelle version pour divers nombre de threads est assez linéaire pour différentes tailles du problème. Pour des données partagées, nous avons proposé UHLRC (Updated Home-based Lazy Relaxed Consistency), une version modifiée de la HLRC (Home-based Lazy Relaxed Consistency) modèle de mémoire, pour le rendre plus adapté aux caractéristiques de CAPE. Les prototypes et les algorithmes à mettre en œuvre la synchronisation des données et des directives et clauses de données partagées sont également précisées. Ces deux travaux garantit la possibilité pour CAPE de respecter des demandes de données partagées d'OpenMP
In order to explore further the capabilities of parallel computing architectures such as grids, clusters, multi-processors and more recently, clouds and multi-cores, an easy-to-use parallel language is an important challenging issue. From the programmer's point of view, OpenMP is very easy to use with its ability to support incremental parallelization, features for dynamically setting the number of threads and scheduling strategies. However, as initially designed for shared memory systems, OpenMP is usually limited on distributed memory systems to intra-nodes' computations. Many attempts have tried to port OpenMP on distributed systems. The most emerged approaches mainly focus on exploiting the capabilities of a special network architecture and therefore cannot provide an open solution. Others are based on an already available software solution such as DMS, MPI or Global Array and, as a consequence, they meet difficulties to become a fully-compliant and high-performance implementation of OpenMP. As yet another attempt to built an OpenMP compliant implementation for distributed memory systems, CAPE − which stands for Checkpointing Aide Parallel Execution − has been developed which with the following idea: when reaching a parallel section, the master thread is dumped and its image is sent to slaves; then, each slave executes a different thread; at the end of the parallel section, slave threads extract and return to the master thread the list of all modifications that has been locally performed; the master includes these modifications and resumes its execution. In order to prove the feasibility of this paradigm, the first version of CAPE was implemented using complete checkpoints. However, preliminary analysis showed that the large amount of data transferred between threads and the extraction of the list of modifications from complete checkpoints lead to weak performance. Furthermore, this version was restricted to parallel problems satisfying the Bernstein's conditions, i.e. it did not solve the requirements of shared data. This thesis aims at presenting the approaches we proposed to improve CAPE' performance and to overcome the restrictions on shared data. First, we developed DICKPT which stands for Discontinuous Incremental Checkpointing, an incremental checkpointing technique that supports the ability to save incremental checkpoints discontinuously during the execution of a process. Based on the DICKPT, the execution speed of the new version of CAPE was significantly increased. For example, the time to compute a large matrix-matrix product on a desktop cluster has become very similar to the execution time of the same optimized MPI program. Moreover, the speedup associated with this new version for various number of threads is quite linear for different problem sizes. In the side of shared data, we proposed UHLRC, which stands for Updated Home-based Lazy Release Consistency, a modified version of the Home-based Lazy Release Consistency (HLRC) memory model, to make it more appropriate to the characteristics of CAPE. Prototypes and algorithms to implement the synchronization and OpenMP data-sharing clauses and directives are also specified. These two works ensures the ability for CAPE to respect shared-data behavior

Gli stili APA, Harvard, Vancouver, ISO e altri

50

Thompson, Simon Giles. "Distributed boosting algorithms". Thesis, University of Portsmouth, 1999. http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.285529.

Testo completo

Gli stili APA, Harvard, Vancouver, ISO e altri

Tesi sul tema "Distributed optimization and learning"

Cita una fonte nei formati APA, MLA, Chicago, Harvard e in molti altri stili