Academic literature on the topic 'Auto-Scaling policies'

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the lists of relevant articles, books, theses, conference reports, and other scholarly sources on the topic 'Auto-Scaling policies.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Journal articles on the topic "Auto-Scaling policies":

1

Vemasani, Preetham, Sai Mahesh Vuppalapati, Suraj Modi, and Sivakumar Ponnusamy. "Achieving Agility through Auto-Scaling: Strategies for Dynamic Resource Allocation in Cloud Computing." International Journal for Research in Applied Science and Engineering Technology 12, no. 4 (April 30, 2024): 3169–77. http://dx.doi.org/10.22214/ijraset.2024.60566.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract: Auto-scaling is a crucial aspect of cloud computing, allowing for the efficient allocation of computational resources in response to immediate demand. This article delves into the concept of auto-scaling, its key components, and the strategies used to effectively manage resources in cloud environments. This study emphasizes the importance of auto-scaling in the cloud computing landscape by exploring its benefits, including cost efficiency, performance optimization, high availability, and scalability [1]. The article explores the various factors to consider when implementing scaling policies, such as selecting the right approach for scaling, whether it be predictive or reactive and the availability of auto-scaling services provided by major cloud platforms like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure [2, 3]. In addition, the paper addresses the challenges and complexities related to configuring auto-scaling systems, cost management, and latency in resource provisioning [4]. The article also showcases case studies that illustrate the successful implementation of auto-scaling in different industries, along with valuable insights and recommended approaches [5]. Lastly, this paper delves into future trends and research directions in auto-scaling techniques, integration with emerging technologies, and potential research areas [6].
2

Guo, Yuan Yuan, Jing Li, Xin Chun Liu, and Wei Wei Wang. "Batch Job Based Auto-Scaling System on Cloud Computing Platform." Advanced Materials Research 756-759 (September 2013): 2386–90. http://dx.doi.org/10.4028/www.scientific.net/amr.756-759.2386.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
With the quick development of information science, it becomes much harder to deal with a large scale of data. In this case, cloud computing begins to become a hot topic as a new computing model because of its good scalability. It enables customers to acquire and release computing resources from and to the cloud computing service providers according to current workload. The scaling ability is achieved by system automatically according to auto scaling policies reserved by customers in advance, and it can greatly decrease users operating burden. In this paper, we proposed a new architecture of auto-scaling system, used auto-scaling technology on batch jobs based system and considered tasks deadlines and VM setup time as affecting factors on auto-scaling policy besides substrate resource utilities.
3

Evangelidis, Alexandros, David Parker, and Rami Bahsoon. "Performance modelling and verification of cloud-based auto-scaling policies." Future Generation Computer Systems 87 (October 2018): 629–38. http://dx.doi.org/10.1016/j.future.2017.12.047.

Full text
APA, Harvard, Vancouver, ISO, and other styles
4

Rajput, R. S., Dinesh Goyal, Rashid Hussain, and Pratham Singh. "Provisioning of Virtual Machines in the Context of an Auto-Scaling Cloud Computing Environment." Journal of Computational and Theoretical Nanoscience 17, no. 6 (June 1, 2020): 2430–34. http://dx.doi.org/10.1166/jctn.2020.8912.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
The cloud computing environment is accomplishing cloud workload by distributing between several nodes or shift to the higher resource so that no computing resource will be overloaded. However, several techniques are used for the management of computing workload in the cloud environment, but still, it is an exciting domain of investigation and research. Control of the workload and scaling of cloud resources are some essential aspects of the cloud computing environment. A well-organized load balancing plan ensures adequate resource utilization. The auto-scaling is a technique to include or terminate additional computing resources based on the scaling policies without involving humans efforts. In the present paper, we developed a method for optimal use of cloud resources by the implementation of a modified auto-scaling feature. We also incorporated an auto-scaling controller for the optimal use of cloud resources.
5

Wei, Yi, Daniel Kudenko, Shijun Liu, Li Pan, Lei Wu, and Xiangxu Meng. "A Reinforcement Learning Based Auto-Scaling Approach for SaaS Providers in Dynamic Cloud Environment." Mathematical Problems in Engineering 2019 (February 3, 2019): 1–11. http://dx.doi.org/10.1155/2019/5080647.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Cloud computing is an emerging paradigm which provides a flexible and diversified trading market for Infrastructure-as-a-Service (IaaS) providers, Software-as-a-Service (SaaS) providers, and cloud-based application customers. Taking the perspective of SaaS providers, they offer various SaaS services using rental cloud resources supplied by IaaS providers to their end users. In order to maximize their utility, the best behavioural strategy is to reduce renting expenses as much as possible while providing sufficient processing capacity to meet customer demands. In reality, public IaaS providers such as Amazon offer different types of virtual machine (VM) instances with different pricing models. Moreover, service requests from customers always change as time goes by. In such heterogeneous and changing environments, how to realize application auto-scaling becomes increasingly significant for SaaS providers. In this paper, we first formulate this problem and then propose a Q-learning based self-adaptive renting plan generation approach to help SaaS providers make efficient IaaS facilities adjustment decisions dynamically. Through a series of experiments and simulation, we evaluate the auto-scaling approach under different market conditions and compare it with two other resource allocation strategies. Experimental results show that our approach could automatically generate optimal renting policies for the SaaS provider in the long run.
6

Bhattacharjee, Brijit, Bikash Debnath, Jadav Chandra Das, Subhashis Kar, Nandan Banerjee, Saurav Mallik, Hong Qin, and Debashis De. "Predicting the Future Appearances of Lost Children for Information Forensics with Adaptive Discriminator-Based FLM GAN." Mathematics 11, no. 6 (March 10, 2023): 1345. http://dx.doi.org/10.3390/math11061345.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
This article proposes an adaptive discriminator-based GAN (generative adversarial network) model architecture with different scaling and augmentation policies to investigate and identify the cases of lost children even after several years (as human facial morphology changes after specific years). Uniform probability distribution with combined random and auto augmentation techniques to generate the future appearance of lost children’s faces are analyzed. X-flip and rotation are applied periodically during the pixel blitting to improve pixel-level accuracy. With an anisotropic scaling, the images were generated by the generator. Bilinear interpolation was carried out during up-sampling by setting the padding reflection during geometric transformation. The color transformation applied with the Luma flip on the rotation matrices spread log-normally for saturation. The various scaling and modifications, combined with the StyleGan ADA architecture, were implemented using NVIDIA V100 GPU. The FLM method yields a BRISQUE score of between 10 and 30. The article uses MSE, RMSE, PSNR, and SSMIM parameters to compare with the state-of-the-art models. Using the Universal Quality Index (UQI), FLM model-generated output maintains a high quality. The proposed model obtains ERGAS (12 k–23 k), SCC (0.001–0.005), RASE (1 k–4 k), SAM (0.2–0.5), and VIFP (0.02–0.09) overall scores.
7

Bhargavi, K., and B. Sathish Babu. "Uncertainty Aware Resource Provisioning Framework for Cloud Using Expected 3-SARSA Learning Agent: NSS and FNSS Based Approach." Cybernetics and Information Technologies 19, no. 3 (September 1, 2019): 94–117. http://dx.doi.org/10.2478/cait-2019-0028.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Abstract Efficiently provisioning the resources in a large computing domain like cloud is challenging due to uncertainty in resource demands and computation ability of the cloud resources. Inefficient provisioning of the resources leads to several issues in terms of the drop in Quality of Service (QoS), violation of Service Level Agreement (SLA), over-provisioning of resources, under-provisioning of resources and so on. The main objective of the paper is to formulate optimal resource provisioning policies by efficiently handling the uncertainties in the jobs and resources with the application of Neutrosophic Soft-Set (NSS) and Fuzzy Neutrosophic Soft-Set (FNSS). The performance of the proposed work compared to the existing fuzzy auto scaling work achieves the throughput of 80% with the learning rate of 75% on homogeneous and heterogeneous workloads by considering the RUBiS, RUBBoS, and Olio benchmark applications.
8

Russo Russo, Gabriele, Valeria Cardellini, and Francesco Lo Presti. "Hierarchical Auto-Scaling Policies for Data Stream Processing on Heterogeneous Resources." ACM Transactions on Autonomous and Adaptive Systems, May 16, 2023. http://dx.doi.org/10.1145/3597435.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Data Stream Processing (DSP) applications analyze data flows in near real-time by means of operators, which process and transform incoming data. Operators handle high data rates running parallel replicas across multiple processors and hosts. To guarantee consistent performance without wasting resources in face of variable workloads, auto-scaling techniques have been studied to adapt operator parallelism at run-time. However, most the effort has been spent under the assumption of homogeneous computing infrastructures, neglecting the complexity of modern environments. We consider the problem of deciding both how many operator replicas should be executed and which types of computing nodes should be acquired. We devise heterogeneity-aware policies by means of a two-layered hierarchy of controllers. While application-level components steer the adaptation process for whole applications, aiming to guarantee user-specified requirements, lower-layer components control auto-scaling of single operators. We tackle the fundamental challenge of performance and workload uncertainty, exploiting Bayesian optimization and reinforcement learning to devise policies. The evaluation shows that our approach is able to meet users’ requirements in terms of response time and adaptation overhead, while minimizing the cost due to resource usage, outperforming state-of-the-art baselines. We also demonstrate how partial model information is exploited to reduce training time for learning-based controllers.
9

Tournaire, Thomas, Hind Castel-Taleb, and Emmanuel Hyon. "Efficient Computation of Optimal Thresholds in Cloud Auto-Scaling Systems." ACM Transactions on Modeling and Performance Evaluation of Computing Systems, June 6, 2023. http://dx.doi.org/10.1145/3603532.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
We consider a horizontal and dynamic auto-scaling technique in a cloud system where virtual machines hosted on a physical node are turned on and off in order to minimise energy consumption while meeting performance requirements. Finding cloud management policies that adapt the system to the load is not straightforward and we consider here that virtual machines are turned on and off depending on queue load thresholds. We want to compute the optimal threshold values that minimize consumption costs and penalty costs (when performance requirements are not met). To solve this problem, we propose several optimisation methods, based on two different mathematical approaches. The first one is based on queueing theory and uses local search heuristics coupled with the stationary distributions of Markov Chains. The second approach tackles the problem using Markov Decision Process (MDP) in which we assume that the policy is of a special multi-threshold type called hysteresis. We improve the heuristics of the former approach with the aggregation of Markov Chains and queues approximation techniques. We assess the benefit of threshold-aware algorithms for solving MDPs. Then, we carry out theoretical analyzes of the two approaches. We also compare them numerically and we show that all of the presented MDP algorithms strongly outperform the local search heuristics. Finally, we propose a cost model for a real scenario of a cloud system to apply our optimisation algorithms and to show their practical relevance. The major scientific contribution of the paper is a set of fast (almost in real time) load-based threshold computation methods that can be used by a cloud provider to optimize its financial costs.

Dissertations / Theses on the topic "Auto-Scaling policies":

1

Adolfsson, Henrik. "Comparison of Auto-Scaling Policies Using Docker Swarm." Thesis, Linköpings universitet, Databas och informationsteknik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-154160.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
When deploying software engineering applications in the cloud there are two similar software components used. These are Virtual Machines and Containers. In recent years containers have seen an increase in popularity and usage, in part because of tools such as Docker and Kubernetes. Virtual Machines (VM) have also seen an increase in usage as more companies move to solutions in the cloud with services like Amazon Web Services, Google Compute Engine, Microsoft Azure and DigitalOcean. There are also some solutions using auto-scaling, a technique where VMs are commisioned and deployed to as load increases in order to increase application performace. As the application load decreases VMs are decommisioned to reduce costs. In this thesis we implement and evaluate auto-scaling policies that use both Virtual Machines and Containers. We compare four different policies, including two baseline policies. For the non-baseline policies we define a policy where we use a single Container for every Virtual Machine and a policy where we use several Containers per Virtual Machine. To compare the policies we deploy an image serving application and run workloads to test them. We find that the choice of deployment strategy and policy matters for response time and error rate. We also find that deploying applications as described in the methodis estimated to take roughly 2 to 3 minutes.
2

Tournaire, Thomas. "Model-based reinforcement learning for dynamic resource allocation in cloud environments." Electronic Thesis or Diss., Institut polytechnique de Paris, 2022. http://www.theses.fr/2022IPPAS004.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
L'émergence de nouvelles technologies nécessite une allocation efficace des ressources pour satisfaire la demande. Cependant, ces nouveaux besoins nécessitent une puissance de calcul élevée impliquant une plus grande consommation d'énergie notamment dans les infrastructures cloud et data centers. Il est donc essentiel de trouver de nouvelles solutions qui peuvent satisfaire ces besoins tout en réduisant la consommation d'énergie des ressources. Dans cette thèse, nous proposons et comparons de nouvelles solutions d'IA (apprentissage par renforcement RL) pour orchestrer les ressources virtuelles dans les environnements de réseaux virtuels de manière à garantir les performances et minimiser les coûts opérationnels. Nous considérons les systèmes de file d'attente comme un modèle pour les infrastructures cloud IaaS et apportons des méthodes d'apprentissage pour allouer efficacement le bon nombre de ressources.Notre objectif est de minimiser une fonction de coût en tenant compte des coûts de performance et opérationnels. Nous utilisons différents types d'algorithmes de RL (du « sans-modèle » au modèle relationnel) pour apprendre la meilleure politique. L'apprentissage par renforcement s'intéresse à la manière dont un agent doit agir dans un environnement pour maximiser une récompense cumulative. Nous développons d'abord un modèle de files d'attente d'un système cloud avec un nœud physique hébergeant plusieurs ressources virtuelles. Dans cette première partie, nous supposons que l'agent connaît le modèle (dynamiques de l'environnement et coût), ce qui lui donne la possibilité d'utiliser des méthodes de programmation dynamique pour le calcul de la politique optimale. Puisque le modèle est connu dans cette partie, nous nous concentrons également sur les propriétés des politiques optimales, qui sont des règles basées sur les seuils et l'hystérésis. Cela nous permet d'intégrer la propriété structurelle des politiques dans les algorithmes MDP. Après avoir fourni un modèle de cloud concret avec des arrivées exponentielles avec des intensités réelles et des données d'énergie pour le fournisseur de cloud, nous comparons dans cette première approche l'efficacité et le temps de calcul des algorithmes MDP par rapport aux heuristiques construites sur les distributions stationnaires de la chaîne de Markov des files d'attente.Dans une deuxième partie, nous considérons que l'agent n'a pas accès au modèle de l'environnement et nous concentrons notre travail sur les techniques de RL. Nous évaluons d'abord des méthodes basées sur un modèle où l'agent peut réutiliser son expérience pour mettre à jour sa fonction de valeur. Nous considérons également des techniques de MDP en ligne où l'agent autonome approxime le modèle pour effectuer une programmation dynamique. Cette partie est évaluée dans un environnement plus large avec deux nœuds physiques en tandem et nous évaluons le temps de convergence et la précision des différentes méthodes, principalement les techniques basées sur un modèle par rapport aux méthodes sans modèle de l'état de l'art.La dernière partie se concentre sur les techniques de RL basées sur des modèles avec une structure relationnelle entre les variables d’état. Comme ces réseaux en tandem ont des propriétés structurelles dues à la forme de l’infrastructure, nous intégrons les approches factorisées et causales aux méthodes de RL pour inclure cette connaissance. Nous fournissons à l'agent une connaissance relationnelle de l'environnement qui lui permet de comprendre comment les variables sont reliées. L'objectif principal est d'accélérer la convergence: d'abord avec une représentation plus compacte avec la factorisation où nous concevons un algorithme en ligne de MDP factorisé que nous comparons avec des algorithmes de RL sans modèle et basés sur un modèle ; ensuite en intégrant le raisonnement causal et contrefactuel qui peut traiter les environnements avec des observations partielles et des facteurs de confusion non observés
The emergence of new technologies (Internet of Things, smart cities, autonomous vehicles, health, industrial automation, ...) requires efficient resource allocation to satisfy the demand. These new offers are compatible with new 5G network infrastructure since it can provide low latency and reliability. However, these new needs require high computational power to fulfill the demand, implying more energy consumption in particular in cloud infrastructures and more particularly in data centers. Therefore, it is critical to find new solutions that can satisfy these needs still reducing the power usage of resources in cloud environments. In this thesis we propose and compare new AI solutions (Reinforcement Learning) to orchestrate virtual resources in virtual network environments such that performances are guaranteed and operational costs are minimised. We consider queuing systems as a model for clouds IaaS infrastructures and bring learning methodologies to efficiently allocate the right number of resources for the users.Our objective is to minimise a cost function considering performance costs and operational costs. We go through different types of reinforcement learning algorithms (from model-free to relational model-based) to learn the best policy. Reinforcement learning is concerned with how a software agent ought to take actions in an environment to maximise some cumulative reward. We first develop queuing model of a cloud system with one physical node hosting several virtual resources. On this first part we assume the agent perfectly knows the model (dynamics of the environment and the cost function), giving him the opportunity to perform dynamic programming methods for optimal policy computation. Since the model is known in this part, we also concentrate on the properties of the optimal policies, which are threshold-based and hysteresis-based rules. This allows us to integrate the structural property of the policies into MDP algorithms. After providing a concrete cloud model with exponential arrivals with real intensities and energy data for cloud provider, we compare in this first approach efficiency and time computation of MDP algorithms against heuristics built on top of the queuing Markov Chain stationary distributions.In a second part we consider that the agent does not have access to the model of the environment and concentrate our work with reinforcement learning techniques, especially model-based reinforcement learning. We first develop model-based reinforcement learning methods where the agent can re-use its experience replay to update its value function. We also consider MDP online techniques where the autonomous agent approximates environment model to perform dynamic programming. This part is evaluated in a larger network environment with two physical nodes in tandem and we assess convergence time and accuracy of different reinforcement learning methods, mainly model-based techniques versus the state-of-the-art model-free methods (e.g. Q-Learning).The last part focuses on model-based reinforcement learning techniques with relational structure between environment variables. As these tandem networks have structural properties due to their infrastructure shape, we investigate factored and causal approaches built-in reinforcement learning methods to integrate this information. We provide the autonomous agent with a relational knowledge of the environment where it can understand how variables are related to each other. The main goal is to accelerate convergence by: first having a more compact representation with factorisation where we devise a factored MDP online algorithm that we evaluate and compare with model-free and model-based reinforcement learning algorithms; second integrating causal and counterfactual reasoning that can tackle environments with partial observations and unobserved confounders

Book chapters on the topic "Auto-Scaling policies":

1

Kumari, Anisha, and Bibhudatta Sahoo. "Serverless Architecture for Healthcare Management Systems." In Advances in Healthcare Information Systems and Administration, 203–27. IGI Global, 2022. http://dx.doi.org/10.4018/978-1-6684-4580-8.ch011.

Full text
APA, Harvard, Vancouver, ISO, and other styles
Abstract:
Serverless computing is an emerging cloud service architecture for executing distributed applications, where services are provided based on a pay-as-you-go basis. It allows the developers to deploy and run their applications without worrying about the underlying architecture. Serverless architecture has been popular due to its cost-effective policies, auto-scaling, independent, and simplified code deployment. The healthcare service can be made available as a serverless application that consists of distributed cloud services achieving the various requirements in the healthcare industry. The services offered may be made available on premise on a cloud infrastructure to users and health service providers. This chapter presents a novel model for serverless architecture for healthcare systems, where the services are provided as functional units to various stakeholders of the healthcare system. Two case studies related to healthcare systems that have adopted serverless frameworks are discussed.

Conference papers on the topic "Auto-Scaling policies":

1

Evangelidis, Alexandros, David Parker, and Rami Bahsoon. "Performance Modelling and Verification of Cloud-Based Auto-Scaling Policies." In 2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID). IEEE, 2017. http://dx.doi.org/10.1109/ccgrid.2017.39.

Full text
APA, Harvard, Vancouver, ISO, and other styles
2

Gandhi, Anshul, Mor Harchol-Balter, Ram Raghunathan, and Michael A. Kozuch. "Distributed, Robust Auto-Scaling Policies for Power Management in Compute Intensive Server Farms." In 2011 6th Open Cirrus Summit (OCS). IEEE, 2011. http://dx.doi.org/10.1109/ocs.2011.6.

Full text
APA, Harvard, Vancouver, ISO, and other styles

To the bibliography