Multilayered Autoscaling Policies Simulation Toolbox

General Information

Multilayered autoscaling is a process of automatic adaptation of both the deployed application and the virtual infrastructure that it is running on. A real-world example of multilayered autoscaling is the combination of Horizontal Pod Autoscaler (HPA) with Cluster Autoscaler (CA) in Kubernetes.

The simulation toolbox in this repository is a work-in-progress set of tools to simulate the multilayered autoscaling policies and analyze them. The toolbox is written in Python and includes the following consider_applications_with_invocations_quantiles:

Multiverse is an autoscaling simulator and the core of the toolbox. It simulates autoscaling both on the application and the VM cluster level. Multiverse attempts to accurately simulate the application as a network of services with buffers. In such a simulated application, the requests travel from one service to the other possibly waiting in the buffers till the service instances can process them. Requests are simulated either individually or in batches. Simulation of individual requests rapidly arrives at the performance bottleneck when increasing the generated load, but allows to better study tail latency effects. When the load is high, e.g. 10-40 kRPS, it is recommended to use batching in 1000-2000 requests with a simulation step of 50 ms.
Stethoscope is a simulation data visualization tool. The tool leverages Python's matplotlib package to produce two categories of plots. The first category represents the quality of an autoscaling policy as experienced both by the user and the application owner. The second category characterizes the autoscaling behavior, i.e. an impact of an autoscaling policy on the internal application state.
Cruncher is an autoscaling simulation automation tool that leverages the simulation capabilities of Multiverse and the visualization options offered by Stethoscope. This automation tool arranges concrete simulations to run based on the alternative configuration files provided to it. Evaluation of alternatives is allowed for any configuration file, including application, scaling process, and scaling policy. Each possible combination of provided alternatives is explored by Cruncher by feeding it into Multiverse and retrieving the results once they are ready. The evaluation results for the alternative configurations are then combined and fed into Stethoscope to get the comparative plots.
Praxiteles is a tool that generates sets of configurations files for autoscaling simulations based on meta-configuration files (so-called recipes), cloud provider traces, and aspectual models such as application topology model. The meta-configuration files determine what the generated simulation configuration files would contain. They also specify either the concrete parameters for these files (e.g. the count of services in an application) or the probabilistic distributions over the parameter values (e.g. the memory consumption by an application is uniformly distributed in the range from 100 to 200 MBs). Some of the parameters may be derived directly from the data, i.e. from the traces that some cloud services providers make publicly available. These mostly include load traces and resource utilization traces. Currently, Praxiteles supports only two traces for Azure cloud that were recently published by Microsoft[1][2]. The support for new traces can be added by writing custom classes implementing the same interface. Some simulation aspects can also be derived from the models. As of now, Praxiteles supports only the model-based application topology generation[3] and the model-based booting/termination time intervals generation. The tool's output is fed either into Cruncher or into Multiverse.
Training Ground is a tool that helps to get the justified results for the short simulations that use online learning-based models in their autoscaling policy. The challenge for these policies originates from the insufficient simulation time to achieve model accuracy that is appropriate to be used for drawing the autoscaling decisions. To overcome this challenge, Training Ground pre-trains the model used in an autoscaling policy by running the simulation over a limited set of load patterns that is presented to it. These 'training' simulations are repeated multiple times with each load trace being drawn randomly from the set to adjust the model. When the achieved accuracy is considered to be sufficient, the path to this model can be supplied as one of the parameters to the evaluated autoscaling policy. An additional advantage of this tool is that the models can be directly shipped into the production environment if it happens to use Python, Keras, and Tensorflow for cloud operations automation.

References

[1] Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource Central: Understanding and Predicting Workloads for Improved Resource Management in Large Cloud Platforms. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP '17). Association for Computing Machinery, New York, NY, USA, 153–167. DOI:https://doi.org/10.1145/3132747.3132772

[2] Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, & Ricardo Bianchini (2020). Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In 2020 USENIX Annual Technical Conference (USENIX ATC 20) (pp. 205–218). USENIX Association.

[3] Podolskiy, V., Patrou, M., Patros, P., Gerndt, M., & Kent, K. B. (2020). The weakest link: Revealing and modeling the architectural patterns of microservice applications. In CASCON ’20: Proceedings of the 30th Annual International Conference on Computer Science and Software Engineering (pp. 113–122). New York, NY, USA: ACM. https://doi.org/10.5555/3432601.3432616

Remit / autoscaling-simulator

Multilayered Autoscaling Policies Simulation Toolbox

General Information

References

About

Languages