INTEGRATING AI-DRIVEN AUTOSCALING MECHANISMS IN KUBERNETES-BASED MICROSERVICES ARCHITECTURES

Venkatramana Reddy Panyala

doi:10.15662/9zf67r46

Authors

Venkatramana Reddy Panyala Production Engineer, Yahoo, USA Author

DOI:

https://doi.org/10.15662/9zf67r46

Keywords:

Kubernetes, Autoscaling, Microservices, Machine Learning, LSTM, Reinforcement Learning

Abstract

The increasing popularity of cloud-native software has led to the need for improved efficiency in managing resources in containerized environments. The fact that Kubernetes is a de-facto solution when it comes to orchestrating workloads of containers offers inherent autoscaling mechanisms in the shape of Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA). However, such solutions do not cope well with the dynamics and burstiness of modern micro service-oriented architectures. This paper discusses how a brand-new solution combining Kubernetes- based autoscaling with smart prediction based on algorithms developed by Artificial Intelligence and Machine Learning can be applied. In particular, we will dwell on the combination of Long Short-Term Memory (LSTM), Autoregressive Integrated Moving Average (ARIMA) models, and Reinforcement Learning (RL) to predict the next workload peaks. The given approach will be implemented and tested in a sample microservices environment that consists of auth, order, and payment services. We shall consider the technical design of the approach, its challenges and potential integration possibilities. Early findings indicate that the proposed architecture performs better as compared to the traditional reactive autoscalers and enables the implementation of higher SLO compliance levels and reduction of the risks of latency peaks. We conclude by outlining further research directions.

References

[1] Burns, B., Grant, B., Oppenheimer, D., Brewer, E., & Wilkes, J. (2016). Borg, Omega, and Kubernetes. ACM Queue, 14(1), 70–93.

[2] Hutter, F., Kotthoff, L., & Vanschoren, J. (Eds.). (2019). Automated Machine Learning: Methods, Systems, Challenges. Springer.

[3] Peng, B., Chen, Y., Schuster, M., Pierson, H., & Ding, S. (2021). Elastic scaling in cloud- native environments using predictive control. IEEE Transactions on Cloud Computing, 9(2), 544–558.

[4] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347.

[5] Li, C., Zheng, C., & Shen, H. (2020). Workload prediction for cloud computing using recurrent neural networks. Future Generation Computer Systems, 113, 285–296.

[6] Casalicchio, E., & Silvestri, L. (2013). Mechanisms for SLA provisioning in cloud-based service providers. Computer Networks, 57(3), 795–810.

[7] Mao, H., Alizadeh, M., Menache, I., & Kandula, S. (2016). Resource management with deep reinforcement learning. In Proceedings of the 15th ACM Workshop on Hot Topics in Networks (pp. 50–56). ACM.

[8] Rzadca, K., Findeisen, P., Swiderski, J., Zych, P., Bronson, P., Ostrowski, D., ... & Wilkes, J. (2020). Autopilot: workload autoscaling at Google. In Proceedings of the 15th European Conference on Computer Systems (EuroSys). ACM.

[9] T. K. Community, “Kubesphere devops: A powerful ci/cd platform built on top of

kubernetes for devops-oriented teams,” https://kubesphere.io/devops/, 2022.

[10] W. Xu, “Test report on kubeedge’s support for 100,000 edge nodes,”

https://kubeedge.io/en/blog/scalability-test-report/, 2022.

[11] C. Carrión, “Kubernetes scheduling: Taxonomy, ongoing issues and challenges,” ACM

Computing Surveys (CSUR), 2022.

[12] S. N. A. Jawaddi, M. H. Johari, and A. Ismail, “A review of microservices autoscaling with formal verification perspective,” Software: Practice and Experience, 2022.

[13] C. Carrión, “Kubernetes scheduling: Taxonomy, ongoing issues and challenges,” ACM

Computing Surveys (CSUR), 2022.

[14] Mao, H., Schwarzkopf, M., Venkatakrishnan, S. B., Meng, Z., & Alizadeh, M. (2019). Learning scheduling algorithms for data processing clusters. In Proceedings of the ACM Special Interest Group on Data Communication (SIGCOMM). ACM.

[15] Thinakaran, P., Gunasekaran, J. R., Sharma, B., Kandemir, M. T., & Das, C. R. (2019). Kube-knots: Resource harvesting through dynamic container orchestration in GPU-based datacenters. In Proceedings of the IEEE International Conference on Cluster Computing. IEEE.

[16] Wang, C., Urgaonkar, B., Gupta, A., Kesidis, G., & Lim, Q. (2016). Effective capacity modelling in cloud computing: A fine-grained workload prediction approach. In Proceedings of the 12th IEEE International Conference on e-Science. IEEE.

[17] Ahmad, I., & Ranka, S. (Eds.). (2020). Handbook of Energy-Aware and Green Computing. CRC Press.

[18] Shahrad, M., Fonseca, R., Goiri, I., Chaudhry, G., Batum, P., Cooke, J., ... & Bianchini,

R. (2020). Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. In Proceedings of the USENIX Annual Technical Conference (ATC). USENIX.

[19] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS). MIT Press.

INTEGRATING AI-DRIVEN AUTOSCALING MECHANISMS IN KUBERNETES-BASED MICROSERVICES ARCHITECTURES

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

Issue

Section

How to Cite

Make a Submission

Images

Submisssion

Open Access

License

Keywords

Keywords

Latest publications