Resilience by Design: Site Reliability Engineering in Financial Platforms

Authors

  • Vasudevan Subramani Development Manager and Solution Architect Author

DOI:

https://doi.org/10.15662/987sx245

Keywords:

Fintech, Site Reliability Engineering, Automation, High Availability, CI/CD Pipelines, Telemetry, AI/ML

Abstract

This paper examines the application of Site Reliability Engineering (SRE) principles to enhance 
stability, customer experience, and operational efficiency in financial platforms. Modern information systems 
demand extremely high availability, as even minor outages can lead to revenue loss, regulatory fines, and 
reputational damage. The case study demonstrates that automation and SRE practices prevented over $1 million 
in penalties, minimized failed transactions, and improved root cause analysis through custom ETL file 
management and heat maps. Additionally, the MyAccount portal was redesigned to reduce errors and improve 
usability, while operational improvements cleared 7,000 backlog tickets and reduced daily ticket volume to fewer 
than 68. Telemetry and failover automation further increased system availability to 99.95%. Findings confirm that 
SRE is a technical methodology rather than a customer-facing approach, enabling organizations to reduce costs, 
improve efficiency, and deliver services reliably. The conclusions highlight the strategic importance of SRE in 
fintech and its potential to shape robust, scalable, and cost-effective platforms.

References

Devan, K. (2025). Driving digital transformation:

leveraging site reliability engineering and platform

engineering for scalable and resilient systems.

Applied Science and Engineering Journal for

Advanced

Research,

1–1,

21–29.

https://doi.org/10.5281/zenodo.14799721

[2] Aktas, E. U., Tuzlutas, B., & Yesiltas, B. (2025,

June 17). Designing a custom chaos engineering

framework for enhanced system resilience at

SoftTech.

arXiv.org.

https://arxiv.org/abs/2506.14281

[3] Chen, Y., Pan, J., Clark, J., Su, Y., Zheutlin, N.,

Bhavya, B., Arora, R., Deng, Y., Jha, S., & Xu, T.

(2025, May 27). STRATUS: a multi-agent system

for autonomous reliability engineering of modern

clouds.

arXiv.org.

https://arxiv.org/abs/2506.02009

[4] Mosali, S. R. (2025). SRE PRINCIPLES IN

FINTECH: A TECHNICAL DEEP DIVE.

INTERNATIONAL JOURNAL OF COMPUTER

ENGINEERING & TECHNOLOGY, 16(1),

3331–3343.

https://doi.org/10.34218/ijcet_16_01_232

[5] Panda, S. P., Koneti, S. B., & Muppala, M. (2025).

Benefits of Site Reliability Engineering (SRE) in

Modern Technology Environments. Benefits of

Site Reliability Engineering (SRE) in Modern

Technology

Environments.

https://doi.org/10.2139/ssrn.5285768

[6] Bollaert,

H.,

Lopez-De-Silanes,

F.,

&

Schwienbacher, A. (2021). Fintech and access to

finance. Journal of Corporate Finance, 68, 101941.

https://doi.org/10.1016/j.jcorpfin.2021.101941

[7] Grego, M., Magnani, G., & Denicolai, S. (2023).

Transform to adapt or resilient by design? How

organizations can foster resilience through

business model transformation. Journal of

Business

Research,

171,

114359.

https://doi.org/10.1016/j.jbusres.2023.114359

[8] Mandal, P., Basu, P., Choi, T., & Rath, S. B.

(2023). Platform financing vs. bank financing:

Strategic choice of financing mode under seller

competition. European Journal of Operational

Research,

315(1),

130–146.

https://doi.org/10.1016/j.ejor.2023.11.025

[9] Cai, B., Zhang, Y., Wang, H., Liu, Y., Ji, R., Gao,

C., Kong, X., & Liu, J. (2021). Resilience

evaluation methodology of engineering systems

with

dynamic-Bayesian-network-based

degradation

and

maintenance.

Reliability

Engineering & System Safety, 209, 107464.

https://doi.org/10.1016/j.ress.2021.107464

[10] Ma, J., Gao, X., Di Gao, N., Dang, J., & Zhao, B.

(2025). Digital finance, green development, and

supply chain resilience: the moderating effects of

climate

risk.

Applied Economics, 1–17.

https://doi.org/10.1080/00036846.2025.2498102

[11] Rao, V. B. (2025). Journal of Marketing & Social

Research. Journal of Marketing &Amp; Social

Research. https://doi.org/10.61336/jmsr

Downloads

Published

2025-12-19

How to Cite

Resilience by Design: Site Reliability Engineering in Financial Platforms . (2025). International Journal of Engineering & Extended Technologies Research (IJEETR), 7(6), 11210-11218. https://doi.org/10.15662/987sx245