Nisum built a comprehensive SRE Portal Framework.
Clients now have a reliable and scalable SRE portal framework, which measures service availability using machine learning in addition to providing end-to-end visibility across the production ecosystem, leading to:
99.999%
|
An increase in cross-functional collaboration with shared accountability and ownership |
Business Challenge
A Fortune 500 premium goods retailer lacked a comprehensive framework to measure its digital operations and end-to-end visibility against business and technical KPIs, specifically on eCommerce systems. This led to:
-
Impacted customers because:
-
Churn between business units created downtime in services
-
Manual operations caused a delay in deliveries
-
-
Increased operational expenditures due to a linear increase in team sizes
-
Decreased revenue due to downtime
Solution
Nisum built a comprehensive SRE Portal Framework to measure everything, understand the current state of Service Level Indicator (SLIs), and define Service Level Objective (SLOs). As a result, they are able to achieve a balance between reliability and scalability over the velocity of feature delivery.
-
Developed a spring-boot web application that has responsive web design, using Angular, with built-in schedulers, database connectors, and Rest API integration capabilities to:
-
Fetch source data from disparate systems across the enterprise (Internal/External)
-
Provide unified 360-degree visibility on business and technical operations
-
-
Analyzed KPIs, in turn, helped SRE engineers to connect the dots and cut down response (MTTA) and resolution (MTTR) times, leading to
-
A decrease in the operational budget
-
Better focus on automation of menial tasks
-
Testimonial
“Amazing to see the result of measuring everything over time to achieve operational excellence.”
-VP of Site Reliability Engineering
Feel free to contact us for more information on how Nisum can drive results for your company.