SRE is a mindset

Applying Stephen Covey’s ‘Seven Habits of Highly Effective People’ to organisational functions

Vipin Luthra | January 23, 2022


#self-development   #psychology   #Stephen Covey   #management   #Business  
(Illustration: Ashish Asthana)
(Illustration: Ashish Asthana)

In 2010, I attended a self-development program called ‘Seven Habits of Highly Effective People’.  This program deeply impacted me, and I underwent a life-changing experience. Stephen Covey writes, “The way we see the problem is the problem.” We must allow ourselves to undergo paradigm shifts – to change ourselves fundamentally and not just alter our attitudes and behaviours on the surface. That's where the seven habits of highly effective people come in.

I was introduced to the three foundational fundamentals that impactful leaders adopt through this program.

One: they focus on personal effectiveness, also referred to as Private Victory or Self Mastery. This should be achieved first. The habits that help to accomplish this are “Be proactive”, “Begin with an end in mind”, and “Put the first thing first”. Earning trust and reliability are also the essential foundational elements for all human relationships. Habits 1, 2, and 3 focus on self-mastery and moving from dependence to independence.
    
Two: they emphasise team effectiveness, also referred to as Public Victory (Working with others). The habits that help to accomplish this are “Think Win-Win”, "Seek first to understand, then to be understood", "Synergize". Habits 4, 5, and 6 focus on developing teamwork, collaboration, communication skills, and moving from independence to interdependence.
    
Three: continuous improvement is also referred to as “Sharpen the Saw”. Habit 7 is focused on continuous growth and advancement and embodies all other habits.

Meanwhile, let me talk about SRE – site reliability engineering – a topic very close to my heart. I have performed several roles in IT infrastructure and operations, designing applications, and developing codes in my career. The basic need of any organisation that aims for a digital transformation is the reliability of its IT infrastructure, including network stability and application resilience. I believe that reliability and trust are not only desirable human traits but are also necessary for machines and software engineering. This is an eternal truth – not something restricted to the domain of information technology alone.

What is SRE?
In 2013, Google coined the term SRE, and their goal was “Keeping Google up and running”. At Google, SRE is a practice of continually defining reliability goals, measuring those goals, and working to improve their services as needed. The main goals are to create scalable and highly reliable software systems.

How I see SRE through the prism of Seven Habits of Highly Effective People

As mentioned above, “Seven Habits of Highly Effective People” profoundly impacted me, and I started replicating this in my professional and personal space.

Let me elaborate.

Now let us try and see if we can extrapolate this and apply it in the wider organisational perspective.

Every organisation has IT functions and subfunctions.Let us start with the presumption that covey’s individual is a particular IT function of the organisation. Now let us apply all the seven habits to the organisational functions just as Covey applies them to individuals.

    Step one – to make aparticular IT function effective (i.e., private victory)

    Step two- to make all the different IT functions collectively effective (i.e., public victory)

    Step three – continuously improve steps one and two

In the table below, let’s understand how organisations can adopt the SRE practices using the seven habits:

STEP ONE: TO MAKE A PARTICULAR IT FUNCTION EFFECTIVE (I.E., PRIVATE VICTORY)

*Begin with an end in mind.
    This principle is focused on People Excellence within an IT function

It is imperative for any organisation to define its end goal to be achieved by the adoption of SRE. This should be followed by making their people understand what SRE means to them. Every IT function may have different objectives as part of SRE adoption that may ultimately map to the organisation’s goal. Hence, it is important to set the goals and expectations of every IT function for the adoption of SRE. They should have a shared sense of purpose and should develop an SRE mindset.  Here are some examples of how various IT functions can develop a shared purpose.  

1)    “Design for Reliability”: It is important for the “Platform & Architecture” to design the application for reliability. The design includes all the IT infrastructure, networks, computes, databases, and applications for high availability and resilience. Fixing architectural mistakes becomes more difficult later in the development cycle

2)    “Effective coding for Reliability”: Development teams should develop efficient code with an effectivetransaction logging framework, monitoring, alerting, and performance tuning.  Effective transaction logging helps in the implementation of AI OPS and other predictive analytics tools.

3)    “Monitoring Mindset to an observability one”: IT operations team should develop the observability mindset. Modern infrastructure has evolved from a monitoring mindset to an observability one. Observability as a mindset is the degree to which a team or company values the ability to inspect and understand systems, their workload, and their behaviour. Observability enables us to quickly and easily understand how the whole system runs—and even preempt issues.

Recommendation
An organisation should set up a centre of excellence for SRE. This team works closely with every function and helps them understand their role in adopting the SRE mindset and defines the objectives at the IT function level.Ultimately the goal should be that SRE becomes a part of every function in the organisation.

Be Proactive

    This principle is focused on Product Excellence and enabling Intelligent and automated operations. Every IT function should focus on developing its capability to detect issues/problems proactively and fix them. Here are some examples.

Organisations should invest in the Implementation of end-to-endbusiness process monitoring solutions. This includes monitoring networks, compute, databases, application layer, integration layer, workloads,public URLs,etc. This should be followed by implementing self-healsolutionsthatmake necessary changes to restore themselvesto normal operations.

Organisations should also develop the Predictive AI Strategy and implement Predictive Analytical Toolsto harness predictive analytics. This willreduce operational inefficiencies and improve digital experiences. Predictive IT is a powerful new approach that uses machine learning (ML) and artificial intelligence (AI) to predict incidents before impacting customers and end-users.

Put the first thing first
    This principle is focused on Process excellence, especially around alert management.

The team should understand what is important versus what is urgent.

This relates to excellence in the process, especially when it comes to alertmanagement.

Let’s try and map alert management to the four quadrants of time management.
1)    Quadrant I: Urgent and important (Do): These are those alerts that must be fixed immediately
2)    Quadrant II: Not urgent but important (Plan). These alerts are like warnings, may not be fixed on an urgent basis but should be actioned in due course of time, e.g., a continuous high CPU that may not be causing system downtime but can cause a major incident in the future.
3)    Quadrant III: Urgent but not important (Delegate). These alerts may not be important for one team but need to be fixed byanothergroup. Hence, a hot handover is critical. E.g., it is essential to engage business on an urgent basis for any master data failure
4)    Quadrant IV: Not urgent and not important (Eliminate). These are false alerts and act as a big noise that should be eliminated to increase effectiveness. I have seen a significant psychological impact on people’s minds when they receive too many false alerts. Generally, in such cases, the true alerts get misplaced, or the teamslose interest in taking adequate and timely action.

STEP TWO: TO MAKE ALL THE DIFFERENT FUNCTIONS COLLECTIVELY EFFECTIVE (I.E., PUBLIC VICTORY)

Think Win-Win
    This habit focuses on cross-team excellence and on establishing shared metrics and alert thresholds across all the functions

Common metrics sources include:
•System metrics (CPU, memory, disk)
•Infrastructure metrics (Azure, AWS)
•Web tracking scripts (Google Analytics, Digital Experience Management)
•Application agents/collectors (APM, error tracking)
•Business metrics (Order to Cash, Load Out, Load In, etc.)

Here is an example: An application stops giving the precise performance if the CPU’s utilisation reaches above 95%, but the data centre compute team configures the high CPU threshold at 99%. There is a mismatch in setting the CPU’s utilization threshold by the application and infrastructure teams. The thresholds to be monitored should be standard across all functions.

Seek first to understand, then to be understood
    This habit focuses on building better observable systems and the ability to quickly and easily understand how the whole system runs.This is achievable through the Three Pillars of Observability – Logs, metrics, and traces.

Observability is all about service reliability to provide the best customer experience. Observability is instrumenting your systems with tools to collect actionable data to know when errors occur. While having access to logs, metrics, and traces doesn’t necessarily make systems more observable, these are powerful tools that, if understood well, can unlock the ability to build better systems. Logs, metrics, and traces serve their unique purpose and are complementary. In unison, they provide maximum visibility into the behaviour of distributed systems.

"Synergize"
    Synergize is the habit of creative cooperation. It is teamwork, open-mindedness, and the adventure of finding new solutions to old problems and managing the Error Budgets.

Various functions should work together to solve the alerts and major incidents and follow common approaches to platforms for solving problems, focusing on solving more complex problems. And to maintain effective relationships among various steams and with their different partner teams. Effective communication is a high priority in SRE.

Under this habit, Stephan Covey also talks about the Emotional Bank Balance. This is very much like a checking account at a bank. You can make deposits, improve the relationship, or take withdrawals and weaken it.

Similarly, organizations can spend their error budget in any way they like. If the product is currently running flawlessly, they can launch any innovations with few or no errors. Conversely, suppose they have met or exceeded the error budget and are operating at or below the defined SLA. In that case, all innovations or launches are on hold until they reduce the number of errors to a level that allows the launch to proceed.

STEP THREE: CONTINUOUSLY IMPROVE STEPS ONE AND TWO

Sharpen the saw
 “We must never become too busy sawing to take time to sharpen the saw.”
–Dr. Stephen R. Covey

This habit focuses on governance and measuring the Key Performance Indicators. This helps the organisations measure where they are today and continuously improve and ensure that they are moving in the right direction to achieve their goal towards the adoption of the SRE principles.


The above diagram shows the Maturity Continuum of the SRE. In this way, any organization can fulfill its dream of adopting SRE by following the seven habits of Dr Stephan Covey.

P.S.: A couple of years ago, I was asked to lead IT operations of the ‘Go To Market’ area that had many challenges related to system reliability and availability. I tried to apply all these principles and achieved fantastic results. I will share those details with you in my upcoming blogs.

Let me know how your organisations are embarking on their journey of SRE adoption.

Ms. Vipin Luthra is a highly respected IT leader with nearly 25 years of experience. She hails from Roorkee in Uttarakhand and is a graduate of the IIT Roorkee. Currently, she is working as Senior Director in PepsiCo.

Comments

 

Other News

India set to enter 5G era

After years of intense preparation, India is about to take the next step in information and communication technology. Prime Minister Narendra Modi will launch 5G services in India on October 1 and also inaugurate the 6th Edition of India Mobile Congress 2022 to be held till October 4, at Pragati Maidan, Ne

Ram Shravan Manan: A musical venture that leads to self-reflection

There is this popular saying that epics are never told, but always retold. Ramayana is one such epic and needs no introduction. Its plot is grounded in sacrifice and the end brings out hope that the good always wins. But how is a centuries-old tale of the prince of Ayodhya still relevant for today’s

Garib Kalyan Ann Yojana extended for three more months

In pursuance of the pro-people announcement made by the prime minister in 2021 and successful implementation of additional food security under PM Garib Kalyan Ann Yojana, the union cabinet has approved its extension for a further period of three months, from October to December 2022. At a ti

A great literary feast (that could’ve been even more sumptuous)

A Case of Indian Marvels: Dazzling Stories from the Country’s Finest Writers Edited by David Davidar Aleph, 390 pages, Rs 999 Change is the only constant, and India has always been doing so. Yet, after independence, if there was a year when the p

Govt e-Marketplace sellers report more business

“My volume of business has increased ever since I registered on GeM (Government e-Marketplace) in 2017. Earlier, I could supply items only in the vicinity of my shop in Fort area and only within Mumbai. Now, I ship my products all over the country! I have tied up with India Post and three private cou

How the Hindi Newspaper Business Changed

The Journey of Hindi Language Journalism in India: From Raj to Swaraj and Beyond By Mrinal Pande Orient BlackSwan, 188 pages, Rs 1,195.00 In India, the English-language media is considered the ‘national media’, while the language press

Visionary Talk: Amitabh Gupta, Pune Police Commissioner with Kailashnath Adhikari, MD, Governance Now


Archives

Current Issue

Opinion

Facebook    Twitter    Google Plus    Linkedin    Subscribe Newsletter

Twitter