SRE is a mindset

Applying Stephen Covey’s ‘Seven Habits of Highly Effective People’ to organisational functions

Vipin Luthra | January 23, 2022

#self-development   #psychology   #Stephen Covey   #management   #Business  
(Illustration: Ashish Asthana)
(Illustration: Ashish Asthana)

In 2010, I attended a self-development program called ‘Seven Habits of Highly Effective People’.  This program deeply impacted me, and I underwent a life-changing experience. Stephen Covey writes, “The way we see the problem is the problem.” We must allow ourselves to undergo paradigm shifts – to change ourselves fundamentally and not just alter our attitudes and behaviours on the surface. That's where the seven habits of highly effective people come in.

I was introduced to the three foundational fundamentals that impactful leaders adopt through this program.

One: they focus on personal effectiveness, also referred to as Private Victory or Self Mastery. This should be achieved first. The habits that help to accomplish this are “Be proactive”, “Begin with an end in mind”, and “Put the first thing first”. Earning trust and reliability are also the essential foundational elements for all human relationships. Habits 1, 2, and 3 focus on self-mastery and moving from dependence to independence.
Two: they emphasise team effectiveness, also referred to as Public Victory (Working with others). The habits that help to accomplish this are “Think Win-Win”, "Seek first to understand, then to be understood", "Synergize". Habits 4, 5, and 6 focus on developing teamwork, collaboration, communication skills, and moving from independence to interdependence.
Three: continuous improvement is also referred to as “Sharpen the Saw”. Habit 7 is focused on continuous growth and advancement and embodies all other habits.

Meanwhile, let me talk about SRE – site reliability engineering – a topic very close to my heart. I have performed several roles in IT infrastructure and operations, designing applications, and developing codes in my career. The basic need of any organisation that aims for a digital transformation is the reliability of its IT infrastructure, including network stability and application resilience. I believe that reliability and trust are not only desirable human traits but are also necessary for machines and software engineering. This is an eternal truth – not something restricted to the domain of information technology alone.

What is SRE?
In 2013, Google coined the term SRE, and their goal was “Keeping Google up and running”. At Google, SRE is a practice of continually defining reliability goals, measuring those goals, and working to improve their services as needed. The main goals are to create scalable and highly reliable software systems.

How I see SRE through the prism of Seven Habits of Highly Effective People

As mentioned above, “Seven Habits of Highly Effective People” profoundly impacted me, and I started replicating this in my professional and personal space.

Let me elaborate.

Now let us try and see if we can extrapolate this and apply it in the wider organisational perspective.

Every organisation has IT functions and subfunctions.Let us start with the presumption that covey’s individual is a particular IT function of the organisation. Now let us apply all the seven habits to the organisational functions just as Covey applies them to individuals.

    Step one – to make aparticular IT function effective (i.e., private victory)

    Step two- to make all the different IT functions collectively effective (i.e., public victory)

    Step three – continuously improve steps one and two

In the table below, let’s understand how organisations can adopt the SRE practices using the seven habits:


*Begin with an end in mind.
    This principle is focused on People Excellence within an IT function

It is imperative for any organisation to define its end goal to be achieved by the adoption of SRE. This should be followed by making their people understand what SRE means to them. Every IT function may have different objectives as part of SRE adoption that may ultimately map to the organisation’s goal. Hence, it is important to set the goals and expectations of every IT function for the adoption of SRE. They should have a shared sense of purpose and should develop an SRE mindset.  Here are some examples of how various IT functions can develop a shared purpose.  

1)    “Design for Reliability”: It is important for the “Platform & Architecture” to design the application for reliability. The design includes all the IT infrastructure, networks, computes, databases, and applications for high availability and resilience. Fixing architectural mistakes becomes more difficult later in the development cycle

2)    “Effective coding for Reliability”: Development teams should develop efficient code with an effectivetransaction logging framework, monitoring, alerting, and performance tuning.  Effective transaction logging helps in the implementation of AI OPS and other predictive analytics tools.

3)    “Monitoring Mindset to an observability one”: IT operations team should develop the observability mindset. Modern infrastructure has evolved from a monitoring mindset to an observability one. Observability as a mindset is the degree to which a team or company values the ability to inspect and understand systems, their workload, and their behaviour. Observability enables us to quickly and easily understand how the whole system runs—and even preempt issues.

An organisation should set up a centre of excellence for SRE. This team works closely with every function and helps them understand their role in adopting the SRE mindset and defines the objectives at the IT function level.Ultimately the goal should be that SRE becomes a part of every function in the organisation.

Be Proactive

    This principle is focused on Product Excellence and enabling Intelligent and automated operations. Every IT function should focus on developing its capability to detect issues/problems proactively and fix them. Here are some examples.

Organisations should invest in the Implementation of end-to-endbusiness process monitoring solutions. This includes monitoring networks, compute, databases, application layer, integration layer, workloads,public URLs,etc. This should be followed by implementing self-healsolutionsthatmake necessary changes to restore themselvesto normal operations.

Organisations should also develop the Predictive AI Strategy and implement Predictive Analytical Toolsto harness predictive analytics. This willreduce operational inefficiencies and improve digital experiences. Predictive IT is a powerful new approach that uses machine learning (ML) and artificial intelligence (AI) to predict incidents before impacting customers and end-users.

Put the first thing first
    This principle is focused on Process excellence, especially around alert management.

The team should understand what is important versus what is urgent.

This relates to excellence in the process, especially when it comes to alertmanagement.

Let’s try and map alert management to the four quadrants of time management.
1)    Quadrant I: Urgent and important (Do): These are those alerts that must be fixed immediately
2)    Quadrant II: Not urgent but important (Plan). These alerts are like warnings, may not be fixed on an urgent basis but should be actioned in due course of time, e.g., a continuous high CPU that may not be causing system downtime but can cause a major incident in the future.
3)    Quadrant III: Urgent but not important (Delegate). These alerts may not be important for one team but need to be fixed byanothergroup. Hence, a hot handover is critical. E.g., it is essential to engage business on an urgent basis for any master data failure
4)    Quadrant IV: Not urgent and not important (Eliminate). These are false alerts and act as a big noise that should be eliminated to increase effectiveness. I have seen a significant psychological impact on people’s minds when they receive too many false alerts. Generally, in such cases, the true alerts get misplaced, or the teamslose interest in taking adequate and timely action.


Think Win-Win
    This habit focuses on cross-team excellence and on establishing shared metrics and alert thresholds across all the functions

Common metrics sources include:
•System metrics (CPU, memory, disk)
•Infrastructure metrics (Azure, AWS)
•Web tracking scripts (Google Analytics, Digital Experience Management)
•Application agents/collectors (APM, error tracking)
•Business metrics (Order to Cash, Load Out, Load In, etc.)

Here is an example: An application stops giving the precise performance if the CPU’s utilisation reaches above 95%, but the data centre compute team configures the high CPU threshold at 99%. There is a mismatch in setting the CPU’s utilization threshold by the application and infrastructure teams. The thresholds to be monitored should be standard across all functions.

Seek first to understand, then to be understood
    This habit focuses on building better observable systems and the ability to quickly and easily understand how the whole system runs.This is achievable through the Three Pillars of Observability – Logs, metrics, and traces.

Observability is all about service reliability to provide the best customer experience. Observability is instrumenting your systems with tools to collect actionable data to know when errors occur. While having access to logs, metrics, and traces doesn’t necessarily make systems more observable, these are powerful tools that, if understood well, can unlock the ability to build better systems. Logs, metrics, and traces serve their unique purpose and are complementary. In unison, they provide maximum visibility into the behaviour of distributed systems.

    Synergize is the habit of creative cooperation. It is teamwork, open-mindedness, and the adventure of finding new solutions to old problems and managing the Error Budgets.

Various functions should work together to solve the alerts and major incidents and follow common approaches to platforms for solving problems, focusing on solving more complex problems. And to maintain effective relationships among various steams and with their different partner teams. Effective communication is a high priority in SRE.

Under this habit, Stephan Covey also talks about the Emotional Bank Balance. This is very much like a checking account at a bank. You can make deposits, improve the relationship, or take withdrawals and weaken it.

Similarly, organizations can spend their error budget in any way they like. If the product is currently running flawlessly, they can launch any innovations with few or no errors. Conversely, suppose they have met or exceeded the error budget and are operating at or below the defined SLA. In that case, all innovations or launches are on hold until they reduce the number of errors to a level that allows the launch to proceed.


Sharpen the saw
 “We must never become too busy sawing to take time to sharpen the saw.”
–Dr. Stephen R. Covey

This habit focuses on governance and measuring the Key Performance Indicators. This helps the organisations measure where they are today and continuously improve and ensure that they are moving in the right direction to achieve their goal towards the adoption of the SRE principles.

The above diagram shows the Maturity Continuum of the SRE. In this way, any organization can fulfill its dream of adopting SRE by following the seven habits of Dr Stephan Covey.

P.S.: A couple of years ago, I was asked to lead IT operations of the ‘Go To Market’ area that had many challenges related to system reliability and availability. I tried to apply all these principles and achieved fantastic results. I will share those details with you in my upcoming blogs.

Let me know how your organisations are embarking on their journey of SRE adoption.

Ms. Vipin Luthra is a highly respected IT leader with nearly 25 years of experience. She hails from Roorkee in Uttarakhand and is a graduate of the IIT Roorkee. Currently, she is working as Senior Director in PepsiCo.



Other News

“Mumbai Infra boom similar to that of Manhattan in 19th C”

Mumbai’s ongoing infrastructure boom – with a new coastal road, Atal Setu, metro lines and much more – creating transport corridors – is comparable to that of Manhattan in New York during 1811-1870, according to BMC commissioner Bhushan Gagrani. The iconic projects being implemented

Global Gandhi: How the Mahatma captured the world’s imagination

Gandhi’s Australia, Australia’s Gandhi By Thomas Weber Orient BlackSwan, 348 pages, Rs 1,800  

Urban apathy in Mumbai, Maharashtra sees 49% voting

Polling in the fifth phase of General Elections 2024 which commenced at 7 am on Monday simultaneously across 49 PCs recorded an approximate voter turnout of 57.47% as of 7:45 pm. Voters came out in large numbers braving hot weather in many parts of the states that went for polls on Monday.

Voter turnout: Drop from 2019 reduces further

As the voting percentages dropped drastically in the first couple of phases of the ongoing general elections, observers and analysts spoke of ‘voter apathy’ blamed it on a lack of “wave” this time – apart from the heatwave, that is. The latest figures after the fourth phase, h

GAIL reports annual revenue of Rs.1,30,638 crore

GAIL (INDIA) Limited has reported 75% increase in Profit before Tax (PBT) of Rs.11,555 crore in FY24,  as against Rs 6,584 Cr in FY23. Profit after Tax (PAT) in FY24 stands at Rs. 8,836 Cr as against Rs.5,302 Cr in FY23, a 67 % increase. However, revenue from operations registered a fa

Women move forward, one step at a time

“Women’s rights are not a privilege but a fundamental aspect of human rights.” —Savitribai Phule In India, where almost two-thirds of the population resides in rural areas, women’s empowerment initiatives are extremely critical for intensifying l

Visionary Talk: Amitabh Gupta, Pune Police Commissioner with Kailashnath Adhikari, MD, Governance Now


Current Issue


Facebook Twitter Google Plus Linkedin Subscribe Newsletter