DevOps

We live and work in exciting times when technology can play a key role in delivering real value and competitive advantage into any businesses. Technology now provides opportunities to deliver features to customers in new ways and at speeds not possible before. But organisations based on old IT and waterfall methods often find themselves struggling to keep up.

DevOps

DevOps is a new term that emerged from two related major trends.

The first trend – called also “agile infrastructure” or “agile operations” – originated from applying Agile and Lean to the operations work.

The second trend came from a better understanding of the value of collaboration between development and operations teams and how important operations has become in our increasingly service-oriented world.

A development team creates the product or service and they have different incentives than the operations team who keeps the product or service running.

E.g., a developer wants to create as many working functionalities as possible in the shortest possible time. That is their work for which they are payed by the customers.

However, an operator wants the least amount of new functionalities because each new feature is a change and changes are a risk.

As a result of this misalignment, DevOps was born.

DevOps can be defined as the practice of operations and development teams participating together in the entire product or service lifecycle, from design through the development process to production support.

DevOps ultimately means building awesome revenue generating products (NOT merely automate and improve efficiency of infrastructure) through digital pipelines (process & tools)  that take code from a development environment all the way to a valuable product or service.

Sorry, that was a complicated definition, maybe could be summarised in:

quality through collaboration.

Agile and Lean roots

DevOps values are effectively the same as captured in the Agile Manifesto but focusing on the overall service or product fully delivered to the customer instead of simply “working software”.

Also, many Lean values and principles apply to operations.

Lean can be defined as the the pursuit of perfection, the maximisation of customer value through systemic identification and minimisation of waste.
Especially the principle of eliminate waste wherever possible (Muda) is done in DevOps all the time. By automating and removing manual steps.

Another one is Move toward flow (Mura).
A lot of what DevOps does in the pipelines is about getting that pipeline flow consistent according to the Theory of Constraints: making work visible, understanding bottlenecks in the pipeline, understanding how global optimisation always trumps over local optimisation.

Waste and flow work brilliantly together. Because you’re reducing waste, but you’re doing it in an evenness fashion. You’re not just fixing some local waste. You’re looking at the global performance of the flow. this reduces stress on the system, allows everything to flow quicker (Muri).

And then, we have the classic Kaizen, which is basically continuous improvement.

Most of the methods are the same; you can use Scrum or Kanban.

Practices too, such as continuous integration and deployment, using configuration management, metrics and monitoring schemes, a toolchain approach to tooling… Even using virtualisation and cloud computing is a common practice used to accelerate change in the modern infrastructure world.

But specific DevOps techniques are used as part of implementing the above concepts and processes, as Infrastructure Automation and Continuous Everything (build, test, deploy, monitor, analyse).

The three ways

This is summarised in the three basic principles of DevOps, otherwise known as the “Three Ways“, which outline its values and philosophies:

  • The First Way – This is a set of principles that accelerate the delivery of digital services. The focus is on Continuous Delivery.
  • The Second Way – This is a set of principles that amplify feedback loops.The concept is creating a problem-solving culture, as well as understand monitoring business metrics.
  • The Third Way – Finally, these patterns cover the concepts of organisational learning and safety culture. They include principles like blameless postmortems, resilience engineering and systems thinking.

Why is DevOps hard?

Adam Jacob, the founder of Chef (an infrastructure automation company) defined DevOps as “a cultural and professional movement”.

When we change culture, we fundamentally shift how people respond to a situation.
And changing how people behave is always hard.

Which kind of culture change is it?

It’s a culture of continuous improvement.
Everything is an improvement exercise and then automation is the kind of delivery.

Then we have learning and continuous learning is the measurement. If you can’t measure, you can’t improve.  A successful DevOps implementation will measure everything it can as often as it can.

And then, finally, we have sharing, the feedback loop. Creating a culture where people share ideas and problems is critical.

This is really the old Deming Cycle: Plan-Do-Check-Act.

DevOps Patterns

There is no One True Way of doing DevOps. But we can see what a successful DevOps culture looks and acts like and how good principles can be applied and describe commonly seen misconceptions and anti-patterns.

High-performing organisations take testing very serious, all the way through the process  from code to how they deliver the software. The idea is to build quality in by work in small batches, build an automatic software pipeline and create a full test coverage.

Common patterns are to peer review on pull requests, before the code gets put into the code base and in some organisations they will make sure that they’re not actually on the same development team. This is not mandated for every change, and is often left up to individual discretion—in high-trust, blameless environment, people are given the trust and authority to decide whether a code review is necessary.

Another standard pattern is trunk based deploys. In general, you’re always committing to and deploying from a main branch (the trunk), and the idea is that it lowers the complexity while also it fits the small batch moving fastball.

One important assumption of the trunk based deploy is that when the developer commits their code it’s fully tested and that it could go to production.

This leads to the pattern “iterate fast”: each feature created or changed is put in the system and could be delivered right to production.

Continuous Delivery is the ability to get changes of all types—including new features, configuration changes, bug fixes and experiments—into production, or into the hands of users, safely and quickly in a sustainable way at any time.

This is achieved by ensuring that code is always in a deployable state, even in the face of teams of thousands of developers making changes on a daily basis. The integration, testing and hardening phases that traditionally followed “dev complete” as well as code freezes, are completely eliminated.

There are a couple of things that are core to Continuous Delivery.
One is everything starts in source control. Not only the application source but also the meta-definition, so, when using products like Docker, the actual DSLs that build the images should be in source control. The code used to build the infrastructure should also go into the source control (infrastructure as a code).
Another one is to automate anything that you possibly can. Automate everything that makes sense.

Everyone is responsible – this is another principle.
Werner Vogels, the CTO of Amazon, says that “If you build it, you own it”.
The idea is that everybody is responsible for the pipeline. A developer doesn’t just check in the code and they’re done with it. They own it for the life of the service or the feature.

Done means released.
This goes back to kind of the shared responsibility, but a developer now doesn’t just commit code and then says “I’m done”. Done means that it’s actually in production.

A summary of best practices

High performance organisations:

  • deploy more frequently
  • have shorter lead times (time to get an idea into production)
  • less failures related to change
  • they recover faster (Mean Time To Recovery is lower)

how they do it:

  • make work visible
  • manage Work In Progress (WIP) and Flow
  • create high trust work environments
  • learn and embrace failure (e.g., Chaos Monkey)

they use the CAMS (Culture, Automation, Measurement, Sharing) principles:

  • build quality in
  • work in small batches
  • automate repeatable tasks
  • pursue continuous improvements
  • everyone is responsible

DevOps Anti-pattern

Manual takes too long

The worst cases anti-pattern would be testing taking too long and anything manual: manual regression, acceptance test, checklists, long lead times,  when it just takes forever to get a change in the system.
Remember: If a test is not automatic it doesn’t exist.

The solution is to focus on tools such as infrastructure automation and continuous integration. In both of these cases, automation is a result of improved technology.

Silo culture. No sharing, no feedback

A blame culture is one that tends toward blaming and punishing people or teams when mistakes are made.
In a blameless culture or a learning organisation, a human error is seen as a starting point rather than an ending one, sparking a discussion on the context surrounding the decision and why it made sense at the time.

Outside of incident response, a culture of blame that calls people out (e.g., which developers or teams introduced the most bugs, or which team closed the fewest tickets) will contribute to an atmosphere of hostility between coworkers as everyone tries to avoid blame.

While this sort of self-preservation is understandable in such an environment, it doesn’t lend itself well to a culture of openness and collaboration. People will begin with‐holding information, in an effort to keep themselves from being blamed.

An organisational silo describes the mentality of teams that do not share their knowledge with other teams in the same company. Instead of having common goals or responsibilities, silo-ed teams have very distinct and segregated roles.

Combined with a blameful culture, this can lead to information caching as a form of job security, difficulty or slowness completing work that involves multiple teams and decreases in morale as teams or silos start to see each other as adversaries.

Cross-functional teams are not enough; just because a team serves only one function it’s not necessarily a silo. Silos come from a lack of communication and collaboration between teams, not simply from a separation of duties.

Wrong culture

Another big anti-pattern: measuring and automating are an important aspect but without the right culture or behaviour or direction it will just make things going really fast but in the wrong way.

Yo need to be careful as there are many small anti-pattern that prevent the correct culture to be adopted.

Timeline

One of the questions often asked about the cultural transformation is how long it will take. The problem with this question is that it assumes that DevOps is an easily definable or measurable state and once that state is reached then the work is done.

In reality, DevOps is an ongoing process; it is the journey, not the destination.

Because so much of DevOps is cultural, it is harder to predict how long some of those changes will take: how long will it take people to break old silo-ed habits and replace them with new collaborative ones? This cannot be easily predicted

Roles

Creating a team called DevOps is neither necessary nor sufficient for creating a DevOps culture. If the development and operations teams cannot communicate with each other, an additional team is likely to cause more communication issues, not fewer.

Moreover, the concept of a DevOps engineer – part admin part developer – in addition to being totally unrealistic,  doesn’t scale well.

Certifications.

How do you certify culture? There is no 60-minute exam that can certify how effectively you communicate with other people.

DevOps doesn’t have required technology or one-size-fits-all solutions. Certification exams are testing knowledge where there are clear right or wrong answers, which DevOps generally does not have.

Minor anti-patterns

Other smaller anti-patterns are organisations where the environments are completely different, from the development machines to test harness systems and the production; running different infrastructure across the board and different operating system.

One way to improve that is containerisation, things like Docker,

A trend now is immutable infrastructure, where everything – the system, the application and the middleware – are all in a binary that basically is  is running on the development machines, on the integration and the production exactly the same, bit for bit.

A final word

Just because a company presents their successful DevOps strategy does not mean that these same processes will be the right way of doing within every environment. Cargo culting processes and tools into an environment can lead to the creation of additional silos and resistance to change.

DevOps encourages critical thinking about processes, tools, and practices; being a learning organisation requires questioning and iterating on processes, not accepting things as the “one true way” or the way that things have always been done.

One should also beware of people – including myself – saying that anyone who isn’t following their example is doing DevOps “the wrong way”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s