Why DevOps might be the wrong term?

We all know by now what DevOps is. Or at least we think we do. It cannot be said that is one thing in specific, but more of a union of different things, including people. And that is where it gets a bit complex...

The whole idea of DevOps had a goal to demolish barriers between Developers (the Dev part), and Operations (the Ops part). It ended up mostly relabeling the Ops people as DevOps. At least in most cases, it did. And the barriers remained.

This article will not delve into the endless battle, what DevOps is, what it represents, or who should do it... We all should be aware of what it represents and "do" DevOps. But that's not the point, at least not in this post.

With this article, I want to concentrate on the reason for the barriers in the first place, and the fact that the ability to fail on all parts of the equation is important, if not the most important aspect of creating something. Whatever we decide to call that something.

I wrote about failure before, but from the rock-climbing perspective, and how to apply it in life. This time I want to concentrate on the importance of failure on projects and how that fear might be connected to the barriers between the project teams.

If we don't try we will not learn. If we fail during that trial, we mustn't beat ourselves because of it, but shift our perspective and learn from that trial and try again. If we are afraid of failing, well, this is where the problem begins.

The foundation of barriers

These barriers go beyond the physical, real world. One side could say something like - We finished our part, tested it, and it works. Now, if it fails, it's no longer our fault. The other side could respond with - If it fails to run, we know it's not on our side, it must be on the other (their) side.

What is similar to both of these sides, despite both of them being human and prone to mistakes? It is the fact that we are shifting the blame - if it fails, it's not us, it's them. And shifting the blame is only one representation of the fear of failure which is behind it all - If something fails, let's blame the other side, in that way, we will not get punished.

This could be the initial spark that had built the walls between teams - the walls of confusion. The walls which we'll all use to throw the blame over. The same spark that remained to kindle, keeping those same walls alive.

And why are we afraid to fail? The initial fear of failure comes from the fear of getting punished. That we are somehow going to be punished if something doesn't work as expected. That the failure might end up in us losing our jobs.

Shaking the foundation

Instead of concentrating on tearing down the walls, let's attack the possible cause for them in the first place.

One way to tackle this fear, and start shaking the foundations of the barriers is to accept failure as a normal part of the process. We don't want to shift the blame to others and throw it over the wall. We need, as a team, to be aware of the failure, own it together, and work on the best way to learn from that failure.

With the risk of sounding corny, the catchphrase - There is no I in a team, should be expanded to the failure as well - when we fail, we fail together.

How can we do that?

Easy. If failure is something we don't want to see in a production environment, well, too bad, because, at some point, things will fail. Be it the application, the infrastructure behind it, or the whole system.

So, instead of thinking of failure as an exception, we should consider it a normal part of the process, and count on it from the start. We should build our systems with failure in mind. When they fail, when, not if, instead of seeking the blame, let's have mechanisms in place that will help us quickly recover from them, and learn from them for future iterations.

Yes, you might think - it's easier said than done, and what this guy knows about this in the first place? Well, to be honest, it is easier said than done, but the fact is, it shouldn't be the reason not to do it. And what do I know about it in the first place - not much, but whenever I or someone else tried shifting the blame, it didn't end well.

Thinking of failure as a normal part of the process can be one step toward tearing down the barriers. Barriers that are still there, although not (wanted to be) seen. Or even called differently.

Instead of DevOps, we should have called it DevOops! With the accent on oops! And consider that oops! an expected part of the process.

And what if I'm wrong?

I might be. But if you end up considering failure as a normal part of the process and you build around that, I would count that as a win for everyone nevertheless.