DevOps is a concept that we’ve all started coming across more and more in the last few months. Critically it’s taken a bit of a leap just lately because people have started to: (a) define it formally and (b) actually agree to a decent extent on what the definition is.
So, for what its worth, Wikipedia talks of DevOps as: “A culture, movement or practice that emphasizes the collaboration and communication of both software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes.”
Webopedia, for its part, calls DevOps: “A type of agile relationship between Development and IT Operations” whose aim is “to change and improve the relationship by advocating better communication and collaboration between the two business units”.
Okay, all well and good. But is it actually a proper thing or is it just a relatively young term that people throw about without either understanding it or meaning it?
Well, first of all I’ll start by saying that I don’t really like the second half of Wikipedia’s definition: I fail to see what specific relationship DevOps has with automation. I do, however, get a warm feeling from the fact that both these definitions (which happened to be the first two I picked at random) both include the words communication and collaboration. So perhaps there’s something in the concept.
Over the last few years I’ve worked in both halves of the DevOps definition, so I’m fortunate in having seen first hand what problems exist between the development world and the operational world. So the operations guys eye the developers with an ill-concealed layer of suspicion, expecting them to try to sneak their new products into production without proper handover or support.
In return the developers are trying to get the fruit of their labours into production because the reason they’ve spent thousands of person-hours putting it together is because the business case says it’s going to improve productivity or profitability, yet all they get from the ops team is a demand for more testing, more support, more documentation, more of everything.
Is this fair? Much of the time, yes. Have I come across cowboy developers who’ve lashed something together and fudged it into production? Of course I have, and I bet you have too. Have I come across ops managers who will find any excuse they can to bat off the developers so that they don’t have something new to support? Yup, no doubt about it.
Interestingly, a single word sums up my frustration as an ops manager when developers come to me to make something live: Agile.
Agile is, I reckon, a good thing. If you can produce something in bursts and get the benefit of the first while you’re building the second, and from the second while you’re building the third, that’s a great thing and much better than waiting forever for a single-hit delivery of a socking big monolithic monster. Not least because the world has probably moved on while you were building the monster and nobody wants it any more. Trouble is, Agile is far too frequently (and wrongly) used to describe poorly designed, poorly implemented crap that doesn’t satisfy the requirement because nobody bothered to define the requirement in the first place.
What about as a developer? Well, my frustration with the ops guys is that while we’re producing new software and systems that do new and exciting things, they have an overwhelming desire to try to fit it into their operations model with their fleet of processes and policies, thereby thwarting our attempts to do stuff differently and innovatively.
Both of these scenarios are understandable, of course. The ops guys have service levels to abide by and a finite amount of staff and time so spend on learning new systems, so you can’t blame them for resisting change. Similarly the developers will generally underestimate the amount of support and training required for a new product because they know it inside out and have become blind to how complex it actually is.
But what actually is the barrier to taking something live? Errr… none of the above, actually – they’re just symptoms of the problem. The problem usually boils down to a lack of common sense.
I mentioned a while back that the business case states that the new development is “going to improve productivity or profitability.” But what is actually in the business case? Chances are it’s all about the cost of the tools to build the product, the staff costs for the developers and testers, and so on.
But unless it’s a big new product (which makes it impossible to miss the obvious) how many business cases include six weeks of contractor coverage for the IT service desk while the team go on training courses for the new product? Or the budget for overtime to cover the retainer because the new development is the first you’ve made that needs to be supported out of hours?
Or the competitive bids from third party support agencies for the back-end database that you’ve had to implement with something non-standard because it has a unique, essential feature?
Business cases for new developments aren’t written by ops managers. Which means that the post-deployment bit doesn’t usually get the focus it deserves. Which in turn frustrates the hell out of the ops guys when yet again they have to work miracles with bugger all resource to take on the new system.
It’s a complete no-brainer, then, to suggest that to make a successful new development you need to involve the operations team from the start. Because frankly the only people who know operations properly are the operations team. Look at the ITIL framework – it starts with strategy then works through design and transition before arriving at operations (with continuous improvement catching the result at the end). Designers seldom have experience in operations, but a critical part of the design process is to understand what obstacles stand in the way of the new product working properly – everything from available computing resource to likely failure points – and how to design around those obstacles. And the way to do that as early as possible in the project is, fairly obviously, to embrace the people who know it all and live it daily.
If you have the ops team on board as part of the project, not only do you get the benefit of their knowledge while you’re building the product but they can also benefit from the developers looking in on ops from outside. It’s easy to be blinkered and have the “but we always do it this way” mentality when you have standard operational processes; having non-ops people in there means the status quo gets questioned and, in some cases, changed.
Think of the case where the ops team says: “We don’t have the resource to support that” and the dev team says: “So why not outsource it then?” If you ask that at deployment time, you’re stuffed. If you ask it at business case time, you budget for it and it’s taken into account from the beginning: occasionally the extra cost makes the project unviable, but usually it doesn’t.
Involving ops staff from the start also brings the final benefit of the concept of DevOps: working together makes it easier to stamp out the obstacles. One of the most eye-opening things you can experience is the first time you sit in a project approval meeting with the developers and the ops guys pulling on the same end of the metaphorical rope.
That’s partly because it’s amusing to see the surprise on everyone’s face when the ops manager smiles and says: “Yup, we’re happy with the support model and they’re including X, Y and Z in the design so it’ll interface into our monitoring suite properly.”
But it’s partly because for the first time ever the business case is realistic because it includes all the ops and support costs and is seen to have been thought out properly.
It also inspires confidence in the approval committee. For example I was once asked by a senior manager why I, as the ops lead, was happy to accept a service into production despite the third party support contract not yet being quite finalised. Easy answer: the developers had kept me informed all along and we’d all agreed (including the business owner of the service) that the benefit of going live immediately was well worth the risk of accepting a best efforts support model for a few weeks.
One thing you learn over the years is that you don’t have to have the best solution every time – what matters is that you can prove that you’ve considered the alternatives and made a positive decision to go with the chosen idea.
As a word, DevOps is at that tricky age where it has got a spotty face and braces on its teeth (not to mention an embarrassing camel-case infection) and sort of shuffles about looking awkward and trying not to make eye contact.
But is it, as a concept, a proper thing? Heck, yes. ®
Global DDoS threat landscape report