We'll see | Matt Zimmerman

a potpourri of mirth and madness

DevOps and Cloud


I first heard about DevOps from Lindsay Holmwood at linux.conf.au 2010. Since then, I’ve been following the movement with interest. It seems to be about cross-functional involvement in software teams, specifically between software development and system administration (or operations). In many organizations, especially SaaS shops, these two groups are placed in opposition to each other: developers are driven to deliver new features to users, while system administrators are held accountable for the operation of the service. In the best case, they maintain a healthy balance by pushing in opposite directions, but more typically, they resent each other for getting in the way, as a result of this dichotomy:

Development Operations
is responsible for… creating products offering services
is measured on… delivery of new features high reliability
optimizes by… increasing velocity controlling change
and so is perceived as… reckless and irresponsible obstructing progress

Of course, both functions are essential to a viable service, and so DevOps aims to replace this opposition with cooperation. By removing this friction from the organization, we hope to improve efficiency, lower costs, and generally get more work done.

So, DevOps promotes the formation of cross-functional teams, where individuals still take on specialist “development” or “operations” roles, but work together toward the common goal of delivering a great experience to users. By working as teammates, rather than passing work “over the wall”, they can both contribute to development, deployment and maintenance according to their skills and expertise. The team becomes a “devops” team, and is responsible for the entire product life cycle. Particular tasks may be handled by specialists, but when there’s a problem, it’s the team’s problem.

Some take it a step further, and feel that what’s needed is to combine the two disciplines, so that individuals contribute in both ways. Rather than thinking of themselves as “developers” or “sysadmins”, these folks consider themselves “devops”. They work to become proficient in both roles, and to synthesize new ways of working by drawing on both types of skills and experience. A common crossover activity is the development of sophisticated tools for automating deployment, monitoring, capacity management and failure resolution.

DevOps meets Cloud

Like DevOps, cloud is not a specific technology or method, but a reorganization of the model (as I’ve written previously). It’s about breaking down the problem in a different way, splitting and merging its parts, and creating a new representation which doesn’t correspond piece-for-piece to the old one.

DevOps drives cloud because it offers a richer toolkit for the way they work: fast, flexible, efficient. Tools like Amazon EC2 and Google App Engine solve the right sorts of problems. Cloud also drives DevOps because it calls into question the traditional way of organizing software teams. A development/operations division just doesn’t “fit” cloud as well as a DevOps model.

Deployment is a classic duty of system administrators. In many organizations, only the IT department can implement changes in the production environment. Reaping the benefits of an IaaS environment requires deploying through an API, and therefore deployment requires development. While it is already common practice for system administrators to develop tools for automating deployment, and tools like Puppet and Chef are gaining momentum, IaaS makes this a necessity, and raises the bar in terms of sophistication. Doing this well requires skills and knowledge from both sides of the “fence” between development and operations, and can accelerate development as well as promote stability in production.

This is exemplified by infrastructure service providers like Amazon Web Services, where customers pay by the hour for “black box” access to computing resources. How those resources are provisioned and maintained is entirely Amazon’s problem, while its customers must decide how to deploy and manage their applications within Amazon’s IaaS framework. In this scenario, some operations work has been explicitly outsourced to Amazon, but IaaS is not a substitute for system administration. Deployment, monitoring, failure recovery, performance management, OS maintenance, system configuration, and more are still needed. A development team which is lacking the experience or capacity for this type of work cannot simply “switch” to an IaaS model and expect these needs to be taken care of by their service provider.

With platform service providers, the boundaries are different. Developers, if they build their application on the appropriate platform, can effectively outsource (mostly) the management of the entire production environment to their service provider. The operating system is abstracted away, and its maintenance can be someone else’s problem. For applications which can be built with the available facilities, this will be a very attractive option for many organizations. The customers of these services may be traditional developers, who have no need for operations expertise. PaaS providers, though, will require deep expertise in both disciplines in order to build and improve their platform and services, and will likely benefit from a DevOps approach.

Technical architecture draws on both development and operations expertise, because design goals like performance and robustness are affected by all layers of the stack, from hardware, power and cooling all the way up to application code. DevOps itself promotes greater collaboration on architecture, by involving experts in both disciplines, but cloud is a great catalyst because cloud architecture can be described in code. Rather than talking to each other about their respective parts of the system, they can work together on the whole system at once. Developers, sysadmins and hybrids can all contribute to a unified source tree, containing both application code and a description of the production environment: how many virtual servers to deploy, their specifications, which components run on which servers, how they are configured, and so on. In this way, system and network architecture can evolve in lockstep with application architecture.

Cloudy promises such as dynamic scaling and fault tolerance call for a DevOps approach in order to be realized in a real-world scenario. These systems involve dynamically manipulating production infrastructure in response to changing conditions, and the application must adapt to these changes. Whether this takes the form of an active, intelligent response or a passive crash-only approach, development and operational considerations need to be aligned.

So what?

DevOps and cloud will continue to reinforce each other and gain momentum. Both individuals and organizations will need to adapt in order to take advantage of the opportunities provided by these new models. Because they’re complementary, it makes sense to adopt them together, so those with expertise in both will be at an advantage.


Written by Matt Zimmerman

June 8, 2010 at 10:28

11 Responses

Subscribe to comments with RSS.

  1. Thanks to Adam Fletcher for his feedback on the first draft of this post

    Matt Zimmerman

    June 8, 2010 at 10:55

  2. @Matt:
    The fun part about DevOps is that this system works already for years.
    Yes, you had in the past a strict line between those two departments, but since Dev and Ops went into large scale enterprises with a lot of hardware, network and application infrastructure, Development and Operations worked closely together. Operations is telling Development about infrastructure, and how Development needs to design their software, in regards of Hardware and Network Infrastructure.

    This workflow is old school, but only now it got momentum and is discussed publicly.




    June 8, 2010 at 12:22

    • The idea of working together is not novel, but there is a meaningful difference between “I talk to those people” and “we are on a team together” and “we are the same person”, which are what I see in DevOps.

      Matt Zimmerman

      June 8, 2010 at 13:13

      • Hi Matt,

        right, since agile software development and agile system administration (scrum etc.) Development and Ops can only work in a full fledged team.
        OPS System Architects + Dev Application Architects need to work closely together.

        But as I said, it’s just old school in new words ;)




        June 8, 2010 at 14:35

  3. This flickr deck:

    which was on high scalability a few months back, covers some of the same points — and also has some concrete suggestions to move two split, distrustful teams in that direction. In a previous workplace, which had this problem to an extreme degree, this helped us at least get a few people from the opposing team (in that case, dev) working closer together.

    Aaron Z

    June 9, 2010 at 19:29

    • Yes, that deck is a good introduction to devops and stands on its own without a speaker.

      In this post, I tried to give just enough of an overview to make my point about the relationship between devops and cloud computing. There’s a lot more to say about both topics.

      Matt Zimmerman

      June 9, 2010 at 20:16

  4. […] Matt Zimmerman: DevOps and Cloud Really like Matt's point of view here. He hits the nail pretty much on the head, and I've tried to promote a lot of the values he touches on here. (tags: sysadmin ubuntu linux) Posted by bill Mblog Subscribe to RSS feed […]

  5. “Cloud also drives DevOps because it calls into question the traditional way of organizing software teams.”

    This is a very important point.

    I agree that, provided you have an IaaS/PaaS setup, you are able to use a DevOps cross-functional team.

    However, I am afraid it requires relieving the DevOps team from duties “chaotic by nature”, like those related to major issues with availability, load, storage… (at least if predictability is important for you)

    During the last fifteen years, we have struggled to control the chaos inherent to software development. Well-managed iterative lifecycles isolate the team from external distractions (feature requests, changes on existing features, non-critical bugs…). These distractions are delayed until the ongoing iteration is completed, so the team stays focused, and the actual outcome is closer to the plan.

    The problem is: How do we fit typical sysadmin “firefights” here? They have the potential to ruin your velocity, and if key people are involved, even delay entire releases…

    In a DevOps scenario, either “the Cloud services” handle these issues, or predictability will be sacrified in order to keep agile and cross-functional teams.

    Well… we’ll see :)


    June 13, 2010 at 16:02

  6. […] Amazon provides. These developer and operations roles (sometimes referred to collectively as “DevOps“) may be filled by technical folks internal to an organization or may be from third parties. […]

  7. […] Per Day: Dev and Ops Cooperation at Flickr Also see Matt Zimmerman’s excellent post on Devops and Cloud. Be Sociable, Share! Tweet Cancel […]

Comments are closed.

%d bloggers like this: