DevOps and Cloud
I first heard about DevOps from Lindsay Holmwood at linux.conf.au 2010. Since then, I’ve been following the movement with interest. It seems to be about cross-functional involvement in software teams, specifically between software development and system administration (or operations). In many organizations, especially SaaS shops, these two groups are placed in opposition to each other: developers are driven to deliver new features to users, while system administrators are held accountable for the operation of the service. In the best case, they maintain a healthy balance by pushing in opposite directions, but more typically, they resent each other for getting in the way, as a result of this dichotomy:
|is responsible for…||creating products||offering services|
|is measured on…||delivery of new features||high reliability|
|optimizes by…||increasing velocity||controlling change|
|and so is perceived as…||reckless and irresponsible||obstructing progress|
Of course, both functions are essential to a viable service, and so DevOps aims to replace this opposition with cooperation. By removing this friction from the organization, we hope to improve efficiency, lower costs, and generally get more work done.
So, DevOps promotes the formation of cross-functional teams, where individuals still take on specialist “development” or “operations” roles, but work together toward the common goal of delivering a great experience to users. By working as teammates, rather than passing work “over the wall”, they can both contribute to development, deployment and maintenance according to their skills and expertise. The team becomes a “devops” team, and is responsible for the entire product life cycle. Particular tasks may be handled by specialists, but when there’s a problem, it’s the team’s problem.
Some take it a step further, and feel that what’s needed is to combine the two disciplines, so that individuals contribute in both ways. Rather than thinking of themselves as “developers” or “sysadmins”, these folks consider themselves “devops”. They work to become proficient in both roles, and to synthesize new ways of working by drawing on both types of skills and experience. A common crossover activity is the development of sophisticated tools for automating deployment, monitoring, capacity management and failure resolution.
DevOps meets Cloud
Like DevOps, cloud is not a specific technology or method, but a reorganization of the model (as I’ve written previously). It’s about breaking down the problem in a different way, splitting and merging its parts, and creating a new representation which doesn’t correspond piece-for-piece to the old one.
DevOps drives cloud because it offers a richer toolkit for the way they work: fast, flexible, efficient. Tools like Amazon EC2 and Google App Engine solve the right sorts of problems. Cloud also drives DevOps because it calls into question the traditional way of organizing software teams. A development/operations division just doesn’t “fit” cloud as well as a DevOps model.
Deployment is a classic duty of system administrators. In many organizations, only the IT department can implement changes in the production environment. Reaping the benefits of an IaaS environment requires deploying through an API, and therefore deployment requires development. While it is already common practice for system administrators to develop tools for automating deployment, and tools like Puppet and Chef are gaining momentum, IaaS makes this a necessity, and raises the bar in terms of sophistication. Doing this well requires skills and knowledge from both sides of the “fence” between development and operations, and can accelerate development as well as promote stability in production.
This is exemplified by infrastructure service providers like Amazon Web Services, where customers pay by the hour for “black box” access to computing resources. How those resources are provisioned and maintained is entirely Amazon’s problem, while its customers must decide how to deploy and manage their applications within Amazon’s IaaS framework. In this scenario, some operations work has been explicitly outsourced to Amazon, but IaaS is not a substitute for system administration. Deployment, monitoring, failure recovery, performance management, OS maintenance, system configuration, and more are still needed. A development team which is lacking the experience or capacity for this type of work cannot simply “switch” to an IaaS model and expect these needs to be taken care of by their service provider.
With platform service providers, the boundaries are different. Developers, if they build their application on the appropriate platform, can effectively outsource (mostly) the management of the entire production environment to their service provider. The operating system is abstracted away, and its maintenance can be someone else’s problem. For applications which can be built with the available facilities, this will be a very attractive option for many organizations. The customers of these services may be traditional developers, who have no need for operations expertise. PaaS providers, though, will require deep expertise in both disciplines in order to build and improve their platform and services, and will likely benefit from a DevOps approach.
Technical architecture draws on both development and operations expertise, because design goals like performance and robustness are affected by all layers of the stack, from hardware, power and cooling all the way up to application code. DevOps itself promotes greater collaboration on architecture, by involving experts in both disciplines, but cloud is a great catalyst because cloud architecture can be described in code. Rather than talking to each other about their respective parts of the system, they can work together on the whole system at once. Developers, sysadmins and hybrids can all contribute to a unified source tree, containing both application code and a description of the production environment: how many virtual servers to deploy, their specifications, which components run on which servers, how they are configured, and so on. In this way, system and network architecture can evolve in lockstep with application architecture.
Cloudy promises such as dynamic scaling and fault tolerance call for a DevOps approach in order to be realized in a real-world scenario. These systems involve dynamically manipulating production infrastructure in response to changing conditions, and the application must adapt to these changes. Whether this takes the form of an active, intelligent response or a passive crash-only approach, development and operational considerations need to be aligned.
DevOps and cloud will continue to reinforce each other and gain momentum. Both individuals and organizations will need to adapt in order to take advantage of the opportunities provided by these new models. Because they’re complementary, it makes sense to adopt them together, so those with expertise in both will be at an advantage.