Posts Tagged ‘Software quality’
We’ve packaged all of the free software…what now?
Today, virtually all of the free software available can be found in packaged form in distributions like Debian and Ubuntu. Users of these distributions have access to a library of thousands of applications, ranging from trivial to highly sophisticated software systems. Developers can find a vast array of programming languages, tools and libraries for constructing new applications.
This is possible because we have a mature system for turning free software components into standardized modules (packages). Some software is more difficult to package and maintain, and I’m occasionally surprised to find something very useful which isn’t packaged yet, but in general, the software I want is packaged and ready before I realize I need it. Even the “long tail” of niche software is generally packaged very effectively.
Thanks to coherent standards, sophisticated management tools, and the principles of software freedom, these packages can be mixed and matched to create complete software stacks for a wide range of devices, from netbooks to supercomputing clusters. These stacks are tightly integrated, and can be tested, released, maintained and upgraded as a unit. The Debian system is unparalleled for this purpose, which is why Ubuntu is based on it. The vision, for a free software operating system which is highly modular and customizable, has been achieved.
Rough edges
This is a momentous achievement, and the Debian packaging system fulfills its intended purpose very well. However, there are a number of areas where it introduces friction, because the package model doesn’t quite fit some new problems. Most of these are becoming more common over time as technology evolves and changes shape.
- Embedded systems need to be pared down to the essentials to minimize storage, distribution, computation and maintenance costs. Standardized packaging introduces excessive code, data and interdependency which make the system larger than necessary. Tight integration makes it difficult to bootstrap the system from scratch for custom hardware. Projects like Embedded Debian aim to adapt the Debian system to be more suitable for use in these environments, to varying degrees of success. Meanwhile, smart phones will soon become the most common type of computer globally.
- Data, in contrast to software, has simple requirements. It just needs to be up to date and accessible to programs. Packaging and distributing it through the standardized packaging process is awkward, doesn’t offer tangible benefits, and introduces overhead. There have been extensive debates in Debian about how to handle large data sets. Meanwhile, this problem is becoming increasingly important as data science catalyzes a new wave of applications.
- Client/server and other types of distributed applications are notoriously tricky to package. The packaging system works within the context of a single OS instance, and so relationships which span multiple OS instances (e.g. a server application which depends on a database running on another server) are not straightforward. Meanwhile, the web has become a first-class application development platform, and this kind of interdependency is extremely common on both clients and servers.
- Cross-platform applications such as Firefox, Chromium and OpenOffice.org have long struggled with packaging. In order to be portable, they tend to bundle the components they depend on, such as libraries. Packagers strive for normalization, and want these applications to use the packaged versions of these libraries instead. Application developers build, test and ship one set of dependencies, but their users receive a different stack when they use the packaged version of the application. Developers on both sides are in constant tension as they expect their configuration to be the canonical one, and want it to be tightly integrated. Cross-platform application developers want to provide their own, application-specific cross-platform update mechanism, while distributions want to use the same mechanism for all their components.
- Virtual appliances aim to combine application and operating system into a portable bundle. While a modular OS is definitely called for, appliances face some of the same problems as embedded systems as they need to be minimized. Furthermore, the appliance becomes a component in itself, and requires metadata, distribution mechanisms and so on. If someone wants to “install” a virtual appliance, how should that work? Packaging them up as .debs doesn’t make much sense for the same reasons that apply to large data sets. I haven’t seen virtual appliances really taking off, but I expect cloud to change that.
- Runtime libraries for languages such as Perl, Python and Ruby provide their own packaging systems, which manage dependencies and other metadata, installation, upgrades and removal in a standardized way. Because these operate independently of the OS package manager, all sorts of problems arise. Projects such as GoboLinux have attempted to tie them together, to varying degrees of success. Meanwhile, each new programming language we invent comes with a different, incompatible package manager, and distribution developers need to spend time repackaging them into their preferred format.
Why are we stuck?
I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.
– Abraham Maslow
The packaging ecosystem is very strong. Not only do we have powerful tools for working with packages, we also benefit from packages being a well-understood concept, and having established processes for developing, exchanging and talking about them. Once something is packaged, we know what it is and how to work with it, and it “fits” into everything else. So, it is tempting to package everything in sight, as we already know how to make sense of packages. However, this may not always be the right tool for the job.
Various attempts have been made to extend the packaging concept to make it more general, for example:
- Portage, of Gentoo fame, offers impressive flexibility by building packages with a custom configuration, tailored for the needs of the target system.
- Conary, from rPath, offers finer-grained dependencies, powerful revision control and object-oriented build recipes.
- Nix provides a consistent build and runtime environment, ensuring that programs are run with the same dependencies used to build them, by keeping the relevant versions installed. I don’t know much about it, but it sounds like all dependencies implicitly refer to an exact version.
Other package managers aim to solve a specific problem, such as providing lightweight package management for embedded systems, or lazy dependency installation, or fixing the filesystem hierarchy. There is a long list of package managers of various levels which solve different problems.
Most of these systems suffer from an important fundamental tradeoff: they are designed to manage the entire system, from the kernel through applications, and so they must be used wholesale in order to reap their full benefit. In other words, in their world, everything is a package, and anything which is not a package is out of scope. Therefore, each of these systems requires a separate collection of packages, and each time we invent a new one, its adherents set about packaging everything in the new format. It takes a very long time to do this, and most of them lose momentum before a mature ecosystem can form around them.
This lock-in effect makes it difficult for new packaging technologies to succeed.
Divide and Conquer
No single package management framework is flexible enough to accommodate all of the needs we have today. Even more importantly, a generic solution won’t account for the needs we will have tomorrow. I propose that in order to move forward, we must make it possible to solve packaging problems separately, rather than attempting to solve them all within a single system.
- Decouple applications from the platform. Debian packaging is an excellent solution for managing the network of highly interdependent components which make up the core of a modern Linux distribution. It falls short, however, for managing the needs of modern applications: fast-moving, cross-platform and client/server (especially web). Let’s stop trying to fit these square pegs into round holes, and adopt a different solution for this space, preferably one which is comprehensible and useful to application developers so that they can do most of the work.
- Treat data as a service. It’s no longer useful to package up documentation in order to provide local copies of it on every Linux system. The web is a much, much richer and more effective solution to that problem. The same principle is increasingly applicable to structured data. From documents and contacts to anti-virus signatures and PCI IDs, there’s much better data to be had “out there” on the web than “down here” on the local filesystem.
- Simplify integration between packaging systems in order to enable a heterogeneous model. When we break the assumption that everything is a package, we will need new tools to manage the interfaces between different types of components. Applications will need to introspect their dependency chain, and system management tools will need to be able to interrogate applications. We’ll need thoughtfully designed interfaces which provide an appropriate level of abstraction while offering sufficient flexibility to solve many different packaging problems. There is unarguably a cost to this heterogeneity, but I believe it would easily outweigh the shortcomings of our current model.
But I like things how they are!
We don’t have a choice. The world is changing around us, and distributions need to evolve with it. If we don’t adapt, we will eventually give way to systems which do solve these problems.
Take, for example, modern web browsers like Firefox and Chromium. Arguably the most vital application for users, the browser is coming under increasing pressure to keep up with the breakneck pace of innovation on the web. The next wave of real-time collaboration and multimedia applications relies on the rapid development of new capabilities in web browsers. Browser makers are responding by accelerating deployment in the field: both aggressively push new releases to their users. A report from Google found that Chrome upgrades 97% of their users within 21 days of a new release, and Firefox 85% (both impressive numbers). Mozilla recently changed their maintenance policies, discontinuing maintenance of stable releases and forcing Ubuntu to ship new upstream releases to users.
These applications are just the leading edge of the curve, and the pressure will only increase. Equally powerful trends are pressing server applications, embedded systems, and data to adapt as well. The ideas I’ve presented here are only one possible way forward, and I’m sure there are more and better ideas brewing in distribution communities. I’m sure that I’m not the only one thinking about these problems.
Whatever it looks like in the end, I have no doubt that change is ahead.
DevOps and Cloud
DevOps
I first heard about DevOps from Lindsay Holmwood at linux.conf.au 2010. Since then, I’ve been following the movement with interest. It seems to be about cross-functional involvement in software teams, specifically between software development and system administration (or operations). In many organizations, especially SaaS shops, these two groups are placed in opposition to each other: developers are driven to deliver new features to users, while system administrators are held accountable for the operation of the service. In the best case, they maintain a healthy balance by pushing in opposite directions, but more typically, they resent each other for getting in the way, as a result of this dichotomy:
| Development | Operations | |
|---|---|---|
| is responsible for… | creating products | offering services |
| is measured on… | delivery of new features | high reliability |
| optimizes by… | increasing velocity | controlling change |
| and so is perceived as… | reckless and irresponsible | obstructing progress |
Of course, both functions are essential to a viable service, and so DevOps aims to replace this opposition with cooperation. By removing this friction from the organization, we hope to improve efficiency, lower costs, and generally get more work done.

So, DevOps promotes the formation of cross-functional teams, where individuals still take on specialist “development” or “operations” roles, but work together toward the common goal of delivering a great experience to users. By working as teammates, rather than passing work “over the wall”, they can both contribute to development, deployment and maintenance according to their skills and expertise. The team becomes a “devops” team, and is responsible for the entire product life cycle. Particular tasks may be handled by specialists, but when there’s a problem, it’s the team’s problem.
Some take it a step further, and feel that what’s needed is to combine the two disciplines, so that individuals contribute in both ways. Rather than thinking of themselves as “developers” or “sysadmins”, these folks consider themselves “devops”. They work to become proficient in both roles, and to synthesize new ways of working by drawing on both types of skills and experience. A common crossover activity is the development of sophisticated tools for automating deployment, monitoring, capacity management and failure resolution.
DevOps meets Cloud
Like DevOps, cloud is not a specific technology or method, but a reorganization of the model (as I’ve written previously). It’s about breaking down the problem in a different way, splitting and merging its parts, and creating a new representation which doesn’t correspond piece-for-piece to the old one.
DevOps drives cloud because it offers a richer toolkit for the way they work: fast, flexible, efficient. Tools like Amazon EC2 and Google App Engine solve the right sorts of problems. Cloud also drives DevOps because it calls into question the traditional way of organizing software teams. A development/operations division just doesn’t “fit” cloud as well as a DevOps model.
Deployment is a classic duty of system administrators. In many organizations, only the IT department can implement changes in the production environment. Reaping the benefits of an IaaS environment requires deploying through an API, and therefore deployment requires development. While it is already common practice for system administrators to develop tools for automating deployment, and tools like Puppet and Chef are gaining momentum, IaaS makes this a necessity, and raises the bar in terms of sophistication. Doing this well requires skills and knowledge from both sides of the “fence” between development and operations, and can accelerate development as well as promote stability in production.

This is exemplified by infrastructure service providers like Amazon Web Services, where customers pay by the hour for “black box” access to computing resources. How those resources are provisioned and maintained is entirely Amazon’s problem, while its customers must decide how to deploy and manage their applications within Amazon’s IaaS framework. In this scenario, some operations work has been explicitly outsourced to Amazon, but IaaS is not a substitute for system administration. Deployment, monitoring, failure recovery, performance management, OS maintenance, system configuration, and more are still needed. A development team which is lacking the experience or capacity for this type of work cannot simply “switch” to an IaaS model and expect these needs to be taken care of by their service provider.

With platform service providers, the boundaries are different. Developers, if they build their application on the appropriate platform, can effectively outsource (mostly) the management of the entire production environment to their service provider. The operating system is abstracted away, and its maintenance can be someone else’s problem. For applications which can be built with the available facilities, this will be a very attractive option for many organizations. The customers of these services may be traditional developers, who have no need for operations expertise. PaaS providers, though, will require deep expertise in both disciplines in order to build and improve their platform and services, and will likely benefit from a DevOps approach.
Technical architecture draws on both development and operations expertise, because design goals like performance and robustness are affected by all layers of the stack, from hardware, power and cooling all the way up to application code. DevOps itself promotes greater collaboration on architecture, by involving experts in both disciplines, but cloud is a great catalyst because cloud architecture can be described in code. Rather than talking to each other about their respective parts of the system, they can work together on the whole system at once. Developers, sysadmins and hybrids can all contribute to a unified source tree, containing both application code and a description of the production environment: how many virtual servers to deploy, their specifications, which components run on which servers, how they are configured, and so on. In this way, system and network architecture can evolve in lockstep with application architecture.
Cloudy promises such as dynamic scaling and fault tolerance call for a DevOps approach in order to be realized in a real-world scenario. These systems involve dynamically manipulating production infrastructure in response to changing conditions, and the application must adapt to these changes. Whether this takes the form of an active, intelligent response or a passive crash-only approach, development and operational considerations need to be aligned.
So what?
DevOps and cloud will continue to reinforce each other and gain momentum. Both individuals and organizations will need to adapt in order to take advantage of the opportunities provided by these new models. Because they’re complementary, it makes sense to adopt them together, so those with expertise in both will be at an advantage.
The behavioral economics of free software
People who use and promote free software cite various reasons for their choice, but do those reasons tell the whole story? If, as a community, we want free software to continue to grow in popularity, especially in the mainstream, we should understand better the true reasons for choosing it—especially our own.
Some believe that it offers higher quality, that the availability of source code results in a better product with higher reliability. Although it’s difficult to do an apples-to-apples comparison of software, there are certainly instances where free software components have been judged superior to their proprietary counterparts. I’m not aware of any comprehensive analysis of the general case, though, and there is plenty of anecdotal evidence on both sides of the debate.
Others prefer it for humanitarian reasons, because it’s better for society or brings us closer to the world we want to live in. These are more difficult to analyze objectively, as they are closely linked to the individual, their circumstances and their belief system.
For developers, a popular reason is the possibility of modifying the software to suit their needs, as enshrined in the Free Software Foundation’s freedom 1. This is reasonable enough, though the practical value of this opportunity will vary greatly depending on the software and circumstances.
The list goes on: cost savings, educational benefits, universal availability, social rewards, etc.
The wealth of evidence of cognitive bias indicates that we should not take these preferences at face value. Not only are human choices seldom rational, they are rarely well understood even by the human themselves. When asked to explain our preferences, we often have a ready answer—indeed, we may never run out of reasons—but they may not withstand analysis. We have many different ways of fooling ourselves with regard to our own past decisions and held beliefs, as well as those of others.
Behavioral economics explores the way in which our irrational behavior affects economies, and the results are curious and subtle. For example, the riddle of experience versus memory (TED video), or the several examples in “The Marketplace of Perception” (Harvard Magazine article). I think it would be illuminating to examine free software through this lens, and consider that the vagaries of human perception may have a very strong influence on our choices.
Some questions for thought:
- Does using free software make us happier? If so, why? If not, why do we use it anyway?
- Do we believe in free software because we have a great experience using it, or because we feel good about having used it? (Daniel Kahneman explains the difference)
- Why do we want other people to use free software? Is it only because we want them to share our preference, or because we will benefit ourselves, or do we believe they will appreciate it for their own reasons?
If you’re aware of any studies along these lines, I would be interested to read about them.
Ubuntu 10.10 (Maverick) Developer Summit
I spent last week at the Ubuntu Developer Summit in Belgium, where we kicked off the 10.10 development cycle.
Due to our time-boxed release cycle, not everything discussed here will necessarily appear in Ubuntu 10.10, but this should provide a reasonable overview of the direction we’re taking.
Presentations
While most of our time at UDS is spent in small group meetings to discuss specific topics, there are also a handful of presentations to convey key information and stimulate creative thinking.
A few of the more memorable ones for me were:
- Mark Shuttleworth talked about the desktop, in particular the introduction of the new Unity shell for Ubuntu Netbook Edition
- Fanny Chevalier presented Diffamation, a tool for visualizing and navigating the history of a document in a very flexible and intuitive way
- Rick Spencer talked about the development process for 10.10 and some key changes in it, including a greater focus on meeting deadlines for freezes (and making fewer exceptions)
- Stefano Zacchiroli, the current Debian project leader, gave an overview of how Ubuntu and Debian developers are working together today, and how this could be improved. He has posted a summary on the debian-project mailing list.
The talks were all recorded, though they may not all be online yet.
Foundations
The Foundations team provides essential infrastructure, tools, packages and processes which are central to the development of all Ubuntu products. They make it possible for the desktop and server teams to focus on their areas of expertise, building on a common base system and development procedures.
Highlights from their track:
- Early on in the week, they conducted a retrospective to discuss how things went during the 10.04 cycle and how we can improve in the future
- One of their major projects has been about revision control for all of Ubuntu’s source code, and they talked last week about what’s next
- We’re aiming to provide btrfs as an install-time option in 10.10
- In order to keep Ubuntu moving forward, the foundations team is always on the lookout for stale bits which we don’t need to keep around anymore. At UDS, they discussed culling unattended packages, retiring the IA64 and SPARC ports and other spring cleaning
- There was a lot of discussion about Upstart, including its further development, implications for servers, desktops and the kernel, and the migration of server init scripts to upstart jobs
- After maintaining two separate x86 boot loaders for years, it looks like we may be ready to replace isolinux with GRUB2 on Ubuntu CDs
Desktop
The desktop team manages both Desktop Edition and Netbook Edition, on a mission to provide a top-notch experience to users across a range of client computing devices.
Highlights from their track:
- A key theme for 10.10 is to help developers to create applications for Ubuntu, by providing a native development environment, improving Quickly, improving desktopcouch, making it easier to get started with desktopcouch, and enabling developers to deliver new applications to Ubuntu users continuously
- With more and more touch screen devices appearing, Ubuntu will grow some new features to support touch oriented applications
- The web browser is a staple application for Ubuntu, and as such we are always striving for the best experience for our users. The team is looking ahead to Chromium, using apport to improve browser bug reports, and providing a web-oriented document capability via Zoho
- Building on work done in 10.04, we will aim to make simple things simple for basic photo editing
- Security-conscious users may rest easier knowing that the X window system will run without root privileges where kernel modesetting is supported
Server/Cloud
The server team is charging ahead with making Ubuntu the premier server OS for cloud computing environments.
Highlights from their track:
- Providing more powerful tools for managing Ubuntu in EC2 and Ubuntu Enterprise Cloud infrastructure, including boot-time configuration, image and instance management, and kernel upgrades
- Improving Ubuntu Enterprise Cloud by adding new Eucalyptus features (such as LXC container support, monitoring, rapid provisioning, and load balancing. If you ever wanted to run a UEC demo from a USB stick, that’s possible too.
- Providing packaged solutions for cloud building blocks such as hadoop and pig, Drupal, ehcache, Spring, various NOSQL databases, web frameworks, and more
- Providing turn-key solutions for free software applications like Alfresco and Kolab
- Making Puppet easier to deploy, easier to configure, and easier to scale in the cloud
ARM
Kiko Reis gave a talk introducing ARM and the corresponding opportunity for Ubuntu. The ARM team ran a full track during the week on all aspects of their work, from the technical details of the kernel and toolchain, to the assembly of a complete port of Netbook Edition 10.10 for several ARM platforms.
Kernel
The kernel team provided essential input and support for the above efforts, and also held their own track where they selected 2.6.35 as their target version, agreed on a variety of changes to the Ubuntu kernel configuration, and created a plan for providing backports of newer kernels to LTS releases to support deployment on newer hardware.
Security
Like the kernel team, the security team provided valuable input into the technical plans being developed by other teams, and also organized a security track to tackle some key security topics such as clarifying the duration of maintenance for various collections of packages, and the ongoing development of AppArmor and Ubuntu’s AppArmor profiles.
QA
The QA team focuses on testing, test automation and bug management throughout the project. While quality is everyone’s responsibility, the QA team helps to coordinate these activities across different teams, establish common standards, and maintain shared infrastructure and tools.
Highlights from their track include:
- There was a strong sense of broadening and deepening our testing efforts, mobilizing testers for specific testing projects, streamlining the ISO testing process by engaging Ubuntu derivatives and fine-tuning ISO test cases, and reactivating the community-based laptop testing program
- In support of this effort, there will be projects to improve test infrastructure, including enabling tests to target specific hardware and tracking test results in Launchpad
- There is a continuous effort to improve high-volume processing of bug reports, and two focus areas for this cycle will be tracking regressions (as these are among the most painful bugs for users) and improving our response to kernel bugs (as the kernel faces some special challenges in bug management)
Design
The design team organized a track at UDS for the first time this cycle, and team manager Ivanka Majic gave a presentation to help explain its purpose and scope.
Toward the end of the week, I joined in a round table discussion about some of the challenges faced by the team in engaging with the Ubuntu community and building support for their work. This is a landmark effort in mating professional design with free software community, and there is still much to learn about how to do this well.
Community
The community track discussed the usual line-up of events, outreach and advocacy programs, organizational tools, and governance housekeeping for the 10.10 cycle, as well as goals for improving the translation of Ubuntu and related resources into many languages.
One notable project is an initiative to aggressively review patches submitted to the bug tracker, to show our appreciation for these contributions by processing them more quickly and completely.
Lucid ruminations
A few months ago, I wrote about changes in our development process for Ubuntu 10.04 LTS in order to meet our goals for this long-term release. So, how has it turned out?
Well, the development teams are still very busy preparing for the upcoming release, so there hasn’t been too much time for retrospection yet. Here are some of my initial thoughts, though.
- Merge from Debian testing – Martin Pitt has started a discussion on ubuntu-devel about how this went. For my part, I found that Lucid included fewer surprises than Karmic.
- Add fewer features – This is difficult to evaluate objectively, but my gut feeling is that we kept this largely under control. As usual, a few surprise desktop features were implemented that not everyone is happy about, myself included.
- Avoid major infrastructure changes – I think we did reasonably well here, though Plymouth is a notable exception. It resulted (unsurprisingly) in some nasty bugs which we’ve had to spend time dealing with.
- Extend beta testing – This will be difficult to assess, though if 10.04 beta was at least as good as 9.10 or 9.04 beta, then it will have arguably been a success.
- Freeze with Debian – Although early indications were good, this didn’t work out so well, as Debian’s freeze was delayed
- Visualize progress – The feature status page provided a lot of visual progress information, and the system behind it allowed us to keep track of work slippage throughout the cycle, both of which seemed like a firm step in the right direction. I’m looking forward to hearing from development teams how this information helped them (or not).
A more complete set of retrospectives on Lucid should give us some good ideas for how to improve further in Maverick and beyond.
Update: Fixed broken link.
Ubuntu Inside Out – Free Software and Linux Days 2010 in Istanbul
In early April, I visited Istanbul to give a keynote at the Free Software and Linux Days event. This was an interesting challenge, because this was my first visit to Turkey, and my first experience presenting with simultaneous translation.
In my new talk, Ubuntu Inside Out, I spoke about:
- What Ubuntu is about, and where it came from
- Some of the challenges we face as a growing project with a large community
- Some ways in which we’re addressing those challenges
- How to get involved in Ubuntu and help
- What’s coming next in Ubuntu
The organizers have made a video available if you’d like to watch it (WordPress.com won’t let me embed it here).
Afterward, Calyx and I wandered around Istanbul, with the help of our student guide, Oğuzhan. We don’t speak any Turkish, apart from a few vocabulary words I learned on the way to Turkey, so we were glad to have his help as we visited restaurants, cafes and shops, and wandered through various neighborhoods. We enjoyed a variety of delicious food, and the unexpected company of many friendly stray cats.
It was only a brief visit, but I was grateful for the opportunity to meet the local free software community and to see some of the city.
A story in numbers
During a 24 hour period:
- One person whose virtual server instance was destroyed when they invoked an init script without the required parameter
- One bug report filed about the problem
- 6 bug metadata changes reclassifying the bug report
- 286 word rant on Planet Ubuntu about how the bug was classified
- 5 comments on the blog post
- 22 comments on the bug report
- 2 minutes on the command line
- 6 commands using UDD (bzr branch, cd, vi, bzr commit, bzr push, bzr lp-open)
- One line patch
- 99% wasted energy
QCon London 2010: Day 3
The tracks which interested me today were “How do you test that?”, which dealt with scenarios where testing (especially automation) is particularly challenging, and “Browser as a Platform”, which is self-explanatory.
Joe Walker: Introduction to Bespin, Mozilla’s Web Based Code Editor
I didn’t make it to this talk, but Bespin looks very interesting. It’s “a Mozilla Labs Experiment to build a code editor in a web browser that Open Web and Open Source developers could love”.
I experimented briefly with the Mozilla hosted instance of Bespin. It seems mostly oriented for web application development, and still isn’t nearly as nice as desktop editors. However, I think something like this, combined with Bazaar and Launchpad, could make small code changes in Ubuntu very fast and easy to do, like editing a wiki.
Doron Reuveni: The Mobile Testing Challenge
Why Mobile Apps Need Real-World Testing Coverage and How Crowdsourcing Can Help
Doron explained how the unique testing requirements of mobile handset application are well suited to a crowdsourcing approach. As the founder of uTest, he explained their approach to connecting their customers (application vendors) with a global community of testers with a variety of mobile devices. Customers evaluate the quality of the testers’ work, and this data is used to grade them and select testers for future testing efforts in a similar domain. The testers earn money for their efforts, based on test case coverage (starting at about $20 each), bug reports (starting at about $5 each), and so on. Their highest performers earn thousands per month.
uTest also has a system, uTest Remote Access, which allows developers to “borrow” access to testers’ devices temporarily, for the purpose of reproducing bugs and verifying fixes. Doron gave us a live demo of the system, which (after verifying a code out of band through Skype) displayed a mockup of a BlackBerry device with the appropriate hardware buttons and a screenshot of what was displayed on the user’s screen. The updates were not quite real-time, but were sufficient for basic operation. He demonstrated taking a picture with the phone’s camera and seeing the photo within a few seconds.
Dylan Schiemann: Now What?
Dylan did a great job of extrapolating a future for web development based on the trend of the past 15 years. He began with a review of the origin of web technologies, which were focused on presentation and layout concerns, then on to JavaScript, CSS and DHTML. At this point, there was clear potential for rich applications, though there were many roadblocks: browser implementations were slow, buggy or nonexistent, security models were weak or missing, and rich web applications were generally difficult to engineer.
Things got better as more browsers came on the scene, with better implementations of CSS, DOM, XML, DHTML and so on. However, we’re still supporting an ancient implementation in IE. This is a recurring refrain among web developers, for whom IE seems to be the bane of their work. Dylan added something I hadn’t heard before, though, which was that Microsoft states that anti-trust restrictions were a major factor which prevented this problem from being fixed.
Next, there was an explosion of innovation around Ajax and related toolkits, faster javascript implementations, infrastructure as a service, and rich web applications like GMail, Google Maps, Facebook, etc.
Dylan believes that web applications are what users and developers really want, and that desktop and mobile applications will fall by the wayside. App stores, he says, are a short term anomaly to avoid the complexities of paying many different parties for software and services. I’m not sure I agree on this point, but there are massive advantages to the web as an application platform for both parties. Web applications are:
- fast, easy and cheap to deploy to many users
- relatively affordable to build
- relatively easy to link together in useful ways
- increasingly remix-able via APIs and code reuse
There are tradeoffs, though. I have an article brewing on this topic which I hope to write up sometime in the next few weeks.
Dylan pointed out that different layers of the stack exhibit different rates of change: browsers are slowest, then plugins (such as Flex and SilverLight), then toolkits like Dojo, and finally applications which can update very quickly. Automatically updating browsers are accelerating this, and Chrome in particular values frequent updates. This is good news for web developers, as this seems to be one of the key constraints for rolling out new web technologies today.
Dylan feels that technological monocultures are unhealthy, and prefers to see a set of competing implementations converging on standards. He acknowledged that this is less true where the monoculture is based on free software, though this can still inhibit innovation somewhat if it leads to everyone working from the same point of view (by virtue of sharing a code base and design). He mentioned that de facto standardization can move fairly quickly; if 2-3 browsers implement something, it can start to be adopted by application developers.
Comparing the different economics associated with browsers, he pointed out that Mozilla is dominated by search through the chrome (with less incentive to improve the rendering engine), Apple is driven by hardware sales, and Google by advertising delivered through the browser. It’s a bit of a mystery why Microsoft continues to develop Internet Explorer.
Dylan summarized the key platform considerations for developers:
- choice and control
- taste (e.g. language preferences, what makes them most productive)
- performance and scalability
- security
and surmised that the best way to deliver these is through open web technologies, such as HTML 5, which now offers rich media functionality including audio, video, vector graphics and animations. He closed with a few flashy demos of HTML 5 applications showing what could be done.
QCon London 2010: Day 2
I was talk-hopping today, so none of these are complete summaries, just enough to capture my impressions from the time I was there. I may go back and watch the video for the ones which turned out to be most interesting.
Yesterday, I noted a couple of practices employed by the QCon organizers which I wanted to note, to consider trying them out with Canonical and Ubuntu events:
- As participants leave each talk, they pass a basket with a red, a yellow and a green square attached to it. Next to the wastebasket are three small stacks of colored paper, also red, yellow and green. There are no instructions, indeed no words at all, but the intent seemed clear enough: drop a card in the basket to give feedback.
- The talks were spread across multiple floors in the conference center, which I find is usually awkward. They mitigated this somewhat by posting a directory of the rooms inside each lift.
Chris Read: The Cloud Silver Bullet
Which calibre is right for me?
Chris offered some familiar warnings about cloud technologies: that they won’t solve all problems, that effort must be invested to reap the benefits, and that no one tool or provider will meet all needs. He then classified various tools and services according to their suitability for long or short processing cycles, and high or low “data sensitivity”.
Simon Wardley: Situation Normal, Everything Must Change
I actually missed Simon’s talk this time, but I’ve seen him speak before and talk with him every week about cloud topics as a colleague at Canonical. I highly recommend his talks to anyone trying to make sense of cloud technology and decide how to respond to it.
In some of the talks yesterday, there was a murmur of anti-cloud sentiment, with speakers asserting it was not meaningful, or they didn’t know what it was, or that it was nothing new. Simon’s material is the perfect antidote to this attitude, as he makes it very clear that there is a genuinely important and disruptive trend in progress, and explains what it is.
Jesper Boeg: Kanban
Crossing the line, pushing the limit or rediscovering the agile vision?
Jesper shared experiences and lessons learned with Kanban, and some of the problems it addresses which are present in other methodologies. His material was well balanced and insightful, and I’d like to go back and watch the full video when it becomes available.
Here again was a clear and pragmatic focus on matching tools and processes to the specific needs of the team, business and situation.
Ümit Yalcinalp: Development Model for the Cloud
Paradigm Shift or the Same Old Same Old?
Ümit focused on the PaaS (platform as a service) layer, and the experience offered to developers who build applications for these platforms. An evangelist from Salesforce.com, she framed the discussion as a comparison between force.com, Google App Engine and Microsoft Azure.
Eric Evans: Folding Design into an Agile Process
Eric tackled the question of how to approach the problem of design within the agile framework. As an outspoken advocate of domain-driven design, he presented his view in terms of this school and its terminology.
He emphasized the importance of modeling “when the critical complexity of the project is in understanding and communicating about the domain”. The “expected” approach to modeling is to incorporate an up-front analysis phase, but Eric argues that this is misguided. Because “models are distilled knowledge”, and teams are relatively ignorant at the start of a project, modeling in this way captures that ignorance and makes it persist.
Instead, he says, we should employ to a “pull” approach (in the Lean sense), and decide to work on modeling when:
- communications with stakeholders deteriorates
- when solutions are more complex than the problems
- when velocity slows (because completed work becomes a burden)
Eric illustrated his points in part by showing video clips of engineers and business people engaged in dialog (here again, the focus on people rather than tools and process). He used this material as the basis for showing how models underlie these interactions, but are usually implicit. These dialogs were full of hints that the people involved were working from different models, and the software model needed to be revised. An explicit model can be a very powerful communication tool on software projects.
He outlined the process he uses for modeling, which was highly iterative and involves identifying business scenarios, using them to develop and evaluate abstract models, and testing those models by experimenting with code (“code probes”). Along the way, he emphasized the importance of making mistakes, not only as a learning tool but as a way to encourage creative thinking, which is essential to modeling work. In order to encourage the team to “think outside the box” and improve their conceptual model, he goes as far as to require that several “bad ideas” are proposed along the way, as a precondition for completing the process.
Eric is working on a white paper describing this process. A first draft is available on his website, and he is looking for feedback on it.
Modeling work, he suggested, can be incorporated into:
- a stand up meeting
- a spike
- an iteration zero
- release planning
He pointed out that not all parts of a system are created equal, and some of them should be prioritized for modeling work:
- areas of the system which seem to require frequent change across projects/features/etc.
- strategically important development efforts
- user experiences which are losing coherence
This was a very compelling talk, whose concepts were clearly applicable beyond the specific problem domain of agile development.
QCon London 2010: Day 1
For the first time in several years, I had the opportunity to attend a software conference in the city where I lived at the time. I’ve benefited from many InfoQ articles in the past couple of years, and watched recordings of some excellent talks from previous QCon events, so I jumped at the opportunity to attend QCon London 2010. It is being held in the Queen Elizabeth II Conference Center, conveniently located a short walk away from Canonical’s London office.
Whenever I attend conferences, I can’t help taking note of which operating systems are in use, and this tells me something about the audience. I was surprised to notice that in addition to the expected Mac and Windows presence, there was a substantial Ubuntu contingent and some Fedora as well.
Today’s tracks included two of particular interest to me at the moment: Dev and Ops: A single team and the unfortunately gendered Software Craftsmanship.
Jason Gorman: Beyond Masters and Apprentices
A Scalable, Peer-led Model For Building Good Habits In Large & Diverse Development Teams
Jason explained the method he uses to coach software developers.
I got a bad seat on the left side of the auditorium, where it was hard to see the slides because they were blocked by the lectern, so I may have missed a few points.
He began by outlining some of the primary factors which make software more difficult to change over time:
- Readability: developers spend a lot of their time trying to understand code that they (or someone else) have written
- Complexity: as well as making code more difficult to understand, complexity increases the chance of errors. More complex code can fail in more ways.
- Duplication: when code is duplicated, it’s more difficult to change because we need to keep track of the copies and often change them all
- Dependencies and the “ripple effect”: highly interdependent code is more difficult to change, because a change in one place requires corresponding changes elsewhere
- Regression Test Assurance: I didn’t quite follow how this fit into the list, to be honest. Regression tests are supposed to make it easier to change the code, because errors can be caught more easily.
He then outlined the fundamental principles of his method:
- Focus on Learning over Teaching – a motivated learner will find their own way, so focus on enabling them to pull the lesson rather than pushing it to them (“there is a big difference between knowing how to do something and being able to do it”)
- Focus on Ability over Knowledge – learn by doing, and evaluate progress through practice as well (“how do you know when a juggler can juggle?”)
…and went on to outline the process from start to finish:
- Orientation, where peers agree on good habits related to the subject being learned. The goal seemed to be to draw out knowledge from the group, allowing them to define their own school of thought with regard to how the work should be done. In other words, learn to do what they know, rather than trying to inject knowledge.
- Practice programming, trying to exercise these habits and learn “the right way to do it”
- Evaluation through peer review, where team members pair up and observe each other. Over the course of 40-60 hours, they watch each other program and check off where they are observed practicing the habits.
- Assessment, where learners practice a time-boxed programming exercise, which is recorded. The focus is on methodical correctness, not speed of progress. Observers watch the recording (which only displays the code), and note instances where the habit was not practiced. The assessment is passed only if less than three errors are noticed.
- Recognition, which comes through a certificate issued by the coach, but also through admission to a networking group on LinkedIn, promoting peer recognition
Jason noted that this method of assessing was good practice in itself, helping learners to practice pairing and observation in a rigorous way.
After the principal coach coaches a pilot group, the pilot group then goes on to coach others while they study the next stage of material.
To conclude, Jason gave us a live demo of the assessment technique, by launching Eclipse and writing a simple class using TDD live on the projector. The audience were provided with worksheets containing a list of the habits to observe, and instructed to note instances where he did not practice them.
Julian Simpson: Siloes are for farmers
Production deployments using all your team
After a brief introduction to the problems targeted by the devops approach, Julian offered some advice on how to do it right.
He began with the people issues, reminding us of Weinberg’s second law, which is “no matter what they tell you, it’s always a people problem”.
His people tips:
- In keeping with a recent trend, he criticized email as a severely flawed communication medium, best avoided.
- respect everyone
- have lunch with people on the other side of the wall
- discuss your problems with other groups (don’t just ask for a specific solution)
- invite everyone to stand-ups and retrospectives
- co-locate the sysadmins and developers (thomas allen)
Next, a few process suggestions:
- Avoid code ownership generally (or rather, promote joint/collective ownership)
- Pair developers with sysadmins
- It’s done when the code is in production (I would rephrase as: it’s not done until the code is in production)
and then tools:
- Teach your sysadmins to use version control
- Help your developers write performant code
- Help developers with managing their dev environment
- Run your deploy scripts via continuous integration (leading toward continuous deployment)
- Use Puppet or Chef (useful as a form of documentation as well as deployment tools, and on developer workstations as well as servers)
- Integrate monitoring and continuous integration (test monitoring in the development environment)
- Deliver code as OS packages (e.g. RPM, DEB)
- Separate binaries and configuration
- Harden systems immediately and enable logging for tuning security configuration (i.e. configure developer workstations with real security, making the development environment closer to production)
- Give developers access to production logs and data
- Re-create the developer environment often (to clear out accumulated cruft)
I agreed with a lot of what was said, objected to some, and lacked clarity on a few points. I think this kind of material is well suited to a multi-way BOF style discussion rather than a presentation format, and would have liked more opportunity for discussion.
Lars George and Fabrizio Schmidt: Social networks and the Richness of Data
Getting distributed webservices done with Nosql
Lars and Fabrizio described the general “social network problem”, and how they went about solving it. This problem space involves the processing, aggregation and dissemination of notifications for a very high volume of events, as commonly manifest in social networking websites such as Facebook and Twitter which connect people to each other to share updates. Apparently simple functionality, such as displaying the most recent updates from one’s “friends”, quickly become complex at scale.
As an example of the magnitude of the problem, he explained that they process 18 million events per day, and how in the course of storing and sharing these across the social graph, some operations peak as high as 150,000 per second. Such large and rapidly changing data sets represent a serious scaling challenge.
They originally built a monolithic, synchronous system called Phoenix, built on:
- LAMP frontends: Apache+PHP+APC (500 of them)
- Sharded MySQL multi-master databases (150 of them)
- memcache nodes with 1TB+ (60 of them)
They then added on asynchronous services alongside this, to handle things like Twitter and mobile devices, using Java (Tomcat) and RabbitMQ. The web frontend would send out AMQP messages, which would then be picked up by the asynchronous services, which would (where applicable) communicate back to Phoenix through an HTTP API call.
When the time came to re-architect their activity , they identified the following requirements:
- endless scalability
- storage- and cloud-independent
- fast
- flexible and extensible data model
This led them to an architecture based on:
- Nginx + Janitor
- Embedded Jetty + RESTeasy
- NoSQL storage backends (no fewer than three: Redis, Voldemort and Hazelcast)
They described this architecture in depth. The things which stood out for me were:
- They used different update strategies (push vs. pull) depending on the level of fan-out for the node (i.e. number of “friends”)
- They implemented a time-based activity filter which recorded a global timeline, from minutes out to days. Rather than traversing all of the user’s “friends” looking for events, they just scan the most recent events to see if their friends appear there.
- They created a distributed, scalable concurrent ID generator based on Hazelcast, which uses distributed locking to assign ranges to nodes, so that nodes can then quickly (locally) assign individual IDs
- It’s interesting how many of the off-the-shelf components had native scaling, replication, and sharding features. This sort of thing is effectively standard equipment now.
Their list of lessons learned:
- Start benchmarking and profiling your app early
- A fast and easy deployment keeps motivation high
- Configure Voldemort carefully (especially on large heap machines)
- Read the mailing lists of the NoSQL system you use
- No solution in docs? – read the sources
- At some point stop discussing and just do it
Andres Kitt: Building Skype
Learnings from almost five years as a Skype Architect
Andres began with an overview of Skype, which serves 800,000 registered users per employee (650 vs. 521 million). Their core team is based in Estonia. Their main functionality is peer-to-peer, but they do need substantial server infrastructure (PHP, C, C++, PostgreSQL) for things like peer-to-peer supporting glue, e-commerce and SIP integration. Skype uses PostgreSQL heavily in some interesting ways, in a complex multi-tiered architecture of databases and proxies.
His first lesson was that technical rules of thumb can lead us astray. It is always tempting to use patterns that have worked for us previously, in a different project, team or company, but they may not be right for another context. They can and should be used as a starting point for discussion, but not presumed to be the solution.
Second, he emphasized the importance of paying attention to functional architecture, not only technical architecture. As an example, he showed how the Skype web store, which sells only 4 products (skype in, skype out, voicemail, and subscription bundles of the previous three) became incredibly complex, because no one was responsible for this. Complex functional architecture leads to complex technical architecture, which is undesirable as he noted in his next point.
Keep it simple: minimize functionality, and minimize complexity. He gave an example of how their queuing system’s performance and scalability were greatly enhanced by removing functionality (the guarantee to deliver messages exactly once), which enabled the simplification of the system.
He also shared some organizational learnings, which I appreciated. Maybe my filters are playing tricks on me, but it seems as if more and more discussion of software engineering is focusing on organizing people. I interpret this as a sign of growing maturity in the industry, which (as Andres noted) has its roots in a somewhat asocial culture.
He noted that architecture needs to fit your organization. Design needs to be measured primarily by how well they solve business problems, rather than beauty or elegance.
He stressed the importance of communication, a term which I think is becoming so overused and diluted in organizations that it is not very useful. It’s used to refer to everything from roles and responsibilities, to personal relationships, to cultural norming, and more. In the case of Skype, what Andres learned was the importance of organizing and empowering people to facilitate alignment, information flow and understanding between different parts of the business. Skype evolved an architecture team which interfaces between (multiple) business units and (multiple) engineering teams, helping each to understand the other and taking responsibility for the overall system design.
Conclusion
Overall, I thought the day’s talks gave me new insight into how Internet applications are being developed and deployed in the real world today. They affirmed some of what I’ve been wondering about, and gave me some new things to think about as well. I’m looking forward to tomorrow.
