We'll see | Matt Zimmerman

a potpourri of mirth and madness

We’ve packaged all of the free software…what now?

Today, virtually all of the free software available can be found in packaged form in distributions like Debian and Ubuntu. Users of these distributions have access to a library of thousands of applications, ranging from trivial to highly sophisticated software systems. Developers can find a vast array of programming languages, tools and libraries for constructing new applications.

This is possible because we have a mature system for turning free software components into standardized modules (packages). Some software is more difficult to package and maintain, and I’m occasionally surprised to find something very useful which isn’t packaged yet, but in general, the software I want is packaged and ready before I realize I need it. Even the “long tail” of niche software is generally packaged very effectively.

Thanks to coherent standards, sophisticated management tools, and the principles of software freedom, these packages can be mixed and matched to create complete software stacks for a wide range of devices, from netbooks to supercomputing clusters. These stacks are tightly integrated, and can be tested, released, maintained and upgraded as a unit. The Debian system is unparalleled for this purpose, which is why Ubuntu is based on it. The vision, for a free software operating system which is highly modular and customizable, has been achieved.

Rough edges

This is a momentous achievement, and the Debian packaging system fulfills its intended purpose very well. However, there are a number of areas where it introduces friction, because the package model doesn’t quite fit some new problems. Most of these are becoming more common over time as technology evolves and changes shape.

  • Embedded systems need to be pared down to the essentials to minimize storage, distribution, computation and maintenance costs. Standardized packaging introduces excessive code, data and interdependency which make the system larger than necessary. Tight integration makes it difficult to bootstrap the system from scratch for custom hardware. Projects like Embedded Debian aim to adapt the Debian system to be more suitable for use in these environments, to varying degrees of success. Meanwhile, smart phones will soon become the most common type of computer globally.
  • Data, in contrast to software, has simple requirements. It just needs to be up to date and accessible to programs. Packaging and distributing it through the standardized packaging process is awkward, doesn’t offer tangible benefits, and introduces overhead. There have been extensive debates in Debian about how to handle large data sets. Meanwhile, this problem is becoming increasingly important as data science catalyzes a new wave of applications.
  • Client/server and other types of distributed applications are notoriously tricky to package. The packaging system works within the context of a single OS instance, and so relationships which span multiple OS instances (e.g. a server application which depends on a database running on another server) are not straightforward. Meanwhile, the web has become a first-class application development platform, and this kind of interdependency is extremely common on both clients and servers.
  • Cross-platform applications such as Firefox, Chromium and OpenOffice.org have long struggled with packaging. In order to be portable, they tend to bundle the components they depend on, such as libraries. Packagers strive for normalization, and want these applications to use the packaged versions of these libraries instead. Application developers build, test and ship one set of dependencies, but their users receive a different stack when they use the packaged version of the application. Developers on both sides are in constant tension as they expect their configuration to be the canonical one, and want it to be tightly integrated. Cross-platform application developers want to provide their own, application-specific cross-platform update mechanism, while distributions want to use the same mechanism for all their components.
  • Virtual appliances aim to combine application and operating system into a portable bundle. While a modular OS is definitely called for, appliances face some of the same problems as embedded systems as they need to be minimized. Furthermore, the appliance becomes a component in itself, and requires metadata, distribution mechanisms and so on. If someone wants to “install” a virtual appliance, how should that work? Packaging them up as .debs doesn’t make much sense for the same reasons that apply to large data sets. I haven’t seen virtual appliances really taking off, but I expect cloud to change that.
  • Runtime libraries for languages such as Perl, Python and Ruby provide their own packaging systems, which manage dependencies and other metadata, installation, upgrades and removal in a standardized way. Because these operate independently of the OS package manager, all sorts of problems arise. Projects such as GoboLinux have attempted to tie them together, to varying degrees of success. Meanwhile, each new programming language we invent comes with a different, incompatible package manager, and distribution developers need to spend time repackaging them into their preferred format.

Why are we stuck?

I suppose it is tempting, if the only tool you have is a hammer, to treat everything as if it were a nail.
– Abraham Maslow

The packaging ecosystem is very strong. Not only do we have powerful tools for working with packages, we also benefit from packages being a well-understood concept, and having established processes for developing, exchanging and talking about them. Once something is packaged, we know what it is and how to work with it, and it “fits” into everything else. So, it is tempting to package everything in sight, as we already know how to make sense of packages. However, this may not always be the right tool for the job.

Various attempts have been made to extend the packaging concept to make it more general, for example:

  • Portage, of Gentoo fame, offers impressive flexibility by building packages with a custom configuration, tailored for the needs of the target system.
  • Conary, from rPath, offers finer-grained dependencies, powerful revision control and object-oriented build recipes.
  • Nix provides a consistent build and runtime environment, ensuring that programs are run with the same dependencies used to build them, by keeping the relevant versions installed. I don’t know much about it, but it sounds like all dependencies implicitly refer to an exact version.

Other package managers aim to solve a specific problem, such as providing lightweight package management for embedded systems, or lazy dependency installation, or fixing the filesystem hierarchy. There is a long list of package managers of various levels which solve different problems.

Most of these systems suffer from an important fundamental tradeoff: they are designed to manage the entire system, from the kernel through applications, and so they must be used wholesale in order to reap their full benefit. In other words, in their world, everything is a package, and anything which is not a package is out of scope. Therefore, each of these systems requires a separate collection of packages, and each time we invent a new one, its adherents set about packaging everything in the new format. It takes a very long time to do this, and most of them lose momentum before a mature ecosystem can form around them.

This lock-in effect makes it difficult for new packaging technologies to succeed.

Divide and Conquer

No single package management framework is flexible enough to accommodate all of the needs we have today. Even more importantly, a generic solution won’t account for the needs we will have tomorrow. I propose that in order to move forward, we must make it possible to solve packaging problems separately, rather than attempting to solve them all within a single system.

  • Decouple applications from the platform. Debian packaging is an excellent solution for managing the network of highly interdependent components which make up the core of a modern Linux distribution. It falls short, however, for managing the needs of modern applications: fast-moving, cross-platform and client/server (especially web). Let’s stop trying to fit these square pegs into round holes, and adopt a different solution for this space, preferably one which is comprehensible and useful to application developers so that they can do most of the work.
  • Treat data as a service. It’s no longer useful to package up documentation in order to provide local copies of it on every Linux system. The web is a much, much richer and more effective solution to that problem. The same principle is increasingly applicable to structured data. From documents and contacts to anti-virus signatures and PCI IDs, there’s much better data to be had “out there” on the web than “down here” on the local filesystem.
  • Simplify integration between packaging systems in order to enable a heterogeneous model. When we break the assumption that everything is a package, we will need new tools to manage the interfaces between different types of components. Applications will need to introspect their dependency chain, and system management tools will need to be able to interrogate applications. We’ll need thoughtfully designed interfaces which provide an appropriate level of abstraction while offering sufficient flexibility to solve many different packaging problems. There is unarguably a cost to this heterogeneity, but I believe it would easily outweigh the shortcomings of our current model.

But I like things how they are!

We don’t have a choice. The world is changing around us, and distributions need to evolve with it. If we don’t adapt, we will eventually give way to systems which do solve these problems.

Take, for example, modern web browsers like Firefox and Chromium. Arguably the most vital application for users, the browser is coming under increasing pressure to keep up with the breakneck pace of innovation on the web. The next wave of real-time collaboration and multimedia applications relies on the rapid development of new capabilities in web browsers. Browser makers are responding by accelerating deployment in the field: both aggressively push new releases to their users. A report from Google found that Chrome upgrades 97% of their users within 21 days of a new release, and Firefox 85% (both impressive numbers). Mozilla recently changed their maintenance policies, discontinuing maintenance of stable releases and forcing Ubuntu to ship new upstream releases to users.

These applications are just the leading edge of the curve, and the pressure will only increase. Equally powerful trends are pressing server applications, embedded systems, and data to adapt as well. The ideas I’ve presented here are only one possible way forward, and I’m sure there are more and better ideas brewing in distribution communities. I’m sure that I’m not the only one thinking about these problems.

Whatever it looks like in the end, I have no doubt that change is ahead.

Advertisement

Written by Matt Zimmerman

July 6, 2010 at 15:31

61 Responses

Subscribe to comments with RSS.

  1. If you really got the impression, that every (relevant) free software is available in Debian (or Ubuntu) please get in touch with the Debian-Java team. We still have plenty of very important projects to package:

    http://wiki.debian.org/Java/RequestedPackages

    Thomas Koch

    July 6, 2010 at 16:50

    • A bit further down in the article, I actually highlighted runtime libraries as an area which isn’t addressed well enough by the current packaging model, so I agree with you. :-)

      My point is not that packaging is a solved problem, but that packaging everything is not the right solution.

      Matt Zimmerman

      July 6, 2010 at 17:02

      • What happens if a CPAN module requires a certain C library to build? What if a certain program requires some other program and not just PERL modules from CPAN?

        What if the latest version of the module from CPAN breaks the current version of the app?

        Another point: package management systems are great for managing the files on the disk. They are less so for managing the data in a local database.

        Tzafrir Cohen

        July 7, 2010 at 07:10

        • In a heterogeneous model, interdependencies like those would be more complex to handle, but not impossible. In practice, at least Debian and Ubuntu already model the system as concentric layers, where each layer can depend only on the layer below (i.e. package priorities), and we try to minimize cycles in the dependency graph.

          A agree that local databases are a good example of where package managers fall short, i.e. managing data rather than program code.

          Matt Zimmerman

          July 7, 2010 at 10:24

          • If we want Eclipse packaged, we need quite a few Java “libraries”. Or do we want to give it up completely? (yes, please!)

            Do we want to package Frozen Bubbles? That requires Perl-SDL.

            Now what happen if some new “unpackaged” application requires a version of perl-SDL that breaks the one frozen-bubbles uses? That app should use a private copy of that CPAN module?

            Tzafrir Cohen

            July 7, 2010 at 13:06

            • Rather than thinking about the underlying technical concerns, look at this from the perspective of a distribution’s “customers”: their users who run the distro, and the application developers who create their content.

              From the user’s perspective, they should be able to run whichever version of Frozen Bubble they want, without knowing what Perl-SDL is. The system should make it easy for them to manage that, and make it somewhat hard (but perhaps not impossible) to screw up.

              From an application developer’s perspective, they should be able to select which runtime components they want to use, and use them. They shouldn’t have to agree with every other application developer on the planet on which version of Perl-SDL to use. They should be able to take responsibility for the experience they provide for their users, without learning all the lore of how distributions work. Maven, in the Java domain, is a good example of how developers think about this problem.

              Distributions, of course, want to provide value to users as well, and they do this by standardizing and integrating components.

              These needs aren’t fundamentally incompatible, but none of them are served well by a “one size fits all” model. The optimal tradeoff will vary from one case to the next.

              Matt Zimmerman

              July 7, 2010 at 13:15

  2. I like your problem statement, and I’ve seen you broach the topic before so obviously you’ve thought about it a lot.

    Have you ever come up with any alternatives or solutions (even if not perfect)?

    Jonathan Carter

    July 6, 2010 at 17:13

  3. One thing you need to be careful of when treating data as a service: what if the installation (embedded or otherwise) doesn’t have Internet access? While local data is often incomplete or stale, a cache is usually better than nothing! And even in this day of broadband connections, occasionally a local version is better simply because it’s “faster”.

    Ben Love

    July 6, 2010 at 20:14

    • A cache is not packaged, which demonstrates the point in the article.

      Marc

      July 7, 2010 at 08:14

  4. I’ve thought about this, too. I definitely agree that dpkg/APT is great for the core system stuff. I don’t really know how we should deal with applications. The semi-centralized nature of the current system is nice in that we only have one update system begging for our attention. I guess the most important thing we can do is have a uniform way of specifying dependencies so that the person who writes the software is also the person who specifies dependencies, but he doesn’t have to know about all of the different packaging systems.

    Alex

    July 6, 2010 at 22:40

  5. Since you mention, “It’s no longer useful to package up documentation in order to provide local copies of it on every Linux system,” are there any technologies in place now that you see as possible tools to help drive a change to having more web-based content for user help?

    In general, I do think it’s important to have offline help content available, as users won’t always be connected to the web, but I think it should be easier to get more docs, and to update the docs that are available to a user. In pursuit of this, I asked one of the couchdb guys at UDS-L whether couchdb would work well for storing docs, but he said no. Also, if I understand correctly, the HTML 5 local store settings for Firefox are limited to about 5mb per site. I’m not even sure if HTML 5 local storage would work well for this, but I think 5mb is too small of a cache for any reasonable amount of docs.

    MS is going toward having some local help content, but also having downloadable content, too. What they are going toward is basically a set of xhtml files (plus a special index file) placed into a zipped folder (the folder is given a a unique file extension, but it’s basically a zip file).

    Given what the Doc and Manual teams are set to discuss this coming weekend ( https://wiki.ubuntu.com/ubuntu-support-and-learning-center ), any ideas you have bouncing around that you’re willing to share would be helpful.

    Jim Campbell

    July 6, 2010 at 23:57

    • To be sure, it’s still useful to have offline access to this type of content. There is a balance to be struck, and I think it’s getting to be time to adjust it.

      Yes, we all spend some time offline, but does that mean we should package documentation for every component of the system, from coreutils to OpenOffice.org, whether the user needs it or not?

      A documentation service could provide the flexibility needed to implement a policy such as “keep a local copy of documentation for applications I use frequently, or which I use when I’m offline”. Most computers are connected to the Internet at least some of the time, so why optimize for the least common denominator (entirely offline)?

      Matt Zimmerman

      July 7, 2010 at 10:18

      • What is definitely happening is that disks are getting ever larger, so there isn’t that much of an excuse for “offline contents” anymore.

        Horst H. von Brand

        July 7, 2010 at 19:31

      • I see two things brewing here:
        (1) I don’t think it’s fair to treat all systems alike, each present a separate and unique problem. Ultimately documentation, and perhaps other components, can be packed separately to account for system differences so the user decides per his/her use case to include the documentation dependencies, rather than packager or maintainer.
        (2) The systems most likely to be connected or remain online are also the ones least likely to need extensive local documentation stores. For instance in servers, netbooks, cell phones, routers, but they need various types of data unique to each case to reman up to date, like anti-virus and ids signatures,

        (3) Perhaps decoupling data will incentivise keeping data up to date particularly on systems that tend to be notoriously out of date in may respects like in the case of embedded devices, many of which are operated offline like cash registers, in-car entertainment, and even many consumer routers.

        F. Fellini

        July 11, 2010 at 17:03

    • “Since you mention, “It’s no longer useful to package up documentation in order to provide local copies of it on every Linux system,” are there any technologies in place now that you see as possible tools to help drive a change to having more web-based content for user help?”

      I think there’s something missed here on the “why” for local docs: it is not so much so you can have access to docs on unconnected machines (is there unconnected machines still there? and I don’t mean not connected to the Internet but disconnected al all -no LAN, no USB->PDA, no nothing), but to gain access to the *proper* documentation.

      While you can go to the app site and look for more detailed information, forums, maillists, etc. where you can find information for *my* X.Y.Z version, which is the one I have installed? Upper maintainer web, docs, etc. focus on the new and flasy X+3 version and it doesn’t even support my Debian one.

      Having docs aligned to what it’s installed has been a bless for my in quite many situations.

      noname

      July 11, 2010 at 16:17

  6. I think in terms of documentation and ‘ generic’ data that a service like wuala could be useful for utilizing and extending the functionality of document stores. By having an initial cached copy of the data you can then get incidental updates as and whenit becomes available. It may also reduce the total storage space taken up by said documents and therefore is also environmentally friendlier.

    Andy loughran

    July 7, 2010 at 00:12

  7. There are truckloads of useful software that isn’t packaged yet or has been removed from the distribution because it was unmaintained upstream or just in Debian.

    http://wiki.debian.org/PaulWise/InterestingSoftware

    Paul Wise

    July 7, 2010 at 01:57

    • It’s true, the title was a bit cheeky. My point is more that we need to look beyond packaging everything which seems useful, and evolve the general model for how we provide users with the components they need.

      Matt Zimmerman

      July 7, 2010 at 10:28

    • On the other hand one could say that unmaintained in upstream and/or Debian is a good indication of how popular/maintained something is and helps me decide wether I want to use the software or not.

      jorge

      July 7, 2010 at 16:29

  8. Hi,

    I develop and maintain a couple of apps in Debian (and Ubuntu – Tux Math and Tux Typing). While these are nowhere near comparable to Chromium, FF, or OO.o, they are developed as cross-platform software, and we encounter some of the same issues as the “big boys”.

    From my perspective, Debian packaging is close to ideal – all the needed libs will be easily available, and everything “just works”. I think that cross-platform projects tend to bundle in libraries (and fonts, etc.) just to be sure the win32 build has the needed resources – not so much to lock in specific library versions.

    The “fast-moving” part is an issue, to be sure. By definition, everything will be a few months old at the time of an Ubuntu release, and six months older at the end. For stable Debian, packages will become even older. But this isn’t really a function of the packaging system per se – it just reflects a policy to “protect” users from new packages until they have been thoroughly tested within the distro. If users want to have new packages almost as soon as they are released upstream, they can do that with Debian packaging – that is what Sid is. In this context, “unstable” means “constantly being updated”, not “crash-prone”.

    So, I think Ubuntu could “decouple” selected rapidly-moving packages (such as the above-mentioned browsers) by pulling them from Sid (or something similar), while pinning the core of the system to the current stable release. I don’t think the package manager is really the problem here.

    David Bruce

    July 7, 2010 at 01:59

    • The package manager isn’t the problem, I agree. It is more the implicit model of “everything should be a package” which I’m calling into question. I think this model contributes significantly to the perception that users need to be “protected” in this way: in some ways, we are actually making it harder for them to get what they want(!).

      Although it’s technically possible to mix and match packages from Debian stable and unstable, it’s pretty awkward in practice. Source-based package managers can make this much easier, I think, but introduce their own problems. Most Linux applications are pretty portable across distributions and releases, as they rely on a fairly stable ABI subset. Package managers can actually get in the way here, when conservative dependencies (such as minimum library versions) prevent installation of software which is actually compatible.

      This is a good example of a design pattern which works well for system components, but introduces unnecessary friction for user applications.

      Matt Zimmerman

      July 7, 2010 at 10:12

      • Is it always the case? The release notes of OpenOffice.org 3.2.1 stated that it fixes 2 important security issues.

        One of them was, in fact, a Java issue. It was fixed in Linux distributions a month or two before that. The other was, from what I see, an issue fixed in the Debian OOo package 3 monthes previously.

        Tzafrir Cohen

        July 7, 2010 at 12:56

  9. I have a question for you. What do you think about creating the GRand Unified Managed Packager – GRUMP. A way application devs could package their app to be able to install on practically all linux machines. Not to mention free up alot of time for the package teams.

    Or at least just unify Deb and RpM – DRM which would cover a very large %.

    Munky

    July 7, 2010 at 02:48

  10. I have yet to find a package-managing system that can survive the punishment I give it. Ubuntu’s package manager is included amongst the ones I’ve manage to fry.

    One problem I found is that when there is an error in a package in Ubuntu, it requires manually editing the package management database. That means there is no means of rolling back failed transactions. It also means there is not much in the way of error handling. Trapping errors is not much use if it leaves the system in an undetermined state.

    Another problem is that different users have different needs. It’s not just different types of system. This means that there is not much value in merely having different repositories for different classes of system. You need the ability to add configuration options to the list of what a package supplies and depends upon, so that user requirements are met without having to have unnecessary code installed (always bad for security) and without having to inhibit the users by asserting the packagers’ preferences over and above user requirements.

    Lastly, given the increasing need for host intrusion detection systems, would it not make sense to have the package manager provide an API that HIDS systems can use? After all, the package already knows what files contain code and the hashes can be either supplied with the package or generated by the package manager during installation.

    Jonathan Day

    July 7, 2010 at 03:00

  11. You might have missed the GoboLinux talk at LCA where it was mentioned that their system deals with external packaging systems like Gems, CPAN etc.

    Paul Wise

    July 7, 2010 at 03:31

  12. A packaging system should place minimal requirements on local hardware.

    It should allow software to be installed even on headless computers with no Internet access.

    “Live” data (via internet) should be used to supplement package data, not replace it.

    Glenn

    July 7, 2010 at 04:54

    • >A packaging system should place minimal requirements on local hardware.
      +1

      >It should allow software to be installed even on headless computers with no Internet access.
      +1

      >“Live” data (via internet) should be used to supplement package data, not replace it.
      Here I disagree. Sometimes you need data localy, but sometimes you dont’t. There should be a way to do both.

      Luka Marinko

      July 7, 2010 at 06:06

    • I think we are in violent agreement. “Data as a service” doesn’t mean “no data should ever be available locally”.

      Perhaps in my view, a lot more of this data could be “supplemented” rather than “packaged” compared to where we are today.

      Matt Zimmerman

      July 7, 2010 at 10:04

  13. It’s no longer useful to package up documentation in order to provide local copies of it on every Linux system. The web is a much, much richer and more effective solution to that problem.

    The trouble with that if you’re trying to maintain a “stable” (as in, “doesn’t change”) system is that a lot of upstreams do not keep documentation for old releases easily available on the website.[0] If you’ve got version 2 of something, are happy with it, and want to hold off using version 3 for a while due to some perfectly good reason of your own, then you might have a problem if the only documentation on the vendor’s website is for version 3, and not relevant for the software you actually have.

    [0] There are some notable exceptions, such as gcc.

    Karellen

    July 7, 2010 at 07:11

    • This is a good point. Some of the “data services” (such as online documentation) will need to get better in order to meet users’ needs. Still, I think this is worth pursuing as a long term direction.

      Matt Zimmerman

      July 7, 2010 at 10:06

  14. If you start reducing the number of packages in a
    distribution you will end up with a system like
    Windows. With Windows you only have the bare mininum
    to get a computer running and minesweeper. It’s the
    user’s responsibility to find, get and update the
    software (s)he wants or needs. Are you envisioning
    such a Linux distribution? (start here a discussion
    about the consequences of users failing to update
    their system and and thus inviting the black hats to
    use their computers for sending spam etc. Users of
    Linux distributions and especially Debian based know
    that they will receive updates for all the software
    they use.)

    On the other hand as a Debian user I have to agree
    with you. To my opinion Debian should stop trying to
    get a release ready and leave that for others like
    Ubuntu. I mean Debian really expects their users to
    use the same version of the web browser for two
    years? Just recently I ended up updating my desktop
    to testing, after an odyssey that brought me pass
    debian-desktop.org only because I wanted a more
    recent version of PyQt. I somehow didn’t like the
    idea to install a more recent version in /usr/local
    and so having two versions installed. But this
    is probably what I will do next time. Same with
    Firefox, Thunderbird, et cetera.

    Pjodrr

    July 7, 2010 at 10:26

    • I certainly don’t think that the Windows model is best. It is in some ways the opposite extreme.

      Windows says “applications are entirely somebody else’s problem”. Linux distributions say “applications are solely our responsibility”.

      I think what we need is a middle ground.

      I definitely don’t envision a world where we forgo updates. Taking Chrome and Firefox as examples, they certainly don’t lack updates. In fact, users who run the non-packaged versions will actually get updates faster than those who run the packaged versions (and with a smaller download). This is good for users, and good for the application developers, but awkward for distributions because they’re getting bypassed.

      I don’t think we have to choose between these models, but find a way to use each where it makes the most sense, and make the appropriate tradeoffs for each case.

      Matt Zimmerman

      July 7, 2010 at 10:37

  15. Any innovation about package management is a good thing for all of us.

    But about documentation… I thing it need more ‘to be written, translated, updated and maintained’ than ‘new ways to release it’.

    Traditionally documentation has been available local as well as online… -doc packages, debian.org/doc, html, compressed, pdf, text etc.

    Local documentation, has saved me more than one crisis/issue… and more than once, reading time while offline.

    I could prefer more human-power/automated-scripts updating debian wiki/site than… new? ways to store/load the debian documentation.

    One cool thing to provide documentation from 2010 on… in new ways… could be to use other senses than the eyes: podcast/text2spech, official video-tutorials, manuals that includes examples/exercises , child games… :)

    poisonbit

    July 7, 2010 at 10:30

  16. There is one thing with respect to documentation that wasn’t mentioned why it actually makes a *LOT* of sense to have it local: If new features are added, if things change, the documentation on the web won’t be able to help you (anymore) for your local installed version. I can point you to a launchpad bugreport for the abook package where someone did complaint that the documentation on the web didn’t work with their package.

    Scanning back through the comments I noticed this was pointed out already – but it isn’t only about “improving documentation” in a way to be able to adapt to have one universal manual – which would just grow in size and get useless over time because of all the historical stuff you wouldn’t want to get removed for that reason.

    So indeed, there is a very valid and reasonable place for local documentation, especially for people who tend to sit on the train and like to be able to work offline, too. Not many people can (or want to) afford the cost involved in a requirement to be online 24/7 just for the sake of being able to read documentation.

    Gerfried Fuchs

    July 7, 2010 at 11:08

  17. […] We’ve packaged all of the free software…what now? This post, effectively a manifesto for the next generation of packaging, is well worth reading as Matt articulately describes the same issues that led the OpenSolaris team to develop IPS. His solution differs – not one ring to bind them all, but rather a decoupling of cooperating package management approaches so that appropriate solutions can address specific needs. This is a call to order that deserves a serious, collective, non-partisan response. (tags: Packaging APT Provisioning FOSS) […]

  18. As an exclusive linux user for 5+ years, I can say that what is missing is an stable rolling release distribution, backed by a big name. That might sound weird, but it’s not.
    IMHO, a good model would be a more or less stable base system (kernel, bootsplash, xorg) – let’s say, 1 or 2 updates per year, and a bleeding edge (according to upstream) applications (browsers, DE, general apps). So, users would be free of having to upgrade the system version every 6 months (or 2 years, but using very old applications), and would always have an up-to-date system.
    I have used sidux for a while, but reverted to kubuntu after having troubles.

    • That’s something that I have been thinking about, too. Generally, users want the latest and greatest applications and a rock-solid core. This may require some cooperation from the application developers, though. Some software requires a recent udev instead of HAL, for example. I guess now that the udev/HAL situation has stabilized, we could specify a stable core that includes a recent version of udev and perhaps HAL for backwards compatibility.

      Alex

      July 7, 2010 at 17:20

  19. Great problem statement, something I had had thoughts about too ..
    What about the following model:

    1- Only a minimal “core” ubuntu OS is served to user via normal packages. The definition of minimal can be argued .. let’s move on
    2- Users are offered extra software from perhaps 3rd party vendors via a “market place”. The offered software comes as an add-on “image” that mounts over the core OS via perhaps unionfs, in the process adding/over-riding any libraries the application needs. Of course if another 3rd party app needs another version of the library, it should not be a problem, both can co-exist in the 2 different unionfs image mounts, such that each app gets its own preferred image
    3- The 3rd party software offered should be run in isolated LXC like environment, where it can never damage the core system, but will still have access to 3D acceleration and full hardware access for example. Guarantee’ing system security will of course be a challenging task. Perhaps a permission based apparmor/selinux system is needed here

    That’s “desktop” problem solved :) For “server” however, a very similar approach can work. The interesting part however is that “data” such as for example mysql’s actual DB files need to live inside that mounted image, and need to be portable such that the admin can extract mysql image with its data and move it over to a new linux server. In essence turning the server platform into a stateless machine (beautiful) and turning over the problem of delivering “mysql” updates and data format upgrades to the mysql image vendor

    Would love to hear your opinion matt

    Ahmed Kamal

    July 7, 2010 at 18:33

  20. It would be nice if someone could add kissd (http://www.popies.net/kissd/index.html) to the debian repository.

    thanks,
    paul

    Paul Cobbaut

    July 7, 2010 at 19:26

  21. I found this today: http://blog.nlinux.org/2010/07/listaller-0-4b-released/
    This projects seems to be very promising, as it aims to provide one setup file per applications (no packages). It also is completely decoupled from the system’s package manager. (The author talks about a “system packages layer” (providing system components and apps, this is Apt or Yum) and an “applications layer” (the Listaller project) on some of his German posts)
    You may want to look at this.

    Qualin

    July 7, 2010 at 21:18

  22. I’d like to add another requirement: the ability to install and manage applications without having to be ‘root’.

    For the past decade I have been working in environments where I am responsible for software on hosts where I don’t have root access. If I could install non-system packages as a non-root user, perhaps in another tree like /usr/local, I could get access to do so, but this is not possible as far as I can tell. Currently I am using local installs I have compiled myself, managed using graft and distributed using rsync. While this works well, it is less than ideal.

    – Keith

    erwbgy

    July 8, 2010 at 12:41

  23. Almost every day, OMG Ubuntu features interesting free software for Ubuntu. Today, for example, it was Burg. Yesterday, it was GmailWatcher. The day before that, it was Pinta. None of those three are available in the Ubuntu repositories — and that’s typical. So, personally, I find the suggestion that “virtually all of the free software available can be found in packaged form in distributions like Debian and Ubuntu” not so much “cheeky” as risible.

    This is not a failure of Ubuntu’s developers; it’s a sign of Ubuntu’s popularity. Packaging applications is what OS developers do when the OS is unpopular. Ubuntu is now popular enough that application developers are increasingly interested in packaging their software for Ubuntu themselves (as they already do with Windows and Mac OS X). This does not mean they’re the slightest bit interested in becoming a MOTU or a DD; they want to package their own software, not anyone else’s. That’s why we’ll have a new channel for these independent packagers in Maverick.

    It is great that application developers want to do this, because the current scheme of a separate set of developers packaging most things would never scale. Most people aren’t interested in what proportion of “the free software” is available. They’re interested in whether software that does what they want is available. To provide enough software to meet that need, we need not thousands of applications (like Ubuntu currently has), but hundreds of thousands of applications (like Windows and Mac OS X and iOS currently have). And while MOTUs do fantastic work, that work just isn’t exciting enough that we could expect their ranks ever to increase a hundredfold. Growing the user base a hundredfold would reach a billion users, and the developer base will always grow slower than the user base.

    So I agree with your first suggestion, to “Decouple applications from the platform” — not just for the reasons you state, but for the simple reason that the current scheme can’t last.

    I’ve also been thinking for a while about your third idea, to “Simplify integration between packaging systems”, from the perspective of Ubuntu Software Center. Imagine if USC transparently allowed installation and removal of items from cross-platform package catalogues such as addons.mozilla.org, rubygems.org, CPAN, CTAN, CRAN, and CEAN. Would that be cool, or what?

    mpt

    July 8, 2010 at 12:45

  24. Is it not exactly this problem which the Psys library is supposed to solve? (http://gitorious.org/libpsys/pages/Home)

    That project allows a developer to write an installer that connects to the Psys library. The Psys library in turn connects to the system’s package management system (be it APT, RPM or whatever). In this way, a single installer binary will work on every supported distribution.

    This will allow for simple, independent installations of software that integrates nicely with the current package system.

    It will work on any package system just a back-end is written for that package system.

    Pär Lidén

    July 8, 2010 at 19:32

  25. Wrt. decoupling, I agree. The current model of freeze the world for release seems to be more of an outgrowth of the software distribution
    by CD model that maybe was common at some point in the 90’ies. However, IMHO it to some extent does a disservice to users and upstream. Consider a package in Ubuntu; at the moment there are apparently 6 supported Ubuntu releases, and hence, there might be 6 different versions of to support. Add the other major distributions, and you might have dozens of versions in use and in need of security and regression fixes. Needless to say, this is not how upstream works: in addition to development branches, there might typically be one or two maintained stable branches. It’s hardly surprising that upstream
    throws up their hands in exasperation and resorts to providing binary tarballs. Similarly, most users are probably quite happy with an older and proved stable core system (as long as it supports the hardware they have), but would like to use more recent versions of some specific apps, like firefox or OO.org, or some specific apps or libraries for developers etc.

    One slightly different way to think about the packaging problem would be to have the package management system straightforwardly support multiple branches for each package, in other words, shifting the thinking from distribution branches with a single package version per
    branch to instead focus on individual packages and their branches. Each package could then provide one or more stable branches, some testing/unstable branches etc, with compatibility to one or preferably multiple core OS releases (thus avoiding the problem of
    having to support a different version for each OS release). Then the user should be able to specify the update policy on a per package
    basis, e.g. use the latest version of the latest stable branch for those who want to be close to the bleeding edge. Or upon initial
    installation pick the branch with the longest time remaining until end of support (EOS), then when that time arrives switch to the stable branch with the longest remaining time until EOS.

    For a few succesful examples of the above, consider e.g. the Apple App Store or the Android Market. According to wikipedia, the Apple App Store launched in 2008 and currently has over 225000 applications, and the Android Market also launched in 2008 and currently has over 85000 apps growing at 10000 per month. For comparison, consider that the current Debian release, again according to wikipedia, has around 23000 packages, and Debian dates back to 1993. Similarly to the above, the
    lifecycle of apps for these two systems is not synchronized with the OS release, but rather there is apparently quite a simple dependency
    system that just specifies which core OS versions is needed. Of course, the difference in growth rate cannot be solely attributed to
    the package management policy. One reason might be that it’s somewhat easy to monetize, as buying apps on those app stores is easy and there is a 70/30 revenue sharing agreement between the developer and the store operators. That might not go down very well with free software purists, but perhaps it’s something Canonical would be interested in doing as part of their Ubuntu One effort, for instance. As an aside, the next versions of both Windows and Mac OS X are reportedly going to include app stores, thus bringing some sort of package management to 3rd party applications on those platforms.

    jb

    July 8, 2010 at 20:46

    • Freezing is (almost) the only way to do integrated testing. You simply don’t know how packages are going to interact unless you actually test them together. Which is a part of why I find it so easy to break software – I almost always want to use configurations that were never tested together.

      It is also (almost) the only way to make sure packages are in sync. It is a VERY frequent problem that package X and package Y depend on incompatible versions of package Z. Sometimes this is because one or the other failed to rebuild against an update or the package dependencies were too tightly constrained. Other times it’s because package Z both broke compatibility and had no support for multiple versions being installed.

      I say almost to both of those cases, because it’s entirely possible for packages to include automated tests for integration issues, it’s entirely possible to set up automatic package regeneration and it’s entirely possible to handle multiple versions.

      But that’s not how packages are currently provided and the multi-versioning isn’t exactly something the package manager handles very well.

      In fact, QA in general is an area Linux distros have an enormous problem with, which is why freezes are such a vital part of the distro system. It’s not just for CDs, it’s for making sure things actually work. Even the Linux kernel has soft and hard freezes before a new release, to do the QAing.

      (And, frankly, for all that the Linux kernel has had its share of Brown Paper Bag releases, the kernel developers do an amazing job on the QA side. All too often, QA is left to end-users. Never a good strategy.)

      Jonathan Day

      July 8, 2010 at 21:28

      • I wasn’t arguing against the utility of freezing, I was arguing against the model where every package in the distro has to be forced into the same life cycle.

        The problem with packages X, Y and Z are solvable. In the common case that Z is a shared library, the author should strive to provide ABI compatibility through various means, as described e.g. in Drepper’s DSO howto. Failing that, the packager should provide multiple parallel installable versions of Z. If all else fails, X and Y can always bundle a version of Z (like firefox for ubuntu nowadays does for some libraries).

        And yes, indeed multiple branches per packages is not something that package managers support very well today. But I see no reason why such functionality couldn’t be added, if the community decides that such features are needed. While I’m personally involved in a few open source projects, I’m neither a debian nor ubuntu developer, so ultimately it’s not my decision to make, I can only offer my opinions and it’s up to the debian/ubuntu projects to decide upon the worthyness of those opinions. But Mr. Zimmermann’s blog post here suggests that even within the projects there are people who think the status quo is not the ultimate packaging policy for all eternity.

        jb

        July 10, 2010 at 16:46

  26. Check out http://www.portablelinuxapps.org – what you call the “Windows Model”, but improved.

    Dan

    July 8, 2010 at 21:29

  27. http://tinyurl.com/2udtev5

    1868+ packages needs to get packaged.
    You have NOT packaged all of the free software.

    Anonymous

    July 9, 2010 at 11:37

    • I have either subscribed on Freshmeat, monitor the page of, or otherwise know of, probably another couple of thousand actively maintained unpackaged free software projects that would likely be of interest to a reasonably significant number of people.

      That would actually be no big deal, if the package manager was capable of supporting any GNU autoconf-based source tarball that used conventional makefile parameters for compilation and installation and which permitted replacement versions of install.

      (The package manager could provide an installer that supplied the package manager with a list of files that would be installed, the package manager could then check to see if that should be ok, and if so the install could proceed and a package database entry auto-generated.)

      That would not be true packaging but it would, at least, provide some level of support for code that is currently entirely outside the packaging system.

      One of the big dangers of running a hybrid system with a mix of packaged and unpackaged software is that filenames are rarely unique to a package. This can lead to inadvertent replacement, mistaken deletion, and all kinds of other unwanted side-effects.

      If you are installing entirely from source, it’s usually (not always, but usually) possible to manually verify filename safety. If you are installing entirely from packages, the package manager ensures it. On a hybrid system, all bets are off.

      Jonathan Day

      July 9, 2010 at 17:05

      • There’s a program called “checkinstall” that purports to do what you suggest. Last I used it, which was quite a while ago, it worked pretty well with autotools style installations.

        jb

        July 10, 2010 at 16:51

  28. Surprised to see no mention of http://0install.net here. It integrates the use of out-of-prefix installs with the distribution packages (using PackageKit).

    For example, you could specify that your program depends on Java >= 1.5 (using the XML metadata format). If a recent enough version of Java is installed as a native package it will use that, otherwise it can download a newer version and install to a separate directory, setting the appropriate environment variable when the program is run.

    That makes it pretty easy to release a single package that works across multiple Linux/Unix distributions, with automatic dependency handling, updates, digital signatures, etc.

    Thomas Leonard

    July 10, 2010 at 12:40

  29. More “new solutions” needed to old problems :O

    Mozilla did nothing to “force” ubuntu to ship major version changes as updates, the whole point of a distro is to smooth out things like that, if upstream drop support, then you guys pick it up and support it for the rest of the 1-2 years that user need it, not everybody wants the latest and greatest some of us want to install a system in 10.04 and know it will still be the same, supported, secure system until 13.4 (hint this is even more apparent when you look at users rather than outspoken geeks, e.g http://gs.statcounter.com/#browser_version-ww-monthly-200807-201007 shows a lot of people still on ie6 and ie7). I want to be able to install ubuntu on a pc, leave it running apt-cron and KNOW it will still work when I next go to my parents house in 6-7 months and that is very much what ubuntu wants to offer if your looking to expand (especially expand into corporate desktops)

    Now there are a lot of loud geeks wanting in release updates (hell i run the latest kernel, latest firefox and latest xorg), but what these geeks need are PPAs not wholescale reform. What would be nice would be a change in the system so multiple lib versions can be installed and packages pick the right library (e.g no more /usr/lib/xulrunner, just /usr/lib/xulrunner-1.9 and all packages that need 1.9 know where to get it), that allows PPAs (or even 1st party PPAs) to offer multiple versions of an app side by side without crippling package management by packaging the dependencies with the package itself (this plays in very nicely with btrfs being used to save space)

    I’m not too sure about data, maybe you have a point, but couldn’t -data packages be excluded from freezes, thus letting package maintainers issue updates much more easily and not requiring increases complication in the programs themselves (e.g why have update-pciids, when a package called pciids-data could exist and be updated regularly, hell for data packages it could be automated and would be just as reliable as update-pciides). There are cases where these solutions will not work, but i’d rather see apt-get-data or something similar at work so that the files are still checked/hashed and if needed can be easily rolled back and verified.

    Last time i looked the way to download a python pacakge was apt-get install Python-$name, (there are ~1200 such pacakges), now i’m no expert on perl or ruby, but why can’t automated build systems or wrappers to/from apt be used to move all those dependencies through apt/dpkg rather than cpan/gem, if both were available as options a user would be able to either install the latest cpan modules (unsupported) through the wrapper or those supported with the ubuntu release through static packages (it’s my understanding that this is what happens with python and it seams to work well). With even 2nd/3rd parties offering deb installer (e.g when you compile your own kernel, there is a that option, when you install opera there is a repo for it), why are you so to go in the opposite direction and move away from apt/dpkg rather than integrate other tools with it.

    As for embedded systems not being properly served by package management, well i think your probably right, but then again there is no need for “One linux”, besides i’s more to do with the packages themselves, you can always recompile with less support or build a huge repository of packages with different cflags and then match the cflags at update time. Sure if Ubuntu were to do this it would need new package management, but last I checked that wasn’t what Ubuntu was about to do (you also loose out on a lot of testing as fewer people will have any given package combo)

    Sorry to disagree entirely but i think you asked some good questions and came to the wrong answers:
    “How can packaging managers cope with embeded systems/virtual apps?” Why does this matter to ubuntu?
    “How can packaging deal with constantly updating data?” By constantly updating.
    “How can packaging deal with complex ecosystems out of it’s control?” Erm, if the systems are out of it’s control there is nothing it can do, if there is any way that package-management can help it’s by abstracting something bigger on top.
    “How can ubuntu’s package manager cope with constantly updating apps?” They shouldn’t, there are already plenty of rolling release distros, multiple library versions would allow for nicer updating from PPAs though.
    “What can we do about uncontrolled package management systems?” Take control of them through automated builds and wrappers.

    Rioting_pacifist

    July 10, 2010 at 13:01

  30. These are all Debian specific issues. Gentoo Linux solves all of them by letting the user decide to handle them. Gentoo Linux’s package manage, portage, is a tool that automates the processes of dependency management, fetching software, compiling software, installing software and uninstalling software. Anything that does not fit into one of those roles is something that the user handles and it works very well.

    Richard

    July 10, 2010 at 20:53

    • I seem to have made one too many edits before I clicked “Submit Comment”. I omitted “how” in “Gentoo Linux solves all of them by letting the user decide how to handle them.” :/

      Richard

      July 10, 2010 at 20:54

  31. I would like to differentiate between desktop and server contexts. With regard to desktops, I think that there is a lot of truth with your arguments and all the mentioned solutions can be seen as an improvement for the desktop user.

    For the server administrator the whole problem looks IMHO totally different. I believe that the server administrator actually profits from an “everything is a package” approach with easy validation and easy management of all components on a single node. Of course *local* packaging cannot (and IMHO will never) solve the problems between servers. But a solid view on the state of a single node is a very valuable tool!

    ATM I am working on a new data center management approach that follows an “everything is a file” idea and proposes to deliver all local files on a single node via packaging (RPM, DEB …). Including configuration, data (where locally stored), applications etc. The question how the packages are created is IMHO the key to achieving any degree of flexibility required, not loosening the concept of packaging all files on a server.

    So far, this approach works very well (see http://schapiro.org/schlomo/papers/LinuxTag%202010%20-%20%20Advanced%20Software%20Management%20with%20RPM.pdf for an early version of the idea and ping me if you want to discuss it further). It obviously requires a central management solution that handles system interdependencies and obviously the application software should be more robust with regard to version mismatches.

    In my opinion, the strength of “everything is a package” is the unbeatable strong control over a server and the validation of what is going on there. I would not want to miss that for my data center. And yes, the hurdle of packaging software with its dependency is a welcome stage in the quality assurance for the data center :-)

    Schlomo Schapiro

    July 12, 2010 at 09:51

  32. […] pm on juliol 15, 2010 | # | 0 En aquest article es discuteixen els avantatges i els inconvenients dels sistemes de gestió de paquets que empren […]

  33. […] be investing my time in research, experimentation and imagination. This includes considering how we package and distribute software, how we adapt to technological shifts, and highlighting opportunities to cooperate with other open […]

  34. I would also include commercial packages. Apple’s App Store is a model similar to most packaging ecosystems except developers actually get paid. Being able to buy a package has been neglected within most Linux package managers.

    sneakin

    August 21, 2012 at 14:56

  35. […] while back mdz blogged about challenges facing Ubuntu and other Linux distributions. He raises the point that runtime libraries for Python / Ruby etc […]


Comments are closed.

%d bloggers like this: