Split the kernel up

December 20, 2005

I’ve had this idea for quite a while but only the last few weeks it has materialized enough to put it into readable text.

Now, we all know Linux on the desktop and Linux in the embedded world didn’t take off as well as many would’ve hoped for. In the server world, Linux has put its feet firmly into the ground, but outside of that, Linux means fairly little. The desktop market is ruled by Windows (OS X doesn’t mean all that much at this point), and the embedded world (phones, PDAs) is pretty much ruled by Symbian, Windows and PalmOS (the last being on its way out though).

I have a very simple reason as to why Linux ain’t getting there. And it’s the monolithic nature of the kernel. The vanilla Linux kernel tries to be a jack-of-all-trades, it tries to accommodate every possible user scenario, whether it be supercomputers or mobile phones, and that simply ain’t working.

So, what do I advice Linus Torvalds to do? Well, besides learning some manners, and besides stopping breaking driver compatibility each point release, and besides stopping treating 2.6 as a dev. branch, I advice him to split the main branch up into three separate branches: kernel-2.6-embedded, kernel-2.6-server, and kernel 2.6-desktop. Then, optimize the fcuk out of each of those branches.

For the desktop branch, work closely with the GNOME/KDE/Xfce/etc guys, let those teams have more say in what goes in the -desktop branch and what doesn’t. This way, the desktop environments’ lives get easier, and thus the overall desktop experience will improve.

Do the same for the other branches (ie. work together with RedHat on the -server branch), and it will mean a huge leap forward for the Linux world.

Another solution would be to turn the Linux kernel into a proper microkernel– the advantages of a microkernel (ie. stability) far outweigh the disadvantages (overhead). 15 years ago, when computing power was limited, it made sense to make a monolithic kernel. But in this day and age, with all the computing power going to waste, the overhead is neglicable.

You can’t serisouly expect a kernel with 6 million lines of code to work flawlessly.

13 Messages »

  1. All good Thom, except the Microkernel stuff.

    I’m not sure what real advantages there are in a microkernel. Arguments like “if you have a crash in the graphics code you don’t take the kernel down” don’t hold much water, for practical purposes the machine is dead anyway. So, doing all these architectural changes on something that already works today and it’s pretty stable, it’s not a good decision. If Linux must have been a microkernel, it should have been one from Day 1. Now, it’s too late, except many for some minor further modularization.

    Comment by Eugenia — December 20, 2005 @ 8:10 pm

  2. The advantage is really there. If the graphical system fails, it can automatically restart itself– in fact, any driver can do that. You just have to configure it that way.

    In a monolithic kernel, all drivers live in kernelspace, and as such, pose a huge threat to stability. Since, in order for a monolithic kernel to have an infinite uptime, you have to have 6 million lines of bug-free code (let’s keep Linux as the example). In a microkernel, for instance Minix3, you only need to have 3800 lines of bug free code– namely that small part that lives in kernelspace and that handles message passing of the parts living outside of kernelspace (and in Minix3’s case, the hardware clock :D).

    When a bug crashes a driver in a monolithic kernel, the entire system goes down, and must be restarted. After fixing the bug, you need to restart the system AGAIN, because you must load the kernel with the fixed driver.

    In a microkernel, when a bug crashes a driver, the driver will restart itself, without any downtime at all. Then, you can fix the driver, drop it in the system, and reload the driver, but then using your new, fixed one– also without turning the system down (obviously you do lose the funtionality that the driver provided).

    As you can see, for mission-critical systems, a microkernel makes a lot more sense. It’s just that the microkernel design suffers from a serious image problem (the so-called overhead).

    In any case, of course I know this isn’t a viable option at all. I just said it to make a case for the microkernel design, which is inherantly better than a monolithic kernel in this day and age (of excessive computing power).

    Comment by Administrator — December 20, 2005 @ 8:29 pm

  3. In the embedded world, the three you mentioned (Symbian, Windows and PalmOS) are hardly players. Those are PDA/Cellphone OSes. The embedded world is all QNX and VxWorks. Windows just can’t fit in this space, a space which the OS has to be less than 400k.

    I do agree with the general idea about Linux. Linux can never be the real-time os that QNX or VxWorks are; it’s not that its monlithic its the way it does drivers and user space applications. Embedded OSes are designed from the ground up to be embedded, Linux has to be hacked.

    Comment by Chris — December 21, 2005 @ 3:14 am

  4. I advice him to split the main branch up into three separate branches: kernel-2.6-embedded, kernel-2.6-server, and kernel 2.6-desktop. Then, optimize the fcuk out of each of those branches.

    I’m not sure that this is really needed. Personally I think that one of the biggest issues with say “desktop Linux” is the countless DEs, and packaging formats, and the lack of consistency between the various Linux distributions, as opposed to being a fault of trying to make a kernel that can work well in all three of the situations you’ve cited (embedded, desktop, server).

    That’s not to say I don’t have issues with Linux or some of the decisions that were made by it’s developers. Breaking driver compatibility between point versions IMO is a terrible failing on the part of the developers; leaving the users to mess around with binary drivers, spending hours to make something work. If the developers had just been a little more thoughtful when they designed the driver interfaces, we’d not have this problem.

    I do like the idea of microkernels, I always have. The idea of everything living in one address space has always seemed to me like “building a castle on a swamp” and in many cases, I’ve been burned because of it. Buggy drivers happen, and as much as the fan boys would like to have us all believe, Linux isn’t immune to them.

    Unfortunately, Linus is a stubborn creature, and I seriously doubt that he would ever allow his tree to become based on a true microkernel. He’s even dead set against maintaining driver compatibility.

    Comment by Trent Townsend — December 21, 2005 @ 9:23 pm

  5. >I’m not sure that this is really needed.

    Actually there are cases where this is needed. For example, Linus has consistently refused to include on his mainline kernel some changes from Robert Love that fix multimedia and audio performance (for desktop/workstation usage) because it messes up with the server performance. Additionally, other stuff like ACPI, 3D GL drivers, inotify etc are things that a desktop needs but a server or an embedded one doesn’t really need.

    It might be a good idea to split it up, and it might not be a good idea at the same time. But the bottomline is that neither the desktop or the embedded version of Linux get as much love and attention as the server-friendly patches do. It makes sense to focus on the server more as it makes more money and it’s the strong point of Linux. But there are gains to be made on other market areas too and Linus does not put lots of weight on them, despite OSDL’s efforts to help the desktop via a number of ways.

    Linus just doesn’t care much about the desktop. If he did, he wouldn’t break API and binary driver compatibility so easily.

    Comment by Eugenia — December 21, 2005 @ 9:44 pm

  6. Actually there are cases where this is needed. For example, Linus has consistently refused to include on his mainline kernel some changes from Robert Love that fix multimedia and audio performance (for desktop/workstation usage) because it messes up with the server performance.

    I don’t follow the LKML nearly as often as I do the various BSD lists, so could you perhaps fill me in a bit? Are these patches scheduler related? If so, could they not do what various other groups are doing, and implement swapable schedulers? IIRC, both NetBSD and DragonFly for example are working on the ability to do so on the fly, and IIRC, Solaris has had this ability for ages.

    I don’t know enough about inotify to ask anything inteligent about why it isn’t good for a server.

    Comment by Trent Townsend — December 21, 2005 @ 10:05 pm

  7. I can’t remember the actual name of the patch. It was very popular issue discussed back in 2001-3. It was Robert Love’s attempt for better multimedia/desktop performance, but these things do impact the server performance becaus they optimize some parts of the code using resources from other parts. It’s the same thing why BeOS was so good at multimedia but it sucked as a server. Can’t remember the details now, it’s been too long.

    Comment by Eugenia — December 21, 2005 @ 10:09 pm

  8. but these things do impact the server performance becaus they optimize some parts of the code using resources from other parts.

    You’ve been at this longer than I have, so I’m going to trust your judgement here in the case of Linux.

    I suppose it all boils down to limitations of the architecture, and the stubborness of it’s maintainers ;^)

    But I remain unconvinced that a kernel cannot be designed to work well at any given task, so long as it has both well thought out and stable KPIs, and is designed such that various components are truely hot-swapable (like in the case of schedulers, drivers etc.).

    Comment by Trent Townsend — December 21, 2005 @ 10:26 pm

  9. The problem lies in the complexity of the kernel. The more stuff you add to the kernel ( = the more code), the larger the chance of a bug (as more code = more chances of bugs).

    By keeping the kernel smaller, you can decrease the chance of bugs occuring. Now, a *desktop* user can be bothered by a *server* bug. If you split the kernels up the way I described, you reduce the amount of code in each kernel, and thus decrease the chance of bugs.

    It’s so simple, I’m actually wondering why Linus hasn’t thought of it himself, yet.

    Comment by Administrator — December 21, 2005 @ 10:31 pm

  10. The problem lies in the complexity of the kernel. The more stuff you add to the kernel ( = the more code), the larger the chance of a bug (as more code = more chances of bugs).

    Yes, that’s the thing about microkernels that I’ve always found appealing.

    By keeping the kernel smaller, you can decrease the chance of bugs occuring. Now, a *desktop* user can be bothered by a *server* bug. If you split the kernels up the way I described, you reduce the amount of code in each kernel, and thus decrease the chance of bugs.

    Perhaps I’m misunderstanding your POV here. If you’re saying that a microkernel with modules A B C would be for a server, and the same microkernel with modules X Y Z for a desktop, and the same microkernel with module N for embedded, then yeah, I can agree with you 100%. But to have three completely different kernels? Yuck.

    It’s so simple, I’m actually wondering why Linus hasn’t thought of it himself, yet.

    He’s a much better coder than I plan to ever be (sooo boring!), but he’s a terrible software architect from what I see.

    Comment by Trent Townsend — December 21, 2005 @ 10:40 pm

  11. I disagree strongly on the uKernel part. The few uKernels out there are raving disasters; impossibly hard to debug (just ask RMS) and to slow to bear. The successful uKernel (which happends to be commericial) is only good at real-time issues; server performance would suck beyond this world due to how the scheduler prioritize I/O and cpu.

    With the userspace driver API comming up in Linux, the uKernel “advantage” of every driver living in userspace is also mute.

    Comment by God — July 26, 2007 @ 7:06 pm

  12. What kind of suggestion is that??

    Thom do you ever follow lkml?
    I doubt it.

    Essentially the vanilla kernel is the core for all the stuff you mentioned.
    It need to be stable and running.

    Once it is up and running, you can always tweak it to your choice(s).

    Distros give you heavily patched kernel for desktop performance, isn’t it.
    They are at liberty to make changes, build a new kernel and release.

    Maybe sometime in the future Linus may include it in the official tree too.
    If not, it will still reach to the users through distro.

    e.g DRBD worked out of the kernel for years,now it is being consider a rc for future releases. Does this makes sense?
    I guess it does.

    There is a lot more in life than blogging.
    Try hacking the kernel than pointing fingers at developers.

    Thanks!!!

    Comment by rautela — July 27, 2007 @ 4:06 am

  13. I strongly disagree that “microkernel linux” would be much better.

    You say that the problem with linux is that it tries to be a jack-of-all-trades. Presumably, this means you must sacrifice server performance to gain desktop performance.

    If performance is such an issue, then why are you willing to overlook the performance costs of a microkernel in the first place? You’re confusing the argument — the added stability does NOT address the same problem as branching up the kernel into embedded / server / microkernel.

    If anything, linux is an argument AGAINST microkernels. It shows that it’s possible to scale to production-quality without show-stopping stability issues and *still* retain performance. Moreover, the overwhelming majority of kernel bugs reside in device drivers. This can be addressed with the new userspace linux api (as mentioned above by someone else).

    [As an aside, just because device drivers are in userspace doesn’t guarantee they can’t cause havoc. Consider, for example, DMA transfers that bypass virtual memory.]

    Lastly, ACPI and inotify have both been merged into the mainstream kernel. I don’t know that Linus was ever blocking 3D drivers from mainstream. Although ck’s SD scheduler never made it, CFS took ideas from SD, and has been incorporated into the mainstream kernel. So as long as the needed patches eventually make it in, is this really an issue?

    Comment by will — August 1, 2007 @ 12:01 am

RSS feed for messages on this post.

Leave a message

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong> .

Please use the blockquote tag to quote. Comments containing quotes in other ways will be deleted.


-