lca2011 – Thur Session 1 – Kernel Development

Kernel Development – How things go wrong – Jonathan Corbet

  • Kernel dev a success – ~4.5 releases/year
  • Linux soon to your toothbrush
  • Why talk about failure? Gives kernel process a bad name, scares people off, bad press, failure can teach us
  • Note: The kernel community does not lack for clowns – won’t be covering them, names named here are people he respects
  • Example 1 – tux3 file systems by Daniel Phillips –
  • 2008/07 announced , 2008/11 booting , 2009/08 last commit – project now dead
  • Do NOT add stuff to out-of-tree project, merge early
  • Lesson out of tree code – for users, few contributers, invisible, have to keep sync’d with mainline, harder work
  • Lesson – get in mainline early
  • Example 2- em28xx video4linux driver
  • 2005/11 merged , 2008/01 last patch by ori author, 2009/08 author leaves completely
  • “People submitting code have to be aware they will lose control”
  • eg 2004/05 Hans Reiser tries to block updates to reiser3 fs , wanted to keep control and concentrate on reiser3
  • maintainership does NOT mean ownsership
  • Example 3 – 2.5.x IDE
  • 2002/02 Martin Dalecki’s IDE cleanups , 2002/03 IDE18 subsystem takeover , 2002/08 IDE115 merged, 2002/08 Martin’s quits all IDE work reverted
  • What happened? – “Breakage is the price you have to pay for advancements” – martin Daleski .
  • Kernel intolerant towards regressions, listen to people who complain
  • Example 4 – Deadline Scheduler
  • 2007/03/ first post, 2 days later Linux likes, 2 weeks later Linus get irritated (due to regressions), 2007/04 Molnar posts CFS, 2007/07 CFS merged, 2007/07 Con leaves
  • Con leaves unhappily
  • Lessons – Improve for everybody, not just a small set, at least don’t make it worse
  • Lessons – Some parts of the kernel are hard to change
  • Lessons – Participate in the wider kernel mailing list discussion, not just list that is for your changes, subsystem
  • Lessons: Aim for the solution to the problem, not for the inclusion of specific code
  • Example 4 – reiser4
  • 2002/10 First code post, 2003/07 merge request, 2004/08 added to mm, 2005/09 push 2.5.14 , 2006/07 push 2.6.19 , 2006/10 arrested
  • Problems – Non-POSIX behavior , numerous technical difficulties , hard-to-reproduce benchmarks, antagonistic approach to others, memories of reiser3
  • lessons – Linux is not a research system
  • Lesson – visionary brilliance will not excuse pore implementation
  • lessons – it’s best not to access others of conspiring against you
  • lessons – community remembers the past and thinks far into the future
  • Example 5 – systemtap
  • 2003/11 Dtrace debuts, 2005/10 rhel4 introduces systemtap, 2008/07 Ftrace merged, 2009/06 perfevents merged, 2009-09 systemtap 1.0 released, Not yet merged possibly never
  • 2008 kernel summit, 50% had used systemTap, 20% had succeeded
  • bad sign if even kernel devs can’t make it work
  • lesson- if kernel dev community doesn’t see value, it won’t go in
  • Example 6 – Talpa – provide hooks for anti-virus
  • posted in Aug 2008, never merged as such
  • Problems – Kernel devs didn’t like, why broken security model, badly expressed requirements
  • But – fanotify is accepted, same authors, similar code
  • What changed – cleaned up file event notification (replace inotify and dnotify)
  • What changed – Enable virus scanners to hook into file ops without using rootkit techniques
  • lesson – sell to developers not to mangers or customers
  • lesson – user-space API matters, hard to change later
  • Example 6-15 posted
  • Why bother? – “Why not hack on CMSses or something else where the standards are a little lower”
  • Fun!
  • Elite club
  • you will get job offers
  • Influence – how the kernel meets your needs
  • VFS filesystem patches by Nick Hagen(sp) – worked – intrusive and tricky code – good code – listened to people – didn’t behavior for people, all behind the scen
Share