LCA2012 – Friday Morning

Bloat: How and Why UNIX Grew Up (and Out) – Matt Evans and Rusty Russell

  • Cool projects: spark, plover, Homebrew Cray-1A
  • Compare PDP-11 Unix vs Modern Ubuntu 11.10
  • Binary sizes: cat 152 bytes vs  531k KB
  • grep command: 2176 bytes vs 687 KB
  • ls command: 4904 bytes vs 628 KB
  • V6 cat command just 12 lines of assembler, 2 * 512bytes buffers, a.out 16 bytes overhead
  • Binaries 30% because we chose speed over size. ~9% speed gain
  • V6 Runtime coverage: cat 99% , grep 78%, ls 85%
  • V7 has reduced coverage. some commands converted from assembler to C
  • x86 runtime with dietlibc coverage: cat 11% , grep 23% , ls 39%
  • x86 static cat has 700k of libc dependencies, 17% of libc, 313 objects it depends on
  • libc 1.7M but widely shared among hundreds of processes
  • dynamic ls accesses 90k of libc but 476kB paged in.
  • For sample system. libwebkit 8.5MB , 5MB wasted, 33MB wasted real RAM
  • What about a 64Bit version of a PDP-11 – a PDP-11 . Various assumptions on how binary size would increase
  • PDP-44 – binaries around 50% larger
  • 32 bit ubuntu binaries are 9% smaller than 64 bit ubuntu
  • Forward port V6 binaries to x86 . V6 cat almost same size as dietlibc version
  • More work to forward port V6 ls, lots of assumptions not longer true. Code tricks no longer work. 20% larger cause of ELF and nmap. 120% penalty due to modern infrastructure (eg malloc realloc)
  • Backport x86 “ls” and “cat” to V6. Only backport some options
  • cat: remove old options and error reporting. Kept some features.
  • ls: remove lots of options.
  • Binaries 60% larger due to flexibility
  • 440% bloat due to new features
  • Asmutls – reimplementation of current Linux utils in x86 assembler.
  • The talk is online, hard to do notes since it jumped around a lot and graphs hard to read


Open vSwitch – Simon Horman

  • Switch contains ports, ports has one of more interface, packets are forwarded by flow
  • Flows may be identified by lots of combos, address, vlan, ports, TOS
  • 1st packet in flow gets sent to userspace controller, controller makes decision, tells datapath what to do with future packets, resends first packet back to datapath. Later packets the datapath knows what to do (from hashtable lookup) and handles itself
  • Configured by JSON database, persists across restarts
  • database controlled via Unix socket or via TCP. Change action won’t return until database update performed
  • cute ” –may-exist ” options when creating stuff that does nothing if what you are requested already exists
  • He did some demos of standard sort of stuff, truck interfaces, port mirroring, fairly simple commands to do
  • Does VXLAN and GRE tunnels
  • Oracle looking to put in Oracle Linux soon to replace current bridging code
  • Can do millions of packets per second. Some bottlenecks in tunneling code