- Building a Website to 200 Million pageviews and beyond. – ( slides and video ) Very interesting talk about youporn.com migrating to a new architecture. The main link is too a summary on highscalability.com
- Why Jonny can’t ride – Why biking to school is banned at many US schools.
- Meet the New Boss, Worse than the old boss – Musician David Lowery compares economics of music now and in the past
- A ticking time-bomb – How the lack of time-synchronisation of medical devices can kill
Author: simon
Links: Scaling Pinterest, NYT and SF charts, Flickr
- How Yahoo Killed Flickr and Lost the Internet – Could yahoo have grown the flickr community from 2005 and beaten out facebook.
- A Chart that Reveals How Science Fiction Futures Changed Over Time
- Amanda Cox and countrymen chart the Facebook I.P.O – Serious cool behind the scenes on the charts in the New York Times
- Pinterest Architecture Update – 18 Million Visitors, 10x Growth,12 Employees, 410 TB of Data
Links: safecrackers, media, olympics, biography, singularity
- Interviews With People Who Have Interesting or Unusual Jobs: Ken Doyle, Safecracker
- Fungible A treatise on fungibility, or, a framework for understanding the mess the news industry is in and the opportunities that lie ahead
- Dear New York Times & Wall Street Journal: How About Some Sensible Digital Subscription Pricing?
- Can London Afford the $14.5 Billion Price Tag of the Summer 2012 Olympic Games? I think Vanity Fair gets a bad rap, it has good articles and lots of pictures of pretty people
- I the multi-volume biography dead I’ll admit I have attempted the 8-part Winston Churchill biography but ran out of stream with a couple of parts to go.
- Welcome to Life « Tom Scott A science fiction story about what you see when you die. Or: the Singularity, ruined by lawyers.
Links: China vs UK, glasses, unlimited flying & Online Ed.
- What My 11 Year Old’s Stanford Course Taught Me About Online Education
- Recycling Eyeglasses Is a Feel-Good Waste of Money If nothing else I need to pay less when I get some new glasses.
- A Tale of Two Terminals comparison between the launches of Terminal 3 at Beijing Capital Airport and Heathrow Terminal 5 from the perspective of someone used to handling complex software implementations
- The frequent fliers who flew too much This reminds me a lot of Internet (especially ISP) based unlimited accounts that fail to take account of people’s usage patterns when additional usage is free
- When half a million Americans died and nobody noticed Has the death toll from the drug Vioxx been greatly underestimated?
Interesting links for May 5th 2012
- A Relevant Tale: How Google Killed Inktomi – Overview on how early search engine Inktomi was knocked out by Google. The Hacker News discussion is quite good and includes a link to a video talk by Inktomi’s co-founder.
- Reddit interview – IAmA Part Time Hooker in New Zealand – Raw Interview or Summary of Q/A . Prostitution is legal in New Zealand so some people in other countries find it interesting how it works. NSFW obviously.
- Are Shakespeare’s Plays Encoded within Pi? – YouTube . The full text is in the text section of the page.
- Gather – Auckland BarCamp has decided to rebrand itself as “Gather”. Not sure of the point especially since they don’t even own gather.co.nz (or gathernz.com or something). Anyway it’s a pretty good unconference that happens each year. This year it is one June 30th
New Year’s Resolutions – 3 month progress report
At the start of this year I made some New Years Resolutions. I thought I’d review how I was going after 3 months.
- Weight – Unchanged. Doing a bit of work here but obviously not enough, at least it is not going up.
- Driver License – Not started yet
- Chess – Done a lot of work here, plenty of practice and I’ve scored some good results. Feel I’ve made some improvement
- Programming – Not started yet
Overall it doesn’t look so good but I’m actually pretty happy. The chess is going well and I’m intending to start the programming course later this month.
LCA2012 – Friday after lunch
Codec 2 – David Rowe
- Open speech Codec. Low bitrate 2400 b/s down to 1400 b/s
- Applcations for digital radio
- Fills <5000 b/s gap
- http://rowetel.com/codec2.html
- Not a DSP talk
- Can send 45 calls inside 64 kb/s chanel
- Not useful for VOIP due to IP/UDP overhead of 8kb/s on 1400b/s data
- Main use radio spectrum. Less data = less power required since your power gets concentrate on less bits
- doesn’t matter too much if odd packet dropped
- proprietary codecs slowing digital voice over radio
- Proprietary codes: hardware or licensed software form, difficult to distribute, can’t modify
- Example g729 license $40k. Doesn’t believe closed source codecs benefit society
- Authors of propriety/patented codecs borrowed heavily from public domain. perhaps 5% is original. Good news is only 5% needs to be replaced
- Speech coding: eg 16bit samples at 8kHz, comprss to 1400-2400 b/s . What can we thrown away, retain intelligible speech, retain natural speech. Use a model of speech, send model parameters, for effecient than coding waveform
- Model: example is pitch, humans 50-500 Hz , can be represented with 7 bits, updated every 20ms 7/0.02 = 350b/s to represent pitch
- Codec 2 uses Sinusoidal speech coding. Multiple Sine waves added togeather
- Bit allocation: 56bits every 40ms. Of these: Amplitude 32 , Frame energy 10 , voicing 4, pitch 10
- Developing Codecs: complex DSP algorithms, run codecs in non-realtime, dump values from codecs every “frame” ( 80 samples, 10 ms of speech) . Gnu Octave
- Banned exports list includes ” Speech codecs below 2400 b/s ” . Have been advised by DECO that Codec 2 has “assessed as not controlled” but waiting for certificate
UEFI and Linux – Matthew Garrett
- Replacement for PC BIOS
- BSD licensed core
- Adds standardized support for new hardware features
- Platform init
- EFI image load – loaded drivers
- EFI OS loader load – oot from ordered list of EFIOS loaders
- Boot services terminate -> OS handover
- Boot services – memory allocation, timers, image loading, GUIDs.
- Runtime services – non-volatile variable store, boot data, system information, crash dumps (already in Linux 3.2)
- Able to update firmware by reset and grab new firmware out of variables on bootup
- GPT – GUID partition table – no practical restrictions on size and number – more metadata about partition type and service
- That all sounds good …. but ….
- TianoCore – Open Intel reference UEFI reference implementation, 7061 files, >100MB of code, 10% of size of Linux kernel. Bigger than Linux core kernel
- Large codebase, some bugs
- UEFI is poorly tested in the real world. UEFI contains a lot of code. UEFI contains a lot of bugs
- Some problems with secure boot 🙂
LCA2012 – Friday Morning
Bloat: How and Why UNIX Grew Up (and Out) – Matt Evans and Rusty Russell
- Cool projects: spark, plover, Homebrew Cray-1A
- Compare PDP-11 Unix vs Modern Ubuntu 11.10
- Binary sizes: cat 152 bytes vs 531k KB
- grep command: 2176 bytes vs 687 KB
- ls command: 4904 bytes vs 628 KB
- V6 cat command just 12 lines of assembler, 2 * 512bytes buffers, a.out 16 bytes overhead
- Binaries 30% because we chose speed over size. ~9% speed gain
- V6 Runtime coverage: cat 99% , grep 78%, ls 85%
- V7 has reduced coverage. some commands converted from assembler to C
- x86 runtime with dietlibc coverage: cat 11% , grep 23% , ls 39%
- x86 static cat has 700k of libc dependencies, 17% of libc, 313 objects it depends on
- libc 1.7M but widely shared among hundreds of processes
- dynamic ls accesses 90k of libc but 476kB paged in.
- For sample system. libwebkit 8.5MB , 5MB wasted, 33MB wasted real RAM
- What about a 64Bit version of a PDP-11 – a PDP-11 . Various assumptions on how binary size would increase
- PDP-44 – binaries around 50% larger
- 32 bit ubuntu binaries are 9% smaller than 64 bit ubuntu
- Forward port V6 binaries to x86 . V6 cat almost same size as dietlibc version
- More work to forward port V6 ls, lots of assumptions not longer true. Code tricks no longer work. 20% larger cause of ELF and nmap. 120% penalty due to modern infrastructure (eg malloc realloc)
- Backport x86 “ls” and “cat” to V6. Only backport some options
- cat: remove old options and error reporting. Kept some features.
- ls: remove lots of options.
- Binaries 60% larger due to flexibility
- 440% bloat due to new features
- Asmutls – reimplementation of current Linux utils in x86 assembler.
- The talk is online, hard to do notes since it jumped around a lot and graphs hard to read
Open vSwitch – Simon Horman
- Switch contains ports, ports has one of more interface, packets are forwarded by flow
- Flows may be identified by lots of combos, address, vlan, ports, TOS
- 1st packet in flow gets sent to userspace controller, controller makes decision, tells datapath what to do with future packets, resends first packet back to datapath. Later packets the datapath knows what to do (from hashtable lookup) and handles itself
- Configured by JSON database, persists across restarts
- database controlled via Unix socket or via TCP. Change action won’t return until database update performed
- cute ” –may-exist ” options when creating stuff that does nothing if what you are requested already exists
- He did some demos of standard sort of stuff, truck interfaces, port mirroring, fairly simple commands to do
- Does VXLAN and GRE tunnels
- Oracle looking to put in Oracle Linux soon to replace current bridging code
- Can do millions of packets per second. Some bottlenecks in tunneling code
LCA2012 – Thursday last session
Challenges for the Linux plumbing community – Jonathan Corbet
- Good news is boring, so how about some “high quality problems”
- Security
- Stuxnet , kernel.org , RSA hack , DigiNotar
- Scary ones are there must be others we haven’t heard about
- The bad guys are: motivated, capable, well funded. Not just script kiddies
- Not just about money anymore, with governments hacking lives are at stake
- We are on the front line. Not just security software, all code security critical
- Is your code secure? Who reviews it? What sort of testing? Plans for dealing with vulnerabilities?
- Is your infrastructure secure? – Who has access, who can change files? Are security updates applied? What are your plans in case of a breach?
- Are your processes secure? Who can commit? What can sign releases? Can you detect tampering? What do they know about the codes provenance?
- Tools
- Lockdep, valgrind, fault injection, sparse, smatch
- GCC python plugin, MELT, LLVM static analyzer
- and need to actually use the tools that exist
- Hardware
- hardware complexity leads to software complexity
- Complex interfaces: example V4L3 media controller interface.
- Control over our hardware
- Life is okay (could be better, could be worse)
- What is our influence over manufacturer?
- Example: Chasing tablet manufactures , no influence on design, have to port after device launched
- Example: By the time “Rock Box” runs on a device device is obsolete and not in shops
- How can we be more involved in conception and design of hardware in the first place?
- Linux Only
- Once upon a time we depended heavily on portability
- The DRM tree deemphasized BSD support, This hurt BSD but… would we rather do without kernel mode settings
- Might be inevitable but try not to be too arrogant
- The platform problem
- Code you control vs Black box
- The kernel’s ARM subtree (re-implements stuff from elsewhere in kernel)
- XFree86 (tried to keep everything in user space)
- Opportunistic suspend (Andriod decided “too hard” to fix rest of kernel)
- Async I/O (implemented multiple times, no comprehensive implementation)
- Example: wireless devices had own 80211 implementation. replaced with max80211
- Example: PowerTop used to find wide range a random things causing high power usage in laptops
- Ongoing examples: Bufferbloat, marvell-cam drivers, User-Space TCP, Control groups, Andriod
What is in a tiny Linux installation? by Malcolm Tredinnick
- Skipping bootloader portion
- Kernel is big – 9.6M lines of C, 250k lines of assembler
- Booting the kernel
- “make allnoconfig” , smallish, 222 “y” ‘s. 842KB bzimage, build time under 15s, no file systems, no fancy hardware, ISA, no PCI
- “make allyesconfig” , 5177 y options. 39MB bzimage, over 1h to build, includes drivers/staging
- booting allnoconfig via qemu-kvm . Gets to “unable to mount root file system”
- Kernel components – hardware arch, drivers, subsystems, others
- need roto filesystem in memory, initrd / initramfs . init process just in cpio archive, can just be hello world
- need initrd, initramfs , RAM disk block device, ELF binary support. 889KB bzimage (up 50kb)
- Now boots, use “rdinit=/hello” option in qemu , just prints out hello world
- Transition to userspace
- initrd loads some modules etc, runs pivot_root , run startup scripts
- Userspace
- Why are you doing this? Single purpose system, usb stick (rescue, puppy linix, Damn Small Linux) , tiny memory, tiny storage usage, fast power on. Trade-off of options
- We have to run something, need some binaries, shared libraries, large binaries with multiple purposes (busybox)
- Busybox – one binary – acts differently depending on calling name, installed as symlinks
- Busybox: fairly small, default utilities, 2MB without networking, easy to test
- C libraries – glibc (probably not a good idea), eglibc (easier to build, binary compatible with glibc, can take things out), uClibc (alternative, very small, some overlap with busybox, source code compatible with glibc)
- Device and Proc mngt will need: procfs, sysfs, tmpfs, udev, cgroups
- Build environments: you are cross compiling (build root), binutils, C libraries & cross compiling, Test, x86 instead x86 is harder
- See links in slides for some help
- mdebian is something to look at
LCA2012 – Thursday after lunch
Women in Open Technology & Culture – Valerie Aurora and Mary Gardiner
- Very umbrella term including fan fiction, open data, wikipedia, open access
- Why – important areas – women’s participation (especially in charge) very low
- Important for women to be in charge, creating, designing, building, not just as users
- 5 kinds of groups – project specific (debian women), feminist activism, teaching technical skill, networking, majority women projects.
- Community / project specific
- Linuxchix, owoot, pyladies, wikichix, etc ( linuxchix spawned several)
- low participation, poor replacement rate of leaders (often after they get FT jobs), low communication between, sometimes tension between.
- Feminist advocacy
- geek feminism, ada initiative, mind the gap
- growing and active – the new hotness, sharing best practices, paid work more common, some conferences
- Teaching women technical skills
- usually one day or evening courses.
- Growing hugely, vary widely in topics and skills, sharing best practises
- In person networking socialization
- Women in code, girl geek coffees, girl geek dinner
- try not to be dominated by marketing women ( use of “geek” term helps)
- Growing, easy to start local chapters
- Majority Women Groups
- Dreamwidth, Organisation for transformative works
- Often fan-fiction support, protect against takedown, let author control commercialisation
- Survey
- In person vs Online
- Activist vs non-activist
- Community vs technical
- Focussed vs broad topics
- Projects with broad focus within a narrow group seem not to work
- Projects with very technical focus but accoess different technologies seem not to work either (lack common language)
- Why Start – recruit and retain, networking, role models, safe space, feel normal
- Lessons on starting
- Don’t – join an existing one
- If you are a man, don’t do on behalf – “Nothing about us without us”
- Don’t expect women to start a group
- Find 3 or more women to start a group
- Don’t use girl/chix/ladies – use women
- Go broad instead of narrow on topic
- have clear defined goals and scope
- Start small, be realistic about work
- Consider one-off event rather than group
- Avoid NIH , reuse best practices
- be prepared to moderate any public forums you create
- Failure modes
- Become “the nice place” that everybody goes to
- Loses focus on women
- Safe Space moderation too many hours
- Ran low on time, slides will be online
Hacking Everything – Matt Evans
- Reuse things , not just hacking things like audrino that are supposed to be hacked
- reuse, need, art & design
- Gambiarra – brazilian art of an improvised fix
- 1940s radios and TV owners could fix their gear. today people are more passive
- wants people to tinker with things.
- Save resources
- Save money
- take apart things, learn by example
- Low cost manufacturing makes hacking hard ( solid state everything )
- Cheap development makes hacking easier ( reuses common technology, extra bits on devices unused )
- Some products are open hardware designs
- Things to look for – similar to ref design, debug code left in, unused features, factory test points/ports
- Ports that are wired up but unused often serial ports
- “My CD player has a serial port” , common on many devices
- Acquire a “logic level”USB-serial cable
- Other ports – JTAB , In-System programming
- Example: Picture frame, derived from sample board for camera, serial interface, built in CLI
- Old Wifi, ADSL boxes good with OpenWRT
- Don’t just consume – re-consume
- Teach others and tell the world
- Collaborate at a local hackerspace
- support companies that make things hackable