How to run Kubernetes on your spare hardware at home, and save the world Angus Lees
- Mainframe ->
- PC ->
- Rackmount PC
- Back the rackmount PC even with built-in redundancy will still fail. Or the location will go offline, or your data spreads across multiple machines
- Since you need to have distributed/redundancy anyway. New model (2005). Grid computing. Clever software, dumb hardware. Loosely coupled servers
- Libraries > RPC / Microservices
- Threadpool -> hadoop
- SQL -> key/store
- NFS -> Object store
- In-place upgrades -> “Immutable” image-based build from scratch
- Computers in clouds
- No cases. No redundant Power, journaling on filesystems turned off, etc
- Everything is in clouds – Secondary effects
- Corperate driven
- Apache license over GPL
- Centralised services rather than federated protocols
- Profit-driven rather than scrating itches
- Summary
- Problem
- Distributed Systems hard to configure
- Solutions scale down poorly
- Most homes don’t have racks of servers
- Implication
- Home Free Software “stuck” at single-machine architecture
- Problem
- Kubernetes (lots of stuff, but I use it already so just doing unique bits)
- “Unix Process as a service”
- Inverts the stack. Data is important then app. Kernel and Hardware unimportant.
- Easy upgrades, everything is an upgrade
- Declarative API , command line interface
- “We’ve conducted this experiment for decades now, and I have news for you, Hardware fails”
Hardware at Home
- Raid used to be “enterprise” now normal for home
- Elastic compute for home too
- Kubernetes for Home
- Budget $100
- ARM master nodes
- Mixed architecture
- Assume single layer-2 home ethernet
- Worker nodes – old $500 laptops
- x86-64
- CoreOS
- Broken screens, dead batteries
- 3 * $30 Banana pis
- Raspberry Pi2
- armv7a
- containOS
- Persistentvolumes
- NFS mount from RAID server
- Service – keepalived-vip
- Ingress
- keepalived and nginx-ingress , letsEncrypt
- Wildcard DNS
- Status
- Works!
- Printing works
- Install: PXE boot and run coreos-install
- Status – ungood
- Banana PIs a bit too slow.
- github.com/anguslees/k8s-home
- Budget $100
Is the 370 the worst bus route in Sydney? Katie Bell
- The 370 bus
- Goes UNSW and Sydney University. Goes around the city
- If bus runs every 15 minutes, you should not be able to see 3 at once
- Newspaper articles and Facebook group about how bad it is.
- Two Questions
- Bus privitisation better or worse
- Is the 370 really the worst
- Data provided
- Lots of stuff but nothing the reliability
- But they do have realtime data eg for the Tripetime app (done via a 3rd party)
- They have a API and Key with standard format via GTFS
- But they only publish “realtime” data, not the old data
- So collected the realtime data, once a minute for 4 months
- 557 GB
- Format
- zipfile of csv files
- IDs sometimes ephemeral
- Had to match timetable data and realtime data
- Data had to be tidied up – lots
- Processing realtime data
- Download 1 minute
- Parse
- Match each of around ~7000 trips in timetable (across all of NSW)
- Write ~20000 realtime updates to the DB
- Running 5 EC2 instances at leak
- Writing up to 40MB/s to the DB
- Is the 370 the worst?
- Define “worst”
- Found NSW definition of what an on-time bus is.
- Now more than 5:59 late or 1:59 early. Measured start/middle/end
- Victoria definition strictor
- She defined:
- Early: more than 2min early
- On time: 2m early – 5 min late
- late more than 5m late
- Very late – more thna 20m late
- Across all trips
- 3.7 million trips
- On time 31%
- More than 20m late 2.86%
- Best routes
- Nightime buses
- Outside of Sydney
- Shorter routes
- 86% – 97% or better
- Worst
- Less than 5% on time
- Longer routes
- 370 is the 22nd worst
- 8.79% on time
- Worst routes ( percent > 20 min late)
- 23% of 370 trips (6th worst)
- Lots of Wollongong
- Worst agencies
- No obvious difference between agencies and private companies
- Conclusion
- Privatisation could go either way
- 370 is close to the worst (277 could be worse) in Sydney
- bus-shaming.com
- github.com/katharosada/bus-shaming
Questions
- Used Spot instances to keep cost down
- $200 month on AWS
- Buses better/worse according to time? Now checked yet
- Wanted to calculate the “wait time” , not done yet.
- Another feed of bus locations and some other data out there too.
- Lots of other questions