Simon Lyall's Blog – Page 27 – New Zealand, Sysadmin, Linux, Curry, Transport

Linux.conf.au 2016 – Wednesday – Session 2

Welcoming Everyone: Five Years of Inclusion and Outreach Programmes at PyCon Australia by Christopher Neugebauer

How to bring more people to community run events
Talk is not about diversity in tech
Talk is about “Outreach and Inclusion in Events”
Outreach = getting them in , Inclusion = making them feel welcome
About funding programmes for events
FOSS happens over the Internet , face-to-face is less common than in other areas/communities
Events are where you can see the community
BUT: Going to a conference costs money – travel, rego, parking, leave from job
Events have equality of access problem
Inequity of access is a problem with diversity
Solution: Run outreach programmes
Money can reduce the barriers, just spending money can help solve the problem
Pycon Australia has had outreach for last 5 years
FOSS vs other outreach programmes
- Events have easy goals, define ppl/numbers to target, exact things to spend on, time period defined
- Similar every year, similar result each year
- Long-term results are ill-defined
- Engagement is hard to track
Pycon Australia
- Fairly independent of Python software foundation
- Biggest Pycon within 9 hours of flying
- Pycon US – 2500 attendees, $200k on financial attendance
- Pycon Aus 2015 – 450 attendees , 5-8% of budget on funding
2011
- Harassment and Codes of Conduct were a big thing
- Gender diversity policy, code of conduct, 20% speaks were women, First Gender diversity grants
- 2 Grants, – 1 ticket and 1 Ticket + $500 funded out of general conf budget
- 7 strong applicants at time when numbers were looking low (later picked up)
- Sponsor found and funded all 7 applicants
2012
- 1st of 2 years running conf in Hobart
- Moving from Sydney is hard. Australia big and people have to fly between cities (especially to Hobart)
- Hobart long way away for many people and small number of locals
- Sponsor increased funding to $700, funded 10 people for $500 + ticket
- Previous grant recipient from 2011 was speaking in 2012
2013
- Finding more speakers from more places
- Outreach and Speaker support run out of the same budget, cap removed on grants so International travel possible.
- Anyone could apply removed purely on gender limit. So other people who needed funding could apply. Eg Students, teachers, geographic minorities
- $12,500 allocated
- As more signups and more money came in more could go to the assistance budget
- If remove gender targeting then then what happens to diversity
- Got groups like GeekGirlDinners to target people that needed grants rather than directly chasing people to apply.
- Over half aid budget going to women
- Teachers good force multiplies
2014
- Lost previous diversity Sponsor
- Previously $5k from Sponsor + $7k from general fund.
- Pycon US – Everybody pays to attend ( See Essay by Jesse Noller – Everybody Pays )
- Most speakers have FOSS-friendly employers or can claim money
- Argument: Some confs make everybody pay no matter their ability.
- Told speakers that by default they would be charged, but by charge they weive it by just asking. Also said where the money was going and prioritised speakers to assistance. Also all organisers paid
- Extra money from about $7000
- Simplified structure of grants, less paperwork, just gave people a budget. Worked well since many people went with good deals.
- Caters better for diverse needs
- Also had Education Miniconf, covered under teacher traning budget. Offered to underwrite costs of substitute teachers for schools since that is not covered by normal school professional-dev budget
Results
- Every time at least one funding recipient has spoken at next conference
- Many fundees come back when get professional jobs
- Evangelize to the friends
Discovery
- expanding fund gets people you might not expect
- Diverse people have diverse needs
- Avoid making people do paperwork, just give them money
- Sponsors can make boot-strapping starting a programme easier
- Don’t expect 100% success
- Budget liberally, disburse conservatively
- Watch out for immigration scams
- Decline requests compassionately
Questions
- Weekend hard for Childcare – Not heavily targeted
- Targeting Speakers for funding rather than giving all of them means it gets to go a lot further. Better Bang for buck

Sentrifarm – open hardware telemetry system for Australian farming conditions by Andrew McDonnell

Great time to be a maker, everybody is able to make something
Neighbour had problem with having to measure grass fire danger in each paddock before going out with machinery during summer
Needs Wind Speed, temperature, humidity
Sentrifarm
- Low power, solar
- distributed
- Works in area with slow internet, sim card expense adds up however
- Easy to use for farmer, access via their farm.
- Data should not be owned by cloud provider
Hackerday Prize
- Build “something that matters”
- Prizes just for participating
- Document progress, produce a video
Our Goals
- Cheap and Cheerful
- Aussie “bush mechanic” ehtos
- Enjoy the adventure
Used stuff from 24+ other opensource projects
Prototyping
- Tried out various micro-controllers an other equipment
- Most you could only buy for a few dollars
- Tools – Bus Pirate
Radio links
- ISM-band radio module “Lora” technology
- SPI interface, well documented SX1276
- $20 for the module
- Propriety radio protocol, long rang low power, but open interface on top of it
Eagle used (alt is KiCAD) to design circuit
- Build own shields to plug sensors and various controllers into
playformio.org – run one command, creates a arduino project and builds with one command for multiple micro-controllers
MQTT-SN – communications protocol for low-bw links.
Breakdown of his stack, see his slides for details
Backend Software
- Ubuntu
- Docker
- Carbon + Whisper + Graphite, Grafana
- “Great time to be a hacker, using who knows how many lines of code and only had to write 7 to get it to work together”
Grafana hard to setup but found a nice docker container
Data kept separately from the container
Goal to get power down
Used 3D-printer to create some parts from mounting bits.
- OpenSCAD – Language to design the parts
Range of Lori of 5km un-evalated , 9km up a tower with sinple home-built antenna
Won a top-100 prize at Hackaday of a t-shirt
You can do it
- Document
- Focus on one part at a time
- Have fun
- http://hackaday.io/project/4758
Questions
- Ask home survives weather? – Not a lot of experience yet, some options
- Home likely others to use? – Maybe but main gaol was to building it

Linux.conf.au 2016 – Wednesday – Session 1

Going Faster: Continuous Delivery for Firefox by Laura Thomson

Works for Cloud services web operations team
Web Dev and Contious delivery lover
“Continuous delivery is for webapps” – Maybe not just Webapps? Maybe Firefox too
But Firefox is complicated
Process very complicated – “down from 5 source control systems to 3”
But plenty of web apps are very complicated (eg Netflix)
How do we continuous deliver Firefox
How it works currently
- Release every 6 weeks
- 4 channels – Nightly -> Aurora -> Beta -> release
- Mercurial Repo for each channel
Release Models
- Critical Mass – When enough is done and it is stable
- Single Hard deadline – eg for games being mass released
- Train Model – fixed intervals
- Continuous Delivery
Deployment Maturity Model
Updates
- New Build -> Generate a diff -> FF calls back -> downloads and updates
- Hotfixs
- Addons automatically updated
Currently pipeline around 12 hours long, lots of tests and gatekeeping
“Go Faster”
- System add-ons
- Test Pilot
- Data Separate from code
- Downloadable content
- Features delivered as web apps
System addons
- Part of core FF, modularized into an add-on
- Build/test against existing FF build, a lot smaller test
- Updated up to daily(for now) on any release channel
- signed and trusted
- Restartless updates
  - install or update without a browser restart
  - Restarts suck
  - Restartsless coming soon for system add-ons
- Good for rapid iteration, particularly on the front-end
- Wrappers for services
- Replacing hotfixes
Problems with add-ons
- Localalisation
- Optimizing UX : Better browser faster vs update fatigue
- Upfront telemetry requirements
- Dependency mngt on firefox
- Dependency management between system add-ons (coming soon)
Add-ons in flights
- Firefox hello is already an add-on
- Currently in beta in 45
- First beta updates before 46
Test Pilot
- Release channel users opt in to new features
- Release channel users different from pre-release ones
- Developed as regular ad-ons (not system add ons)
- Can graduate to system add-ons by flipping a bit
Data should be seperate from code
- Sec policy
- blocklists
- tracking protection list
- dictionaries
- fonts
Many times Data update == release , this is broken
Also some have their own updaters
Kinto
- Lightweight JSON storage with sync, sharing, signing
- Natice JSON over http
- niceties of couchDB backed by postgressDB
How Kinto Works
- pings for updates
- balrog supplies link to kinto
- signed data downloaded, checked, applied
Kinto good for
- Add-ons block list
Downloadable Content
- Some parts of the browser may not need frequently
- May not be needed on startup
- eg languages packs, fonts for Firefox on Android
Features delivered remotely
- Browser features delivered as web apps
- Pull in content from the server
- in a early stage
Futures
- Easy for projects to impliment
- Better “knobs and dials” (canaries A?B, data viz)
- Pushed based updates
- Simpler localisation
Questions
- They support rollbacks
- Worst case: Firefox has a startup crash
- Not sure sure ice weasel would fit in.
- How will effect ESR channel? – Won’t change, they will stay security-only
- Bad Addons – Hate ones that reporting user-data, crashers (eg skype toolbar at one point), Highjack your browser and change settings
- There is much collaboration between [open source] browsers
- You are avoiding the release cycle, planning to speed it up – Lots of tests that can’t get rid of all, working on it but not a simple thing to solve.

Linux.conf.au 2016 – Sysadmin Miniconf – Session 3

The life of a Sysadmin in a research environment – Eric Burgueno

Everything must be reproducible
Keeping system up as long as possible, not have an overall uptime percentage
One person needs to cover lots of roles rather than specialise
2 Servers with 2TB of RAM. Others smaller according to need
Lots of varied tools mostly bioinformatics software
90TB to over 200TB of data over 2 years. Lots of large files. Big files, big servers.
Big job using 2TB of RAM taking 8 days to run.
The 2*2TB servers can be joined togeather to create a single 4TB server
Have to customize environment for each tool, hard when there have lots of tools and also want to compare/collaborate against other places where software is being run.
Reproducible(?) Research

Creating bespoke logging systems and dashboards with Grafana, in fifteen minutes – Andrew McDonnell

Live Demo

Order in the chaos: or lessons learnt on planning in operations – Peter Hall

Lead of an Ops team at REA group. Looks after dev teams for 10-15 applications
Ops is not a project, but works with many projects
Many sources of work, dev, security, incidents, infrastructure improvement
Understand the work
- Document your work
- Talk about it, 15min standup
Scedule things
- and prepare for the unplanned
- Perhaps 2 weeks
- Leave lots of slack
Interruptions
- Assign team members to each ops teams
- Rotating “ops goal keeper”
- Developers on pager
Review Often
Longer term goals for your team
Failure demand vs value demand.
- Make sure [at least some of] what you are doing is adding value to the environment

From Commit to Cloud – Daniel Hall

Deployments should be:
- fast – 10 minutes
- small – only one feature change and person doing should be aware of all of what is changing
- easy – little human work as possible, simple to understand
We believe this because
- less to break
- devs should focus on dev
- each project should be really easy to learn, devs can switch between projects easy
- Don’t want anyone from being afraid to deploy
Able to rollback
- 30 microservices
- 2 devs plus some work from others
How to do it
- Microservices arch (optional but helps)
- git , build agent, packaging format with dependencies
- something to run you stuff
code -> git -> built -> auto test -> package -> staging -> test -> deploy to prod
Application is built triggere by git
- build.sh script in each repo
Auto test after build, don’t do end-to-end testing, do that in staging
Package app – they use docker – push to internal docker repo
Deploy to staging – they use curl to push json mesos/matathon with pulls container. Testing run there
Single Click approval to deploy to staging
Deploy to prod – should be same as how you deploy to staging.

LNAV – Paul Wayper

Point at a dir. read all the files. sort all the lines together in timestamp order
Colour codes, machines, different facilities(daemons). Highlights IPs addresses
Errors lines in red, warning lines in yellow
Regular expressions highlighted. Fully pcre compatable
Able to move back and force and hour or a day at a time with special keys
Histograph of error lines, number per minutes etc
more complete (SQL like) queries
compiles as a static binary
Ability to add your own log file formats
Ability share format filters with others
Doesn’t deal with journald logs
Availbale for spel, fedora, debian but under a lot of active development.
acts like tail -f to spot updates to logs.

Linux.conf.au 2016 – Sysadmin Miniconf – Session 2

Site Reliability Engineering at Dropbox – Tammy Butow

Having a SLA, measuring against it. Caps OPSwork, Blameless Post Mortum, Always be coding
400 M customer, billion files every day
Very hard to find people to scale, so build tool to scale instead
Team looks at 6,000 DB machines, look after whole machines not just the app
Build a lot of tools in python and go
PygerDuty – python library for pagerduty API
- Easy to find the top things paging, write tools to reduce these
- Regular weekly meeting to review those problems and make them better
- If work is happening on machines then turn off monitoring on them so people don’t get woken up for things they don’t need to.
- Going for days without any pages
Self-Healing and auto-remediation scripts
Hermes
- Allocate and track tasks to groups
Automation of DB tasks
Bot copies pagerduty alerts in slack
Aim Higher
- Create a roadmap for next 12 months
- Buiding a rocketship while it is flying though the sky
Follow the Sun so people are working days
Post Mortem for every page
Frequent DR testing
Take time out to celebrate

I missed out writing up the next couple of talks due to technical problems

Linux.conf.au 2016 – Sysadmin Miniconf – Session 1

Is that a Cloud in you packet – Steven Ellis

What if you could have a demo of a stack on a phone
or on a memory stick or a mini raspberry-pi type PC
Nested Virtualisation
Hardware
- Using Linux as host env, not so good on Win and Mac
- Thinkpad, fedora or Centos, 128GB SSD
Nested Virtualisation
- Huge perforance boost over qemu
- Use SSD
- enable options in modules kvm-intel or kvm-amd
- Confirm SSD perf 1st – hdparm -t /dev/sdX
- Create base env for VMs, enable vmx in features
- Make sure it uses a different network so doesn’t badly interact with ones further out
Think LVM
- Creat ethin pool for all envs
- Think on lvm ” issue_discards = 1 “
Base image
- Doesn’t have to be minimal
- update the base regularly
- How do you build your base image?
  - Thin may go weirdly wrong
  - Always use kickstart to re-create it.
- Think of your use case, don’t skim on the disk (eg 40G disk for image)
- ssh keys, Enable yum cache
- Patch once kicked
- keep a content cache, maybe with rsync or mrepo
Turn off VM and hen use fsrim and partx to make it nice and smaller.
virt-manager can’t manage thin volumes, DONT manually add the path
use virsh to manually add the path.
snapshots or snapshots great performance on SSD
Thin longer activates automatically on distros
packstack simple way to install simple openstack setup
LVM vs QCOW
- qcow okay for smaller images
- cloud-init with atomic
- do not snapshot a qcow image when it is running

Revisiting Unix principles for modern system automation – Martin Krafft

SSH Botnet
OSI of System Automation
Transport unix style, both push and pull
uses socat for low level data moving
autossh <- restarts ssh connection automatically
creates control socket

A Gentle Introduction to Ceph – Tim Serong

Ceph gives a storage cluster that is self healing and self managed
3 interfaces, object, block, distributed fs
OSD with files on them, monitor nodes
OSD will forward writes to other replics of the data
clients can read from any OSD
Software defined storage vs legacy appliances
Network
- Fastest you can, seperate public and cluster networks
- cluster fatsre than public
Nodes
- 1-2G ram per TB of storage
- read recomendations
SSD journals to cache writes
Redundancy
- Replications – capacity impact but usually good performance
- Erasure coding – Like raid – better space efficiency but impact in most other areas
Adding more nodes
- tends to work
- temp impact during rebalancing
How to size
- understand you workload
- make a guess
- Build a 10% pilot
- refine to until perf is achieved
- scale up the the pilot

Keeping Pinterest running – Joe Gordon

Software vs service
- No stable versions
- Only one version is live
- Devs support their own service – alligns incentives, eg monitoring built in
- Testing against production traffic
SRE at Pinterest
- Like a pit crew in F1
- firefighting at scale
- changing tires while moving
Operation Maturity
Operation Excellence
- Have the best practices, docs, process, imporvements
- Repeatable deploys
Visability
- data driven company
- Lots of Time series data – TSDB
- Using ELK
Deployments
- no impact to end user
- easy to do, every few minutes
Canary vs Staging
- Send dark (copies) of traffic to canary box without sending anything back to user
- Bounce back to starting if problems
Teletran
- Rollback, hotfix, rolling deploy, starting and testing, visibility and useability
- client-server model
- pre/post download, restart, etc scripts included with every deployment
- puase/resume various testing
Postmortums and Production readyness reviews
Cloud is not infinite, often will hit AWS capacity limits or even no avaialble stuff in the region
Need to be able to make sure you know what you are running and if it i seffecintly used
Open sourced tools
- mysql_utils – lots of tools to manage many DBs
- Thrift tools
- Teletraan – open sourced in Feb 2016
- github.com/pinterest

Linux.conf.au 2016 – Tuesday – Keynote: George Fong

George Fong – Chair of Internet Australia

The Challenges of the Changing Social Significance of the Nerd

“This is the first conference I’ve been to where there’s an extremely high per capita number of ponytails”
Linux not just running web server and other servers but also network devices
Linux and the Web aren’t the same thing, but they’ve grown symbiotically and neither would be the same without the other
“One of the lessons we’ve learned in Australia is that when you mix technology with politics, you get into trouble”
“We have proof in Australia that if you take guns away from people, people stop getting killed”

Linux.conf.au 2016 – Monday – Session 3

Cloud Anti-Patterns – Casey West

The 5 stages of Cloud Native
Deploying my apps to the cloud is painful – why?
Denial
- “Containers are like tiny VMs”
- Anti-Pattern 1 – do not assume what you have now is what you want to put into the cloud or a container
- “We don’t need to automate continuous delivery”
- We shouldn’t automate what we have until it is perfect. Automate to make things consistent (not always perfect at least at the start)
Anger
- “works on my machine”
- Dev is just push straight from dev boxes to production
- Not about making worse code go to production faster
- Aim to repeatable testable builds, just faster
Bargaining
- “We crammed the monolith into a container and called it a microservice”
- Anti-Pattern: Critically think on what you need to re-factor (or “re-platforming” )
- ” Bi-modal IT “
- Some stuff on fast lane, some stuff on old-way slow lane
- Anti-pattern: leagacy products put into slow lane, these are often the ones that really need to be fixed.
- “Micros-services” talking to same data-source, not APIs
Depression
- “200 microservices but forgot to setup Jenkins”
- “We have an automated build pipeline but online release twice per year”
Acceptance
- All software sucks, even the stuff we write
- Respect CAP theorem
- Respect Conway’s Law
- Small batch sizes works for replatforming too
Microservices architecture, Devops culture, Continuous delivery – Pick all three

Cloud Crafting – Public / Private / Hybrid – Steven Ellis

What does Hybrid mean to you?
What is private Cloud (IAAS)
Hybrid – communicate to public cloud and manage local stuff
ManageIQ – single pain of glass for hardware, vms, clounds, containers
What does it do?
- Brownfields as well as Greenfields, gathers current setup
- Discovery, API presentations, control and detect when env non-complient (eg not fully patched)
- Premise or public cloud
- Supplied as a virtual appliance, HA, scale out
- Platform – Centos 7, rails, postgress, gui, some dashboards our of the box.
Get involved
- Online, roadmap is public
- Various contributors
DEMO
Just put in credentials to allow access and then it can gather the data straiht away

Live Migration of Linux Containers by Tycho Andersen

LXC / LXD
LXD is a REST API that you use to control the container system
tool -> RST -> Daemon -> lxc -> Kernel
“lxc move host1:c1 host2: ” – Live migrations
- Needs a bit of work since lots moving, lots of ways it could fail
- 3 channels created, control, filesystem, container processes state
CRIU
- 5 years of check-pointing
- Lots based off open-VZ initial work
- All sorts of things need to support check-pointing and moving (eg selinux)
- Iterative migration added
- Lots of hooks needed for very privileged kernel features
Filesystems
- btrfs, lvm, zfs, (swift, nfs), have special support for migration that it hooks into
- rsync between incompatable hosts
Memory State
- Stop the world and move it all
- Iterative incremental transfer (via p.haul) being worked on.
LXC + LXD 2.0 should be in Ubuntu 16.04 LTS
Need to use latest versions and latest kernels for best results.

Linux.conf.au 2016 – Monday – Session 1

Open Cloud Miniconf – Continuous Delivery using blue-green deployments and immutable infrastructure by Ruben Rubio Rey

Lots of things can go wrong in a deployment
Often hard to do rollbacks once upgrade happens
Blue-Green deployment is running several envs at the same time, each potentially with different versions
Immutable infrastructure , split between data (which changes) and everything else only gets replaced fully by deployments, not changed
When you use docker don’t store data in the container, makes it immutable. But containers are not required to do this.
Rule 1 – Never modify the infrastructure
Rule 2 – Instead of modifying – always create from ground up everything that is not data.
Advantages
- Rollbacks easy
- Avoid Configuration drift
- Updated and accurate infrastructure documentation
Split things up
- No State – LBs, Web servers, App Servers
- Temp data , Volatile State – message queues, email servers
- Persistent data – Databases, Filesystems, slow warming cache
In case of temp data you have to be able to drain
USe LBs and multiple servers to split up infrastructure, more bit give more room to split up the upgrades.
If pending jobs require old/new version of app then route to servers that have/not been upgraded yet.
Put toy rocket launcher in devs office, shoots person who broke the build.
Need to “use activity script” to bleed traffic off section of the “temp data” layer of infrastructure, determine when it is empty and then re-create.

Priorities for 2016

This is a almost New Years resolutions page but not quite. It is a list of the stuff that will take priority over other things in 2016

Chess – Aim to play regularly in tournaments, do weekly coaching and study at least 7 hours per week on tactics, endgames and openings.
Programming – Continue improving my programming skills, finish the book I am on, do a few exercises and create a few things
Blogging – At least 1 post each month to both my personal blog and the Auckland Chess Centre website
Driving – Get my Restricted Driver License
Reading – Read books (not online) at least half an hour per day
Health – 7500 steps every weekday plus get to goal weight
Conference – Run successful Sysadmin Miniconf at Linux.conf.au 2016

Stretch Goals – If I am keeping up with the above

Start working my way through Shakespeare’s plays
Do a couple of new website projects I’ve been putting off
Watch a 2-3 of hours of TV each week.

Studying for Driver license test with Anki

In 2014 I decided to do a bit or work to finally get my New Zealand driver license. The first step towards this was passing the theory test which is a 35 question test given on computer. You have to get at least 32 questions right to pass.

After spending a bit of time looking at the roadcode book I decided to go with just learning the questions. I did this by:

Buying some of the official practice exams
Grabbing other questions for unofficial sites
Entering some other questions manually from the books

I took all these questions and created a Anki Deck. Anki is some spaced repetition software that I use to learn things. I tell it to ask me a few new questions every day, if I get them wrong it asks me again tomorrow, if I get them right it asks me again next week. Gradually as I learn something it asks me less often (see the more technical explanation here)

A typical question on an Anki deck looks like these screenshots:

The left on the left shows me being asked the question. Once I pick my answer I look at the actual answer (see rightmost screenshot)

If I get it wrong I get the card again in 10 minutes and depending on how easy I judged it if I got it right I’ll only see it again in months.

I ended up entering just on 400 questions and told Anki to give me 5 new cards every day plus whatever old ones I had to review. After a few months I had gone though all the questions and had a good feel for them. I also did some of the official practice exams.

Eventually in December 2014 I sat the exam and got 100 percent correct.

I’ll make my deck available at the link below. There are just over 400 cards in it, some with pictures. There are a few duplications but no errors as far as I am aware. They are current as of late 2014 (including the give-way rules change that year).

To use them you’ll need a copy of Anki and it is probably easiest to use the desktop edition to import the file and then use an Ankiweb account to Synchronize to a copy on your phone.

Download NZ Driver license Theory Anki Deck (2MB .apkg file)