2019 – Wednesday – Session 1

Filesender: Sending large files across facilities – Ben Martin

Dr Ben Martin
  • 10 Year old project
  • Web based File Sharing
    • Quick 1-to-few file sharing between people
    • Files go away after a month by default
    • Simpler to run than anon-ftp etc
    • Stats of downloads available to sharer
    • User only needs web browser
    • Upload resume, important with TB sized files
    • Notifications
    • share with explicit groups
    • Browser-to-browser encryption of data AES-256
    • SAML for auth scale
    • GDPR by default, about privacy page
  • Overview
    • Server side is PHP
    • Client side JS with light widgets
    • MariaDB
  • Server Storage
    • Chunked 5MB files
    • Cept used at aarnet
  • Downloading
    • On the fly zip64 archive creation
    • One of more files listed per transfer
    • Links for console download if needed
  • Dragons
    • Auto Downloads and fast uploads cross browser is HARD
    • Mixed browsers
    • Long uploads can exceed auth sessions times
    • Web crypto support w3c
    • People use ancient versions of databases
  • Lots of details on the Database and Encryption. Sounds like both have improved to a good state
  • Future
    • UI refresh
    • Mobile App
    • E2E Encryption
    • Docker image for easy setup
    • More SAML info, apache config?
    • Integration of Endpoints ( auto youtube etc )
    • Session Clone to investigate problems (but privacy?)
    • Run the whole thing in the cloud
  • Questions
    • Command line? – REST API (php and client)
    • RSYNC for slightly changing files – Being investigated

Hot Potato – James Forman and Callum Dickinson

James and Callum
  • What is Hot potato
    • Not a monitoring System
    • monitoring System -> Hot potato -> On-call person
    • Web app in python and flask
    • Tells you things and stays out of the way
  • Why ?
    • Spark shutdown paging Network
    • Needed quick version
  • Goals
    • Don’t get in the way
    • Alert reduction
    • Highly available
    • Support any System – Nagios family now, Prometheus later
    • Support methods – Pushover, SMS, Paging
  • What else can it do
    • Failure notifications when contacts are not working
    • Heartbeats so know when monitoring system is down
  • Planning stuff to add to it
    • Teams – put everyone on call
    • Team escalations
    • Planned work ( go to person working before oncall, extend windows )
    • Support Hotline integration
    • Mobile App
    • Adding German and Italian
  • How it works
    • Flask App
    • RabbitMQ
    • Database (cockroachDB)
    • Apps talk via the Databases
    • Alert -> Object in DB -> Put on Queue -> Worker
    • Worker -> Get details to send to -> Try to send -> Store result in DB
    • If Failure fails then work it’s way though the list.
  • Questions:
    • ACK can use pushover so don’t have to login to app
    • Looking at teams functions
    • CockroachDB picked since it seems very reliable
    • Not sure about restoring/calendaring features going in since need to make it generic?
    • Endpoints fairly modular so should be extendable to new ones.
Share 2019 – Wednesday – Keynote

#WeAreNotWaiting: how open source is changing healthcare
– Dana Lewis

Dana Lewis
  • Getting diagnosed with a chronic disease is like being struck by lighting.
  • Insulin takes a while to kick in
  • Manual diabetes
    • Have to check level over and over again
    • Have to judge trend and decide more insulin, food, exercise, etc
    • Constantly
  • Device
    • Windows-only interface software to access
    • Alarm not great, various other limitations
  • Idea
    • Pull data off the device, create smarter app
    • Hard to do
  • First version
    • Device -> WinPC -> dropbox -> app -> pushover -> phone
  • 2nd Version
    • Press button to indicate what she is doing (eating, sleeping) when get alert
  • 3rd version
    • Hook into Insulin pump to do it automatically
    • Replace the human do they same thing over-and-over again in the loop.
    • all portable
    • Person doesn’t have to wake up to adjust, petter sleep
  • OpenAPS
    • Open Source
    • Created a list of ways it could fail (battery fails, wires come out)
    • Focusing on safety
    • Limiting dosing ability in hardware and software
    • Failing back safely to standard device operation
  • Approx 1000 users
    • 9 million+ hours of DIY closed loop experience
    • Anonymized dataset available
  • Sample child user
    • Before: 4.5 manual interventions per day by parents
    • After: 0.7 per day
  • School Child ( 5 vs 6th Grade )
    • 420 visits to school nurse ( 2.3 /day ) – 66 visits for events
    • 5 visits with OpenAPS – 3 visits (Gym related)
  • Intel Edison Platform
    • Smaller than Raspberry Pi
    • But discontinued (looking for old ones to replace)
  • Going back to Pi, now smaller
    • Has built in display with “Explorer Hat”
  • Outsiders are building stuff because they can and the traditional companies are not meeting the need. Innovate with small solution and build it up
  • Ask what little things you can do for people with similar problems. This started with just “Make a louder alarm”.

Share 2019 – Tuesday – Session 3 – Docs Down Under

How to Avoid Meetings – Maia Sauren

Maia Sauren
  • What Distributed/International Teams involves
    • Lots of late meetings
    • Not up on the inclusive languages
    • We’d language quirks from ESL speakers
    • Taxonomy changes between fields
    • Stereotypes are incomplete
    • Ask Culture vs Guess Culture
    • Micro-cultures , down to schools, extra years.
    • When you make a private joke without context, someone is left out
    • What governance model wins (who do they raise barriers for)
    • It’s harder to change a relationship over the phone than maintain one
    • A relationship with a CoC is less fragile (shouldn’t be figured out in the fly during conflict)
    • “How do you want to have arguments?”
    • Set standards early and resolve swiftly
    • Normalise conflict resolution
    • Adulting: it’s for people who don’t want to cry even more later
    • What have you done this week you are proud of?

Disaster Recovery Book – Svetlana Marina

Svetlana Marina
  • Software Development Process
    • Speaning all time “firefighting” so no time to improve processes
  • Operation Support Model
  • Incident Model
    • Alert Raise
    • Initial Response
      • Runbook should have: Alert type -> Investigation help, exact queries into log search, links
    • Impact, escalations, SLA
      • Send email to stakeholders, keep their trust
      • Include “communication required” in runbook
    • Investigation and Damage control
      • Do what is required to fix problem, not fix the root cause
      • Message to stakeholders again
    • Through Investigation (depends on situation)
    • Fix root cause
    • Post Incident Review

The Bus Plan: Junior Staff Training – Andrew Jeffree

Andrew Jeffree
  • Staff coming into industries with lots of automation, but what when automation fails?
  • Staff only know about automation, can only run a playbook, don’t understand what it does
  • Disclaimer: Automation isn’t bad.
    • Config Managment, Custom tooling, CICD, Infrastructure as code
  • Buttons
    • Need to understand what they do
    • Sometimes they are not there
    • Can be Hard to implement a button that doesn’t exist
  • Do it the manual way
  • Training
    • Do something manually (install wordpress) and document
    • Expand what you have done
    • Tweak it
    • Break and fix it
    • Don’t be afraid to change something manually
  • Questions
    • Split for new people – 30% training , 70% work
    • Manual -> ansible -> puppet
    • Don’t want them to be stuck in the mindset that everthing we do is the best way

When Agile Doesn’t Work Anymore: Managing a Large Documentation Project – Lana Brindley

Lana Brindley
  • We all age
    • Sometimes old docs should just be binned
  • Old docs base, massive changes needed, but wanted to save it
    • Changes to toolchain too
  • Proof of concept
    • Define how big the job will be, what is involved
    • Get buy-in
  • Plan, Plan, Plan
    • Create a real timeline
    • Advertise your plan, tell everybody about it
    • Do a presentation to team for each phase
  • Research
    • Who is your audience?
    • No, really, who is your audience? – eg Sales and support may use docs more than customers
    • Lets the customers know that “somebody cares about these docs”
  • User/task analysis
    • Where do we need to focus work
    • What are the big tasks
  • Do the thing
    • You have to sit down and write it
    • You are breaking the agile system
    • You need to track your work and who doing what
    • eg this case, moving from big chapters to small topic-based docs.
    • Agree ahead of time on process, have buy-in
  • Review
    • What went on?
    • WHo not to stuff it up next time
    • Try not to blame people too much
  • Agile vs … something else
    • Sometimes Agile is not the best model
  • Tips for working outside Agile
    • Outreach
      • Especially product owners.
      • Get people on board
    • Track your work
      • Create Epics of sprints
      • Don’t go overboard
    • Should about it
      • outreach never stops
      • Present at sprint reviews
      • Brownbags

Share 2019 – Tuesday – Session 2 – Picking a community & Mycroft AI

Finding Your Tribe: Choosing open source Communities – Cintia Del Rio

  • When started out she couldn’t find info on picked what project to work on. Crowdsourced some opinions from others
  • You need to work out why you want to volenteer for a project.
  • 3 types of code on Github
    • Source Available
      • Backed by companies (without open source as their business model)
      • Core devs from company, roadmap controlled by them
      • Limited influence from externals
      • Most communication not on public channels
      • You will be seen as guest/outsider
    • On Person Band
      • Single core maintainer, working in spare time
      • Common even for very popular libraries and tools (lots of examples from node and java ecosystem)
      • Few resource
      • Conflicts might not be handled well
    • Communities
      • Communication Channells – forums, mailing list, chat
      • Multiple core devs
      • Github org
      • Code of conduct
  • Some things to check beforehand
    • Is it dead yet?
      • Communication channels, Commits, issues, pull requests – how old, recent updates
    • How aggressive is the community?
      • Look at how a clueless user is handled
      • Declined pull requests. HOW did they handle the decline
      • ” Is the mailing list/icr/slack, is it a trash fire? ”
      • “Can you please rule” – add “Can you please” in front of a comment and does it sound nice or still mean/sarcasm?
    • Is non-coding work valued
    • Communities with translations?
    • Look at photos from the conferences
    • Grammar mistakes and typos – how are the handled?
    • Newbie tags and Getting started docs
    • Cool languages tend to attract toxic people
    • Look for Jerks in leadership
    • Ask around
  • Does it spark joy? – If not let it go.

Intro to the Open Source Voice Stack By Kathy Reid

Kathy Reid
  • In the past we have taught spreadsheets etc. We need to teach the latest thing and that is now voice interfaces
  • Worked for Mycroft, mainly using that for demo
  • Voice stack
    • Wake Word
    • Speech 2 Text (utterance)
    • Action ( command )
    • Text to Speech (dialogue )
  • Request response life-Cycle
    • Wake word
    • Intent parser over utterance
  • When Kathy just started she was first woman and first Australian so few/no samples in database and had problems understanding her.
  • Text to Speech
    • Needs to have a well speaker speaking for 40-60 hours
    • Mimic Recording Studio – List of Phrases that people need to speak to train the output
Share 2019 – Tuesday – Session 1 – Docs Down Under Miniconf

Being Kind to 3am You – Katie McLaughlin

Katie McLaughlin
  • Not productive and operating at her best at 3am
  • But 3am’s will happen and they probably will be important
  • Essentials
    • You should have documentation, don’t keep it in your head since people are not available at 3am
    • Full doc management system might take a while
    • Must be Editable
      • Must be updateable at any time
    • Searchable
    • Have browser keywords that search confluence or github
    • Secure but discoverable by co-workers
  • Your Tools
    • Easy cache commands to use
    • Not dangerous
  • Stepping Up
    • Integrate your docs so it’ll be available and visible when people need it.
    • Alerts could link to docs for service
  • Post Mortem
    • List of commands you typed to fix it
  • Reoccurring Issues
    • Sometimes the quick fix is all you can do or is good enough. You can get back to sleep.
    • Maybe just log rotate to clean the disk. Or restart process once a week
    • Make you fix an ansible playbook you can just click
  • Learning
    • Learn new stuff so when you have chance you can do it write
  • Flag Changes
    • Handover changes to over to everyone else
  • So Empathy towards the other people (and they may show it back)
  • Audience
    • One guy gave anyone who go paged overnight $100 bill on their desk next day ( although he charged customers $150 )
    • From Fire Depts – Label everything, Have the docs come with the alert. Practice during the daylight.
    • Project IPXE – every single error message is a link to wiki page
    • Advice: Write down every command, everything you did, every output you saw. So useful for next day.

Making youself Redundant on Day One – Alexandra Perkins

Alexandra Perkins
  • Experiences
    • All Docs as facebook posts
    • All docs as comment codes
    • Word documents hidden in folders
  • Why you should document in your first weeks
    • Could you know the what the relevant questions for new people
    • You won’t remember it the first time you hear it
    • Easier for the next person
    • Inclusive and diverse workplace
  • What should you document
    • Document the stuff you find hard
    • Think about who else can use your docs
    • Stuff like: How to book leave, Who to ask about what topics, Info on workplace social events. Where lunch is.
  • How to document from the start
    • In wiki or Sharepoint
    • Word docs locally and copy it the official place once you have access
    • Saved support tickets
    • Notes to yourself on slack
    • Screenshots of slack conversations
    • Keep it simple, informal content is your friend. All the Memes!
    • Example Tutorial: “Send yourself an email and trace it though the logs”
  • Future Proofing
    • Create or Improve the place for Internal documentation
    • Everyone should be able to access and edit (regardless of technical expertise)
    • Must be searchable and editable so can be updated
    • Transfer all docs you did on your personal PC to company-wide documentation
    • Make others aware of the work you have done
    • Foster a culture of strong documentation.
      • Policy to document all newly announced changes
      • Have a rotation for the documentation person
    • Quality Internal Docs should be
      • Accessible
      • Editable
      • Searchable
      • Peer Reviewed

JIT Learning: It’s great until it isn’t – Tessa Bradbury

Tessa Bradbury
  • What should we learn?
    • There is a lot to learn
  • What is JIT Learning?
    • Write Code -> Hit an issue -> Define Problem -> Find a Solution
  • Assumption – You will ask the required questions (hit the issue)
    • Counter example: accessibility, you might not hit the problem yourself
  • Assumption – You can figure out the problem
    • Sometimes you can’t easily, you might not have the expereince and/or training
  • Assumption – You can find the solution
    • Sometimes you can’t find the solution on google
    • Sometimes you are not in Open Source, you can’t just read the code and the docs may be lacking
  • Assumption – You might not be sure the best way to write your fix
    • Best way to implement the code, if you should fix it in code
    • Or if your code has actually fixed the whole problem
  • Assumption – The benefit of getting it done now outweighs the cost of getting it wrong
    • Counter example – Security

Share 2019 – Tuesday – Keynote: Rory Aronson


  • Had idea to automated small-scale gardening
  • Wrote up a proposal: FarmBot – 3d printer for growing your garden
    • Open Source
    • Impact > Money
    • Based on 3rd printer frame (with moving arms)
  • Created prototype 2015-2016
    • Add Web App
  • Crowdfunding and Video campaign
    • 58 Million views
    • $800k pre-sales, 300 orders
  • Created and shipped first version
  • But still open source

OPen Source Hardware

    • All the CAD models
    • Bill of materials
    • Everything Versioned
    • Mods and add-ons “for inspiration only” (not officially supported)
  • Open Source Community
    • Code of Conduct

Open Source Company

  • If you have a business you will have competitors
  • Competitors -> Collaborators
  • Lots of numbers online (profit, bills, etc), Stuff that would be on internal wiki (like how to handle orders or do taxes) at other company is public
  • Compensation formula
  • See “Buffer’s Transparency Dashboard


Donations 2018

Each year I do the majority of my Charity donations in early December (just after my birthday) spread over a few days (so as not to get my credit card suspended).

I also blog about it to hopefully inspire others. See: 2017, 2016, 2015

All amounts this year are in $US

My main donations was to Givewell (to allocate to projects as they prioritize). I’m happy that they are are making efficient uses of donations.

I gave some money to the Software Conservancy to allocate across the projects (mostly open source software) they support and also to Mozilla to support the Firefox browser (which I use) and other projects.

Next were three advocacy and infrastructure projects.

and finally I gave some money to a couple of outlets whose content I consume. Signum University produce various education material around science-fiction, fantasy and medieval literature. In my case I’m following their lectures on Youtube about the Lord of the Rings. The West Wing Weekly is a podcast doing a episode-by-episode review of the TV series The West Wing.



DevOpsDaysNZ 2018 – Day 2 – Session 4

Allen Geer, Amanda Baker – Continuously Testing

  • Various sites
  • All Silverstripe and Common Web Platform
  • Many sites out of date, no automated testing, no test metrics, manual testing
  • Micro-waterfall agile
  • Specification by example (prod owner, Devops, QA)  created Gherkin tests
  • Standardised on CircleCI
  • Visualised – Spec by example
  • Prioritised feature tests
  • Ghirkinse
  • Test at start of dev process. Bake Quality in at the start
  • Visualise and display metrics, people could then improve.
  • Path to automation isn’t binary
  • Involve everyone in the team
  • Automation only works if humanised

Jules Clements – Configuration Pipeline : Ruling the One Ring

  • Desired state
  • I didn’t quite understand what he was saying

Nigel Charman – Keep Calm and Carry On Organising

  • 71 Conferences worldwide this year
  • NZ following the rules
  • Lots of help from people
  • Stuff stuff stuff

Jessica DeVita – Retrospecting our Retrospectives

  • Works on Azure DevOps
  • Post-mortems
  • What does it mean to have robust systems and resilience? Is resilience even a property? It just Is. When we fly on planes, we’re trusting machines and automation. Even planes require regular reboots to avoid catastrophic failures, and we just trust that it happen
  • CEO after a million dollar outage said “Can you get me a million dollars of learning out of this?”
  • After US Navy had accidents caused by slept deprivation switched to new watch structure
  • Postmortems are not magic, they don’t automatically make things change
  • We dedicate a lot of time to to below the line, looking at the technology. Not a lot of conversation about above-the-line things like mental models.
  • Resilience is above the line
  • Catching the Apache SNAFU
  • The Ironies of Automation – Lisanne Bainbridge
  • Well facilitated debriefings support recalibration of mental models
  • US Forest Service – Learning Review – Blame discourages people speaking up about problems
  • We never know where the accident boundary is, only when we have crossed it.
    • SRE, Chaos Engineer and Human Factors help hadle
  • In postmortems please be mindful of judging timelines without context. Saying something happened in a short or long period of time is damanging
  • Ask “what made it hard to get that team on the phone?” , “What were you trying to achieve”
  • Etsy Debriefing Guide – lots of important questions.
  • “Moving post shallow incident data” – Adaptive Capacity Labs
  • Safety is a characteristics of Systems and not of their components
  • Ask people about their history, ask every person about what they do and how they got there because that is what shapes your culture as an organisation

DevOpsDaysNZ 2018 – Day 2 – Session 3


I’ll fill this in later.


  • Honeycomb, Sumologic. Use AI to look at what happened at same time and magically correlate
  • Expensive or hard to send all logs as volumes go up
  • What is the logging is wrong or missing?
  • Metrics
    • Export in prometheus format
    • Read RED and USE paper
    • Create a company schema with half a dozen metrics that all services expose
  • Had and event or transaction ID that flows across all the microservices sorry logs can be correlated
  • Non technical solutions
    • Refer to previous incident logs
    • Part of deliverables for product is SLA stats which require logs etc
  • Testing logs
    • Make sure certain events produce a log
  • Chaos Monkey

ANZ Drivetrain

  • Change control cares about
    • Avaiability
    • Risk
    • Dependencies
    • Rollback
  • But the team doing the change knows about these all
  • Saw tools out there that seem very opinated
  • Drivetrain
    • Automated Checklist
    • Work with Change people to create checklist
    • Pipeline talks to drivetrain and tells it what has been down
    • Slack messages sent for manual changes (they login to app to approve)
  • Looked at some other tools (eg chef automate, udeploy )
    • Forced team to work in a certain pattern
  • But use ServiceNow tool as official corporate standard
    • Looking at making DriveTrail fill in ServiceNow forms
  • People worried about stages in tool often didn’t realise the existing process had same limitations
  • Risk assessed at the Story and Feature level. Not release level
  • Not suitable for products that due huge released every few months with a massive number of changes.




DevOpsDaysNZ 2018 – Day 2 – Session 2

Interesting article I read today

Why Doctors Hate their Computers by Atul Gawande

Mrinal Mukherjee – A DevOps Confessional

  • Not about accidents, it is about Planned Blunders that people are doing in DevOps
  • One Track DevOps
    • From Infrastructure background
    • Job going into places, automated the low hanging fruit, easy wins
    • Column of tools on resume
    • Started becoming the bottleneck, his team was the only one who knew how the infrastructure worked.
    • Not able to “DevOps” a company since only able to fix the infrastructure, not able to fix testing etc so not dilvering everything that company expected
    • If you are the only person who understands the infrastructure you are the only one blamed when it goes wrong
    • Fixes
      • Need to take all team on a journey
      • But need to have right expectations set
      • Need to do learning in areas where you have gaps
      • DevOps is not about individual glory, Devops is about delivering value
      • HR needs to make sure they don’t reward the wrong thing
  • MVP-Driven Devops
    • Mostly working on Greenfields products that need to be delivered quickly
    • MVP = Maximum Technical Debt
    • MVP = Delays later and Security audits = Your name attached to the problem
    • Minimum Standard of Engineering
      • Test cases, Documentation, Modular
      • Peer review
    • Evolve architecture, not re-architect
  • Judgemental Devops
    • That team sucks, they are holding things up, playing a different game from us
    • Laughing at other teams
    • Consequence – Stubbornness from the other team
    • Empathy
      • Find out why things are they way they are
    • Collaborate to find common ground and improve
    • Design my system to I plan to work within constraints of the other team