Audiobooks – December 2018

The Casebook of Sherlock Holmes by Sir Arthur Conan Doyle. Read by Stephen Fry

A bit weaker than the other volumes. The author tries mixing the style in places but several stories feel like repeats. Lacking excitement 7/10

Children of Dune by Frank Herbert

A good entry for the 2nd tier of Dune Books (behind 1, even with 2 & 4). Good mix of story, philosophy and politics. Plot a little plodding though.  8/10

Modern Romance: An Investigation by Aziz Ansari

Lots of hardish data and good advice for people looking to date online. Around 3-4 years old so fairly uptodate. Funny in a lot of places & some audiobook extra bits. 8/10

The Day of Battle: The War in Sicily and Italy, 1943-1944 by Rick Atkinson

Part 2 of a trilogy. I liked this a little more than the first edition but it is still hard to follow in the audiobook format without maps etc. Individual stories lift it up. 7/10

The End of Night: Searching for Natural Darkness in an Age of Artificial Light
by Paul Bogard

A book about how darkness is being lost for most people. How we are missing out on the stars, a good sleep and much else. I really liked this, writing good and topics varied. 8/10

President Carter: The White House Years by Stuart E. Eizenstat

Eizenstat was Chief Domestic Policy Advisor to Carter & took extensive notes everywhere. The book covers just above everything, all the highs and lows. Long but worth it. 7/10

Share

Donations 2018

Each year I do the majority of my Charity donations in early December (just after my birthday) spread over a few days (so as not to get my credit card suspended).

I also blog about it to hopefully inspire others. See: 2017, 2016, 2015

All amounts this year are in $US

My main donations was to Givewell (to allocate to projects as they prioritize). I’m happy that they are are making efficient uses of donations.

I gave some money to the Software Conservancy to allocate across the projects (mostly open source software) they support and also to Mozilla to support the Firefox browser (which I use) and other projects.

Next were three advocacy and infrastructure projects.

and finally I gave some money to a couple of outlets whose content I consume. Signum University produce various education material around science-fiction, fantasy and medieval literature. In my case I’m following their lectures on Youtube about the Lord of the Rings. The West Wing Weekly is a podcast doing a episode-by-episode review of the TV series The West Wing.

 

Share

Audiobooks – November 2018

The Vanity Fair Diaries 1983-1992 by Tina Brown

Well written although I forgot who was who at times. The author came over very real and it is interesting to feel what has/hasn’t changed since the 1980s. 7/10

His Last Bow and The Valley of Fear by Sir Arthur Conan Doyle. Read by Stephen Fry

The Valley of Fear is solid. The short stories are not among my favorites but everything is well produced 7/10

First Man: The Life of Neil A. Armstrong by James R. Hansen

I read this prompted by the movie. Unlike the movie covers his family, early and post-moon life and has a lot more detail everywhere. Not overly long however 8/10

Don’t Make Me Pull Over! : An Informal History of the Family Road Trip by Richard Ratay

Nice combination of the author’s childhood experiences in the early-70s along with a history of the hotel, highway and related topics. 8/10

Giants’ Star by James P. Hogan

3rd book in the trilogy. Worth reading if you read and liked the first two. 6/10

U.S.S. Seawolf: Submarine Raider of the Pacific by Joseph Eckberg

First person account of a crew-member of a US Sub before & during the first year (up to Jan 1943) of US involvement in WW2. Published during the war and solely sourced for one person, so missing some details due to wartime censorship and lack of reference to other sources. Engaging though. 8/10

Mind of the Raven: Investigations and Adventures with Wolf-Birds by Bernd Heinrich

I didn’t like these quite as much as “Summer World” and “Winter World” since 100% ravens got a bit much but still it was well written & got me interested in the birds. 7/10

The Greater Journey: Americans in Paris by David McCullough

Covering American visitors (mostly artists, writers and doctors) to Paris mainly from 1830 to 1900. Covering how they lived and how Paris influenced them along with some history of the city. 9/10

Share

DevOpsDaysNZ 2018 – Day 2 – Session 4

Allen Geer, Amanda Baker – Continuously Testing govt.nz

  • Various .govt.nz sites
  • All Silverstripe and Common Web Platform
  • Many sites out of date, no automated testing, no test metrics, manual testing
  • Micro-waterfall agile
  • Specification by example (prod owner, Devops, QA)  created Gherkin tests
  • Standardised on CircleCI
  • Visualised – Spec by example
  • Prioritised feature tests
  • Ghirkinse
  • Test at start of dev process. Bake Quality in at the start
  • Visualise and display metrics, people could then improve.
  • Path to automation isn’t binary
  • Involve everyone in the team
  • Automation only works if humanised

Jules Clements – Configuration Pipeline : Ruling the One Ring

  • Desired state
  • I didn’t quite understand what he was saying

Nigel Charman – Keep Calm and Carry On Organising

  • 71 Conferences worldwide this year
  • NZ following the rules
  • Lots of help from people
  • Stuff stuff stuff

Jessica DeVita – Retrospecting our Retrospectives

  • Works on Azure DevOps
  • Post-mortems
  • What does it mean to have robust systems and resilience? Is resilience even a property? It just Is. When we fly on planes, we’re trusting machines and automation. Even planes require regular reboots to avoid catastrophic failures, and we just trust that it happen
  • CEO after a million dollar outage said “Can you get me a million dollars of learning out of this?”
  • After US Navy had accidents caused by slept deprivation switched to new watch structure
  • Postmortems are not magic, they don’t automatically make things change
  • http://stella.report
  • We dedicate a lot of time to to below the line, looking at the technology. Not a lot of conversation about above-the-line things like mental models.
  • Resilience is above the line
  • Catching the Apache SNAFU
  • The Ironies of Automation – Lisanne Bainbridge
  • Well facilitated debriefings support recalibration of mental models
  • US Forest Service – Learning Review – Blame discourages people speaking up about problems
  • We never know where the accident boundary is, only when we have crossed it.
    • SRE, Chaos Engineer and Human Factors help hadle
  • In postmortems please be mindful of judging timelines without context. Saying something happened in a short or long period of time is damanging
  • Ask “what made it hard to get that team on the phone?” , “What were you trying to achieve”
  • Etsy Debriefing Guide – lots of important questions.
  • “Moving post shallow incident data” – Adaptive Capacity Labs
  • Safety is a characteristics of Systems and not of their components
  • Ask people about their history, ask every person about what they do and how they got there because that is what shapes your culture as an organisation
Share

DevOpsDaysNZ 2018 – Day 2 – Session 3

Kubernetes

I’ll fill this in later.

Observability

  • Honeycomb, Sumologic. Use AI to look at what happened at same time and magically correlate
  • Expensive or hard to send all logs as volumes go up
  • What is the logging is wrong or missing?
  • Metrics
    • Export in prometheus format
    • Read RED and USE paper
    • Create a company schema with half a dozen metrics that all services expose
  • Had and event or transaction ID that flows across all the microservices sorry logs can be correlated
  • Non technical solutions
    • Refer to previous incident logs
    • Part of deliverables for product is SLA stats which require logs etc
  • Testing logs
    • Make sure certain events produce a log
  • Chaos Monkey

ANZ Drivetrain

  • Change control cares about
    • Avaiability
    • Risk
    • Dependencies
    • Rollback
  • But the team doing the change knows about these all
  • Saw tools out there that seem very opinated
  • Drivetrain
    • Automated Checklist
    • Work with Change people to create checklist
    • Pipeline talks to drivetrain and tells it what has been down
    • Slack messages sent for manual changes (they login to app to approve)
  • Looked at some other tools (eg chef automate, udeploy )
    • Forced team to work in a certain pattern
  • But use ServiceNow tool as official corporate standard
    • Looking at making DriveTrail fill in ServiceNow forms
  • People worried about stages in tool often didn’t realise the existing process had same limitations
  • Risk assessed at the Story and Feature level. Not release level
  • Not suitable for products that due huge released every few months with a massive number of changes.

 

 

Share

DevOpsDaysNZ 2018 – Day 2 – Session 2

Interesting article I read today

Why Doctors Hate their Computers by Atul Gawande

Mrinal Mukherjee – A DevOps Confessional

  • Not about accidents, it is about Planned Blunders that people are doing in DevOps
  • One Track DevOps
    • From Infrastructure background
    • Job going into places, automated the low hanging fruit, easy wins
    • Column of tools on resume
    • Started becoming the bottleneck, his team was the only one who knew how the infrastructure worked.
    • Not able to “DevOps” a company since only able to fix the infrastructure, not able to fix testing etc so not dilvering everything that company expected
    • If you are the only person who understands the infrastructure you are the only one blamed when it goes wrong
    • Fixes
      • Need to take all team on a journey
      • But need to have right expectations set
      • Need to do learning in areas where you have gaps
      • DevOps is not about individual glory, Devops is about delivering value
      • HR needs to make sure they don’t reward the wrong thing
  • MVP-Driven Devops
    • Mostly working on Greenfields products that need to be delivered quickly
    • MVP = Maximum Technical Debt
    • MVP = Delays later and Security audits = Your name attached to the problem
    • Minimum Standard of Engineering
      • Test cases, Documentation, Modular
      • Peer review
    • Evolve architecture, not re-architect
  • Judgemental Devops
    • That team sucks, they are holding things up, playing a different game from us
    • Laughing at other teams
    • Consequence – Stubbornness from the other team
    • Empathy
      • Find out why things are they way they are
    • Collaborate to find common ground and improve
    • Design my system to I plan to work within constraints of the other team
Share

DevOpsDaysNZ 2018 – Day 2 – Session 1

Alison Polton-Simon – The DevOps Experiments: Reflections From a Scaling Startup

  • Software engineer at Angaza, previously Thoughtworks, “Organizational Anthropologist”
  • Angaza
    • Enable sales of life-changing products (eg solar chargers, water pumps, cook stoves in 3rd world countries)
    • Originally did hardware, changed to software company doing pay-as-you-go of existing devices
    • ~50 people. Kenya and SF, 50% engineering
    • No dedicated Ops
    • Innovate with empathy, Maximise impact
    • Model is to provide software tools to activate/deactivate products sold to peopel with low credit-scores. Plus out software around the activity like reports for distributors.
  • Reliability
    • Platform is business critical
    • Outages disrupt real people (households without light, farmers without irrigation)
    • Buildkite, Grafana, Zendesk
  • Constraints
    • Operate in 30+ countries
    • Low connectivity, 2G networks best case
    • Rural and peri-urban areas
    • Team growing by 50% in 2018 (2 eng teams in Kenya + 1 QA)
    • Most customers in timezone where day = SF night
  • Eras of experimentation
    • Ad Hoc
    • Tributes (sacrifice yourself for the stake of the team)
    • Collectives (multiple teams)
    • Product teams
  • ad Hoc – 5 eng
    • 1 eng team
    • Ops by day – you broke, you fix
    • Ops by night – Pagerduty rotation
    • Paged on all backend exception, 3 pages = amnesty
    • Good
      • Small but senior
      • JIT maturity
      • Everyone sitting next to each other
    • Bad
      • Each incident higherly disruptive
      • prioritized necessity over sustainability
  • Tribute – 5-12 eng
    • One person protecting team from interuptions
    • Introduced support person and triage
    • Expanded PD rotation
    • Good
      • More sustainable
      • Blue-Green deploys
      • Clustered workloads
    • Not
      • Headcount != horizontal scaling
      • Hard to hire
      • Customer service declined
  • Collectives 13-20 engs
    • Support and Ops teams – Ops staffed with devs
    • Other teams build roadmaps and requests
    • Teams rotate quarterly – helps onboarding
    • Good
      • Cross train ppl
      • Allow for focus
      • allowed ppl to get depth
    • Bad
      • Teams don’t op what they built
      • Quarter flies by quickly
      • Context switch is costly
      • Still a juggling act
      • 1m ramping up, 6w going okay, 2w winging down
  • Product teams  21 -? eng
    • 5 teams, 2 in Nairobi
    • Teams allighned with business virticals, KPIs
    • Dev, own and maintain services
    • Per-team tributes
    • No [Dev]Ops team
    • Intended goals
      • Independent teams
      • own what build
      • Support biz KPIs
      • cross team coordination
    • Expected Chellenges
      • Ownership without responsbility
      • Global knowledge sharing
      • Return to tribute system (2w out of the workflow)
  • Next
    • Keep growing team
    • Working groups
    • Eventual SRE
    • 24h global coverage
  • Case a “Constitution” of values that everybody who is hired signs
  • Takeaways
    • Maximise impact
      • Dependable tools over fashionable ones
      • Prefer industry-std tech
      • But get creative when necessary
    • Define what reliability means for your system
    • Evolve with Empathy
      • Don’t be dogmatic without structure
      • Serve your customers and your team
      • Adapt when necessary
      • Talk to people
      • Be explicit as to the tradeoffs you are making

Anthony Borton – Four lessons learnt from Microsoft’s DevOps Transformation

  • Microsoft starting in 1975
  • 93k odd engineers at Microsoft
    • 78k deployments per day
    • 2m commits per month
    • 4.4 builds/month
    • 500 million tests/day
  • 2018 State of Devops reports looks at Elite performers in the space
  • TFS – Team Foundation Server
    • Move product to the cloud
    • Moved on-prem to one instance
    • Each account had it’s own DB (broke stuff at 11k DBs)
  • 4 lessons
    • Customer focussed
      • Listen to customers, uservoice.com
      • Lots of team members keep eye on it
      • Stackoverflow
      • Embed with customers
      • Feedback inside product
      • Have to listen in all the channels
    • Failure is an opportunity to learn
    • Definition of done
      • Live in prod, collecting telemetary that examines hypotheses that it was created to prove
    • “For those of you who don’t know who Encarta is, look it up on Wikipedia”
  • Team Structure
    • Combined engineering = devs + testing
      • Some testers left team or organisation
    • Feature team
      • Physical team rooms
      • Cross discipline
      • 10-12 people
      • self managing
      • Clear charter and goals
      • Intact for 12-18 months
    • Sticky note exercise, people pick which teams they would like to join (first 3 choices)
      • 20% choose to change
      • 100% get the choice
  • New constants and requirements
    • Problems
      • Tests took too long – 22h to 2days
      • Tests failed frequently – On 60% passed 100%
      • Quality signal unreliable in master
    • Publish VSTS Quality vision
      • Sorted by exteranl dependancies
      • Unit tests
        • L0 – in-memory unit tests
        • L1 – More with SQL
      • Functional Tests
        • L2 – Functional tests against testable service deployment
        • L3 – Restricted class integration tests that run against production
      • 83k L0 tests run agains all pulls very fast
  • Deploy to rings of users
    • Ring 0 – Internal Only
    • Ring 1 – Small Datacentre 1-1.5m accounts in Brazil (same TZ as US)
    • Ring 2 – Public accounts, medium-large data centre
    • Ring 3 – Large internal accounts
    • Ring 4-5 – everyone else
    • Takes about a week for normal releases.
    • Binaries go out and then the database changes
    • Delays of minutes (up to 75) during the deploys to allow bugs to manafest
    • Some customers have access to feature flags
    • Customers who are risk tolerant can opt in to early deploys. Allows them to get faster feedback from people who are able to provide it
  • More features delivered in 2016 than previous 4 years. 50% more in 2017

 

Share

DevOpsDaysNZ 2018 – Day 1 – Session 4

Everett Toews – A Trip Down CI/CD Lane

I missed most of this talk. Sounded Good.

Jeff Smith – Creating Shared Contexts

  • Ideas and viewpoints are different from diff people
  • Happens in organisation, need to make sure everybody is on the same page
  • Build a shared context via conversations
  • Info exchange
  • Communications tools
  • Context Tools
  • X/Y Problem
  • Data can bridge conversations. Shared reality.
  • Use the same tools as other teams so you know what they are doing
  • Give the context to your requests, ask for them and it will automatically happen eventually.

Peter Sellars – 2018: A Build Engineers Odyssey

  • Hungry, Humble and Smart

Katrina Clokie – Testing in DevOps for Engineers

  • We can already write, so how hard can it be to write a novel?
  • Hopefully some of you are doing testing already
  • Problem is that people overestimate their testing skills, not interested in finding out anything else.
  • The testing you are doing now might be with one tool, in one spot. You are probably finding stuff but missing other things
  • Why important
    • Testing is part of you role
    • In Devops testing goes though Operations as well
    • Testing is DevOps is like air, it is all around you in every role
    • Roles of testers is to tech people to breath continuously and naturally.
  • Change the questions that you ask
    • How do you know that something is okay? What questions are you asking of your product?
    • Oracles are the ways that we recognise a problem
    • Devs ask: “Does it work how we think it should?”
    • Ops ask: “Does it work how it usually works?”
    • Devs on claims, Ops on history
    • Does it work like our competors, does it meet it’s purpose without harmful side effects, doesn’t it meet legal requirements, Does it work like similar services.
    • HICCUPPS – Testing without a Map – Michael Bolton, 2005
    • How do we compare to what other people are doing?  ( Not just a BA’s job , cause the customer will be asking a question and so should you)
    • Flip the Oracle, compare them against other things not just the usual.
    • Audit – Continuous compliance, Always think about if it works like the standards say it should.
    • These are things that the business is asking. If you ask then gain confidence of business
  • Look for Answers in Other Places
    • Number of tests: UI <  Service < Unit
    • The Test Pyramid as a bug catcher. Catch the Simple bugs first and then the subtle ones
  • Testing mesh
    • Unit tests – fine mesh
    • Intergration – Bigger/Fewer tests but cover more
    • Next few layers: End to End, Alerting, Monitoring, Logging. Each stops different types of bugs
    • Conversation should be “Where do we put our mesh?”, “How far can this bug fall?”.
    • If another layer will pick the bug up do we need a test.
  • Use Monitoring as testing
    • Push risk really late, no in all cases but can often work
  • A/B testing
    • Ops needs to monitor
    • Dev needs framework to role out and put in different options
  • Chaos Engineering
    • Start with something small, communicate well and do during daylight hours.
    • Yours customers are testing in production all the time, so why arn’t you too?
  • https://leanpub.com/testingindevops

 

Share

DevOpsDaysNZ 2018 – Day 1 – Session 3

Open Space 1 – Prod Support, who’s responsible

  • Problem that Ops doesn’t know products, devs can’t fix, product support owners not technical enough
  • Xero have embedded Ops and dev in teams. Each person oncall maybe 2 weeks in 20
  • Customer support team does everything?
  • “Ops have big graphs on screens, BI have a couple of BI stats on screens, Devs have …youtube videos”
  • Tiers support vs Product team vs Product support team
  • Tiered support
    • Single point of entry
    • lower paid person can handle easy stuff
    • Context across multiple apps
  • Product Team
    • Buck stops with someone
    • More likely to be able to action
    • Ownership of issues
    • Everyone must be enabled to do stuff
    • Everyone needs to be upskilled
  • Prod Support
    • Big skilled can fix anything team
    • Devs not keen
    • Even the best teams don’t know everything

Open Space 2 – DevOps at NZ Scale

  • Devops team, 3rd silo
    • Sometimes they are the new team doing cool stuff
    • One model is evangelism team
  • Do you want devops culture or do you just want somebody to look after your pipeline?
  • Companies often don’t know what they want to hire
  • Companies get some benefit with the tools (pipelines, agile)  but not the culture. But to get the whole benefit they need to adopt everything.
  • The Way of Ways article by John Cutler

Open Space 3 – Responding Quickly

I was taking notes on the board.

Share

DevOpsDaysNZ 2018 – Day 1 – Session 2

Mark Simpson, Carlie Osborne – Transforming the Bank: pipelines not paperwork

  • Change really is possible even in the least likely places
  • Big and risk adverse
    • Lots of paperwork and progress, very slow
  • Needed to change – In the beginning – 18 months ago
    • 6 months talking about how we could change things
  • Looked for a pilot project – Internet Banking team – ~80 people – Go-money platform
    • Big monolith, 1m lines of code
    • New release every 6 weeks
    • 10 weeks for feature from start to production
    • Release on midnight on a Friday night, 4-5 hours outage, 20-25 people.
    • Customer complaints about outage at midnight, move to 2am Sunday morning
  • Change to release every single week
    • Has to be middle of the day, no outage
    • How do we do this?
  • Took whole Internet banking team for 12 weeks to create process, did nothing else.
  • What we didn’t do
    • Didn’t replatform, no time
  • What we did
    • Jenkins – Created a single Pipeline, from commit to master all the way to projection
    • Got rid of selenium tests
    • Switched to cypress.io
      • Just tested 5 key customer journeys
    • Drivetrain – Internal App
      • Wanted to empower the teams, but lots of limits within industry/regulations
      • Centralise decision making
      • Lightweight Rules engine, checks that all the requirements have been done by the team before going to the next stage.
    • Cannery Deployments
      • Two versions running, ability to switch users to one or other
  • Learning to Break things down into small chunks
  • Change Process
    • Lots of random rules, eg mandatory standdown times
    • New change process for teams using Drivetrain, certified process no each release
  • Lots of times spent talking to people
    • Had to get lots of signoffs
  • Result
    • Successful
    • 16 weeks rather than 12
    • 28 releases in less than 6 months (vs approx 4 previously)
    • 95% less toil for each release
  • Small not Big changes
    • Now takes just 4-5 weeks to cycle though a feature
    • Don’t like saying MVP. Pitch is as quickly delivering a bit of value
    • and iterating
    • 2 week pilot, not iterations -> 8 week pilot, 4 iterations
    • Solution at start -> Solution derived over time
  • Sooner, not later
    • Previously
      • Risk, operations people not engaged until too late
      • Dev team disengaged from getting things into production
    • Now
      • Everybody engaged earlier
  • Other teams adopting similar approach

Ryan McCarvill – Fighting fires with DevOps

  • Lots of information coming into a firetruck, displayed on dashboard
  • Old System was 8-degit codes
  • Rugged server in each each truck
    • UPS
    • Raspberry Pi
    • Storage
    • Lots of different networking
  • Requirements
    • Redundant Comms
    • Realtime
    • Offline Mpas
    • Offline documentation, site reports, photos, building info
    • Offline Hazzards
    • Allow firefighters to update
    • Track appliance and firefighter status
    • Be a hub for an incident
    • Needs to be very secure
  • Stack on the Truck
    • Ansible, git, docker, .netcode, redis, 20 micoservices
  • What happens if update fails?
  • More than 1000 trucks, might be offline for months at a time
  • How to keep secure
  • AND iterate quickly
  • Pipeline
    • Online update when truck is at home
    • Don’t update if moving
    • Blue/Green updates
    • Health probes
  • Visual Studio Team Services -> Azure cont registry
  • Playbooks on git , ansible pull,
  • Nginx in front of blue/green
  • Built – there were problems
    • Some overheating
    • Server in truck taken out of scope, lost offline strategy
    • No money or options to buy new solution
  • MVP requirements
    • Lots of gigs of data, made some so only online
    • But many gigs still needed online
    • Create virtual firetruck in the sky, worked for online
    • Still had communication device – 1 core, minimum storage, locked down Linux
  • Put a USB stick in the back device and updated it
    • Can’t use a lot of resources or will inpact comms
    • Hazard search
      • Java/python app, no much impact on system
      • Re-wrote in rust, low impact and worked
      • Changed push to rsync and bash
  • Lessons
    • Automation gots us flexability to change
    • Automation gave us flexability to grow
    • Creativity can solve any problem
    • You can solve new problems with old technology
    • Sometimes the only way to get buy in is to just do it.
Share