DevOpsDaysNZ 2018 – Day 2 – Session 3

Kubernetes

I’ll fill this in later.

Observability

  • Honeycomb, Sumologic. Use AI to look at what happened at same time and magically correlate
  • Expensive or hard to send all logs as volumes go up
  • What is the logging is wrong or missing?
  • Metrics
    • Export in prometheus format
    • Read RED and USE paper
    • Create a company schema with half a dozen metrics that all services expose
  • Had and event or transaction ID that flows across all the microservices sorry logs can be correlated
  • Non technical solutions
    • Refer to previous incident logs
    • Part of deliverables for product is SLA stats which require logs etc
  • Testing logs
    • Make sure certain events produce a log
  • Chaos Monkey

ANZ Drivetrain

  • Change control cares about
    • Avaiability
    • Risk
    • Dependencies
    • Rollback
  • But the team doing the change knows about these all
  • Saw tools out there that seem very opinated
  • Drivetrain
    • Automated Checklist
    • Work with Change people to create checklist
    • Pipeline talks to drivetrain and tells it what has been down
    • Slack messages sent for manual changes (they login to app to approve)
  • Looked at some other tools (eg chef automate, udeploy )
    • Forced team to work in a certain pattern
  • But use ServiceNow tool as official corporate standard
    • Looking at making DriveTrail fill in ServiceNow forms
  • People worried about stages in tool often didn’t realise the existing process had same limitations
  • Risk assessed at the Story and Feature level. Not release level
  • Not suitable for products that due huge released every few months with a massive number of changes.

 

 

Share