DevOps Days Auckland 2017 – Wednesday Session 3

Sanjeev Sharma – When DevOps met SRE: From Apollo 13 to Google SRE

  • Author of Two DevOps Bookks
  • Apollo 13
    • Who were the real heroes? The guys back at missing control. The Astronaunts just had to keep breathing and not die
  • Best Practice for Incident management
    • Prioritize
    • Prepare
    • Trust
    • Introspec
    • Consider Alternatives
    • Practice
    • Change it around
  • Big Hurdles to adoption of DevOps in Enterprise
    • Literature is Only looking at one delivery platform at a time
    • Big enterprise have hundreds of platforms with completely different technologies, maturity levels, speeds. All interdependent
    • He Divides
      • Industrialised Core – Value High, Risk Low, MTBF
      • Agile/Innovation Edge – Value Low, Risk High, Rapid change and delivery, MTTR
      • Need normal distribution curve of platforms across this range
      • Need to be able to maintain products at both ends in one IT organisation
  • 6 capabilities needed in IT Organisation
    • Planning and architecture.
      • Your Delivery pipeline will be as fast as the slowest delivery pipeline it is dependent on
    • APIs
      • Modernizing to Microservices based architecture: Refactoring code and data and defining the APIs
    • Application Deployment Automation and Environment Orchestration
      • Devs are paid code, not maintain deployment and config scripts
      • Ops must provide env that requires devs to do zero setup scripts
    • Test Service and Environment Virtualisation
      • If you are doing 2week sprints, but it takes 3-weeks to get a test server, how long are your sprints
    • Release Management
      • No good if 99% of software works but last 1% is vital for the business function
    • Operational Readiness for SRE
      • Shift between MTBF to MTTR
      • MTTR  = Mean time to detect + Mean time to Triage + Mean time to restore
      • + Mean time to pass blame
    • Antifragile Systems
      • Things that neither are fragile or robust, but rather thrive on chaos
      • Cattle not pets
      • Servers may go red, but services are always green
    • DevOps: “Everybody is responsible for delivery to production”
    • SRE: “(Everybody) is responsible for delivering Continuous Business Value”
Share