2017 SysAdmin Miniconf – Session 3

Turtles all the way down – Thin LVM + KVM tips and Tricks – Steven Ellis

  • ssd -> partition -> encryption -> LVM -> [..] -> filesystem
  • Lots of examples see the online Slides
  • https://github.com/steven-ellis/ansible-playpen

Samba and the road to 100,000 user – Andrew Bartlett

  • Release cycle is every 6 months
  • Samba 4.0 is 4 years p;d
  • 4.2 and older are out of security support by Samba team (support by distros sometimes)
  • Much faster adding users to AD DC. 55k users added in 50 minutes
  • Performance issues, not bugs, are now the biggest area of work
    • Customer deploying SAMBA at scale
  • Looking for Volunteers running AD will to run a tshark script
    • What does your busy hour look like?
    • What is the pattern of requests?

The School for Sysadmins Who Can’t Timesync Good and Wanna Learn To Do Other Stuff Good Too – Paul Gear

  • Aim is 1-10ms accuracy
  • Using Standard Linux reference distribution etc
  • Why care
    • Same apps need time sync
    • Log matching
  • Network Time Foundation needs support
  • NTP
    • Not widely understood
    • Unglamorous
    • Daunting documentation
    • old protocol, chequered secrity history
    • The first Google result may not be accurate
  • Set clock
    • step – jump clock to new time
    • slew – gradually adjust the time
  • NTP Assumption
    • The is one true time – UTC
    • Nobody really has it
    • bad time servers may be present
    • networks change

I ran out of power on my laptop at this point so not many more notes. Paul gave a very good set of recommendations and myth-busting for those running NTP though. His notes will be online on the Sysadmin Miniconf site and he has also posted more detail online.

Share

2017 Sysadmin Miniconf – Session 2

Running production workloads in a programmable infrastructure – Alejandro Tesch

Managing performance parameters through systemd – Sander van Vugt

  • Mostly Demos in this talk too.
  • Using CPUShare parameter as an example
  • systemd-cgtop and systemd-cgls
  • “systemctl show stress1.service” will show available parameters
  • “man 5 systemd.resource-control” gives a lot more details.

Go for DevOps – Caskey L. Dickson

  • SideBar: The Platform Wars are over
    • Hint: We all won
    • As long as have an API we are all cool
  • Always builds staticly linked binaries, should work on just about any Linux system. Just one file.
  • Built in cross compiler (eg for Windows, Mac) via just enviroment variable “GOOS=darwin” and 32bit “GOARCH=32”
  • Bash is great, Python is great, Go is better
  • Microservices are Services
  • No Small Systems
    • Our Scripts are no longer dozens of lines long, they are thousands of lines long
    • Need full software engineering
  • Sysops pushing buttons and running scripts are dying
  • Platform Specific Code
    • main_linux.go main_windows.go and compiler find.
    • // +build linux darwin     <– At the top of the file
  • “Once I got my head around channels Go really opened up for me”

 

Share

2017 Sysadmin Miniconf – Session 1

The Opposite of the Cloud – Tom Eastman

  • Korinates Data gateway – an appliance onsite at customers
  • Requirements
    • A bootable images ova, AMI/cloud images
    • Needs network access
    • Sounds like an IoT device
  • Opoossite of cloud is letting somebody outsource their stuff onto your infrastructure
  • Tom’s job has been making a nice and tidy appliance
  • What does IoT get wrong
    • Don’t do updates, security patches
    • Don’t treat network as hostile
    • Hard to remotely admin
  • How to make them secure
    • no default or static credentials
    • reduce the attack surface
    • secure all networks comms
    • ensure it fails securely
  • Solution
    • Don’t treat appliances like appliances
    • Treat like tightly orchestrated Linux Servers
  • Stick to conserative archetecture
    • Use standard distribution like Debian
    • You can trust the standard security updates
  • Solution Components
    • aspen: A customized Debian machine image built with Packer
    • pando: orchestration server/C&C network
    • hakea: A Django/Rest microservice API in charge
  • saltstack command and control
    • Normal orchestration stuff
    • Can works as a distributed command execution
    • The minions on each server connect to the central node, means you don’t need to connect into a remote appliance (no incoming connections needed to appliance)
    • OpenVPN as Internet transport
    • Outgoing just port 443 and openvpn protocol. Everything else via OpenVPN
  • What is the Appliance
    • A lightly mangled Debian Jessie VM image
    • Easy to maintain by customer, just reboot, activate or reinstall to fix any problems.
    • Appliance is running a bunch of docker containers
  • Appliance authentication
    • Needs to connect via 443 with activation code to download VPN and Salt short-lived certificates to get started
    • Auth keys only last for 24 hours.
    • If I can’t reach it it kills itself.
  • Hakea: REST control
    • Django REST framework microservices
    • Self documenting using DRF amd CoreAPI Schema
  • DevOps Principals apply beyonf the cloud

Inventory Management with Pallet Jack – Karl-Johan Karlsson

  • Goals
    • Single source of truth
    • Version control
    • Scaleable (to around 1000 machines, 10k objects)
  • Stuff stored as just a file structure
  • Some tools to access
  • Tools to export, eg to kea DHCP config
  • Tools as post-commit hooks for git. Pushes out update via salt etc
  • Various Integrations
    • API
    • Salt

Continuous Dashboard – You DevOps Airbag – Christopher Biggs

  • Dashboard traditionally targeted at OPs
  • Also need to target Devs
    • KPIs and
  • Sales and Support need to know everything to
  • Management want reassurance, Shipping a new feature, you have a hotline to the CEO
  • Customer, do you have something you are ashamed of?
    • Take notice of load spikes
    • Assume customers errors are being acted on, option to notify then when a fix happens
    • What is relivant to support call, most recent outages affecting this customer
    • Remember recent behavour of this customer
  • What kinds of data?
    • Tradditionally: System load indicators, transtion numbers etc
    • Now: Business Goals, unavoidable errors, spikes of errors, location of errors, user experience metrics, health of 3rd party interfaces, App and product reviews
  • What should I put in dashboards
    • Understand the Status-quo
    • Continuously
    • Look at trends over time and releases
    • Think about features holisticly
  • How to get there
    • Like you data as much as your code
    • Experiment with your data
    • tools: nodered.org, blynk.cc, elastic
  • Insert Dashboards into your dev pipeline
    • Code Review, CI, Unit Test, Confirm that alarms actually work via test errors
    • Automate deployment
  • Tools
    • ELK – off the shelf images, good import/export
    • Node-RED – Flow based data processing, nice visual editor, built in dashboarding
    • Blynk – Nice dashboards in Ios or Android. Interactive dashboard editor. Easy to share
  • Social Media integration
    • Receive from twitter, facebook, apps stores reviews
    • Post to slack and monitoring channels
    • Forward to internal groups

The Sound of Silencing – Julien Goodwin

  • Humans know to ignore “expected” alerts during maintenance
    • Hard to know what is expected vs unexpected
    • Major events can lead to alert overload
  • Level 1 – Turn it all off
    • Can work on small scale
  • Level 2 – Turn off a localtion while working on it.
    • What if something happens while you are doing the work?
    • May work with single-service deployments
  • Level 3 – Turn off the expect alerts
    • Hard to get exactly right
  • Level 4 – Change mngt integration
    • Link the generator up to th change mngt automation system
    • What about changes too small to track?
    • What about changes too big for a simple silence?
  • Level 5 – Inhibiting Alerts
    • Use Service level indigations to avoid alerts on expected failures
    • Fire “goes nowhere” alert
  • Level 6 – Global monitoring and preventing over-siliencing
    • Alert if too many sites down
    • Need to have explicit alerts to spot when somebody silences “*”
  • How to get there from here
    • Incrementally
    • Choose a bad alert and change it to make it better
    • Regularly

 

Share

Linux.conf.au 2017 – Conference Opening

  • Wear SunScreen
  • Karen Sandler introduces Outreachy and it is announced as the raffle cause for 2017
  • Overview of people
    • 462 From Aus
    • 43 from NZ
    • 62 From USA
    • Lots of other countries
    • Gender breakdown lots of no answers so a stats a bit rough
  • Talks
    • 421 Proposals
    • 80-ish talks and 6 tutorials
    • Questions
      • Please ask questions during the question time
  • Looking for Volunteers – look at a session and click to signup
  • Keynotes – A quick profile
  • All the rooms are booked till 11pm! for BOF sessions
  • Lightning talks, Coffee, Lunch, dinners

 

 

Share

DevOpsDays Wellington 2016 – Day 2, Session 3

Ignites

Mrinal Mukherjee – How to choose a DevOps tool

Right Tool
– Does the job
– People will accept

Wrong tool
– Never ending Poc
– Doesn’t do the job

How to pick
– Budget / Licensing
– does it address your pain points
– Learning cliff
– Community support
– API
– Enterprise acceptability
– Config in version control?

Central tooling team
– Pro standardize, educate, education
– Constant Bottleneck, delays, stifles innovation, not in sync with teams

DevOps != Tool
Tools != DevOps

Tools facilitate it not define it.

Howard Duff – Eric and his blue boxes

Physical example of KanBan in an underwear factory

Lindsey Holmwood – Deepening people to weather the organisation

Note: Lindsey presents really fast so I missed recording a lot from the talk

His Happy, High performing Team -> He left -> 6 months later half of team had left

How do you create a resilient culture?

What is culture?
– Lots of research in organisation psychology
– Edgar Schein – 3 levels of culture
– Artefacts, Values, Assumptions

Artefacts
– Physical manifestations of our culture
– Standups, Org charts, desk layout, documentation
– actual software written
– Easiest to see and adopt

Values
– Goals, strategies and philosophise
– “we will dominate the market”
– “Management if available”
– “nobody is going to be fired for making a mistake”
– lived values vs aspiration values (People have good nose for bullshit)
– Example, cores values of Enron vs reality
– Work as imagined vs Work is actually done

Assumptions
– beliefs, perceptions, thoughts and feelings
– exist on an unconscious level
– hard to discern
– “bad outcomes come from bad people”
– “it is okay to withhold information”
– “we can’t trust that team”
– “profits over people”

If we can change our people, we can change our culture

What makes a good team member?

Trust
– Vulnerability
– Assume the best of others
– Aware of their cognitive bias
– Aware of the fundamental attribution error (judge others by actions, judge ourselves by our intentions)
– Aware of hindsight bias. Hindsight bias is your culture killer
– When bad things happen explain in terms of foresight
– Regular 1:1s
Eliminate performance reviews
Willing to play devils advocate

Commit and acting
– Shared goal settings
– Don’t solutioneer
– Provide context about strategy, about desired outcome
What makes a good team?

Influence of hiring process
– Willingness to adapt and adopt working in new team
– Qualify team fit, tech talent then rubber stamp from team lead
– have a consistent script, but be prepared to improvise
– Everyone has the veto power
– Leadership is vetoing at the last minute, thats a systemic problem with team alignment not the system
– Benefit: team talks to candidate (without leadership present)
– Many different perspectives
– unblock management bottlenecks
– Risk: uncovering dysfunctions and misalignment in your teams
– Hire good people, get out of their way

Diversity and inclusion
– includes: race, gender, sexual orientation, location, disability, level of experience, work hours
– Seek out diverse candidates.
– Sponsor events and meetups
– Make job description clear you are looking for diverse background
– Must include and embrace differences once they actually join
– Safe mechanism for people to raise criticisms, and acting on them

Leadership and Absence of leadership
– Having a title isn’t required
– If leader steps aware things should continue working right
– Team is their own shit umbrella
– empowerment vs authority
– empowerment is giving permission from above (potentially temporary)
– authority is giving power (granting autonomy)

Part of something bigger than the team
– help people build up for the next job
– Guilds in the Spotify model
– Run them like meetups
– Get senior management to come and observe
– What we’re talking about is tech culture

We can change tech culture
– How to make it resist the culture of the rest of the organisation
– Artefacts influence behaviour
– Artifact fast builds -> value: make better quality
– Artifact: post incident reviews -> Value: Failure is an opportunity for learning

Q: What is a pre-incident review
A: Brainstorm beforehand (eg before a big rollout) what you think might go wrong if something is coming up
then afterwards do another review of what just went wrong

Q: what replaces performance reviews
A: One on ones

Q: Overcoming Resistance
A: Do it and point back at the evidence. Hard to argue with an artifact

Q: First step?
A: One on 1s

Getting started, reading books by Patrick Lencioni:
– Solos, Politics and turf wars
– 5 Dysfunctions of a team

Share

DevOpsDays Wellington 2016 – Day 2, Session 2

Troy Cornwall & Alex Corkin – Health is hard: A Story about making healthcare less hard, and faster!

Maybe title should be “Culture is Hard”

@devtroy @4lexNZ

Working at HealthLink
– Windows running Java stuff
– Out of date and poorly managed
– Deployments manual, thrown over the wall by devs to ops

Team Death Star
– Destroy bad processes
– Change deployment process

Existing Stack
– VMware
– Windows
– Puppet
– PRTG

CD and CI Requirements
– Goal: Time to regression test under 2 mins, time to deploy under 2 mins (from 2 weeks each)
– Puppet too slow to deploy code in a minute or two. App deply vs Conf mngt
– Can’t use (then) containers on Windows so not an option

New Stack
– VMware
– Ubuntu
– Puppet for Server config
– Docker
– rancher

Smashed the 2 minute target!

But…
– We focused on the tech side and let the people side slip
– Windows shop, hard work even to get a Linux VM at the start
– Devs scared to run on Linux. Some initial deploy problems burnt people
– Lots of different new technologies at once all pushed to devs, no pull from them.

Blackout where we weren’t allowed to talk to them for four weeks
– Should have been a warning sign…

We thought we were ready.
– Ops was not ready

“5 dysfunctions of a team”
– Trust as at the bottom, we didn’t have that

Empathy
– We were aware of this, but didn’t follow though
– We were used to disruption but other teams were not

Note: I’m not sure how the story ended up, they sort of left it hanging.

Pavel Jelinek – Kubernetes in production

Works at Movio
– Software for Cinema chains (eg Loyalty cards)
– 100million emails per month. million of SMS and push notifications (less push cause ppl hate those)

Old Stack
– Started with mysql and php application
– AWS from the beginning
– On largest aws instance but still slow.

Decided to go with Microservices
– Put stuff in Docker
– Used Jenkins, puppet, own docker registery, rundeck (see blog post)
– Devs didn’t like writing puppet code and other manual setup

Decided to go to new container management at start of 2016
– Was pushing for Nomad but devs liked Kubernetes

Kubernetes
– Built in ports, HA, LB, Health-checks

Concepts in Kub
– POD – one or more containers
– Deployment, Daemon, Pet Set – Scaling of a POD
– Service- resolvable name, load balancing
– ConfigMap, Volume, Secret – Extended Docker Volume

Devs look after some kub config files
– Brings them closer to how stuff is really working

Demo
– Using kubectl to create pod in his work’s lab env
– Add load balancer in front of it
– Add a configmap to update the container’s nginx config
– Make it public
– LB replicas, Rolling updates

Best Practices
– lots of small containers are better
– log on container stdout, preferable via json
– Test and know your resource requirements (at movio devs teams specify, check and adjust)
– Be aware of the node sizes
– Stateless please
– if not stateless than clustered please
– Must handle unexpected immediate restarts

Share

DevOpsDays Wellington 2016 – Day 2, Session 1

Jethro Carr – Powering stuff.co.nz with DevOps goodness

Stuff.co.nz
– “News” Website
– 5 person DevOps team

Devops
– “Something you do because Gartner said it’s cool”
– Sysadmin -> InfraCoder/SRE -> Dev Shepherd -> Dev
– Stuff in the middle somewhere
– DevSecOps

Company Structure drives DevOps structure
– Lots of products – one team != one product
– Dev teams with very specific focus
– Scale – too big, yet to small

About our team
– Mainly Ops focus
– small number compared to developers
– Operate like an agency model for developers
– “If you buy the Dom Post it would help us grow our team”
– Lots of different vendors with different skill levels and technology

Work process
– Use KanBan with Jira
– Works for Ops focussed team
– Not so great for long running projects

War Against OnCall
– Biggest cause of burnout
– focus on minimising callouts
– Zero alarm target
– Love pagerduty

Commonalities across platforms
– Everyone using compute
– Most Java and javascript
– Using Public Cloud
– Using off the shelf version control, deployment solutions
– Don’t get overly creative and make things too complex
– Proven technology that is well tried and tested and skills available in marketplace
– Classic technologist like Nginx, Java, Varnish still have their place. Don’t always need latest fashion

Stack
– AWS
– Linux, ubuntu
– Adobe AEM Java CMS
– AWS 14x c4.2xlarge
– Varnish in front, used by everybody else. Makes ELB and ALB look like toys

How use Varnish
– Retries against backends if 500 replies, serve old copies
– split routes to various backends
– Control CDN via header
– Dynamic Configuration via puppet

CDN
– Akamai
– Keeps online during breaking load
– 90% cache offload
– Management is a bit slow and manual

Lamda
– Small batch jobs
– Check mail reputation score
– “Download file from a vendor” type stuff
– Purge cache when static file changes
– Lamda webapps – Hopefully soon, a bit immature

Increasing number of microservices

Standards are vital for microservices
– Simple and reasonable
– Shareable vendors and internal
– flexible
– grow organicly
– Needs to be detail
– 12 factor App
– 3 languages Node, Java, Ruby
– Common deps (SQL, varnish, memcache, Redis)
– Build pipeline standardise. Using Codeship
– Standardise server builds
– Everything Automated with puppet
– Puppet building docker containers (w puppet + puppetstry)
– Std Application deployment

Init systems
– Had proliferation
– pm2, god, supervisord, systemvinit are out
– systemd and upstart are in

Always exceptions
– “Enterprise ___” is always bad
– Educating the business is a forever job
– Be reasonable, set boundaries

More Stuff at
http://tinyurl.com/notclickbaithonest

Q: Pull request workflow
A: Largely replaced traditional review

Q: DR eg AWS outage
A: Documented process if codeship dies can manually push, Rest in 2*AZs, Snapshots

Q: Dev teams structure
A: Project specific rather than product specific.

Q: Puppet code tested?
A: Not really, Kinda tested via the pre-prod environment, Would prefer result (server spec) testing rather than low level testing of each line
A: Code team have good test coverage though. 80-90% in many cases.

Q: Load testing, APM
A: Use New Relic. Not much luck with external load testing companies

Q: What is somebody wants something non-standard?
A: Case-by-case. Allowed if needed but needs a good reason.

Q: What happens when automation breaks?
A: Documentation is actually pretty good.

Share

DevOpsDays Wellington 2016 – Day 1, Session 3

Owen Evans – DevOps is Dead, long live DevOps

Theory: Devops is role that never existed.

In the old days
– Shipping used to be hard and expensive, eg on physical media
– High cost of release
– but everybody else was the same.
– Lots of QA and red tape, no second chances

Then we got the Internet
– Speed became everything
– You just shipped enough

But Hardware still was a limiting factor
– Virtual machines
– IaaS
– Containers

This led to complacency
– Still had a physical server under it all

Birth of devops
– Software got faster but still had to have hardware under their somewhere
– Disparity between operations cadence and devs cadence
– things got better
– But we didn’t free ourselves from hardware
– Now everything is much more complex

Developers are now divorced from the platform
– Everything is abstracted
– It is leaky buckets all the way down

Solutions
– Education of developers as to what happens below the hood
– Stop reinventing the where
– Harmony is much more productive
– Lots of tools means that you don’t have enough expertise on each
– Reduce fiefdoms
– Push responsibility but not ownership (you own it but the devs makes some of the changes)
– Live with the code
– Pit of success, easy ways to fail that don’t break stuff (eg test environments, by default it will do the right thing)
– Be Happy. Everybody needs to be a bit devops and know a bit of everything.

Share

DevOpsDays Wellington 2016 – Day 1, Session 2

Martina Iglesias – Automatic Discovery of Service metadata for systems at scale

Backend developer at Spotify

Spotify Scale
– 100m active users
– 800+ tech employees
– 120 teams
– Microservices architecture

Walk though Sample artist’s page
– each component ( playlist, play count, discgraphy) is a seperate service
– Aggregated to send result back to client

Hard to co-ordinate between services as scale grows
– 1000+ services
– Each need to use each others APIs
– Dev teams all around the world

Previous Solution
– Teams had docs in different places
– Some in Wiki, Readme, markdown, all different

Current Solution – System Z
– Centralise in one place, as automated as possible
– Internal application
– Web app, catalog of all systems and its parts
– Well integrated with Apollo service

Web Page for each service
– Various tabs
– Configuration (showing versions of build and uptimes)
– API – list of all endpoints for service, scheme, errors codes, etc (automatically populated)
– System tab – Overview on how service is connected to other services, dependencies (generated automatically)

Registration
– System Z gets information from Apollo and prod servers about each service that has been registered

Apollo
– Java libs for writing microservices
– Open source

Apollo-meta
– Metadata module
– Exposes endpoint with metadata for each service
– Exposes
– instance info – versions, uptime
– configuration – currently loaded config of the service
– endpoints –
– call information – monitors service and learns and returns what incoming and outgoing services the service actually does and to/from what other services.
– Automatically builds dependencies

Situation Now
– Quicker access to relevant information
– Automated boring stuff
– All in one place

Learnings
– Think about growth and scaling at the start of the project

Documentation generators
-Apollo
– Swagger.io
– ralm.org

Blog: labs.spotify.com
Jobs: spotify.com/jobs

Q: How to handle breaking APIs
A: We create new version of API endpoint and encourage people to move over.

Bridget Cowie – The story of a performance outage, and how we could have prevented it

– Works for Datacom
– Consultant in Application performance management team

Story from Start of 2015

– Friday night phone calls from your boss are never good.
– Dropped in application monitoring tools (Dynatrace) on Friday night, watch over weekend
– Prev team pretty sure problem is a memory leak but had not been able to find it (for two weeks)
– If somebody tells you they know what is wrong but can’t find it, give details or fix it then be suspicious

Book: Java Enterprise performance

– Monday prod load goes up and app starts crashing
– Told ops team but since crash wasn’t visable yet, was not believed. waited

Tech Stack
– Java App, Jboss on Linux
– Multiple JVMs
– Oracle DBs, Mulesoft ESB, ActiveMQ, HornetQ

Ah Ha moment
– Had a look at import process
– 2.3 million DB queries per half hour
– With max of 260 users, seems way more than what is needed
– Happens even when nobody is logged in

Tip: Typically 80% of all issues can be detected in dev or test if you look for them.

Where did this code come from?
– Process to import a csv into the database
– 1 call mule -> 12 calls to AMQ -> 12 calls to App -> 102 db queries
– Passes all the tests… But
– Still shows huge growth in queries as we go through layers
– DB queries grow bigger with each run

Tip: Know how your code behaves and track how this behavour changes with each code change (or even with no code change)

Q: Why Dynatrace?
A: Quick to deploy, useful info back in only a couple of hours

Share

DevOpsDays Wellington 2016 – Day 1, Session 1

Ken Mugrage – What we’re learning from burnout and how DevOps culture can help

Originally in the Marines, environment where burnout not tolerated
Works for Thoughtworks – not a mental health professional

Devops could make this worse
Some clichéd places say: “Teach the devs puppet and fire all the Ops people”

Why should we address burnout?
– Google found psychological safety was the number 1 indicator of an effective team
– Not just a negative, people do better job when feeling good.

What is burnout
– The Truth about burnout – Maslach and Leiter
– The Dimensions of Burnout
– Exhaustion
– Cynicism
– Mismatch between work and the person
– Work overload
– Lack of control
– Insufficient reward
– Breakdown of communication

Work overload
– Various prioritisation methods
– More load sharing
– Less deploy marathons
– Some orgs see devops as a cost saving
– There is no such thing as a full stack engineer
– team has skills, not a person

Lack of Control
– Team is ultimately for the decissions
– Use the right technolgy and tools for the team
– This doesnt mean a “Devops team” contolling what others do

Insufficient Reward
– Actually not a great motivator

Breakdown in communication
– Walls between teams are bad
– Everybody involved with product should be on the same team
– 2 pizza team
– Pairs with different skill sets are common
– Swarming can be done when required ( one on keyboard, everybody else watching and talking and helping on big screen)
– Blameless retrospectives are held
– No “Devops team”, creating a silo is not a solution for silos

Absence of Fairness
– You build it, you run it
– Everybody is responsible for quality
– Everybody is measured in the same way
– example Expedia – *everything* deployed has A/B tesing
– everybody goes to release party

Conflicting Values
– In the broadest possible sense
– eg Company industry and values should match your own

Reminder: it is about you and how you fit in with the above

Pay attention to how you feel
– Increase your self awareness
– Maslach Burnout inventory
– Try not to focus on the negative.

Pay attention to work/life balance
– Ask for it, company might not know your needs
– If you can’t get it then quit

Talk to somebody
– Professional help is the best
– Trained to identify cause and effect
– can recommend treatment
– You’d call them if you broke your arm

Friends and family
– People who care, that you haven’t even meet
– Empathy is great , but you aren’t a professional
– Don’t guess cause and effect
– Don’t recommend treatment if not a professional

Q: Is it Gender specific for men (since IT is male dominated) ?
– The “absence of fairness” problem is huge for women in IT

Q: How to promote Psychological safety?
– Blameless post-mortems

 

Damian Brady – Just let me do my job

After working in govt, went to work for new company and hoped to get stuff done

But whole dev team was unhappy
– Random work assigned
– All deadlines missed
– Lots of waste of time meetings

But 2 years later
– Hitting all deadlines
– Useful meetings

What changes were made?

New boss, protect devs for MUD ( Meetings, uncertainty, distractions )

Meetings
– In board sense, 1-1, all hands, normal meetings
– People are averaging 7.5 hours/week in meetings
– On average 37% of meeting time is not relevant to person ( ~ $8,000 / year )
– Do meetings have goals and do they achieve those goals?
– 38% without goals
– only half of remaining meet those goals
– around 40% of meetings have and achieve goals
– Might not be wasted. Look at “What has changed as result of this meeting?”

Meetings fixes
– New Boss went to meetings for us (didn’t need everybody) as a representative
– Set a clear goal and agenda
– Avoid gimmicks
– don’t default to 30min or 1h

Distractions
– 60% of people interrupted 10 or more times per day
– Good to stay in a “flow state”
– 40% people say they are regularly focussed in their work. but all are sometimes
– 35% of time loss focus when interrupted
– Study shows people can take up to 23mins to get focus back after interruption
– $25,000/year wasting according to interruptions

Distraction Fixes
– Allowing headphones, rule not to interrupt people wearing headphones
– “Do not disturb” times
– Little Signs
– Had “the finger” so that you could tell somebody your were busy right now and would come back to them
– Let devs go to meeting rooms or cafes to hide from interruptions
– All “go dark” where email and chat turned off

Uncertainty
– 82% in survey were clear
– nearly 60% of people their top priority changes before they can finish it.
– Autonomy, mastery, purpose

Uncertainty Fixes
– Tried to let people get clear runs at work
– Helped people acknowledge the unexpected work, add to Sprint board
– Established a gate – Business person would have to go through the manager
– Make the requester responsible – made the requester decide what stuff didn’t get done by physically removing stuff from the sprint board to add their own

Share