Linux.conf.au 2017 – Tuesday – Session 2

Stephen King’s practical advice for tech writers – Rikki Endsley

  • Example What and Whys
    • Blog post, press release, talk to managers, tell devs the process
    • 3 types of readers: Lay, Managerial, Experts
  • Resources:
    • Press: The care and Feeding of the Press – Esther Schindler
    • Documentation: RTFM? How to write a manual worth reading

 

  • “On Writing: A memoir of the craft” by Stephen King
  • Good writing requires reading
    • You need to read what others in your area or topic or competition are writing
  • Be clear on Expectations
    • See examples
    • Howto Articles by others
    • Writing an Excellent Post-Event Wrap Up report by Leslie Hawthorn
  • Writing for the Expert Audience
    • New Process for acceptance of new modules in Extras – Greg DeKoenigserg (Ansible)
    • vs Ansible Extras Modules + You – Robyn Bergeon
      • Defines audience in the intro

 

  • Invite the reader in
  • Opening Line should Invite the reader to begin the story
  • Put in an explitit outline at the start

 

  • Tell a story
  • That is the object of the exercise
  • Don’t do other stuff

 

  • Leave out the boring parts
  • Just provides links to the details
  • But sometimes if people not experts you need to provide more detail

 

  • Sample outline
    • Intro (invite reader in)
    • Brief background
    • Share the news (explain solution)
    • Conclude (include important dates)

 

  • Sample Outline: Technical articles
  • Include a “get technical” section after the news.
  • Too much stuff to copy all down, see slides

 

  • To edit is divine
  • Come back and look at it afterwards
  • Get somebody who will be honest to do this

 

  • Write for OpenSource.com
  • opensource.com/story

 

  • Q: How do you deal with skimmers?   A: Structure, headers
  • Q: Pet Peeves?  A: Strong intro, People using “very” or “some” , Leaving out import stuff

 

 

Share

Linux.conf.au 2017 – Tuesday Session 1

Fishbowl discussion – GPL compliance Karen M. Sandler

  • Fishbowl format
    • 5 seats at front of the room, 4 must be occupied
    • If person has something to say they come up and sit in spare chair, then one existing person must sit down.
  • Topics
    • Conflicts of Law
    • Mixing licences
    • Implied warrenty
    • Corporate Procedures and application
    • Get knowledge of free licences into the law school curriculum
  • “Being the Open Source guy at Oracle has always been fun”
  • “Our large company has spent 2000 hours with a young company trying to fix things up because their license is not GPL compliant”
  • BlackDuck is a commercial company will review your company’s code looking for GPL violations. Some others too
    • “Not a perfect magical tool by any sketch”
    • Fossology is alternative open tool
    • Whole business model around license compliance, mixed in with security
    • Some of these companies are Kinda Ambulance chasers
    • “Don’t let those companies tell you how to tun your business”
    • “Compliance industry complex” , “Compliance racket”
  • At my employer with have a tool that just greps for a “GPL” license in code, better than nothing.
  • Lots of fear in this area over Open-source compliance lawsuits
    • Disagreements in community if this should be a good idea
    • More, Less, None?
    • “As a Lawyer I think there should definitely be more lawsuits”
    • “A lot of large organisations will ignore anything less than [a lawsuit] “
    • “Even today I deal with organisations who reference the SCO period and fear widespread lawsuits”
  • Have Lawsuits chilled adoption?
    • Yes
    • Chilled adoption of free software vs GPL software
    • “Android has a policy of no GPL in userspace” , “they would replace the kernel if they could”
    • “Busybox lawsuits were used as a club to get specs so the kernel devs could create drivers” , this is not really applicable outside the kernel
    • “My goal in doing enforcement was to ensure somebody with a busybox device could compile it”
    • “Lawyers hate any license that prevents them getting future work”
    • “The amount of GPL violations skyrocketed with embedded devices shipping with Linux and GPL software”
  • People are working on a freer (eg “Not GPL”) embeded stack to replace Android userspace: Toybox, Toolbox, No kernel replacement yet.
  • Employees and Compliance
    • Large company helping out with charities systems unable to put AGPL software from that company on their laptops
    • “Contributing software upstream makes you look good and makes your company look good” , Encourages others and you can use their contributions
    • Work you do on your volunteer days at company do not fill under software assignment policy etc, but they still can’t install random stuff on their machines.
  • Website’s often are not GPL compliance, heavy restrictions, users giving up their licenses.
  • “Send your lawyers a video of another person in a suit talking about that topic”

U 2 can U2F Rob N ★

  • Existing devices are not terribly but better than nothing, usability sucks
  • Universal Two-Factor
    • Open Standard by FIDO alliance
    • USB, NFC, Bluetooth
    • Multiple server and host implimentations
    • One device multi-sites
    • Cloning protection
  • Interesting Examples
  • User experience: Login, press the button twice.
  • Under the hood a lot more complicated
    • Challenge from site, send must sign challenge (including website  url to prevent phishing site proxying)
    • Multiple keypairs for each website on device
    • Has a login counter on the device included in signature, so server can panic then counter gets out of sync from a cloned device
  • Attestation Certificate
    • Shared across model or production batch
  • Browserland
    • Javascript
    • Chrome-based support are good
    • Firefox via extension (Native “real soon now”)
    • Mobile works on Android + Chrome + Google Authenticator
Share

Linux.conf.au 2017 – Tuesday Keynote – Pia Waugh

BTW: Conference Streams are online at linux.conf.au/stream

The Future of Humans – Pia Waugh

At a tipping point, we can’t reinvent everything or just do the past with shinny new things.

Started as a Sysadmin, helped her see things as Systems

Trying to make active choices about the future we want,

  • Started building tools, knowledge spread slowly
  • Created cities, people could specialise, knowledge faster
  • Surplus created, much went to rulers, sometimes rulers overthrown, but hierarchy started the same
  • More recently the surplus has got given to people
  • Last 250 years, people have seen themselves as having power, change their future, not just be a peasant.
  • As resources have increased power and resources have been distributed more widely
  • This has kept expanding, – overthrown you boss at work
  • We are on the cusp on a massive skyrocket in quality of live

 

  • Citizens have powers now that we previously centralized
  • We are now in a time of suplus not scaricity
  • Small groups and individual can now disrupt a country, industry or company
  • We made up all of our society, we can make it again to reflect the present not what was needed in the past.
  • Choose our own adventure or let others choose it for us. We have the option now that we didn’t previously
  • Most people’s eyes glaze over when they here that.
  • “You can’t do that” say many people when they find out what software can do.
  • People switch off their creativity when they come to work.

How Could the World be better

  • Property
    • 3D printing could print organs, food, just about anything
    • Why are we protecting business models that are already out of date (eg copyright) when we couple use them to eliminated scarcity
  • Work and Jobs
    • Everybody is scared about technology taking jobs
    • What do we care about the lose of jobs
    • Why is the value of a person defined by a full-time jobs?
  • Transhumanism
    • tatoos, peicing have been around forever
    • Obsession with the human “normal” , is this a recent thing from the media?
    • Society encourages people towards the Norm
    • Internet has demonstrated that not everybody is normal – Rule 34
    • “If you lose a leg, instead of getting a replacement leg, whey not have seven legs?”
    • Anyone who doesn’t make our definition of Normal is seen as something less even if they have amazing abilities
  • Spaceships
    • Still takes a day to get around the planet
    • If we are going to set up new worlds how are they going to run?
  • Global Citizenship
    • People are seen though the lens of their national citizenship
    • Governments are not the only representative of our rights

 

  • “How can we build a better world? Luckily we have git”
  • We have the power and knowledge to do things, but not all people do
  • If you are as powerful as the tools you use, where does that leave people who can’t use computers or program?

 

  • Systemic Change
    • What doesn’t you Doctor say about “scratching your itch” ?
    • Example: “diversity” , how do we deal with the problems that led us to not having it.
  • Who are you building for? Not building for?
  • What is the default position in society? Is it to no get knowledge, power?
  • What does human mean to you
  • Waht do we value
  • What assumptions and bias do you have?
  • How are you helping non-geeks help themselves
  • What future do you want to see?

 

  • How are Systems changing? How do out policies, assumptions laws reflect the older way?
    • Scarcity -> Surplus
    • Close -> Open
    • Centralise -> Distributed
    • Belief -> Rationalism
    • Win/Lose -> Cooperative competitive
    • Nationalism -> World Citizen
    • Normative Human -> Formative Human
  • I believe the Open Source Culture is a good model for society
  • But in Inventing the future we have to be careful not to drag the legacy systems and values from the past.
Share

2017 SysAdmin Miniconf – Session 3

Turtles all the way down – Thin LVM + KVM tips and Tricks – Steven Ellis

  • ssd -> partition -> encryption -> LVM -> [..] -> filesystem
  • Lots of examples see the online Slides
  • https://github.com/steven-ellis/ansible-playpen

Samba and the road to 100,000 user – Andrew Bartlett

  • Release cycle is every 6 months
  • Samba 4.0 is 4 years p;d
  • 4.2 and older are out of security support by Samba team (support by distros sometimes)
  • Much faster adding users to AD DC. 55k users added in 50 minutes
  • Performance issues, not bugs, are now the biggest area of work
    • Customer deploying SAMBA at scale
  • Looking for Volunteers running AD will to run a tshark script
    • What does your busy hour look like?
    • What is the pattern of requests?

The School for Sysadmins Who Can’t Timesync Good and Wanna Learn To Do Other Stuff Good Too – Paul Gear

  • Aim is 1-10ms accuracy
  • Using Standard Linux reference distribution etc
  • Why care
    • Same apps need time sync
    • Log matching
  • Network Time Foundation needs support
  • NTP
    • Not widely understood
    • Unglamorous
    • Daunting documentation
    • old protocol, chequered secrity history
    • The first Google result may not be accurate
  • Set clock
    • step – jump clock to new time
    • slew – gradually adjust the time
  • NTP Assumption
    • The is one true time – UTC
    • Nobody really has it
    • bad time servers may be present
    • networks change

I ran out of power on my laptop at this point so not many more notes. Paul gave a very good set of recommendations and myth-busting for those running NTP though. His notes will be online on the Sysadmin Miniconf site and he has also posted more detail online.

Share

2017 Sysadmin Miniconf – Session 2

Running production workloads in a programmable infrastructure – Alejandro Tesch

Managing performance parameters through systemd – Sander van Vugt

  • Mostly Demos in this talk too.
  • Using CPUShare parameter as an example
  • systemd-cgtop and systemd-cgls
  • “systemctl show stress1.service” will show available parameters
  • “man 5 systemd.resource-control” gives a lot more details.

Go for DevOps – Caskey L. Dickson

  • SideBar: The Platform Wars are over
    • Hint: We all won
    • As long as have an API we are all cool
  • Always builds staticly linked binaries, should work on just about any Linux system. Just one file.
  • Built in cross compiler (eg for Windows, Mac) via just enviroment variable “GOOS=darwin” and 32bit “GOARCH=32”
  • Bash is great, Python is great, Go is better
  • Microservices are Services
  • No Small Systems
    • Our Scripts are no longer dozens of lines long, they are thousands of lines long
    • Need full software engineering
  • Sysops pushing buttons and running scripts are dying
  • Platform Specific Code
    • main_linux.go main_windows.go and compiler find.
    • // +build linux darwin     <– At the top of the file
  • “Once I got my head around channels Go really opened up for me”

 

Share

2017 Sysadmin Miniconf – Session 1

The Opposite of the Cloud – Tom Eastman

  • Korinates Data gateway – an appliance onsite at customers
  • Requirements
    • A bootable images ova, AMI/cloud images
    • Needs network access
    • Sounds like an IoT device
  • Opoossite of cloud is letting somebody outsource their stuff onto your infrastructure
  • Tom’s job has been making a nice and tidy appliance
  • What does IoT get wrong
    • Don’t do updates, security patches
    • Don’t treat network as hostile
    • Hard to remotely admin
  • How to make them secure
    • no default or static credentials
    • reduce the attack surface
    • secure all networks comms
    • ensure it fails securely
  • Solution
    • Don’t treat appliances like appliances
    • Treat like tightly orchestrated Linux Servers
  • Stick to conserative archetecture
    • Use standard distribution like Debian
    • You can trust the standard security updates
  • Solution Components
    • aspen: A customized Debian machine image built with Packer
    • pando: orchestration server/C&C network
    • hakea: A Django/Rest microservice API in charge
  • saltstack command and control
    • Normal orchestration stuff
    • Can works as a distributed command execution
    • The minions on each server connect to the central node, means you don’t need to connect into a remote appliance (no incoming connections needed to appliance)
    • OpenVPN as Internet transport
    • Outgoing just port 443 and openvpn protocol. Everything else via OpenVPN
  • What is the Appliance
    • A lightly mangled Debian Jessie VM image
    • Easy to maintain by customer, just reboot, activate or reinstall to fix any problems.
    • Appliance is running a bunch of docker containers
  • Appliance authentication
    • Needs to connect via 443 with activation code to download VPN and Salt short-lived certificates to get started
    • Auth keys only last for 24 hours.
    • If I can’t reach it it kills itself.
  • Hakea: REST control
    • Django REST framework microservices
    • Self documenting using DRF amd CoreAPI Schema
  • DevOps Principals apply beyonf the cloud

Inventory Management with Pallet Jack – Karl-Johan Karlsson

  • Goals
    • Single source of truth
    • Version control
    • Scaleable (to around 1000 machines, 10k objects)
  • Stuff stored as just a file structure
  • Some tools to access
  • Tools to export, eg to kea DHCP config
  • Tools as post-commit hooks for git. Pushes out update via salt etc
  • Various Integrations
    • API
    • Salt

Continuous Dashboard – You DevOps Airbag – Christopher Biggs

  • Dashboard traditionally targeted at OPs
  • Also need to target Devs
    • KPIs and
  • Sales and Support need to know everything to
  • Management want reassurance, Shipping a new feature, you have a hotline to the CEO
  • Customer, do you have something you are ashamed of?
    • Take notice of load spikes
    • Assume customers errors are being acted on, option to notify then when a fix happens
    • What is relivant to support call, most recent outages affecting this customer
    • Remember recent behavour of this customer
  • What kinds of data?
    • Tradditionally: System load indicators, transtion numbers etc
    • Now: Business Goals, unavoidable errors, spikes of errors, location of errors, user experience metrics, health of 3rd party interfaces, App and product reviews
  • What should I put in dashboards
    • Understand the Status-quo
    • Continuously
    • Look at trends over time and releases
    • Think about features holisticly
  • How to get there
    • Like you data as much as your code
    • Experiment with your data
    • tools: nodered.org, blynk.cc, elastic
  • Insert Dashboards into your dev pipeline
    • Code Review, CI, Unit Test, Confirm that alarms actually work via test errors
    • Automate deployment
  • Tools
    • ELK – off the shelf images, good import/export
    • Node-RED – Flow based data processing, nice visual editor, built in dashboarding
    • Blynk – Nice dashboards in Ios or Android. Interactive dashboard editor. Easy to share
  • Social Media integration
    • Receive from twitter, facebook, apps stores reviews
    • Post to slack and monitoring channels
    • Forward to internal groups

The Sound of Silencing – Julien Goodwin

  • Humans know to ignore “expected” alerts during maintenance
    • Hard to know what is expected vs unexpected
    • Major events can lead to alert overload
  • Level 1 – Turn it all off
    • Can work on small scale
  • Level 2 – Turn off a localtion while working on it.
    • What if something happens while you are doing the work?
    • May work with single-service deployments
  • Level 3 – Turn off the expect alerts
    • Hard to get exactly right
  • Level 4 – Change mngt integration
    • Link the generator up to th change mngt automation system
    • What about changes too small to track?
    • What about changes too big for a simple silence?
  • Level 5 – Inhibiting Alerts
    • Use Service level indigations to avoid alerts on expected failures
    • Fire “goes nowhere” alert
  • Level 6 – Global monitoring and preventing over-siliencing
    • Alert if too many sites down
    • Need to have explicit alerts to spot when somebody silences “*”
  • How to get there from here
    • Incrementally
    • Choose a bad alert and change it to make it better
    • Regularly

 

Share

Linux.conf.au 2017 – Conference Opening

  • Wear SunScreen
  • Karen Sandler introduces Outreachy and it is announced as the raffle cause for 2017
  • Overview of people
    • 462 From Aus
    • 43 from NZ
    • 62 From USA
    • Lots of other countries
    • Gender breakdown lots of no answers so a stats a bit rough
  • Talks
    • 421 Proposals
    • 80-ish talks and 6 tutorials
    • Questions
      • Please ask questions during the question time
  • Looking for Volunteers – look at a session and click to signup
  • Keynotes – A quick profile
  • All the rooms are booked till 11pm! for BOF sessions
  • Lightning talks, Coffee, Lunch, dinners

 

 

Share

DevOpsDays Wellington 2016 – Day 2, Session 3

Ignites

Mrinal Mukherjee – How to choose a DevOps tool

Right Tool
– Does the job
– People will accept

Wrong tool
– Never ending Poc
– Doesn’t do the job

How to pick
– Budget / Licensing
– does it address your pain points
– Learning cliff
– Community support
– API
– Enterprise acceptability
– Config in version control?

Central tooling team
– Pro standardize, educate, education
– Constant Bottleneck, delays, stifles innovation, not in sync with teams

DevOps != Tool
Tools != DevOps

Tools facilitate it not define it.

Howard Duff – Eric and his blue boxes

Physical example of KanBan in an underwear factory

Lindsey Holmwood – Deepening people to weather the organisation

Note: Lindsey presents really fast so I missed recording a lot from the talk

His Happy, High performing Team -> He left -> 6 months later half of team had left

How do you create a resilient culture?

What is culture?
– Lots of research in organisation psychology
– Edgar Schein – 3 levels of culture
– Artefacts, Values, Assumptions

Artefacts
– Physical manifestations of our culture
– Standups, Org charts, desk layout, documentation
– actual software written
– Easiest to see and adopt

Values
– Goals, strategies and philosophise
– “we will dominate the market”
– “Management if available”
– “nobody is going to be fired for making a mistake”
– lived values vs aspiration values (People have good nose for bullshit)
– Example, cores values of Enron vs reality
– Work as imagined vs Work is actually done

Assumptions
– beliefs, perceptions, thoughts and feelings
– exist on an unconscious level
– hard to discern
– “bad outcomes come from bad people”
– “it is okay to withhold information”
– “we can’t trust that team”
– “profits over people”

If we can change our people, we can change our culture

What makes a good team member?

Trust
– Vulnerability
– Assume the best of others
– Aware of their cognitive bias
– Aware of the fundamental attribution error (judge others by actions, judge ourselves by our intentions)
– Aware of hindsight bias. Hindsight bias is your culture killer
– When bad things happen explain in terms of foresight
– Regular 1:1s
Eliminate performance reviews
Willing to play devils advocate

Commit and acting
– Shared goal settings
– Don’t solutioneer
– Provide context about strategy, about desired outcome
What makes a good team?

Influence of hiring process
– Willingness to adapt and adopt working in new team
– Qualify team fit, tech talent then rubber stamp from team lead
– have a consistent script, but be prepared to improvise
– Everyone has the veto power
– Leadership is vetoing at the last minute, thats a systemic problem with team alignment not the system
– Benefit: team talks to candidate (without leadership present)
– Many different perspectives
– unblock management bottlenecks
– Risk: uncovering dysfunctions and misalignment in your teams
– Hire good people, get out of their way

Diversity and inclusion
– includes: race, gender, sexual orientation, location, disability, level of experience, work hours
– Seek out diverse candidates.
– Sponsor events and meetups
– Make job description clear you are looking for diverse background
– Must include and embrace differences once they actually join
– Safe mechanism for people to raise criticisms, and acting on them

Leadership and Absence of leadership
– Having a title isn’t required
– If leader steps aware things should continue working right
– Team is their own shit umbrella
– empowerment vs authority
– empowerment is giving permission from above (potentially temporary)
– authority is giving power (granting autonomy)

Part of something bigger than the team
– help people build up for the next job
– Guilds in the Spotify model
– Run them like meetups
– Get senior management to come and observe
– What we’re talking about is tech culture

We can change tech culture
– How to make it resist the culture of the rest of the organisation
– Artefacts influence behaviour
– Artifact fast builds -> value: make better quality
– Artifact: post incident reviews -> Value: Failure is an opportunity for learning

Q: What is a pre-incident review
A: Brainstorm beforehand (eg before a big rollout) what you think might go wrong if something is coming up
then afterwards do another review of what just went wrong

Q: what replaces performance reviews
A: One on ones

Q: Overcoming Resistance
A: Do it and point back at the evidence. Hard to argue with an artifact

Q: First step?
A: One on 1s

Getting started, reading books by Patrick Lencioni:
– Solos, Politics and turf wars
– 5 Dysfunctions of a team

Share

DevOpsDays Wellington 2016 – Day 2, Session 2

Troy Cornwall & Alex Corkin – Health is hard: A Story about making healthcare less hard, and faster!

Maybe title should be “Culture is Hard”

@devtroy @4lexNZ

Working at HealthLink
– Windows running Java stuff
– Out of date and poorly managed
– Deployments manual, thrown over the wall by devs to ops

Team Death Star
– Destroy bad processes
– Change deployment process

Existing Stack
– VMware
– Windows
– Puppet
– PRTG

CD and CI Requirements
– Goal: Time to regression test under 2 mins, time to deploy under 2 mins (from 2 weeks each)
– Puppet too slow to deploy code in a minute or two. App deply vs Conf mngt
– Can’t use (then) containers on Windows so not an option

New Stack
– VMware
– Ubuntu
– Puppet for Server config
– Docker
– rancher

Smashed the 2 minute target!

But…
– We focused on the tech side and let the people side slip
– Windows shop, hard work even to get a Linux VM at the start
– Devs scared to run on Linux. Some initial deploy problems burnt people
– Lots of different new technologies at once all pushed to devs, no pull from them.

Blackout where we weren’t allowed to talk to them for four weeks
– Should have been a warning sign…

We thought we were ready.
– Ops was not ready

“5 dysfunctions of a team”
– Trust as at the bottom, we didn’t have that

Empathy
– We were aware of this, but didn’t follow though
– We were used to disruption but other teams were not

Note: I’m not sure how the story ended up, they sort of left it hanging.

Pavel Jelinek – Kubernetes in production

Works at Movio
– Software for Cinema chains (eg Loyalty cards)
– 100million emails per month. million of SMS and push notifications (less push cause ppl hate those)

Old Stack
– Started with mysql and php application
– AWS from the beginning
– On largest aws instance but still slow.

Decided to go with Microservices
– Put stuff in Docker
– Used Jenkins, puppet, own docker registery, rundeck (see blog post)
– Devs didn’t like writing puppet code and other manual setup

Decided to go to new container management at start of 2016
– Was pushing for Nomad but devs liked Kubernetes

Kubernetes
– Built in ports, HA, LB, Health-checks

Concepts in Kub
– POD – one or more containers
– Deployment, Daemon, Pet Set – Scaling of a POD
– Service- resolvable name, load balancing
– ConfigMap, Volume, Secret – Extended Docker Volume

Devs look after some kub config files
– Brings them closer to how stuff is really working

Demo
– Using kubectl to create pod in his work’s lab env
– Add load balancer in front of it
– Add a configmap to update the container’s nginx config
– Make it public
– LB replicas, Rolling updates

Best Practices
– lots of small containers are better
– log on container stdout, preferable via json
– Test and know your resource requirements (at movio devs teams specify, check and adjust)
– Be aware of the node sizes
– Stateless please
– if not stateless than clustered please
– Must handle unexpected immediate restarts

Share

DevOpsDays Wellington 2016 – Day 2, Session 1

Jethro Carr – Powering stuff.co.nz with DevOps goodness

Stuff.co.nz
– “News” Website
– 5 person DevOps team

Devops
– “Something you do because Gartner said it’s cool”
– Sysadmin -> InfraCoder/SRE -> Dev Shepherd -> Dev
– Stuff in the middle somewhere
– DevSecOps

Company Structure drives DevOps structure
– Lots of products – one team != one product
– Dev teams with very specific focus
– Scale – too big, yet to small

About our team
– Mainly Ops focus
– small number compared to developers
– Operate like an agency model for developers
– “If you buy the Dom Post it would help us grow our team”
– Lots of different vendors with different skill levels and technology

Work process
– Use KanBan with Jira
– Works for Ops focussed team
– Not so great for long running projects

War Against OnCall
– Biggest cause of burnout
– focus on minimising callouts
– Zero alarm target
– Love pagerduty

Commonalities across platforms
– Everyone using compute
– Most Java and javascript
– Using Public Cloud
– Using off the shelf version control, deployment solutions
– Don’t get overly creative and make things too complex
– Proven technology that is well tried and tested and skills available in marketplace
– Classic technologist like Nginx, Java, Varnish still have their place. Don’t always need latest fashion

Stack
– AWS
– Linux, ubuntu
– Adobe AEM Java CMS
– AWS 14x c4.2xlarge
– Varnish in front, used by everybody else. Makes ELB and ALB look like toys

How use Varnish
– Retries against backends if 500 replies, serve old copies
– split routes to various backends
– Control CDN via header
– Dynamic Configuration via puppet

CDN
– Akamai
– Keeps online during breaking load
– 90% cache offload
– Management is a bit slow and manual

Lamda
– Small batch jobs
– Check mail reputation score
– “Download file from a vendor” type stuff
– Purge cache when static file changes
– Lamda webapps – Hopefully soon, a bit immature

Increasing number of microservices

Standards are vital for microservices
– Simple and reasonable
– Shareable vendors and internal
– flexible
– grow organicly
– Needs to be detail
– 12 factor App
– 3 languages Node, Java, Ruby
– Common deps (SQL, varnish, memcache, Redis)
– Build pipeline standardise. Using Codeship
– Standardise server builds
– Everything Automated with puppet
– Puppet building docker containers (w puppet + puppetstry)
– Std Application deployment

Init systems
– Had proliferation
– pm2, god, supervisord, systemvinit are out
– systemd and upstart are in

Always exceptions
– “Enterprise ___” is always bad
– Educating the business is a forever job
– Be reasonable, set boundaries

More Stuff at
http://tinyurl.com/notclickbaithonest

Q: Pull request workflow
A: Largely replaced traditional review

Q: DR eg AWS outage
A: Documented process if codeship dies can manually push, Rest in 2*AZs, Snapshots

Q: Dev teams structure
A: Project specific rather than product specific.

Q: Puppet code tested?
A: Not really, Kinda tested via the pre-prod environment, Would prefer result (server spec) testing rather than low level testing of each line
A: Code team have good test coverage though. 80-90% in many cases.

Q: Load testing, APM
A: Use New Relic. Not much luck with external load testing companies

Q: What is somebody wants something non-standard?
A: Case-by-case. Allowed if needed but needs a good reason.

Q: What happens when automation breaks?
A: Documentation is actually pretty good.

Share