Linux.conf.au 2016 – Sysadmin Miniconf – Session 2

Site Reliability Engineering at Dropbox – Tammy Butow

  • Having a SLA, measuring against it. Caps OPSwork, Blameless Post Mortum, Always be coding
  • 400 M customer, billion files every day
  • Very hard to find people to scale, so build tool to scale instead
  • Team looks at 6,000 DB machines, look after whole machines not just the app
  • Build a lot of tools in python and go
  • PygerDuty – python library for pagerduty API
    • Easy to find the top things paging, write tools to reduce these
    • Regular weekly meeting to review those problems and make them better
    • If work is happening on machines then turn off monitoring on them so people don’t get woken up for things they don’t need to.
    • Going for days without any pages
  • Self-Healing and auto-remediation scripts
  • Hermes
    • Allocate and track tasks to groups
  • Automation of DB tasks
  • Bot copies pagerduty alerts in slack
  • Aim Higher
    • Create a roadmap for next 12 months
    • Buiding a rocketship while it is flying though the sky
  • Follow the Sun so people are working days
  • Post Mortem for every page
  • Frequent DR testing
  • Take time out to celebrate

I missed out writing up the next couple of talks due to technical problems

 

Share

Linux.conf.au 2016 – Sysadmin Miniconf – Session 1

Is that a Cloud in you packet – Steven Ellis

  • What if you could have a demo of a stack on a phone
  • or on a memory stick or a mini raspberry-pi type PC
  • Nested Virtualisation
  • Hardware
    • Using Linux as host env, not so good on Win and Mac
    • Thinkpad, fedora or Centos, 128GB SSD
  • Nested Virtualisation
    • Huge perforance boost over qemu
    • Use SSD
    • enable options in modules kvm-intel or kvm-amd
    • Confirm SSD perf 1st – hdparm -t /dev/sdX
    • Create base env for VMs, enable vmx in features
    • Make sure it uses a different network so doesn’t badly interact with ones further out
  • Think LVM
    • Creat ethin pool for all envs
    • Think on lvm ” issue_discards = 1 “
  • Base image
    • Doesn’t have to be minimal
    • update the base regularly
    • How do you build your base image?
      • Thin may go weirdly wrong
      • Always use kickstart to re-create it.
    • Think of your use case, don’t skim on the disk (eg 40G disk for image)
    • ssh keys, Enable yum cache
    • Patch once kicked
    • keep a content cache, maybe with rsync or mrepo
  • Turn off VM and hen use fsrim and partx to make it nice and smaller.
  • virt-manager can’t manage thin volumes, DONT manually add the path
  • use virsh to manually add the path.
  • snapshots or snapshots great performance on SSD
  • Thin longer activates automatically on distros
  • packstack simple way to install simple openstack setup
  • LVM vs QCOW
    • qcow okay for smaller images
    • cloud-init with atomic
    • do not snapshot a qcow image when it is running

Revisiting Unix principles for modern system automation – Martin Krafft

  • SSH Botnet
  • OSI of System Automation
  • Transport unix style, both push and pull
  • uses socat for low level data moving
  • autossh <- restarts ssh connection automatically
  • creates control socket

A Gentle Introduction to Ceph – Tim Serong

  • Ceph gives a storage cluster that is self healing and self managed
  • 3 interfaces, object, block, distributed fs
  • OSD with files on them, monitor nodes
  • OSD will forward writes to other replics of the data
  • clients can read from any OSD
  • Software defined storage vs legacy appliances
  • Network
    • Fastest you can, seperate public and cluster networks
    • cluster fatsre than public
  • Nodes
    • 1-2G ram per TB of storage
    • read recomendations
  • SSD journals to cache writes
  • Redundancy
    • Replications – capacity impact but usually good performance
    • Erasure coding – Like raid – better space efficiency but impact in most other areas
  • Adding more nodes
    • tends to work
    • temp impact during rebalancing
  • How to size
    • understand you workload
    • make a guess
    • Build a 10% pilot
    • refine to until perf is achieved
    • scale up the the pilot

Keeping Pinterest running – Joe Gordon

  • Software vs service
    • No stable versions
    • Only one version is live
    • Devs support their own service – alligns incentives, eg monitoring built in
    • Testing against production traffic
  • SRE at Pinterest
    • Like a pit crew in F1
    • firefighting at scale
    • changing tires while moving
  • Operation Maturity
  • Operation Excellence
    • Have the best practices, docs, process, imporvements
    • Repeatable deploys
  • Visability
    • data driven company
    • Lots of Time series data – TSDB
    • Using ELK
  • Deployments
    • no impact to end user
    • easy to do, every few minutes
  • Canary vs Staging
    • Send dark (copies) of traffic to canary box without sending anything back to user
    • Bounce back to starting if problems
  • Teletran
    • Rollback, hotfix, rolling deploy, starting and testing, visibility and useability
    • client-server model
    • pre/post download, restart, etc scripts included with every deployment
    • puase/resume various testing
  • Postmortums and Production readyness reviews
  • Cloud is not infinite, often will hit AWS capacity limits or even no avaialble stuff in the region
  • Need to be able to make sure you know what you are running and if it i seffecintly used
  • Open sourced tools
    • mysql_utils – lots of tools to manage many DBs
    • Thrift tools
    • Teletraan – open sourced in Feb 2016
    • github.com/pinterest
Share

Linux.conf.au 2016 – Tuesday – Keynote: George Fong

George Fong – Chair of Internet Australia

The Challenges of the Changing Social Significance of the Nerd

  • “This is the first conference I’ve been to where there’s an extremely high per capita number of ponytails”
  • Linux not just running web server and other servers  but also network devices
  • Linux and the Web aren’t the same thing, but they’ve grown symbiotically and neither would be the same without the other
  • “One of the lessons we’ve learned in Australia is that when you mix technology with politics, you get into trouble”
  • “We have proof in Australia that if you take guns away from people, people stop getting killed”

 

Share

Linux.conf.au 2016 – Monday – Session 3

Cloud Anti-Patterns – Casey West

  • The 5 stages of Cloud Native
  • Deploying my apps to the cloud is painful – why?
  • Denial
    • “Containers are like tiny VMs”
    • Anti-Pattern 1 – do not assume what you have now is what you want to put into the cloud or a container
    • “We don’t need to automate continuous delivery”
    • We shouldn’t automate what we have until it is perfect. Automate to make things consistent (not always perfect at least at the start)
  • Anger
    • “works on my machine”
    • Dev is just push straight from dev boxes to production
    • Not about making worse code go to production faster
    • Aim to repeatable testable builds, just faster
  • Bargaining
    • “We crammed the monolith into a container and called it a microservice”
    • Anti-Pattern: Critically think on what you need to re-factor (or “re-platforming” )
    • ” Bi-modal IT “
    • Some stuff on fast lane, some stuff on old-way slow lane
    • Anti-pattern: leagacy products put into slow lane, these are often the ones that really need to be fixed.
    • “Micros-services” talking to same data-source, not APIs
  • Depression
    • “200 microservices but forgot to setup Jenkins”
    • “We have an automated build pipeline but online release twice per year”
  • Acceptance
    • All software sucks, even the stuff we write
    • Respect CAP theorem
    • Respect Conway’s Law
    • Small batch sizes works for replatforming too
  • Microservices architecture, Devops culture, Continuous delivery – Pick all three

Cloud Crafting – Public / Private / Hybrid  – Steven Ellis

  • What does Hybrid mean to you?
  • What is private Cloud (IAAS)
  • Hybrid – communicate to public cloud and manage local stuff
  • ManageIQ – single pain of glass for hardware, vms, clounds, containers
  • What does it do?
    • Brownfields as well as Greenfields, gathers current setup
    • Discovery, API presentations, control and detect when env non-complient (eg not fully patched)
    • Premise or public cloud
    • Supplied as a virtual appliance, HA, scale out
    • Platform – Centos 7, rails, postgress, gui, some dashboards our of the box.
  • Get involved
    • Online, roadmap is public
    • Various contributors
  • DEMO
  • Just put in credentials to allow access and then it can gather the data straiht away

Live Migration of Linux Containers by Tycho Andersen

  • LXC / LXD
  • LXD is a REST API that you use to control the container system
  • tool -> RST -> Daemon -> lxc -> Kernel
  • “lxc move host1:c1 host2: ” – Live migrations
    • Needs a bit of work since lots moving, lots of ways it could fail
    • 3 channels created, control, filesystem, container processes state
  • CRIU
    • 5 years of check-pointing
    • Lots based off open-VZ initial work
    • All sorts of things need to support check-pointing and moving (eg selinux)
    • Iterative migration added
    • Lots of hooks needed for very privileged kernel features
  • Filesystems
    • btrfs, lvm, zfs, (swift, nfs), have special support for migration that it hooks into
    • rsync between incompatable hosts
  • Memory State
    • Stop the world and move it all
    • Iterative incremental transfer (via p.haul) being worked on.
  • LXC + LXD 2.0 should be in Ubuntu 16.04 LTS
  • Need to use latest versions and latest kernels for best results.
Share

Linux.conf.au 2016 – Monday – Session 1

Open Cloud Miniconf – Continuous Delivery using blue-green deployments and immutable infrastructure by Ruben Rubio Rey

  • Lots of things can go wrong in a deployment
  • Often hard to do rollbacks once upgrade happens
  • Blue-Green deployment is running several envs at the same time, each potentially with different versions
  • Immutable infrastructure , split between data (which changes) and everything else only gets replaced fully by deployments, not changed
  • When you use docker don’t store data in the container, makes it immutable. But containers are not required to do this.
  • Rule 1 – Never modify the infrastructure
  • Rule 2 – Instead of modifying – always create from ground up everything that is not data.
  • Advantages
    • Rollbacks easy
    • Avoid Configuration drift
    • Updated and accurate infrastructure documentation
  • Split things up
    • No State – LBs, Web servers, App Servers
    • Temp data , Volatile State – message queues, email servers
    • Persistent data – Databases, Filesystems, slow warming cache
  • In case of temp data you have to be able to drain
  • USe LBs and multiple servers to split up infrastructure, more bit give more room to split up the upgrades.
  • If pending jobs require old/new version of app then route to servers that have/not been upgraded yet.
  • Put toy rocket launcher in devs office, shoots person who broke the build.
  • Need to “use activity script” to bleed traffic off section of the “temp data” layer of infrastructure, determine when it is empty and then re-create.
Share

Priorities for 2016

This is a almost New Years resolutions page but not quite. It is a list of the stuff that will take priority over other things in 2016

  • Chess – Aim to play regularly in tournaments, do weekly coaching and study at least 7 hours per week on tactics, endgames and openings.
  • Programming – Continue improving my programming skills, finish the book I am on, do a few exercises and create a few things
  • Blogging – At least 1 post each month to both my personal blog and the Auckland Chess Centre website
  • Driving – Get my Restricted Driver License
  • Reading – Read books (not online) at least half an hour per day
  • Health – 7500 steps every weekday plus get to goal weight
  • Conference – Run successful Sysadmin Miniconf at Linux.conf.au 2016

Stretch Goals – If I am keeping up with the above

  • Start working my way through Shakespeare’s plays
  • Do a couple of new website projects I’ve been putting off
  • Watch a 2-3 of hours of TV each week.
Share

Studying for Driver license test with Anki

In 2014 I decided to do a bit or work to finally get my New Zealand driver license. The first step towards this was passing the theory test which is a 35 question test given on computer. You have to get at least 32 questions right to pass.

After spending a bit of time looking at the roadcode book I decided to go with just learning the questions. I did this by:

  1. Buying some of the official practice exams
  2. Grabbing other questions for unofficial sites
  3. Entering some other questions manually from the books

I took all these questions and created a Anki Deck. Anki is some spaced repetition software that I use to learn things. I tell it to ask me a few new questions every day, if I get them wrong it asks me again tomorrow, if I get them right it asks me again next week. Gradually as I learn something it asks me less often (see the more technical explanation here)

A typical question on an Anki deck looks like these screenshots:

Screenshot_2015-12-10-21-05-24 Screenshot_2015-12-10-21-04-24The left on the left shows me being asked the question. Once I pick my answer I look at the actual answer (see rightmost screenshot)

If I get it wrong I get the card again in 10 minutes and depending on how easy I judged it if I got it right I’ll only see it again in months.

I ended up entering just on 400 questions and told Anki to give me 5 new cards every day plus whatever old ones I had to review. After a few months I had gone though all the questions and had a good feel for them. I also did some of the official practice exams.

Eventually in December 2014 I sat the exam and got 100 percent correct.

I’ll make my deck available at the link below. There are just over 400 cards in it, some with pictures. There are a few duplications but no errors as far as I am aware. They are current as of late 2014 (including the give-way rules change that year).

To use them you’ll need a copy of Anki and it is probably easiest to use the desktop edition to import the file and then use an Ankiweb account to Synchronize to a copy on your phone.

Download NZ Driver license Theory Anki Deck (2MB .apkg file)

Share

Donations 2015

Up until a couple of years ago my main charity was a regular payment to Oxfam. However I cancel this after I decided I disliked their fund-raising methods and otherwise read they were probably not in the top few percent of charities. Since then I’ve been tending to do things all in one go.

I just finished doing this year’s so I thought I’d document it here. It does feel a little weird to post about it but I’ve seen others do it. The theory I guess is that you the reader might be convinced that giving to charity is a good thing and do likewise.

My main donation was to the the top four charities rated by GiveWell:

  • Against Malaria Foundation                   $US 150
  • Schistosomiasis Control Initiative         $US 150
  • Deworm the World Initiative                  $US 150
  • GiveDirectly                                                 $US 150

Next were a series of Open Source projects

  • Debian                                                              $US 50
  • Freedesktop.org                                              $US 30
  • LibreOffice                                                       $US 30
  • OpenBSD                                                          $US 30
  • Python                                                              $US 30
  • Gnome                                                              $US 30

Interestingly enough I hadn’t originally intended to donate to LibreOffice and Freedesktop.org but Debian handles donations via Software in the Public Interest and those two showed up on the same donation page.

and some others

I thought about a few others including The Internet Archive, Anki and Mozilla. Perhaps next year

Share

OSCON 2015

No, I didn’t attend 🙁

But I had a look though the list of talks and read a tonne of slides. Here are found some interesting ones which I hope to watch when the videos go up.

See also:

Share

Gather 2015 – Afternoon Sessions

Panel: “How we work” featuring Lance Wiggs, Dale Clareburt, Robyn Kamira, Amie Holman – Moderated by Nat Torkington

  • Flipside of Startups given by Nat
  • Amie – UX and Services Designer for the Govt, thinks her job is pretty cool. Puts services online.
  • Lance – Works for NZTE better by capital programme. Also runs an early stage fund. Multiple starts and fails
  • Dale – Founded of Weirdly. Worked her way up to top of recruitment company (small to big). Decided to found something for herself.
  • Robyn – Started business 25 years ago. IT consultant, musician, writer.
  • Nat – Look at what you are getting from the new job. Transition to new phase in life. Want ot be positive.
  • Types of jobs: Working for someone else, work for yourself, hire other people, investor. Each has own perks, rewards and downsides.
  • Self employed
    • Big risk around income, peaks and troughs. Robyn always lived at the bottom of the trough level of income. Some people have big fear where next job is coming from.
    • Robyn – Charged Govt as much as possible. Later on charged just below what the really big boys charged. Also has lower rates for community orgs. Sniffed around to find out the rates. Sometimes asked the client. Often RFPs don’t explicityly say so you have to ask.
    • Pricing – You should be embarrassed about how much you charge for services.
    • Robyn – Self promotion is really hard. Found that contracts came out of Wellington. Book meetings in cafes back to back. Chat to people, don’t sell directly.
  • Working for others
    • Amie – Working in a new area of government. But it an area that is growing. Fairly permissive area, lots of gaps that they can fill.
    • Dale – Great experience as an employee. In environment with lot of autonomy in a fast growing company.
    • Lance – Worked from Mobile – Lots of training courses, overseas 6 months after hired. 4 years 4 different cities, steep learning curve, subsidized housing etc. “Learning curve stopped after 4 years and then I left”.
    • Big companies downside: Multiple stakeholders, Lots of rules
    • Big company upside: Can do startup on the side, eg a Family . Secure income. Get to play with big money and big toys.
  • Startup
    • Everything on steroids
    • Really exciting
    • Starting all parts of a company at once
    • Responsibility for business and people in it
    • Crazy ups and downs. Brutal emotional roller-coaster
    • Lance lists 5 businesses off the top of his head that failed that he was at. 3 of which he was the founder
    • Worst that can happen is that you can lose your house
    • Is this life for everyone? – Dale “yes it can be, need to go in with your eyes open”.  “Starting a business can be for everyone. I’m the poorest I’ve ever been now but I’m the happiest I’ve ever been”
    • At a startup you are not working for yourself, you are working for everybody else. Dale says she trys to avoid that.
    • Robyn – “If you life is gone when you are in a business then you are doing it wrong.”
    • If you are working from home you can get isolated, get some peer support and have a drink, coffee with some others.
  • Robyn – Recomends “How to make friends and influence People”
  • Dale
    • Jobhunters – Look for companies 1st and specific job 2nd
    • Startup – Meet everyone that you know and ask their opinion on your pitch
    • Young People going to Uni – You have to get work experience, as a recruiter she looks at experience 1st and pure academic history second.
  • Lance
    • Balance between creating income, creating wealth, learning
    • Know what you are passionate about and good at
    • It is part of our jobs to support everyone around us. Promote other people
  • Amie
    • Find the thing that is your passion
    • When you are deliverying your passion then you are delivering sometime relevant

 Pick and Mix

  • Random Flag generator – @polemic
    • See Wikipedia page for parts of a flag
    • 3 hex numbers are palet
    • 4 numbers represent the pattern
    • Next number will be the location
    • next number which color will be assigned
    • Last number will be a tweak number
    • Up to 8 or 9 of the above
    • Took python pyevolve and run evolution on them.
  • Alex @4lexNZ , @overtime
    • E-sports corporate gaming league
    • untested in NZ
    • Someone suggested cold calling CEOs or writing them letter
  • Simon @slyall (yes me)
    • Low volume site for announcements
  •  Mutate testing
    • Tweak test values of code, to reverse fuzzing
  • Landway learning  – @kiwimrdee
    • Looking for computers to borrow for class
    • They teach lots of stuff
  • Poetry for computers – @kiwimrdee
    • Hire somebody english/arts background who understand language rather than somebody from a CS background who understand machines
  • the.dosprompt.com
    • Lossless image compression for the web
    • Tools vary across the platform
  • Glen – Make computers learn to play Starcraft 1
    • Takes replays of humans playing starcraft
    • Getting computer to learn to play from that DB
    • It is struggling
  • Emergent political structures in tabletops games

Never check in a bag – How to pack

  • 48 hour bag
    • Laptop and power
    • Always – Zip up pouch, tissues , hand sanitizer, universal phone charger, breath mints, the littlest power plug (check will work in multiple voltages), Food bar, chocolate.
    • If more than 48 hours – notebook, miso soup, headphones, pen, laptop charger, apple plugs ( See “world travel kit” on apple site)
    • Get smallest power plug that will charge your laptop
    • Bag 3 – Every video adapter in the world, universal power adapter, airport express.
    • TP-link battery powered wifi adapters
    • If going away just moves laptop etc to this bag
    • Packing Cell
      • Enough clothes to get me through 48 hours
      • 2 * rolled tshirts (ranger rolling)
      • 2 pairs of underwear
      • 2 pairs of socks
      • Toileties. Ziplock back that complies with TSA rules for gels etc.
      • Other toiletries in different bag
      • Rip off stuff from hotels, also Kmart and local stores.
      • Put toiletries ziplock near door to other bag so easy to get out for security.
      • Leave packing cell in Hotel when you go out
    • Learn to Ranger roll socks and shirts etc.
  • 6 weeks worth of stuff
    • In the US you can have huge carry-on
    • Packs 2 weeks worth of clothes
    • Minaal Bag (expensive but cool).
    • Schnozzel bag – Vacuum pack clothing bag
  • Airlines allow 1 carryon bag up to 7 kgs + 1 bag for other items (heavy stuff can go into that)
  • Pick multi-color packing sell so you can color-code them.
  • Elizabeth Holmes and Matilda Kahl and Steve Jobs all wear same stuff every day.
  • Wear Ballet Heals on the plane
  • Woman no more than 2 pairs of shoes every, One of which must be good for walking long distances
  • Always be charging

 Show us your stack

  • I was running this session so didn’t take any notes.
  • We had people from about 5 compnies give a quick overview of some stuff they are running.
  • A bit of chat beforehand
  • Next year if I do this I probably need to do 5 minutes time limits for everyone

Close from Rochelle

  • Thanks to Sponsors
  • Thanks to Panellists
  • Thanks to catering and volunteer teams
  • Will be back in 2016

 

Share