Linux.conf.au 2016 – Sysadmin Miniconf – Session 1

Is that a Cloud in you packet – Steven Ellis

  • What if you could have a demo of a stack on a phone
  • or on a memory stick or a mini raspberry-pi type PC
  • Nested Virtualisation
  • Hardware
    • Using Linux as host env, not so good on Win and Mac
    • Thinkpad, fedora or Centos, 128GB SSD
  • Nested Virtualisation
    • Huge perforance boost over qemu
    • Use SSD
    • enable options in modules kvm-intel or kvm-amd
    • Confirm SSD perf 1st – hdparm -t /dev/sdX
    • Create base env for VMs, enable vmx in features
    • Make sure it uses a different network so doesn’t badly interact with ones further out
  • Think LVM
    • Creat ethin pool for all envs
    • Think on lvm ” issue_discards = 1 “
  • Base image
    • Doesn’t have to be minimal
    • update the base regularly
    • How do you build your base image?
      • Thin may go weirdly wrong
      • Always use kickstart to re-create it.
    • Think of your use case, don’t skim on the disk (eg 40G disk for image)
    • ssh keys, Enable yum cache
    • Patch once kicked
    • keep a content cache, maybe with rsync or mrepo
  • Turn off VM and hen use fsrim and partx to make it nice and smaller.
  • virt-manager can’t manage thin volumes, DONT manually add the path
  • use virsh to manually add the path.
  • snapshots or snapshots great performance on SSD
  • Thin longer activates automatically on distros
  • packstack simple way to install simple openstack setup
  • LVM vs QCOW
    • qcow okay for smaller images
    • cloud-init with atomic
    • do not snapshot a qcow image when it is running

Revisiting Unix principles for modern system automation – Martin Krafft

  • SSH Botnet
  • OSI of System Automation
  • Transport unix style, both push and pull
  • uses socat for low level data moving
  • autossh <- restarts ssh connection automatically
  • creates control socket

A Gentle Introduction to Ceph – Tim Serong

  • Ceph gives a storage cluster that is self healing and self managed
  • 3 interfaces, object, block, distributed fs
  • OSD with files on them, monitor nodes
  • OSD will forward writes to other replics of the data
  • clients can read from any OSD
  • Software defined storage vs legacy appliances
  • Network
    • Fastest you can, seperate public and cluster networks
    • cluster fatsre than public
  • Nodes
    • 1-2G ram per TB of storage
    • read recomendations
  • SSD journals to cache writes
  • Redundancy
    • Replications – capacity impact but usually good performance
    • Erasure coding – Like raid – better space efficiency but impact in most other areas
  • Adding more nodes
    • tends to work
    • temp impact during rebalancing
  • How to size
    • understand you workload
    • make a guess
    • Build a 10% pilot
    • refine to until perf is achieved
    • scale up the the pilot

Keeping Pinterest running – Joe Gordon

  • Software vs service
    • No stable versions
    • Only one version is live
    • Devs support their own service – alligns incentives, eg monitoring built in
    • Testing against production traffic
  • SRE at Pinterest
    • Like a pit crew in F1
    • firefighting at scale
    • changing tires while moving
  • Operation Maturity
  • Operation Excellence
    • Have the best practices, docs, process, imporvements
    • Repeatable deploys
  • Visability
    • data driven company
    • Lots of Time series data – TSDB
    • Using ELK
  • Deployments
    • no impact to end user
    • easy to do, every few minutes
  • Canary vs Staging
    • Send dark (copies) of traffic to canary box without sending anything back to user
    • Bounce back to starting if problems
  • Teletran
    • Rollback, hotfix, rolling deploy, starting and testing, visibility and useability
    • client-server model
    • pre/post download, restart, etc scripts included with every deployment
    • puase/resume various testing
  • Postmortums and Production readyness reviews
  • Cloud is not infinite, often will hit AWS capacity limits or even no avaialble stuff in the region
  • Need to be able to make sure you know what you are running and if it i seffecintly used
  • Open sourced tools
    • mysql_utils – lots of tools to manage many DBs
    • Thrift tools
    • Teletraan – open sourced in Feb 2016
    • github.com/pinterest
Share