Linux.conf.au 2017 – Wednesday – Session 2

400,000 ephemeral containers: testing entire ecosystems with Docker – Daniel Axtens

  • A pretty interesting talk. It was largely a demo so I didn’t grab many notes

Community Building Beyond the Black Stump – Josh Simmons

  • How to build communities when you don’t live in a big city
  • Whats in a meetup?
  • Santa Rosa County, north of San Franscisco
    • Not easy to get to SF
    • SF meetups not always relevant
  • After meeting with one other person, created “North Bay web Professionals”, minimal existing groups
  • Multidisciplinary community worked better
    • Designers, Marketers, Web Devs, writers, etc
    • Hired each other
    • Seemed to work better, fewer toxic dynamics
    • Safe space for beginners
  • 23 People at first event (worked hard to tell people)
    • Told everyone that we knew even if not interested
    • Contacted the competitors
    • Contacting firms, schools
    • Co-working spaces (formal of de-facto like cafes)
    • Other meetup groups, even in unrelated areas.
  • Adapting to the needs of the community
    • You might have a vision
    • But you must adapt to who turns up and what they want/need
  • First meeting
    • Asked people to bring food
    • Fluffy start time so could greet people and mingle
    • Went round room and got people to introduce themselves
      • Intro ended up being a thing they always did
      • Helped people remember names
      • Got everyone to say a little
      • put people in a social mindset
    • Framework for events decided
    • Decided on next meeting date, some prep
    • Ended up going late
      • Format became. Social -> talk -> Social on each night.
  • Tools
    • Used facebook and meetup
    • 1/3 of people came just from meetup promoting automatically
    • Go where people already are
  • Renamed from “North Bay Web professions” to “North Bay Web and Interactive Media professionals”
  • “Ask a person, not a search engine”
  • Hosted over 169 events – Core was the monthly meeting
    • Tried to keep the topics a little broad
    • Often the talk was narrow but compensated with a broad Q&A afterwards
  • Thinking of people as “members” not “attendees” , have to work at getting them come back
  • Also hosted
    • Lunches, rotated all around the region so eventually near everywhere, Casual
    • Unconfernces
    • Topical meetups
    • Charity Hackathon, teamed up with students and non-profits to do website for non-profit. Student was an apprentice.
    • Hosted Ag+Tech mixers with local farmers groups
    • Helped local cities put out tech RFPs
  • Q: Success measures? A: Survey of member, things like Job referrals, what have learnt

 

 

Linux.conf.au 2017 – Wednesday – Session 1

Servo Architecture: Safety and Performance – Jack Moffitt

  • History
    • 1994 Netscape Navigator
    • 2002 Mozilla Release
    • 2008 multi-core CPU stuff not making firefox faster
    • 2016 CPUs now have on-chip GPUs
    • Very hard to write multi-threaded C++ to allow mozilla to take advantage of many cores
  • How to make Servo Faster?
  • Constellation
    • In the past – Monolithic browser engines
      • Single browser engine handling multiple tabs
      • Two processes – Pool Content processes vs Chrome process
        • If one process dies on a page doesn’t take out whole browser
      • Sanboxing lets webpage copies have less privs
    • Threads
      • Less overhead than whole processes
      • Thread per page
      • More responsive
      • Sandboxing
      • More robust to failure
    • Is this the best we can do?
      • Run Javascript and layout simultaniously
      • Pipeline splitting them up
      • Child pipelines for inner iframes (eg ads)
  • Constellation
    • Rust can fail better
    • Most failures stop at thread boundaries
    • Still do sandbox and privledges
    • Option to still have some tabs in multiple processes
  • Webrender
    • Using the GPU
      • Frees up main CPU
      • Are VERY fast at some stuff
      • Easiest place to start is rendering
    • Don’t browsers already use the GPU?
      • Only in a limited way for compositing
    • Key ideas
      • Retain mode not immediate mode (put things in optimal order first)
      • Designed to render CSS content (CSS is actually pretty simple)
      • Draw the whole frame every frame (things are fast enough, simpler to not try to optimise)
    • Pipeline
      • Chop screen into 256×256 tiles
      • Tile assignment
      • Create a big tree
      • merge and assign render targets
      • create and execute batches
    • Text
      • Rasterize on CPU and upload glyth to GPU
      • Paste and shadow usign the GPU
  • Project Quantum
    •  Taking technology we made in servo and put it in gecko
  • Research in progress
    • Pathfinder – GPU font rasterizer – Now faster than everything else
    • Magic DOM
      • Wins in JS/DOM intergration
      • Fusing reflectors and DOM objects
      • Self hosted JS
    • External colaborations: ML, Power Mngt, WebBluetooth, etc
  • Get involved
    • Test nightlies
    • Curated bugs for new contributors
    • servo.org

In Case of Emergency: Break Glass – BCP, DRP, & Digital Legacy – David Bell

  • Definitions
    • BCP = Business continuity Plan
    • A process to prevent and recover from business continuity plans
    • BIP = Business interuptions plan
    • BRP = Recovery plan
    • RPO = Recovery point objective, targetted recovery point (when you last backed up)
    • RTO = Recovery time objective
  • Why?
    • Because things will go wrong
    • Because things should not go even more wrong
  • Create your BCP
    • Brainstorm
    • Identify events that may interrupt, loss access to physical site, loss of staff
    • Backups
      • 3 copies
      • 2 different media/formats
      • 1 offsite and online
      • Check how long it will take to download or fetch
    • Test
    • Who has the Authority
    • Communication chains, phone trees, contact details
    • Practice Early, Practice often
      • Real-world scenarios
      • Measure, measure, measure
      • Record your results
      • Convert your into an action item
      • Have different people on the tests
    • Each Biz Unit or team should have their own BCP
    • Recovery can be expensive, make sure you know what your insurance will cover
  • Breaking the Glass
    • Documentation is the Key
    • Secure credentials super important
    • Shamir secret sharing, need number of people to re-create the share
  • Digital Legacy
    • Do the same for your personal data
    • Document
      • Credentials
      • Services
        • What uses them
        • billing arrangments
        • Credentials
      • What are your wishes for the above.
    • Talk to your family and friends
    • Backups
    • Document backups and backup your documentation
    • Secret sharing, offer to do the same for your friends
  • Other / Questions
    • Think about 2-Facter devices
    • Google and some others companies can setup “Next of Kin” contacts

 

 

Linux.conf.au 2017 – Wednesday Keynote – Dan Callahan

Designing for failure: On the decommissioning of Persona

  • Worked for Mozilla on Persona
  • Persona did authentication on the web
    • You would go to a website
    • Type in your email address
    • Redirects via login page by your email provider
    • You login and redirect back
  • Started centralised, designed to be uncentralised as it is taken up
  • Some sites were only offering login via social media
    • Some didn’t offer traditional logins for emails or local usernames
    • Imposes 3rd party between you and your user.
    • Those 3rd parties have their own rules, eg real name requirements
  • Persona Failed
    • Traditional logins now more common
  • Cave Diving
    • Equipment and procedures designed to let you still survive if something fails
    • Training review deaths and determines how can be prevented
    • “5 rules of accident analysis” for cave diving
  • Three weeks ago switched off Persona
    • Encourage others to share mistakes

 

  • Just having a free license is not enough to succeed
  • Had a built in centralisation point
    • Protocol designed so browser could eventually natively implement but initially login.persona.com was using it.
    • Relay between provider and website went via Mozilla until browser natively implemented
    • No ability to fork the project
  • Bits rot more quickly online
    • Stuff that is online must be continually maintain (especially security)
    • Need a way to have software maintained without experts
  • Complexity Limits agency
    • Limits who can run project at all
    • Lots of work for those people who can run it
  • A free license don’t further my feeedom if we can’t run the software

 

  • Prolong Your Project’s Life
  • Bad ideas
    • We used popups and people reflexively closed them
    • API wasn’t great
  • Didn’t measure the right thing
    • Is persona product or infrastructure?
    • Treated like a product, not a good fit
  • Explicitly define and communicate your scope
    • “Solves authentication” or “Authenticate email addresses”
    • Broke some sites
    • Got used by FireFoxOS which was not a good fit
  • Ruthlessly oppose complexity
    • Tried to do too much mean’t it was overly complex
    • Complex hard to maintain and review and grow
    • Hard for newbies to join
    • If it is complex then it is hard to even test that is is working as expected
    • Focus and simplify
    • Almost no outside contributors, especially bad when mozilla dropped it.

 

  • Plan for Your Projects Failure
  • “Sometimes that [bus failure] is just a commuter bus that picks up that person and takes them to another job”
  • If you know you are dead say it
    • 3 years after we pulled people off project till officially killed
    • Might work for local software but services cost money to run
    • Sooner you admit you are dead the sooner people can plan to your departure
  • Ensure your users can recover without your involvement
    • Hard to do when you think your project is going to save the world
    • Example firefox sync has a copy of the data locally so even if it dies user will survive
  • Use standard data formats
    • eg OPML for RSS providers
  • Minimise the harm caused when your project goes away

 

Linux.conf.au 2017 – Tuesday – Session 3

The Internet of Scary Things – tips to deploy and manage IoT safely Christopher Biggs

  • What you need to know about the Toaster Apocalypse
  • Late 2016 brought to prominence when major sites hit by DDOS from compromised devices
  • Risks present of grabbing images
    • Targeted intrusion
    • Indiscriminate harvesting of images
    • Drive-by pervs
    • State actors
  • Unorthorized control
    • Hit traffic lights, doorbells
  • Takeover of entire devices
    • Used for DDOS
    • Demanding payment for the owner to get control of them back.
  • “The firewall doesn’t divide the scary Internet from the safe LAN, the monsters are in the room”

 

  • Poor Security
    • Mostly just lazyness and bad practices
    • Hard for end-users to configure (especially non-techies)
    • Similar to how servers and Internet software, PCs were 20 years ago
  • Low Interop
    • Everyone uses own cloud services
    • Only just started getting common protocols and stds
  • Limited Maint
    • No support, no updates, no patches
  • Security is Hard
  • Laziness
    • Threat service is too large
    • Telnet is too easy for devs
    • Most things don’t need full Linux installs
  • No incentives
    • Owner might not even notice if compromised
    • No incentive for vendors to make them better

 

  • Examples
    • Cameras with telenet open, default passwords (that can not be changed)
    • exe to access
    • Send UDP to enable a telnet port
    • Bad Mobile apps

 

  • Selecting a device
    • Accept you will get bad ones, will have to return
    • Scan your own network, you might not know something is even wifi enabled
    • Port scan devices
    • Stick with the “Big 3” ramework ( Apple, Google, Amazon )
    • Make sure it supports open protocols (indicates serious vendor)
    • Check if open source firmward or clients exists
    • Check for reviews (especially nagative) or teardowns

 

  • Defensive arch
    • Put on it’s own network
    • Turn off or block uPNP opening firewall holes
    • Plan for breaches
      • Firewall rules, rate limited, recheck now and then
    • BYO cloud (dont use the vendor cloud)
      • HomeBridge
      • Node-RED (Alexa)
      • Zoneminder, Motion for cameras
  • Advice for devs
    • Apple HomeKit (or at least support for Homebridge for less commercial)
    • Amazon Alexa and AWS IoT
      • Protocols open but look nice
    • UCF uPnP and SNP profiles
      • Device discovery and self discovery
      • Ref implimentations availabel
    • NoApp setup as an alternative
      • Have an API
    • Support MQTT
    • Long Term support
      • Put copy of docs in device
      • Decide up from what and how long you will support and be up front
    • Limit what you put on the device
      • Don’t just ship a Unix PC
      • Take out debug stuff when you ship

 

  • Trends
    • Standards
      • BITAG
      • Open Connectivity founddation
      • Regulation?
    • Google Internet of things
    • Apple HomeHit
    • Amazon Alexa
      • Worry about privacy
    • Open Connectivity Foundation – IoTivity
    • Resin.io
      • Open source etc
      • Linux and Docket based
    • Consumer IDS – FingBox
  • Missing
    • Network access policy framework shipped
    • Initial network authentication
    • Vulnerbility alerting
    • Patch distribution

Rage Against the Ghost in the Machine – Lilly Ryan

  • What is a Ghost?
    • The split between the mind and the body (dualism)
    • The thing that makes you you, seperate to the meat of your body
  • Privacy
    • Privacy for information not physcial
    • The mind has been a private place
    • eg “you might have thought about robbing a bank”
    • The thoughts we express are what what is public.
    • Always been private since we never had technology to get in there
    • Companies and governments can look into your mind via things like your google queries
    • We can emulate the inner person not just the outer expression
  • How to Summon a Ghost
    • Digital re-creation of a person by a bot or another machine
    • Take information that post online
    • Likes on facebook, length of time between clicks
  • Ecto-meta-data
    • Take meta data and create something like you that interacts
  • The Smartphone
    • Collects meta-data that doesn’t get posted publicly
    • deleted documents
    • editing of stuff
    • search history
    • patten of jumping between apps
  • The Public meta-data that you don’t explicitly publish
    • Future could emulate you sum of oyu public bahavour
  • What do we do with a ghost?
    • Create chatbots or online profiles that emulate a person
    • Talk to a Ghost of yourself
    • Put a Ghost to work. They 3rd party owns the data
    • Customer service bot, PA
    • Chris Helmsworth could be your PA
    • Money will go to facebook or Google
  • Less legal stuff
    • Information can leak from big companies
  • How to Banish a Ghost
    • Option to donating to the future
    • currently no regulation or code of conduct
    • Restrict data you send out
      • Don’t use the Internet
      • Be anonymous
      • Hard to do when cookies match you across many sites
        • You can install cookie blocker
    • Which networks you connect to
      • eg list of Wifi networks match you with places and people
      • Mobile network streams location data
      • location data reveals not just where you go but what stores, houses or people you are near
      • Turn off wifi, bluetooth or data when you are not using. Use VPNs
    • Law
      • Lobby and push politicians
      • Push back on comapnies
    • For technologiest
      • Collect the minimum, not the maximum

FreeIPA project update (turbo talk) – Fraser Tweedale

  • Central Identity manager
  • Ldap + Kerberos, CA, DNS, admin tools, client. Hooks into AD
  • NAnage via web or client
  • Client SSSD. Used by various distros
  • What is in the next release
    • Sub-CAs
    • Can require 2FA for important serices
    • KDC Proxy
    • Network bound encryption. ie Needs to talk to local server to unencrypt a disk
    • User Session recording

 

Minimum viable magic

Politely socially engineering IRL using sneaky magician techniques – Alexander Hogue

  • Puttign things up your sleeve is actually hard
  • Minimum viable magic
  • Miss-direct the eyes
  • Eyes only move in a straight line
  • Exploit pattern recognition
  • Exploit the spot light
  • Your attention is a resource

Linux.conf.au 2017 – Tuesday – Session 2

Stephen King’s practical advice for tech writers – Rikki Endsley

  • Example What and Whys
    • Blog post, press release, talk to managers, tell devs the process
    • 3 types of readers: Lay, Managerial, Experts
  • Resources:
    • Press: The care and Feeding of the Press – Esther Schindler
    • Documentation: RTFM? How to write a manual worth reading

 

  • “On Writing: A memoir of the craft” by Stephen King
  • Good writing requires reading
    • You need to read what others in your area or topic or competition are writing
  • Be clear on Expectations
    • See examples
    • Howto Articles by others
    • Writing an Excellent Post-Event Wrap Up report by Leslie Hawthorn
  • Writing for the Expert Audience
    • New Process for acceptance of new modules in Extras – Greg DeKoenigserg (Ansible)
    • vs Ansible Extras Modules + You – Robyn Bergeon
      • Defines audience in the intro

 

  • Invite the reader in
  • Opening Line should Invite the reader to begin the story
  • Put in an explitit outline at the start

 

  • Tell a story
  • That is the object of the exercise
  • Don’t do other stuff

 

  • Leave out the boring parts
  • Just provides links to the details
  • But sometimes if people not experts you need to provide more detail

 

  • Sample outline
    • Intro (invite reader in)
    • Brief background
    • Share the news (explain solution)
    • Conclude (include important dates)

 

  • Sample Outline: Technical articles
  • Include a “get technical” section after the news.
  • Too much stuff to copy all down, see slides

 

  • To edit is divine
  • Come back and look at it afterwards
  • Get somebody who will be honest to do this

 

  • Write for OpenSource.com
  • opensource.com/story

 

  • Q: How do you deal with skimmers?   A: Structure, headers
  • Q: Pet Peeves?  A: Strong intro, People using “very” or “some” , Leaving out import stuff

 

 

Linux.conf.au 2017 – Tuesday Session 1

Fishbowl discussion – GPL compliance Karen M. Sandler

  • Fishbowl format
    • 5 seats at front of the room, 4 must be occupied
    • If person has something to say they come up and sit in spare chair, then one existing person must sit down.
  • Topics
    • Conflicts of Law
    • Mixing licences
    • Implied warrenty
    • Corporate Procedures and application
    • Get knowledge of free licences into the law school curriculum
  • “Being the Open Source guy at Oracle has always been fun”
  • “Our large company has spent 2000 hours with a young company trying to fix things up because their license is not GPL compliant”
  • BlackDuck is a commercial company will review your company’s code looking for GPL violations. Some others too
    • “Not a perfect magical tool by any sketch”
    • Fossology is alternative open tool
    • Whole business model around license compliance, mixed in with security
    • Some of these companies are Kinda Ambulance chasers
    • “Don’t let those companies tell you how to tun your business”
    • “Compliance industry complex” , “Compliance racket”
  • At my employer with have a tool that just greps for a “GPL” license in code, better than nothing.
  • Lots of fear in this area over Open-source compliance lawsuits
    • Disagreements in community if this should be a good idea
    • More, Less, None?
    • “As a Lawyer I think there should definitely be more lawsuits”
    • “A lot of large organisations will ignore anything less than [a lawsuit] “
    • “Even today I deal with organisations who reference the SCO period and fear widespread lawsuits”
  • Have Lawsuits chilled adoption?
    • Yes
    • Chilled adoption of free software vs GPL software
    • “Android has a policy of no GPL in userspace” , “they would replace the kernel if they could”
    • “Busybox lawsuits were used as a club to get specs so the kernel devs could create drivers” , this is not really applicable outside the kernel
    • “My goal in doing enforcement was to ensure somebody with a busybox device could compile it”
    • “Lawyers hate any license that prevents them getting future work”
    • “The amount of GPL violations skyrocketed with embedded devices shipping with Linux and GPL software”
  • People are working on a freer (eg “Not GPL”) embeded stack to replace Android userspace: Toybox, Toolbox, No kernel replacement yet.
  • Employees and Compliance
    • Large company helping out with charities systems unable to put AGPL software from that company on their laptops
    • “Contributing software upstream makes you look good and makes your company look good” , Encourages others and you can use their contributions
    • Work you do on your volunteer days at company do not fill under software assignment policy etc, but they still can’t install random stuff on their machines.
  • Website’s often are not GPL compliance, heavy restrictions, users giving up their licenses.
  • “Send your lawyers a video of another person in a suit talking about that topic”

U 2 can U2F Rob N ★

  • Existing devices are not terribly but better than nothing, usability sucks
  • Universal Two-Factor
    • Open Standard by FIDO alliance
    • USB, NFC, Bluetooth
    • Multiple server and host implimentations
    • One device multi-sites
    • Cloning protection
  • Interesting Examples
  • User experience: Login, press the button twice.
  • Under the hood a lot more complicated
    • Challenge from site, send must sign challenge (including website  url to prevent phishing site proxying)
    • Multiple keypairs for each website on device
    • Has a login counter on the device included in signature, so server can panic then counter gets out of sync from a cloned device
  • Attestation Certificate
    • Shared across model or production batch
  • Browserland
    • Javascript
    • Chrome-based support are good
    • Firefox via extension (Native “real soon now”)
    • Mobile works on Android + Chrome + Google Authenticator

Linux.conf.au 2017 – Tuesday Keynote – Pia Waugh

BTW: Conference Streams are online at linux.conf.au/stream

The Future of Humans – Pia Waugh

At a tipping point, we can’t reinvent everything or just do the past with shinny new things.

Started as a Sysadmin, helped her see things as Systems

Trying to make active choices about the future we want,

  • Started building tools, knowledge spread slowly
  • Created cities, people could specialise, knowledge faster
  • Surplus created, much went to rulers, sometimes rulers overthrown, but hierarchy started the same
  • More recently the surplus has got given to people
  • Last 250 years, people have seen themselves as having power, change their future, not just be a peasant.
  • As resources have increased power and resources have been distributed more widely
  • This has kept expanding, – overthrown you boss at work
  • We are on the cusp on a massive skyrocket in quality of live

 

  • Citizens have powers now that we previously centralized
  • We are now in a time of suplus not scaricity
  • Small groups and individual can now disrupt a country, industry or company
  • We made up all of our society, we can make it again to reflect the present not what was needed in the past.
  • Choose our own adventure or let others choose it for us. We have the option now that we didn’t previously
  • Most people’s eyes glaze over when they here that.
  • “You can’t do that” say many people when they find out what software can do.
  • People switch off their creativity when they come to work.

How Could the World be better

  • Property
    • 3D printing could print organs, food, just about anything
    • Why are we protecting business models that are already out of date (eg copyright) when we couple use them to eliminated scarcity
  • Work and Jobs
    • Everybody is scared about technology taking jobs
    • What do we care about the lose of jobs
    • Why is the value of a person defined by a full-time jobs?
  • Transhumanism
    • tatoos, peicing have been around forever
    • Obsession with the human “normal” , is this a recent thing from the media?
    • Society encourages people towards the Norm
    • Internet has demonstrated that not everybody is normal – Rule 34
    • “If you lose a leg, instead of getting a replacement leg, whey not have seven legs?”
    • Anyone who doesn’t make our definition of Normal is seen as something less even if they have amazing abilities
  • Spaceships
    • Still takes a day to get around the planet
    • If we are going to set up new worlds how are they going to run?
  • Global Citizenship
    • People are seen though the lens of their national citizenship
    • Governments are not the only representative of our rights

 

  • “How can we build a better world? Luckily we have git”
  • We have the power and knowledge to do things, but not all people do
  • If you are as powerful as the tools you use, where does that leave people who can’t use computers or program?

 

  • Systemic Change
    • What doesn’t you Doctor say about “scratching your itch” ?
    • Example: “diversity” , how do we deal with the problems that led us to not having it.
  • Who are you building for? Not building for?
  • What is the default position in society? Is it to no get knowledge, power?
  • What does human mean to you
  • Waht do we value
  • What assumptions and bias do you have?
  • How are you helping non-geeks help themselves
  • What future do you want to see?

 

  • How are Systems changing? How do out policies, assumptions laws reflect the older way?
    • Scarcity -> Surplus
    • Close -> Open
    • Centralise -> Distributed
    • Belief -> Rationalism
    • Win/Lose -> Cooperative competitive
    • Nationalism -> World Citizen
    • Normative Human -> Formative Human
  • I believe the Open Source Culture is a good model for society
  • But in Inventing the future we have to be careful not to drag the legacy systems and values from the past.

2017 SysAdmin Miniconf – Session 3

Turtles all the way down – Thin LVM + KVM tips and Tricks – Steven Ellis

  • ssd -> partition -> encryption -> LVM -> [..] -> filesystem
  • Lots of examples see the online Slides
  • https://github.com/steven-ellis/ansible-playpen

Samba and the road to 100,000 user – Andrew Bartlett

  • Release cycle is every 6 months
  • Samba 4.0 is 4 years p;d
  • 4.2 and older are out of security support by Samba team (support by distros sometimes)
  • Much faster adding users to AD DC. 55k users added in 50 minutes
  • Performance issues, not bugs, are now the biggest area of work
    • Customer deploying SAMBA at scale
  • Looking for Volunteers running AD will to run a tshark script
    • What does your busy hour look like?
    • What is the pattern of requests?

The School for Sysadmins Who Can’t Timesync Good and Wanna Learn To Do Other Stuff Good Too – Paul Gear

  • Aim is 1-10ms accuracy
  • Using Standard Linux reference distribution etc
  • Why care
    • Same apps need time sync
    • Log matching
  • Network Time Foundation needs support
  • NTP
    • Not widely understood
    • Unglamorous
    • Daunting documentation
    • old protocol, chequered secrity history
    • The first Google result may not be accurate
  • Set clock
    • step – jump clock to new time
    • slew – gradually adjust the time
  • NTP Assumption
    • The is one true time – UTC
    • Nobody really has it
    • bad time servers may be present
    • networks change

I ran out of power on my laptop at this point so not many more notes. Paul gave a very good set of recommendations and myth-busting for those running NTP though. His notes will be online on the Sysadmin Miniconf site and he has also posted more detail online.

2017 Sysadmin Miniconf – Session 2

Running production workloads in a programmable infrastructure – Alejandro Tesch

Managing performance parameters through systemd – Sander van Vugt

  • Mostly Demos in this talk too.
  • Using CPUShare parameter as an example
  • systemd-cgtop and systemd-cgls
  • “systemctl show stress1.service” will show available parameters
  • “man 5 systemd.resource-control” gives a lot more details.

Go for DevOps – Caskey L. Dickson

  • SideBar: The Platform Wars are over
    • Hint: We all won
    • As long as have an API we are all cool
  • Always builds staticly linked binaries, should work on just about any Linux system. Just one file.
  • Built in cross compiler (eg for Windows, Mac) via just enviroment variable “GOOS=darwin” and 32bit “GOARCH=32”
  • Bash is great, Python is great, Go is better
  • Microservices are Services
  • No Small Systems
    • Our Scripts are no longer dozens of lines long, they are thousands of lines long
    • Need full software engineering
  • Sysops pushing buttons and running scripts are dying
  • Platform Specific Code
    • main_linux.go main_windows.go and compiler find.
    • // +build linux darwin     <– At the top of the file
  • “Once I got my head around channels Go really opened up for me”

 

2017 Sysadmin Miniconf – Session 1

The Opposite of the Cloud – Tom Eastman

  • Korinates Data gateway – an appliance onsite at customers
  • Requirements
    • A bootable images ova, AMI/cloud images
    • Needs network access
    • Sounds like an IoT device
  • Opoossite of cloud is letting somebody outsource their stuff onto your infrastructure
  • Tom’s job has been making a nice and tidy appliance
  • What does IoT get wrong
    • Don’t do updates, security patches
    • Don’t treat network as hostile
    • Hard to remotely admin
  • How to make them secure
    • no default or static credentials
    • reduce the attack surface
    • secure all networks comms
    • ensure it fails securely
  • Solution
    • Don’t treat appliances like appliances
    • Treat like tightly orchestrated Linux Servers
  • Stick to conserative archetecture
    • Use standard distribution like Debian
    • You can trust the standard security updates
  • Solution Components
    • aspen: A customized Debian machine image built with Packer
    • pando: orchestration server/C&C network
    • hakea: A Django/Rest microservice API in charge
  • saltstack command and control
    • Normal orchestration stuff
    • Can works as a distributed command execution
    • The minions on each server connect to the central node, means you don’t need to connect into a remote appliance (no incoming connections needed to appliance)
    • OpenVPN as Internet transport
    • Outgoing just port 443 and openvpn protocol. Everything else via OpenVPN
  • What is the Appliance
    • A lightly mangled Debian Jessie VM image
    • Easy to maintain by customer, just reboot, activate or reinstall to fix any problems.
    • Appliance is running a bunch of docker containers
  • Appliance authentication
    • Needs to connect via 443 with activation code to download VPN and Salt short-lived certificates to get started
    • Auth keys only last for 24 hours.
    • If I can’t reach it it kills itself.
  • Hakea: REST control
    • Django REST framework microservices
    • Self documenting using DRF amd CoreAPI Schema
  • DevOps Principals apply beyonf the cloud

Inventory Management with Pallet Jack – Karl-Johan Karlsson

  • Goals
    • Single source of truth
    • Version control
    • Scaleable (to around 1000 machines, 10k objects)
  • Stuff stored as just a file structure
  • Some tools to access
  • Tools to export, eg to kea DHCP config
  • Tools as post-commit hooks for git. Pushes out update via salt etc
  • Various Integrations
    • API
    • Salt

Continuous Dashboard – You DevOps Airbag – Christopher Biggs

  • Dashboard traditionally targeted at OPs
  • Also need to target Devs
    • KPIs and
  • Sales and Support need to know everything to
  • Management want reassurance, Shipping a new feature, you have a hotline to the CEO
  • Customer, do you have something you are ashamed of?
    • Take notice of load spikes
    • Assume customers errors are being acted on, option to notify then when a fix happens
    • What is relivant to support call, most recent outages affecting this customer
    • Remember recent behavour of this customer
  • What kinds of data?
    • Tradditionally: System load indicators, transtion numbers etc
    • Now: Business Goals, unavoidable errors, spikes of errors, location of errors, user experience metrics, health of 3rd party interfaces, App and product reviews
  • What should I put in dashboards
    • Understand the Status-quo
    • Continuously
    • Look at trends over time and releases
    • Think about features holisticly
  • How to get there
    • Like you data as much as your code
    • Experiment with your data
    • tools: nodered.org, blynk.cc, elastic
  • Insert Dashboards into your dev pipeline
    • Code Review, CI, Unit Test, Confirm that alarms actually work via test errors
    • Automate deployment
  • Tools
    • ELK – off the shelf images, good import/export
    • Node-RED – Flow based data processing, nice visual editor, built in dashboarding
    • Blynk – Nice dashboards in Ios or Android. Interactive dashboard editor. Easy to share
  • Social Media integration
    • Receive from twitter, facebook, apps stores reviews
    • Post to slack and monitoring channels
    • Forward to internal groups

The Sound of Silencing – Julien Goodwin

  • Humans know to ignore “expected” alerts during maintenance
    • Hard to know what is expected vs unexpected
    • Major events can lead to alert overload
  • Level 1 – Turn it all off
    • Can work on small scale
  • Level 2 – Turn off a localtion while working on it.
    • What if something happens while you are doing the work?
    • May work with single-service deployments
  • Level 3 – Turn off the expect alerts
    • Hard to get exactly right
  • Level 4 – Change mngt integration
    • Link the generator up to th change mngt automation system
    • What about changes too small to track?
    • What about changes too big for a simple silence?
  • Level 5 – Inhibiting Alerts
    • Use Service level indigations to avoid alerts on expected failures
    • Fire “goes nowhere” alert
  • Level 6 – Global monitoring and preventing over-siliencing
    • Alert if too many sites down
    • Need to have explicit alerts to spot when somebody silences “*”
  • How to get there from here
    • Incrementally
    • Choose a bad alert and change it to make it better
    • Regularly