Linux.conf.au 2017 – Conference Opening

  • Wear SunScreen
  • Karen Sandler introduces Outreachy and it is announced as the raffle cause for 2017
  • Overview of people
    • 462 From Aus
    • 43 from NZ
    • 62 From USA
    • Lots of other countries
    • Gender breakdown lots of no answers so a stats a bit rough
  • Talks
    • 421 Proposals
    • 80-ish talks and 6 tutorials
    • Questions
      • Please ask questions during the question time
  • Looking for Volunteers – look at a session and click to signup
  • Keynotes – A quick profile
  • All the rooms are booked till 11pm! for BOF sessions
  • Lightning talks, Coffee, Lunch, dinners

 

 

Share

Passengers vs “50 Girls 50”

Spoilers: Minor for Passengers, Major for 50 Girls 50.

In late 2016 the movie “Passengers” came out staring Jennifer Lawrence and Chris Pratt. The movie is set aboard a sleeper spaceship and the plot centers around the two leads characters waking up early. I won’t say more about movie but there is summary of the plot in the wikipedia entry for the movie. You can compare it to the comic below to see the similarities and differences.

When I first saw the trailer it reminded me of a Sci-Fi comic I read years ago, others noticed it was similar and gave a name of the comic as “50 Girls 50” by Al Williamson. I couldn’t find a summary of  short story so I thought I’d write it up here.

50 Girls 50 by Al Williamson – Plot summary

The story is a 6 page comic with one off characters originally published in 1953. It is set in the distant future aboard a spaceship making humanity’s first journey to a nearby star. Since the trip will take 100 years the the crew/passengers of 50 women and 50 men (hence the title) will be frozen for the whole journey. However the freezing technology used only works on a person once, if you attempt to refreeze somebody they will die.

The plot of the story is partially told though flashbacks but I’ll tell it is chronological order.

The main character is Sid who before the voyage starts is attracted to one of the other passengers Wendy. Wendy notices his attraction and they get together. After a time Wendy has proposition for him. She suggest that Sid sabotage the Deep-freeze (D-F) units so that  he wakes up early. He can then wake her up and they can wake up the others one at a time and “make them our slaves”

Sid however as his own idea. What he wants to do is just have a series of girlfriends. He’ll set his clock for two years into the voyage. Then he will wake up Wendy and live with he for a while, when he gets tired of Wendy he will get rid of her and move to the next girl and so on.

Once the voyage starts things go to Sid’s plan. He thaws out 2 years in but instead of waking up Wendy he decided to thaw out Laura first. He then pretends to Laura that they both accidentally thawed out.

“Almost a year” later he gets tired and Laura, shoots her with a “Paralyzer” gun and stuffs her back in a Freeze-chamber to die.

He then prepares to wake Wendy. First he sets the Ships clock to say they will reach the destination in 3 years to give him enough time to get tired of Wendy. Things don’t go according to plan however when Wendy wakes up:

Not really a happy ending for anyone, although it is not like Sid or Wendy really deserved one.

Share

Donations 2016

Like last year I am doing all my charity donations at once and blogging about it. The theory with doing it all at once is that is it more efficient and less impulsive, while blogging about it might encourage others to do similarly. Note that all amounts are in $US

I found one downside of doing it all at once (especially around midnight) is that my bank suspended my card for suspicious activity. All sorted out with a quick phone call though.

Once more this year I gave the majority of my money to those charities recommended by Givewell. This year instead of spreading my donation evenly among the top charities I followed their recommendations ( See right sidebar on the link above ).

Next were a series of Open Source projects, trying to concentrate on software I use:

And some tech content or advocacy groups

Additionally I gave some money to MSF via a campaign by Zeynep Tufekci highlighting Yemen

Hoping to do the same again next year, feel free to recommend other organizations you think might be a good place for me to donate towards. I’m thinking about

Share

DevOpsDays Wellington 2016 – Day 2, Session 3

Ignites

Mrinal Mukherjee – How to choose a DevOps tool

Right Tool
– Does the job
– People will accept

Wrong tool
– Never ending Poc
– Doesn’t do the job

How to pick
– Budget / Licensing
– does it address your pain points
– Learning cliff
– Community support
– API
– Enterprise acceptability
– Config in version control?

Central tooling team
– Pro standardize, educate, education
– Constant Bottleneck, delays, stifles innovation, not in sync with teams

DevOps != Tool
Tools != DevOps

Tools facilitate it not define it.

Howard Duff – Eric and his blue boxes

Physical example of KanBan in an underwear factory

Lindsey Holmwood – Deepening people to weather the organisation

Note: Lindsey presents really fast so I missed recording a lot from the talk

His Happy, High performing Team -> He left -> 6 months later half of team had left

How do you create a resilient culture?

What is culture?
– Lots of research in organisation psychology
– Edgar Schein – 3 levels of culture
– Artefacts, Values, Assumptions

Artefacts
– Physical manifestations of our culture
– Standups, Org charts, desk layout, documentation
– actual software written
– Easiest to see and adopt

Values
– Goals, strategies and philosophise
– “we will dominate the market”
– “Management if available”
– “nobody is going to be fired for making a mistake”
– lived values vs aspiration values (People have good nose for bullshit)
– Example, cores values of Enron vs reality
– Work as imagined vs Work is actually done

Assumptions
– beliefs, perceptions, thoughts and feelings
– exist on an unconscious level
– hard to discern
– “bad outcomes come from bad people”
– “it is okay to withhold information”
– “we can’t trust that team”
– “profits over people”

If we can change our people, we can change our culture

What makes a good team member?

Trust
– Vulnerability
– Assume the best of others
– Aware of their cognitive bias
– Aware of the fundamental attribution error (judge others by actions, judge ourselves by our intentions)
– Aware of hindsight bias. Hindsight bias is your culture killer
– When bad things happen explain in terms of foresight
– Regular 1:1s
Eliminate performance reviews
Willing to play devils advocate

Commit and acting
– Shared goal settings
– Don’t solutioneer
– Provide context about strategy, about desired outcome
What makes a good team?

Influence of hiring process
– Willingness to adapt and adopt working in new team
– Qualify team fit, tech talent then rubber stamp from team lead
– have a consistent script, but be prepared to improvise
– Everyone has the veto power
– Leadership is vetoing at the last minute, thats a systemic problem with team alignment not the system
– Benefit: team talks to candidate (without leadership present)
– Many different perspectives
– unblock management bottlenecks
– Risk: uncovering dysfunctions and misalignment in your teams
– Hire good people, get out of their way

Diversity and inclusion
– includes: race, gender, sexual orientation, location, disability, level of experience, work hours
– Seek out diverse candidates.
– Sponsor events and meetups
– Make job description clear you are looking for diverse background
– Must include and embrace differences once they actually join
– Safe mechanism for people to raise criticisms, and acting on them

Leadership and Absence of leadership
– Having a title isn’t required
– If leader steps aware things should continue working right
– Team is their own shit umbrella
– empowerment vs authority
– empowerment is giving permission from above (potentially temporary)
– authority is giving power (granting autonomy)

Part of something bigger than the team
– help people build up for the next job
– Guilds in the Spotify model
– Run them like meetups
– Get senior management to come and observe
– What we’re talking about is tech culture

We can change tech culture
– How to make it resist the culture of the rest of the organisation
– Artefacts influence behaviour
– Artifact fast builds -> value: make better quality
– Artifact: post incident reviews -> Value: Failure is an opportunity for learning

Q: What is a pre-incident review
A: Brainstorm beforehand (eg before a big rollout) what you think might go wrong if something is coming up
then afterwards do another review of what just went wrong

Q: what replaces performance reviews
A: One on ones

Q: Overcoming Resistance
A: Do it and point back at the evidence. Hard to argue with an artifact

Q: First step?
A: One on 1s

Getting started, reading books by Patrick Lencioni:
– Solos, Politics and turf wars
– 5 Dysfunctions of a team

Share

DevOpsDays Wellington 2016 – Day 2, Session 2

Troy Cornwall & Alex Corkin – Health is hard: A Story about making healthcare less hard, and faster!

Maybe title should be “Culture is Hard”

@devtroy @4lexNZ

Working at HealthLink
– Windows running Java stuff
– Out of date and poorly managed
– Deployments manual, thrown over the wall by devs to ops

Team Death Star
– Destroy bad processes
– Change deployment process

Existing Stack
– VMware
– Windows
– Puppet
– PRTG

CD and CI Requirements
– Goal: Time to regression test under 2 mins, time to deploy under 2 mins (from 2 weeks each)
– Puppet too slow to deploy code in a minute or two. App deply vs Conf mngt
– Can’t use (then) containers on Windows so not an option

New Stack
– VMware
– Ubuntu
– Puppet for Server config
– Docker
– rancher

Smashed the 2 minute target!

But…
– We focused on the tech side and let the people side slip
– Windows shop, hard work even to get a Linux VM at the start
– Devs scared to run on Linux. Some initial deploy problems burnt people
– Lots of different new technologies at once all pushed to devs, no pull from them.

Blackout where we weren’t allowed to talk to them for four weeks
– Should have been a warning sign…

We thought we were ready.
– Ops was not ready

“5 dysfunctions of a team”
– Trust as at the bottom, we didn’t have that

Empathy
– We were aware of this, but didn’t follow though
– We were used to disruption but other teams were not

Note: I’m not sure how the story ended up, they sort of left it hanging.

Pavel Jelinek – Kubernetes in production

Works at Movio
– Software for Cinema chains (eg Loyalty cards)
– 100million emails per month. million of SMS and push notifications (less push cause ppl hate those)

Old Stack
– Started with mysql and php application
– AWS from the beginning
– On largest aws instance but still slow.

Decided to go with Microservices
– Put stuff in Docker
– Used Jenkins, puppet, own docker registery, rundeck (see blog post)
– Devs didn’t like writing puppet code and other manual setup

Decided to go to new container management at start of 2016
– Was pushing for Nomad but devs liked Kubernetes

Kubernetes
– Built in ports, HA, LB, Health-checks

Concepts in Kub
– POD – one or more containers
– Deployment, Daemon, Pet Set – Scaling of a POD
– Service- resolvable name, load balancing
– ConfigMap, Volume, Secret – Extended Docker Volume

Devs look after some kub config files
– Brings them closer to how stuff is really working

Demo
– Using kubectl to create pod in his work’s lab env
– Add load balancer in front of it
– Add a configmap to update the container’s nginx config
– Make it public
– LB replicas, Rolling updates

Best Practices
– lots of small containers are better
– log on container stdout, preferable via json
– Test and know your resource requirements (at movio devs teams specify, check and adjust)
– Be aware of the node sizes
– Stateless please
– if not stateless than clustered please
– Must handle unexpected immediate restarts

Share

DevOpsDays Wellington 2016 – Day 2, Session 1

Jethro Carr – Powering stuff.co.nz with DevOps goodness

Stuff.co.nz
– “News” Website
– 5 person DevOps team

Devops
– “Something you do because Gartner said it’s cool”
– Sysadmin -> InfraCoder/SRE -> Dev Shepherd -> Dev
– Stuff in the middle somewhere
– DevSecOps

Company Structure drives DevOps structure
– Lots of products – one team != one product
– Dev teams with very specific focus
– Scale – too big, yet to small

About our team
– Mainly Ops focus
– small number compared to developers
– Operate like an agency model for developers
– “If you buy the Dom Post it would help us grow our team”
– Lots of different vendors with different skill levels and technology

Work process
– Use KanBan with Jira
– Works for Ops focussed team
– Not so great for long running projects

War Against OnCall
– Biggest cause of burnout
– focus on minimising callouts
– Zero alarm target
– Love pagerduty

Commonalities across platforms
– Everyone using compute
– Most Java and javascript
– Using Public Cloud
– Using off the shelf version control, deployment solutions
– Don’t get overly creative and make things too complex
– Proven technology that is well tried and tested and skills available in marketplace
– Classic technologist like Nginx, Java, Varnish still have their place. Don’t always need latest fashion

Stack
– AWS
– Linux, ubuntu
– Adobe AEM Java CMS
– AWS 14x c4.2xlarge
– Varnish in front, used by everybody else. Makes ELB and ALB look like toys

How use Varnish
– Retries against backends if 500 replies, serve old copies
– split routes to various backends
– Control CDN via header
– Dynamic Configuration via puppet

CDN
– Akamai
– Keeps online during breaking load
– 90% cache offload
– Management is a bit slow and manual

Lamda
– Small batch jobs
– Check mail reputation score
– “Download file from a vendor” type stuff
– Purge cache when static file changes
– Lamda webapps – Hopefully soon, a bit immature

Increasing number of microservices

Standards are vital for microservices
– Simple and reasonable
– Shareable vendors and internal
– flexible
– grow organicly
– Needs to be detail
– 12 factor App
– 3 languages Node, Java, Ruby
– Common deps (SQL, varnish, memcache, Redis)
– Build pipeline standardise. Using Codeship
– Standardise server builds
– Everything Automated with puppet
– Puppet building docker containers (w puppet + puppetstry)
– Std Application deployment

Init systems
– Had proliferation
– pm2, god, supervisord, systemvinit are out
– systemd and upstart are in

Always exceptions
– “Enterprise ___” is always bad
– Educating the business is a forever job
– Be reasonable, set boundaries

More Stuff at
http://tinyurl.com/notclickbaithonest

Q: Pull request workflow
A: Largely replaced traditional review

Q: DR eg AWS outage
A: Documented process if codeship dies can manually push, Rest in 2*AZs, Snapshots

Q: Dev teams structure
A: Project specific rather than product specific.

Q: Puppet code tested?
A: Not really, Kinda tested via the pre-prod environment, Would prefer result (server spec) testing rather than low level testing of each line
A: Code team have good test coverage though. 80-90% in many cases.

Q: Load testing, APM
A: Use New Relic. Not much luck with external load testing companies

Q: What is somebody wants something non-standard?
A: Case-by-case. Allowed if needed but needs a good reason.

Q: What happens when automation breaks?
A: Documentation is actually pretty good.

Share

DevOpsDays Wellington 2016 – Day 1, Session 3

Owen Evans – DevOps is Dead, long live DevOps

Theory: Devops is role that never existed.

In the old days
– Shipping used to be hard and expensive, eg on physical media
– High cost of release
– but everybody else was the same.
– Lots of QA and red tape, no second chances

Then we got the Internet
– Speed became everything
– You just shipped enough

But Hardware still was a limiting factor
– Virtual machines
– IaaS
– Containers

This led to complacency
– Still had a physical server under it all

Birth of devops
– Software got faster but still had to have hardware under their somewhere
– Disparity between operations cadence and devs cadence
– things got better
– But we didn’t free ourselves from hardware
– Now everything is much more complex

Developers are now divorced from the platform
– Everything is abstracted
– It is leaky buckets all the way down

Solutions
– Education of developers as to what happens below the hood
– Stop reinventing the where
– Harmony is much more productive
– Lots of tools means that you don’t have enough expertise on each
– Reduce fiefdoms
– Push responsibility but not ownership (you own it but the devs makes some of the changes)
– Live with the code
– Pit of success, easy ways to fail that don’t break stuff (eg test environments, by default it will do the right thing)
– Be Happy. Everybody needs to be a bit devops and know a bit of everything.

Share

DevOpsDays Wellington 2016 – Day 1, Session 2

Martina Iglesias – Automatic Discovery of Service metadata for systems at scale

Backend developer at Spotify

Spotify Scale
– 100m active users
– 800+ tech employees
– 120 teams
– Microservices architecture

Walk though Sample artist’s page
– each component ( playlist, play count, discgraphy) is a seperate service
– Aggregated to send result back to client

Hard to co-ordinate between services as scale grows
– 1000+ services
– Each need to use each others APIs
– Dev teams all around the world

Previous Solution
– Teams had docs in different places
– Some in Wiki, Readme, markdown, all different

Current Solution – System Z
– Centralise in one place, as automated as possible
– Internal application
– Web app, catalog of all systems and its parts
– Well integrated with Apollo service

Web Page for each service
– Various tabs
– Configuration (showing versions of build and uptimes)
– API – list of all endpoints for service, scheme, errors codes, etc (automatically populated)
– System tab – Overview on how service is connected to other services, dependencies (generated automatically)

Registration
– System Z gets information from Apollo and prod servers about each service that has been registered

Apollo
– Java libs for writing microservices
– Open source

Apollo-meta
– Metadata module
– Exposes endpoint with metadata for each service
– Exposes
– instance info – versions, uptime
– configuration – currently loaded config of the service
– endpoints –
– call information – monitors service and learns and returns what incoming and outgoing services the service actually does and to/from what other services.
– Automatically builds dependencies

Situation Now
– Quicker access to relevant information
– Automated boring stuff
– All in one place

Learnings
– Think about growth and scaling at the start of the project

Documentation generators
-Apollo
– Swagger.io
– ralm.org

Blog: labs.spotify.com
Jobs: spotify.com/jobs

Q: How to handle breaking APIs
A: We create new version of API endpoint and encourage people to move over.

Bridget Cowie – The story of a performance outage, and how we could have prevented it

– Works for Datacom
– Consultant in Application performance management team

Story from Start of 2015

– Friday night phone calls from your boss are never good.
– Dropped in application monitoring tools (Dynatrace) on Friday night, watch over weekend
– Prev team pretty sure problem is a memory leak but had not been able to find it (for two weeks)
– If somebody tells you they know what is wrong but can’t find it, give details or fix it then be suspicious

Book: Java Enterprise performance

– Monday prod load goes up and app starts crashing
– Told ops team but since crash wasn’t visable yet, was not believed. waited

Tech Stack
– Java App, Jboss on Linux
– Multiple JVMs
– Oracle DBs, Mulesoft ESB, ActiveMQ, HornetQ

Ah Ha moment
– Had a look at import process
– 2.3 million DB queries per half hour
– With max of 260 users, seems way more than what is needed
– Happens even when nobody is logged in

Tip: Typically 80% of all issues can be detected in dev or test if you look for them.

Where did this code come from?
– Process to import a csv into the database
– 1 call mule -> 12 calls to AMQ -> 12 calls to App -> 102 db queries
– Passes all the tests… But
– Still shows huge growth in queries as we go through layers
– DB queries grow bigger with each run

Tip: Know how your code behaves and track how this behavour changes with each code change (or even with no code change)

Q: Why Dynatrace?
A: Quick to deploy, useful info back in only a couple of hours

Share

DevOpsDays Wellington 2016 – Day 1, Session 1

Ken Mugrage – What we’re learning from burnout and how DevOps culture can help

Originally in the Marines, environment where burnout not tolerated
Works for Thoughtworks – not a mental health professional

Devops could make this worse
Some clichéd places say: “Teach the devs puppet and fire all the Ops people”

Why should we address burnout?
– Google found psychological safety was the number 1 indicator of an effective team
– Not just a negative, people do better job when feeling good.

What is burnout
– The Truth about burnout – Maslach and Leiter
– The Dimensions of Burnout
– Exhaustion
– Cynicism
– Mismatch between work and the person
– Work overload
– Lack of control
– Insufficient reward
– Breakdown of communication

Work overload
– Various prioritisation methods
– More load sharing
– Less deploy marathons
– Some orgs see devops as a cost saving
– There is no such thing as a full stack engineer
– team has skills, not a person

Lack of Control
– Team is ultimately for the decissions
– Use the right technolgy and tools for the team
– This doesnt mean a “Devops team” contolling what others do

Insufficient Reward
– Actually not a great motivator

Breakdown in communication
– Walls between teams are bad
– Everybody involved with product should be on the same team
– 2 pizza team
– Pairs with different skill sets are common
– Swarming can be done when required ( one on keyboard, everybody else watching and talking and helping on big screen)
– Blameless retrospectives are held
– No “Devops team”, creating a silo is not a solution for silos

Absence of Fairness
– You build it, you run it
– Everybody is responsible for quality
– Everybody is measured in the same way
– example Expedia – *everything* deployed has A/B tesing
– everybody goes to release party

Conflicting Values
– In the broadest possible sense
– eg Company industry and values should match your own

Reminder: it is about you and how you fit in with the above

Pay attention to how you feel
– Increase your self awareness
– Maslach Burnout inventory
– Try not to focus on the negative.

Pay attention to work/life balance
– Ask for it, company might not know your needs
– If you can’t get it then quit

Talk to somebody
– Professional help is the best
– Trained to identify cause and effect
– can recommend treatment
– You’d call them if you broke your arm

Friends and family
– People who care, that you haven’t even meet
– Empathy is great , but you aren’t a professional
– Don’t guess cause and effect
– Don’t recommend treatment if not a professional

Q: Is it Gender specific for men (since IT is male dominated) ?
– The “absence of fairness” problem is huge for women in IT

Q: How to promote Psychological safety?
– Blameless post-mortems

 

Damian Brady – Just let me do my job

After working in govt, went to work for new company and hoped to get stuff done

But whole dev team was unhappy
– Random work assigned
– All deadlines missed
– Lots of waste of time meetings

But 2 years later
– Hitting all deadlines
– Useful meetings

What changes were made?

New boss, protect devs for MUD ( Meetings, uncertainty, distractions )

Meetings
– In board sense, 1-1, all hands, normal meetings
– People are averaging 7.5 hours/week in meetings
– On average 37% of meeting time is not relevant to person ( ~ $8,000 / year )
– Do meetings have goals and do they achieve those goals?
– 38% without goals
– only half of remaining meet those goals
– around 40% of meetings have and achieve goals
– Might not be wasted. Look at “What has changed as result of this meeting?”

Meetings fixes
– New Boss went to meetings for us (didn’t need everybody) as a representative
– Set a clear goal and agenda
– Avoid gimmicks
– don’t default to 30min or 1h

Distractions
– 60% of people interrupted 10 or more times per day
– Good to stay in a “flow state”
– 40% people say they are regularly focussed in their work. but all are sometimes
– 35% of time loss focus when interrupted
– Study shows people can take up to 23mins to get focus back after interruption
– $25,000/year wasting according to interruptions

Distraction Fixes
– Allowing headphones, rule not to interrupt people wearing headphones
– “Do not disturb” times
– Little Signs
– Had “the finger” so that you could tell somebody your were busy right now and would come back to them
– Let devs go to meeting rooms or cafes to hide from interruptions
– All “go dark” where email and chat turned off

Uncertainty
– 82% in survey were clear
– nearly 60% of people their top priority changes before they can finish it.
– Autonomy, mastery, purpose

Uncertainty Fixes
– Tried to let people get clear runs at work
– Helped people acknowledge the unexpected work, add to Sprint board
– Established a gate – Business person would have to go through the manager
– Make the requester responsible – made the requester decide what stuff didn’t get done by physically removing stuff from the sprint board to add their own

Share

Putting Prometheus node_exporter behind apache proxy

I’ve been playing with Prometheus monitoring lately. It is fairly new software that is getting popular. Prometheus works using a pull architecture. A central server connects to each thing you want to monitor every few seconds and grabs stats from it.

In the simplest case you run the node_exporter on each machine which gathers about 600-800 (!) metrics such as load, disk space and interface stats. This exporter listens on port 9100 and effectively works as an http server that responds to “GET /metrics HTTP/1.1” and spits several hundred lines of:

node_forks 7916
node_intr 3.8090539e+07
node_load1 0.47
node_load15 0.21
node_load5 0.31
node_memory_Active 6.23935488e+08

Other exporters listen on different ports and export stats for apache or mysql while more complicated ones will act as proxies for outgoing tests (via snmp, icmp, http). The full list of them is on the Prometheus website.

So my problem was that I wanted to check my virtual machine that is on Linode. The machine only has a public IP and I didn’t want to:

  1. Allow random people to check my servers stats
  2. Have to setup some sort of VPN.

So I decided that the best way was to just use put a user/password on the exporter.

However the node_exporter does not  implement authentication itself since the authors wanted the avoid maintaining lots of security code. So I decided to put it behind a reverse proxy using apache mod_proxy.

Step 1 – Install node_exporter

Node_exporter is a single binary that I started via an upstart script. As part of the upstart script I told it to listen on localhost port 19100 instead of port 9100 on all interfaces

# cat /etc/init/prometheus_node_exporter.conf
description "Prometheus Node Exporter"

start on startup

chdir /home/prometheus/

script
/home/prometheus/node_exporter -web.listen-address 127.0.0.1:19100
end script

Once I start the exporter a simple “curl 127.0.0.1:19100/metrics” makes sure it is working and returning data.

Step 2 – Add Apache proxy entry

First make sure apache is listening on port 9100 . On Ubuntu edit the /etc/apache2/ports.conf file and add the line:

Listen 9100

Next create a simple apache proxy without authentication (don’t forget to enable mod_proxy too):

# more /etc/apache2/sites-available/prometheus.conf 
<VirtualHost *:9100>
 ServerName prometheus

CustomLog /var/log/apache2/prometheus_access.log combined
 ErrorLog /var/log/apache2/prometheus_error.log

ProxyRequests Off
 <Proxy *>
Allow from all
 </Proxy>

ProxyErrorOverride On
 ProxyPass / http://127.0.0.1:19100/
 ProxyPassReverse / http://127.0.0.1:19100/

</VirtualHost>

This simply takes requests on port 9100 and forwards them to localhost port 19100 . Now reload apache and test via curl to port 9100. You can also use netstat to see what is listening on which ports:

Proto Recv-Q Send-Q Local Address   Foreign Address State  PID/Program name
tcp   0      0      127.0.0.1:19100 0.0.0.0:*       LISTEN 8416/node_exporter
tcp6  0      0      :::9100         :::*            LISTEN 8725/apache2

 

Step 3 – Get Prometheus working

I’ll assume at this point you have other servers working. What you need to do now is add the following entries for you server in you prometheus.yml file.

First add basic_auth into your scape config for “node” and then add your servers, eg:

- job_name: 'node'

  scrape_interval: 15s

  basic_auth: 
    username: prom
    password: mypassword

  static_configs:
    - targets: ['myserver.example.com:9100']
      labels: 
         group: 'servers'
         alias: 'myserver'

Now restart Prometheus and make sure it is working. You should see the following lines in your apache logs plus stats for the server should start appearing:

10.212.62.207 - - [31/Jul/2016:11:31:38 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1"
10.212.62.207 - - [31/Jul/2016:11:31:53 +0000] "GET /metrics HTTP/1.1" 200 11398 "-" "Go-http-client/1.1"
10.212.62.207 - - [31/Jul/2016:11:32:08 +0000] "GET /metrics HTTP/1.1" 200 11377 "-" "Go-http-client/1.1"

Notice that connections are 15 seconds apart, get http code 200 and are 11k in size. The Prometheus server is using Authentication but apache doesn’t need it yet.

Step 4 – Enable Authentication.

Now create an apache password file:

htpasswd -cb /home/prometheus/passwd prom mypassword

and update your apache entry to the followign to enable authentication:

# more /etc/apache2/sites-available/prometheus.conf
 <VirtualHost *:9100>
 ServerName prometheus

 CustomLog /var/log/apache2/prometheus_access.log combined
 ErrorLog /var/log/apache2/prometheus_error.log

 ProxyRequests Off
 <Proxy *>
 Order deny,allow
 Allow from all
 #
 AuthType Basic
 AuthName "Password Required"
 AuthBasicProvider file
 AuthUserFile "/home/prometheus/passwd"
 Require valid-user
 </Proxy>

 ProxyErrorOverride On
 ProxyPass / http://127.0.0.1:19100/
 ProxyPassReverse / http://127.0.0.1:19100/
 </VirtualHost>

After you reload apache you should see the following:

10.212.56.135 - prom [01/Aug/2016:04:42:08 +0000] "GET /metrics HTTP/1.1" 200 11394 "-" "Go-http-client/1.1"
10.212.56.135 - prom [01/Aug/2016:04:42:23 +0000] "GET /metrics HTTP/1.1" 200 11392 "-" "Go-http-client/1.1"
10.212.56.135 - prom [01/Aug/2016:04:42:38 +0000] "GET /metrics HTTP/1.1" 200 11391 "-" "Go-http-client/1.1"

Note that the “prom” in field 3 indicates that we are logging in for each connection. If you try to connect to the port without authentication you will get:

Unauthorized
This server could not verify that you
are authorized to access the document
requested. Either you supplied the wrong
credentials (e.g., bad password), or your
browser doesn't understand how to supply
the credentials required.

That is pretty much it. Note that will need to add additional Virtualhost entries for more ports if you run other exporters on the server.

 

Share