Everything Open 2025 – Day 2 – Afternoon

I skipped a couple of talks to do Hallway track and other stuff

Koha – not your average library system by Aleisha Amohia

  • Name because software was made open source as a gift to the community
  • Started in 1999
  • First fully web-based opensource library system
  • Bugs and external patches soon after
  • Customizable and Configurable
  • Used in 18,000+ libraries
  • It is just a big database
    • Can be used as not just a library system
    • Can be used to catalog other stuff at organisations other than libraries like documents
  • Configurable via CSS, fonts, languages, CMS, feature toggles, etc
  • Customisable views for each branch are possible
  • Special Beyond the code
    • Offline circulation
    • Supports non-ascii characters
    • Translation capability
  • Is it harder to find people to work on stuff since it is writter in perl which is effectively a legacy language? – Has a good onboarding and support for devs and things still work
  • What are challengers with it being open source? –
    • People worry about quality of OSS. Fix: Good robust quality procedures
    • Think it is free – Have good support that is worth paying for
    • DB backend – MySQL and MariaDB

The circle of life: The Digital Skills GitBook project by Sara King

  • Working on project for last 5 years that is in the process of winding up
  • tinyurl.com/5539zzpx <- more information
  • Starting early 2019
  • 5 years later project is coming to the end of a natural cycle
  • Context
    • Group of 60 libraries looking for projects – CAUL Digital Librarians
    • Is there a book that teaches modern not-quite-technical computer skills?
  • With Pandemic lockdowns everybody started working from home
  • Why Gitbook?
    • “Book” is in the term helped
    • Similar project using github etc
    • CAUL eventually went Pressbooks, but not till later
    • Also qualified for free version
    • Learning git was a useful thing
  • Did the community really need this? – Wasn’t checked in detail, but seemed a cool idea
  • Happened at start of pandemic
    • Everyone online
    • Supportive community was good at start of pandemic
  • Took some courses in git and other tools
  • Did a prototype book on another subject to get the hang of the tech
  • “Gave ourselves permission to not know what we were doing”
  • Created chapters of the books to give outline
    • Each Chapter had 3 levels of knowledge in it. Novice, proficient, advanced
  • Went public in late-2021
  • Also did code of conduct, license, contributions guidelines
  • Told people about it via various methods
  • Worked to get people to contribute ad-hoc
  • But didn’t get the amount of contributions they were expecting
  • In 2023 University libraries having problems, budgets shrinking etc
    • People leaving or too busy
    • Some used experience on the project to get new more technical jobs
  • No new people joining to replace those leaving
  • 2025 reflecting on the project
  • Process and product are different
  • We equated enthusiastic about the idea and the process. But didn’t join in or wasn’t super into the product
  • Not shared a lot or got many hits
  • Goal of training people to create stuff was a big success
  • People gained lots of confidence with new tech
  • Support of CAUL was great, but no longer availbale
  • Next? – If people like the process maybe we should talk about that
  • Create a roadmap for other projects
  • Hand it over to somebody else? Doesn’t seem to be interest
Share

Everything Open 2025 – Day 2 – Morning

Skill Trees: Gamifying The Hard Things by Steph Piper

  • A list of skills
  • Each area has a series of skills that can be colored in.
  • Design
    • Hexagons are good
    • Can be done in any order, hard to connect meaningfully
    • Simple, flexable milestones
  • Reception
    • First on was 3d printing & modeling
    • Tested on makerspace student staff members. Good to identify gaps
  • Benefits
    • Reduce imposter syndrome or on the other size overconfidence
    • Target areas for improvement
  • Online on git – https://github.com/sjpiper145/MakerSkillTree
  • How to make a skill tree
    • Flexibility, not too cost restrictive, globally applicable
    • Peer reviewed
    • Final skill tree and translation
  • Book – The Learning Game by Ana Lorena Fabrega
  • Beta testing book of a collection of these skills.
    • Good published through “Make: Magazine”
    • 68 tiles per tree, 1020 skill tiles in the book
  • Tips for writing
    • Continue to evolve and improve
    • Do own illustrations was huge time saver from the publisher
    • Confidence in your work. The publisher will only do the final publishing
  • Looking to fill the gaps
  • Working on a kids version of the book

The Token Wars: Why not everything should be open by Kathy Reid

  • The Token Wars
    • A resource conflict fought through technical, social and legal means
  • What is a token?
    • An atomic unit of text taken from a larger collection called a corpus
    • text -> subwords tokens -> vectorization
    • Transformer architecture
    • Word embeddings capture semantic closeness of words
  • Scaling up to billions of tokens
    • Train the relationships between tokens based on all the text
  • The value of tokens and token economics and the actors in the token wars
    • Are the a public good?
    • No the are rivalrous either excludable or non-excludable
    • LLMs in 2024 were trained on 4 orders of magnitude data than 5 years ago.
    • Estimated 60-160 trillion tokens on the public web and some LLMs are trained on close to all of those
    • Synthetic Data especially low quality slop is polluting the Internet
    • Scrapers pick this up and train on it, concern about Model Collapse ( like a photocopy of a photocopy). Reduces the diversity of what it will produce.
  • Key actors in the token wars
  • Individual content creators
    • Included in corpus without permission
  • Platforms with user-generated content
    • Seeking to get paid for their content ( eg Reddit deal with OpenAI )
  • Archival Institutions
    • Australian National Film and Sound Archive: Maintain Trust, Transparancy, Create Public Value
  • Private Companies
    • Anthropic: Model Context Protocol
  • The AI Companies
    • Have used fair-use. Although some countries don’t have those
    • Companies blocking the common crawl
  • Governments
    • Having trouble balancing interests
  • Token Tactics – Protecting your token treasure
    • Data poisoning
    • Blocking bots and scrapers
  • Data Sovereignty
  • Futures
    • Hunt for more tokens
    • Better ways to block/prevent
    • Better understanding of the alateral damage of the resource conflicts
Share

Everything Open 2025 – Day 1 – Afternoon

The Storage Shift by Steven Ellis

  • Storage Data is critical for business
  • Requirements are always growing
  • Organisations already have existing solutions and relationships
  • Three Dimensions of data
  • Participants ( dev, ops, product ) all have different requirements and views
  • Where did you first store your data?
    • As spinning drives have gotten smaller the capacity has increased
    • Now people have small local storage and storage is not directly attached
  • Storage platforms / API driven storage
    • Block vs Files vs Object
  • Options for Kubernetes storage.
    • CSI operates on all levels
    • Able to create an destroy storage at kubernetes speed rather than waiting for storage admin (or even cloud storage API)
  • Workload Examples
    • Kubevirt and Kubernetes centric but applicable elsewhere
  • What about prosumer
    • Be careful with clouds except as backups
    • zfs and btrfs
    • Stephen uses TrueNas
    • 3 copies of all data. RAID isn’t a backup

What happened in production?! Instrumenting with OpenTelemetry by David Bell

  • A sample problem
    • Microservice based system
    • What happened in Production?
    • Errors up high, response time went bad
  • What about the logs?
    • 200s and then 500s . What does that mean?
  • Kept happening at 2pm every day. Sometimes bad, sometimes worse
  • O11y and OpenTelemetry
    • Find the internal state of a system just by asking questions
  • What about metrics
    • Pre-aggregated, No “connective tissue”, Can’t drill down
    • Answering known questions, good for alarms, graphs and dashboards
    • known-knowns and known-unknowns
  • What about Logs?
    • unstructured strings
    • Many logs lines per piece of work. Maybe with a request-id but not often
    • no schema or index So can be quite slow to parse
    • structured logs sometimes work
    • expensive to store yourself or pay to have stored
    • But we should still log – audit logging and security logging
  • Tracing is good
    • separate tooling from logs and metrics
    • often limited fields
    • often limited traces to even look at ( just the bad ones)
  • OpenTelemetry
    • covers metrics, logs and traces
    • wide language support and auto-instrumentation out of the box
    • Easy to get started
    • wrappers and external hooks
    • distributed tracing
  • Otel Traces
    • Traces are Directed Acyclic Graphs ( DAGs) of Spans
    • Spans are sort of structured logs with required firlds
    • Spans contain many attributes
    • Attributes can have high cadinality
    • Spans have high dimensionality
  • Otel isn’t for everything
    • Don’t put you secret data
    • Maybe not business logic
    • no guarantee on delivery ( sometimes traces get lost )
    • No for secuity/audit loggin
  • Sampling can be useful
    • head-based sampling ( based on head at start )
    • rule-based/tail-based grabs all and keeps some that are interesting
  • Setup ( for python ) – no code changes
    • install a couple of packages. One to gather, one to send
    • send in some env variables
    • Change docker run command to wrap your existing code
  • Setup (code changes )
    • Import packages
    • Shove attributes into a span in code (see example code in talk)
  • Demo of App (using Honeycomb)

Please don’t forget my parents! – Digital Exclusion is happening, so you all better know about it by Sae Ra Germaine

  • Various Background Stuff
  • Her Parents retired to rural property near outer suburb of Melbourne
  • Two phone lines
  • Mobile reception only available standing outside of the house
  • Wireless point-to-point wireless. Approx 1Mb/s but vulnerable to animals chewing through it
  • NBN
    • Originally was going to be Fiber to the premises.
    • Then got cheaper and fiber-to-the-curb or fiber-to-the-node and copper rest of the way
    • Today 98% on NBN but not everybody well connected
    • Parents land line got cut off regular due to errors
    • Then 3G got cut-off. 4G at parents place doesn’t really work
  • Digital Divide
    • Everything is now all online ( jobs, doctors, social services )
    • Satellite based Internet a lot more expensive than comparable options in cities
    • During covid lockdowns they were over 5km from various services which was a problem with movement restrictions
  • Libraries had to pivot during lockdows
    • wifi hotspots outside, accepting deliveries
    • Mobile libraries provide access to government services
    • Various other stuff on libraries

Open source voice interfaces in 2025 by Kit Biggs

  • Big changes in the last 12 months
  • AI has zoomed past inflated expectations and is now in the trough of disillusionment
  • Where are we with conversation user interfaces
  • What are the steps/software needed for this?
  • Get the sound
    • Digital microphones are good and do the first rough filtering
  • Is somebody actually speaking?
    • xiao_respeaker – example software project
  • Wake word recognisers
    • Commercial software work with a “wake word” ( Hey Siri )
    • Used to be hard to do, now easier
  • Word recognition just looks for specific words
    • Getting better
  • Contentious voice recognition
    • Also better
  • Intent recognition
    • Usually hooked in with communication to outside world
  • Feedback
    • Speech Synthesis is pretty much a solve problem
  • Looking at software you can use. Not cloud based
  • Wake Word
    • Picovoice Porcupine ( non commercial or licensed ) . 16 languages
    • OpenWakeWord
      • Great docs
      • Trains on Synthetic speech
      • More than good enough
  • Speech to Text
    • OpenAI Whisper was leader
    • Lots of new ones. Look at Moonshine
  • Text to Speech
    • Piper is the stand-out, actively developed
    • Others mostly good for english-only
    • Emotional synthesis is getting better
  • Hardware
    • Raspberry Pi 4 or 5
      • 5 has ability to plugin an accelerator
    • Rockchip Arm64 with neural coprocessor
    • AI in A Box ( Radxa Rock 5A)
  • Voice on a Microcontroller, the time has arrived
  • ESP32 processor is the most common option – $10 each
    • Dev board plus microphone maybe for $20 or so
    • Can do the wakeword stuff and then stream audio to something with more spec
  • How small can you go?
    • What can you do with a small board just by itself?
    • Speech recognition on micro-controller not there yet but phrase and wake word recognition works
  • Glasses display looking almost there
    • Can have microphones
    • Avoid cameras to avoid privacy concerns
Share

Everything Open 2025 – Day 1 – Morning

Keynote: Sustaining Open Source Software by Justin Warren

Good talk. Advise you have a watch it on Video. Good thoughts on the economics of Open Source

Sandboxing untrusted code with WebAssembly by Katie Bell

  • Works for MongoDB. Webscale!
  • Untrusted Code
  • Example Shopify
    • Supports 3rd party apps
    • What happens when 3rd-party apps goes offline and is used by a lot of stores
    • What if slow and inserts itself into customer flow making experience bad
    • Decided to hosted 3rd party apps in their cloud to provide better reliability
    • Shopify decided to go with webassembly
  • Some alternatives for sandboxing
    • Small VM like firecracker – 4MB memory, 125ms startup
    • Docker – Using Shared Kernel still
    • V8 Isolates – Used to isolating processes within a chrome tab. Cloudflare runs many workers in a process, startip 5ms
    • But not fair comparison. Lots of tradeoffs on how secure vs speed vs flexability
  • Webassembly
    • Designed to compile big apps to run in a browser (eg photoshop)
    • Is a compile Target – .wasm binary
    • Originally designed to usually be called from javascript ( in browser )
    • Is a tiny simulated computer, very locked down, can’t interact with anything outside. Can just provide and call functions
    • When you build compiler will usually create a javascript wrapper to make it easier to use so you don’t have to call wasm directly.
  • WASI
    • An API lets you run webassembly programs as regular programs
    • wasmtime – program to run .wasm directly
    • Keeps things sandboxed but can’t optionally provide with with a very limited set of stuff that must be explicitly provided
  • Sandboxing Webassembly in the real world
    • Shopify use this. See their docs and definitions
    • Firefox and Graphite font shaping library
      • Compiled from native code into wasm to ensure memory safety rather than audit or re-write in rust
  • Is it secure?
    • Sometimes. But WASI is built with holes intentionally so can have bugs
    • Wasmtime has a lot of work put into sandboxing though
    • Use multiple layers of security
  • WASI standard is in progress ( webassembly itself is fairly stable )

80% faster, 70% less memory: building a new high-performance, low-cost Prometheus query engine by Joshua Hesketh, Charles Korn

  • Works at Grafana Labs on Mimir database
  • Explains time-series database. (Name+Labels)+time+number
  • Talk covers the query app which turns promql requests into a result
  • Memory used by the old software was bouncing, had to be over-provisioned which wastes money or sends back errror to use if runs out of memory.
  • Prometheus Promql engine has little room for extensions
  • Problem
    • Prom promql engines loads the entire series into memory before processing it further
    • Fix would require a new new rewrite.
    • Which they did
  • MQE engine
    • Loads a bunch of samples and then streams to operator(s). Then repeats a bit at a time
    • Will fallback to Prometheus engine of function is not yet implimented
    • Very efficient on range queries
  • He explained memory allocation strategy using pooling. I got a little lost
    • “That was a very oversimplified example”
  • query-tee
    • Send queries to two different engines and ensure they return the same result for testing
    • Has test group for data that can run this over as well as live queries. Might to fizzy query testing in future
  • Engine is available and can be switched in via command line
    • Does fall-back if things are not implemented
    • Implements the most common queries (above 90% of actual request)

Share

Audiobooks – December 2024

Kubrick: An Odyssey by Robert P Kolker & Nathan Abrams

A fairly straightforward telling of Kubrick’s life and films. Well researched and interesting. 3/5

Mercury Rising: John Glenn, John Kennedy, and the New Battleground of the Cold War by Jeff Shesol

The early days of US manned spaceflight centered around the story of John Glenn and to a lesser extent JFK. Interesting with a good hook. 4/5

Cities in the Sky: The Quest to Build the World’s Tallest Skyscrapers by Jason M Barr

A continent-by-continent tour of the history of Skyscrapers. Good coverage of developers, governments and economics. 3/5

Troublemakers: Silicon Valley’s Coming of Age by Leslie Berlin

The stories of seven important but lesser known pioneers in personal computing, video games, advanced semiconductor logic, modern venture capital, and biotechnology during the 1960s-1980s. 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all
Share

AudioBooks – November 2024

Churchill: A Life by Martin Gilbert

Fairly but not exhaustively detailed, it is readable to someone casually interested. Authorised so generally positive towards Churchill 3/5

Tyranny of the Minority: How to Reverse an Authoritarian Turn, and Forge a Democracy for All by Steven Levitsky Daniel Ziblatt

How various parts of the US constitution thwarts the will of an expanding multicultural majority in favor of a shrinking rural white minority. Interesting 3/5

The Human Tide: How Population Shaped the Modern World by Paul Morland

Explains the demographic transition and how it has flowed from the UK to Europe to the rest of the world and how this has and will influence history. 3/5

Accidental Astronomy: How Random Discoveries Shape the Science of Space by Chris Lintott

Covers the last 60 years so many less well-known stories. Fun and interesting read 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all
Share

Donations 2024

Each year I do the majority of my Charity donations in early December, timed to be around my birthday.

I do a blog post about it to hopefully inspire others. See previous years: 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015

All amounts are in $US unless otherwise stated.

General Charities

$895 to Givewell Top Charities fund . I’ve been donating to Givewell as my main “help the poor” charity since they have fairly low overheads and try and get the most impact from their donations. They also get good reviews for living up to these goals.

My employer matched this donation so total given to Givewell was $1790.

Software and Internet Infrastructure Projects

Software in the Public Interest, The Software Freedom Conservancy and LibreOffice all use Paypal which is blocking charity donations from Asia/Pacific so I was unable to donate to them.

Content creators

Other Projects

$NZ 100 to Greater Auckland for the Transport Advocacy and Content

Payments via Patreon / Nebula

No change from last year . I pay around $1/month to most of the below creators and I also pay $30/year for a Nebula subscription.

Share

Audiobooks – September/October 2024

We Are the Nerds: The Birth and Tumultuous Life of Reddit, the Internet’s Culture Laboratory by Christine Lagorio-Chafkin

A history of reddit up to 2018. A little gushing and gossipy but mostly interesting. 3/5

Truman and the Bomb: The Untold Story by D. M . Giangreco

Fifty percent about the historical controversy rather than the events themselves. Lots of sniping at opponents. For friends of the author only. 2/5

Lakes Their Birth, Life, and Death by John Richard Saylor

Delivers on the title. Interesting explanations of types of lakes, how they came to be and how they evolve. Great writing and lots of interesting information 4/5

Running The Show: Television from the Inside by Jeff Melvoin

A Veteran TV Writer and Showrunner writes about his career, the business and how to make it as a TV writer and possibly eventually a showrunner. Excellent 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all
Share

Audiobooks – August 2024

Pandora’s Box: How Guts, Guile, and Greed Upended TV by Peter Biskind

Covers the rise HBO, Cable channels and Streamers since 1990. Lots of Gossip and corporate shuffles but not the best book on the subject. 3/5

Redshirts by John Scalzi

A Star Trek parody from the POV of five ensigns who realise something is very strange on their ship. Plot moves steadily and the humour and action mostly work. 3/5

The World Before Us: The New Science Behind Our Human Origins by Tom Higham

An account of the discover and lives of Neanderthals, Denisovans and others hominids who shared the earth with Homo sapiens in the last 300,000 years. 4/5

My Scoring System

  • 5/5 = Brilliant, top 5 book of the year
  • 4/5 = Above average, strongly recommend
  • 3/5 = Average. in the middle 70% of books I read
  • 2/5 = Disappointing
  • 1/5 = Did not like at all
Share

Expanding the reach of Parnell Station

The Problem with Parnell Station

Since it opened in 2017 Parnell Station has been one of the least busy stations in Auckland. In the year to June 2019 there were just 168,000 boardings at the Station, ranking 36th out of 40 stations on the network.

While the suburb of Parnell is fairly high density and has a good mixture of retail, entertainment, office and residential is it under-served by the station.

Parnell Station’s main problem is that it is in a valley with the Auckland Domain on one side and a steep hill to Parnell Road on the other. The way up the hill is steep, indirect and is not suitable for people with mobility issues. The route to the museum is a rough walking track. There is a dedicated path to the Carlaw Park student village and business centre however.

The poor accessibility to the main Parnell Road shopping/business area and even worse access to the St Georges Bay Road business area have hurt the station’s usage. These problems have been written about previously on Greater Auckland, twice.

A wheelchair accessible underpass between the two platforms was added to the station in early 2024. This enabled safer and easier transfer between platforms and to access to the boardwalk to Carlaw Park. However the hill to Parnell Road is still a problem.

A Possible Solution – A Pedestrian Tunnel

My proposal is a pedestrian tunnel running from near the Parnell Station to the North-West under the main hill and emerging on St Georges Bay Road. Around the middle of the tunnel there would be elevators going up to Parnell Road. The tunnel would be around 550 metres long. The ends are at similar heights so the tunnel would be relatively flat while the central elevators would need to travel around 20 metres. The tunnel should be wide, well-lit and have security cameras etc to make people using it feel safe.

The elevators would be around 3 minutes walk from Parnell Station on 4-5 minutes from St Georges Bay Road. I’ve place the street level access to the elevators in Heard Park on the corner of Parnell Road and Ruskin Street (at the bend in the above map). Probably several elevators would be required for redundancy and since traffic will probably be bursty.

The St Georges Bay Road entrance could be at the bottom of Garfield Street. It would probably be easiest to take up some street/footpath space and run parallel to the road before turning South-West once significantly deep. There are several hundred jobs within a couple of minutes walk of this entrance. There is also a Saturday Market nearby.

Overall the project should be only moderately expensive to build and improve the catchment and value of Parnell Station as well as linking three parts of Parnell better together.

Share