Everything Open 2025 – Day 2 – Morning

Skill Trees: Gamifying The Hard Things by Steph Piper

  • A list of skills
  • Each area has a series of skills that can be colored in.
  • Design
    • Hexagons are good
    • Can be done in any order, hard to connect meaningfully
    • Simple, flexable milestones
  • Reception
    • First on was 3d printing & modeling
    • Tested on makerspace student staff members. Good to identify gaps
  • Benefits
    • Reduce imposter syndrome or on the other size overconfidence
    • Target areas for improvement
  • Online on git – https://github.com/sjpiper145/MakerSkillTree
  • How to make a skill tree
    • Flexibility, not too cost restrictive, globally applicable
    • Peer reviewed
    • Final skill tree and translation
  • Book – The Learning Game by Ana Lorena Fabrega
  • Beta testing book of a collection of these skills.
    • Good published through “Make: Magazine”
    • 68 tiles per tree, 1020 skill tiles in the book
  • Tips for writing
    • Continue to evolve and improve
    • Do own illustrations was huge time saver from the publisher
    • Confidence in your work. The publisher will only do the final publishing
  • Looking to fill the gaps
  • Working on a kids version of the book

The Token Wars: Why not everything should be open by Kathy Reid

  • The Token Wars
    • A resource conflict fought through technical, social and legal means
  • What is a token?
    • An atomic unit of text taken from a larger collection called a corpus
    • text -> subwords tokens -> vectorization
    • Transformer architecture
    • Word embeddings capture semantic closeness of words
  • Scaling up to billions of tokens
    • Train the relationships between tokens based on all the text
  • The value of tokens and token economics and the actors in the token wars
    • Are the a public good?
    • No the are rivalrous either excludable or non-excludable
    • LLMs in 2024 were trained on 4 orders of magnitude data than 5 years ago.
    • Estimated 60-160 trillion tokens on the public web and some LLMs are trained on close to all of those
    • Synthetic Data especially low quality slop is polluting the Internet
    • Scrapers pick this up and train on it, concern about Model Collapse ( like a photocopy of a photocopy). Reduces the diversity of what it will produce.
  • Key actors in the token wars
  • Individual content creators
    • Included in corpus without permission
  • Platforms with user-generated content
    • Seeking to get paid for their content ( eg Reddit deal with OpenAI )
  • Archival Institutions
    • Australian National Film and Sound Archive: Maintain Trust, Transparancy, Create Public Value
  • Private Companies
    • Anthropic: Model Context Protocol
  • The AI Companies
    • Have used fair-use. Although some countries don’t have those
    • Companies blocking the common crawl
  • Governments
    • Having trouble balancing interests
  • Token Tactics – Protecting your token treasure
    • Data poisoning
    • Blocking bots and scrapers
  • Data Sovereignty
  • Futures
    • Hunt for more tokens
    • Better ways to block/prevent
    • Better understanding of the alateral damage of the resource conflicts
Share